Fan controller project (also, no more Arduino)

December 2, 2015

A while ago I started on a rather fun project to make a temperature controlled fan that we needed so our consoles will not melt in their enclosed media cabinets under the TV ( never mind the creepy cat).

11949364_701999709944238_1937537779589436090_n

Sure, I could’ve bought some but sometimes it’s nice to build stuff with your own hands ( and head ). After a few prototypes using various Arduino boards I decided to make my own board based on Atmega32u4 from Atmel in order to have the small form factor I was going for.

After some quality time spent in the lab the hardware design was done.

12046838_722167881260754_763966236515822514_n

First prototype was proto-board and that sucked bad… i mean baaaad..  sparks between solder points bad… so on to make my own board with the laser printer and a bunch of nasty chemicals….

12039207_709400759204133_7728226878182741216_n

Yeaaa… that’s not gonna happen… Hopeless! Finally went for my trusted friends at https://oshpark.com for help, and here’s how it looks today (after i mounted some components on it )

IMG_0130

Now that the hardware is done, time to do the software. Burn the bootloader and on we go with the Arduino IDE.

What can I say… turns out it’s a bit slow… so I got Atmel Studio and gave that a try. It’s really awesome! And.. it just imports Arduino sketches as native projects. But then you’re still running arduino code. Don’t get me wrong, Arduino code is just like the Arduino boards: great for prototyping, but if you wanna do native stuff it’s better to write native code.

Here’s a simple example of Arduino digitalWrite disassembly:

00000806 <digitalWrite>:
     806:	1f 93       	push	r17
     808:	cf 93       	push	r28
     80a:	df 93       	push	r29
     80c:	28 2f       	mov	r18, r24
     80e:	30 e0       	ldi	r19, 0x00	; 0
     810:	f9 01       	movw	r30, r18
     812:	ec 50       	subi	r30, 0x0C	; 12
     814:	ff 4f       	sbci	r31, 0xFF	; 255
     816:	84 91       	lpm	r24, Z
     818:	f9 01       	movw	r30, r18
     81a:	ee 5e       	subi	r30, 0xEE	; 238
     81c:	fe 4f       	sbci	r31, 0xFE	; 254
     81e:	d4 91       	lpm	r29, Z
     820:	f9 01       	movw	r30, r18
     822:	e0 5d       	subi	r30, 0xD0	; 208
     824:	fe 4f       	sbci	r31, 0xFE	; 254
     826:	c4 91       	lpm	r28, Z
     828:	cc 23       	and	r28, r28
     82a:	c9 f0       	breq	.+50     	; 0x85e <digitalWrite+0x58>
     82c:	16 2f       	mov	r17, r22
     82e:	81 11       	cpse	r24, r1
     830:	0e 94 8a 03 	call	0x714	; 0x714 <_ZL10turnOffPWMh>
     834:	ec 2f       	mov	r30, r28
     836:	f0 e0       	ldi	r31, 0x00	; 0
     838:	ee 0f       	add	r30, r30
     83a:	ff 1f       	adc	r31, r31
     83c:	e4 5a       	subi	r30, 0xA4	; 164
     83e:	fe 4f       	sbci	r31, 0xFE	; 254
     840:	a5 91       	lpm	r26, Z+
     842:	b4 91       	lpm	r27, Z
     844:	8f b7       	in	r24, 0x3f	; 63
     846:	f8 94       	cli
     848:	11 11       	cpse	r17, r1
     84a:	05 c0       	rjmp	.+10     	; 0x856 <digitalWrite+0x50>
     84c:	9c 91       	ld	r25, X
     84e:	ed 2f       	mov	r30, r29
     850:	e0 95       	com	r30
     852:	e9 23       	and	r30, r25
     854:	02 c0       	rjmp	.+4      	; 0x85a <digitalWrite+0x54>
     856:	ec 91       	ld	r30, X
     858:	ed 2b       	or	r30, r29
     85a:	ec 93       	st	X, r30
     85c:	8f bf       	out	0x3f, r24	; 63
     85e:	df 91       	pop	r29
     860:	cf 91       	pop	r28
     862:	1f 91       	pop	r17
     864:	08 95       	ret

That is atrocious! Let alone the number of cycles but man… branches… function calls.. .AAAAARGH!

Time to write my own…

For contrast, here’s my generated code for digitalWrite:

1f6:	2d 9a       	sbi	0x05, 5	; 5

Yes! That’s right. One bloody line. Remember though, that’s *generated* code. The source code handles every case just like arduino does. It supports PWM, supports setting the mode to input, input_pullup, output, all the good things. It’s just that the generated code is better by a many orders of magnitude.

It’s good to realize compilers are pretty smart nowadays and it’s sad to see people don’t take advantage of that. I’m not saying trust the compilers, but once you verify it does what you expect it to do then use them! That’s why they were invented for in the first place !

The key to make this super optimal is proper use of inlines and consts.

Example:

static FORCE_INLINE  volatile uint8_t* GetDDRRegFromPin(const Pin& pin)
{
	if ( &PORTB == pin.port )
		return &DDRB;
	else if ( &PORTC == pin.port )
		return &DDRC;
	else if ( &PORTD == pin.port )
		return &DDRD;
	else if ( &PORTE == pin.port )
		return &DDRE;
	//else if ( &PORTF == pin.port )
	
	return &DDRF;
}

static FORCE_INLINE  volatile uint8_t* GetPINRegFromPin(const Pin& pin)
{
	if ( &PORTB == pin.port )
		return &PINB;
	else if ( &PORTC == pin.port )
		return &PINC;
	else if ( &PORTD == pin.port )
		return &PIND;
	else if ( &PORTE == pin.port )
		return &PINE;
	//else if ( &PORTF == pin.port )
	
	return &PINF;
}

struct Pin
{
 volatile uint8_t* const port; // pin's port physical address
 uint8_t mask; // pin's bit
 volatile uint8_t* const timer; // pin's timer physical address
 uint8_t timerBit; // timer's bit
 volatile uint16_t* const pwmReg;// PWM register address
 ...
 void FORCE_INLINE SetHigh() const { *port |= mask; }
 void FORCE_INLINE SetLow() const { *port &= ~mask; }
 void FORCE_INLINE SetDDRHigh() const { *GetDDRRegFromPin(*this) |= mask; }
 void FORCE_INLINE SetDDRLow() const { *GetDDRRegFromPin(*this) &= ~mask; }
 uint8_t FORCE_INLINE Get() const { return *GetPINRegFromPin(*this) & mask;}
 ...
};

Now… you might say ‘WTF is with all the branches ?!?!?! this is stupid!’

Well, not quite. If you look carefully, the input arguments are const. in most cases people setup pins by using literal consts ( as in numbers ).
What the compiler does, it will pass that literal const along in the preprocessing stage and then it will optimize all the branches away. And that’s exactly what it does.

This only works if the consts are allowed to propagate properly. This means that the initialization of the pins has to be done in a certain way. Having them members of a struct that you init in the constructor won’t do the trick. And if they’re not properly propagated you’ll end up generating all those branches from above and ain’t nobody got time for that… literally… you’re running on a 8-16MHz MCU…

So. This is how I initialize my LCD thingy:

static const HD44780::Config lcdContext =
{
	PINDESC_D12,
	PINDESC_D4,
	PINDESC_D14,
	PINDESC_D16,
	PINDESC_D15,
	PINDESC_D17,
};

For PINDESC_Dxx I took advantage of the compiler/preprocessor by making these crafty struct initializing macros for each of the pins.

#define PINDESC_D0   { &PORTD, (1 << 2), 0, 0, 0} // example PIN without PWM support
...
#define PINDESC_D5   { &PORTC, (1 << 6), &TCCR3A, 1 << COM3A1, &OCR3A} // PIN with PWM support
...

So now that the constants are properly propagated, the inlines are properly inlined and everything is fast and dandy, I’m happy!

Don’t you just love a happy ending? I know I do!

 

P.S.

You can download the library from GitHub here. Keep in mind that it is *not* a fully tested released product. It’s just a little something I’m using for my own projects. Use at your own risk.

Share this:

Computing camera field of view from frustum planes

June 11, 2015

In one of our VR platforms at Unity today we needed to compute the camera field of view from a set of 6 frustum planes. The previous way to calculate that was pretty slow – it was using 1 acos, 2 tans and an atan. Using a few trig tricks, the new cost of that is a reciprocal sqrtf ( which is faster than a normal sqrtf and it’s tons faster than an acos) and an atanf.

First some identities:

1) cos(\beta) = \vec{N}_{top} \cdot \vec{N}_{near}

2) \alpha = \frac{\pi}{2} - \beta, \ and \ \beta=\frac{\pi}{2}-\alpha

3) tan(\alpha) = \frac{sin(\alpha)}{cos(\alpha)}

4) sin^2(\alpha) + cos^2(\alpha)  = 1

The image represents the geometrical construct to help visualize the problem a bit better. Nt is the normal of the top frustum plane, Nn is the normal of the near frustum plane, a is fov/2 and b is the angle between the Nn and Nt.

frustum

From (2) and (1) we get

5)  cos(\beta) = sin(\alpha)  ( this can be derived by substituting (2) into (1) as follows:  cos(\beta) = cos(\frac{\pi}{2}-\alpha)=cos(\frac{\pi}{2})cos(\alpha)+sin(\frac{\pi}{2})sin(\alpha) = sin(\alpha) )

Using (4) and (5) we can rewrite (3) as follows:

6)  tan(\alpha)=\frac{cos(\beta)}{\sqrt{1-cos^2(\beta)}}

Note that the sqrtf in the bottom can be replaced with a reciprocal sqrtf which is usually way faster than a normal sqrtf

 

To get the field of view, it’s enough to just do \alpha=2 * arctan(\frac{cos(\beta)}{\sqrt{1-cos^2(\beta)}})

 

The code ended up look something like this:

float cosb = DotProduct(frustum[kTop].n, frustum[kNear].n);
float fov = 2.0f * atanf(cosb * rsqrtf(1.0f-cosb*cosb));

 

Share this:

CNC Log: 1- Humble beginnings

January 1, 2015

I had this project on my mind for quite a while now and I’ve decided to finally start on it.

Instead of buying a kit or follow plans, I thought it would be way more fun to make my own.
I also wanted to make it a bit more sturdy in the hope that one day I could make it actually mill steel – most frames out there are made of aluminum extrusion which is not great for stability.

So far I got the following things for it:
– 3 Axis Nema34 kit from eBay ( dual shaft motors @ 878oz torque, 7.8A/256microstep drivers & 200W 60VDC power supplies) stored in the electronics housing of a 36″ HP DesignJet 750C plotter I got off craigslist.
– 3W 445nm laser diode + driver. ( still waiting for a larger radiator heat sink to arrive from China )
– 2 16mm linear rails from the same HP DesignJet plotter.
– Matching linear bearing blocks from eBay that fit the rails above.
– 10m (33ft) of 1.5wide, 3mm pitch PU timing belt for driving the axes also off eBay.
– A bunch of different size steel square tubing that I’m prototyping the frame and the axes for now.
– Skateboard bearings & patio door bearings to prototype linear slides.
– Random nuts & bolts
– Random other things too small to mention.

Since you can’t really do any proper metal work without proper tools I had to update my collection with some more:
– 4.5in angle grinder with different abrasive cutting discs, grinding wheels & wire brushes.
– 7.5in circular saw with a few carbide tipped blades for cutting steel (it goes through 1/8in steel plate like it’s butter)
– large 20in bench drill press
– Impact wrench
– combination hammer/impact/drill
– orbital sander
– jigsaw with metal cutting blades
– files
– a bunch of other small things like screwdrivers, pliers, etc

There are also some things that I’d really like to add to my collection, but since they’re quite expensive new I have to wait until I find them on craigslist:
– MIG/FluxCore welder as screwing things is not always the sturdy thing to do
– Plasma cutter ( that I’d like to add as an attachment to the finished machine so it can improve
itself )

Now, I’ve been playing with different things for the past couple of months and I figured I could really use a log of sorts to record the good ideas, the bad ideas, successful and failed attempts at doing different things.

Since the rails from the plotter came out different sizes it only made sense to start the video log with plenty of sparks so here it goes:

Share this:

My first game with the Novag Obsidian chess computer

August 10, 2013

Share this:

Managing virtual address spaces with MMAP

January 11, 2013

Being able to manage your own virtual address space(s) is sometimes really useful. Some platforms (Win32/Xbox360/PS3) allow you to do this in a very elegant fashion: you first allocate your address space and then you start mapping physical memory into it. OSX and Linux (to my limited knowledge) only support managing virtual memory through the MMAP system calls.

The details for mmap can be found here.

If you look through the manpage, you’ll notice there is no notion of address space, so how can we achieve what we want to achieve using this (primitive) system call API?

Well, it’s not that simple, but assuming (I know…I always say that assumption is the mother of all fuckups…) the mmap implementation is true to the design document it will implement demand paging.This is something that we can really use. Since I can’t talk about Xbox360 or PS3 publicly, I can illustrate this with Win32’s VirtualAlloc.

In order to be able to manage our own memory space successfully we need to have access to the following 4 operations:

The minimalistic implementation of the four operations using the Win32 API is as follows:

void* AllocateAddressSpace(size_t size)
{
    return VirtualAlloc(NULL, size, MEM_RESERVE , PAGE_NOACCESS);
}

void* CommitMemory(void* addr, size_t size)
{
    return VirtualAlloc(addr, size, MEM_COMMIT, PAGE_READWRITE);
}

void DecommitMemory(void* addr, size_t size)
{
    VirtualFree((void*)addr, size, MEM_DECOMMIT);
}

void FreeAddressSpace(void* addr, size_t size)
{
    VirtualFree((void*)addr, 0, MEM_RELEASE)
}

Now, with MMAP there is no “ownership” of pages until they’re mapped and used, and that makes the implementation of the address space concept really tricky.
In order to keep the pages reserved in your virtual address space, all you need to do is to just remap the address and then trick the TLB to remap the area as a freshly mapped region.
This, together with the demand paging will basically return the address into a reserved & uncommitted state hence returning the physical memory back to the OS.
The interesting (and a bit upside-down from what you’d expect to see there) is in DecommitMemory.
The equivalent MMAP implementation would be somewhat like this:

void* AllocateAddressSpace(size_t size)
{
    void * ptr = mmap((void*)0, size, PROT_NONE, MAP_PRIVATE|MAP_ANON, -1, 0);
    msync(ptr, size, MS_SYNC|MS_INVALIDATE);
    return ptr;
}

void* CommitMemory(void* addr, size_t size)
{
    void * ptr = mmap(addr, size, PROT_READ|PROT_WRITE, MAP_FIXED|MAP_SHARED|MAP_ANON, -1, 0);
    msync(addr, size, MS_SYNC|MS_INVALIDATE);
    return ptr;
}

void DecommitMemory(void* addr, size_t size)
{
    // instead of unmapping the address, we're just gonna trick 
    // the TLB to mark this as a new mapped area which, due to 
    // demand paging, will not be committed until used.

    mmap(addr, size, PROT_NONE, MAP_FIXED|MAP_PRIVATE|MAP_ANON, -1, 0);
    msync(addr, size, MS_SYNC|MS_INVALIDATE);
}

void FreeAddressSpace(void* addr, size_t size)
{
    msync(addr, size, MS_SYNC);
    munmap(addr, size);
}

I hope this will help someone save some time by not debugging nonsensical crashes in random and seemingly unrelated places…
It took around 2 days of debugging GPU driver and Mono crashes until I could come up with a solution and an explanation that would make sense to me…

Share this:

…And out of nowhere… bubbles!

May 26, 2010

Last week I was in Romania for a short visit, and it so happened that when I ordered the tickets I was somewhere else. Because of my “spiritual” absence, I managed to order the worse possible tickets – 4.5 hours wait in Amsterdam. No time to get in the city properly but plenty of time to get bored.
While walking half-bored through the airport, I noticed some benches and next to one of them a power plug. So an idea struck – Let’s plug the laptop and have some Unity fun instead of letting boredom get the best of me. Now, I wanted to do something quick that will be finished before I board the plane, so why not bubbles.
You can see in the picture the result.

The “feature list” consists of: cube-mapped Fresnel reflections using Schlick’s approximation, chromatic aberation and vertex shader based wobbling.

You can download the Unity Package to play around with it at your own discretion.

Have fun.

Share this:

Fight your fears with infinite ammo

May 24, 2010

Yet another simple game that came through from nothing 🙂 I was trying to write a controller for a crawler, and once I had them all over the walls there was only one thing missing – a mini gun!

Enjoy!

Share this:

At first there was nothing. 5 hours later, we had this. (v2.0)

December 31, 2009

It all started when I was showing off how fast is it to make a game with Unity. Next thing you know we started messing around and got this running. It only took like 2-3 hours to have the first level setup and playable. After some more polish it got to a stage when it could be shown to the world.

Features:
– It’s shiny 🙂
– A lil bit of fun from the physics driven game play

There’s a lot to do still:
– GUI
– more levels
– whatever our minds can conceive

So have fun, and stay tuned for updates 😉

Oh, and Happy New Year everyone!

************ Change log ************

– 13 January 2010

* Added another level ! Yay!
* Now you can rotate the camera by pressing space; it rotates in 90 degrees incrementst

– 02 January 2010.

* Added score GUI
* Added sound effects
* Added background music
* Added prizes
* Added camera collision in case the ball becomes occluded
* Added a background skybox made by Danielle.

Share this:

Rats racing super charged booster equiped sandals… how cool is that?!

November 21, 2009

So I guess you can figure it out yourselves how this whole idea started up… don’t really know how the sandals ended up on the table, but I know that one of my colleagues – Alexandra – placed the little plush mice inside them. From the angle i was standing it looked like they were racing, so .023 seconds later I thought Hey! Could you imagine a better way of accommodating yourself with Unity other than doing something crazy & fun? It could make a good FAFF project. So the next friday I started working on it.

I modeled the sandal and the boosters and I got the mouse & the room model from my friend Danielle.

The whole “project” took like 2-3 hours of work – without the modeling of course.
I don’t think I’ll continue working on it since I managed to remove some of the max scenes but that’s no problem. I already have lots of other ideas for crazy casual games that I’ll try out at some point.

An image speaks 1000 words they say… Wonder how many words to animated images speak… Without further ado, there you go:

W/S – forward / backward
Mouse – rotate
Space – jump.

This was moved to http://nervus.org/ratty/ratty.html

Share this:

Teaching an old dog new tricks…

November 12, 2009

dog

Today I’ve got a MacBookPro to be able to test some of the stuff I’m doing on the OSX version of Unity as well.
I’ve been using MacOS before when I’ve tried to see how Ableton Live works on it, but I never really try to use it for programming.
What can I say… XCode is really weird!
I’ve been using Microsoft IDEs since Visual C 6 or so. I’ve tried CodeWarrior, KDE’s KDevelop, CodeBlocks and all sorts of other obscure IDEs but nothing was so out-of-this-world as XCode. It sometimes feels like “either me, or him” so I hope it’ll be me who makes it out alive 🙂
Ah did I mentioned that I only had it for like 8 hours or so? At least it frustrates quick…

Share this:

← Previous Page