If you pop over to the NVIDIA web site and download the 3d card drivers for Linux, you'll note that there is a /usr/src/nv directory. In that directory is source code to the "thin layer" to the Linux kernel which NVIDIA links their binary blob. This, of course, makes no legal difference - NVIDIA are still extending the Linux kernel and therefore it is unlawful to distribute the Linux kernel along with the NVIDIA drivers, but NVIDIA doesn't do that, so it's not a problem - for them. Anyway, that's a side issue. I was thinking, recently, about the Linux kernel's "tainted" flag. Essentially, whenever you install a kernel module that isn't under some accepted open source license, the kernel sets a flag so that developers know there is a chance that any bugs you report might have been caused by code they can't fix. In general, this is not such a big deal as kernel modules are typically small and easy to isolate. The NVIDIA graphics drivers, on the other hand, are not small, they are actually over 5 meg.. loading anything that is 5 meg in size into the kernel is typically a bad idea. It's much better to split some parts out into userland and use thunks to connect the kernel part to the userland part.
So back to this /usr/src/nv directory. If you have a look, there's a bunch of source files, header files, make files and this nv-kernel.o file. That's the binary blob. They don't give us source for that bit. For me, it's 5104332 bytes, and most of the symbols have been replaced with stuff like _nv003299rm. This, of course, is to make it harder to understand if you were to try to reverse engineer it.
Now, someone out there, if you build your own kernel, I need your help (done, see comments). Go into the /usr/src/nv directory, do a make and get this thing to build. You may have to screw around with paths and stuff, and this is the reason why I havn't done it myself. Ok, got it to build? Great. Now make clean and remove the nv-kernel.o from the Makefile, then make. You should get some errors.. in particular, we're looking for "undefined symbol" errors. How many symbols are undefined?
This tells us how big the interface is between the source layer and the binary layer. If this interface is small enough we can write thunks for each symbol and move the binary layer into userland. If we do speed tests and discover that it isn't much slower to run the blob in userland then we can create a module that contains no binary code, and maybe the Linux kernel developers will consider this "untainted" enough that it is useful to someone. Knowing that interface will also help reverse engineer the binary part.