Saturday, May 27, 2006

Better Support for Relocatable ELF Files

Looking at how the Boomerang ELF loader handles symbols and relocations, I noticed that something was clearly wrong for relocatable files (i.e., .o files). The loader was assuming that the VirtualAddress members of the section table were set as they are in executable images. This is not the case. It is the duty of the loader to choose an arbitary starting address and to load each section at appropriate offsets from that address. I decided that choosing the same default address that IDA Pro uses was probably a good idea. I often switch between Boomerang and IDA Pro to gather information, especially information that Boomerang has gotten wrong. I also decided to delay loading any section that starts with ".rel." until all the other sections are loaded because IDA Pro does so. I don't know why it does it, but I want my addresses to match up with those in IDA Pro, so I have to follow their lead.

After fixing this, I noticed that all the symbols and relocations now point to addresses that are not in any section. That is, all the symbols that point to the .text section now need to have the base address of the section added to their offset. Relocations point to symbols, so they too need to have the base address of the section added to their offset.

Some of this was already in the ELF loader, or at least some attempt had been made to add it.

Relocatable ELF files have no "entrypoints" as such, as they are not typically intended to be executed on their own. Typically the linker combines multiple .o files to produce an executable. So, at first, it would appear that attempting to decompile any .o file that doesn't have the main symbol in it, and where no specific entrypoints have been supplied by the user, would be a bit questionable. However, two examples spring to mind where the decompiler should be able to find an entrypoint. The first is kernel modules, where the initialization and teardown functions are valid entrypoints. The second is X11 modules, which have similar entrypoints. Obviously in both these cases there are interesting targets for decompilation.

No comments:

Post a Comment