This post is about ELF(Executable and Linkable Format) python parser but I will briefly go through ELF specs first. Funny story, I once gave couple of presentations about DPI and I thought it’s funny to have few slides about GCC and ELF. I called it “The short sort of ELF” and as expected, the joke didn’t land. Good thing I am a not comedian :)
The ELF Link to heading
ELF is UNIX standard for executable format supported by toolchains(compilers/linkers) and loaders. The figure,is from the specs, shows the two different views of linking and execution(loader) of ELF.
The ELF Header contains important fields that parse uses to parse the following:
- section headers
- program headers
- string table
For implementation, I used OrderDict to represent the fields and created generic parse function to use attr_size_map
to populate the fields.
class Elf64Hdr(BinResource):
def __init__(self):
pass
def size_map(self):
attr_size_map = collections.OrderedDict()
attr_size_map["e_ident" ] = BIT64_DATA_TYPE.Elf64_Char.value * 16
attr_size_map["e_type" ] = BIT64_DATA_TYPE.Elf64_Half.value
attr_size_map["e_machine" ] = BIT64_DATA_TYPE.Elf64_Half.value
attr_size_map["e_version" ] = BIT64_DATA_TYPE.Elf64_Word.value
attr_size_map["e_entry" ] = BIT64_DATA_TYPE.Elf64_Addr.value
attr_size_map["e_phoff" ] = BIT64_DATA_TYPE.Elf64_Off.value
attr_size_map["e_shoff" ] = BIT64_DATA_TYPE.Elf64_Off.value
attr_size_map["e_flags" ] = BIT64_DATA_TYPE.Elf64_Word.value
attr_size_map["e_ehsize" ] = BIT64_DATA_TYPE.Elf64_Half.value
attr_size_map["e_phentsize" ] = BIT64_DATA_TYPE.Elf64_Half.value
attr_size_map["e_phnum" ] = BIT64_DATA_TYPE.Elf64_Half.value
attr_size_map["e_shentsize" ] = BIT64_DATA_TYPE.Elf64_Half.value
attr_size_map["e_shnum" ] = BIT64_DATA_TYPE.Elf64_Half.value
attr_size_map["e_shstrndx" ] = BIT64_DATA_TYPE.Elf64_Half.value
return attr_size_map
The spec defines enum values for header fields. For that, I used Enum
to match these enums.
class E_TYPE(Enum):
ET_NONE = 0
ET_REL = 1
ET_EXEC = 2
ET_DYN = 3
ET_CORE = 4
ET_LOOS = 0xfe00
ET_HIOS = 0xfeff
ET_LOPROC = 0xff00
ET_HIPROC = 0xffff
class E_MACHINE(Enum): # TODO: x86 and x86-64 for now
EM_NONE = 0
EM_386 = 3
EM_X86_64 = 62
class E_VERSION(Enum):
EV_NONE = 0
EV_CURRENT = 1
Sections Link to heading
section header table is array of Elf32_Shdr
. The section header is defined as follows
Similar to ELF header, I defined the section header with fields for the binary parser segment
class Elf64Shdr(BinResource):
def __init__(self,data):
data_dict = common.segment_bin(data,self.size_map() ,0,'lsb')
common.append_attr(self,data_dict)
def size_map(self):
attr_size_map = collections.OrderedDict()
attr_size_map["sh_name" ] = BIT64_DATA_TYPE.Elf64_Word.value
attr_size_map["sh_type" ] = BIT64_DATA_TYPE.Elf64_Word.value
attr_size_map["sh_flags" ] = BIT64_DATA_TYPE.Elf64_Xword.value
attr_size_map["sh_addr" ] = BIT64_DATA_TYPE.Elf64_Addr.value
attr_size_map["sh_offset" ] = BIT64_DATA_TYPE.Elf64_Off.value
attr_size_map["sh_size" ] = BIT64_DATA_TYPE.Elf64_Xword.value
attr_size_map["sh_link" ] = BIT64_DATA_TYPE.Elf64_Word.value
attr_size_map["sh_info" ] = BIT64_DATA_TYPE.Elf64_Word.value
attr_size_map["sh_addalign" ] = BIT64_DATA_TYPE.Elf64_Xword.value
attr_size_map["sh_entsize" ] = BIT64_DATA_TYPE.Elf64_Xword.value
return attr_size_map
The following fields of ELF header defines how to get the section header table
- e_shoff : offset of section header
- e_shnum: number of section header
- e_shentsize: size of section header
start = common.bytearray_to_int(self.ehdr.e_shoff)
for x in range(0, common.bytearray_to_int(self.ehdr.e_shnum)):
end = start + common.bytearray_to_int(self.ehdr.e_shentsize)
sh = Elf64Shdr(self.file_bin[start:end])
start = end
self.sh_tbl.append(sh)
Program Header Link to heading
Same as section header, the program header is parsed.
## parse program table if applicable
self.ph_tbl = []
if(common.bytearray_to_int(self.ehdr.e_phnum) > 0):
start = common.bytearray_to_int(self.ehdr.e_phoff)
for x in range(0, common.bytearray_to_int(self.ehdr.e_phnum)):
end = start + common.bytearray_to_int(self.ehdr.e_phentsize)
ph = Elf64Phdr(self.file_bin[start:end])
start = end
self.ph_tbl.append(ph)
String Table Link to heading
e_shstrndx
is the index of string table section. So, we get that section header and parse it using unpack_str_table
.
## parse e_shstrndx and back annotate the sh headers (sh_tbl)
sym_sh = self.sh_tbl[common.bytearray_to_int(self.ehdr.e_shstrndx)]
start = common.bytearray_to_int(sym_sh.sh_addr) + common.bytearray_to_int(sym_sh.sh_offset)
end = common.bytearray_to_int(sym_sh.sh_addr) + common.bytearray_to_int(sym_sh.sh_offset) +common.bytearray_to_int(sym_sh.sh_size)
strtab = common.unpack_str_table(self.file_bin[start:end])
for sh,nm in zip(self.sh_tbl,strtab):
sh.real_name = nm
Or we can just use readelf
like a normal person.