This post is just a bunch of quick notes for PE analysis while handling an incident. It may be modified in the future if I find something else I consider interesting to include. I have added some reference links where you can find more in-depth information about the PE format. This was also an excuse to play with the pefile python library and a way to learn something about the PE format.
PE (Portable Executable) format is the format used by executable files in x32 and x64 (PE32+) Windows operating systems. This format contains all data needed by the OS to load and execute the code.
.cpl, .exe, .dll, .ocx,.sys, .scr, .drv, .efi
A PE file is divided in headers and sections.
DOS header, NT header, Data directories, Section Table
Here is a list of some of the most interesting things to look for when analyzing a PE file.
- The NT header has two headers inside: File header and Optional header
- Inside the File header, there is very useful information for IH. It contains the compiling date of the file in the field Time Date Stamp. If the date doesn’t make any sense, that could mean that the header has been tampered with and it will give us a hint of a PE file not being legit.
To retrieve this value with the pefile library:
import pefile import time pe = pefile.PE('c:windowssystem32cmd.exe') ts = pe.FILE_HEADER.TimeDateStamp print time.strftime('%d-%m-%Y %H:%M:%S', time.localtime(ts))
- In the Optional header, under the ‘keyword’ Subsystem, we can find which kind of executable the file is. It can mainly be either a command line program or a program with a graphic interface.
To check that value we can use the following code:
import pefile pe = pefile.PE('C:Windowssystem32cmd.exe') ss = pe.OPTIONAL_HEADER.Subsystem
- The Section Table describes each section of a PE file. Each file has different sections.
We can see all the sections with the following code:
import pefile pe = pefile.PE('C:Windowssystem32cmd.exe') for section in pe.sections: print section.Name
- Inside each section header, we can find two very interesting values: Virtual Size and Size of Raw Data. The former contains the amount of memory that is allocated during the loading process. The latter contains the size of the data in the disk. These two values should be quite similar (not identical as there could be differences between the alignment in the memory and in the disk). If these values are too different, i.e. one is much bigger than the other, it will show us that something has been modified in the PE file. That could be an indicator of Malware presence.
To access these data, we can use the pefile library as follows:
import pefile pe = pefile.PE('C:Windowssystem32cmd.exe') for section in pe.sections: print section.Name, section.Misc_VirtualSize, section.SizeOfRawData
.text, .rdata, .data, .rsrc.
There are other sections, but I don’t list them because that’s not the goal of this post.
- The import table contains all the libraries needed by an executable to execute the code. Usually this table is quite big, even for a relatively simple executable such as the windows calculator (calc.exe). Finding a reduced number of imports in the import table usually means that the file is packed. Malware is usually packed to evade signature antivirus detection. Therefore this tags a file as suspicious.
Retrieving imports from cmd.exe:
import pefile pe = pefile.PE('c:windowssystem32cmd.exe') for entry in pe.DIRECTORY_ENTRY_IMPORT: print entry.dll for imp in entry.imports: print 't', hex(imp.address), imp.name
 PE Walk through
 Reverse Engineering III – PE Format [PDF]
 pefile @ code.google.com
 An In-Depth Look into the Win32 Portable Executable File Format
 Practical malware analysis – The Hands-On Guide to Dissecting Malicious Software
by Michael Sikorski and Andrew Honig [No Starch Press – 2012]
 Peering Inside the PE: A Tour of the Win32 Portable Executable File Format
 The Portable Executable File Format from Top to Bottom