During Penetration testing engagement you are required backdooring PE file with your own shellcode without increasing the size of the executable or altering its intended functionality and hopefully making it fully undetectable (FUD) how would you do it?. For example, after recon, you gather information that a lot number of employees use a certain “program/software”. The social engineering way to get in the victim’s network would be a phishing email to employees with a link to download “Updated version of that program”, which actually is the backdoored binary of the updated program. This post will cover how to backdoor a legitimate x86 PE (Portable Executable) file by adding our own reverse TCP shellcode without increasing the size or altering the functionality. Different techniques are also discussed on how to do the fully undetectable (FUD) backdooring PE file. The focus in each step of the process is to make the backdoored file Fully undetectable. The word “undetectable” here is used in the context of scan time static analysis. Introductory understanding of PE file format, x86 assembly and debugging required. Each section is building upon the previous section and no topic is repeated for the sake of conciseness, one should reference back and forth for clarity.
Self Imposed Restrictions for backdooring PE file
Our goal for backdooring PE file is that it becomes fully undetectable by antiviruses, and the functionality of the backdoored program should remain the same with no interruptions/errors. For anti-virus scanning purposes we will be using NoDistribute.There are a lot of ways to make a binary undetectable, using crypt that encode the entire program and include a decoding stub in it to decode at runtime, compressing the program using UPX, using veil-framework or msfvenom encodings. We will not be using any of such tools. The purpose is to keep it simple and elegant! For this reason, I have the following self-imposed restrictions:-
- No use to Msfvenom encoding schemes, crypters, veil-framework or any such fancy tools.
- Size of the file should remain same, which means no extra sections, decoder stubs, or compressing (UPX).
- The functionality of the backdoored program must remain the same with no error/interruptions.
- Adding a new section header to add shellcode
- User interaction based shellcode Trigger + codecaves.
- Dual code caves with custom encoder + triggering shellcode upon user interaction
Criteria for PE file selection for implanting backdoor
Unless you are forced to use a specific binary for backdooring PE file the following points must be kept in mind. They are not required to be followed but preferred because they will help to reduce the AV detection rate and making the end product more feasible.
- The file size of executable should be small < 10mb, Smaller size file will be easy to transfer to the victim during a penetration testing engagement. You could email them in ZIP or use other social engineering techniques. It will also be convenient to debug in case of issues.
- Backdoor a well-known product, for example, Utorrent, network utilities like Putty, sysinternal tools, winRAR , 7zip etc. Using a known PE file is not required, but there are more chances of AV to flag an unknown PE backdoor-ed than a known PE backdoor-ed and the victim would be more inclined to execute a known program.
- PE files that are not protected by security features such as ASLR or DEP. It would be complicated to backdoor those and won’t make a difference in the end product compared to normal PE files.
- It is preferable to use C/C++ Native binaries.
- It is preferable to have a PE file that has a legitimate functionality of communicating over the network. This would fool few anti-viruses upon execution when backdoor shellcode will make a reverse connection to our desired box. Some antiviruses would not flag and will consider it as the functionality of the program. Chances are network monitoring solutions and people would consider malicious communication as legitimate functionality.
The Program we will be backdooring is 7Zip file archiver (GUI version). Firstly let’s check if the file has ASLR enabled.
Randomizes the addresses each time program is loaded in memory, this way attacker cannot be used hardcoded addresses to exploit flaws/shellcode placement.
As we can see in the above screenshot, not much in terms of binary protection. Let’s take a look at some other information about the 7zip binary.
The PE file is 32 bit binary, has a size of about ~500kb. It is a programme in native code (C++). Seems like a good candidate for backdooring PE file. Let’s dig in!
Backdooring PE file
There are two ways to backdoor Portable executable (PE) files. Before demonstrating both of them separately it is important to have a sense of what do we mean by backdooring PE file?. In simple terms we want a legitimate windows executable file like 7zip achiever (used for demonstration) to have our shellcode in it, so when the 7zip file is executed our shellcode should get executed as well without the user knowing and without the antiviruses detecting any malicious behavior. The program (7zip) should work accurately as well. The shellcode we will be using is a stageless MSFvenom reverse TCP shell. Follow this link to know the difference between staged and stageless payloads
Both of the methods described below have the same overall process and goal but different approaches to achieve it. The overall process is as follow:-
- Find an appropriate place in memory for implanting our shellcode, either in codecaves or by creating new section headers, both methods demonstrated below.
- Copy the opcodes from the stack at the beginning of program execution.
- Replace those instructions with our own opcodes to hijack the execution flow of the application to our desired location in memory.
- Add the shellcode to that memory location which in this case is stageless TCP reverse shell.
- Set the registers back to the stack copied in the first step to allow normal execution flow.
Adding a new Section header method
The idea behind this method is to create a new header section in PE file, add our shellcode in newly created section and later point the execution flow it that section. The new section header can be created using a tool such as LordPE.
- Open Lord PE Go to section header and add the section header (added .hello) at the bottom.
- Add the Virtual size and Raw size 1000 bytes. Note that 1000 is in hexadecimal (4096 bytes decimal).
- Make the section header executable as we have to place our Shellcode in this section so it has to be executable, writable and readable.
- Save the file as original.
Now if we execute the file, it won’t work because we have added a new section of 1000h bytes, but that header section is empty.
To make to file work normally as intended, we have to add 1000h bytes at the end of the file because right now the file contains a header section of 1000 bytes but that section is empty, we have to fill it up by any value, we are filling it up by nulls (00). Use any hex editor to add 1000 hexadecimal bytes at the end of the file as shown below.
We have added null values at the end of the file and renamed it 7zFMAddedSection.exe. Before proceeding further we have to make sure now our executable 7zFMAddedSection.exe, is working properly and our new section with proper size and permissions is added, we can do that in Ollydbg by going to memory section and double-clicking PE headers.
Hijack Execution Flow
We can see that our new section .hello is added with designated permissions. Next step is to hijack the execution flow of the program to our newly added .hello section. When we execute the program it should point to .hello section of the code where we would place our shellcode. Firstly note down the first 5 opcodes, as will need them later when restoring the execution flow back. We copy the starting address of .hello section 0047E000 open the program in Ollydbg and replace the first opcode at address 004538D8 with JMP to 0047E000.
Right click -> Copy to executable -> all modifications -> Save file. We saved the file as 7zFMAddedSectionHijacked.exe (File names getting longer and we are just getting started!)
Up-till now we have added a new header section and hijacked the execution flow to it. We open the file 7zFMAddedSectionHijacked.exe in Ollydbg. We are expecting execution flow to redirect to our newly added .hello section which would contain null values (remember we added nulls using hexedit ?).
Sweet! We have a long empty section .hello section. Next step is to add our shellcode from the start of this section so it gets triggered when the binary is executed.
As mentioned earlier we will be using Metasploit’s stagless windows/shell_reverse_tcp shellcode. We are not using any encoding schemes provided by msfvenom, most of them if not all of them are already flagged by antiviruses. To add the shellcode firstly we need push registers on to the stack to save their state using PUSHAD and PUSHFD opcodes. At the end of shellcode, we pop back the registers and restore the execution flow by pasting initial (Pre hijacked) program instructions copied earlier and jumping back to make sure the functionality of 7zip is not disturbed. Here is the sequence of instructions
- Restore Execution Flow…
We generate windows stageless reverse shellcode using the following arguments in mfsvenom
Copy the shellcode and paste the hex in Ollydbg as right click > binary > binary paste, it will get dissembled to assembly code.
Now that we have our reverse TCP shellcode in .hello section its time to save the changes to file, before that we need to perform some modifications to our shellcode.
- At the end of the shellcode we see an opcode CALL EBP which terminates the execution of the program after shellcode is executed, and we don’t want the program execution to terminate, in fact we want the program to function normally after the shellcode execution, for this reason, we have to modify the opcode CALL EBP to NOP (no operation).
- Another modification that needs to be made is due to the presence of a WaitForSingleObject in our shellcode. WaitForSignleObject function takes an argument in milliseconds and waits until that time before starting other threads. If the WaitForSignleObject function argument is -1 this means that it will wait the infinite amount of time before starting other threads. Which simply means that if we execute the binary it will spawn a reverse shell but normal functionality of 7zip would halt till we close our reverse shell. This post helps in finding and fixing WaitForSignleObject. We simply need to modify opcode DEC INC whose value is -1 (Argument for WaitForSignleObject) to NOP.
- Next, we need to POP register values off the stack (to restore the stack value pre-shellcode) using POPFD and POPAD at the end of shellcode.
- After POPFD and POPAD we need to add the 5 hijacked instructions(copied earlier in hijack execution flow) back, to make sure after shellcode execution our 7zip program functions normally.
- We save the modifications as 7zFMAddedSectionHijackedShelled.exe
We set up a listener on Kali Box and execute the binary 7zFMAddedSectionHijackedShelled.exe. We get a shell. 7zip binary works fine as well with no interruption in functionality.
How are we doing detection wise?
Not so good!. Though it was expected since we added a new writeable, executable section in binary and used a known Metasploit shellcode without any encoding.
Pros of adding a new section header method
- You can create large section header. Large space means you don’t need to worry about space for shellcode, even you can encode your shellcode a number of times without having to worry about its size. This could help bypassing Antiviruses.
Cons of adding a new section header method
- Adding a new section header and assigning it execution flag could alert Antiviruses. Not a good approach in terms of AV detection rate.
- It will also increase the size of the original file, again we wouldn’t want to alert the AV or the victim about the change of file size.
- High detection rate.
Keeping in mind the cons of new section header method. Next, we will look at two more methods that would help us achieve usability and low detection rate of the backdoor.
Triggering shellcode upon user interaction + Codecaves
What we have achieved so far is to create a new header section, place our shellcode in it and hijack the execution flow to our shellcode and then back to normal functioning of the application. In this section, we will be chaining together two methods to achieve low detection rate and to mitigate the shortcomings of new adder section method discussed above. Following are the techniques discussed:-
- How to trigger our shellcode based on user interaction with a specific functionality.
- How to find and use code caves.
Code caves are dead/empty blocks in a memory of a program which can be used to inject our own code. Instead of creating a new section, we could use existing code caves to implant our shellcode. We can find code caves of different sizes in most of any PE. The size of the code cave does matter!. We would want a code cave to be larger than our shellcode so we could inject the shellcode without having to split it into smaller chunks.
The first step is to find a code cave, Cave Miner is an optimal python script to find code caves, you need to provide the size of the cave as a parameter and it will show you all the code caves larger than that size.
We got two code caves larger than 700 bytes, both of them contain enough space for our shellcode. Note down the virtual address for both caves. The virtual address is the starting address of the cave. Later We will hijack the execution flow by jumping to the virtual addresses. We will be using both caves later, for now, we only require one cave to implant in our shellcode. We can see that the code cave is only readable, we have to make it writable and executable for it to execute our shellcode. We do that with LORDPE.
Triggering Shellcode Upon user interaction
Now that we have a code cave we can jump to, we need to find a way to redirect execution flow to our shellcode upon user interaction. Unlike in the previous method, we don’t want to hijack the execution flow right after the program is run. We want to let the program run normally and execute shellcode upon user interaction with a specific functionality, for example clicking a specific tab. To accomplish this we need to find reference strings in the application. We can then hijack the address of a specific reference string by modifying it to jump to code cave. This means that whenever a specific string is accessed in memory the execution flow will get redirected to our code cave. Sounds good? Let see how do we achieve this.
Open the 7zip program in Ollydbg > right click > search for > all referenced text strings
In reference strings, we found an interesting string, a domain (http://www.7-zip.org). The memory address of this domain gets accessed when a user clicks on about > domain.
Note that we can have multiple user interaction triggers that can be backdoored in a single program using the referenced strings found. For the sake of an example, we are using the domain button on about page which upon click opens the website www.7-zip.org in the browser. Our objective is to trigger shellcode whenever a user clicks on the domain button.
Now we have to add a breakpoint at the address of domain string so that we can then modify its opcode to jump to our code cave when a user clicks on the website button.We copy the address of domain string 0044A8E5 and add a breakpoint. We then click on the domain button in the 7zip program. The execution stops at the breakpoint as seen in the below screenshot:-
now we can modify this address to jump to code cave, so when a user clicks on the website button execution flow would jump to our code cave, wherein next step we will place our shellcode. Firstly we copy a couple of instructions after 0044A8E5 address as they will be used again when we want to point execution flow back to it after shellcode execution to make the sure normal functionality of 7zip.
After modification to jmp 00477857 we save the executable as 7zFMUhijacked.exe . Note that the address 00477857 is the starting address of codecave 1.
We load the 7zFMUhijacked.exe in Ollydbg and let it execute normally, we then click on the website button. We are redirected to an empty code cave.
Nice! we have redirected execution flow to code cave upon user interaction. To keep this post concise We will be skipping the next steps of adding and modifying the shellcode as these steps are the same explained above “6.2 Adding shellcode” and “6.3 Modifying shellcode“.
We add the shellcode, modify it, restore the execution flow back to from where we hijacked it 0044A8E5 and save the file as 7zFMUhijackedShelled.exe. The shellcode used is stageless windows reverse TCP. We set a netcat listener, run 7zFMUhijackedShelled.exe , click on the website button.
Everything worked as we expected and we got a shell back! . Lets see how are we doing detection wise?
That’s good! we are down from 16/36 to 3/38. Thanks to code caves and triggering shellcode upon user interaction with a specific functionality. This shows a weakness in detection mechanism of most antiviruses as they are not able to detect a known msfvenom shellcode without any encoding just because it is in a code cave and triggered upon user interaction with specific functionality. The detection rate 3/38 is good but not good enough (Fully undetectable). Considering the self-imposed restrictions, the only viable route from here seem to do custom encoding of shellcode and decode it in memory upon execution.
Custom Encoding Shellcode
Building upon what we previously achieved, executing shellcode from code cave upon user interaction with a specific program functionality, we now want to encode the shellcode using XOR encoder. Why do we want to use XOR, a couple of reasons, firstly it is fairly easy to implement, secondly we don’t have to write a decoder for it because if you XOR a value 2 times, it gives you the original value. We will encode the shellcode with XOR once and save it on disk. Then we will XOR the encoded value again in memory at runtime to get back the original shellcode. Antiviruses wouldn’t be able to catch it because it is being done in memory!
We require 2 code caves for this purpose. One for shellcode and one for encoder/decoder. In finding code caves section above we found 2 code caves larger than 700 bytes both of them have fair enough space for shellcode and encoder/decoder. Below is the flow chart diagram of execution flow.
So we want to hijack the program execution upon user interaction of clicking on the domain button to CC2 starting address 0047972e which would contain the encoder/decoder XOR stub opcodes, it will encode/decode the shellcode that resides in CC1 starting address 00477857, after CC2 execution is complete we will jump to CC1 to start execution which would spawn back a shell, after CC2 execution we will jump back from CC2 to where we initially hijacked the execution flow with clicking the domain button to make sure the functionality of the 7zip program remains the same and the victim shouldn’t notice any interruptions. Sounds like a long ride, let’s GO!
Note that the steps performed in the last section wouldn’t be repeated so refer back to hijacking execution upon user interaction, adding shellcode in codecaves, modifying shellcode and restoring the execution flow back to where we hijacked it.
Firstly we Hijack the execution flow from address 0044A8E5 (clicking domain button) to CC2 starting address 0047972e and save the modifications as file on disk. We run the modified 7zip file in Ollydbg and trigger the hijacking process by clicking on the domain button.
Now that we are in CC2, before writing our XOR encoder here, we will first jump to starting address of CC1 and implant our shellcode so that we get the accurate addresses that we have to use an XOR encoder. Note that the first step of hijacking to CC2 can also be performed at the end as well, as it won’t impact the overall execution flow illustrated in flowchart above.
We jump to CC1 , implant, modify shellcode and restore the execution flow to 0044A8E5 from where we hijacked to CC2 to make sure smooth execution of 7zip program functionality after shellcode. Note that implanting, modifying shellcode and restoring execution flow is already explained in previous sections.
Above screenshot shows the bottom of shellcode at CC1, note down the address 0047799B, this is where the shellcode ends, next instructions are for restoring the execution flow back. So we have to encode from starting of the shellcode at address 00477859 till 0047799B.
We move to 00477857 the starting address of CC2, we write XOR encoder, following are the opcodes for XOR encoder implementation.
- PUSH ECX, 00477857 // Push the starting address of shellcode to ECX.
- XOR BYTE PTR DS:[EAX],0B // Exclusive OR the contents of ECX with key 0B
- INC ECX // Increase ECX to move to next addresses
- CMP ECX,0047799B // Compare ECX with the ending address of shellcode
- JLE SHORT 00479733 // If they are not equal, take a short jump back to address of XOR operation
- JMP 7zFM2.00477857 // if they are equal jump to the start of shellcode
As we are encoding the opcodes in CC1, we have to make sure the header section in which the CC1 resides is writeable otherwise Ollydbg will show access violation error. Refer back to codecaves section to know how to make it writable and executable.
We add a breakpoint at JMP 7zFM2.00477857 after the encoding is performed and we are about to jump back to encoded shellcode. If we go back to CC1 we will see that out shellcode is encoded now.
All is left to save the modifications of both the shellcode at CC1 and the encoder at CC2 to a file we named it as 7zFMowned.exe. Let’s see if it’s working as intended.
How are we doing detection wise?
Great! we have achieved fully undetectable backdooring PE file that remains functional with the same size.