Payload in PDF

Infected PDFs have always been a privileged way to infect users because this document format is very common and used by almost everyone. Moreover, it exists many ways to exploit Acrobat Reader vulnerabilities and it’s very stealth and elegant way to launch a malware.

In this article, I will show you how easy it is to craft a malicious PDF with custom shellcode, and trigger a vulnerability to execute a payload. We will also analyse the malicious PDF to learn how the payload is stored, and how to extract it.

This article is for research purpose only, don’t do bad things!


PDF Format

PDF is object oriented format, defined by Adobe. This format describes a document organization, and preserves dependencies needed for the document (fonts, images, …). These objects are stored within the document as streams and most of the time encoded or compressed. Below is the overview of a classic PDF document. For more information, please read Adobe’s specifications.



Infected PDF creation

We will create a fake PDF with metasploit, containing an exploit attempt, as well as a custom payload (code to execute). The exploit is targeting a specific version of Adobe Reader, so we will need to make some archaeology and find an ancient Reader version (thanks to to install on the target machine.

So, first, let’s make this PDF. We will make a infected PDF that just opens calculator (calc.exe) on the machine, just for demonstration. Open a metasploit console (installation of metasploit is not covered in this article) and type:

It should look like this:


Copy the file that has just been created (here /home/osboxes/.msf4/local/malicious.pdf) on a shared drive. You will need to feed your target machine with it.


Infected PDF execution

On the target machine, download and install a vulnerable Adobe Reader version (metasploit tells us it should be less than 8.1.2). I choose to install a 8.1.1 version.

Once installed, execute the malicious.pdf file. You should see a calculator being spawned from the Adobe Reader process. That’s the exploit.


I’ve done another PDF but changed the payload slightly, just for fun:

Here’s the result. Adobe Reader now has a backdoor (reverse shell) listening for commands.



Infected PDF analysis

Played enough! Let’s see what’s inside that malicious PDF, and let’s try to extract the malicious payload(we’re still with the calc.exe PDF).

First, we will need a tool called PDF Stream Dumper, so download itLoad the malicious PDF with it, and take some time to familiarize yourself with the tool.


We can start by checking if some exploit is detected by the tool using the “Exploit Scan” menu:

Indeed, there’s an exploit hidden in stream 6 (the one in blue on the capture).

But let’s start by the beginning: when searching for exploits in a PDF, we most of the time encounter heap spray created by a Javascript code. That heap spray is used to push the payload on the heap, ready to be executed once the vulnerability has triggered.

If you open Stream 1, you can see:

That we can translate to OpenAction on stream 5. Let’s move to stream 5:

Which says to execute Javascript located in stream 6. This stream shows plain Javascript, it’s time to open the “Javascript_UI” menu. We immediately recognize a big string hex encoded, and pushed into a variable for heap spray. This is our payload:


Fortunately, we have tools to manipulate it, and understand what it does. Select the payload (the part between quotes), and open “Shellcode_analysis” menu. Then choose “scDbg – LibEmu Emulation”. You will get a new window will the shellcode decoded into bytes (you can even save it to file):


LibEmu is a library able to simulate a processor, it gives information about what the assembly code is trying to do. Just hit the “Launch” button and you will understand:


Here it is, we can clearly see the shellcode will just opens a calc.exe window and exits.
Let’s redo the same analysis for the other malicious PDF (reverse shell):


Uh, self explaining right? Shellcode is loading the library needed to manipulate sockets (ws2_32.dll), and tries to connect back to C&C.

I haven’t told about the exploit itself, it’s located at the end of the javascript code (like stated by Exploit search, “util.printf – found in stream: 6”). It’s exploiting a buffer overflow on printf function to execute arbitrary code (here, our heap-sprayed shellcode)

1 Comment

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.