In the world of cybersecurity, new vulnerabilities are discovered almost every day, but some are more insidious than others. One such vulnerability, which has quietly caused havoc in countless applications, is the threat posed by deserialization attacks. Deserialization—the process of converting a data format into a usable object—seems like an innocent operation at first glance, but it hides a critical weakness.
For instance, in 2016, two security researchers, Alvaro Muñoz & Oleksandr Mirosh uncovered how an insecure deserialization flaw in an enterprise web application allowed an attacker to execute arbitrary code on the server. The result? Full access to sensitive user data and even remote code execution. This vulnerability was not caused by a flaw in the application’s core logic, but rather by how it handled serialized data—data that was meant to be harmless but was ultimately exploited to bypass security mechanisms.
What are Serialization and Deserialization?
Serialization is the process of converting an object, data structure, or state into a format that can be stored (e.g. in a file or database) or transmitted (e.g. over a network) and later reconstructed back into its original form. It enables data to be shared between systems, applications, or processes, even if they run on different platforms.
It converts an in-memory object, such as a dictionary or a Java class instance, into a format that can be easily stored or transmitted, such as JSON, XML, or binary data. This process involves transforming the object’s state into a linear representation that can be saved to a file or sent over a network.
On the other hand, deserialization is the reverse process, where the serialized data is read and converted back into the original object, reconstructing its structure and state. This two-way process allows for data persistence, communication, and sharing across different systems or applications.
The primary difference is that after deserialization, you are working with an actual object, while serialized data is just text or a byte stream.
Why Serialization Matters?
Serialization and deserialization are fundamental concepts for converting objects to byte streams and vice versa. These operations are crucial for tasks such as saving object states to files, transferring objects over networks, or persisting data across different program executions.
One of the remarkable features of object serialization mechanisms is platform independence. For example, when you serialize an object in Java, the resulting byte stream can be read and deserialized on any platform, regardless of the operating system or architecture. This is a direct result of Java’s “write once, run anywhere” philosophy, which makes serialized objects highly portable.
If an object is serialized on a Windows machine, it can be deserialized on a Linux machine, a macOS system, or even a different architecture, such as ARM or x86. As long as the Java version is compatible and the class definitions remain consistent between the two platforms, the deserialization process will work seamlessly.
This ability to transfer serialized data between different platforms makes Java serialization an essential tool for distributed systems, such as client-server applications, web services, and cloud computing.
Technical Demonstration for Better Understanding
Imagine an e-commerce application that allows users to create an account on the application. The user details need to be stored temporarily and shared between different servers in a distributed system.
Consider the following user object in Python:
user = {
‘username’: ‘john_doe’,
‘age’: 30,
’email’: ‘[email protected]’
}
Serialization to XML: You can use libraries like xml.etree.ElementTree in python to serialize the object to XML:
import xml.etree.ElementTree as ET
# Create XML from the data
user = ET.Element(‘user’)
ET.SubElement(user, ‘username’).text = ‘john_doe’
ET.SubElement(user, ‘age’).text = ’30’
ET.SubElement(user, ’email’).text = ‘[email protected]’
# Serialize to XML string
serialized_data = ET.tostring(user, encoding=’unicode’)
print(“Serialized Data (XML Format):”)
print(serialized_data)
Serialized Data (XML Format):
<user>
<username>john_doe</username>
<age>30</age>
<email>[email protected]</email>
</user>
Deserialization of XML Format:
# Deserialize XML data back into Python object (dictionary)
root = ET.fromstring(serialized_data)
deserialized_data = {
‘username’: root.find(‘username’).text,
‘age’: int(root.find(‘age’).text),
’email’: root.find(’email’).text
}
print(“\nDeserialized Data (Python Dictionary):”)
print(deserialized_data)
Deserialized Data:
After deserialization, the data becomes a Python dictionary again:
{
‘username’: ‘john_doe’,
‘age’: 30,
’email’: ‘[email protected]’
}
Data Flow in Serialization and Deserialization
Taking an example of PHP, the functions serialize(), unserialize(), __construct(), and __destruct() often work together in scenarios involving object serialization and deserialization.
- __construct() method : When you create an object of a class, the __construct() method is called automatically. This method is typically used to initialize the object’s properties or perform setup tasks.
- serialize() method : The serialize() function converts an object into a storable representation, typically a string.This string can be saved to a file, database, or sent over a network.
- unserialize() method : The unserialize() function takes the serialized string and reconstructs the object. Upon deserialization, if the class of the object implements the __construct() method, it does not automatically execute during this process.
- __destruct() method : The __destruct() method is called when the object is destroyed (typically at the end of a script. This is often used for cleanup operations, such as closing database connections or releasing resources.
What is Insecure Deserialization?
Insecure deserialization refers to the improper handling of serialized data that is untrusted or user-controlled. It occurs when an application deserializes (converts data back into objects or executable code) without validating or sanitizing the data adequately, allowing attackers to manipulate or inject malicious code into the deserialization process. This opens the door for a wide range of attacks, including remote code execution (RCE), privilege escalation, and data manipulation.
It is even possible to substitute a serialized object with one from a completely different class. Worryingly, any object from a class accessible to the application can be deserialized and instantiated, regardless of the class originally expected. This is why insecure deserialization is often referred to as an object injection vulnerability.
1) Privilege Escalation using Insecure deserialization.
Consider a web application using PHP’s serialize() and unserialize() functions to store user data.
Step 1 : Identification:
The first step in exploiting insecure deserialization is to identify whether serialized data is being used. This can often be found in cookies, hidden form fields, or HTTP parameters. For example, you might notice data in a PHP serialization format, such as a:2:{s:8:”username”;s:8:”john_doe”;s:8:”userRole”;s:5:”admin”;}, or other encoded strings like JSON or Base64. Using tools like Burp Suite or OWASP ZAP, you can capture and inspect HTTP traffic for serialized data patterns.
Breaking It Down:
- a:2:- Here a stands for array and 2 indicates the array contains two elements.
- s:8: “username”; :- s stands for string, 8 is the length of the string “username” which is the key in this array.
Similarly, there are other elements in the array.
Step 2: Manipulation :
After identifying serialized data, the next step is to manipulate it for exploitation. This involves decoding the data, making changes, and sending it back to the server. For instance, in a PHP serialized string like a:2:{s:8:”userRole”;s:5:”admin”;}, an attacker could change the userRole to superadmin by editing the string. In cases where object serialization is used, attackers might inject malicious objects to trigger unintended behaviors, such as code execution. Tools like browser developer consoles for cookies or intercepting proxies like Burp Suite can assist in modifying and resending serialized data.
2) Inserting an Object of Another Class: Object Injection
Object injection is a critical technique used in insecure deserialization attacks, where an attacker exploits the deserialization process to insert malicious objects into the application’s class. Deserialization vulnerabilities arise because the application does not validate the type or integrity of the serialized data before converting it back into an object.
Attackers begin by identifying serialized data formats, often found in cookies, hidden form fields, or HTTP parameters. By inspecting serialized data, such as a PHP string like O:8:”UserInfo”:2:{s:8:”username”;s:8:”john_doe”;s:5:”email”;s:15:”j[email protected]”;}.
Here O:8:”UserInfo”:2 stands for object UserInfo,where 8 is the length of string UserInfo containing 2 key value pairs.
Getting access to the Classes used in Application:
To execute a successful object injection, attackers must understand the available classes within the application. This is often achieved through several techniques, such as analyzing open-source code of commonly used frameworks like Laravel, Symfony, or Spring, which reveal class definitions and methods. Additionally, verbose error messages during testing can inadvertently disclose class paths, method signatures, or stack traces, which further assist in constructing a payload.
Crafting a malicious payload:
Suppose the application expects an object of the class User:
class User {
public $username;
public function __construct($username) {
$this->username = $username;
}
}
Inserting a malicious object is a key step in exploiting insecure deserialization vulnerabilities through object injection. The process begins with the attacker identifying a class within the application that has methods capable of executing unintended actions, such as command execution or file manipulation.
In PHP, methods like __destruct() or __wakeup() are commonly targeted because they are invoked automatically during the deserialization process. They are powerful lifecycle methods that execute automatically during the deserialization process.
To craft the object, the attacker modifies the serialized data format to represent an object of the chosen class, injecting their payload into one of its properties or methods. The attacker can inject a malicious class, such as:
class Deletefiles {
public $payload;
public function __destruct() {
system($this->payload); // Executes OS commands
}
}
A serialized string like O:11:”Deletefiles”:1:{s:7:”payload”;s:14:”rm -rf /var/www”;} can represent an object of the Deletefiles class with a payload property designed to delete critical files when executed.
Execution:
Once the serialized object is crafted, it is inserted into the application’s data flow, replacing legitimate serialized data, and sent to the server. If the server blindly deserializes this data without proper validation, the malicious object is instantiated, and the attacker’s payload is executed. This step is critical in converting deserialization vulnerabilities into active exploits, highlighting the danger of insecure deserialization in applications.
3) Exploiting Insecure Deserialization with Remote Code Execution
Insecure deserialization can lead to one of the most severe exploits in application security i.e Remote Code Execution (RCE). Consider a vulnerable PHP application that processes user-provided serialized data without validation. The application includes a class called Commandexec:
<?php
class Commandexec {
public command;
public function __construct($cmd) {
$this->command = $cmd;
}
public function __destruct() {
system($this->command); // Executes the command on the server
}
}
if (isset($_POST[‘data’])) {
$obj = unserialize($_POST[‘data’]);
}
?>
Understanding the Vulnerability:
The application unserializes user-supplied data ($_POST[‘data’]) without validating it.The Commandexec class has a __destruct()method that automatically executes the command stored in the $command property when the object is destroyed.
Crafting the Malicious Payload:
The attacker creates a serialized object of the Commandexec class and sets the $command property to a malicious command, such as fetching the server’s /etc/passwd file:
O:14:”RemoteExecutor”:1:{s:7:”command”;s:12:”cat /etc/passwd”;}
Payload Execution:
The application unserializes the data and instantiates the CommandExec object with the command property set to cat /etc/passwd. When the script execution ends, the __destruct() method is called automatically, executing the system() function with the command cat /etc/passwd.The server responds with the contents of /etc/passwd, leaking sensitive information.
Mitigating the Risk of Deserialization Attacks:
Deserialization attacks are a significant security risk in many applications, but there are several effective strategies for mitigating this risk. By following best practices in input validation, secure coding techniques, and regular security audits, developers can greatly reduce the chances of deserialization vulnerabilities being exploited.
1) Using JSON or XML to avoid object injection and code execution:
JSON (JavaScript Object Notation) is a text-based format that doesn’t map directly to executable objects. It’s simply a structured data format consisting of key-value pairs or arrays, so it doesn’t contain executable code like PHP or Java serialization.
Safe JSON Example: Here’s an example of a simple object in JSON format:
{
“username”: “admin”,
“email”: “[email protected]”
}
When this JSON is deserialized into an object, it’s treated as simple data (a dictionary with username and email), not as an object with methods or code that can be executed. Even if this JSON is manipulated, it doesn’t contain any means to execute arbitrary code.
If an attacker tries to modify this JSON data, they would just change the values, but they wouldn’t be able to inject malicious executable objects
2) Implementing Input Validation:
Ensure that any input received by your application is validated before it undergoes deserialization. Strict input validation can prevent malicious data from being processed. This includes validating both the format and the type of data. For instance, only allow serialized objects from known, trusted sources, and reject any unexpected or malformed data.
Additionally, you can validate that the serialized data is from an expected class by checking the integrity of the serialized object. This method restricts object injection attacks and prevents the deserialization of untrusted classes.
3) Use of Digital Signatures to prevent data manipulation:
Another strategy to protect against deserialization attacks is to use digital signatures or hashing to verify the integrity of the serialized data. By adding a signature or checksum to the serialized object, you can ensure that the data has not been tampered with during transmission. If the integrity check fails, the deserialization process should be aborted.
For example, in PHP, you can create a hash of the serialized data and compare it when deserializing:
$serializedData = serialize($data);
$hash = hash(‘sha256’, $serializedData);
if ($expectedHash === $hash) {
$object = unserialize($data);
} else {
die(“Data integrity check failed.”);
}
4) Web Application firewalls to prevent the deserialization attacks:
Advanced WAFs can employ anomaly detection to identify unusual or unexpected behavior in serialized data, such as large object graphs, unexpected class names. These anomalies may indicate that an attacker is trying to exploit deserialization vulnerabilities. WAFs can also block HTTP requests that include malicious serialized data, particularly those with suspicious HTTP headers, request bodies, or URL parameters that could trigger insecure deserialization. For example, the WAF could intercept an HTTP request that contains a serialized object with an invalid class.
5) Disable Dangerous methods to prevent malicious code execution:
Many programming languages, including PHP, provide special lifecycle methods like __destruct(), __wakeup(), or __call(), which can execute code when an object is destroyed or restored. These methods can be exploited by attackers if insecure deserialization is possible. To mitigate this, disable or restrict the use of these dangerous methods when handling user-supplied data. Here is an example of a safe class which does not has __destruct() method to execute any command:
class SafeClass {
public $username;
public $userRole;
// Disable __wakeup to prevent exploitation during deserialization
public function __wakeup() {
// No code executed here, so the object is safe from malicious injections
}
}
$serializedObject = ‘O:8:”SafeClass”:2:{s:8:”username”;s:8:”john_doe”;s:8:”userRole”;s:5:”admin”;}’;
// Deserialization is now safer
$object = unserialize($serializedObject);
Example of code with __destruct() methods allowing execution of a command when the script ends.
class UnsafeClass {
public $username;
public $userRole;
// Dangerous __destruct() method that can be exploited
public function __destruct() {
// Attackers can exploit this by injecting code that gets executed on object destruction
file_put_contents(‘log.txt’, ‘Object with username ‘ . $this->username . ‘ destroyed at ‘ . date(‘Y-m-d H:i:s’) . PHP_EOL, FILE_APPEND);
}
}
// Serialized object containing user data
$serializedObject = ‘O:10:”UnsafeClass”:2:{s:8:”username”;s:8:”john_doe”;s:8:”userRole”;s:5:”admin”;}’;
// Deserialization
$object = unserialize($serializedObject);
6) Mitigation Strategies in PHP and Java for Insecure Deserialization:
PHP: Unsafe deserialization example
$user_data = unserialize($user_input); // Dangerous if $user_input comes from untrusted source
// Mitigated version: validate the input before deserialization
if (isValidUserData($user_input)) {
$user_data = unserialize($user_input);
} else {
// Handle error or reject the input
}
Use json_decode() Instead of unserialize() in PHP:
Instead of unserialize(), which allows for class instantiation and can lead to object injection vulnerabilities, use json_decode() for deserializing data that comes from external sources.
For example,
// Safer deserialization example with JSON
$user_data = json_decode($user_input, true); // Converts JSON into a PHP array or object
JSON serialization is typically safer because it does not involve executing class constructors or deserialization of objects. JSON serialization only converts simple data structures like arrays and objects, and it avoids the execution of any class methods or code during deserialization.
JAVA :
To reduce the risk of deserialization attacks in Java, one effective approach is to limit class instantiation during deserialization. By doing so, you can ensure that only trusted and authorized classes are instantiated from the deserialized data, preventing malicious objects from being deserialized.
Custom ObjectInputStream Class: The code defines a subclass of ObjectInputStream, which is the standard class in Java used to deserialize objects from a stream of bytes. By extending ObjectInputStream, we can customize the deserialization behavior to control which classes can be deserialized.
public class SecureObjectInputStream extends ObjectInputStream {
public SecureObjectInputStream(InputStream in) throws IOException {
super(in); // Pass the InputStream to the parent class constructor
}
}
Here, The constructor takes an InputStream as an argument and calls the parent class (ObjectInputStream) constructor to initialize the stream. ObjectInputStream reads the serialized data from the input stream and transforms it into Java objects.
The critical part of the mitigation is the resolveClass() method, which is overridden to check the class name of the object being deserialized. The default behavior of resolveClass() is to automatically load and return the class corresponding to the stream data. However, by overriding it, you can implement custom logic to restrict the classes that can be deserialized.
@Override
protected Class<?> resolveClass(ObjectStreamClass desc) throws ClassNotFoundException {
// Only allow deserialization of trusted classes
if (desc.getName().equals(“com.example.SafeClass”)) {
return super.resolveClass(desc); // Call the default behavior if the class is safe
} else {
throw new ClassNotFoundException(“Unauthorized class deserialization attempt”);
}
}
The resolveClass() method takes an ObjectStreamClass object (desc) that represents the class metadata of the object being deserialized.The desc.getName() method returns the fully qualified name of the class being deserialized. The if statement checks if the class name is “com.example.SafeClass”, a trusted class. If the class is trusted, super.resolveClass(desc) is called, allowing the object to be deserialized.If the class is not trusted, a ClassNotFoundException is thrown, rejecting the deserialization of that class. This prevents potentially malicious or unauthorized classes from being deserialized and instantiated.
Conclusion
Insecure deserialization is a critical security vulnerability that can lead to severe consequences, including remote code execution, privilege escalation, and data manipulation. As we’ve explored, the risks associated with deserialization can be mitigated using several strategies. These include validating and sanitizing serialized data, using strong data encryption, restricting the use of dangerous methods like __destruct() and __wakeup(), and implementing mechanisms to control which classes can be deserialized (such as with the resolveClass() method in Java). While deserialization offers valuable functionality for object persistence and transfer, its misuse or mishandling can lead to devastating breaches. By adopting a proactive approach to securing deserialization processes, organizations can safeguard their applications and protect sensitive data from malicious actors.