Pickles, Shorts and Jokers: A study on Java deserialization

By Bernardo Melo

Introduction

When asked by some members of Tempest’s Technical Consulting team which subject I would choose for the last stage of the internship program – a research paper on a particular security topic – I was initially unsure how to answer. Between suggestions and topics chosen by other interns, I felt that I would like to address a theme that I had previously encountered but would still be challenging enough. So, among some of the ideas suggested by team members, I chose insecure deserialization, addressing cases of the problem in applications designed in the Java language.

Included in the OWASP Top10 for the first time in the 2017 edition, insecure deserialization flaws are characterized by resulting in severe damage to the applications that contain them, which can lead to problems such as remote code execution and writing/reading arbitrary files, in addition to breaking business rules, specific to the application.

But “what the heck” is insecure deserialization, and which issues does the vulnerability pose for applications designed in Java nowadays?

To answer these questions, we first need to understand some important concepts.

What is deserialization?

In the context of programming languages, serialization is the name given to the process in which an object is converted into a sequence of bytes, a stream, in order to be stored or transmitted. From an object in memory, we obtain a representation of it, which preserves its state, and enables it to be reconstructed at a future time.

Analogously, deserialization consists of the opposite process, in which an object is reconstructed within a running program, from the information present in a stream.

Serialization and deserialization are constantly used in web applications designed in PHP, .NET and Java, among other technologies. They often play a crucial role for the functioning of applications, either by transmitting serialized user information or by storing a given object for later use.

Java Serialization

The Java language supports the serialization process natively: it has two specific interfaces, which are implemented by the classes to be serialized, and responsible for serializing and deserializing the desired objects. They are:

Serializable – Default interface, has no method in its signature, and defines which classes use the serialization protocol. In it, the serialization logic is the responsibility of the JVM, and the developer has no control over it. Below is the signature of the interface in question:

Externalizable – Interface that extends the Serializable interface, and transfers the responsibility of the serialization logic to the developer, who needs to implement the methods that will be responsible for performing the serialization/deserialization. Below is the native implementation of it:

We can commonly find the serialization process in Java applications, in scenarios such as HTTP request parameters, cookie values, among others. In addition, the Remote Method Invocation (RMI) API also makes use of the serialization protocol in its communication.

Okay, but how exactly does this process occur?

Serialization and deserialization – How do they work?

After including the Serializable interface in the definition of a class, it’s possible to call specific methods to serialize or deserialize an object belonging to it. Such methods are:

Java.io.ObjectOutputStream.writeObject, for serialization,

Java.io.ObjectInputSteram.readObject, for deserialization.

In addition, some specific methods can be defined by classes that implement one of the interfaces discussed above: The Magic Methods.

Magic Methods, what are they?

Defined within the classes that implement one of the interfaces discussed, Magic Methods are special methods that allow the programmer to insert additional logic that will be executed during the process.

Among the various Magic Methods, the writeObject and readObject methods stand out, commonly used to add functionality to the serialization and deserialization processes.

It’s worth noting that, as we can observe from their signatures, these methods aren’t the same methods discussed in the previous section (yes, it’s confusing!). Although they have the same names, Magic Methods are methods called during serialization, and are implemented by the programmer. The writeObject and readObject methods, discussed previously, are invoked to start the serialization and deserialization process, and are called elsewhere in the application code.

It’s also worth mentioning that serialization is effectively performed by the Magic Methods writeExternal and readExternal, which must be implemented by the programmer when the interface used is Externalizable. Within their definitions, it’s possible to use the methods of the ObjectOutput and ObjectInput interfaces to define the behavior of the serialization logic.

Serialization – Examples

Below is an example of an object being serialized. In it, we can observe how the writeObject method is invoked to start the process.

In the following image, we have the definition of the User class, which will be serialized. Note that it implements the Serializable interface:

We can observe its attributes, which already have values defined for demonstrative purposes.

Next, we see the code snippet where the writeObject method is called, passing the user object as a parameter. The information contained in it is written to the object oos, of the ObjectOutputStream class, which, in turn, is instantiated from an object of the FileOutputStream class.

After the process is finished, we obtain the stream represented in the image below. Its bytes are grouped in sets, and represent important information, such as names, attribute values and metadata, which will serve as a reference to reconstruct the object later.

An important detail is that the first bytes, ac ed 00 05, are always the same, because they are a set of flags that are intended to indicate the beginning of a serialized object in Java. This feature is very convenient when identifying serialized Java objects in HTTP requests, where they are usually encoded in Base64. In this format, the initial bytes take the form of “rO0AB”.

Whew! After all this information, we finally have the necessary knowledge to understand the problem itself.

So… What exactly is insecure deserialization?

Insecure deserialization – What is it?

Insecure object deserialization is the name given when an application deserializes untrusted data that can be modified by the user. This creates the opportunity for such data to be constructed in a malicious manner, allowing an attacker to change the operation of the application under certain circumstances.

In general, insecure deserialization flaws in Java can be separated into two types:

– flaws where values of structures already present in the serialized object are modified, resulting in the possibility of attacks related to access control, breaking business rules, etc.

– flaws where the attacker takes advantage of classes accessible by the application to change the behavior of the deserialization to be performed. Faults of this type tend to be quite severe, and can create problems ranging from arbitrary reading or writing of files, to even command execution, in more serious cases. It’s important to point out that a class doesn’t necessarily need to be in use by the application for an object that belongs to it to be deserialized. Any class available in the application is part of the attack surface, as long as it can be deserialized, meaning, as long as it implements one of the interfaces discussed above.

Okay, but how does that happen?

Remember that Magic Methods give freedom for developers to insert additional logic into the deserialization? These methods will be invoked during the process itself. Knowing this, attackers can change the object to be deserialized that is passed as input, in order to create a chain of invocations of methods already present in the application (the famous gadgets), which can end up severely compromising it if the process is carried out without due care.

Gadgets – What are they?

Briefly, gadgets are pieces of code already present in the application that help an attacker achieve a certain goal. They aren’t necessarily harmful and may have the function of passing the received input to another method.

The problem arises when an attacker manages to concatenate a series of gadgets, resulting in the passage of malicious data to a last gadget, which will be responsible for causing the real damage. These are known as sink gadgets.

Sink gadgets can, for example, execute arbitrary commands from their input, or perform file writing. The set of gadgets concatenated together in order to exploit some flaw is called a gadget chain.

Gadgets and Gadget Chains: Examples (and exercises!)

Now, how about taking a look at some examples?

Below, we have some scenarios where an application is vulnerable to insecure deserialization flaws. Can you identify them?

Example 1 – (Business) rules were made to be broken

Suppose we are testing an application. During the test, we realize that a request, like the example below, was made:

You can see that the cookie user has as value a string in a format we have seen before: it has rO0AB as its initial characters! This strongly suggests that we are dealing with a serialized Java object. We can confirm our suspicion without much difficulty. To do so, we decode the found string, which is currently in base64.

We get the following string as a result:

Indeed, the value of user is about a serialized Java object! Let’s analyze the object a bit better with the help of a hex editor:

From the analysis of the stream found, we can infer attributes belonging to the class that originated the serialized object. Let’s see:

In yellow, we have the names of the serialized fields: Age, userID and name. In red, we have their values: 26, 3793 and “Bob”

If we reconstructed the class that serialized the stream above, taking into account the types of each field, we would have something similar to the image below:

Now that we know the structure of the class, what kind of changes could we make to the serialized object that would cause some kind of impact on the functioning of the application?

Example 2 – From one node to the other

You have been hired to perform a code review on a financial application built in Java, in order to find possible security flaws. During the test, you come across the following class:

You have been informed by the developers that the application receives a serialized object, belonging to the above class, via a parameter of an HTTP request, which starts from the users’ browser.

Since this is a financial application, where security is crucial, the developers decided to perform additional checks on the object received from their users. For this, they implemented the Magic Method readObject, called during deserialization. In it, the queryDB() method is invoked, where a query to the database is performed, in order to complete the appropriate user validation checks.

Right after the query, the run() method of the logger object is also invoked, so that the operations performed are recorded and can be analyzed later in case of security incidents.

The logger is of type Runnable, an interface. In object orientation, objects that are part of a hierarchical structure can receive instances of any Class/Interface of its structure. Thus, any object of a class that implements the interface in question can be assigned to a complex variable (in this case, an object of type Runnable).

The signature of the Runnable interface can be seen in the image below:

As you can see, the interface has only one method: run().

In the application, the logger object is usually instantiated as an object of the UserLog class, which is part of a library imported by the application. Its definition is shown below:

Realizing that the User class plays a highly important role in the application, you make some notes and decide to proceed with the review.

A few days later, while continuing the code review, you come across the following class:

SKnowing that the above class is in the same code as the project under review, what could a possible attacker do with it? And more importantly, what would the malicious payload look like?

Insecure deserialization – Mitigation measures

Finally, what should we do to mitigate problems like those shown above?

The answer is simple: don’t deserialize objects from sources that are not fully trusted!

However, this isn’t always possible. Many applications have been designed in such a way that deserialization of objects, which can be modified by users, is not only inevitable, but also essential. Changing this would imply making deep structural changes, which are often limited by the technologies used to build the application itself.

Got it! So, as an alternative we should find and eliminate gadgets from the application, right?

No!

While it may make sense at first, “hunting” for gadgets is not a good idea as it doesn’t tackle the root of the problem: deserializing untrusted objects. New gadgets are being discovered all the time, and not finding known gadgets in your application doesn’t necessarily mean that it’s free from unsafe object deserialization.

Besides, large projects have hundreds of libraries and thousands of classes, and examining each one of them thoroughly for gadgets is an extremely time-consuming, inefficient and error-prone job.

Given all this, we have a few alternatives:

Restrict deserialization: by default, ObjectInputStream deserializes any class that implements the Serializable interface. It’s possible to change this behavior by using allow/deny lists, so that only the desired classes can be deserialized. This can be done in a few ways: Implementing a new class that inherits from ObjectInputStream and changing its implementation of ResolveClass, a method called internally at the beginning of deserialization, or using existing libraries that offer similar functionality, such as SerialKiller and SafeObjectInputStream. After that, calls to ObjectInputStream in the code should then be replaced by calls to one of the new classes. Approaches of this type are known as look-ahead deserialization.
Implementing integrity checks: Another alternative is, if possible, to use integrity checking mechanisms, such as digital signatures, to ensure that the stream to be deserialized has not been modified since its creation.
Run the code that performs deserialization in lower privilege environments: this technique helps to restrict the severity of problems that may arise if deserialization occurs in high-privilege environments.

In addition to all this, deserialization operations should also be monitored in the application through logging, allowing errors, problems and suspicious activities to be analyzed later if necessary.

Finally, it is worth mentioning that the techniques described here are not definitive solutions, and the ideal approach would involve employing multiple techniques together in order to minimize the chances of problems occurring.

Conclusion

That’s it! I hope this (not so brief!) article has contributed to the understanding of this problem, which greatly impacted the internet around 2015, but is still far from extinct.

References

Below, I leave some interesting links for those who wish to dig a little deeper:

CONTRAST-SECURITY-OSS. Contrast-rO0 – SafeObjectInputStream. Available at: https://github.com/Contrast-Security-OSS/contrast-rO0. Accessed on: June 21, 2023.

FROHOFF, Chris; LAWRENCE, Gabriel. Marshalling Pickles – Chris Frohoff & Gabriel Lawrence – OWASP AppSec California 2015. Available at: https://www.youtube.com/watch?v=KSA7vUkXGSg. Accessed on: June 21, 2023.

FROHOFF, Chris. OWASP SD: Deserialize my shorts. Available at: https://frohoff.github.io/owaspsd-deserialize-my-shorts/. Accessed on: June 21, 2023.

FROHOFF, Chris. Ysoserial. Available at: https://github.com/frohoff/ysoserial. Accessed on: June 21, 2023.

IKKISOFT. SerialKiller. Available at: https://github.com/ikkisoft/SerialKiller. Accessed on: June 21, 2023.

ORACLE. Especificação do protocolo de serialização de objetos do Java. Available at: https://docs.oracle.com/javase/8/docs/platform/serialization/spec/serialTOC.html. Accessed on: June 21, 2023.

PORT SWIGGER. Web Security Academy – Insecure Deserialization. Available at: https://portswigger.net/web-security/deserialization. Accessed on: June 21, 2023.