The Java Deserialization Bug

matt_heimer · on Nov 8, 2015

The linked article says "No such safe method is available for Java serialization, so your best bet in Java is not to use Java serialization unless you absolutely trust whoever is producing the data".

This is wrong. It is trivial to mitigate this issue and it only takes a couple of lines of code. For a step-by-step guide from 2013 see http://www.ibm.com/developerworks/library/se-lookahead/

pron · on Nov 8, 2015

Another way to solve this it is to only call readResolve/readObject(ObjectInputStream) under a different security policy (a different AccessControlContext from the caller). The problem is that both approaches (lookahead or security policy) require you to use a custom ObjectInputStream class.

wsargent · on Nov 12, 2015

There's two problems with the lookahead approach:

1) It only applies to the code you write. Libraries can still use insecure object serialization under the hood i.e. you can't apply this to an built in app server running JMX.

2) It doesn't account for pathological objects, i.e. a billion laughs attack: https://gist.github.com/coekie/a27cc406fc9f3dc7a70d

More details here: https://tersesystems.com/2015/11/08/closing-the-open-door-of...

matt_heimer · on Nov 9, 2015

The article has been updated to include subclassing ObjectInputStream as a possible mitigation.

matt_heimer · on Nov 8, 2015

Better information on which class (InvokerTransformer) is being exploited http://foxglovesecurity.com/2015/11/06/what-do-weblogic-webs...

The related Apache ticket: https://issues.apache.org/jira/browse/COLLECTIONS-580

alblue · on Nov 8, 2015

It's not just Apache Commons that's affected - other libraries such as Spring and Groovy have similar vulnerabilities. My write up at InfoQ is here:

https://news.ycombinator.com/item?id=10526953

kevinherron · on Nov 8, 2015

This is quite serious; not sure how we didn't all start hearing about this back at the end of January.

matt_heimer · on Nov 8, 2015

It isn't a new concern, you would have had to start listening back in 2005 (or probably earlier) https://inst.eecs.berkeley.edu/~cs161/fa05/Notes/objectSeria...

draw_down · on Nov 8, 2015

January?

matt_heimer · on Nov 9, 2015

January was when some slides from a talk were posted - http://www.slideshare.net/frohoff1/appseccali-2015-marshalli...

jtheory · on Nov 9, 2015

I don't think I've ever worked on a project that deserialized untrusted Java objects; where does this come up?

I suppose -- if you have a Java client for a (probably) EJB-based server; most of the cases where I imagine this sort of setup are with internal-only solutions, though, which mitigates the risk somewhat (but obv. it would still need to be corrected).

I've read about banks in some countries still commonly using Java applets for site access -- that could be a seriously-dangerous attack vector.

Nelson69 · on Nov 9, 2015

I've been trying to figure that out too. Part of it is certainly some kind of bug hunter fame that is being played for.

That just seems like bad architecture, period. Now if this can be exploited with RMI or something, I wouldn't be too shocked to see some client/server type applications using that as an interface.

are there session cookies that are serialized objects or something?

notZeroDay · on Nov 9, 2015

This is not so much a "zero-day" vulnerability, since the topic was discussed back in January, albeit with little fanfare. That it did not receive much attention until last week, doesn't mean it's really a sudden exploit discovered being used maliciously in the wild.

This was a tech talk intended to broach the subject of dangerous, but standard features of the Java serialization API, providing a demonstration of proof of concept attack vectors.

The truth is that this is not actually a bug. The code works as the design of the OPTIONAL THIRD PARTY library intends. The problem is that this is code that people carelessly deploy, accidentally exposing themselves to attack, failing to understand the full scope and capabilities of their servers and the features they've accidentally included by default.

aaronharnly · on Nov 8, 2015

Trust is relative, of course; one should be extremely wary of data that came from an external user, or a client-side application. None of us would say one should naively deserialize such data.

But then there's data that comes from another team's server calling yours as a service; or from another server internal to your service; or off an internal message bus; or from one's own database.

Each of these has a different degree of trust, and different considerations to weigh.

CountHackulus · on Nov 8, 2015

So it's not a Java bug, it's an Apache Commons-Collection bug. Granted, it's an extremely popular library, but it's not a "Java" bug.

skybrian · on Nov 8, 2015

The bug is that doing deserialization safely is difficult in the presence of inheritance, since it's an open-world type system.

If a serializable object contains a field of type Foo, this implies that all subtypes of Foo can be transmitted, whether they were designed with serialization in mind or not. This is especially bad for commonly used base types such as List or Exception. At the limit, if you have a field of type Object or Any, there's no choice but to use an explicit whitelist.

Contrast with how Go does unmarshaling in its standard library (which works with structs and arrays but not interfaces), and functional languages (which use unions in the form of algebraic data types, not inheritance). These are closed-world type systems where we can generate the whitelist by walking the type tree from the base type.

pron · on Nov 9, 2015

> The bug is that doing deserialization safely is difficult in the presence of inheritance, since it's an open-world type system.

Well, yes and no. Java's open-world type system is one of its greatest features. To counteract any security holes this may cause, Java has a very strong security model, where each piece of executable code is associated with a security context based on its origin, and what it is allowed to do is restricted by a security policy based on the code's domain. In this case, however, no foreign code was directly introduced. It was found to be possible to exploit code already found within the application's safe domain (a third-party library).

The Java solution is simple: during deserialization of external data, treat execute custom deserialization code as if it was foreign, and execute it within a foreign security context. It's best if this was done directly in the JDK code, and I assume a patch will adress this.

hyperpallium · on Nov 9, 2015

Restricting serialization to JSON-like objects loses the benefit of being able to "serialize objects". (However, having that separate internal format for data may well be a better way to go, in general.)

The algebraic approach, in this context, is similar to a superclass naming its subclasses (rather than a subclass naming its superclasses as java does). I'm not sure, but I suspect this open-world choice is deliberate, allowing extensibility. Java is very attentive to security considerations elsewhere. It also would be a too-dramatic change for mainstream OO programming languages.

A third approach is explicitly closed-world: naming the set of classes permitted for deserialization, as per some other commenters. e.g. JAXB does this by necessity, for binding to XML Schema's algebraic type system.

pron · on Nov 9, 2015

> naming the set of classes permitted for deserialization

Or simply having ObjectInputStream execute readResolve/readObject(ObjectInputStream) under a different security context. No need for whitelisting.

kevinherron · on Nov 8, 2015

It's not a commons-collections bug. There is nothing wrong with the existence of the classes being used from the commons-collections on their own. In fact, I'm sure they serve a purpose.

But, in an application that deserializes objects from an untrusted source, the fact that they happen to be on the class path leads to them being available to use in an undesirable manner.

pron · on Nov 9, 2015

I agree. It is an ObjectInputStream security hole which needs to be patched by having it execute readResolve/readObject(ObjectInputStream) under a different security context.

matt_heimer · on Nov 8, 2015

The native execution bug is in the Apache Commons code but the serialization issue is an attack vector that can be exploited if you are too lazy to implement some form of white listing. Think about class loading. Does your JVM/classloader implement class unloading? How much memory can I use in your JVM in I cause every single class on your CLASSPATH to load into memory?

It would have been nice if ObjectInputStream was abstract and required a subclass that provided a whitelist of classes.

barrkel · on Nov 9, 2015

Or even a simple predicate.

TazeTSchnitzel · on Nov 8, 2015

> arbitrary object deserialization [...] is inherently unsafe

Is it always? In a language with side-effects (like Java), maybe, but in something like Haskell, surely it would be completely safe?

barrkel · on Nov 9, 2015

The distinction between file formats and programs is mostly convention rather than semantic.

The logical structure of a file parser is almost indistinguishable from that of a language parser, and most file parsers include what is effectively an interpreter to massage the low-level file instructions into something more meaningful in the loading program's domain model.

A file format is a program that, when evaluated, constructs an in-memory data structure.

The degree to which this is a problem depends on the side effects (if any) of data structure construction, and the set of data structures available to be constructed. And of course how the data structure is used later; if the data structure represents instructions (this is perfectly possible even in pure languages like Haskell) or may behave other than expected with inheritance, more problems ensue.

That's all independent of simple bugs in the parser, such as buffer overflows when trusting lengths and offsets embedded in the file, or infinite loops with naive offset chasing.

Filligree · on Nov 8, 2015

Yes. The binary/cereal libraries are perfectly safe unless there's a bug in the compiler/RTS somewhere.

CJefferson · on Nov 9, 2015

In Haskell you could still escape with System.IO.Unsafe at any time.

of course you could check your desiralised code isn't doing anything unsafe, but that's fairly easy to do in Java too.

codygman · on Nov 10, 2015

The way to do so in Haskell is putting "Safe" in the Language section of a cabal file, and it covers other stuff like partial functions IIRC.

Are you saying that Java has something similar to SafeHaskell? Are you saying they are nearly the same? If yes to the last one, I feel that might be untrue.

It's quite possible I just read through your comment too quickly though.

CJefferson · on Nov 10, 2015

In Java, it's easy to run code under a SecurityManager, which lets you carefully control exactly what other classes the code can access. This gives you really fine-grained control, and lets you control very carefully what code is doing.