I don't have an article but here's a super quick rundown. Log4j is a very common logging framework used in java. It very often gets pulled in along with other dependencies, so it's easy to be using it without even realizing it. It has a feature that allows it to download and run code just by logging specially formatted strings. So if someone get cause your server to log these strings, it will run whatever code they want. On minecraft servers for example, all chat messages get logged at some point, so just posting the right string in chat will cause the server to execute the code you tell it to download.
Okay, I know I am not a Real Programmer, but even I know that user content is to be Not Trusted. Isn't it like a Security 101 principle that user content is always potentially dangerous, and to be treated accordingly?
> Meanwhile, library developers didn't realize app devs would be feeding untrusted strings to their library.
What?!
It's a logging library. I would expect users of it to be feeding user input, so that in the case of a bug, I can look at logs to see what input triggered a bug.
> It's a logging library. I would expect users of it to be feeding user input,
I agree with you, but I did see a couple of days ago someone with the diametrically opposite opinion: that we should never log user input, with a link to https://owasp.org/www-community/attacks/Log_Injection (plus this bug) as the justification.
Seems like a strange conclusion to draw. I mean, taking input from one user and presenting it to another creates the opportunity for XSS attacks, but obviously you wouldn't use that to argue that you should never show one user's input to another, because then no website could contain user-generated content. Forums would not exist and the entire web would be non-interactive.
Nah...logging user input is a must to be able to perform digital forensics and incident response. Certainly knowing exactly how an attack was triggered would help in preventing it in the future.
Just filter the CRs and LFs to prevent log forging, and make sure log files are not accessible from the web app. They should be in /var/log, not in the web root.
Your instinct is right, and in hindsight that's easy to say. Practically speaking, however, "treated accordingly" generally means "do not execute this string as a command (or pass it to anything that might do so, like a SQL statement)". It was entirely reasonable (though, we have now discovered, false) to expect that a logging framework would not execute the strings that it received.
Yes, but oftentimes you want to log user generated events (especially ones that might otherwise be ephemeral) to create, well, logs, of what has happened. You expect the log library to dump whatever string you direct it to to the correct location (based on config and log level and etc), with any necessary sanitizing, and you otherwise forget about it. You don't expect the log library to try and execute any part of the string.
As best I've been able to tell, this was not an intentional feature; it was added by the original author for configuration, so that they could drop LDAP URL's into the log4j configuration file, thus using LDAP as a "configuration server". I don't think they realized this would cross paths with every single log message as well.
Mind you, I also think that original intention was idiotic: Now your java application can't boot up unless your LDAP server is working, and for what? That's the kind of thing that makes global restarts & outage recovery a disaster.
If you look at the change that caused this it was intended that incoming log statements pattern match jndi requests and run those. The only configuration in the change was to specify the pattern match for jndi. There was no intention to run jndi from the configuration, the intention was always to read log statements and run those.
It uses the routing appender that's intended to send logs different ways via string matching and added code that said if the string matching has ${jndi... it should run that jndi code.
The change did what it said it would do. Akin to someone submitting a patch to run eval(log_statment).
That it passed review and was accepted is frightening.
I will admit that I've used it, though. I have an IRC bot written in Python with a "!calc" command that calls eval(). But I use `ast` to build the syntax tree, then walk it and compare every AST object to a white list so you can only use numbers and math operators. Anything else, such as CALLs or strings will throw an error.