How to find vulnerable log4j instances across your AWS EC2 instances

Introduction

A Critical Severity vulnerability in one of the most ubiquitous Java packages used for logging - Log4j was discovered a couple of weeks ago and the Internet is still abuzz with the news. Rightly so, since a successful exploitation of this bug allows for a remote unauthenticated command execution and system take over when exploited.

”The vulnerability arises because the Log4j package mixes command and data resulting in execution of strings that are interpreted as a command.”

Over the period of the last 2 weeks a total of 3 vulnerabilities have been disclosed in Log4j - CVE-2021-44228, CVE-2021-45046 and CVE-2021-45105, with each one of them having different exploitation results ranging from a full remote code execution to a denial of service under special circumstances. The gravest of these vulnerabilities - CVE-2021-44228 - allows for a very uncomplicated remote command execution without requiring any authentication over the Internet resulting in a complete compromise of data and system confidentiality, integrity and availability - giving this vulnerability a CVSS score of 10 (Critical).

This article shows us how to automate the process of discovering whether a vulnerable copy of log4j exists within your infrastructure on AWS EC2 instances. Although the methodology described shows how to do this using AWS and Linux machines, the techniques can be broadly adapted to work across most systems that can be remotely accessed (SSH, WINRM etc.).

Note: For Windows an additional change would be in the find command and the command to obtain a SHA1 hash of the discovered JAR. We have covered this in a previous post.

What is Apache Log4j?

Log4j is a Java based logging utility. It is a part of the Apache Logging Services which is a project of the Apache Software Foundation. This utility is used to log error and debug messages in hundreds millions of software across the world and is the default logging package in a lot of popular SaaS and online services including Amazon, Apple iCloud, Twitter, Cisco, Cloudflare etc.

The current version of Log4j is version 2 which has significant improvements over version 1 (no longer supported).

The vulnerable versions of Log4j are 2.0 to 2.16.0. Log4j v2.17.0 was released on December 18th and patches all three vulnerabilities that have been made public.

How to do Automation to detect the vulnerability in server

This section shows how you can automate the detection of the vulnerable version across different machines. The overall process is

Make a list of all vulnerable versions of log4j
Create file hashes for these versions
Remotely run a command on a machine to find all JAR files
Compute file hashes for these discovered JAR files
Compare the hashes for these JAR files with hashes obtained in Step 2
Repeat step across another machine till all machines are exhausted

Let’s see these steps in detail and some AWS specific ways of executing commands remotely as root on EC2 instances.

Step 1 - Download the versions that are vulnerable and create a file hash for each version.

Note: You can use the list that we have created from here if you want to jump to Step 3 - https://github.com/Kloudle/vulnerable-log4j-jar-hashes

The impacted versions of Log4j are 2.0 to 2.16.0 We can obtain the list from the official website of Log4j https://archive.apache.org/dist/logging/log4j/. Below is the list of vulnerable versions.

Bash Script to download all the vulnerable versions. Over here vulnerable-log4j-versions.txt consists of all the vulnerable versions of log4j and is read line by line.

#!/bin/bash

VULNERABLE_VERSIONS=`cat vulnerable-log4j-versions.txt`

for version in $VULNERABLE_VERSIONS
do
    echo "Downloading log4j-$version"
    DOWNLOAD_URL="https://archive.apache.org/dist/logging/log4j/$version/apache-log4j-$version-bin.tar.gz"
    echo "Archive Download URL: $DOWNLOAD_URL"
    wget $DOWNLOAD_URL -P ./temp_archives/
done

Once the vulnerable versions are downloaded we need to extract all the .tar archives in a directory.

Step 2 - Generating the hash for each version downloaded

sha1sum is a computer program that calculates and verifies SHA-1 hashes. It is commonly used to verify the integrity of files. In the previous step we downloaded the .tar archives of all the vulnerable versions of log4j. We then get the sha1sum of all the .jar files and store it in a .csv file.

Bash Script for the sha1sum of all the .tar files

#!/bin/bash
cd ./temp_archives/
LIST_OF_DIRS=`ls | grep -v tar`
for dir in $LIST_OF_DIRS
do
    cd $dir
    LIST_OF_FILES_IN_DIR=`ls | grep jar`
    for file in $LIST_OF_FILES_IN_DIR
    do
   	 echo "$file,`sha1sum $file`" | cut -d ' ' -f1 | tee -a log4j-vuln-versions-sha1sum.csv
    done
    cd ../
done

This creates a CSV with the list of SHA1 hashes of JAR files that are known to be vulnerable.

Step 3 - Running Against a Server

Running a command on a remote server requires some kind of access. Usually you would do this using SSH for Linux and WMI or WINRM on Windows. For cloud providers additional options may be available. In this example, we are using AWS EC2 instances as our target and use Amazon Web Services Systems Manager (SSM) to remotely run commands on the machines. The additional advantage (in this context) is that these commands will run as root ensuring all the areas of your file system are scanned.

SSM is a very cool feature of AWS EC2 that helps you automate management tasks such as collecting system inventory, applying operating system (OS) patches, automating the creation of Amazon Machine Images (AMIs), and configuring operating systems (OSs) and applications at scale using AWS IAM credentials.

Here’s the bash shell script that will use SSM to run a find command and compute the SHA1 hash for the discovered JAR files.

#!/bin/bash

COMMAND_ID=`aws ssm send-command --document-name "AWS-RunShellScript" --targets '[{"Key":"InstanceIds","Values":["i-**********"]}]' --parameters 'commands=["#!/bin/bash","find / -type f -iname *.jar -exec sha1sum {} \\;"]' | jq -r '.Command.CommandId'`
echo $COMMAND_ID
COMMAND_DETAILS=`aws ssm list-command-invocations --command-id $COMMAND_ID --details`

The script runs the find command using SSM’s send-command, obtains the Command ID of the executed command (a non system level, AWS reference) and then uses another SSM command list-command-invocations, to fetch the actual output of the find command.

If you do not want to use SSM or are unsure if all the machines within your AWS EC2 support SSM then you could do this the old school way of running the command over SSH using a bash script or via an automation framework like Ansible.

Step 4 - Compare the hashes from Step 3 with your Master List

The hashes that are generated in Step 2 of all the vulnerable versions of log4j and the hashes that are generated in Step 3 of all the files present in the remote EC2 Instance can be then compared using automation or manually.

Any detected instance can then be quarantined and patched with the new version before the machine is used again in production.

Conclusion

The article explains the methodology used to automate the discovery of vulnerable instances of Log4j within a large set of machines on AWS using hash comparison. The methodology can be tweaked to work with any kind of instance that can be remotely administered.

The log4j vulnerabilities are here to stay at least for the next several months. As developers, DevOps folks and administrators continue to identify new instances within their network and offensive security researchers continue to find new and novel ways of attacking this vulnerability across known and unknown software, Log4j will continue to remain a hot topic.

To ensure your systems are not exploited a defence in depth approach works well but you would need to patch your systems at some point in time. Automating the discovery and potentially even patching the vulnerable system remotely is something that administrators would want to use when working with hundreds of instances whose vulnerability status is unknown.

Riyaz Walikar

Founder & Chief of R&D

Riyaz is the founder and Chief of R&D at Kloudle, where he hunts for cloud misconfigurations so developers don’t have to. With over 15 years of experience breaking into systems, he’s led offensive security at PwC and product security across APAC for Citrix. Riyaz created the Kubernetes security testing methodology at Appsecco, blending frameworks like MITRE ATT&CK, OWASP, and PTES. He’s passionate about teaching people how to hack—and how to stay secure.

•• See all posts

Riyaz Walikar

Founder & Chief of R&D

•• See all posts

← Back to Academy