Software Composition Analysis

What, How, and Where Open Source Gets Pulled into a Codebase

Stephen Gates — Tue, 07 Dec 2021 13:04:48 +0000

The vast majority of software developers in the industry today are paid to solve business problems. Regardless of whether they work for small independent software vendors or Fortune 500 companies, solving such problems is now one of their primary responsibilities. Given the time and the opportunity, many software developers would write as much functionality into their applications as they possibly could from scratch. However, that can be very time consuming: first, they have to debug and fix it, and then, they have to maintain it (or better yet, enhance it).

Third-Party Extensions Are the Answer

To increase productivity and save a great deal of time, developers often use code written by third parties rather than rebuilding the same generic functionality across multiple applications. While there has always been a market for commercially licensed and supported extensions (including modules, packages, libraries, and frameworks), the vast majority of the third-party code used today is open source. This means that there is no marketplace and no purchase order; rather, a few extra lines of someone else’s code are simply imported into the software.

This can cause problems with licensing and disclosure if it is not accurately tracked and monitored. That is why software composition analysis (SCA) products are worth their weight in gold. These solutions find all of the third-party packages that are in use, then identify the corresponding licenses. They can even show if they are out of date or if known security vulnerabilities have been reported against them.

It’s All Based on Open Source

Even the lowest level of an application stack (the language and runtime engine) is often open source. The most popular languages in use these days are all open source, or at least have open source distributions. Go, Python, PHP, Ruby, and JavaScript are all open source by default, and even languages that are traditionally commercially supported have open source distributions like OpenJDK (for Java) and gcc (for C/C++).

After you’ve chosen your language, you’ll likely want to ensure that you have some structure in place so that you won’t need to declare all the basic functionality like dependency management and data management. Well over half of all Java applications use the Spring framework as their starting point. PHP uses Laravel, while JavaScript uses React and Bootstrap, among others.

Frameworks and languages form a solid foundation for any application, but the bulk of open source influence can be found in the staggering number of modules that are available as packages and libraries which can be easily integrated into applications.

How Easy Is It to Find Open Source Modules and Libraries?

Any web search for any type of functionality will often return results that link to places like GitHub, GitLab, PyPI, and many other sites. So how do you find what you need?

Let’s say that you want to make a particular form a little more secure by including a CAPTCHA. If you don’t know where to start, just head over to your favorite search engine and enter, “captcha library for Python.” In our case, the first result is an open source library that can be installed via pip (pip is the standard utility used in the Python ecosystem to install modules).

Installing this module is as simple as typing, “pip install captcha.”

Now, with just a couple lines of code, a whole new set of tested and proven functionality is added to the application in minutes.

from captcha.image import ImageCaptcha
image = ImageCaptcha(fonts=['/path/A.ttf', '/path/B.ttf'])
data = image.generate('1234')
image.write('1234', 'out.png')

For another, more real-world example, let’s say that you have a web application that needs to be able to pick a date from a calendar. To show you how to do this, we will use the jQuery library, which has a great deal of functionality and is easy to use.

The first step is to add the jQuery modules to the web page in question. There are two stylesheets and two script files that need to be imported. These are added between the head tags. The next step is to define the datapicker function, which activates the appropriate pieces of the jQuery library. The final step is to define where to put it on the page using an input field.

The code looks like this:




 
 
 jQuery UI Datepicker - Default functionality
 
 
 
 
 


Date:

The finished web page looks like this:

When you select the date input, it will present a calendar. You can stylize it, of course, but this example shows the simplicity that open source libraries can provide:

Conclusion

Without intentional effort, you will not find a single modern microservice or web site that doesn’t have open source somewhere in the components that it relies on or ships. The question isn’t whether you can use open source, but whether it provides a full view. Different open source licenses have different restrictions around distribution. GPL requires the disclosure of all source code, whereas Apache and BSD licenses simply require proper copyright attribution. This can determine which open source libraries and modules you are able to include in a given application.

In any case, rather than just trusting the development team to document everything they do (and we all know how much developers love to document things), a better and more viable long-term solution would be to build a pipeline to catch all of the open source code early, before it could introduce known security vulnerabilities or licensing complications. Better yet, you could integrate SCA into your source code repository and let tools like Checkmarx Software Composition Analysis do the heavy lifting for you.

Vince Power is an Enterprise Architect with a focus on digital transformation built with cloud enabled technologies. He has extensive experience working with Agile development organizations delivering their applications and services using DevOps principles including security controls, identity management, and test automation. You can find @vincepower on Twitter.

Download our Ultimate Guide to SCA Here.

SBOM: How to Create One Using Checkmarx SCA

Stephen Gates — Mon, 15 Nov 2021 15:44:27 +0000

In the first post in this SBOM series, we discussed what an SBOM is and why you should care. As previously mentioned, generating an SBOM report may sound relatively simple, but in most cases, it’s not. As you likely know, modern software projects make use of a long list of third-party open source packages, each of which often calls on many other packages as dependencies. This can create an extensive tree of direct dependencies, dependencies of dependencies, and so on. Simply put, trying to create and manage an SBOM using a spreadsheet is nearly impossible, and if you try to manage your open source usage this way, it will likely get out of hand very quickly.

Another Caveat

The next caveat to consider is that SBOM reports should follow a standard format that includes detailed information about each involved component. At a minimum, it needs to give the component’s name, supplier name, version, hashes and other unique identifiers, dependency relationship, author of SBOM data, and a timestamp. The report also needs to cover every software modification and update to reflect the current status of the project. An SBOM report is best accomplished using an automated process that is integrated into your CI/CD pipeline.

SBOM Methodology That Actually Enhances Security

The first and most fundamental task in generating an SBOM is analyzing the software dependencies, which is a natural undertaking for software composition analysis (SCA) solutions such as Checkmarx SCA. However, the ultimate purpose of an SBOM is not just providing a list of ingredients, but to identify potential risk. A standard SBOM provides a list of ingredients but no simple way to detect and measure risks associated with third-party dependencies. So, what else do you need to enhance software security? Simple: vulnerability and license risk information.

To meet the need for a more comprehensive SBOM, Checkmarx SCA leverages our existing infrastructure for identifying vulnerabilities, in addition to license and supply chain risks, to supplement the standard SBOM info. This creates an SBOM that provides valuable insight into the risks associated with your third-party components instead of just a list of ingredients. This methodology exceeds the requirements for what a simple SBOM contains.

The SBOM reports generated from Checkmarx SCA use the existing CycloneDX SBOM format, and SPDX and SWID formats will be added soon. The reports also provide additional “property” fields showing important risk data that organizations need to know about. The reports can be exported in XML or JSON format, making them easy for organizations to consume, track, and update.

How to Generate an SBOM from Checkmarx SCA

Using the Checkmarx SCA User Interface

Navigate to the Scan Results screen for the most recent scan of the desired project.
Click on the “SBOM” button. The SBOM configuration dialog is shown below:

Select the SBOM standard. Currently, only CycloneDX is available.
Select the output format: XML or JSON.
Click “Generate SBOM.”

The SBOM report will be downloaded and can be viewed on any standard XML/JSON viewers.

How to Add CI/CD Integration

Checkmarx SCA provides plugins and CLI tools for various CI/CD pipelines. One method for running Checkmarx SCA scans via CLI commands is the CxSCA Resolver, which is an on-premises utility for resolving and extracting dependencies. The following section describes how to export SBOM reports using the CxSCA Resolver.

How to Generate SBOM Using SCA Resolver

An SBOM report can be exported via the CxSCA Resolver CLI using –report-extension and report-type arguments.

Example:

“./ScaResolver -s /home/jack/src/MyApp -n MyApp -a Checkmarx -u jack -p ‘demo123!’ –report-extension Xml / Json –report-type CycloneDx”

SBOM Content

Below is a view of the SBOM content, which is part of the SBOM Checkmarx SCA generates.

The standard SBOM fields are ID (purl), Component Name, Version, License, and Hashes. All of these are included in every Checkmarx SCA SBOM as required fields.

In addition, we add a Properties section with extended information about the risks associated with each library.

SBOM Component Dependencies

Below is a view of the component dependencies, which is part of the SBOM Checkmarx SCA generates.

Each component contains its dependent components, and each dependency section contains a set of required fields and a Properties section.

Conclusion

Checkmarx is dedicated to helping organizations secure the software they develop, one line of code at a time. In response to the proliferation of open source usage, recent supply chain attacks, and the executive order mentioned in the previous post, you can use Checkmarx SCA to easily create and maintain an SBOM of your own. Plus, you’ll get real-time risk data about the open source found in your codebase to help you manage your own risk better.

In the next blog in this SBOM/Software Supply Chain series, we’ll discuss the top three software supply chain risks you need to know about.

To see an SBOM being created live, don’t hesitate to request a demo.

Download our Ultimate Guide to SCA Here.

SBOM: What It Is and Why You Should Care

Stephen Gates — Tue, 02 Nov 2021 12:18:43 +0000

Most health-conscious people pay close attention to labels in the grocery store. They want to know what’s in their food before they eat it, and they tend to make choices based on the ingredients, additives, preservatives, nutritional value, etc. The labels, and especially expiration dates, let buyers know if the food is healthy and safe to consume.

In this same way, it’s important to know what’s in your software before using it, so you know its ingredients are safe for your organization to consume. Since most software today is made up of open source components, combined with proprietary code (e.g., business logic) that makes everything work, having a list of all open source ingredients in the software you consume allows your organization to manage risk more effectively.

Therefore, leaning on what the manufacturing industry calls a “bill of materials” (BOM), we have the software BOM, or SBOM. An SBOM contains an accurate list of all open source software ingredients found in a software-based product. With this in mind—and due to a number of recent and notorious open source supply chain attacks drawing the attention of security experts, industry advocates, and even the US federal government—the current administration decided to act with regard to SBOMs.

On May 12, 2021, President Biden issued Executive Order 14028, “Improving the Nation’s Cybersecurity,” which states: “The term ‘Software Bill of Materials’ or ‘SBOM’ means a formal record containing the details and supply chain relationships of various components used in building software. Software developers and vendors often create products by assembling existing open source and commercial software components.”

Security experts agree that an initial step toward the EO’s goal of enhancing software supply chain security is transparency, and an SBOM is now required for anyone selling software to the US federal government and its agencies. So, do organizations that develop their own software in-house, to be used solely to support their own operations, need SBOMs for their own applications? The answer is likely yes.

Why an SBOM for the Software You Develop Makes Sense

In the past, organizations that developed their own software applications primarily did it in-house. Developers and security teams knew the origins of the core components that made up their applications and had full control over the recipe, so to speak. However, this model is no longer acceptable to most organizations, primarily due to time-to-market demands. Management and customers alike expect faster and more frequent releases/updates thanks to their familiarity with some of the media and retail giants that update their applications and release new builds on a daily basis—if not more frequently.

This rapid-release capability is largely owed to more organizations integrating significant amounts of open source into their application stacks. Due to this move, the open source supply chain, community, and contributors are expanding exponentially. Even a weekly build might pull in loads of open source components that may have been updated by the community since the last version in use, and if developers and security teams don’t allow the updates to be performed within their own applications, they will be deploying builds with potentially known vulnerabilities. If any of your applications import libraries from NPM, Maven Central, or any other registry, then you are using open source in your codebase.

If you have complete knowledge of what open source “ingredients” are required to build or compile the applications your organization relies on, then you can mitigate a number of risks when trying to improve the security of your applications. Therefore, if a new vulnerability (e.g., CVE) is issued, you can confirm if you are affected by comparing known vulnerable versions against your existing SBOM. If you have matches, you can quickly determine which issues must be resolved before the next build is released.

Ultimately, having a better view of the open source that your applications depend on will give you a clear view of your own vulnerabilities and associated risks. However, there are quite a few caveats concerning SBOMs that you should be aware of. The first is as follows.

SBOMS Are Not Spreadsheets

Generating an SBOM report may sound relatively simple, but in most cases, it’s not. As you likely know, modern software projects make use of a long list of third-party open source packages, each of which often calls on many other packages as dependencies. This can create an extensive tree of dependencies being used by your software in the form of direct dependencies, dependencies of dependencies, and so on. Simply put, trying to create and manage an SBOM using a spreadsheet is nearly impossible, and if you attempt to manage your open source usage in this fashion, it will likely get out of hand very quickly.

At the end of the day, SBOMs just make sense. Understanding your own risk profile and doing everything possible to effectively manage and reduce your organization’s risk falls into the realm of due care, which is defined as, “the standard of care a reasonable person would exercise in the same situation or under similar circumstances.” If due care is not being upheld, then your organization, your developers, and your security teams could be viewed as negligent.

In the next blog in this SBOM/Software Supply Chain series, we’ll discuss what an SBOM report should include and highlight the easiest approach to generating a high-quality SBOM report using Checkmarx Software Composition Analysis (CxSCA) solution .

To see an SBOM being created live, don’t hesitate to request a demo here.

To learn more, don’t forget to join our Technical Meetup Series to dive into topics like SBOM and open source libraries. Checkmarx experts Alex Cohen, James Brotsos, and I will walk you through security vulnerabilities you might not even know you had. We’ll also discuss the latest industry trends and application security best practices. It’ll be an interactive discussion, so bring your questions and pick our brains about how to improve your processes.

Download Our Ultimate Guide to SCA Here

A Developer’s Guide to Managing Open Source Risks

Stephen Gates — Thu, 16 Sep 2021 16:36:34 +0000

We’re living in an open source world. If you’re a developer today, it’s very likely that – no matter where you work or what type of applications you build – you rely at least in part on open source. Indeed, the Linux Foundation reported in 2018 that 72 percent of organizations use open source software in one way or another, and that more than half actively incorporate open source code into their commercial products.

It’s easy to understand why open source is so pervasive. By importing open source libraries, extensions, and other resources into applications, developers save themselves from having to reinvent the wheel. They can reuse code that others have already written, which frees up more time to write innovative features that don’t yet exist.

Yet open source can have its downsides. In order to leverage open source responsibly and avoid the security and compliance challenges that often accompany open source code, developers need full visibility into the open source software they use and the risks associated with it.

To provide guidance on that point, this article walks through the most common risks of incorporating open source code into a larger codebase. It also identifies best practices for working with open source code.

Risk 1: Inconsistent Security Standards

When it comes to security, open source code varies tremendously. Some open source projects, like the Linux kernel, maintain very high security standards (although even they sometimes let vulnerabilities slip by). Others, like random tools on GitHub that were written by CS students for a class assignment, don’t always set the bar so high.

This doesn’t mean that open source software is always insecure. Sometimes, it’s very secure. But sometimes, it’s not very secure at all.

This inconsistency presents a major challenge for developers. It means they have to vet the security of third-party open source code on a case-by-case basis. Although open source advocates like to claim that “many eyeballs make all bugs shallow,” and that vulnerabilities are therefore likely to be discovered and fixed quickly by the open source community, the fact is that the number of eyeballs on a given open source codebase can vary significantly. The less attention an open source project gets, and the less experienced its developers, the more likely it is to have low security standards.

Risk 2: Unknown Source Code Origins

Sometimes, it’s hard to vet the security of open source code because it’s unclear where the code originated in the first place.

For example, you might find a source code tarball on a website that offers little information on who wrote the code. Maybe there is a README inside the file that attributes the code to someone, but you have no way of verifying its authenticity.

Or, perhaps you clone code from a git repository, assuming that the maintainer of the repo is the code’s author. But the repo could contain code that actually originated somewhere else, and was simply copied into the repo you use. Here again, any documentation files that mention authorship are difficult to verify.

Being unsure where code originated matters because it’s easier to trust code when you know it was written by experienced, well-intentioned developers. Obviously, you should scan the code for vulnerabilities either way, but you can make better decisions about whether or not to use third-party code if you have confidence in where it came from.

Risk 3: Licensing Non-Compliance

Developers have a tendency to think they’re experts in open source licenses. But most are not. They often misunderstand licensing terms and hold false beliefs, such as “the GPL says you can’t sell your code for money” or “under the MIT license, you can do whatever you want as long as you attribute the original developers.”

Misconceptions like these create risk in the form of non-compliance with open source licensing terms. If developers don’t understand the specific requirements of the licenses that govern each of the various open source components they use, they may violate licensing agreements. And while it’s easy to assume that no one will actually sue you for breaking an open source license, such lawsuits are more frequent than you may think.

Complicating matters is the fact that open source projects occasionally change their licensing terms from time to time – as Elastic famously did in 2021, for example. This means that it’s not enough to determine the licensing requirements of open source code the first time you use it. You need to reevaluate every time the code changes versions.

Best Practices for Managing Open Source Risks

The risks identified above are not a reason to avoid open source. When managed responsibly, open source provides developers a range of benefits that outweigh the risks.

To keep the risks in check, consider the following best practices for working with open source code:

Download code from the project’s website or from GitHub repos that are linked from the project’s site. This is better than pulling code from random GitHub repositories, where there is a risk that a seemingly legitimate repo actually contains vulnerability-ridden code.
Always scan your code, no matter who wrote it or how certain you are of its origin.
Continuously validate the licensing terms of all open source components that you use.
Collaborate with your legal team to verify that your practices surrounding open source code actually comply with licensing terms. Don’t assume your developers “just know” what the licenses require.

Again, open source is an excellent tool. The life of the modern developer would probably be considerably more tedious without the ubiquity of freely reusable open source code. But to avoid shooting yourself in the foot when working with open source, it’s crucial to manage the inherent risks.

Chris Tozzi has worked as a journalist and Linux systems administrator. He has particular interests in open source, agile infrastructure, and networking. He is Senior Editor of content and a DevOps Analyst at Fixate IO. His latest book, For Fun and Profit: A History of the Free and Open Source Software Revolution, was published in 2017.

Download our Ultimate Guide to SCA here.

Exploitable Path – Advanced Topics

Alex Livshiz — Wed, 24 Mar 2021 07:58:14 +0000

This is the third and final blog on Exploitable Path – a unique feature that allows our customers to prioritize vulnerabilities in open-source libraries. In the first blog, we introduced the concept of Exploitable Path and its importance. The conclusion was that a vulnerability in a library is considered exploitable when:

The vulnerable method in the library needs to be called directly or indirectly from a user’s code.
An attacker needs a carefully crafted input to reach this method and trigger the vulnerability.

In the second blog, we discussed some of the challenges in developing such a feature, and our unique approach. Mainly:

Using a query language over the CxSAST engine for the abstraction of queries over source code. This allows a more language-agnostic approach, so that Exploitable Path works for every programming language supported by CxSAST.
We walked through the various CxSAST queries that are required to build a full call graph of a user’s source code and its libraries’ source code. By crossing it with vulnerability data, we can know if a vulnerability is exploitable or not.

In this last blog in the series, we will cover more advanced topics we faced during the development of Exploitable Path.

Challenge no. 1 – Supporting Multiple Library Versions

The public data on a CVE usually contains affected versions, but how can we use this information to support Exploitable Path across versions? Meaning, if the source code of a library changes between various versions, how can we have the required data for Exploitable Path for each of those versions?
Let’s assume we have a user’s source code that uses a single open-source library. This library contains a vulnerability, and using Mitre, we can figure out the affected versions.
To be able to assess if the vulnerability is exploitable, we need the following for each version on the library:

A call graph of the library’s code. This can be done automatically using CxSAST.
Is the current version vulnerable?
- If it is, the inner method in which the exploitation occurs is required.

Now the question is, “how can we find this inner method for each vulnerable version”? Going over each version manually is not practical, especially since a library can have hundreds of versions.
The first part of the solution is to find the inner method that’s vulnerable. Usually, a vulnerability goes together with a specific method (or methods) that are responsible for a certain logic. Pull requests and commits for the relevant CVE, help our Analysts uncover the relevant method.
Next, we generate a fingerprint of the fix – if a version contains the fix, we can mark it as not vulnerable to this CVE. This is where our powerful static code analysis tool comes into play again, making it easy to re-assess hundreds of library versions for the vulnerability.
Re-assessing the affected versions of a vulnerability is crucial. As it turns out, this data on public websites like Mitre is often not precise. Versions that are marked as vulnerable can be safe and vice versa. It can be the result of human error, or even a slight difference in the version tags between the public registry and the git repository on which the library is developed. By searching for the fingerprint of the fix, we can ensure the quality and accuracy of our vulnerabilities data.
Using the in-depth analysis process, the vulnerable method is marked for every affected version, eventually resulting in a very accurate Exploitable Path scan.

Challenge no. 2 – Data Flow

Just because your code calls a vulnerable method, that doesn’t mean you are automatically at risk. To assess the risk properly (and avoid false positives), it’s crucial to have both a call graph and a DFG (Data Flow Graph) of a code to assess its exploitability
Let’s start with an example, and assume that a method called parse(content) has a DoS (Denial of Service) vulnerability given the right input. If parse() is only called with a constant value, meaning parse(CONSTANT_VALUE), there is no attack surface for an attacker to exploit it and cause a DoS. On the other hand, if a user of the application controls the input parameter of parse(), it’s a different story. For example, this input can be a comment or other data provided by the user. In such a case, the attacker can easily exploit the vulnerability and craft the required input.
The reality is more complex, as there are various ways data can be transferred in code:

Input parameters
Global or class members
The return value of another method invocation

Also, not all data options are necessary for exploitation. For example, a method parseRequest(HttpRequest request, Config config) can be vulnerable for exploitation using only the HttpRequest.Content member in the request parameter.
Now we understand the importance, but how do you incorporate DFG in the process of assessing a vulnerability? To be more specific, how can we know that a vulnerability is exploitable from a data flow point of view?
First, we use CxSAST to build a DFG. We start at the vulnerable method and trace back the origins of data point. Eventually we’ll reach one of the following cases:

A constant value. This is not exploitable, of course.
An input parameter of a method that is not called by other methods. This is a potential data flow compromise, as in the context of the static code scan, we don’t know how the method is invoked.
An internal method of the language is called, such as fopen() in Python.
A method of a different library is called, and its source code is not available.

The last two cases are the most interesting ones, and have two complementary approaches:

As a rule of thumb, mark those methods as a potential for data flow compromise since the inner implementation is unknown.
Mark specific methods as definite data flow compromises. For example, reading contents from a database pipe file. The same goes for parsing HTTP packets, pulling a message from a message queue, etc.

These two approaches are the basis for DFG support in assessing a vulnerability for exploitability.

Summary

In this blog we covered two additional advanced topics in Exploitable Path. We started with the problem of supporting various library versions, and how this is solved using the in-depth analysis process. Then, we discussed the integration of DFG in the vulnerability evaluation process, and how to backtrack the flow of data in the code.
With CxSCA, Checkmarx enables your organizations to address open source vulnerabilities earlier in the SDLC and cut down on manual processes by reducing false positives and background noise, so you can deliver secure software faster and at scale. For a free demonstration of CxSCA, please contact us here.

Exploitable Path – How to Solve a Static Analysis Nightmare

Alex Livshiz — Wed, 03 Feb 2021 15:10:36 +0000

In my previous blog, I walked you through the reasoning and importance of the Exploitable Path feature in the Checkmarx CxSCA solution. We discussed the challenges of prioritizing vulnerabilities in open source dependencies and defined what it means for a vulnerability to be exploitable:

The vulnerable method in the library needs to be called directly or indirectly from a user’s code.
An attacker needs a carefully crafted input to reach the method to trigger the vulnerability.

Now that we know the scope of the problem, let’s dive into how uncovering an exploitable path is done.

Prerequisites

1. A SAST Engine

Every programming language has its set of quirks and features. Some use brackets; some don’t. Some are loosely typed; others are strict. To be able to develop an Exploitable Path, we needed a certain level of abstraction for example, a “common language.” This is particularly hard when high level concepts like “imports” behave differently across languages.
To solve this issue, Checkmarx uses its powerful CxSAST engine. CxSAST breaks down the code of every major language into an Abstract Syntax Tree (AST), which provides much of the needed abstraction. Imports, call graphs, method definitions, and invocations all become a tree.

2. An AST Query Language

Having an AST, the next step is having a query language capable of even further abstractions. Checkmarx uses CxQuery that can run queries to answer various questions, for example:

What are all the import statements in a codebase?
Which methods have no definition but only usage?
What’s the namespace of every file?

With a tool like CxQuery, you can get results in a unified format regardless of the programming language, such as, C#, Java, Python, etc.

Assumptions

1. Vulnerable Methods Are Known

Usually, the public data on a CVE provides a CVSS score, affected products, and versions, etc. However, the inner method in which the vulnerability is triggered is usually unknown. To help with this dilemma, the CxSCA Research Team has application security analysts on board who are responsible for analyzing CVEs and finding the method in which the vulnerability occurs. So, for the rest of the post we can assume that for every CVE, we know the method that triggers it.

2. A SAST Scan Is Limited to One Project

You can think of a project as a folder containing all source code without the third-party package’s code. This makes life easier since there’s a clear distinction between a user’s code and the dependency’s code.
For example, in case there’s a user code that requires a single third-party package, two scans can be made:

A scan on the user code.
A scan on the third-party package.

Static Analysis Steps

Now that we’ve covered the prerequisites and assumptions, let’s understand the challenge itself by looking at the following example, written in Python.
Here’s a simple code, importing an open source library and calling a method in it. This method in turn calls a vulnerable method.

The code of OSLib will be:

Here are the steps:

1. Find Unresolved Methods in User’s Code

The user code is parsed with CxSAST and a query is run to detect all methods that are called and are missing a definition – hence unresolved and belong to a third-party package. In our example, there are two calls:

foo() – is defined in the user code and hence resolved.
lib_foo() – is defined in OSLib and hence an unresolved method must be imported.

In our case, there’s a single import to OSLib, so it’s obvious where the method was imported from.
Usually, there will be multiple imports, in which case a signature of the method is collected and searched across imported libraries. Assuming the code is functional and works, there will always be a single match.

2. Find Exported Methods in Package Code

The code of package OSLib is also parsed with CxSAST, and a query is run to find all exported methods. In languages like C# and Java, an exported method is a public method in a public class that can be used by the user’s code. In Python, all methods are public so the exported methods in our example will be lib_foo() and inner_vuln_method().
This data is essential since it’s used to match unresolved methods in the step above.

3. Call Graph

A query for a call graph is run on both user’s code and package code.
For the user’s code, the graph is:

For the package code, the result is similar:

4. Find Exploitable Path

Using all the data collected so far, a full call graph is built:

All methods in the graph are checked for exploitability. In our example, inner_vuln_method() is the exploitable method, and so an Exploitable Path is found.

Further Topics

The example above provided a simple demonstration of how Exploitable Path is analyzed, but in reality, this problem is much harder. Some other research questions we faced, which are not discussed in this blog post, are:

Detecting Exploitable Path in a dependency of a dependency
Matching challenges between user’s code and package code
Integration of DFG (Data Flow Graph)

Summary

By using CxSAST with queries written in CxQuery, we created an abstraction layer to statically detect vulnerabilities that are exploitable. A single algorithm can detect Exploitable Path across multiple programming languages, and unlike other solutions on the market, CxSCA can easily extend support for more languages. Currently, Java and Python are already supported, with many more languages to follow.
With CxSCA, Checkmarx enables your organizations to address open source vulnerabilities earlier in the SDLC and cut down on manual processes by reducing false positives and background noise, so you can deliver secure software faster and at scale. For a free demonstration of CxSCA, please contact us here.

In the next post in this series, we’ll look at some of the challenges we faced as we developed the Exploitable Path feature.

Software Composition Analysis: Why Exploitable Path Is Imperative

Alex Livshiz — Wed, 20 Jan 2021 07:20:38 +0000

If you look at the way code is written today vs. a few years back, one of the major changes is the transition to open source. What was once considered an unsafe methodology has grown and matured, and now almost every software project uses open source libraries. Today, software engineers prefer to use existing open source code instead of writing everything themselves.

Open source code’s benefits are significant

Code development can be faster:
It’s now more about welding existing pieces together, rather than building them yourself. Open source libraries solve fundamental engineering problems, allowing engineers to focus their time on more complex tasks.
Tools like package managers make it easy to manage and add third-party dependencies:
Every programming language or IDE comes with an integrated package manager support.
Over time, the way APIs are exported and used becomes clearer and simpler:
Open source maintainers offer clear APIs, simple documentation, and code samples.

Every new technology has its risks, though, and attackers can exploit weak points in software that uses open source. An attacker can gain information about open source libraries used by an application, and in other cases, can simply maintain an arsenal of exploits for popular open source packages and attempt to use these until one succeeds. In the case of open source packages, attackers have full access to:

Its code, which they can scan for zero-day vulnerabilities.
Issues and security tickets that are managed on GitHub, GitLab, etc., which can help find vulnerable areas for exploitation.
Current and past vulnerabilities, which can be very helpful when the library in use is not up to date. These vulnerabilities have detailed descriptions and advisories, and even the patches themselves are open source. An attacker can utilize those vulnerabilities and attempt to attack the application, and if the library uses an old version, the attack will succeed.

To manage such risks, a software composition analysis (SCA) tool such as Checkmarx CxSCA detects your third-party libraries and versions in use and informs you of existing vulnerabilities. It’s important to recognize that not all libraries in a project may apply since some may not be in use.

Prioritizing Vulnerabilities

Tracking existing vulnerabilities is important, but it’s not enough. The average project has dependencies that in turn have their own dependencies. Overall, there can be hundreds or thousands of libraries with hundreds of vulnerabilities in your project.

Nowadays, solving those vulnerabilities can take lots of time, while developers need to put efforts into developing new features as well. Managing security vulnerabilities of third-party packages is often not a one-time thing, but rather an on-going process, so it’s important for an SCA tool to prioritize the risks. This way, developers know what the most crucial risks to solve are.

But how do you prioritize a vulnerability?

The popular method is to prioritize vulnerabilities by the CVSS—a score given to a vulnerability based on the impact, how easy it is to exploit, etc. Every vulnerability that is made public has this score. However, this methodology is too simplistic, since exploitability is the most crucial aspect.

Exploitability of a Vulnerability

Let’s assume that a vulnerability is triggered by a foo() method in a library you’re using. If your code doesn’t call foo() in any flow, either directly or indirectly, the vulnerability is in fact not exploitable. If so, the priority of fixing it is low and efforts should be redirected to exploitable vulnerabilities instead.

Looking at it from an attacker’s perspective, for a vulnerability to be exploitable:

The method foo() needs to be called. This can require a carefully crafted input, the processing of which will trigger a call for foo().
The attacker needs to control the data flow for foo(). Usually, calling a method with “regular input” won’t trigger any unwanted behavior. Unwanted behavior is triggered when a carefully crafted input from the attacker reaches foo(), meaning the vulnerable method needs to be callable and its input controlled.

Developers today can use an entire library for a single API method out of dozens of APIs. Also, libraries they use have their own third-party libraries, with only a partial use of available APIs. This means that given a vulnerability in one of your dependencies, the probability of exploiting it can be below 5%. This has serious implications:

Current prioritizations of vulnerabilities are defocusing. Instead of fixing exploitable vulnerabilities first, efforts are put into risks that may be irrelevant.
They may be considered false positives. You would assume that a critical risk is a top priority, but if the relevant code flow can’t be reached, there’s nothing critical here.
The true number of vulnerabilities that need to be addressed is actually much lower than assumed, and that’s good news for developers. Fewer vulnerabilities means far less effort to remediate them.

By using our SAST scan (CxSAST) to statically analyze the project’s source code and the source code of all its used packages, when examining the call graphs and data flows, the exploitability and risk can be evaluated.

With CxSCA, Checkmarx enables your organization to address open source vulnerabilities earlier in the SDLC, and cut down on manual processes by reducing false positives and background noise, so you can deliver secure software faster and at scale. For a free demonstration of CxSCA, please contact us here.

In the next blog post, we’ll dig deeper into the research behind Exploitable Paths, sharing challenges and insights we collected along the way.

Apache Unomi CVE-2020-13942: RCE Vulnerabilities Discovered

Eugene Rojavski — Tue, 17 Nov 2020 09:00:14 +0000

“Apache Unomi is a Java Open Source customer data platform, a Java server designed to manage customers, leads and visitors’ data and help personalize customers experiences,” according to its website. Unomi can be used to integrate personalization and profile management within very different systems such as CMSs, CRMs, Issue Trackers, native mobile applications, etc. Unomi was announced to be a Top-Level Apache product in 2019 and is made with high scalability and ease of integration in mind.
Given that Unomi contains an abundance of data and features tight integrations with other systems, making it a highly desired target for attackers, the Checkmarx Security Research Team analyzed the platform to uncover potential security issues. The findings are detailed below.

Executive Summary: CVE-2020-13942

What We Found

Apache Unomi allowed remote attackers to send malicious requests with MVEL and OGNL expressions that could contain arbitrary classes, resulting in Remote Code Execution (RCE) with the privileges of the Unomi application. MVEL and OGNL expressions are evaluated by different classes inside different internal packages of the Unomi package, making them two separate vulnerabilities. The severity of these vulnerabilities is heightened since they can be exploited through a public endpoint, which should be kept public by design for the application to function correctly, with no authentication, and no prior knowledge on the attacker’s part.
Both vulnerabilities, designated as CVE-2020-13942, have a CVS Score of 10.0 (Critical) as they lead to complete compromise of the Unomi service’s confidentiality, integrity, and accessibility, in addition to allowing access to the underlying OS.

Details

Previous RCE Found in Unomi

Unomi offers a restricted API that allows retrieving and manipulating data, in addition to a public endpoint where applications can upload and retrieve user data. Unomi allows complex conditions in the requests to its endpoints.
Unomi conditions rely on expression languages (EL), such as OGNL or MVEL, to allow users to craft complex and granular queries. The EL-based conditions are evaluated before accessing data in the storage.
In the versions prior to 1.5.1, these expression languages were not restricted at all—leaving Unomi vulnerable to RCE via Expression Language Injection. An attacker was able to execute arbitrary code, and OS commands on the Unomi server by sending a single request. This vulnerability was classified as CVE-2020-11975 and was fixed. However, due to further investigation by the Checkmarx Security Research Team, we discovered that the fix is not sufficient and can be trivially bypassed.

Patch Not Sufficient – New Vulnerabilities Discovered

The patch for CVE-2020-11975 introduced SecureFilteringClassLoader, which checks the classes used in the expressions against an allowlist and a blocklist. The SecureFilteringClassLoader relies on the assumption that every class in both MVEL and OGNL expressions is loaded using the loadClass() method of the ClassLoader class. The SecureFilteringClassLoader overrides the ClassLoader loadClass method and introduces the allowlist and blocklist checks. This assumption happened to be incorrect. There are multiple ways of loading a class other than calling the loadClass() method, which leads to the security control bypass and leaves Unomi open to RCE.
First, the MVEL expressions in some cases use already instantiated classes, like Runtime or System, without calling loadClass(). This results in the latest version of Unomi (1.5.1) allowing the evaluation of MVEL expressions inside the condition, which contains arbitrary classes.
The following HTTP request has a condition with a parameter containing a MVEL expression (script::Runtime r = Runtime.getRuntime(); r.exec(“touch /tmp/POC”);). Unomi parses the value and executes the code after script:: as an MVEL expression. The expression in the example below creates a Runtime object and runs a “touch” OS command, which creates an empty file in /tmp directory.

Vulnerability #1

Second, there is a way to load classes inside OGNL expressions without triggering the loadClass() call. The following HTTP request gets Runtime and executes an OS command using Java reflections API.

Vulnerability #2

The payload may look scary but it’s simply Runtime r = Runtime.getRuntime(); r.exec(“touch /tmp/POC”); written using reflection API and wrapped into OGNL syntax.
Both presented approaches successfully bypass the security control introduced in version 1.5.1, making it vulnerable to RCE in two different locations.

Possible Attack Scenarios

Unomi can be integrated with various data storage and data analytics systems that usually reside in the internal network. The vulnerability is triggered through a public endpoint and allows an attacker to run OS commands on the vulnerable server. The vulnerable public endpoint makes Unomi an ideal entry point to corporate networks. Its tight integration with other services also makes it a steppingstone for further lateral movement within an internal network.

Summary of Disclosure and Events

After discovering and validating the vulnerabilities, we notified Apache of our findings and worked with them throughout the remediation process until they informed us everything was appropriately patched.
To learn more about these types of vulnerabilities, OWASP and CWE have descriptions, examples, consequences, and related controls, as shown in the following links:

Additionally, read the code, analyze the fix, and learn how to mitigate similar issues via our interactive CxCodebashing lesson here.

Timeline of Disclosure

June 24, 2020 – Vulnerability disclosed to Apache Unomi developers
August 20, 2020 – Code with the mix merged to master branch
November 13, 2020 – version 1.5.2 containing the fixed code is released
November 17, 2020 – public disclosure

Recommendations

The evaluation of user-defined expression language statements is dangerous and hard to constrain. Struts 2 is an excellent example of how hard it is to restrict dynamic OGNL expressions and avoid RCE. These attempts to impose usage restrictions from within/on the EL, rather than restricting tainted EL usage for general purposes, is an iterative approach, rather than a definitive one. Instead, a more reliable means to prevent RCE is to remove the support of arbitrary EL expressions entirely, creating a set of static expressions that rely on dynamic parameters instead.
Static Application Security Testing solutions, like CxSAST, can detect OGNL injections in source code and prevent this sort of vulnerability from making its way into production. Meanwhile, software composition analysis (SCA) solutions, such as CxSCA, will have the necessary data about the vulnerable package and will update CxSCA users as soon as the vulnerability is publicly disclosed. To learn how to mitigate similar issues, visit our CxCodebashing lesson here.

Final Words/Summary

This type of research is part of the Checkmarx Security Research Team’s ongoing efforts to drive the necessary changes in software security practices among all organizations. Checkmarx is committed to analyzing open source software to help development teams build and deploy more-secure applications. Our database of open source libraries and vulnerabilities is cultivated by the Checkmarx Security Research Team, empowering CxSCA customers with risk details, remediation guidance, and exclusive vulnerabilities that go beyond the NVD.
To learn more about this type of RCE vulnerabilities, read our blog about Struts 2. For more information or to speak to a Checkmarx expert about how to detect, prioritize, and remediate open source risks in your code, contact us.

It’s Time to Update Your Drupal Now!

Erez Yalon — Thu, 18 Jun 2020 15:52:12 +0000

As part of our ongoing mission to help organizations develop and deploy more secure software and applications, and in light of Checkmarx’s expanded insight into the open source security landscape with its recently launched SCA solution, the Checkmarx Security Research Team analyzed Drupal, an open source content management system (CMS) and one of the top 10 most used PHP resources (frameworks, libraries, etc.) used by our customers. Over one million websites run on Drupal, including enterprise and government sites worldwide.
Drupal just recently released two major versions, which piqued our researchers’ interest. Once the team got to work on the two latest versions of Drupal, they quickly found that both versions were vulnerable to being exploited. Later, it is was confirmed by Drupal that every maintained version of Drupal (7.x, 8.8.x, 8.9.x) were easily exploitable by the same techniques.
These issues were discovered by Dor Tumarkin of the Checkmarx Security Research Team. Drupal acknowledged and patched the vulnerability, assigning it CVE-2020-13663. More information can be found below and on their security advisories page.

What is the Issue?

The Checkmarx Security Research Team identified a document object model-based cross-site scripting (DOM XSS for short) vulnerability in Drupal Core. This type of XSS attack is achievable if a web application enters data to the DOM without being appropriately sanitized. In this case, an attacker can manipulate their input data to include XSS content on the web page, for example, malicious JavaScript code, which in-turn would be consumed by Drupal Core itself.

What is the Risk?

An attacker abusing this vulnerability can take over the administrator role of a Drupal-based website and get full control that allows changing of content, creating malicious links, stealing sensitive or financial data, or whatever else comes to mind.

Drupal Assigns CVE-2020-13663

Drupal labeled the Security Risk of this vulnerability our team discovered as follows:

Risk: Critical
Access Complexity: Complex
Authentication: All/Anonymous Users
Confidentiality Impact: Certain Non-Public Data is Released
Integrity Impact: Some Data Can be Modified
Exploit (Zero-day Impact): Theoretical or White-hat (no public exploit code or documentation on development exists)
Target Distribution: All Module Configurations are Exploitable

Summary of Disclosure and Events

When the vulnerability was first discovered, the Checkmarx Security Research Team responsibly notified Drupal of its findings. Our team was asked to advise Drupal’s team after our disclosure, which we willingly did.
After we disclosed the vulnerability, the Drupal team’s sense of urgency and professionalism was quite notable, and a fix was made available within a week of our disclosure.
In accordance with Drupal’s disclosure guidelines and to give its users adequate time to update their software, Checkmarx will refrain from publishing a more technical report showing an in-depth walkthrough and proof-of-concept of exploiting this vulnerability for 60 days. In the meantime, we strongly encourage Drupal users to take action on recommended updates.

Recommendation

At this time, Checkmarx highly recommends that anyone using Drupal update the version in use immediately to the latest release, which contains a fix for this vulnerability.
Checkmarx customers using Checkmarx Software Composition Analysis (CxSCA) have already been automatically notified to update Drupal while running a scan of their code base.

Final Words

This type of research activity is part of the Checkmarx Security Research Team’s ongoing efforts to drive the necessary changes in software security practices among all organizations in an effort to improve security for everyone. Checkmarx is committed to analyzing the most prominent open source packages to help development teams ship more secure software and improve their software security risk posture. Our database of open source libraries and vulnerabilities is cultivated by the Checkmarx Security Research Team, empowering CxSCA with risk details, remediation guidance, and exclusive vulnerabilities that go beyond the NVD.
For more information or to speak to an expert about how to detect, prioritize, and remediate open source risks in your code, contact us.