Golang The world runs on code. We secure it. Tue, 22 Oct 2024 19:39:07 +0000 en-US hourly 1 https://wordpress.org/?v=6.6.1 https://checkmarx.com/wp-content/uploads/2024/06/cropped-cx_favicon-32x32.webp Golang 32 32 Race Conditions Can Exist in Go https://checkmarx.com/blog/race-conditions-can-exist-in-go/ Wed, 19 Aug 2020 09:05:45 +0000 https://www.checkmarx.com/?p=40898 Overview

The Go Programming Language (also known as Golang) is an open source programming language created by Google. Go is compiled and is statically typed as in C (with garbage collection). It has limited structural typing, memory safety features, and CSP-style concurrent features. In this article, I will cover Go Race Conditions from theory to practice including source code examples. I will discuss how to detect and solve race condition issues and the security impact they represent.

The theory

Race conditions are a common issue in software due to a computer program that expects a sequence of operations to execute in a specific order, but they run in another, making the software output unpredictable. These issues are common in concurrent computer programs where either multiple workers, processes, or threads shared a state.
On one hand, these issues are quite easy to introduce during the software development stage since concurrency is not easy to accomplish. On the other hand, they are difficult to detect/debug since the computer program behavior won’t always be deterministic.
For example, think about HTTP servers in general. They are a great example of concurrent computer programs since they are able to consistently handle simultaneous requests. For example, to track the number of visits/visitors to a web page, we can use a file on a local file system. When a new HTTP request is received, the process reads the current counter value from this file, increments it by 1, and writes the new value to the same file. Let’s explore this example in the next section.

In Practice

Let’s start by writing our HTTP server main() function, using Go http package

Now, we’ll implement the updateVisitsCounter() function, which is responsible to read/write the number of visits from/to a file in the filesystem. Please notice that error handling is missing for brevity.

The full source code can be found here. To run it, enter go run server.go and point your browser to https://localhost:8080. You should expect to see the visitor number increase every time the browser window is reloaded as shown below.

Figure 1

While running the server locally, with a single user accessing it, you should not experience the race condition. To make the race condition noticeable, let’s stress the server a little with a few concurrent HTTP requests. We can accomplish this by running concurrent-requests.go as shown in Figure 2.

Figure 2
On the left side of Figure 2, server.go is running and ready to receive requests. On the right side, the concurrent-requests.go will make 5 concurrent HTTP requests to it. Each output line starts with the request timestamp (Time.UnixNano()) and the request’s response.
Clearly, the output on the right side is not the expected one: four distinct requests received the same message, “Hello, you’re visitor #1”.
Notice that visitor #3, #4 and #5 are missing in Figure 2. Sorting the output by timestamp, we see that after visitor number #2 there were three other visitors who received the message, “Hello, you’re visitor #1”, as shown in Figure 3.

Figure 3
As explained in the previous section, our HTTP server implementation has:

  1. a shared state: the file in the filesystem where the counter is persisted (VISITS_COUNTER_FILE) and
  2. updateVisitsCounter() implementation expects read/write operations from/to the file in the filesystem to happen sequentially and in this strict order.

Figure 4 below illustrates the race condition with two concurrent workers: (worker1 and worker2)

Figure 4
Both workers read the value 0 (zero) from VISITS_COUNTER_FILE. Then worker1 updates the value to 1. While worker2 is still processing its first request, worker1 receives a new request doing a read immediately followed by a write operation, updating VISITS_COUNTER_FILE to 2. When worker2 finishes handling the request, it writes to VISITS_COUNTER_FILE the value 1: the read value 0 (zero) incremented by one unit. Finally, worker1 receives a new request, reading the VISITS_COUNTER_FILE, whose value is 1.

Security impact

Race conditions can be exploited to cause software malfunctioning, leading to problems such as Denial of Service and Privilege Escalation. Time of check to time of use (TOCTTOU) is a class of race conditions that may lead to privilege escalation.
Below are some examples of race conditions:

How to detect

As previously mentioned, it isn’t easy to find and solve race conditions, but care, diligence, and testing will certainly help. Despite of offering a clean way to write concurrent code, Go creators felt the need to add a race detector to help diagnose race conditions.
Go race detector works on all major operating systems, but only on 64bits systems, and it can only detect data races (accesses to memory from different threads or operations that impose ordering on memory accesses). Remember that race condition detection happens at runtime with a cost of 5-10x in memory increase, and 2-20x in execution time for a typical program.
To enable races detection, run your programs with the -race option, e.g., go run -race server.go. Go race detector won’t report any races on server.go because it isn’t a data race but, instead, a TOCTTOU race. Changing the implementation, replacing the VISITS_COUNTER_FILE file by a shared global variable visitsCounter to store number of visits (server-alternative.go), will allow us to see -race option output go run -race server-alternative.go as shown in Figure 5 below.

Figure 5

How to solve

Generally speaking, using mutual exclusion locks (mutex), semaphores, and other access and execution control primitives will help you preventing races. Packages like sync provides basic synchronization primitives such as mutual exclusion locks.
Adding a mutex to our HTTP server as in server-fixed.go implementation, fixes the race condition, leading to the expected result as shown in Figure 6.

Figure 6
Mutual exclusion locks role in computer programs is very simple. Before starting an operation on shared state, one worker should gain access to it acquiring mutex’s lock (Lock()). As soon as it happens, all other workers willing to manipulate the shared state will be blocked until the mutex’s lock is released (Unlock()). Once the mutex’s lock is released, the next worker will be able to manipulate the shared state.
In our example, acquiring mutex’s lock will enable the worker to read and write from/to the VISITS_COUNTER_FILE in this exact order, without interference from other workers. After updating the VISITS_COUNTER_FILE, then other workers will be able to do the same.
Of course, access and execution control primitives bring other challenges. For example, what if mutex’s lock never gets released? Workers will be perpetually denied access to the shared resource, potentially leading to a Denial of Service. This is a well-known problem in concurrent programming called starvation, which is out of the scope of this blog post.

Conclusion

Golang concurrency is awesome but there’s not much the compiler can do to prevent you from making mistakes leading to a Race Condition. Race conditions are something you’ll really want to avoid, and detection and debugging are both tough tasks.
While writing concurrent programs double check whether state (e.g., a file or variable) is accessed by more than one worker/thread/process at the same time. If so, you’ll need some access control mechanism such as mutexes or semaphores.

References

]]>
You Better Get Going with Go https://checkmarx.com/blog/you-better-get-going-with-go/ Tue, 18 Aug 2020 08:59:47 +0000 https://www.checkmarx.com/?p=40957 “I think Node (.js) is not the best system to build a massive server web. I would use Go for that. And honestly, that’s the reason why I left Node. It was the realization that: oh, actually, this is not the best server-side system ever.” (HS, 2017) This quote is by Ryan Dahl the author of Node.js.
Well, this is basically my story too. I left Node.js as well, and now I use the Go Language (Golang, Go) as my backend technology of choice. Although Node.js will always have a warm place in my heart, if you are looking for the best backend technology that is out there, I fully recommend Golang. Let me tell you why.

Go is Perfect

The motivation behind the creation of Go was that currently, there are many programming languages available. Each has its own pros and cons. There are some languages that are efficient, but not simple, like C and C++. And, on the other hand, there are simpler languages that are less efficient, like JS and Python.
But, a programming language should be perfect—both efficient and simple, right? Go is both, and a lot more. Let me elaborate a little further on why that is.

Go Syntax is Simple and Clean

Go syntax is something between C and Python with advantages from both languages. It has a garbage collector that is very useful.

  • It does not have objects, but it supports structures (structs) and any type can have a method.
  • It does not support inheritance, but does support compositions and interfaces.
  • You can have a pointer to a variable, but at same time, you don’t have to worry about referring to its methods.
  • It’s statically typed, but it’s non-strict because of the type inference.
  • Its package-based code management reduces the number of code lines.
  • And last but not least, Go has a simple concurrent model.

Let’s dig into some examples of the simplicity, but awesomeness of Go.

  • Swapping between variables is simple:
    • b, a = a, b
  • It allows you to import a package directly from GitHub or any other source manager:

o    import “github.com/pkg/errors”

  • By starting a Goroutine, it supports concurrent routing:

o    go runConcurrently()

  • Allows you to refer to a method of instance of some type, no matter if it’s a pointer or the actual instance:
    • instance.method()

Go is Efficient and Scalable

Thanks to the Go dependency model and its memory management, the compilation is very fast when compared to low-level languages, and even more so with high-level languages.
Go’s runtime performance is similar to C++ and C, making its performance quite notable.
In the context of scaling, Go is much faster than popular competitors thanks to the Goroutines model as you can see in the figure below.

When comparing Goroutines to Java threads, Goroutine threads consume ~2KB, while Java threads consume ~1MB.

Go is Widely Used and Super Easy to Learn

Go is an open source language with wide adoption and a fast-growing community. On the web, you can find lots of free and useful packages and many Q&As, FAQs, Tutorials, etc.
According to 2019 stack-overflow survey among developers, Go is one of the most popular languages. 8.2% of the survey respondents are programming in it, and 67.9% of the respondents mentioned Go as their most loved, dreaded, and wanted language. Go scored the highest rank (93% of the respondents) as a preferred language in a 2019 survey by Golang within Go developers, and as you can see in the Figure below, Go is growing in popularity.


In addition to that, Go is super easy to learn. Because of its friendly syntax and the great “Tour of Go” (that takes about two days to complete and covers all the basics you’ll need to get started programming in Go), after completing the tour, you will feel very confident with the language. When you start using the language, coding with it will become pretty easy overall. And after about two weeks of using it, it will likely become your preferred/native language.
Looking to quickly increase your knowledge of Golang security? Check out our Intro to Go Language guide!

So, hear this Gopher out!

You Better Get Going with Go.
Reference
HS, P. (2017, October 03). Episode 8: Interview with Ryan Dahl, Creator of Node.js. Retrieved July 07, 2020, from https://mappingthejourney.com/single-post/2017/08/31/episode-8-interview-with-ryan-dahl-creator-of-nodejs/

]]>
Diving Deep into Regular Expression Denial of Service (ReDoS) in Go https://checkmarx.com/blog/redos-go/ Mon, 07 May 2018 14:14:42 +0000 https://www.checkmarx.com/?p=23855 Go Programming Language (also known as Golang) is an open source programming language created by Google. Go is compiled, is statically typed as in C (with garbage collection), with limited structural typing, memory safety features and CSP-style concurrent features. In this blog post, we’ll recap Go’s security posture facing Regular Expression Denial of Service (ReDoS) attacks. But first, let’s start by explaining the concept of ReDoS and how such attacks can be exploited and mitigated. This blog post includes a set of practical examples using different programming languages, aiming to show how the Go implementation avoids ReDoS.

The topic of this report was motivated byongoing research on the topic of Go security, where we aim to discover vulnerabilities lurking in Go packages.
func sqli() {
username := r.Form.Get(“username”)
sql := “SELECT * FROM user WHERE username='” + username + “‘” row_fullname := db.QueryRow(sql)
fmt.Printf(“Welcome, %sn”, row_fullname)
}
Listing 1: Golang SQLi example

ReDoS

Regular Expression Denial of Service (ReDoS) is an algorithmic complexity attack that provokes a Denial of Service (DoS). ReDos attacks are caused by a regular expression that takes a very long time to be evaluated, exponentially related with the input size. This exceptionally long time in the evaluation process is due to the implementation of the regular expression in use, for example, recursive backtracking ones.
A regular expression, better known as a ‘regex’, is a sequence of characters that defines a search pattern, used to search for one or more characters within a string. One of the handy usages of a regex is information validation, i.e., ensuring that only properly formed data is being submitted.
For example, let’s pretend that we want to apply a regular expression over the username input of listing 1. Thus, a simple regex could be:
/^[a-zA-Z0-9_-]{3,10}$/
Listing 2: Regex example 1
The regular expression is contained between the slash characters and in the pattern 2 regex. We start by telling the parser to find the beginning of the string (ˆ), followed by any lowercase letter (a-z), uppercase letter (A-Z), number (0-9), an underscore, or a hyphen. The {3,10} section makes sure that the entered string has a length between three and ten characters. Finally, the $ represents the end of the string. For this regex, if we used the input checkmarx it would match the pattern:

Figure 1: Example – Regex 1

On the other hand, if we used a string like checkmarx’ OR SLEEP(10)– it would not match the pattern.

Figure 2: Example – Regex 1 Fail

Evil Regular Expressions

Even though the benefits of using regexs for input validations are great, depending on the way they are written and the engine used, a malicious user can leverage it and make the application or service unavailable. Thus, evil regexs are the root cause of the ReDoS issue. They are considered evil or malicious if they can stuck on crafted input. To understand this better, let’s consider the following regular expression:
/A(B|C+)+D/
Listing 3: Evil Regex example 1
In this scenario, this regex pattern starts by searching for the character ’A.’ Then, the following string must either be the character ’B’ or one or more ’C’s, (B|C+). The next + indicates that it can search for one or more occurrences of the previous string. Finally, the ’D’ ensures that the string is terminated by the character ’D’. To match this regular expression, any input of the following type would be accepted:
ABCD
ABCBD
ACD
ACBD
Listing 4: Valid input for Evil regex 1
To show the differences between the implementations used by different languages, we created simple programs in four different languages: Python, JavaScript, PHP and Go. All created programs use the regex from example 3. This benchmark was done incrementing passed inputs, allowing us to visually understand the different behaviors of the program depending on the variations of the inputs.
In the example code from listing 5, we show a simple Python implementation to evaluate the evil regex. We start by testing the valid inputs from listing 4. Then we send malicious inputs to try to get the program stuck.
regex = r”A(B|C+)+D”
test_str = raw_input(“Enter the string: “)
matches = re.finditer(regex, test_str) (…)
print (“It took: %s seconds” % elapsedTime)
Listing 5: Python Regex compiler
To craft malicious inputs, we started by incrementing valid and invalid inputs, and tweaking it according to the time differences between them. At some point, we found some relevant discrepancies. In figure 4, we show the attempted malicious payloads and the matching elapsed time for evaluation. We see that when a malicious input payload of type AC+E, where + represents one or more occurrences of the character C, is sent with more than 20 Cs, the elapsed time starts to double for each new C.

The same principle was applied to the other languages and we maintained the input cases in order to compare the results. The next example is for JavaScript. Listing 6 shows the JS code snippet:
const regex = /A(B|C+)+D/g; (…)
let m;
while ((m = regex.exec(str)) !== null) { (…)
} (…)
console.log(“It took: ” + seconds + ” seconds”);
Listing 6: JS Regex compiler
The obtained results for the JavaScript implementation are as follows:

The next example is PHP:
$re = ‘/A(B|C+)+D/’;
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0); (…)
echo $timediff.” seconds”;
Listing 7: PHP Regex compiler
Where the results from figures 7 and 8 produced these results:

Finally, an example for Go. We created the following program:
func main() {
re := regexp.MustCompile(`A(B|C+)+D`) (…)
for i, match := range re.FindAllString(str, -1) { fmt.Println(match, “found at index”, i)
} (…)
fmt.Println(“It took:”, elapsed.Seconds(), “seconds”)
}
Listing 8: Go Regex compiler
And produced the following results:

We can see that the results from PHP (8) and Go (10) implementations are much different from the Python (4) and JavaScript (6) results. The malicious inputs used in Python and JavaScript generated exponential time increments, versus the linear time responses for PHP and Go.
In the PHP regex implementation, we used the Perl Compatible Regular Expressions (PCRE) library 1, which uses backreferences. The official regex package [2] implemented in Go uses the RE2 engine 2, which does not support backreferences, and guarantees a linear time execution while avoiding regex denial of service.
In table 1, we summarize the obtained results for each programming language, where it is displayed the elapsed time in seconds for each input that was tested. For the purpose of this post, we only tested ten inputs, starting with  20 ’C’ characters and incrementing one unit. This clarifies that for the Python and JavaScript implementations the time doubles when a new C is added.
Finally, in the chart seen in figure 11, it becomes visually clear what the discrepancies are between the results and places of the performances of PHP and Go side by side.

Behind the Curtains

The most commonly used algorithms to implement regular expression matching are:

  • Perl-based
  • NFA-based

So far, we have seen how the PHP implementation uses the PCRE, Perl-based, and Go uses the RE2. In any case, to accomplish the regular expression matching, the engine builds a Nondeterministic Finite Automaton (NFA), which is a finite state machine where for each pair of state and input symbol, there may be several possible next states. Hence, for each input symbol, the NFA will transit to a new state until all the input symbols have been consumed. This will try all paths of the NFA until it reaches an accepting state, that is, where a match occurred or all the paths were attempted but with no match.
Considering the regex ˆ(a+)+$ and its correspondent NFA:

We can use the same methodology of inputting different sizes in order to understand the NFA behavior. To this, if we choose the input aaaaX, 16 possible paths will exist in the graph from figure 12.
If we modify the input to aaaaaaaaaaX, it will have 1024 steps. And if we change it to aaaaaaaaaaaaaaaaX, 65536 possible paths will be generated. Each additional ”a” doubles this number. This behavior is an extreme case and happens because the algorithm will go through all the possible paths until failing.
What happens behind the curtains is that any time a symbol is being tested by the engine and it fails to match the next one, it will backtrack and look for another way to compile the previous symbol. If this path gets too long, the number of backtracking steps will eventually become very large, resulting in catastrophic backtracking, leading to a possible denial of service.
If we take the example of the regular expression from listing 3, and with the help from the regex101 website, we can resume this behavior in a table, where displayed is the number of steps taken for a target input string.

Table 2: PCRE (PHP) - Benchmark

Table 2: PCRE (PHP) – Benchmark

From table 2, it is clear that each time we incremented the number of C’s by one unit, the engine took twice the number of steps. Using this engine from the regex101 website, if more than 998 C’s are used, it will respond with a catastrophic backtracking message:

This is the turning point between the PHP and Go implementations. In subsection 2.2, we saw that for the input used, the results in terms of elapsed time were very similar, but we did not test for extremely large inputs, as seen in figure 13, crashes the PHP implementation. This is avoided in Go.
As a matter of fact, all programming language engines (from this website) will have disastrous behaviors with this input – except for Go. JavaScript will respond with a timeout and the Python with a catastrophic backtracking. As for Go, it will resolve the string input in approximately 218ms. These results can be consulted here.
2.3.1 Easter Egg
It is also important that websites testing regular expressions can properly detect catastrophic cases:
Figure 14: Catastrophic Backtracking on Pythex

Figure 14: Catastrophic Backtracking on Pythex

A malicious user could take advantage of the lack of validation in this website to provoke a denial of service. Another example happens in the https://www.debuggex.com/ website. When a vulnerable regular expression is used with a malicious input, it will hang the page.
Figure 13: Example - Catastrophic Backtracking

Conclusion

In this blog post, we recapped what a regular expression is and how can it be leveraged to provoke a denial of service. We go through a set of examples where behaviors of different engines are shown. Specifically, we emphasize the Go behavior.
Despite possible recommendations and workarounds to avoid ReDoS, which revolve around the usual input sanitization, the best measure is to target the root cause, and so, focus on the implemented algorithm.
The implementation provided by the Go package (regexp) is guaranteed to run in time linear in the size of the input. A property that is not guaranteed by most open source implementations of regular expressions [3].

Using open source, but not sure what versions and components are in use?
Get a single holistic view of your application portfolio

References

]]>
Table 2: PCRE (PHP) - Benchmark Figure 14: Catastrophic Backtracking on Pythex Figure 13: Example - Catastrophic Backtracking