Easy Buffer Overflow Attack Tutorial for Beginner Programmers

When a software developer codes a variable into their application, the system allocates a specific number of bytes to hold data. Usually, the data passed to the variable is from user input, but it could also be from another system or application. Some languages have validation in place that makes them generally safe from buffer overflows (also called a buffer overrun), but programs written in C or C++ could be vulnerable to buffer overflow attacks if the developer does not initialize variables correctly and validate user input.

Buffer overflows generally happen in computer RAM (random access memory). Various components of a program are stored in RAM and space is allocated so that data can be moved from one location to another. Programmers occasionally make mistakes by not allocating enough memory space to store data, usually when a user sends an unexpected value that’s larger than the memory space allocated. This creates a buffer overflow, and generates undefined behavior that could be as simple as crashing the program or as dangerous as allowing an attacker to overwrite adjacent memory and execute their own malicious code.

An Overview of a Buffer Overflow Attack Example

Before stepping into code, you should first understand what happens in a buffer overflow attack. Let’s take the example of a username and password. The developer will code a variable for the username and password. Using the C language as an example, the code might look like this:

char username[] = “username”;
char password[] = “password”;

Note: In C there is a terminating ‘\0’ character, so the actual size of the character array for the username is 8 characters plus the termination character. The termination character of a character string can also create a buffer overflow scenario, but for simplicity we are excluding the ‘\0’ character for our examples.

With the username[] variable initialized to “username,” the following image represents the way computer memory stores each character.

Any character over the defined allocated memory space could create a buffer overflow. Therefore, if a developer allows a user or other program to enter more than 8 characters as input, you now have a buffer overflow attack scenario.

A Real-World Buffer Overflow Example in C Code Using strcpy()

One common function in C is strcpy(). The strcpy function copies characters from one string and inserts it into another. It then returns the destination variable’s value, which should be the same as the source variable.

Here is an example of strcpy code:

int main() {

     char source[] = “mylongstring”;
     char destination[13];
     strcpy(destination, source);
     return 0;

}

Every program in C has a main function where the execution of code starts. Our main function defines a source variable and uses the strcpy function to copy the source character string to the variable destination. It’s a simple snippet of code that assumes that the destination variable will never need to store more than 13 characters (remember that “mylongstring” is actually 13 characters including the termination ‘\0’ character).

With no user input, this function might be fine, but what happens if we add the ability for users to enter input that strcpy then copies from the user input to the destination variable? If we don’t handle this use case, we then introduce a strcpy buffer overflow.

Here is a buffer overflow PoC (proof of concept) using the previous code example:

int main() {

    char source[8];
    char destination[10];
    scanf(“%s”, source);
    strcat(source, “overflow”);
    strcpy(destination, source);
    return 0;

}

If you haven’t noticed, there are three buffer overflow vulnerabilities in this code. They are:

The scanf function allows users to enter any number of characters for the source variable, which could be more than the allocated 8 characters (including the termination character) allowed in the variable definition. The scanf function takes user input and assigns it to the given variable, which in this case is the source variable.
The strcat function adds the string “overflow” to the source variable. This string alone is 8 characters, so any value entered by the user will exploit the buffer overflow vulnerability when the strcat function runs.
The strcpy function will trigger a buffer overflow scenario if the source variable is more than 10 characters. It could execute successfully if the source variable only contains 9 characters, but any more than 9 characters and the program crashes.

How to Prevent a Buffer Overflow Attack

If you are a developer, you can practice buffer overflow exploit prevention by using specific functions that prevent an excess of characters (or numbers) from being sent to a variable. Testing for buffer overflow exploits is much more difficult. You can review code to identify buffer overflow vulnerabilities or continue to throw malformed values as input.

First let’s fix the main function in the previous example. The following code is one way to stop buffer overflows:

int main() {

     char source[8];</span>
     char destination[10];
     scanf(“%7s”, source);
     strncat(source, “overflow”, sizeof(source) - 1);
     strncpy(destination, source, sizeof(destination) - 1);
     return 0;

}

You’ll notice all three functions were changed to use alternative strncat and strncpy, and the scanf function adds a numeric value to the input data type. Here are the changes we made to remediate the possibility of a buffer overflow exploit:

The scanf function uses “%7s” as it’s parameter, which means that only 7 characters will be taken from the input, eliminating the ability for a user to input more than 8 characters and overflowing the buffer when it’s copied.
The strcat function is replaced with the strncat function that has an extra parameter to define the number of characters to take from the source variable. The parameter gets the size of the source variable and subtracts one to account for the termination character, so no more than 7 characters will be copied to the source variable.
The strcpy function is replaced with the strncpy function, which also has an additional parameter to define the number of characters that will be copied. No more than 9 characters will be copied to remediate buffer overflow vulnerabilities.

Testing for Buffer Overflow Vulnerabilities

Finding buffer overflows is often called a “needle in the haystack” scenario. From the outside, an attacker must try any malformed input across numerous opportunities where the software takes user-generated input. If you have an application written in C or C++, the best way to find buffer overflow possibilities is to review the code. Any function or input that takes user-generated values should be validated for size and type.

Buffer overflow vulnerabilities are considered high-risk and could create numerous exploit potential for attackers including remote code execution (RCE). For this reason, you should have your code reviewed before deploying it to production.

Generally, a penetration tester will check for buffer overflows by:

Identifying where input is sent to the program, mainly user-generated input.
Use automated or manually generated input and send it to potential vulnerable locations.
Log when and how the application crashes after input is sent to the application.

One option for corporations that have several developers is to install Static Application Security Testing (SAST) tools. These tools run in the developer’s environment and constantly check for vulnerabilities such as buffer overflows. SAST alerts the developers before compile time or committing code so that they can fix them early in the development lifecycle.

What Languages are Vulnerable to Buffer Overflows?

Although the right scenario could lead to a program crash, the following languages are primary targets for buffer overflow exploits:

C
C++
FORTRAN
Assembly
Operating systems

Developers might ask questions such as “Is python vulnerable to a buffer overflow?” Or is C# vulnerable to buffer overflows? Most web-based applications are safe from buffer overflows, provided you use “normal” code. Here are a few examples:

C# buffer overflow: Only can be done when using the unsafe keyword or when turning off array range checking, which overrides safety for performance.
GoLang buffer overflow: Go is safe from buffer overflows provided that you do not use unsafe packages to override safety protections.
Python buffer overflow: Python also checks bounds for a maximum size and will reallocate memory if needed. Should a value exceed its allocated space, Python throws an OverflowError exception.
JavaScript buffer overflow: JavaScript and other web-based languages are not vulnerable to buffer overflow exploits.
Java buffer overflow: Java is not vulnerable to buffer overflows, but the Java Virtual Machine (JVM) that runs code could be, because it’s written in C++. You could also introduce Java buffer overflow vulnerabilities if you call native code via the JNI (Java Native Interface).