Buffer overflow: Difference between revisions

From Citizendium
Jump to navigation Jump to search
imported>Nick Johnson
(renamed "In Software" to "Software Debugging Tools")
imported>Nick Johnson
m (usafe --> usage)
Line 29: Line 29:
====As Language Semantics or Library Functionality====
====As Language Semantics or Library Functionality====


One major cause of buffer overflow vulnerabilities in software systems has been the use of unsafe string manipulation functions---most notably [[C programming language|C]]'s [[strcpy|strcpy() and strcat()]] and others.  These functions perform buffer copies, but do not require the programmer to impose a maximum number of bytes to copy, and thus can result in buffer overflows.  The first improvements over these two functions were [[strncpy|strncpy() and strncat()]], which take, as an extra parameter the maximum number of bytes to copy.  However, the semantics of these functions are difficult for programmers to understand, and they have a whole slew of boundary cases that are commonly misunderstood.  More recently, the [[OpenBSD]] project has implemented the [[strlcpy|strlcpy() and strlcat()]] functions, which offer simplified semantics, and presumably safer usafe.  These two functions have become common on other Unix-like operating systems.
One major cause of buffer overflow vulnerabilities in software systems has been the use of unsafe string manipulation functions---most notably [[C programming language|C]]'s [[strcpy|strcpy() and strcat()]] and others.  These functions perform buffer copies, but do not require the programmer to impose a maximum number of bytes to copy, and thus can result in buffer overflows.  The first improvements over these two functions were [[strncpy|strncpy() and strncat()]], which take, as an extra parameter the maximum number of bytes to copy.  However, the semantics of these functions are difficult for programmers to understand, and they have a whole slew of boundary cases that are commonly misunderstood.  More recently, the [[OpenBSD]] project has implemented the [[strlcpy|strlcpy() and strlcat()]] functions, which offer simplified semantics, and presumably safer usage.  These two functions have become common on other Unix-like operating systems.


Another approach to the same goal is to simply replace unsafe languages, such as C, with [[high-level language|higher-level languages]], such as [[Perl programming language|Perl]], [[Java programming language|Java]], or many others.  Proponents argue that, since these languages include data structures that have automatic bounds checking and automatic memory management, that they are less succeptible to buffer overflow attacks.
Another approach to the same goal is to simply replace unsafe languages, such as C, with [[high-level language|higher-level languages]], such as [[Perl programming language|Perl]], [[Java programming language|Java]], or many others.  Proponents argue that, since these languages include data structures that have automatic bounds checking and automatic memory management, that they are less succeptible to buffer overflow attacks.

Revision as of 10:13, 12 April 2007

In computers and computer security, a buffer overflow occurs when more data is written to a memory buffer than can fit into the memory buffer. In certain programs, the excess data is written to memory beyond that buffer, overwriting other data. This error is the most common type of Computer security flaw, and its prevalence is due to the common use of languages such as C which have no implicit method to prevent buffer overflows.

Other names for this attack include "buffer overrun" and "Smashing the Stack," both of which describe the concept.[1]

Technical Explanation

A software execution stack exists for every process running on a computer. Parts of the stack contain program variables, and other parts contain information such as saved program counter address. Many programs---often because of the nature of the language in which they were written---do not take adequate steps to ensure they cannot overwrite their stacks as a result of invalid inputs. As a result, it is possible to coerce such programs to overwrite their stacks with chosen data.

By overwriting saved program counter addresses, an attacker may modify variables within the program, or even redirect execution to other code, potentially code that the attacker placed onto stack.

This can achieve unexpected results, ranging anywhere from the program crashing, to hijacking the execution context (and therefore, the security context) of the program in question. This simple concept has had profound implications in the annals of computer security.

Attempts at Overcoming This Vulnerability

Attempts at overcoming this vulnerability in a proactive way (rather than simply issuing Software patches) have had limited success. Researchers in Computer security have attempted to solve the buffer overflow attack problem both in software and in hardware. The best way to ensure that this attack vector isn't successful is by writing code that validates input wherever necessary.

Software Debugging Tools

Valgrind is an open source suite of tools that are designed to assist with debugging and improving the performance of software. It simulates the execution of code on a virtual x86 processor, and intercepts certain function calls, allowing for fine-grained buffer overflow detection on the heap.[2]

Splint is another open source toolset which performs static program analysis to detect common programming and security errors in C programs. Used on standard source code, or with annotated source code, it can help detect a large number of errors before a program is deployed.[3]

By The Operating System

Some operating systems, most notably the Unix-variant OpenBSD, employ address randomization in an attempt to thwart many buffer overflow attacks. In this method, the operating system attempts to map allocated memory to random memory addresses during the system calls malloc() and mmap(). This method will foil attacks which assume some address relationship between blocks of memory, such as one object occupies space immediately preceding another.

Similarly, OpenBSD attempts to insert so-called Guard pages before and after allocated blocks of memory. By manipulating the memory controller's memory map, the operating system can be notified upon reads or writes to a guard page. Thus, buffer overflows that escape one allocated block are trapped before they can reach another.

As Language Semantics or Library Functionality

One major cause of buffer overflow vulnerabilities in software systems has been the use of unsafe string manipulation functions---most notably C's strcpy() and strcat() and others. These functions perform buffer copies, but do not require the programmer to impose a maximum number of bytes to copy, and thus can result in buffer overflows. The first improvements over these two functions were strncpy() and strncat(), which take, as an extra parameter the maximum number of bytes to copy. However, the semantics of these functions are difficult for programmers to understand, and they have a whole slew of boundary cases that are commonly misunderstood. More recently, the OpenBSD project has implemented the strlcpy() and strlcat() functions, which offer simplified semantics, and presumably safer usage. These two functions have become common on other Unix-like operating systems.

Another approach to the same goal is to simply replace unsafe languages, such as C, with higher-level languages, such as Perl, Java, or many others. Proponents argue that, since these languages include data structures that have automatic bounds checking and automatic memory management, that they are less succeptible to buffer overflow attacks.

As Compiler Features

Main Article: canary value

Several groups have implemented security enhancements to compilers, hoping they can produce more secure code without forcing programmers to change their application's source code. Notable examples of this are StackGuard and Propolice.

The method is simple. The compiler generates additional instructions, so that the function prologue will add a so-called canary value to the stack frame between the return address and the local variables. This canary value is a random number chosen when the program begins. Then, additional instructions are inserted into the function epilogue which check the canary value, as it appears in the stack frame. If incorrect, the new instructions cause the program to go into a fail-safe mode (usually immediate termination), as to control the program's worst-case behavior while under attack. Canary values can work, because most stack smashing attacks which successfully overwrites the return address will also overwrite the canary value, and it is unlikely that the attacker will be able to guess the canary value. [4]

At least four attacks have been developed against this sort of protection. [5]

In Hardware

Processor manufacturers have attempted to create a hardware solution to this problem, where parts of memory are segregated into areas marked as instructions that should be executed and areas marked as data, which should never be executed. This solution, when used properly, can prevent buffer overflow attacks in many cases.

AMD developed and marketed this feature first, and named it the NX (No eXecute) bit. Intel's name for this feature is the XD (eXexute Disable) bit, however the two technologies are functionally the same and serve the same purpose.

Related Topics

External Links

"Smashing the Stack for Fun and Profit" This article is a bit dated, but it covers in great technical detail this flaw

References