2 buffer overflows

Software Security Buffer Overflows public enemy number 1 Erik Poll Digital Security Radboud University Nijmegen

Essence of the problem Suppose in a C program we have an array of length 4 char buffer[4]; What happens if we execute the statement below ? buffer[4] = ‘a’; Anything can happen ! If the data written (ie. the “a”) is user input that can be controlled by an attacker, this vulnerability can be exploited: anything that the attacker wants can happen. 2

Solution to this problem • Check array bounds at runtime – Algol 60 proposed this back in 1960! • Unfortunately, C and C++ have not adopted this solution, for efficiency reasons. ( (Ada, Perl, Python, Java, C#, and even Visual Basic have.) • As a result, buffer overflows have been the no 1 security problem in software ever since. 3

Problems caused by buffer overflows • The first Internet worm, and all subsequent ones (CodeRed, Blaster, ...), exploited buffer overflows • Buffer overflows cause in the order of 50% of all security alerts – Eg check out CERT, cve.mitre.org, or bugtraq • Trends – Attacks are getting cleverer • defeating ever more clever countermeasures – Attacks are getting easier to do, by script kiddies 4

Any C(++) code acting on untrusted input is at risk Eg • code taking input over untrusted network – eg. sendmail, web browser, wireless network driver,... • code taking input from untrusted user on multi- user system, – esp. services running with high privileges (as ROOT on Unix/Linux, as SYSTEM on Windows) o • code acting on untrusted files – that have been downloaded or emailed • also embedded software, eg. in devices with (wireless) network connection such as mobile phones with Bluetooth, wireless smartcards, airplane navigation systems, ... 5

How does buffer overflow work?

Memory management in C/C++ • Program responsible for its memory management • Memory management is very error-prone – Who here has had a C(++) program crash with a segmentation fault? • Typical bugs: – Writing past the bound of an array – Dangling pointers • missing initialisation, bad pointer arithmetic, incorrect de- allocation, double de-allocation, failed allocation, ... – Memory leaks • For efficiency, these bugs are not detected at run time, as discussed before: – behaviour of a buggy program is undefined 7

Process memory layout High Arguments/ Environment Stack grows addresses down, Stack by procedure calls Unused Memory H Heap (dynamic data) Heap grows up, eg. by malloc Static Data .data and new Low addresses Program Code .text 8

Stack overflow The stack consists of Activation Records: x AR main() m return address AR f() f buf[4..7] buf[0..3] Stack grows void f(int x) { Buffer grows downwards char[8] buf; upwards gets(buf); } void main() { f(…); … } void format_hard_disk(){…} 9

Stack overflow What if gets() reads more than 8 bytes ? x AR main() m return address AR f() f buf[4..7] buf[0..3] void f(int x) { Buffer grows char[8] buf; upwards gets(buf); } void main() { f(…); … } void format_hard_disk(){…} 10

Stack overflow What if gets() reads more than 8 bytes ? x AR main() m return address AR f() f buf[4..7] buf[0..3] Stack grows void f(int x) { Buffer grows downwards char[8] buf; upwards gets(buf); } never use void main() { f(…); … gets()! } void format_hard_disk(){…} 11

Stack overflow • Lots of details to get right: – No nulls in (character-)strings – Filling in the correct return address: • Fake return address must be precisely positioned • Attacker might not know the address of his own string – Other overwritten data must not be used before return from function – … 12

Variants & causes • Stack overflow is overflow of a buffer allocated on the stack • Heap overflow idem, of buffer allocated on the heap Common causes: • poor programming with of arrays and strings – esp. library functions for null-terminated strings • problems with format strings But other low-level coding defects than can result in buffer overflows, eg integer overflows or data races 13

Example: gets char buf[20]; gets(buf); // read user input until // first EoL or EoF character • Never use gets • Use fgets(buf, size, stdin) instead 15

Example: strcpy char dest[20]; strcpy(dest, src); // copies string src to dest • strcpy assumes dest is long enough , and assumes src is null-terminated • Use strncpy(dest, src, size) instead 16

S Spot the defect! (1) char buf[20]; char prefix[] = ”http://”; ... strcpy(buf, prefix); // copies the string prefix to buf strncat(buf, path, sizeof(buf)); // concatenates path to the string buf 17

S Spot the defect! (1) char buf[20]; char prefix[] = ”http://”; ... strcpy(buf, prefix); // copies the string prefix to buf strncat(buf, path, sizeof(buf)); // concatenates path to the string buf strncat’s 3rd parameter is number of chars to copy, not the buffer size Another common mistake is giving sizeof(path) as 3rd argument... 18

S Spot the defect! (2) char src[9]; char dest[9]; char base_url = ”www.ru.nl”; strncpy(src, base_url, 9); // copies base_url to src strcpy(dest, src); // copies src to dest 19

S Spot the defect! (2) base_url is 10 chars long, incl. its char src[9]; null terminator, so src won’t be char dest[9]; not null-terminated char base_url = ”www.ru.nl/”; strncpy(src, base_url, 9); // copies base_url to src strcpy(dest, src); // copies src to dest 20

Spot the defect! (2) base_url is 10 chars long, incl. its char src[9]; null terminator, so src won’t be char dest[9]; not null-terminated char base_url = ”www.ru.nl/”; strncpy(src, base_url, 9); // copies base_url to src strcpy(dest, src); // copies src to dest so strcpy will overrun the buffer dest 21

Example: strcpy and strncpy • Don’t replace s strcpy(dest, src) by s strncpy(dest, src, sizeof(dest)) but by s strncpy(dest, src, sizeof(dest)-1) dst[sizeof(dest-1)] = `0`; if dest should be null-terminated! • Btw: a strongly typed programming language could of course enforce that strings are always null-terminated... 22

S Spot the defect! (3) char *buf; We forget to check for bytes int i, len; representing a negative int, so len might be negative read(fd, &len, sizeof(len)); // read sizeof(len) bytes, ie. an int // and store these in len buf = malloc(len); read(fd,buf,len); len cast to unsigned and negative length overflows read then goes beyond the end of buf 23

S Spot the defect! (3) char *buf; int i, len; read(fd, &len, sizeof(len)); i if (len < 0) {error ("negative length"); return; } buf = malloc(len); read(fd,buf,len); Remaining problem may be that buf is not null-terminated 24

S Spot the defect! (3) char *buf; May result in integer overflow; int i, len; we should check that len+1 is positive read(fd, &len, sizeof(len)); i if (len < 0) {error ("negative length"); return; } buf = malloc(len+1); read(fd,buf,len); buf[len] = '0'; // null terminate buf 25

S Spot the defect! (4) #ifdef UNICODE #define _sntprintf _snwprintf #define TCHAR wchar_t #else #define _sntprintf _snprintf #define TCHAR char #endif TCHAR buff[MAX_SIZE]; _sntprintf(buff, sizeof(buff), ”%sn”, input); [slide from presentation by Jon Pincus] 26

Spot the defect! (4) #ifdef UNICODE #define _sntprintf _snwprintf #define TCHAR wchar_t #else #define _sntprintf _snprintf #define TCHAR char #endif _snwprintf’s 2nd param is # of chars in buffer, not # of bytes TCHAR buff[MAX_SIZE]; _sntprintf(buff, sizeof(buff), ”%sn”, input); The CodeRed worm exploited such an ANSI/Unicode mismatch [slide from presentation by Jon Pincus] 27

S Spot the defect! (5) #define MAX_BUF = 256 v void BadCode (char* input) { short len; char buf[MAX_BUF]; len = strlen(input); if (len < MAX_BUF) strcpy(buf,input); } 28

Spot the defect! (5) #define MAX_BUF = 256 What if input is longer than 32K ? v void BadCode (char* input) { short len; len will be a negative number, char buf[MAX_BUF]; due to integer overflow hence: potential len = strlen(input); buffer overflow if (len < MAX_BUF) strcpy(buf,input); } The integer overflow is the root problem, but the (heap) buffer overflow that this enables make it exploitable 29

S Spot the defect! (6) b bool CopyStructs(InputFile* f, long count) { structs = new Structs[count]; f for (long i = 0; i < count; i++) { if !(ReadFromFile(f,&structs[i]))) break; } } 30

S Spot the defect! (6) b bool CopyStructs(InputFile* f, long count) { structs = new Structs[count]; f for (long i = 0; i < count; i++) { if !(ReadFromFile(f,&structs[i]))) break; } } effectively does a malloc(count*sizeof(type)) which may cause integer overflow And this integer overflow can lead to a (heap) buffer overflow. (Microsoft Visual Studio 2005(!) C++ compiler adds check to prevent t this) 31

Spot the defect! (7) char buff1[MAX_SIZE], buff2[MAX_SIZE]; // make sure url a valid URL and fits in buff1 and buff2: if (! isValid(url)) return; if (strlen(url) > MAX_SIZE – 1) return; // copy url up to first separator, ie. first ’/’, to buff1 out = buff1; do { // skip spaces if (*url != ’ ’) *out++ = *url; } while (*url++ != ’/’); strcpy(buff2, buff1); ... [slide from presentation by Jon Pincus] 32

Spot the defect! (7) Loop termination (exploited by Blaster) ( char buff1[MAX_SIZE], buff2[MAX_SIZE]; // make sure url a valid URL and fits in buff1 and buff2: if (! isValid(url)) return; if (strlen(url) > MAX_SIZE – 1) return; // copy url up to first separator, ie. first ’/’, to buff1 out = buff1; do { length up to the first null // skip spaces if (*url != ’ ’) *out++ = *url; } while (*url++ != ’/’); strcpy(buff2, buff1); ... what if there is no ‘/’ in the URL? [slide from presentation by Jon Pincus] 33

Spot the defect! (7) char buff1[MAX_SIZE], buff2[MAX_SIZE]; // make sure url a valid URL and fits in buff1 and buff2: if (! isValid(url)) return; if (strlen(url) > MAX_SIZE – 1) return; // copy url up to first separator, ie. first ’/’, to buff1 out = buff1; do { // skip spaces if (*url != ’ ’) *out++ = *url; } while (*url++ != ’/’) && (*url != 0); strcpy(buff2, buff1); ... [slide from presentation by Jon Pincus] 34

Spot the defect! (7) char buff1[MAX_SIZE], buff2[MAX_SIZE]; // make sure url a valid URL and fits in buff1 and buff2: if (! isValid(url)) return; if (strlen(url) > MAX_SIZE – 1) return; // copy url up to first separator, ie. first ’/’, to buff1 out = buff1; do { // skip spaces if (*url != ’ ’) *out++ = *url; } while (*url++ != ’/’) && (*url != 0); strcpy(buff2, buff1); ... Order of tests is wrong (note the first test includes ++) What about 0-length URLs? [slide from presentation by Jon Pincus] Is buff1 always null-terminated? 35

Spot the defect! (8) #include <stdio.h> int main(int argc, char* argv[]) { if (argc > 1) printf(argv[1]); return 0; } This program is vulnerable to format string attacks, where calling the program with strings containing special characters can result in a buffer overflow attack. 36

Format string attacks • Complete new type of attack, invented/discovered in 2000. Like integer overflows, it can lead to buffer overflows. • Strings can contain special characters, eg %s in printf(“Cannot find file %s”, filename); Such strings are called format strings • What happens if we execute the code below? printf(“Cannot find file %s”); • What may happen if we execute printf(string) where string is user-supplied ? Esp. if it contains special characters, eg %s, %x, %n, %hn? 37

Format string attacks • %x reads and prints 4 bytes from stack – this may leak sensitive data • %n writes the number of characters printed so far onto the stack – this allow stack overflow attacks... • Note that format strings break the “don’t mix data & code” principle. • Easy to spot & fix: replace printf(str) by printf(“%s”, str) ) 38

Dynamic countermeasures incl. stack canaries

Dynamic countermeasures protection by kernel • non-executable memory (NOEXEC) – prevents attacker executing her code • address space layout randomisation (ASLR) ( – generally makes attacker's life harder • instruction set randomisation – hardware support needed to make this efficient enough protection inserted by the compiler • stack canaries to prevent or detect malicious changes to the stack; examples to follow • obfuscation of memory addresses Doesn't prevent against heap overflows 40

Dynamic countermeasure: stack canaries • introduced in StackGuard in gcc • a dummy value - stack canary or cookie - is written on the stack in front of the return address and checked when function returns • a careless stack overflow will overwrite the canary, which can then be detected. • a careful attacker can overwrite the canary with the correct value. • additional countermeasures: – use a random value for the canary – XOR this random value with the return address – include string termination characters in the canary value 41

Further improvements • PointGuard – also protects other data values, eg function pointers, with canaries • ProPolice's Stack Smashing Protection (SSP) by IBM – also re-orders stack elements to reduce potential for trouble • Stackshield has a special stack for return addresses, and can disallow function pointers to the data segment 42

Dynamic countermeasures NB none of these protections are perfect! Eg • even if attacks to return addresses are caught, integrity of other data other the stack can still be abused • clever attacks may leave canaries intact • where do you store the "master" canary value – a cleverer attack could change it • none of this protects against heap overflows – eg buffer overflow within a struct... • .... 43

Windows 2003 Stack Protection The subtle ways in which things can still go wrong... • Enabled with /GS command line option • Similar to StackGuard, except that when canary is corrupted, control is transferred to an exception handler • Exception handler information is stored ... on the stack – http://www.securityfocus.com/bid/8522/info • Countermeasure: register exception handlers, and don't trust exception handlers that are not registered or on the stack • Attackers may still abuse existing handlers or point to exception outside the loaded module... 44

Countermeasures • We can take countermeasures at different points in time – before we even begin programming – during development – when testing – when executing code to prevent, to detect – at (pre)compile time or at runtime -, and to migitate problems with buffer overflows 46

Prevention • Don’t use C or C++ • Better programmer awareness & training Eg read – and make other people read - • Building Secure Software, J. Viega & G. McGraw, 2002 • Writing Secure Code, M. Howard & D. LeBlanc, 2002 • 19 deadly sins of software security, M. Howard, D LeBlanc & J. Viega, 2005 • Secure programming for Linux and UNIX HOWTO, D. Wheeler, • Secure C coding, T. Sirainen 47

Dangerous C system calls source: Building secure software, J. Viega & G. McGraw, 2002 Extreme risk High risk (cntd) H Moderate risk Low risk • gets • streadd • getchar • fgets • strecpy • fgetc • memcpy High risk • strtrns • getc • snprintf • strcpy • read • strccpy • realpath • strcat • bcopy • strcadd • syslog • sprintf • strncpy • scanf • getenv • strncat • sscanf • getopt • vsnprintf • fscanf • getopt_long • vfscanf • getpass • vsscanf 48

Prevention – use better string libraries • there is a choice between using statically vs dynamically allocated buffers – static approach easy to get wrong, and chopping user input may still have unwanted effects – dynamic approach susceptible to out-of-memory errors, and need for failing safely 49

B Better string libraries (1) • libsafe.h provides safer, modified versions of eg strcpy – prevents buffer overruns beyond current stack frame in the dangerous functions it redefines • libverify enhancement of libsafe – keeps copies of the stack return address on the heap, and checks if these match • strlcpy(dst,src,size) and strlcat(dst,src,size) with size the size of dst, not the maximum length copied. Consistently used in OpenBSD 50

B Better string libraries (2) • glib.h provides Gstring type for dynamically growing null- terminated strings in C – but failure to allocate will result in crash that cannot be intercepted, which may not be acceptable • Strsafe.h by Microsoft guarantees null-termination and always takes destination size as argument • C++ string class – but data() and c-str()return low level C strings, ie char*, with result of data()is not always null-terminated on all platforms... 51

Detection before shipping • Testing – Difficult! How to hit the right cases? – Fuzz testing - test for crash on long, random inputs – can be succesful in finding some weaknesses • Code reviews – Expensive & labour intensive • Code scanning tools (aka static analysis) C Eg – RATS () – also for PHP, Python, Perl – Flawfinder , ITS4, Deputy, Splint – PREfix, PREfast by Microsoft plus other commercial tools – Coverity – Parasoft – Klockwork. 52

More prevention & detection • Bounds Checkers – add additonal bounds info for pointers and check these at run time – eg Bcc, RTcc, CRED, ..... – RICH prevents integer overflows • Safe variants of C – adding bound checks, but also type checks and more: eg garbage collection or region-based memory m management) – eg Cyclone (http://cyclone.thelanguage.org), CCured, Vault, e Control-C, Fail-Safe C, … 53

More prevention & detection The most extreme form of static analysis: • Program verification – proving by mathematical means (eg Hoare logic) that memory management of a program is safe – extremely labour-intensive  – eg hypervisor verification project by Microsoft & Verisoft: • http://www.microsoft.com/emic/verisoft.mspx 54

Reducing attack surface • Not running or even installing certain software, or enabling all features by default, mitigates the threat 55

Summary • Buffer overflows are the top security vulnerability • Any C(++) code acting on untrusted input is at risk • Getting rid of buffer overflow weaknesses in C(++) c code is hard (and may prove to be impossible) – Ongoing arms race between countermeasures and ever more clever attacks. – Attacks are not only getting cleverer, using them is getting easier 56

More general Buffer overflow is an instance of three more general problems: 1) lack of input validation 2) mixing data & code – data and return address on the stack 1) believing in & relying on an abstraction – in this case, the abstraction of procedure calls offered by C • Attacks often exploit holes in abstractions that are not 100% enforced 57

Moral of the story • Don’t use C(++), if you can avoid it – but use a language that provides memory safety, such as Java or C# • If you do have to use C(++), become or hire an expert • Required reading – A Comparison of Publicly Available Tools for Dynamic Buffer Overflow Prevention, by John Wilander and Mariam Kamkar 58

Optional Reading • If you want/need to read some more to understand on how buffer overflows attacks work, or are interested in a very comprehensive account of countermeasures: Y. Younan,W. Joosen, F. Piessens, Code injection in C and C++: a survey of vulnerabilities and countermeasures 59

2 buffer overflows

More Related Content

Similar to 2 buffer overflows

More from Karthic Rao

Recently uploaded

2 buffer overflows