Memory Corruption#

Few extra notes to the text from last time#

The Executable and Linkable Format (ELF)#

The Executable and Linkable Format (ELF) is the standard binary format used on Linux and other Unix-like systems. When you compile a C program on Linux, the resulting executable is an ELF file. While we often think of binaries as just machine code, ELF files are actually quite structured and contain a lot of metadata that tells the operating system how to load and execute the program.

An ELF file is divided into several parts. The ELF header sits at the beginning and contains basic information like the architecture the binary was compiled for (e.g., x86_64), the entry point address where execution should start, and pointers to other important structures in the file. Following the header are the program headers and section headers, which describe how the binary should be loaded into memory and what different parts of the file contain.

A nice visualisation is this illustration from the Corkami file formats repo (we recommend you checkout other illustrations from it too):

The ELF file structure (From [Corkami](https://github.com/corkami/pics/blob/master/binary/ELF101.png)

(Image courtesy of Corkami, licensed under CC-BY-3.0)

Segments and sections#

ELF files contain both sections and segments. Sections are used primarily by the linker and debugger and contain things like code (.text section), initialized data (.data section), and uninitialized data (.bss section). Segments, on the other hand, describe how the file should be mapped into memory when the program runs. When the operating system loads an ELF binary, it reads the program headers to determine which segments need to be loaded into virtual memory and where.

Memory permissions#

This brings us to an important security feature: memory permissions. When ordering a virtual memory mapping from the operating system (like with the mmap syscall), we can specify which operations are allowed on it — some regions should be readable and executable (like the code), while others should be readable and writable (like the stack and heap).

Each page of virtual memory can have a combination of three permissions:

Read (r) - The process can read data from this memory
Write (w) - The process can modify data in this memory
Execute (x) - The CPU can execute instructions from this memory

The ELF program headers specify these permissions for each segment. Typically:

The .text section (containing code) is mapped as r-x
The .data and .bss sections are mapped as rw-

You can inspect these permissions for a running process by looking at /proc/<PID>/maps on Linux, which shows all memory mappings and their permissions, or the vmmap command in pwndbg.

Dynamic linking and libc#

When you write a C program and use functions like printf, you might wonder where these functions actually come from. They’re not part of your program’s code - they exist in shared libraries that get linked to your program. This process is called dynamic linking.

Static vs Dynamic Linking#

There are two ways to include library code in your program:

Static linking - The library code is copied directly into your executable at compile time. The resulting binary is larger but self-contained.
Dynamic linking - Your binary contains references to library functions, and the actual code is loaded from shared library files (.so files on Linux) at runtime.

Most programs use dynamic linking by default because it has several advantages: smaller binary sizes, shared memory between processes using the same library, and the ability to update libraries without recompiling programs. However, this also means the program depends on having the correct libraries available on the system.

The C Standard Library (libc)#

The most important library for C programs is libc, the C standard library. On most Linux systems, this is specifically glibc (GNU C Library), though alternatives like musl exist. This library provides fundamental functions like:

I/O functions: printf, scanf, fopen, fclose
String manipulation: strcpy, strcmp, strlen
Memory allocation: malloc, free, calloc
System calls wrappers: read, write, open

Even the simplest C program that just returns from main will be dynamically linked against libc, because the startup code that calls your main function comes from libc.

How Dynamic Linking Works#

When you compile a program with dynamic linking (the default), the compiler doesn’t include the actual code for library functions. Instead, it creates a symbol table that lists which functions need to be resolved, and the linker (ld) adds information about which libraries to load.

When the program starts, the dynamic linker (ld-linux.so on Linux) takes over before your main function runs. It:

Loads the required shared libraries into memory
Resolves symbols - figures out where each library function actually is in memory
Updates your program’s references to point to the correct addresses
Finally transfers control to your program’s entry point

You can see which dynamic libraries a binary depends on using the ldd command:

ldd ./output_binary

This might output something like:

linux-vdso.so.1 (0x00007f1920ade000)
libc.so.6 => /lib64/libc.so.6 (0x00007f1920800000)
/lib64/ld-linux-x86-64.so.2 (0x00007f1920ae0000)

Here we can see our program depends on libc.so.6, and the dynamic linker found it at /lib/x86_64-linux-gnu/libc.so.6. The addresses shown are where these libraries were loaded in one run (these will likely change between executions).

The Procedure Linkage Table (PLT) and Global Offset Table (GOT)#

If you look at the disassembly of our earlier Hello World program, you might have noticed the call to printf looked like call 401030 <printf@plt>. What’s this @plt?

For performance reasons, dynamic symbol resolution can be lazy - functions aren’t resolved until the first time they’re called. This is implemented using two special sections - PLT and GOT.

The first time you call printf, the following happens:

Your code calls printf@plt
The PLT stub checks the GOT entry for printf
If it’s not resolved yet, the PLT stub calls the dynamic linker to resolve it
The dynamic linker finds printf in libc and updates the GOT entry
Subsequent calls to printf@plt find the resolved address in the GOT and jump directly there

PLT (Procedure Linkage Table) - Contains small stub functions that handle the dynamic linking. When you call a library function, you actually call its PLT entry.

You can inspect the .plt section from our binary using objdump:

objdump -M intel -d -j .plt ./output_binary

And the output will be:

Disassembly of section .plt:

0000000000401020 <printf@plt-0x10>:
  401020:	ff 35 ca 2f 00 00    	push   QWORD PTR [rip+0x2fca]        # 403ff0 <_GLOBAL_OFFSET_TABLE_+0x8>
  401026:	ff 25 cc 2f 00 00    	jmp    QWORD PTR [rip+0x2fcc]        # 403ff8 <_GLOBAL_OFFSET_TABLE_+0x10>
  40102c:	0f 1f 40 00          	nop    DWORD PTR [rax+0x0]

0000000000401030 <printf@plt>:
  401030:	ff 25 ca 2f 00 00    	jmp    QWORD PTR [rip+0x2fca]        # 404000 <printf@GLIBC_2.2.5>
  401036:	68 00 00 00 00       	push   0x0
  40103b:	e9 e0 ff ff ff       	jmp    401020 <_init+0x20>

GOT (Global Offset Table) - Contains the actual addresses of library functions once they’ve been resolved.

You can again check it content in the binary using objdump:

objdump -M intel -s -j .got.plt ./output_binary

And the output will be mostly zeros:

Contents of section .got.plt:
 403fe8 f83d4000 00000000 00000000 00000000  .=@.............
 403ff8 00000000 00000000 36104000 00000000  ........6.@.....

However, when running the binary, after running printf it will become populated:

 403fe8 f83d4000 00000000 00a0fbf7 ff7f0000  .=@.............
 403ff8 3088fdf7 ff7f0000 5eeec5f7 ff7f0000  0.......^.......

(For information on how we got this output checkout the debugging section in the previous lecture)

And Now Memory Corruption#

In the previous chapter, we showed how to exploit a simple buffer overflow to overwrite local variables and return addresses on the stack. While that gave us a taste of binary exploitation, real-world exploitation is significantly more complex - especially in recent years.

Modern systems employ multiple layers of defense against memory corruption attacks (luckily), but attackers have developed sophisticated techniques to bypass them. We will explore the various types of memory corruption vulnerabilities, the defenses designed to stop them, and the exploitation techniques used to bypass those defenses.

Memory Corruption Vulnerabilities#

Memory corruption occurs when a program writes data to a memory location in an unintended way. In memory-safe languages like Rust or Python, the language runtime prevents most of these errors, even though not always. In C and C++, however, the programmer is responsible for memory safety, and mistakes are common.

Attack: Buffer Overflow#

A buffer overflow occurs when data is written past the end of a fixed-size buffer.

We’ve already seen a stack-based buffer overflow with the gets function, this is what is called the stack buffer overflow:

void vulnerable() {
    char buffer[64];
    gets(buffer);
}

As already mentioned, gets will read into the buffer until newline and is totally oblivious about it’s size.

When overflow like this happens with buffer on the heap (dynamically allocated memory), it’s called a heap buffer overflow:

void vulnerable() {
    char *buffer = malloc(64);
    strcpy(buffer, user_input);
}

strcpy, as the name suggests, will copy a string, but, again, without considering the size of the target buffer.

More sneaky to recognize are off-by-one errors, where only one byte past the buffer is written:

void vulnerable() {
    char buffer[64];
    for (int i = 0; i <= 64; i++) {
        buffer[i] = some_data[i];
    }
}

Although this might seem harmless, as one byte is not a lot, often this can also be exploited. On the stack one might be able to overwrite just enough to corrupt critical data structures (like the saved rbp), on the heap it might be possible to corrupt the next free chunk’s pointer in the free linked list. In the linked example, an exploit is shown for a scenario where you can only overflow just a single null-byte!

Apart from gets and strcpy, other dangerous functions in libc include sprintf and scanf, that can be used like this scanf("%s", buffer), again writing a string into a buffer without being aware of it size. Even though it’s rare to come by gets nowadays, strcpy and other functions are still widely used.

Defense: Better APIs & Bounds Checking#

You might have been asking right now, how the API of these libc functions can allow size-unaware reading into a buffer. This is the correct question.

Although far from all real-world buffer overflow vulnerabilities stem from these functions, it’s always wise to use fgets(buffer, sizeof(buffer), stdin) instead of gets(buffer) (using gets is especially unjustifiable), strncpy(dest, src, sizeof(dest)) instead of strcpy(dest, src), snprintf(buffer, sizeof(buffer), ...) instead of sprintf(buffer, ...), etc.

Defense: Stack Canaries#

Stack canaries (also called stack cookies) are a defense mechanism that detects stack buffer overflows before they can be exploited. The idea is simple: place a random value on the stack between local variables and the return address. Before returning from the function, check if that value has been modified. If it has, the program terminates.

Here’s how the stack looks with a canary:

[Lower addresses]
  buffer[0]
  ...
  buffer[63]
  Canary  <- Random value
  Saved rbp
  Return address
[Higher addresses]

When a buffer overflow occurs, it will overwrite the canary before reaching the return address. The compiler inserts code at the end of the function to verify the canary hasn’t changed:

void function() {
    // compiler inserts: canary = RANDOM_VALUE
    char buffer[64];
    gets(buffer);
    // compiler inserts: if (canary != RANDOM_VALUE) abort()
}

The canary value is typically stored in a special register/memory location and randomized when the program starts, making it hard for attackers to guess.

This of course doesn’t protect against overwriting just the local variables, or overflows happening on the heap. It also breaks when the attacker is able to leak the canary somehow, using a memory leak vulnerability.

GCC and Clang enable stack canaries by default for functions with buffers larger than 8 bytes. It can be controlled with -fstack-protector, -fstack-protector-strong, or -fno-stack-protector compiler arguments.

Attack: Out-of-Bounds Access#

Out-of-bounds (OOB) access is a more general category that includes buffer overflows but also reading past array boundaries:

int array[10];
printf("%d\n", array[15]);  // reading OOB
array[20] = 42;             // writing OOB

In this example it’s quite obvious what’s wrong, but in complex software, where pointers are passed around pointing to structs that are converted/embedded in one another, it’s easy to make a mistake. Another common presence are scripting languages.

OOB vulnerabilities are defended against by the same mechanisms as buffer overflows: proper bounds checking.

Attack: Use-After-Free (UAF)#

Use-after-free occurs when a program continues to use a pointer after the memory it points to has been freed:

char *ptr = malloc(64);
strcpy(ptr, "Hello");
free(ptr);
printf("%s\n", ptr);  // UAF: reading freed memory

Why is this dangerous? Well, after free(), the memory is returned to the allocator and may be reused for another allocation. If an attacker can control what gets allocated in that location, they can control what the dangling pointer points to.

The attacker can do so-called heap feng shui, crafting series of allocations and deallocations, such that a sensitive allocation (such as the struct holding user credentials in the Linux kernel) is placed in the location, that is still accessible through the dangling pointer. Then they can modify the struct in an unintended way, using the dangling pointer.

UAF vulnerabilities are particularly prevalent in codebases with complex object lifetimes, such as in kernel code, or parallel environments.

Attack: Double Free#

Double free occurs when free() is called twice on the same pointer:

char *ptr = malloc(64);
free(ptr);
free(ptr);  // double free

Since the allocator uses a linked list of free chunks to keep track of them, freeing the same chunk twice will create a loop in this list. This can result in the attacker inserting arbitrary pointer into the list, and getting this pointer as an allocation from malloc at some point.¹

Defense: Heap Corruption Detection#

Memory allocators (the code implementing malloc and free) are aware of these issues and how they are being exploited, so they keep adding checks and protections to make them harder to exploit and easier to detect. Some try very hard.

For example when you try to compile and run the code above on a glibc Linux system, you will get:

free(): double free detected in tcache 2
[1]    366472 IOT instruction (core dumped)  ./binary

This is glibc detecting what is happening and aborting execution. This detection can be done for example using Tcache keys, which store a bit of extra information in the linked list head and with that are able to detect, whether a double free is not happening.

UAF are bit harder to detect, for that wait for the next lecture.

A good practice is also keeping the pattern of zeroing pointers right after freeing:

free(ptr);
ptr = NULL;

In case the ptr is freed again, nothing will happen (freeing NULL does nothing), in case it’s accessed in an UAF, the program will likely crash, because the page near 0 is almost never mapped exactly for this purpose.

Attack: Integer Overflow/Underflow#

Integer have fixed width (have a minimal and a maximal value) in CPUs, so naturally overflows/underflows can happen and often lead to memory corruption when they’re used in size calculations:

void vulnerable(size_t count) {
    size_t size = count * sizeof(int);  // this will be usually 4x as big as count - overflow is possible
    int *array = malloc(size);
    for (size_t i = 0; i < count; i++) {
        array[i] = get_value();  // but this will loop exactly count times
    }
}

If count is large enough, count * sizeof(int) wraps around to a small value. The malloc() succeeds with a tiny allocation, but the loop writes far beyond it.²

As you know, integers can be signed or unsigned, a careless conversion between them, sometime even implicit and unnoticeable, can also be problematic:

size_t int_to_sizet(int i) {
    return (size_t)(i > 0 ? i : -i);
}

void vulnerable(int size) {
    if (size > MAX_SIZE) return;  // check seems safe
    char *buffer = malloc(int_to_sizet(size));
    if (!buffer) return 0;
    size_t bytes_read = read(fd, buffer, (size_t)size);  // explicit conversion to size_t
}

Here, size is negative, passes the check, but malloc() and read() interpret it different values.

Defense: Safe Integer Arithmetic#

It’s necessary to think thoroughly about how the numbers get interpreted from representation to representation. Checking the lower-bound when accepting signed integers and not using signed integers for values like sizes (using size_t instead) is a good idea. However, it is complicated by the fact that standard libc functions return ints and use the negative numbers for error codes.

Recently safe arithmetic library has been added to libc, allowing for easy overflow detection:

#include <stdckdint.h>  // C23
size_t size;
if (ckd_mul(&size, count, sizeof(int))) {
    // Overflow occurred
    return -1;
}

Mitigations#

In the previous lecture we showed you pwntools, and it’s tool called checksec, a tool to check security features/mitigations of a binary. But we didn’t explain what all the lines of its output mean. The output, when running checksec ./output_binary, looked like this:

[*] './output_binary'
    Arch:       amd64-64-little
    RELRO:      Partial RELRO
    Stack:      No canary found
    NX:         NX enabled
    PIE:        No PIE (0x400000)
    Stripped:   No
    Debuginfo:  Yes

Arch is the CPU architecture of the binary. Stack are the stack canaries that we explained above. Stripped: No means the binary contains symbols, so we know the names of the functions in the binary. Debuginfo: Yes means additional information, such as variable names, is included. Now we will go over the remaining bits.

Non-Executable Memory (NX/DEP)#

NX, standing for Non-eXecutable, means that the memory permissions on the stack are set only as read-write (RW), and not execute.

If the stack would be read-write-execute (RWX) (and it is that way, without NX), this would mean an attacker could inject their own code (compiled instructions - shellcode) onto the stack (this is easy, because the stack is where you usually store user input in local variables). Then, if they can overwrite the return address, using a buffer overflow, and know the address of the stack, they can just return to their own code and gain arbitrary code execution.³

An attack this simple is mostly not possible nowadays, as the NX protection became standard, as well as ASLR, described in the next section.

Address Space Layout Randomization (ASLR)#

ASLR randomizes the memory addresses where code and data are loaded. Without ASLR, programs load at predictable addresses every time. With ASLR, each time you run a program, things are loaded at different addresses.

This is not mentioned in the checksec output as this is set on the system-level. You can check whether is enabled on your system using /proc/sys/kernel/randomize_va_space (where 0=off, 1=conservative, 2=full).

If it is enabled the following memory regions are on randomized locations (and the offsets between them are also randomized):

Stack
Heap
Shared library locations (libc, etc.)
Binary itself (if PIE is enabled, see bellow)

You can see this easily by running ldd. This is how it looks like on a system without ASLR:

ldd /bin/ls

    linux-vdso.so.1 (0x00007ffff7fcd000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ffff7a00000)
    ...

ldd /bin/ls

    linux-vdso.so.1 (0x00007ffff7fcd000)  # same address
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ffff7a00000)

And here with ASLR:

ldd /bin/ls

    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f8a2b200000)

ldd /bin/ls

    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f1c9e400000)

When ASLR is enabled, for attackers, tasks like “jumping to stack” become much harder, as they usually require first acquiring a “leak” - some address from the memory region, from which can they calculate the rest of the addresses (as they are at stable offsets within the same memory region).

However, on 32-bit systems, ASLR is ineffective, as there are only around 16-bits of entropy in the randomization, meaning it can be trivially bruteforced - compared to 28-bits of entropy on a 64-bit system.

A notable limitation is also the fact that when forking a process, the child inherits the parents memory addresses, so they are the same. For example, Android forks pretty much all processes from one (zygote), making ASLR more-or-less ineffective for userspace applications.

Position Independent Executables (PIE)#

The checksec however does mention PIE, this stands for position independent executable, and allows even the location of the code of the binary itself to be randomized. This defeats another class of exploitation methods (ROP) we will get into bellow.

Relocation Read-Only (RELRO)#

If you recall the PLT and GOT explainer from last lecture, RELRO protection involves around that. Vulnerabilities like OOB read or write, UAF, and double free can often lead to arbitrary write primitive (also known as write-what-where primitive). Turning this primitive into arbitrary code execution can be tricky. One way attackers do this, is by overwriting an entry in the GOT. By overwriting an entry for, let’s say, puts, with the address of system, every call to puts now becomes a call to system. So when anywhere in the code puts is executed with attacker controlled argument, that now becomes a way to run arbitrary command.

When RELRO is set to full protection, it makes the sensitive parts inside the GOT read-only, so it’s not possible to overwrite them for GOT overwrite attack. When it’s in “partial” mode, it doesn’t really make a difference for GOT overwrite, as it only features moving the GOT before the global variables, to prevent an overflow from them into the GOT.

Exploitation Techniques#

Now that we understand the mitigations and defenses, let’s explore how attackers bypass them or are blocked by them.

Shellcode#

Shellcode is a name for small piece of machine code (compiled instructions) that an attacker injects and executes. It’s called “shellcode” because it traditionally spawns a shell, but it can do anything.

A traditional shellcode that spawns a shell (/bin/sh) on a x86-64 Linux system looks like this:

; execve("/bin/sh", NULL, NULL)
xor rdi, rdi
xor rsi, rsi
xor rdx, rdx
xor rax, rax
push rax
; 68 73 2f 2f 6e 69 62 2f
mov rbx, 68732f2f6e69622fH
push rbx
mov rdi, rsp
mov al, 59
syscall

Which assembles to:

\x48\x31\xff\x48\x31\xf6\x48\x31\xd2\x48\x31\xc0\x50\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53\x48\x89\xe7\xb0\x3b\x0f\x05

Projects like shell-storm or pwntools shellcraft, like shown in the last lecture, collect various shellcodes for different architectures and purposes.

Return-Oriented Programming (ROP)#

ROP is a code reuse technique that bypasses NX by chaining together small snippets of existing code called gadgets.

A gadget is a short sequence of instructions ending in ret, like:

pop rdi
ret

By carefully crafting the stack, we can chain gadgets together to perform arbitrary computation, all reusing existing executable code - from the binary itself, or from linked libraries, like libc.

Remember that ret pops an address from the stack and jumps to it. By placing multiple addresses on the stack, we can execute a chain of gadgets:

Stack:
  [address of gadget 1]  <- rsp when first ret executes
  [address of gadget 2]  <- rsp when gadget 1's ret executes
  [address of gadget 3]  <- rsp when gadget 2's ret executes
  ...

Suppose we want to call execve("/bin/sh", NULL, NULL). We need to:

Set rdi = pointer to "/bin/sh"
Set rsi = NULL
Set rdx = NULL
Set rax = 59 (execve syscall number)
Execute syscall

We find gadgets like:

gadget1: pop rdi; ret
gadget2: pop rsi; ret
gadget3: pop rdx; ret
gadget4: pop rax; ret
gadget5: syscall; ret

Stack setup:

[address of gadget1]  ; pop rdi; ret
[address of "/bin/sh"]
[address of gadget2]  ; pop rsi; ret
[0x0000000000000000]
[address of gadget3]  ; pop rdx; ret
[0x0000000000000000]
[address of gadget4]  ; pop rax; ret
[0x000000000000003b]  ; 59 in hex
[address of gadget5]  ; syscall; ret

Tools like ROPgadget, ropper, or pwntools rop automatically search binaries for useful gadgets:

ROPgadget --binary /lib/x86_64-linux-gnu/libc.so.6 --only "pop|ret"
0x0000000000023b6a : pop rdi ; ret
0x0000000000023b68 : pop rsi ; ret
0x0000000000001b96 : pop rdx ; ret
...

Of course, ROPing has the prerequisite of knowing the address of the code you want to ROP in - which often means defeating ASLR. As well as having enough gadgets to do something meaningful.

ret2libc#

ret2libc (return-to-libc) is a simpler code reuse technique: instead of chaining gadgets, directly call useful library functions.

The most common target is system() from libc, which executes a shell command:

system("/bin/sh");

So, if we manage to pass prepared /bin/sh as an argument and return to system we have won.

The stack setup for x86-64 (first argument is in rdi) can look the following:

[address of "pop rdi; ret" gadget]
[address of "/bin/sh" string]
[address of system()]

Of course, if ASLR is enabled, we need to leak the address of libc somehow.

GOT Overwrite#

Already covered in the RELRO section, but let us link this nice tutorial about GOT overwrite from pwntools.

Information Leaks#

ASLR is a powerful defense, but it relies on addresses being secret. If an attacker can leak addresses, they can defeat ASLR.

Leaks usually involve some function printing or otherwise outputting more, than it should be.

void leak(char *buffer, size_t size) {
    char data[64];
    memcpy(buffer, data, size);  // If size > 64, leaks stack data to buffer
}

Format String Bug#

Another common example of leaks are the format string bugs.

Format strings are type of an injection of template format specifiers, like %p, %s. They are particularly powerful for reading, but also writing memory:

char buffer[100];
gets(buffer);
printf(buffer);  // format string vulnerability

If the user inputs %p %p %p %p, printf will try to read values at places where it expects its arguments, at one point also from the stack and print them:

$ ./program
%p %p %p %p
0x7fffffffe3a0 0x7ffff7a03780 0x555555554740 0x4141414141414141

The %s format specifier treats its argument as a pointer and dereferences it. By carefully positioning values on the stack, we can read arbitrary memory:

Input: AAAAAAAA%7$s

If the address 0x4141414141414141 happens to be valid, this reads memory from that location.

To write arbitrary memory with printf, the %n format specifier can be used, which writes the number of bytes printed so far to an address:

int count;
printf("Hello%n\n", &count);  // count = 5

An attacker can use this to write arbitrary values to arbitrary addresses:

Input: AAAAAAAA%7$n

Writes the number of bytes printed to the address 0x4141414141414141.

TLDR#

Memory Corruption Vulnerabilities:

Buffer Overflow: Writing past buffer boundaries on stack or heap

Out-of-Bounds: Reading/writing outside allocated memory, like not checking an index is in bounds

Use-After-Free: Accessing freed memory that might have been reallocated

Double Free: Freeing memory twice, corrupting allocator metadata

Integer Overflow: Arithmetic errors leading to incorrect size or bounds calculations

Defenses:

Stack Canaries: Random value protecting return addresses, detects overflows

NX/DEP: Without executable memory the attacker cannot execute shellcode

ASLR: Randomizes addresses of stack, heap, and libraries

PIE: Makes executable position-independent, enabling full ASLR

RELRO: Makes GOT read-only to prevent function pointer hijacking

Exploitation Techniques:

Shellcode: Injected machine code, limited by NX

ROP: Chain gadgets ending in ret to build arbitrary computation, bypasses NX

ret2libc: Call existing library functions like system()

GOT Overwrite: Hijack function pointers in Global Offset Table

Information Leaks: Leak addresses to defeat ASLR (format strings, buffer over-reads)

Further resources#

pwn.college — comprehensive course on binary exploitation
Nightmare — detailed binary exploitation writeups
how2heap — heap exploitation techniques
ROP Emporium — practice challenges for learning ROP
Fil-C - memory safe implementation of the C and C++
LiveOverflow Binary Exploitation — bit outdated, but still relevant video series on binary exploitation
Smashing The Stack For Fun And Profit — the 1996 article that was one of the first places to introduce binary exploitation

It’s not that easy nowadays with all the protections, but the principle remains still the same. ↩︎
The example pictured is not very practical, but only a demonstration. ↩︎
They don’t even have to know the exact address, by using a NOP Sled. ↩︎