HACKVent 2014 - Day 20 writeup
I’ve sign up for the Hackvent event made by the guys from www.hacking-lab.com, which is a advent-like hacking competition. Every day there is a new challenge posted at midnight which has a to solved at best in the same day, the challenge becoming increasingly more difficult every week completed. The aim in every puzzle is to find either a qr-encoded x-mas ball with lead to the validation code, or a secret human-readable string which gives you the former ball when feeding into a validator (the “Ball-O-Matic”).
Here’s the write-up for the twentieth challenge, in which we will constantly be called a “hobo”.
Stop ! Hammer Time Part :
For the Day 20 Hackvent challenge, we were given the following instructions :
The url links to this website :
It’s quite the minimalist page, where there is only a mathematical equation embedded in a gif image - the equation being randomly selected at every new request - and two very interesting headers in the GET response :
X-Riddler : "INFO" X-Riddler-Howto :"I will reward you with a nice christmas ball if you solve 30 of my riddles in just a minute. Just send me back my cookies and POST your answer as 'result'. However, if one of your answers is wrong, you'll have to start over."
Fortunately, I got this hint quite early in the game (like about 15 mins since I’ve begun) but I know it has stumped other contestants. Along the headers , there is also some cookies :
HACKVent_Ticket : "edb20784c6810ed9aef2f1ec0031f071" HACKVent_User : "bHVjYXNn" PHPSESSID : "epflifvtjn8p1v9ghsnt2i3dt2"
The instructions are quite clear : I have to “solve” one way or another 30 equations under a minute, and send a POST request for every answer. There is still two hurdles : how am I gonna solve the 30 equations, and how I have to send back the answer.
One equation to solve every two seconds seemed difficult to do, knowing that there is at least 300ms of latency between the POST answer and the reception of the next challenge. Which leave less about 1.5 sec to crack the equation automatically. That’s why I initially though there were some kind of session fixation where you could have an impact on the
SESSID cookie and always getting the same string of equations. Kinda like when you set the seed for a pseudo-random generator. I try to set the
PHPSESSID value but to no avail. So instead I tried to solve the equations automatically.
Can you read it ? Part :
The first thing to check is to be sure to send back the cookies every time we want to send an answer. For that matter,
curl has some pretty nice built-ins options.
For the initial download :
curl -c cookies.txt "http://220.127.116.11/1JnjqflWseX_29Ow-YXH/" -o 1JnjqflWseX_29Ow-YXH1.gif
-c cookies.txtcan save the current cookies in the provided file
- http://hackvent.hacking-lab.com/1JnjqflWseX_29Ow-YXH is a redirection, so instead of tell curl to follow redirs, I rather used the IP 18.104.22.168.
-o 1JnjqflWseX_29Ow-YXH1.gif: save the gif file
For the answer :
curl -X POST -b cookies.txt --data "result="$payload "http://22.214.171.124/1JnjqflWseX_29Ow-YXH/" -o 1JnjqflWseX_29Ow-YXH2.gif
-X POSTtells curl to send the request as POST.
-b cookies.txtloads the cookies and send them with the message.
--data "result="$payload.This option add a result field in the sent message, telling the riddle what is our answer. I spent way too long trying to find this.
-o 1JnjqflWseX_29Ow-YXH2.gif: save the next equation (if our result is correct).
Now that we’ve solved the communication part, we need to be able to detect and solve correctly the equations given by the riddler. The only command-line OCR tool which is free is called tesseract, and I was impressed by the result. The only alternative to detect the symbol would be to build a database of letters and operators, and run a least-square estimator on the pixels difference to retrieve what could be the equation. Being lazy, I went with the external tool solution, even if I think the latter should work quite reliably.
Once installed, you can call tesseract this way :
tesseract equation.gif equation_output
The decoded text is then saved into
equation_output.txt. After the equation decoding, you also need a numerical evaluator, one you can tweak a bit in order to correct the errors made in the OCR part. I used this python’s numerical parser found in a StackOverflow answer :
Notice the ugly serie of
replace’s which are here to sanitize the decoded equation. the
* symbol was notoriously incorrectly recognized.
CSI Enhancing Part :
However, that wasn’t enough since tesseract had difficulties differenciating
8’s. I could answer several equations, but I always got one wrong answer every ten-ish equations, which wouldn’t cut it. I had to ease the work done for tesseract.
I used ImageMagick (I never ceased to be amazed by its efficiency) convert tool to enbiggen the gif equations before feeding into tesseract :
convert $equation.gif -resize "200%" big_equation.gif.
Once we get every part right, you just need to wrap it up using a simple shell script :
You get 30 differents gif files, the latest one being a x-mas qr-ball :p