# Cracking Rolling Code Locks the lazy way

I took some of my Christmas break time to solve as many challenges as I could in HackerOne’s CTF. Out of the remaining challenges I still had to solve, there was a couple of named *Model E1337 — Rolling Code Lock* and *Model E1337 v2 — Hardened Rolling Code Lock.* Both of the challenges had Math listed as a required skill, so I decided to put my brain to the test and give them a try.

## SPOILER ALERT

This write-up actually contains valid working ways of solving the before mentioned challenges. If you are only looking for tips and would like to figure out the problems yourself, stop here and DM me on Twitter (@bananabr). I will try to help as best as I can.

# The challenges

Both challenges are very similar, you are given an input field and must enter a number that matches the next output generated by a PRNG.

In both applications, if an incorrect number is entered, the user is presented with an error message showing the expected value. The main difference between v1 and v2 is the size of the expected number.

To solve both challenges I had to first get access to the PRNGs source code. I won’t be covering this part of the challenge but it shouldn’t take too much to find another write-up (spoiler alert #2) if you can’t do it.

*Model E1337 — Rolling Code Lock*

The first challenge’s PRNG source code is presented below.

I am lazy. This means that after reversing and understanding the first loop I was already bored. Being lazy doesn’t mean I am not stubborn though. So after struggling with pen and paper for a while, I decided to forget the math and tackle the problem with a more hacky approach.

For starters, I encapsulated the code into a class.

Then I decided to try out a theory.

Could this be solved by using symbolic execution?

For those who don’t know what symbolic execution is, here is a definition according to Wikipedia.

In computer science,

symbolic execution(alsosymbolic evaluationorsymbex) is a means of analyzing a program to determine what inputs cause each part of a program to execute. An interpreter follows the program, assuming symbolic values for inputs rather than obtaining actual inputs as normal execution of the program would. It thus arrives at expressions in terms of those symbols for expressions and variables in the program, and constraints in terms of those symbols for the possible outcomes of each conditional branch.

As my tool of choice, I used the z3-solver python lib. Although z3-solver is actually an SMT solver, it works well enough as a symbolic execution engine/tool for this challenge.

The first thing was to define the constraints z3-solver would work with. Those were pretty simple to define after some basic code review. The PRNG seed must be ≥0, ≤0xFFFFFFFF and the first 26 bits returned by the PRNG must match the number returned by the error message. To improve the chances of getting the seed right on the first try, I also included another constraint asking for extra 26 bits that must match the number reported by a second consecutive error message.

The following is a PoC exploit script that given the first two consecutive values returned by the challenge’s PRNG outputs the seed and the value to be entered to get the flag.

In my not so new workstation, the above script will run in about five seconds.

In summary, to get the second flag, reset the challenge instance so the PRNG is initialized with a new seed. Next, input two random numbers as input and take note of the expected numbers returned in the error messages. Finally, feed v1 and v2 in the PoC script and run it. That’s it, no math involved. Well, a lot of math involved actually, but I left it for z3 ;).

For those really interested in a mathematical approach to this challenge, I recommend this video.

*Model E1337 v2 — Hardened Rolling Code Lock*

To solve the second challenge I had to find a way to basically accomplish the same thing. Given the first two consecutive outputs from a PRNG, get the seed to find the third one. The following snippet contains the source code for the hardened version of the PRNG.

By comparing the above code with the one from the first challenge, it’s possible to enumerate the following differences:

- The provided seed is now 64-bits instead of 32
- There are three extra loops for each bit of output

These simple changes were supposed to be enough to prevent brute-force attempts. However, there is a bug in the setup function. Only the 32 rightmost bits of the provided seed are considered. This means that seed 0xAAAABBBBCCCCDDDD and seed 0x00006666CCCCDDDD will result in the same PRNG initial state. Because of this bug, the key strength of the PRNG is only 32-bits instead of 64.

Although 32-bits doesn’t seem like much, the extra loops plus the fact that this challenge generates 64 bits numbers instead of 26 are enough to make the z3 solution unfeasible. Nonetheless, I was still not convinced that I couldn’t use pure computational power to attack this PRNG.

After a couple of unsuccessful tries, I decided to go fancy and created a vector based version of the PRNG algorithm.

If you are wondering what difference does this actually makes I included some code samples below.

The following snippet uses a linear approach to generate 100,000 8 bits numbers from sequential seeds. This takes around 22 seconds on my workstation.

The following parallel version will do the same thing and run in around 7 seconds using all of my 8 cores.

The next vector-based version will do the exact same thing and run in 0.4 seconds!

Let’s do some simple math just to check how long would take to cover the full 32-bits space.

0xFFFFFFFF/100000 = 42949

42949*0.4 = 17179.600000000002

17179.600000000002/60/60 = 4.772111111111112

So for 8 bits numbers, we would need approximately 5 hours, which is not so bad. I could have just left the script running during a single night and it would be done. But that was still not enough. It’s 2020, it shouldn’t take this long to brute-force a simple PRNG using a 32-bits key. So, I decided to research how I could use my GPU to make my tool run faster and found cupy.

According to cupy’s own web site cupy is:

an open-source array library accelerated with NVIDIA CUDA. CuPy provides GPU accelerated computing with Python. CuPy uses CUDA-related libraries including cuBLAS, cuDNN, cuRand, cuSolver, cuSPARSE, cuFFT and NCCL to make full use of the GPU architecture.

In most cases, cupy works as a drop-in replacement for numpy, which I was already using for the vector-based class. Because of that, I literally only had to change the import line from “*import numpy as np”* to “*import cupy as np”* to test it.

Using the same test case as before, I compared the performance of numpy x cupy generating one million 8 bits outputs from sequential seeds. Numpy average time was around 8.307s while cupy’s was 1.960s! My new time prediction was then:

0xFFFFFFFF/1000000 = 4294.967295

4294.967295*1.960 = 8418.1358982

8418.1358982/60/60 = 2.3383710828333335

So on average, I would need 2.34 hours to find all 32-bits seeds whose first generate 8-bits are the same. That’s not bad at all, just a couple of Netflix shows and it would be done. But why 8 bits? Wasn’t the challenge generating 64 bits outputs? Yes indeed, but the idea was to avoid wasting cycles. If two seeds cannot produce the same 8 bits they for sure won’t produce the same 64. The idea was to drastically reduce the search space before performing the full 64 bits attack. The final exploit code is shown below.

The exploit works this way:

- Given the first two values V1 and V2 produced by the
*E1337 v2 PRNG,*search through the whole 32 bits space for seed candidates by comparing the first 8 bits generated by each seed to the leftmost 8 bits of V1 - Once all 8 bits candidates are found, look for those who can produce V1.
- For every seed that can produce V1, check if the next generated 64 bits output is equal to V2.
- If the seed is able to generate V1 and V2 as their first two 64 bits outputs, consider that seed valid, print it, and exit.

The following picture shows the script running and finding 39000 candidates out of a million. This means a lot of computing cycles in the final search run.

This next picture shows the exploit code finally finding a valid seed, which I used to get the flag.

At last, the output of the Linux *time* command, showing the whole process took only 106 minutes. Way less than my initial prediction.

# Final thoughts

I plan on taking the time to understand the mathematical approach to this challenge and recommend everyone to do the same. When v3 is released and the setup function is fixed, hammering the challenge probably won’t do. Even so, I learned a lot of new things solving these, and I hope this write-up could do the same for someone else.