If you haven’t already, I would recommend you check out the talk Yifan and I did at 35C3. We had a lot of content that we would have loved to speak about but did not have the required time to do so. This post will be addressing one of those, a fail in the F00D crypto processor. Before we dive into that, I will explain some fundamentals.
Cryptography on the Playstation Vita is covered under two devices that perform mostly the same function. “DMAC5” is such a device and it encapsulates cryptography performed by the ARM kernel, such as:
- Memory card encryption/decryption
- CMA PC backup encryption/decryption
- PFS encryption/decryption
- Kernel coredump encryption
- etc
DMAC5 can be broken down into two main components: the crypto device itself and the keyring associated with it. The keyring contains 0x20 slots of size 0x20 bytes. Normally, this keyring is configured by F00D with a handful of slots directly configurable from ARM. This keyring can then be used by DMAC5 by specifying a keyslot when performing a crypto operation.
In addition to DMAC5, there is another crypto device we call “bigmac” since it is only accessible by the “F00D” processor. Bigmac and DMAC5 have an almost identical interface, but bigmac is only accessible by F00D.
Just like DMAC5, bigmac also has an associated keyring. They are mostly the same except bigmac’s keyring has 0x800 slots of 0x20 bytes and each slot can be marked for specific capabilities. These capabilities can designate several thing:
- Which algorithm is permitted to access it (AES, HMAC, CMAC… etc).
- Which algorithm mode is permitted (encrypt, decrypt or both).
- Output destination – whether it can write results to memory or only to a specific group of slots.
- Whether or not it can be written to.
Most slots have no permissions and cannot be used. Some contain master keys which cannot be overwritten, and others are configurable and can be used for cryptography.
One of the critical usecases for bigmac is to “derive” keys. That is, take an encrypted key and “decrypt” it into a a keyslot. This allows Sony to update important keys without keeping them in plaintext. The only way it would be possible obtain these derived keys is to either compromise the master key used to decrypt these, or to somehow read out the keyslot.
Lets think about how a decryption operates traditionally with bigmac. We specify a few important details:
- The source containing the data we want to decrypt.
- The destination such as RAM address or a keyslot number.
- The keyslot we want as the key for the AES operation.
- The size of the data we want to decrypt.
Now, in the normal case for a key decryption we would provide an encrypted key, point the destination to a keyslot and set the size to 0x10 bytes. The first idea we can get to attack this would be to change the destination to memory. Unfortunately, all the keyslots used for deriving keys explicitly forbid using it without having the destination pointing to a keyslot.
The other idea would be to try different sizes. Interestingly, using a size that isn’t a multiple of the block cipher seems to work. In standard AES you are required to provide data in multiples of the block size, in this case 0x10. So, normally I would assume they are using a padding-scheme to work with non-divisible block sizes, but the result from bigmac is also the same size as the input.
After doing some experimentation I found that bigmac accepts sizes in multiples of 4 bytes. When performing a decryption of 4 bytes I noticed that the plaintext returned was different each time the engine was called. When doing an decryption of size larger than 0x10 bytes this property was not observed. This implies that partial bytes we input are mixed with the result from the previous calculation.
We can then try to model this behaviour. Lets assume that bigmac is just doing a dumb copy over some residual data. If we have 4 bytes of data 00 00 00 00
:
Okay, so we’ve learned that the 00 00 00 00
bytes were encrypted with residue data from whatever the previous operation was.
So how do we figure out what the residual data is? We need to get the bigmac internal buffer into a known state. To do this we decrypt a complete block of zeroes using a key of zeroes, that way we know plaintext, ciphertext and key.
Now lets assume that bigmac does not clear it’s internal buffer. We can confirm this now by decrypting 4 bytes of 00 00 00 00
with a key of zeroes:
As hypothesised, it seems than when we pass in data that is less that the block size, bigmac just blindly copies in the data without clearing its internal buffer. It then does the operation and copies out the size we input.
I’m sure you’re wondering how we can abuse this, or maybe you’ve even figured it out now. Lets mimic the key derivation process and imagine that there is a key: 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
that is stored in an encrypted format by some unknown encryption key. Lets say we decrypt this encrypted key into some keyslot. By doing this, bigmac’s internal buffer should contain the decrypted key. If we then ask bigmac to decrypt 4 bytes of 00 00 00 00
with an all zero key:
You can see that the internal buffer is now 00 00 00 00 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
and we have 4 bytes of the decrypted zeroes and key data. It’s important to note that even though the original key derivation must have the result stored into a keyslot, we are not bound to that restriction with our partial decryption. This allows us to control the key, observe ciphertext and partially control the plaintext.
With 4 bytes of this data being zero, we can bruteforce the rest with only 2^96 instead of 2^128. A huge improvement but still unreachable.
To remedy this, we repeat the process above but instead write 12 zeroes. After, we can determine 4 bytes of the key by bruteforcing. This is 2^32 which is easily achievable on a modern PC within a couple of minutes.
Repeat the process again with 8 zeroes and we can use the known value(s) we got from the the 12 zeroes step to determine another 4 bytes of the key.
Now we can revisit our 4 zeroes and determine another 4 bytes of the key with the value(s) we bruteforced.
The state now is that we have bruteforced 96 bits of the derived key in 3*2^32 attempts. To get the final 32 bits of the key the easiest method was just to use the full key to decrypt/encrypt some chosen plaintext. Then, bruteforce the last 32 bits by attempting to decrypt/encrypt the plaintext until we get the expected ciphertext.
Now you have successfully recovered the derived key. Viva la Raccoon exploit!
Some key points to takeaway from this. When bruteforcing it is possible that you will find collisions. This has happened a bunch of times to me, you just need to make sure to explore all possible avenues.
This method of key-recovery cannot be used to determine “master keys”. Master keys sit in a keyslot and are locked from being overwritten. This method is only good for data that is encrypted/decrypted within bigmac.
Such an example of a derivation is the AA key (see the end of our talk for the details of this key) for the F00D loaders. This key is derived by decrypting a part of the header with the master key (slot 0x208) into another slot (say, for example 0xA). This means that when/if Sony change the AA key, we can glitch to get bootrom code execution and perform this attack to get the new key. This isn’t possible outside of bootrom because Sony disables the 0x208 master key after boot.
This is an interesting bug and would have been easily fixed by forbidding non-block aligned sizes being passed into the crypto device. The other easy solution would have been to clear the internal buffer at the end of a crypto operation. As this bug is in hardware, it will always be possible to obtain derived keys. The only secure assets left for encryption/decryption in the system are the master keys.
Thanks to @qlutoo and Proxima for their work on this.
Make sure to follow me on twitter: @DaveeFTW