ECC collision HOWTO

published on Tue 29 October 2024 by

This is a small blog post I would have been happy to read while searching how to find ecc collision when dealing with how to exploit a limited write to flash.

No rocket science, just a little write-up.

Short Context

While working on a target, I found a primitive that allowed me to overwrite a flash page. By overwriting I mean, writing again. If you are not familiar with how some flash works, they must first be erased (aka, every bit is set to 1 (or 0) level, and the written (a bit can be flipped to the other state). Overwriting is doing the same flip again, but you can only flip on bit in the erased state to the written state.

ERASE
flash state:
0xffffffff

WRITE 0xffffff01

flash state:
0xffffff01

WRITE 0xffffff00

flash state:
0xffffff00

WRITE 0xffffff01

flash state:
0xffffff00

See ? On the last write, the last bit was not set back to 1, but kept to 0.

So, this set up with a limit on what we can overwrite to the flash page (eg, if this is the first page to be executed, a jump to another place), but this is something that can dealt with easily.

Let's imagine that we have the jmp 0x00abcd encoded as 0x8000abcd and that 0x00000000 is a nop

Inital flash state:
0x8000abcd 0xffffffff 0xffffffff ... 0xffffffff
[jmp 0xabcd]

WRITE 0x00000000 0x80e00000 0xffffffff

flash state:
0x00000000 0x80e00000 0xffffffff ... 0xffffffff
[nop, jmp 0xe00000]

and now we can jump to our (controlled) code

Here enter ECC

Unfortunatelly, the flash I was dealing with had a protection mechanism to deal with errors. Internally, every 256 bit pages of data was stored on 256+22 bits, and those 22 bits kept an ecc code to detect (and correct) errors.

(ECC stands for Error Code Correction)

A quick overview of what the problem is, with samller bits numbers:

Wen doing a normal write of data

ERASE
flash state:
DATA: 0xffffffff ECC: 0xff

WRITE 0xFFFFFF01 (calculated ecc=0x20)

flash state:
DATA: 0xffffff01 ECC: 0x20

An ecc code is calulated and written

For another data, the ecc is different:

ERASE
DATA: 0xffffffff ECC: 0xff

WRITE 0xffffff00 (calculated ecc=0x30)

flash state:
DATA: 0xffffff00 ECC: 0x30

This mess with with our overwriting scheme (since ecc bits are stored in the flash, they also cannot be switched back to erase state)

eg:

ERASE
flash state:
DATA: 0xffffffff ECC: 0xff

WRITE 0xFFFFFF01 (calculated ecc=0x20)

flash state:
DATA: 0xffffff01 ECC: 0x20

WRITE 0xFFFFFF00 (calculated ecc=0x30)

flash state:
DATA: 0xffffff00 ECC: 0x20 = (0x30 & 0x20)

And here, the flash controller will detect that 0xffffff00 and 0x20 don't match.

The necessary maths (or not)

Disclaimer, I am not (I mean, no longer) a math guy, so I'll just put here the basics things needed to understand.

The Ecc code generated is an hamming code. It has surely a lot of property, but what will interest us in the cas of ecc colisions, is that:

  • Is is generated by a simple multiplication by a generation matrice
  • at the bit level, the sum is a xor

(for more maths, go to wikipedia)

so:

Ecc(DATAS_BITS) = DATA_BITS x GEN_MATRICE

and:

Ecc(DATA1 ^ DATA2) = Ecc(DATA1) ^ Ecc(DATA2)

Calculating ECC

Back to our problem. We have:

Inital flash state:
0x8000abcd 0xffffffff 0xffffffff ... 0xffffffff ECC(inital_data)
[jmp 0xabcd]

We want:

flash state:
0x00000000 0x80e00000 0xffffffff ... 0xffffffff ECC(new_data)

With

ECC(new_data) & ECC(initial_data) == ECC(new_data)

(ie, no bit writtent to zero in initial_ecc must be switched back to 1 in new_ecc)

But we may also be happy with

flash state:
0x00000000 0x80e00000 0xefffffff ... 0xffffffff ECC(new_data_2)

With the same constraint:

ECC(new_data_2) & ECC(initial_data) == ECC(new_data_2).

The first 64 bits of our payload need to be fixed, but we don't care of the state of the (256 - 64) bits remaining. So we have some latitude to forge a payload that will satisfy the constraint.

Remember that Ecc are additive ?

Let's call bit_array_n the 256 bits set to 0 except for bit n set to 1.

So here, as we have

new_data_2 = new_data ^ bit_array_65

we have

ECC(new_data_2) == ECC(new_data) ^ ECC(bit_array_65)

Let's call ECC_n = ECC(bit_array_n ( the ECC value of 256 bits set to 0 and bit n set to 1). (We can use an unprotected dev board of the same chip to extract thoses).

And for any interesting test_data, we have N bits (i1, i2, ..., iN) such as

test_data = new_data ^ (bit_array_i1 ^  ... ^ bit_array_iN)

and

ECC(test_data) = ECC(new_data) ^ ECC_i1 ^  ... ^ ECC_iN

Finding the collision

Now we can easily compute the ECC of any payload we will try to overwrite. But we still needs to find a collision, ie, we need to find the set of bits from 65 to 256 to set to zero in order for the resulting ECC to be initial_ecc.

So we must find (i1, i2, ..., iN) within {65..256} such that

ECC_i1 ^  ... ^ ECC_iN = ECC(new_data) ^ ECC(initial_data)

In order to do so, we first build for each couple and triplet of bits:

ECC(i,j)   = ECC(i) ^ ECC(j)
ECC(i,j,k) = ECC(i) ^ ECC(j) ^ ECC(k)

And sort them by the number of bits set, and keep the ones with only 3,4 bits set.

We have something like:

ECC(069,146)      002100a0 (04 bits changed)
ECC(065,066)      00012880 (04 bits changed)
ECC(130,134)      00012804 (04 bits changed)
ECC(236,132)      00004281 (04 bits changed)
ECC(227,108)      00004805 (04 bits changed)
...
ECC(128,145,153)  00000284 (03 bits changed)
ECC(231,124,144)  00000282 (03 bits changed)
ECC(224,078,155)  00000281 (03 bits changed)
...

At that point, I stopped doing automated thing and started to search by hand which bit set would produce the desired ECC(new_data) ^ ECC(initial_data), noticing that some combo leads to a one bit setter (eg ECC(224,078,155) ^ ECC(236,132) = 0x4000)

I was lucky in that ECC(new_data) ^ ECC(initial_data) had only a few bit set. But hasn'it be the case, I would first search the ECC_i space that would have minised the number of bits set.

I also didn't handle the case where a bit is part of two set you want to use.

For the record, here is the sequence I identified:

target ecc 22e773
set_1:
00220202 ->  e571
set 2
00000502 ->  e073
set 3
00009011 ->  7062
set 4
00007000 ->    62
set 5 & 6
00020240
00020220 ->     2
set 7 & 8
00340002
00340000 ->     0

And voilĂ  !

I can now overwrite the flash page with a controlled jump without the flash complaining about Ecc errors.

Conclusion

I tried to make this as much simple as I could, but reading it back, not sure I managed to do so.

Also note that, in my case, the ECC was also calculated on the address of the flash page, but in the end, it does not change the approach used.

Feel free to share if you found this interesting.

Social Network

Categories

Feeds