# ECC collision HOWTO

published on Tue 29 October 2024 by Xilokar

This is a small blog post I would have been happy to read while searching how to find ecc collision when dealing with how to exploit a limited write to flash.

No rocket science, just a little write-up.

## Short Context

While working on a target, I found a primitive that allowed me to overwrite a flash page. By overwriting I mean, writing again. If you are not familiar with how some flash works, they must first be erased (aka, every bit is set to 1 (or 0) level, and the written (a bit can be flipped to the other state). Overwriting is doing the same flip again, but you can only flip on bit in the erased state to the written state.

ERASE flash state: 0xffffffff WRITE 0xffffff01 flash state: 0xffffff01 WRITE 0xffffff00 flash state: 0xffffff00 WRITE 0xffffff01 flash state: 0xffffff00

See ? On the last write, the last bit was not set back to 1, but kept to 0.

So, this set up with a limit on what we can overwrite to the flash page (eg, if this is the first page to be executed, a jump to another place), but this is something that can dealt with easily.

Let's imagine that we have the `jmp 0x00abcd` encoded as `0x8000abcd` and that `0x00000000` is a `nop`

Inital flash state: 0x8000abcd 0xffffffff 0xffffffff ... 0xffffffff [jmp 0xabcd] WRITE 0x00000000 0x80e00000 0xffffffff flash state: 0x00000000 0x80e00000 0xffffffff ... 0xffffffff [nop, jmp 0xe00000]

and now we can jump to our (controlled) code

## Here enter ECC

Unfortunatelly, the flash I was dealing with had a protection mechanism to deal with errors. Internally, every 256 bit pages of data was stored on 256+22 bits, and those 22 bits kept an ecc code to detect (and correct) errors.

(ECC stands for Error Code Correction)

A quick overview of what the problem is, with samller bits numbers:

Wen doing a normal write of data

ERASE flash state: DATA: 0xffffffff ECC: 0xff WRITE 0xFFFFFF01 (calculated ecc=0x20) flash state: DATA: 0xffffff01 ECC: 0x20

An ecc code is calulated and written

For another data, the ecc is different:

ERASE DATA: 0xffffffff ECC: 0xff WRITE 0xffffff00 (calculated ecc=0x30) flash state: DATA: 0xffffff00 ECC: 0x30

This mess with with our overwriting scheme (since ecc bits are stored in the flash, they also cannot be switched back to erase state)

eg:

ERASE flash state: DATA: 0xffffffff ECC: 0xff WRITE 0xFFFFFF01 (calculated ecc=0x20) flash state: DATA: 0xffffff01 ECC: 0x20 WRITE 0xFFFFFF00 (calculated ecc=0x30) flash state: DATA: 0xffffff00 ECC: 0x20 = (0x30 & 0x20)

And here, the flash controller will detect that `0xffffff00` and `0x20` don't match.

## The necessary maths (or not)

Disclaimer, I am not (I mean, no longer) a math guy, so I'll just put here the basics things needed to understand.

The Ecc code generated is an hamming code. It has surely a lot of property, but what will interest us in the cas of ecc colisions, is that:

- Is is generated by a simple multiplication by a generation matrice
- at the bit level, the sum is a xor

(for more maths, go to wikipedia)

so:

Ecc(DATAS_BITS) = DATA_BITS x GEN_MATRICE

and:

Ecc(DATA1 ^ DATA2) = Ecc(DATA1) ^ Ecc(DATA2)

## Calculating ECC

Back to our problem. We have:

Inital flash state: 0x8000abcd 0xffffffff 0xffffffff ... 0xffffffff ECC(inital_data) [jmp 0xabcd]

We want:

flash state: 0x00000000 0x80e00000 0xffffffff ... 0xffffffff ECC(new_data)

With

`ECC(new_data) & ECC(initial_data) == ECC(new_data)`

(ie, no bit writtent to zero in `initial_ecc` must be switched back to 1 in `new_ecc`)

But we may also be happy with

flash state: 0x00000000 0x80e00000 0xefffffff ... 0xffffffff ECC(new_data_2)

With the same constraint:

`ECC(new_data_2) & ECC(initial_data) == ECC(new_data_2)`.

The first 64 bits of our payload need to be fixed, but we don't care of the state of the (256 - 64) bits remaining. So we have some latitude to forge a payload that will satisfy the constraint.

Remember that `Ecc` are additive ?

Let's call `bit_array_n` the 256 bits set to 0 except for bit n set to 1.

So here, as we have

`new_data_2 = new_data ^ bit_array_65`

we have

`ECC(new_data_2) == ECC(new_data) ^ ECC(bit_array_65)`

Let's call `ECC_n = ECC(bit_array_n` ( the `ECC` value of 256 bits set to 0 and bit n set to 1). (We can use an unprotected dev board of the same chip to extract thoses).

And for any interesting test_data, we have `N` bits (`i1`, `i2`, ..., `iN`) such as

`test_data = new_data ^ (bit_array_i1 ^ ... ^ bit_array_iN)`

and

`ECC(test_data) = ECC(new_data) ^ ECC_i1 ^ ... ^ ECC_iN`

## Finding the collision

Now we can easily compute the ECC of any payload we will try to overwrite. But we still needs to find a collision, ie, we need to find the set of bits from 65 to 256 to set to zero in order for the resulting ECC to be `initial_ecc`.

So we must find (`i1`, `i2`, ..., `iN`) within `{65..256}` such that

`ECC_i1 ^ ... ^ ECC_iN = ECC(new_data) ^ ECC(initial_data)`

In order to do so, we first build for each couple and triplet of bits:

ECC(i,j) = ECC(i) ^ ECC(j) ECC(i,j,k) = ECC(i) ^ ECC(j) ^ ECC(k)

And sort them by the number of bits set, and keep the ones with only 3,4 bits set.

We have something like:

ECC(069,146) 002100a0 (04 bits changed) ECC(065,066) 00012880 (04 bits changed) ECC(130,134) 00012804 (04 bits changed) ECC(236,132) 00004281 (04 bits changed) ECC(227,108) 00004805 (04 bits changed) ... ECC(128,145,153) 00000284 (03 bits changed) ECC(231,124,144) 00000282 (03 bits changed) ECC(224,078,155) 00000281 (03 bits changed) ...

At that point, I stopped doing automated thing and started to search by hand which bit set would produce the desired `ECC(new_data) ^ ECC(initial_data)`, noticing that some combo leads to a one bit setter (eg `ECC(224,078,155) ^ ECC(236,132) = 0x4000`)

I was lucky in that `ECC(new_data) ^ ECC(initial_data)` had only a few bit set. But hasn'it be the case, I would first search the `ECC_i` space that would have minised the number of bits set.

I also didn't handle the case where a bit is part of two set you want to use.

For the record, here is the sequence I identified:

target ecc 22e773 set_1: 00220202 -> e571 set 2 00000502 -> e073 set 3 00009011 -> 7062 set 4 00007000 -> 62 set 5 & 6 00020240 00020220 -> 2 set 7 & 8 00340002 00340000 -> 0

And voilĂ !

I can now overwrite the flash page with a controlled jump without the flash complaining about `Ecc` errors.

## Conclusion

I tried to make this as much simple as I could, but reading it back, not sure I managed to do so.

Also note that, in my case, the `ECC` was also calculated on the address of the flash page, but in the end, it does not change the approach used.

Feel free to share if you found this interesting.