# Computing parity using CRC

It is a party trick rather than anything practically useful, but the following four functions all compute the same thing:

```
uint8_t parity_b(const uint64_t* data, size_t n) {
uint8_t acc = 0;
for (size_t i = 0; i < n; ++i) {
for (size_t j = 0; j < 64; ++j) {
if (data[i] & (1ull << j)) acc ^= 1;
}
}
return acc;
}
```

```
uint8_t parity_p(const uint64_t* data, size_t n) {
uint8_t acc = 0;
for (size_t i = 0; i < n; ++i) {
acc += _mm_popcnt_u64(data[i]);
}
return acc & 1;
}
```

```
uint8_t parity_x(const uint64_t* data, size_t n) {
uint64_t acc = 0;
for (size_t i = 0; i < n; ++i) {
acc ^= data[i];
}
return _mm_popcnt_u64(acc) & 1;
}
```

```
uint8_t parity_c(const uint64_t* data, size_t n) {
uint32_t acc = 0;
for (size_t i = 0; i < n; ++i) {
acc = _mm_crc32_u64(acc, data[i]);
}
return _mm_popcnt_u64(acc) & 1;
}
```

Going from `parity_b`

to `parity_p`

is an easy step: replace the inner loop with a `popcnt`

. Going from `parity_p`

to `parity_x`

is also an easy step: lift the `popcnt`

out of the loop. Hence it is not too surprising that `parity_b`

and `parity_p`

and `parity_x`

all compute the same thing. The `parity_c`

function is like `parity_x`

, except replacing `^=`

with `crc`

, and the *surprising* thing is that `parity_c`

computes the same thing as all the other functions. To understand *why*, we need to look at the polynomial used by the `crc`

instruction (the CRC32-C polynomial):

This polynomial can be expressed as the product of two separate polynomials:

Accordingly, CRC32-C isn't really a 32-bit checksum, it is really two separate things packed into 32 bits: a 1-bit checksum and a 31-bit checksum. The method of packing is slightly funky, and thus the method of *unpacking* is also slightly funky: from a mathematical point of view, to get the 1-bit checksum out of the 32-bit value, we need to view the 32-bit value as a Z_{2} polynomial and then compute the remainder modulo `x + 1`

. This is equivalent to iteratively applying the reduction rule `x = 1`

until nothing more can be reduced. If we were to start with the 4-bit polynomial `b`

and iteratively apply _{3} * x^{3} + b_{2} * x^{2} + b_{1} * x + b_{0}`x = 1`

, we'd get `b`

, which is just _{3} * 1*1*1 + b_{2} * 1*1 + b_{1} * 1 + b_{0}`b`

, which is just another way of saying _{3} + b_{2} + b_{1} + b_{0}`popcnt`

(mod 2). The same thing is true starting with a 32-bit polynomial: the remainder modulo `x + 1`

is just `popcnt`

(mod 2). In other words, we can use `popcnt`

to unpack the 1-bit checksum out of the 32-bit value, and a 1-bit checksum is just parity. Hence why `parity_c`

computes the same thing as all the other functions.

This combination of a 1-bit checksum and a 31-bit checksum can be seen as a *good* thing for checksumming purposes, as the two parts protect against different problems. Notably, every bit-flip in the data being checksummed will invert the 1-bit checksum, and hence any *odd* number of bit-flips will cause the 1-bit checksum to mismatch. The (ahem) flip-side is that the 1-bit checksum will *match* for any *even* number of bit-flips, which is where the 31-bit checksum comes in: it needs to protect against as many *even* number of bit-flips as possible.