Data Integrity: CRC with Swift on iOS

When working with any communication or storage medium such as Bluetooth, wi-fi (and even wired networks) or disk storage, data can get corrupted. In such cases, we need to make sure that the data we receive (or read) is the same data that was sent (or written). We need a mechanism to validate the integrity of the data.

There are many ways to accomplish this. The most common way is called Cyclic Redundancy Check (CRC). CRC is like a checksum that is appended to the data. It enables us to detect any unwanted changes in the transmitted or stored data.

Understanding the maths behind the CRC is a little challenging. But if you want to take a deep dive, we’ve got another article for you which covers all the theory and explains in detail how a CRC works. In this article, we will solely focus on the implementation of a CRC algorithm in Swift.

There are different kinds of CRCs that vary in their bit size. Common ones are CRC-8, CRC-16, and CRC-32. CRCs with larger bit sizes detect more errors, but come at the price of taking up more storage space. As an iOS or macOS developer, you will most likely not be the one to make this decision. A specific CRC bit size will most likely be required by the hardware you interact with.

Let’s first start with a simple way to calculate a CRC-32 for a given data array by using existing functionality. We’ll then move on to implement our own algorithm to calculate a CRC-16, which can be easily adapted for any CRC bit size.


Calculating a 32-bit CRC in Swift

Swift provides a built-in method to calculate the CRC-32 of any sequence of bytes. The crc32 method can be found in the zlib library, so you need to import that library in order to use it:

import zlib

The function to calculate the CRC-32 in zlib looks like this:

func crc32(crc: uLong, buf: UnsafePointer<Bytef>!, len: uInt)  

where uInt and uLong are simply type-aliases for UInt32 and UInt respectively.

In the above snippet, crc is the initial value, buf is the pointer to the data (buffer) and len is the length of that data buffer. (It’s a C-library, so don’t be surprised that it doesn’t look very “swifty”.)

Example

Let’s assume we have the following 7 bytes of data for which we want to calculate the CRC-32:

var inputBytes: [UInt8] = [212, 125, 31, 220, 15, 99]

The length of the that data buffer is

let length = UInt32(inputBytes.count)

Note that the crc32 function doesn’t have a simple data parameter, but accepts pointer to the memory location of the byte buffer instead. To get a reference to that pointer, we prefix the input with an ampersand.

let bufferPointer = &inputBytes

Now to calculate CRC, we combine all of these together

crc32(uLong(0), &inputBytes, uInt(inputBytes.count)) // returns 3873799252

The first parameter crc of that method is only used for calculating larger CRC for multiple chunks of data. In our case, we don’t have an initial CRC and can simply pass 0.

While this method certainly doesn’t have the most intuitive API, it certainly is a quick and safe way to get a CRC-32 without writing any of the code yourself. However, this is the only CRC function available in zlib. There is no function included to calculate any other type of CRC (such as CRC-8 or CRC-16 etc.) or any other variant of CRC-32 (with a different polynomial). For that reason, we’ll write our own implementation of a CRC function in the remaining sections of this article.


Calculating a 16-bit CRC in Swift

While there is only one distinct mathematical foundation behind the concept of a CRC, there are different ways to calculate it in Swift. We’ll start with the simplest implementation in order to understand how it works, assuming that we only have a single byte for which we calculate the CRC-16. We’ll then move on to generalize our solution for an arbitrary number of bytes, using an efficient algorithm.

The CRC-16 for a Single Byte

Let’s begin by specifying the requirements of our CRC-16 function: First of all, we need some input data on which we want to compute the CRC. We’ll start with a single byte of data (UInt8) and will later expand the algorithm to work for any number of bytes. To compute the CRC, we also need to choose a polynomial suited for our use case. (That’s a topic of its own and we won’t go into detail on how to pick a suitable polynomial as it will usually be predefined by the systems you interact with.)

As we want to have a CRC-16, we need an output of the same size (16 bits), so we pick a UInt16. The polynomial must also have the same size:

  • Inputs:
    • input: 1 Byte → UInt8
    • polynomial: 2 Bytes → UInt16
  • Output:
    • crc: 2 Bytes → UInt16

Now, let’s define a function based on these inputs and output:

func crc16(for input: UInt8, polynomial: UInt16) -> UInt16

Now comes our first little obstacle: the endianness – in other words: the order of bytes. Apple uses the little-endian format on their hardware, meaning that the most significant byte is at the end. However, we’ll be using the big-endian format in our algorithm. Thus, whenever we work with integers which are larger than 1 byte, we first need to reverse the byte order. So, we first convert the input to 16-bit and then use the bigEndian representation.

var result = UInt16(input).bigEndian

That’s our starting point. Next, we need to perform the following operations on each bit of the input. Since our function calculates the CRC of 1 byte, we need to perform these operations 8 times:

  1. Check if the most significant bit of the current result is 1. To get the most significant bit in big-endian format, we simply perform a logical and operation with a value of the same bit size, that has a 1 in its first (= most significant) bit and 0s in all other bits:
    • for 8-bit CRCs, we use 1000 0000, in hexadecimal: 0x80
    • for 16-bit CRCs, we use 1000 0000 0000 0000, in hexadecimal: 0x8000
    • etc.
let isMostSignificantBitOne = self & 0x8000 != 0
  1. Shift the result to the left by 1 bit.
result = result << 1
  1. If the most significant bit computed in step (1) was 1, we perform an XOR operation on the result and the polynomial. (From a mathematical perspective, we subtract the polynomial from the result. In binary representation, this is equivalent to the XOR operation.)
if isMostSignificantBitOne {
  result = result ^ polynomial
}

ℹ️ Note: The algorithm to compute a CRC is based on a polynomial division with modulo-2. If you want to know how this works in detail, we’ve explained how this works in our generic article on CRCs. For the scope of this article, suffice it to say that the modulo-2 division is a combination of left-shifts and XOR operations. That’s what we do here.

As mentioned before, we need to perform the above three operations on each bit (i.e. 8 times), so putting the pieces together, we end up with this function:

func crc16(for input: UInt8, polynomial: UInt16) -> UInt16 {
    var result = UInt16(input).bigEndian

    for _ in 0..<8 {
        let isMostSignificantBitOne = result & 0x8000 != 0

        result = result << 1

        if isMostSignificantBitOne {
            result = result ^ polynomial
        }
    }

    return result
}

We can also write this function using a functional approach with reduce instead of a for loop:

func crc16(for input: UInt8, polynomial: UInt16) -> UInt16 {
    var result = UInt16(input).bigEndian

    return (0..<8).reduce(result) { result, _ in
        let isMostSignificantBitOne = result & 0x8000 != 0
        guard isMostSignificantBitOne else { 
            return result << 1 
        }

        return (result << 1) ^ polynomial
    }
}

The CRC-16 for an Array of Bytes

The above algorithm works well and we could easily adapt it for other CRC sizes, using the same “recipe”. However, this function might become very slow, especially for larger CRCs and large byte arrays of input data. Fortunately, there is a mathematical trick we can apply to make the computation much more efficient.

CRC computation is slow because we have to go through each bit of every byte. For example, if you have an array with 128 bytes, you would need to iterate 8 ⨉ 128 = 1024 times. Now imagine a much longer array, it will need a lot of more iterations.

So how do we make the computation faster? The solution is to precompute a part of the algorithm that will always be the same, regardless of the number of input bytes. This precomputed part is then stored in an array (also called a look-up table).

The CRC is the remainder of a long division of the input bytes (dividend) and the polynomial (divisor). During that division, we have to go through all the bytes of the input. But instead of looping through each bit of each byte of the input, we now use our look-up table remainder of the division of any possible value of a single byte (0-255) by the polynomial. This saves us a lot of computing power and makes our calculation much faster.

Generating a CRC Look-up Table

  1. First, we need to choose a polynomial.
    A different polynomial will yield a different CRC value for the same input. In this example, we choose a very commonly used CRC-16 polynomial:
    x16 + x12 + x5 + 1
    (It’s called CRC16-CCITT.)
  2. Next, we convert the polynomial to its binary representation:
    1 0001 0000 0010 0001
    If this seems like magic, we have to disappoint you – it’s really pretty simple and straight-forward: Starting from the highest exponent, go through all exponents in decreasing order. Put a 1 for all exponents that are present and 0 for the missing ones. That’s how you end up with the binary representation of the polynomial.
  3. The resulting polynomial has 17 bits. Based on the mathematical properties of the CRC polynomials, we can discard the most significant bit (equivalent to a left-shift by 1 bit in big-endian representation) to make it a 16-bit polynomial:
    0001 0000 0010 0001
  4. We convert these 16 bits into their hexadecimal representation:
    0x1021
  5. Now we use this polynomial representation to generate the lookup table by dividing each possible value of the byte by the polynomial. The resulting array is our lookup-table:
let crc16Table = (0...255).map { byteValue in
    crc16(for: UInt16(byteValue), polynomial: 0x1021) 
}

Calculating the CRC with a Look-up Table

Using the CRC-16 table from above, we can easily calculate the CRC for any number of bytes. We go through the input bytes one by one and make use of the precomputed values.

func crc16(for inputs: [UInt8]) -> UInt16 {
    inputs.reduce(UInt16(0)) { remainder, byte in
        let bigEndianInput = UInt16(byte) << 8
        let index = (bigEndianInput ^ remainder) >> 8
        return crc16Table[Int(index)] ^ (remainder << 8)
    }
}

To understand the code in detail, you need some mathematical background on how the algorithm is derived (not part of this article). The core learning here is that we iterate over the input data byte-wise rather than bit-wise. That’s why this algorithm is much more efficient for long data streams.

One drawback of this method is that we always need to compute the CRC for the entire input data all at once. But oftentimes, we might have a number of smaller arrays for which we want to compute only one CRC. Using the above function, we would have to combine all of those smaller data arrays into one big array.

Instead, let’s make our lives a little easier and slightly modify our crc16 function so that we can compute the final CRC in chunks:

func crc16(for inputs: [UInt8], initialValue: UInt16 = 0) -> UInt16 {
    inputs.reduce(initialValue) { remainder, byte in
        let bigEndianInput = UInt16(byte) << 8
        let index = (bigEndianInput ^ remainder) >> 8
        return crc16Table[Int(index)] ^ (remainder << 8)
    }
}

We only changed two things:

  1. We added a new parameter initialValue (with a default value of 0).
  2. We replaced input.reduce(UInt16(0)) with input.reduce(initialValue).

With these changes, we can now split a large byte array of input data into an arbitrary number of smaller arrays that suit our needs and calculate the CRC in as many steps.

Example

Given the following input array of 5 bytes

[1, 2, 240, 125, 3]

we can calculate their CRC-16 like this:

crc16(for: [1, 2, 240, 125, 3])                     // returns 59917

Alternatively, we can also split the input array and compute the CRC-16 in two steps:

let initialCRC = crc16(for: [1, 2])                 // return 4979
crc16(for: [240, 125, 3], initialValue: initialCRC) // return 59917

In a real-world scenario, we usually have a struct or a class with typed numbers, for example:

let id: Int
let age: UInt32

In such cases, it’s convenient to have a property for each of your types that converts an instance of that type to its big-endian byte representation. For the example above, we can simply define an extension of the FixedWidthInteger protocol which will work for all Swift integer types:

extension FixedWidthInteger {
    public var bigEndianBytes: [UInt8] {
        [UInt8](withUnsafeBytes(of: self.bigEndian) { Data($0) })
    }
}

With this extenion, we can now easily calculate the CRC, either all at once:

crc16(for: id.bigEndianBytes + age.bigEndianBytes)

or in steps, for each variable at a time:

let idCRC = crc16(for: id.bigEndianBytes)
let totalCRC = crc16(for: age.bigEndianBytes, initialValue: idCRC)

That’s exactly the same concept used by zlib‘s CRC-32 function, introduced in the beginning:

func crc32(crc: uLong, buf: UnsafePointer<Bytef>!, len: uInt)  

It also has an initial value crc, so we can split the calculation of the CRC-32 the same way:

var firstInput: [UInt8] = [212, 125, 31]
var secondInput: [UInt8] = [220, 15, 99]

let firstCRC = crc32(uLong(0), &firstInput, UInt32(firstInput.count))
crc32(firstCRC, &secondInput, UInt32(secondInput.count)) // returns 3873799252

Conclusion

In this article, you’ve learned how to compute 16- and 32-bit CRCs in Swift. We used the system’s existing implementation for easily retrieving the CRC-32 for a variable-length byte array. We then explained how we can write our own implementation to calculate any type of CRC. You should now be able to

  • generate a CRC table for any type of CRC (and any type of polynomial)
  • (efficiently) calculate any type of CRC for a byte array of an arbitrary size

Where to Go From Here

🐣

Get notified when our next article is born!

(no spam, just one app-development-related article per month)