Data Integrity: CRC with Swift on iOS
When working with any communication or storage medium such as Bluetooth, wi-fi (and even wired networks) or disk storage, data can get corrupted. In such cases, we need to make sure that the data we receive (or read) is the same data that was sent (or written). We need a mechanism to validate the integrity of the data.
There are many ways to accomplish this. The most common way is called Cyclic Redundancy Check (CRC). CRC is like a checksum that is appended to the data. It enables us to detect any unwanted changes in the transmitted or stored data.
Understanding the maths behind the CRC is a little challenging. But if you want to take a deep dive, we’ve got another article for you which covers all the theory and explains in detail how a CRC works. In this article, we will solely focus on the implementation of a CRC algorithm in Swift.
There are different kinds of CRCs that vary in their bit size. Common ones are CRC-8, CRC-16, and CRC-32. CRCs with larger bit sizes detect more errors, but come at the price of taking up more storage space. As an iOS or macOS developer, you will most likely not be the one to make this decision. A specific CRC bit size will most likely be required by the hardware you interact with.
Let’s first start with a simple way to calculate a CRC-32 for a given data array by using existing functionality. We’ll then move on to implement our own algorithm to calculate a CRC-16, which can be easily adapted for any CRC bit size.
Calculating a 32-bit CRC in Swift
Swift provides a built-in method to calculate the CRC-32 of any sequence of bytes. The crc32
method can be found in the zlib
library, so you need to import that library in order to use it:
import zlib
The function to calculate the CRC-32 in zlib
looks like this:
func crc32(crc: uLong, buf: UnsafePointer<Bytef>!, len: uInt)
where uInt
and uLong
are simply type-aliases for UInt32
and UInt
respectively.
In the above snippet, crc
is the initial value, buf
is the pointer to the data (buffer) and len
is the length of that data buffer. (It’s a C-library, so don’t be surprised that it doesn’t look very “swifty”.)
Example
Let’s assume we have the following 7 bytes of data for which we want to calculate the CRC-32:
var inputBytes: [UInt8] = [212, 125, 31, 220, 15, 99]
The length of the that data buffer is
let length = UInt32(inputBytes.count)
Note that the crc32
function doesn’t have a simple data
parameter, but accepts pointer to the memory location of the byte buffer instead. To get a reference to that pointer, we prefix the input
with an ampersand.
let bufferPointer = &inputBytes
Now to calculate CRC, we combine all of these together
crc32(uLong(0), &inputBytes, uInt(inputBytes.count)) // returns 3873799252
The first parameter crc
of that method is only used for calculating larger CRC for multiple chunks of data. In our case, we don’t have an initial CRC and can simply pass 0
.
While this method certainly doesn’t have the most intuitive API, it certainly is a quick and safe way to get a CRC-32 without writing any of the code yourself. However, this is the only CRC function available in zlib
. There is no function included to calculate any other type of CRC (such as CRC-8 or CRC-16 etc.) or any other variant of CRC-32 (with a different polynomial). For that reason, we’ll write our own implementation of a CRC function in the remaining sections of this article.
Calculating a 16-bit CRC in Swift
While there is only one distinct mathematical foundation behind the concept of a CRC, there are different ways to calculate it in Swift. We’ll start with the simplest implementation in order to understand how it works, assuming that we only have a single byte for which we calculate the CRC-16. We’ll then move on to generalize our solution for an arbitrary number of bytes, using an efficient algorithm.
The CRC-16 for a Single Byte
Let’s begin by specifying the requirements of our CRC-16 function: First of all, we need some input data on which we want to compute the CRC. We’ll start with a single byte of data (UInt8
) and will later expand the algorithm to work for any number of bytes. To compute the CRC, we also need to choose a polynomial suited for our use case. (That’s a topic of its own and we won’t go into detail on how to pick a suitable polynomial as it will usually be predefined by the systems you interact with.)
As we want to have a CRC-16, we need an output of the same size (16 bits), so we pick a UInt16
. The polynomial must also have the same size:
- Inputs:
- input: 1 Byte →
UInt8
- polynomial: 2 Bytes →
UInt16
- input: 1 Byte →
- Output:
- crc: 2 Bytes →
UInt16
- crc: 2 Bytes →
Now, let’s define a function based on these inputs and output:
func crc16(for input: UInt8, polynomial: UInt16) -> UInt16
Now comes our first little obstacle: the endianness – in other words: the order of bytes. Apple uses the little-endian format on their hardware, meaning that the most significant byte is at the end. However, we’ll be using the big-endian format in our algorithm. Thus, whenever we work with integers which are larger than 1 byte, we first need to reverse the byte order. So, we first convert the input
to 16-bit and then use the bigEndian
representation.
var result = UInt16(input).bigEndian
That’s our starting point. Next, we need to perform the following operations on each bit of the input. Since our function calculates the CRC of 1 byte, we need to perform these operations 8 times:
- Check if the most significant bit of the current
result
is 1. To get the most significant bit in big-endian format, we simply perform a logical and operation with a value of the same bit size, that has a 1 in its first (= most significant) bit and 0s in all other bits:- for 8-bit CRCs, we use
1000 0000
, in hexadecimal:0x80
- for 16-bit CRCs, we use
1000 0000 0000 0000
, in hexadecimal:0x8000
- etc.
- for 8-bit CRCs, we use
let isMostSignificantBitOne = self & 0x8000 != 0
- Shift the
result
to the left by 1 bit.
result = result << 1
- If the most significant bit computed in step (1) was 1, we perform an XOR operation on the
result
and thepolynomial
. (From a mathematical perspective, we subtract the polynomial from the result. In binary representation, this is equivalent to the XOR operation.)
if isMostSignificantBitOne { result = result ^ polynomial }
ℹ️ Note: The algorithm to compute a CRC is based on a polynomial division with modulo-2. If you want to know how this works in detail, we’ve explained how this works in our generic article on CRCs. For the scope of this article, suffice it to say that the modulo-2 division is a combination of left-shifts and XOR operations. That’s what we do here.
As mentioned before, we need to perform the above three operations on each bit (i.e. 8 times), so putting the pieces together, we end up with this function:
func crc16(for input: UInt8, polynomial: UInt16) -> UInt16 { var result = UInt16(input).bigEndian for _ in 0..<8 { let isMostSignificantBitOne = result & 0x8000 != 0 result = result << 1 if isMostSignificantBitOne { result = result ^ polynomial } } return result }
We can also write this function using a functional approach with reduce
instead of a for
loop:
func crc16(for input: UInt8, polynomial: UInt16) -> UInt16 { var result = UInt16(input).bigEndian return (0..<8).reduce(result) { result, _ in let isMostSignificantBitOne = result & 0x8000 != 0 guard isMostSignificantBitOne else { return result << 1 } return (result << 1) ^ polynomial } }
The CRC-16 for an Array of Bytes
The above algorithm works well and we could easily adapt it for other CRC sizes, using the same “recipe”. However, this function might become very slow, especially for larger CRCs and large byte arrays of input data. Fortunately, there is a mathematical trick we can apply to make the computation much more efficient.
CRC computation is slow because we have to go through each bit of every byte. For example, if you have an array with 128 bytes, you would need to iterate 8 ⨉ 128 = 1024 times. Now imagine a much longer array, it will need a lot of more iterations.
So how do we make the computation faster? The solution is to precompute a part of the algorithm that will always be the same, regardless of the number of input bytes. This precomputed part is then stored in an array (also called a look-up table).
The CRC is the remainder of a long division of the input bytes (dividend) and the polynomial (divisor). During that division, we have to go through all the bytes of the input. But instead of looping through each bit of each byte of the input, we now use our look-up table remainder of the division of any possible value of a single byte (0-255) by the polynomial. This saves us a lot of computing power and makes our calculation much faster.
Generating a CRC Look-up Table
- First, we need to choose a polynomial.
A different polynomial will yield a different CRC value for the same input. In this example, we choose a very commonly used CRC-16 polynomial:
x16 + x12 + x5 + 1
(It’s called CRC16-CCITT.) - Next, we convert the polynomial to its binary representation:
1 0001 0000 0010 0001
If this seems like magic, we have to disappoint you – it’s really pretty simple and straight-forward: Starting from the highest exponent, go through all exponents in decreasing order. Put a1
for all exponents that are present and0
for the missing ones. That’s how you end up with the binary representation of the polynomial. - The resulting polynomial has 17 bits. Based on the mathematical properties of the CRC polynomials, we can discard the most significant bit (equivalent to a left-shift by 1 bit in big-endian representation) to make it a 16-bit polynomial:
0001 0000 0010 0001
- We convert these 16 bits into their hexadecimal representation:
0x1021
- Now we use this polynomial representation to generate the lookup table by dividing each possible value of the byte by the polynomial. The resulting array is our lookup-table:
let crc16Table = (0...255).map { byteValue in crc16(for: UInt16(byteValue), polynomial: 0x1021) }
Calculating the CRC with a Look-up Table
Using the CRC-16 table from above, we can easily calculate the CRC for any number of bytes. We go through the input bytes one by one and make use of the precomputed values.
func crc16(for inputs: [UInt8]) -> UInt16 { inputs.reduce(UInt16(0)) { remainder, byte in let bigEndianInput = UInt16(byte) << 8 let index = (bigEndianInput ^ remainder) >> 8 return crc16Table[Int(index)] ^ (remainder << 8) } }
To understand the code in detail, you need some mathematical background on how the algorithm is derived (not part of this article). The core learning here is that we iterate over the input data byte-wise rather than bit-wise. That’s why this algorithm is much more efficient for long data streams.
One drawback of this method is that we always need to compute the CRC for the entire input data all at once. But oftentimes, we might have a number of smaller arrays for which we want to compute only one CRC. Using the above function, we would have to combine all of those smaller data arrays into one big array.
Instead, let’s make our lives a little easier and slightly modify our crc16
function so that we can compute the final CRC in chunks:
func crc16(for inputs: [UInt8], initialValue: UInt16 = 0) -> UInt16 { inputs.reduce(initialValue) { remainder, byte in let bigEndianInput = UInt16(byte) << 8 let index = (bigEndianInput ^ remainder) >> 8 return crc16Table[Int(index)] ^ (remainder << 8) } }
We only changed two things:
- We added a new parameter
initialValue
(with a default value of 0). - We replaced
input.reduce(UInt16(0))
withinput.reduce(initialValue)
.
With these changes, we can now split a large byte array of input data into an arbitrary number of smaller arrays that suit our needs and calculate the CRC in as many steps.
Example
Given the following input array of 5 bytes
[1, 2, 240, 125, 3]
we can calculate their CRC-16 like this:
crc16(for: [1, 2, 240, 125, 3]) // returns 59917
Alternatively, we can also split the input array and compute the CRC-16 in two steps:
let initialCRC = crc16(for: [1, 2]) // return 4979 crc16(for: [240, 125, 3], initialValue: initialCRC) // return 59917
In a real-world scenario, we usually have a struct or a class with typed numbers, for example:
let id: Int let age: UInt32
In such cases, it’s convenient to have a property for each of your types that converts an instance of that type to its big-endian byte representation. For the example above, we can simply define an extension of the
protocol which will work for all Swift integer types:FixedWidthInteger
extension FixedWidthInteger { public var bigEndianBytes: [UInt8] { [UInt8](withUnsafeBytes(of: self.bigEndian) { Data($0) }) } }
With this extenion, we can now easily calculate the CRC, either all at once:
crc16(for: id.bigEndianBytes + age.bigEndianBytes)
or in steps, for each variable at a time:
let idCRC = crc16(for: id.bigEndianBytes) let totalCRC = crc16(for: age.bigEndianBytes, initialValue: idCRC)
That’s exactly the same concept used by zlib
‘s CRC-32 function, introduced in the beginning:
func crc32(crc: uLong, buf: UnsafePointer<Bytef>!, len: uInt)
It also has an initial value crc
, so we can split the calculation of the CRC-32 the same way:
var firstInput: [UInt8] = [212, 125, 31] var secondInput: [UInt8] = [220, 15, 99] let firstCRC = crc32(uLong(0), &firstInput, UInt32(firstInput.count)) crc32(firstCRC, &secondInput, UInt32(secondInput.count)) // returns 3873799252
Conclusion
In this article, you’ve learned how to compute 16- and 32-bit CRCs in Swift. We used the system’s existing implementation for easily retrieving the CRC-32 for a variable-length byte array. We then explained how we can write our own implementation to calculate any type of CRC. You should now be able to
- generate a CRC table for any type of CRC (and any type of polynomial)
- (efficiently) calculate any type of CRC for a byte array of an arbitrary size
Where to Go From Here
- You can find all the CRC implementations from this article (and some additional ones!) in our Swift CRC repository, ready for use.
- If you want to understand how the CRC works in more detail, please check our article on How to validate your Data with a Cyclic Redundancy Check.
- We also have an article with the respective CRC implementations in Kotlin for Android developers.
- CRCs are often used with Bluetooth Low Energy characteristics. If you want to learn how to decode BLE characteristics in general, we got you covered as well: How to Read BLE Characteristics in Swift.