Are you tired of dealing with data transmission errors that corrupt your valuable information? Even if you don't work with data transmission on a daily basis, you've likely experienced errors in downloaded files or while copying data from one device to another. Data transmission errors are a common issue in computer networks, communication systems, and data storage. They can occur due to noise, interference, or other factors, which can lead to incorrect data being received or stored.
Fortunately, there is a solution to this problem: the Hamming Coding algorithm, a technique that is widely used for error detection and correction. In this topic, we will learn how to compute the hamming code, and in the following topic, this code shall be used to detect errors and fix them. Let's get down to business!
Compute Hamming code
Let's take a closer look at the algorithm itself. The idea is to add a set of new bits, also called parity bits, to detect and correct errors in the data. Each parity bit covers a specific set of data bits. The -th parity bit covers bit positions in the following way (denote ):
- Start from position .
- Include consecutive bits.
- Skip the next consecutive bits.
- Repeat steps 2 and 3 until the end of the string is reached.
The picture below illustrates the process for the first four parity bits. Pretty intuitive and simple, isn't it?
binary(n) returns the binary representative of n. Feel free to think about it in your free time...In order to create a Hamming Code of the data given as a binary string, the algorithm follows these steps:
-
Identify the number of parity bits required. This is done by finding the smallest number such that , where is the number of data bits.
-
Assign each data bit to a position in the code word. The data bits are placed in positions that are not powers of in the same order they are in the original word, just like in the picture below. The remaining positions are reserved for the parity bits and initialized equal to zero.
-
Calculate the value of each parity bit. Check the number of s (ones) in the set, covered by that parity bit: if it is odd, set the value of this parity bit equal to , otherwise set is .
-
Insert the parity bits into their positions in the code word to create the final message. The resulting code word, which is known as Hamming code, is transmitted or stored.
Later on, this code is stored or transmitted in order to check if there are any errors. How is this done? Be patient until the next topic.
Example: 1100101
Too abstract so far? Let's get our hands dirty and encode some string ourselves. A string of length seven will do, so let's decode the string . It should be easy if we follow the steps above:
- First, we have to find the number of parity bits needed. In our case, we have a string of length , hence and we need to find the smallest , such that . Some simple calculations would show that won't do it (try it!), while for we get .
- Next, we position the data bits and put placeholders equal to zero for parity bits, as shown below:
- It's time to calculate the parity bit values. We will perform the calculations in detail for and , while you are advised to check for and . Recall that we denote by the -th parity bit.
- As for the first parity bit, we have the following set covered by it (the red cells in the picture below). It should not be difficult to count, that there are ones in that set, i.e. an even number. Consequently, we set the value of the first parity bit equal to : .
- Similarly, and .
- Last but not least, we have the situation below for the fourth parity bit. We can see that there are 2 ones in the red zone, which is an even number. Hence, .
- As for the first parity bit, we have the following set covered by it (the red cells in the picture below). It should not be difficult to count, that there are ones in that set, i.e. an even number. Consequently, we set the value of the first parity bit equal to : .
- Finally, we have all our parity values, and we can write the Hamming code for our string in the format . We have computed all the values, so the encoded string will be .
Time complexity and pseudocode
Some of us find it difficult to turn words into code. To help with this, here is a pseudocode on Hamming encoding process:
// data = the binary string to be encoded
function compute_parity_bit(data, p_idx):
count = 0
for i in [p_idx, len(data)]:
if i & p_idx == p_idx then // the set covered by i-th parity bit
if data[i] == '1' then // count the number of 1's covered by the parity bit
count = count + 1
return '1' if count mod 2 == 1 else '0' // p_i = 1 if the number of ones is odd, 0 otherwise
function hamming_encode(data):
n = len(data) // number of data bits
r = min r s.t. 2^r >= n + r + 1 // number of parity bits
j = 0 // index of current data bit
// placing bits
for i in [1, n + r]:
if i is a power of 2 then
code = code + '0' // add a placeholder for the parity bit
else:
code = code + data[j] // add the next data bit
j = j + 1
// computing parity bits
for i in [0, r - 1]:
p_idx = 2^i // calculate the index of the parity bit
p_i = compute_parity_bit(code, p_idx) // calculate the parity bit value
code[p_idx] = p_i // replace the placeholder with the parity bit value
return code
Now, let's study the time complexity of this process, analyzing one step at a time:
- We can find the lowest number satisfying the condition, by using binary search, for example. Hmmm, this is a good moment to refresh and test your knowledge on binary search. Recall that the time complexity of this process would be , since can't be greater than . Indeed, is always greater than for large .
- Simply positioning the data bits requires linear time.
- Computing the value of one parity bit would require linear time since we scan all the characters in the string. However, we have parity bits, whose value we have to calculate. In total, we get .
- The last step consists of positioning parity bits, which are linear as well.
Gathering all together, we see that the third step outweighs the other steps, hence the time complexity of the encoding algorithm is .
Conclusion
Now it's time to recap the main information on this topic. Hamming coding is a technique that is widely used for error detection and correction, consisting of encoding, which we mentioned in this topic, and decoding, which is to be discussed in the next topic. The encoding process consists of 4 main steps:
- Identify the number of parity bits required.
- Assign each data bit to a position in the code word.
- Calculate the value of each parity bit.
- Append the parity bits to the code word to create the final message.
The time complexity of such an algorithm is estimated to be . Now that you have a clear idea of how the algorithm works, let's practice with some tasks.