Computer scienceAlgorithms and Data StructuresAlgorithmsString coding algorithms

Coding: overview

6 minutes read

Have you sent a message through a messaging app and thought to yourself whether someone is spying on your texts? Well, if you use a mainstream messaging app, your texts are safe from any prying eyes. All messaging apps generally change your messages before sending them through the internet. In this process, the data you want to send is rearranged and no one can understand the actual meaning. When the intended recipient gets your text it is converted back into its original understandable form. This is one of the many reasons to transform data into other forms.

encrypted text

In this topic, we are going to learn more about why we make these changes, the algorithms we use to do so, and how these algorithms work.

Encoding and decoding

The process of converting data into other forms is called encoding. The data that undergoes the process of encoding is known as encoded data. The process where the encoded data is reverted to its original form is called decoding.

String coding algorithms perform encoding and decoding on strings. But why do we need to do this?

Here are some reasons for string coding.

  1. Data Compression: we can use string coding algorithms to save disk space. String data can be encoded so that it takes less space than previously required. This process is referred to as Data Compression.

  2. Cryptography: this is the process of encoding data into unrecognizable clumps to hide its actual meaning from unwanted eyes. The encoded clumps can then be converted back into their original form when needed.

  3. Error Control: using the properties of the data being used, we can generate additional values that help us check whether the original data has been changed and where those changes were made. This allows us to find and fix errors that occur during data transfer.

We will explore the different types of string coding using analogies in the next step.

Concept of cryptography

Let's say you have an important test coming up. But you and your friends want to organize a party. You decide that the party will happen at your friend Jim's house and you need to spread the message

"Party today at 10 pm, Jim's house."

While you need to pass this on to all your friends you also don't want the teachers to know about the party. If you pass the information as it is there is a high probability that someone you don't want to see your message will see it.

So what can you do? The best option is that you use a special code language that only you and your friends can understand or translate.

Let's make a rule for this translation

Word

Convert To

Party

Lecture

pm

am

house

class

If we convert the given sentence based on the table we get,

"Lecture today at 10 am, Jim's class."

This sentence doesn't convey our original information. Only a select few who have the translation table can understand the true meaning.

encoded message and dictionary with meanings

A similar concept can be used for data transfer as well. When we don't want random people to be able to understand our messages, we can simply replace our original message with something else that can be converted back to its original form by the intended receiver. This will keep our data safe from unwanted hands.

Concept of data compression

(Don't Cheat on your tests!)

Now that the party is over you regret it immediately. You don't have any time to study now. So you think of making a cheat sheet to pass the test. As you make the sheet, you understand that if the sheet is too big, you have a higher chance of getting caught; if it is too small, you cannot contain enough information there to pass. You want the sheet to store as much information as possible using as little space as possible.

"Dijkstra's Algorithm finds the shortest path between a given node and all other nodes in a graph." is one of the sentences that you need to write on the sheet.

Since you want to maximize information in the sheet you can replace or remove some parts of the sentences and words such that no information is lost but the sentence uses less space.

"Dijkstra's Algorithm finds shortest path between given node and other nodes in graph."

Here some words were removed to make the sentence shorter without losing any information. While the grammar might become incorrect that is a non-issue because we are saving precious space.

At the same time, we can also shorten words.

"Dijk's Algo find shortest path betn given node and other nodes in graph"

compressed text

Here we replaced some words with something smaller that is easily understandable. This saves us more space. Enough for you to complete the sheet without it being too large.

One thing to note here is that this sentence barely saved any space on its own. However, if we increase the number of sentences the space saved will be more and more prominent and helps us optimize available storage resources.

Concept of error control

With the cheat sheet, you are doing very well on your test, but you notice that your friend Bob is struggling.

The test has 5 difficult multiple-choice questions and you need to deliver your answers to Bob. Being found helping will land you in trouble and you need to be careful. You take some time and tell Bob the answers when the teacher isn't paying attention.

"The answer is cbadc", you tell Bob. And Bob passes, right? Maybe not! What would happen if Bob doesn't hear you properly? What if Bob hears something else?

You need to do something so that Bob can distinguish whether the answers are correct or not. An easy way to do this would be to tell the answers multiple times but you might not get a chance to tell Bob the answer multiple times. So what can you do?

Well, there is a very interesting way to solve this problem. We can add another value at the end that allows us to check the answer.

In this case let's equate the value of a to 1, b to 2, c to 3, and d to 4. So every time the answer is 'a', we add 1 to the final value. If the answer is b, then we add 2, and so on.

In the end, "cbadc" would get the value 3+2+1+4+3 = 13. So instead of telling Bob the answer. You also give Bob the number 13 so that Bob can check whether they add up to 13 or not.

This way if "cbabc" was heard the number should have been 11 instead of 13 and you would know there is a problem and Bob could ask you to repeat the answer.

With this, you and Bob pass the test and live happily ever after!

Conclusion

In this topic, we got acquainted with the basics of coding, and here are some takeaways:

  • Data is regularly converted into other forms or formats.

  • Converting data into other forms is referred to as encoding whereas converting encoded data back to its original form is called decoding.

  • Data compression allows us to store more information using less space.

  • Encryption conceals information from unwanted eyes.

  • Error Control can be used to recognize errors caused during data transfer.

15 learners liked this piece of theory. 2 didn't like it. What about you?
Report a typo