Computer scienceProgramming languagesPythonCode qualityCode performance

Intro to multithreading

15 minutes read

If you are familiar with the notion of threading, you probably have an idea of how threading utilizes different parts of one process to run concurrently. A process is nothing else but a program that is currently using the CPU. Almost any process now supports multithreading. This means that multiple threads (units) will work together to achieve one common goal.

Every process has at least one thread. It is the main thread.

Let us take an example to illustrate the notion of multithreading: imagine that you are an avid fan of video games. In a video game, the process has to run different tasks — rendering, interaction with players, and, possibly, internet connection. All these tasks run in parallel because the game should be responsive all the time. To accomplish this, it employs multithreading, where each thread is responsible for running a separate task.

Another example is when you create a simple application for registration at an event. People must register if they want to attend the event, so you have prepared a simple form for attendees. If the system that saves user data is a single-threaded application, it can only process one request at a time. But what if the event is a concert for millions of people? Processing one request at a time will slow down the performance drastically. That's why it is a good idea to make the application multithreaded to allow multiple requests to be processed at the same time.

Threading in Python

Almost everything in Python can be represented as an object. Threading is represented in Python. Therefore, threading is also an object. A thread in Python can hold data, can be passed as a parameter to a function, can have various states such as locked or unlocked, and can be stored in different data structures like dictionaries, lists, and so on.

Before jumping into code implementation, let's understand the concept of locking. A lock is a synchronization object that controls simultaneous access to an object. A lock acts as a permit: it is a vital thing in the prevention of data corruption.

The lock will be assigned to only one thread at a time. Other threads will wait for the lock owner to complete its task and return it. Thanks to the lock mechanism, it is possible to control the competition between various threads, ensuring that each one of them performs its activities without the unwanted interference of other threads.

Now, let's return to the notion of threading. Python offers two modules for thread-control in programs: _thread and threading. The main difference between them is that the _thread module implements a thread as a function, while the threading module offers an object-oriented approach to enable thread creation. Below you'll see examples that will give you an idea of how to implement threads using both built-in modules.

The _thread module

First, let's create a function called greet. It takes a lock object as an argument, waits for 3 seconds, and prints a welcome message. The parameter will be necessary later when we'll use a thread to execute the function:

import time

locks = []

def greet(lockobject):
    time.sleep(3)
    print('Hello, ')
    # Release the lock as we are done here
    lockobject.release()

Then we need to create a thread, where we can execute our function. When the thread is started, it takes the function and a tuple of lock objects:

import _thread

def create_thread():
    # Create a lock and acquire it
    lockobject = _thread.allocate_lock()
    lockobject.acquire()

    # Store it in the global lock list
    locks.append(lockobject)
    # Pass it to a new thread that can release the lock once done
    _thread.start_new_thread(greet, (lockobject,))

Let's continue with locks. A lock can be either locked or unlocked. It has only two basic methods, acquire() and release(). When the state is unlocked, acquire() changes the state to locked and returns immediately. When it is locked, acquire() blocks it until a call to release() in another thread changes it to unlocked, then the acquire() call resets it to locked and returns. Call the release() method only when the state is locked; it changes the state to unlocked and returns immediately. If an attempt is made to release an unlocked lock, an error will be raised.

Finally, we can call the create_thread function and print the rest of the greeting message:

create_thread()
print('world!')
# Acquire all locks = release all threads
all(lock.acquire() for lock in locks)

The threading module

In the following code snippet, we will recreate the above greeting function and then pass it as a target parameter to our thread. A target is a callable object that is invoked by thread methods. Once we create the thread object, we must start it with the start() method.

import time
from threading import Thread


def greet():
    time.sleep(3)
    print('Hello, ')


t = Thread(target=greet)
t.start()

print('world!')

If you run the snippets above, you will notice that even if we created a thread before the print statement, world! is printed 3 seconds before the Hello, string. This happens because when using threading, our program does not wait until the delay but rather goes and executes the next lines.

You may also have noticed that creating a thread with the threading module is more straightforward than doing the same with the _thread module. In the first example, we had to create a lock as well. Otherwise, the operating system will ignore our thread, and our Hello, message would not have been printed out. The Official documentation indicates that the _thread module is a low-level threading API, so it is considered a good practice to use a higher-level level module like threading, where you can simply wait for all the threads to exit. In the examples below, we will follow this recommendation.

Multithreading in Python

Now that you have a good understanding of how to create a thread, it is time to make a step forward into threading and see how a program behaves when we set up multiple threads.

import time
from threading import Thread


def cube_area(thread, length, delay=0):
    time.sleep(delay)
    print(
        f"{thread} ---> Area of a cube with an edge length of {length} is: \
        \t{6 * (length ** 2)}"
    )


def circle_area(thread, length, delay=0):
    time.sleep(delay)
    print(
        f"{thread} ---> Area of a circle with a radius length of {length} is: \
        \t{3.14 * (length ** 2)}"
    )


# instantiate multiple threads with functions as targets and
# thread name, length as arguments

t1 = Thread(target=cube_area, args=("t1", 2))
t2 = Thread(target=circle_area, args=("t2", 3))

t3 = Thread(target=cube_area, args=("t3", 4))
t4 = Thread(target=circle_area, args=("t4", 6))

t5 = Thread(target=cube_area, args=("t5", 9))
t6 = Thread(target=circle_area, args=("t6", 8))

t1.start()
t2.start()
t3.start()
t4.start()
t5.start()
t6.start()

As you may have noticed, since the delay variable is set by default to 0, all will be printed in the initial order:

t1 ---> Area of a cube with an edge length of 2 is: 	24
t2 ---> Area of a circle with a radius length of 3 is: 	28.26
t3 ---> Area of a cube with an edge length of 4 is: 	96
t4 ---> Area of a circle with a radius length of 6 is: 	113.04
t5 ---> Area of a cube with an edge length of 9 is: 	486
t6 ---> Area of a circle with a radius length of 8 is: 	200.96

Let us play around with our delay variable and see what happens:

t1 = Thread(target=cube_area, args=("t1", 2, 3))
t2 = Thread(target=circle_area, args=("t2", 2, 2))

t3 = Thread(target=cube_area, args=("t3", 4, 1))
t4 = Thread(target=circle_area, args=("t4", 6, 2))

t5 = Thread(target=cube_area, args=("t5", 9, 4))
t6 = Thread(target=circle_area, args=("t6", 8, 3))

t1.start()
t2.start()
t3.start()
t4.start()
t5.start()
t6.start()

Now, the output will be as follows:

t3 ---> Area of a cube with an edge length of 4 is: 	96
t2 ---> Area of a circle with a radius length of 2 is: 	12.56
t4 ---> Area of a circle with a radius length of 6 is: 	113.04
t1 ---> Area of a cube with an edge length of 2 is: 	24
t6 ---> Area of a circle with a radius length of 8 is: 	200.96
t5 ---> Area of a cube with an edge length of 9 is: 	486

The lines with the shortest delay are printed first, while those with a longer delay are printed at the end of our program. This is possible by implementing multithreading; otherwise, our code would have been executed line by line. This is just a simple example, but imagine processing an image or writing a huge file to a disk, and refreshing a resource on the internet. With multithreading, we can allocate the part of our code that takes longer to a separate thread and continue with the rest of our program at the same time.

Conclusion

In this introductory topic, we have briefly explained what a thread is and how it influences a process execution. We have also introduced you to Python's built-in modules — _thread and threading. There are also locked and unlocked statuses of a thread; you can use them to prevent data corruption. Finally, we've made a basic example of how to use multiple threads in Python.

Of course, this is only a beginning. We will continue to discuss multithreading in Python at length in other topics. But for now, let's turn to practice!

66 learners liked this piece of theory. 7 didn't like it. What about you?

Report a typo