7 minutes read

The subject of this topic probably won't bother you as a Python developer unless you work on C extensions or with multithreaded programs. However, you're likely to come across threads or processes, at least for educational purposes. Let's stay fundamental and consistent in the pursuit of knowledge and try to understand what a Global Interpreter Lock is.

The purpose of GIL

A Global Interpreter Lock (GIL) is a mechanism that protects Python objects from being accessed by several threads at once. It allows only one of the threads to take control over the interpreter and execute the bytecode. In other words, it can solve the issue when multiple threads can manipulate the same objects simultaneously, as this may lead to crashes, data corruption, erasure, and other dubious behavior. Let's discuss this a little bit more thoroughly.

One way to solve the problem we've described above is to isolate objects from the threads they're working with. Sounds logical, right? Not so fast, as this may lead to the so-called deadlocks. A deadlock is a state in which each thread is waiting for another thread to release a lock. Let's clear it up a little. You have two threads; both need to manipulate objects A and B. The first one starts with A, while the other starts with B. You end up in a situation when the first thread is waiting for the release of B and the second for A. But nothing happens.

Luckily for us, there is a GIL. A GIL is a single lock, and, as we have already discussed, it locks the interpreter. We can guarantee thread safety only if we let one thread do the job while blocking all the rest. So, we block this thread, which has been executing the bytecode so far, and let the other one do its job. This way, we make sure that all threads run one by one. To visualize the idea, look at this picture from the famous presentation about GIL in Python given by David Beazley:

Global Interpreter Lock

Does this mean that Python is single-threaded? Yes, it does.

The GIL and multithreading

If you had a hard time solving serious multithread-related issues, you probably heard that a GIL doesn't have a good reputation among the developer community and often gets criticized for restricting Python to a single thread. Moreover, due to threads, CPU-bound programs not only become single-threaded because of the lock but also face an increased execution time in comparison to single-threaded programs. Let's consider the example from the talk we have referenced above and see for ourselves.

Here is a single-threaded program that counts backward starting from 50 000 000:

# single-threaded solution

import time
from threading import Thread

COUNT = 50000000

def countdown(n):
    while n > 0:
        n -= 1

start = time.time()
countdown(COUNT)
end = time.time()

print('The execution took', end - start, 'seconds.')
# The execution took 2.751121997833252 seconds.

And here's a multithreaded solution of the same problem:

# multithreaded solution

import time
from threading import Thread

COUNT = 50000000

def countdown(n):
    while n > 0:
        n -= 1

t1 = Thread(target=countdown, args=(COUNT//2,))
t2 = Thread(target=countdown, args=(COUNT//2,))

start = time.time()
t1.start()
t2.start()
t1.join()
t2.join()
end = time.time()

print('The execution took', end - start, 'seconds.')
# The execution took 3.7550148963928223 seconds.

The final numbers depend on the machine you use, but you can spot the trend. We won't discuss here why this happens; you can find the answers in the talk above if you're interested in details.

For those Python developers who are keen on writing multithreaded programs, the picture looks pretty bad. However, don't be discouraged. There are at least a couple of workarounds.

  • Multiprocessing: the most common approach is to use processes instead of threads. In this case, each process gets its own Python interpreter, and the GIL won't be a problem anymore;

  • Alternative interpreters: GIL is only implemented in CPython; however, there are plenty of other Python interpreters you can use: Jython (Java), IronPython (C#), and PyPy-STM version (Python).

Why is it still there?

The short answer is: because it's not that easy to remove it.

First of all, Python was designed before the spread of multithreading and gained most of its popularity thanks to the simplicity it offered. When the concept of multithreading was introduced, extensions for the existing C libraries started to appear in Python, so it became necessary somehow to provide consistency and thread safety. One of the popular solutions back then was to utilize a GIL. Secondly, removing a GIL from Python will inevitably lead to disrupting many Python packages and modules, and consequently, to various incompatibility difficulties. Last but not least, a GIL has its advantages and disadvantages, as do its alternatives, but it was added for a valid reason. Everything has its purpose.

Conclusion

Let's sum up the points we've discussed in this topic:

  • Python GIL is a mechanism that isolates Python objects from several threads;

  • GIL purpose — it prevents data corruption and deadlocks; it also guarantees the safety of threads;

  • GIL operation — it locks the interpreter and allows only one thread to do its job;

  • GIL role — it makes Python single-threaded;

  • GIL workarounds for multithreaded programs — multiprocessing and alternative Python interpreters;

  • And, lastly, why it's still here with us; it's not that easy to get rid of it as it infers serious consequences.

55 learners liked this piece of theory. 0 didn't like it. What about you?
Report a typo