Python is a high-level programming language. It has many applications in machine learning, web development, and data analysis. It also contains easy syntax and provides various libraries for developers to increase productivity and optimize code. As a piece of code grows more complex, the program becomes inefficient. In this topic, we will learn to increase the performance of programs with built-in functions, list comprehension, generators, string concatenation, and other features.
Why optimize?
Code optimization is a way to perform a task faster and with fewer code lines. While doing this, the program consumes fewer resources. Why is optimizing so important? Let's consider an example with a search engine. If the new search engine is faster than the one we used before, it makes perfect sense to switch to the new one. That's why we need to use the best tools for development.
Built-in functions
With built-in functions, you can increase the speed of your code greatly. They have been tested and optimized to work better. If you can write your code with built-in functions, you'll improve the performance.
Let's observe the benefit of using built-in functions on the example:
from time import time
numbers = list(range(1, 10000000))
def total_number(number_list):
total = 0
for number in number_list:
total += number
return total
# Calculate the sum without built-in functions
start_time = time()
total_list = total_number(numbers)
end_time = time()
print(end_time - start_time) # 0.3405749
# Calculate the sum using built-in functions
start_time = time()
total_with_sum = sum(numbers)
end_time = time()
print(end_time - start_time) # 0.0446672
We import the time() method from the time module to observe the difference. We create a list of numbers from 1 to 10000000. First, we add the numbers in the function without using the built-in method. Then we add the numbers using the sum() method. With the help of the time() method, we observe that the difference is almost eight times.
You can access the entire list of built-in functions in the official documentation.
List comprehension
In Python, we often use loops to add elements to a list. While doing this, we perform various control operations. We can also use list comprehension to make this happen in a faster way. As an example, let's add numbers divisible by three to a list:
from time import time
# default approach
start = time()
new_list = []
for i in range(1, 1000000):
if i % 3 == 0:
new_list.append(i)
end = time()
print(end - start) # 0.0812482
# list comprehension
start = time()
new_list = [i for i in range(1, 1000000) if i % 3 == 0]
end = time()
print(end - start) # 0.0488176
Both approaches produce the same result. But list comprehension makes the code more readable, and it also makes our code run twice as fast.
Generators
A regular list saves all of its elements in memory, but this is not very efficient. As you remember, generators are Python structures that allow data to be loaded temporarily. In this way, you get the data you need only when you call it. What's more, there are functions in Python that use the generator structure by default, for example, map() and filter(). You may want to create your generator functions as well.
Now let's look at a function we've made with a generator:
from time import time
def get_odds(n):
for i in range(n):
if i % 2 != 0:
yield i
# Creating generator objects
start_time = time()
odd_generator = get_odds(10000000)
end_time = time()
print(end_time - start_time) # 0.000015
# You can iterate over a generator by using a for loop
for _ in range(100):
print(next(odd_generator))
In this example, we return odd numbers in a given range. Note that creating the generator happened in no time. However, you should understand that a generator object is different from a list. Generators cannot perform some operations such as indexing and slicing, but we can iterate over our values using a for loop. If we had opted for something else, we would have needed to store all of our numbers in a regular list. In this case, if the list is large enough, it could cause memory overflow or even a crash.
Proper string concatenation
In some programs, the + operator concatenates strings. With each plus operation, a new string is created and copied over the old data. Instead, we can use a more efficient join() method:
from time import time
start = time()
sentence = (
"I " + "currently " + "have " + "4 " + "windows " + "open " + "up... "
+ "And " + "I " + "don't " + "know " + "why.")
print(sentence)
end = time()
print(end - start) # 0.000000398
start = time()
sentence = " ".join([
"I", "currently", "have", "4", "windows", "open", "up...", "And",
"I" , "don't", "know", "why."])
print(sentence)
end = time()
print(end - start) # 0.000000226
We can also decide how to combine elements in the join method. In the example, we've combined it with space. In this way, the readability and speed of our code have increased.
Other improvements
Some Python libraries are written in the faster programming languages such as C and C++. Numpy and Pandas libraries can significantly increase the speed of your program.
Another important thing is to import third-party libraries properly: do not use dots when calling functions in these libraries to increase the speed of your program. Let's see an example:
# With the dot notation
import time
import collections
start_time = time.time()
counter_dict = collections.defaultdict()
end_time = time.time()
print(end_time - start_time) # 0.00000286
# Explicit notation
import time
from collections import defaultdict
start_time = time.time()
counter = defaultdict()
end_time = time.time()
print(end_time - start_time) # 0.00000166
In the code above, we have compared the two approaches with the help of the time() method. The first code runs slower because of dots; they create lookup operations. Also, explicit import enhances code readability.
io.StringIO in Python is an in-memory stream that allows you to read from and write to a string buffer as if it were a file. It provides a file-like interface for working with string data.
Using io.StringIO can save memory in certain scenarios because it avoids the need to store data in a physical file on disk. Instead, it keeps the data in memory, which can be more efficient and faster for certain operations. Let's look how to use this method.
import io
obj = io.StringIO()
obj.write('HyperSkill: ')
print('The best learning platform', file=obj)
print(obj.getvalue()) # HyperSkill: The best learning platform
obj.close()Conclusion
To sum up, in this topic, we've learned:
Benefits of optimizing our code;
Importance of using functions defined in Python when possible;
Importance of generators and list comprehension;
Effective string concatenation;
The correct way to call a method from a library.
Now let's move on to the practice.