The success of any software project depends on two kinds of speed — code creation and execution speed. Python is a well-known language that helps create software quickly, but it is slow compared to Java or C++. For most applications, this is fine, where application speed can be traded for the quick release of new features, but in some cases, improving application speed is a non-negotiable ask.
This is where profilers enter the scene. Profilers are software used to measure the runtime of some code and provide valuable metrics to identify the slowest running parts of a code. This way, one knows what to optimize to get the best runtime for a code.
Profiling in Python
Profiling a program means having read access to all the executed infrastructure when a program runs. Measuring the time cost associated with an event like a function call or object initialization is required. Profiling can be done on a real-time basis (measuring each event as and when they happen) or on a statistical basis (measuring events at a specific statistical interval). The first is a deterministic profiler (cProfile), and the second is a statistical profiler (statprof). Look at a basic example of using the cProfile:
import cProfile
import time
pr = cProfile.Profile()
pr.enable()
def let_me_sleep(time_in_seconds):
time.sleep(time_in_seconds)
for i in range(5):
let_me_sleep(i)
pr.disable()
pr.print_stats()
Using cProfile.Profile(), we call the enable method on the object to start the measurement and disable to stop the measurement. The code in between sleeps for each value of the time_in_seconds variable which changes from 0 to 4. We know the sum of the first numbers is , for , the value is 10. When we execute, we expect a total runtime of around 10 seconds for this code. And here is the output.
The meaning of different columns is as follows. Ncalls means the total number of calls to the function. The variable tottime refers to the full time spent in the function apart from sub-function invocation. Cumtime it shows the cumulative time required for the execution of that function. The two percall columns are tottime/ncalls and cumtime/ncalls respectively.
So in this example, we see time.sleep spends around 10 seconds for the code, and the let_me_sleep function cumulatively sleeps for 10 seconds. In the real world, something like the let_me_sleep function would do something useful like querying a database or processing a network request, and we would understand its time cost by looking at the profiler stats and thus optimize it.
Profiling in Flask
Profiling Flask applications shows you where the request spent most of its time. The application level info on time spent on each request is valuable information when trying to optimize the overall performance of the code. Let's create a simple Flask app and start looking at how to profile the app:
from flask import Flask
app = Flask(__name__)
@app.route('/')
def get_hello():
val = random.randint(0, 6) + 5
time.sleep(val)
return f"The server slept for {val} seconds"
if __name__ == '__main__':
app.run()This Flask server sleeps between 5 to 11 seconds (the value is random) and returns the output saying how many seconds it slept. In the real world, the slowness in an app arises when it has to wait for a response from either network or a database or CPU-intensive work. Here we are simulating that slowness using the time.sleep function.
To profile this app, we would use the in-built profiler called ProfilerMiddleware. The implementation looks as follows:
from flask import Flask
from werkzeug.middleware.profiler import ProfilerMiddleware
app = Flask(__name__)
app.wsgi_app = ProfilerMiddleware(app.wsgi_app, sort_by=('cumtime', 'ncalls'))
# @app.router code belowIn Flask, it is a pattern to add middleware to the Flask app; by using the syntax, middleware is an added functionality to the web app.
app.wsgi_app = Middleware(app.wsgi_app)If we run the code, our server should be up and running. Then we can send the request via a browser, and the output looks like this:
If we check the logs in the backend, it looks something like this:
Note that in the setup of the profiler, we have configured it to sort by ('cumtime', 'ncalls'). This time, we see time.sleep takes the maximum time of 6 seconds and cumulatively the get_hello function takes the same amount of time.
Profiling recursive functions
The time spent while processing a request can be due to a slow operation like a network request we saw above, but the time spent can also be due to a recursive function that repeatedly calls itself. Let us simulate a recursive function and see how the profiling of the function looks in the same setup:
@app.route('/')
def get_hello():
ncalls = random.randint(0, 6)
val = random.randint(1, 3)
start = time.time()
def recurse_and_sleep(count):
if count == ncalls:
return
time.sleep(val)
recurse_and_sleep(count + 1)
recurse_and_sleep(0)
return f'''The server slept for {val} seconds for {ncalls} recursive calls,
time taken = {time.time() - start} seconds'''
We changed the get_hello function in our previous implementation. It has two global variables: ncalls (to determine how many recursive calls are made) and val (to determine how much time the function is supposed to sleep between two recursive calls), and both are picked randomly. Also, we measure the time taken to run the code using time.time(). And finally, we send this information to the caller. The output in the browser looks as follows
And the profiled logs in the backend look like this:
Note that the profiler catches the number of recursive and the total time to run the function; the 7th recursive call is used to exit the function.
Profiling with restriction
As you see in the logs, it is sometimes hard to check the code, causing performance degradation as it is too wordy. Profiling supports a parameter called restriction, which allows you to filter logs based on file name as regex or sheer quantity. Here we will check for the log from only the file which contains our Flask application code. The declaration of the Flask app will change as follows:
from flask import Flask
from werkzeug.middleware.profiler import ProfilerMiddleware
fname = os.path.basename(__file__) # Get the current file name and use it as restriction while profiling
app = Flask(__name__)
app.wsgi_app = ProfilerMiddleware(app.wsgi_app, sort_by=('cumtime', 'ncalls'), restrictions=(fname, ))
# @app.router code belowIf we run the same recursive code in the server under profiling and send a request, our profile logs will look as follows.
It is a much cleaner way of identifying performance bottlenecks.
Profiling and visualization
Until now, we have seen our profile logs on command line output, and it is helpful but a bit inconvenient to understand in a large project with many different function calls. We can do better using a library called snakeviz to see the logs and make sense of them visually. We will continue our recursive function example and retest the profile logs. Here is how to do it.
Setup:
Install the
snakevizlibrary withpip install snakevizand make a new profile directory to store the output of profiled files. These files have a .prof extension.Change the code at the beginning of the file where we add
ProfilerMiddlewareas follows
from flask import Flask
from werkzeug.middleware.profiler import ProfilerMiddleware
app = Flask(__name__)
app.wsgi_app = ProfilerMiddleware(app.wsgi_app, sort_by=('cumtime', 'ncalls'),
profile_dir="./profile_output", filename_format="{method}.{path}.prof")In this code, we add the
profile_dirto theProfilerMiddlewarefunction, which tells it where to write the .prof output files and add the filename format to get the name of the .prof files.
Execution:
Run the flask server and send a request via browser to profile a flask endpoint. It creates the GET.root.prof file in the
profile_outputdirectory.Run the
snakevizserver in theprofile_outputdirectory withsnakeviz -s -H 0.0.0.0 ./profile_output/at the command line and check that the command output is as follows:In the browser, if you open the link and check the created the GET.root.prof file in the /profile_output directory. The output looks like this. Now visually, you can see those functions in red, which are real bottlenecks in our code execution, a much easier way to understand the profiled output.
Conclusion
In this topic, we started by exploring the reasons behind profiling, as the saying goes — what gets measured gets improved. We examined profiling on a basic Python code and then applied it to a Flask application. We learned about the characteristics of profile logs and how they illustrate the time spent on both typically slow and recursively slow functions. Lastly, we explored how to analyze the profiled output for better understanding visually. The Flask ecosystem offers numerous profiling tools, so choosing the one that best fits your needs is essential. However, it's crucial to utilize profiling wisely to determine the optimal version of your code.