How to Profile Python Code With cProfile

May 10, 2023•5 min read

Languages, frameworks, tools, and trends

Programming is a sequential job involving writing code, testing, monitoring (profiling), reviewing the reports and results, and refactoring. After all, there’s no guarantee that the first written source code is in its best-optimized shape.

In this article, we're going to focus on the third step of the sequential procedure, which is profiling, with respect to cProfile in Python.

A brief look at profiling

Profiling explains the techniques for determining the parts and units of a program that take a tremendous amount of time to execute. With it, we can use the reports and numbers to improve performance as well as the execution time of functions and methods.

It’s important to note that profiling is not always the best answer for better performance. However, it’s suitable when we need some extra speed through our executions or if there is a huge request load over the servers and we need to manage the complexity of the components.

If you’re already satisfied with your program's execution time, there’s no need to bother with cProfile. However, it can be used when execution time has become a concern and it breaks other parts of the service.

For instance, your server is experiencing a lot of requests and heavy traffic owing to advertising efforts and users have noticed that it is responding slowly. This is where cProfile will come in handy if the service is Python-based and some Python function calls are causing the heavy traffic load.

Profiling is a game-changer when it comes to redesigning and restructuring a project and the codebase. Most times, you don't need to put effort into a full redesign task as a simple R&D and profiling would work.

cProfile in Python

In Python and *nix environments, there are many packages and Python libraries that provide facilities for evaluating the cost of source codes, whether it's security threats or the time it takes for a component to execute.

cProfile in Python is categorized as a standard library. In the next section, we'll use an example followed by the actual implementation and possible strategies for optimization. It’s a good illustration of what happens in real-world cases.

Scenario: There’s a large text file of novel titles separated by new lines. Some names have occurred twice or even multiple times. The goal is to read the file and create a local variable filled with all the novel names without duplicate names included. The main requirement is to use multiple functions as the solution.

Here is a short scheme of what we have in the text file.

The Name of the Wind
Dune
The Left Hand of Darkness
Dracula
Foundation
Dune
...

Plaintext

Solution: We need functions for reading the file, ignoring the duplicated values, and a main function for calling them in order.

read_file(path: str) -> list: Gets the file name/path and returns the list of titles.

ignore_duplicates(titles: list): Gets a list of titles (the value returned from read_file()) and returns a duplicate-safe list.

main(): Main function.

Keep in mind that the following code snippet is not perfect. We will come up with solutions with the help of cProfile to optimize the code and improve the execution time of each function.

#!/bin/python

def read_file(name: str) -> list:
    with open(name,'r') as file:
        titles = file.readlines()
    
    new_titles = []
    # casting off \n from end of titles
    for title in titles:
        new_titles.append(title.removesuffix('\n'))
    
    return new_titles


def ignore_duplicates(titles: list) -> list:
    new_titles = []
    for title in titles:
        if title in new_titles:
            continue
        else:
            new_titles.append(title)
    
    return new_titles

def main() -> None:
    original_titles = read_file('novels.txt')
    safe_titles = ignore_duplicates(original_titles)
    print(safe_titles)

if __name__ == '__main__':
    cProfile.run('main()')

Python

Executing the above script will show the safe_titles variable that contains the unique titles. On a lower-end machine, the same script takes more seconds to run and that's why we're not done yet.

Let's begin the cProfile part by importing this module first.

import cProfile
...

Python

At the very bottom of the script, replace the phrase main() with cProfile.run('main()'). It not only deals with its parameter as an interpreting line, but illustrates the information we need for optimizing.

321630 function calls in 0.510 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 <frozen codecs>:260(__init__)
        1    0.000    0.000    0.000    0.000 <frozen codecs>:309(__init__)
      308    0.000    0.000    0.001    0.000 <frozen codecs>:319(decode)
        1    0.001    0.001    0.510    0.510 <string>:1(<module>)
        1    0.447    0.447    0.447    0.447 file.py:15(ignore_duplicates)
        1    0.001    0.001    0.509    0.509 file.py:25(main)
        1    0.036    0.036    0.059    0.059 file.py:3(read_file)
      308    0.000    0.000    0.000    0.000 {built-in method _codecs.utf_8_decode}
        1    0.000    0.000    0.510    0.510 {built-in method builtins.exec}
        1    0.001    0.001    0.001    0.001 {built-in method builtins.print}
        1    0.000    0.000    0.000    0.000 {built-in method io.open}
        1    0.000    0.000    0.000    0.000 {method '__exit__' of '_io._IOBase' objects}
   161002    0.005    0.000    0.005    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.009    0.009    0.010    0.010 {method 'readlines' of '_io._IOBase' objects}
   160000    0.009    0.000    0.009    0.000 {method 'removesuffix' of 'str' objects}

Python

In the first line, we have 321630 function calls in 0.5 seconds. This execution time is almost 5 seconds on the low-end machine.

We have two ~160K calls in two lines. To sort the depicted table, simply add the sort=COLUMN attribute to the cProfile.run() method.

 ...
    cProfile.run('main()', sort='ncalls') # sorting by number of calls

Python

Here is the result.

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   161002    0.005    0.000    0.005    0.000 {method 'append' of 'list' objects}
   160000    0.009    0.000    0.009    0.000 {method 'removesuffix' of 'str' objects}
      308    0.000    0.000    0.000    0.000 {built-in method _codecs.utf_8_decode}
      308    0.000    0.000    0.001    0.000 <frozen codecs>:319(decode)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.447    0.447    0.448    0.448 file.py:15(ignore_duplicates)

Python

We can see which lines get executed the most. We have two lines with over +160K executions! That's massive. It's time to resolve things.

read_file() optimization: We're looping through all names in order to remove the \n suffix and append the name into another list that is described like {method 'append' of 'list' objects} in the first row.

That's a heavy load on the system. Instead, we could do it slightly cleaner by returning file.read().splitlines() right within the with scope as follows. It's an impressive O(n) to O(1) complexity.

def read_file(name: str) -> list:
    with open(name,'r') as file:
        return file.read().splitlines()

Python

ignore_duplicates() optimization: Next, it's time to refactor the core function. There is an O(1) trick that we can use to make a list filled with unique items. This involves using sets.

def ignore_duplicates(titles: list) -> list:
    return list(set(titles))

Python

Now, our source code should look like the following snippet after all optimizations and refactors are applied.

#!/bin/python
import cProfile


def read_file(name: str) -> list:
    with open(name,'r') as file:
        return file.read().splitlines()


def ignore_duplicates(titles: list) -> list:
    return list(set(titles))


def main() -> None:
    original_titles = read_file('novels.txt')
    safe_titles = ignore_duplicates(original_titles)
    print(safe_titles)


if __name__ == '__main__':
    cProfile.run('main()')

Python

Let’s execute the program and wait for the result of cProfile to shine through even faster than before!

 15 function calls in 0.017 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.008    0.008    0.008    0.008 {method 'splitlines' of 'str' objects}
        1    0.005    0.005    0.005    0.005 file2.py:8(ignore_duplicates)
        1    0.001    0.001    0.017    0.017 <string>:1(<module>)
        1    0.001    0.001    0.001    0.001 {built-in method builtins.print}
        1    0.001    0.001    0.001    0.001 {built-in method _codecs.utf_8_decode}
        1    0.001    0.001    0.001    0.001 {method 'read' of '_io.TextIOWrapper' objects}
        1    0.000    0.000    0.009    0.009 file2.py:3(read_file)
        1    0.000    0.000    0.017    0.017 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {built-in method io.open}
        1    0.000    0.000    0.015    0.015 file2.py:12(main)
        1    0.000    0.000    0.000    0.000 {method '__exit__' of '_io._IOBase' objects}
        1    0.000    0.000    0.001    0.001 <frozen codecs>:319(decode)
        1    0.000    0.000    0.000    0.000 <frozen codecs>:309(__init__)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.000    0.000 <frozen codecs>:260(__init__)

Python

There is a huge performance improvement here. We went from >300K function calls to only 15 calls. We also made it much faster, from 0.5s to 0.01, which is ~x50 quicker than before!

With the applied performance improvements, we can have the same functionality but 50 times faster on the low-spec machine as well.

Another way of applying cProfile through components is via decorators. Simply structure cProfile objects and with the help of io and pstats modules, we can make our cProfile much easier to use and more precise. A good matching module that helps manipulate the statistics returned from cProfile is pstats.

In the next section, we'll share some tricks to turn the solid raw information that cProfile provides us into better graphical shapes and reports using a third-party library in Python.

cProfile visualization in GUI

There are online services and tools that can be used to illustrate the output and reports of a cProfile Python execution. It's a nice practice for those who need the reports for meetings and technical events.

A quick library installation is needed. We assume that you've already installed virtualenv and have your environment activated.

$ pip install snakeviz

Bash

Now, we need to export the output of cProfile into a .prof file. The following commands do so and keep the statistics in a file named program.prof.

$ python -m cProfile -o program.prof file.py

Bash

As a last step, run the snakeviz command and it will illustrate the reports right on your browser.

$ snakeviz program.prof
snakeviz web server started on 127.0.0.1:8080; enter Ctrl-C to exit
http://127.0.0.1:8080/snakeviz/...

Bash

Browse the link and you'll see a panel like this.

cprofile python.webp

You can customize and filter the shown data as you wish. Feel free to work with the utilities that the tool provides.

Conclusion

Redesigning a codebase is costly but with the help of monitoring and profiling methodologies, you can avoid full redesigns and optimize it instead. In the cProfile example discussed in this article, you'll have plenty of information for optimizing as well as an accurate report of how your functions work over time. Your team can have sessions to analyze the results obtained from libraries and modules, such as cProfile in Python and pstat, for better performance.

Packaging in Python: Tools and Best Practices

Author
Sadra Yahyapour

Sadra is a Python back-end developer who loves the architectural design behind the software. A GitHub Campus Expert and open-source contributor. He spends his free time writing high-quality technical articles.

How to Profile Python Code With cProfile

A brief look at profiling

cProfile in Python

cProfile visualization in GUI

Conclusion

Share this post

Share