A new sampling profiler tool for Python developers, Py-Spy, gathers statistics about running Python programs without needing to instrument the code or even restart a running application.
Written by developer Ben Frederickson, Py-Spy can be installed via Python’s pip
installer, and it runs on both Linux and Windows. This makes it uncomplicated to set up and useful in most any environment where Python is running.
Most profiling systems for Python, says Frederickson in his project’s notes, require changes to the source code to instrument the application. Aside from the possible hassles of modifying source, this also means the profiling code has to run in the same process as the app itself. “This means it’s not generally safe to use these profilers for debugging issues in production services since they will usually have a noticeable impact on performance,” he writes.
IDG
A live profile of a running Python script. Stats are sampled from the Python executable and can be sorted according to overall time or percentage of time used.
Py-Spy takes a different approach. It runs as a separate process, takes in the process ID of the Python app to analyze, and and uses the kernel-level APIs on the platform where it’s running to read the app’s memory. This way, claims Frederickson, Py-Spy is safe to use in production.
The resulting traces can be dumped to the console, where the most-called functions show up in a list or can be visualized as a flame graph. Py-Spy also provides statistics for how much time a process spends waiting on the Python interpreter’s Global Interpreter Lock (GIL). The GIL enforces thread-safe memory management, but at the cost of multithreaded performance, so Py-Spy can provide some perspective on how much of an impact the GIL has on any application.
Py-Spy uses binaries written in Rust to accomplish much of its magic. Most of the binaries bundled into Python apps are written in C or C++, or in the Cython variant of Python that compiles to native C. But several projects have sprung up to make it easier to write Rust applications that interface with Python and vice versa, to leverage Rust’s memory safety and machine-native speeds.
Right now there’s one drawback to using Rust in this way: Python’s setuptools
system doesn’t yet have integrated support for building and bundling a Rust binary. Normally, you’d need a Rust compiler on the system where Py-Spy was being installed. Fredericksen worked out what he calls “a pretty terrible hack that might be useful to other people” to bundle the Py-Spy binaries in the pip
install package.
Another limitation of Py-Spy itself is that it can’t gather information about C extensions for Python running in the same process, just the Python interpreter itself. However, Frederickson notes that it may be possible to do this with additional work, such as by using the libunwind
library.