Many of us who depend on Python a great deal these days know it’s quite easy to make it run faster without having to change code at all. Sound too good to be true? What’s the catch?
The catch is this: You need to be using relatively recent (within the last five years) x86/x86-64 processors, and your Python code needs to lean on NumPy or SciPy or scikit-learn or mpi4py.
If that’s the case, you really should try Tip #1. If not, skip to Tip #2, which can help everyone but isn’t quite as magical.
Tip #1: Download and install the Intel Distribution for Python. It’s actually quite painless, quick, and easy to do; it’s also completely free, and it doesn’t require any code changes to use it. This takes nothing more than a few moments of your time, which you’ll easily get back in the form of faster Python execution if you do much with Python. This biggest tune-up of all is to simply install the Intel Distribution for Python to power either Python 2.7 or Python 3.6 code. You can do this on Linux, Windows, or macOS, and you can use conda or pip package management. If you are an Anaconda user, you can simply set up an Intel channel (conda config –add channels intel) and bring in the Python you want (see instructions in a blog from Intel titled Installing Intel Distribution for Python and Intel Performance Libraries with Anaconda.
I’ve helped quite a few different teams do this, and each was initially skeptical. I can report all have been surprised at the obvious boost in performance they received (each was using at least one of NumPy, SciPy, or scikit-learn). A couple of them were involved in startups doing machine learning work and were considering rewriting code in C simply for speed-ups. After tuning up their Python as I recommend and seeing it run faster, they decided that wasn’t needed.
Tip #2: The next big thing you can do is . . . understand your code. I highly recommend at least peaking at your code with the premier performance analysis tool for x86: the Intel® VTune Amplifier. For free. (For years, VTune has been a strong seller for Intel, but the latest version is also available to anyone for free.)
I’ve been a fan of VTune for several decades, including my years as a developer working for Intel. I love it because I can trust it to accurately tell me what code is causing what performance issues.
Now two big changes in recent years make it even better! It’s easier to use, and the current version is available for free.
Before anyone sells you on “do this” or “do that” to change your Python code for performance, be sure you know how your code is really behaving on the actual system. Surprises are more common than most people expect, and often a surprise has a happy upside by hinting strongly at changes that will enhance performance.
The Intel VTune user guide has easy-to-follow instructions on doing Python Code Analysis with VTune. Best of all, VTune can accurately understand performance on the whole machine regardless of what application is causing it. (This means that you’ll see if a file open operation is horribly delayed by a virus checker, for instance.) Knowing where performance is lost is useful. At least you know what you’re up against!
Try it today!
Python is the foundation for so much work these days that it deserves a few minutes of tune‑up attention. My two simple tips can lead to higher-performance Python code. Here’s where to go for more information: