5 projects that push Python performance

It's spiffy and convenient, but most everyone who uses Python knows it's comparatively creaky -- orders of magnitude slower than CJava, or JavaScript for CPU-intensive work. But several parties don't want to ditch all that's good about Python and instead have decided to boost its performance from the inside out.

If you want to make Python run faster on the same hardware, you have two basic options, each with a drawback:

  1. You can create a replacement for the default runtime used by the language (the CPython implementation) -- a major undertaking, but the result would be a drop-in replacement for CPython.
  2. You can rewrite existing Python code to take advantage of certain speed optimizations, which means more work for the programmer but doesn't require changes in the runtime.

Here are five possible ways the bar could be raised -- and in some cases already is -- on Python performance.

PyPy

Among the candidates for a drop-in replacement for CPython, PyPy is easily the most visible (Quora, for instance, uses it in production). It also stands the best chance of becoming the default, as it's highly compatible with existing Python code.

PyPy uses just-in-time (JIT) compilation, the same technique used by Google Chrome's V8 JavaScript engine to speed up that language. The most recent release, PyPy 2.5, emerged at the beginning of February with a slew of performance improvements, among them better-integrated support for some common libraries used to accelerate Python performance such as NumPy.

Those using Python 3.x have to work with a separate build of the project, PyPy3. Unfortunately for lovers of bleeding-edge language features, that version supports up to Python 3.2.5 only, although support for 3.3 is in the works.

Pyston

Pyston, sponsored by Dropbox, uses the LLVM compiler infrastructure to also speed up Python with JITing. Compared to PyPy, Pyston is in the very early stages -- it's at revision 0.2 so far and supports only a limited subset of the language's features. Much of the work has been divided between supporting core features of the language and bringing up performance ofkey benchmarks to an acceptable level. It'll be a while before Pyston can be considered remotely production-ready.

Nuitka

Rather than replace the Python runtime, some teams are doing away with a Python runtime entirely and seeking ways to transpile Python code to languages that run natively at high speed. Case in point: Nuitka, whichconverts Python to C++ code -- although it relies on executables from the existing Python runtimes to work its magic. That limits its portability, but there's no denying the value of the velocity gained from this conversion. Long-term plans for Nuitka include allowing Nuitka-compiled Python to interface directly with C code, allowing for even greater speed.

Cython (C extensions for Python) is a superset of Python, a version of the language that compiles to C and interfaces with C/C++ code. It's one way to write C extensions for Python (where code that needs to run fast can be implemented), but can also be used on its own, separate from conventional Python code. The downside is that you're not really writing Python, so porting existing code wouldn't be totally automatic.

That said, Cython provides several advantages for the sake of speed not available in vanilla Python, among them variable typing à la C itself. A number of scientific packages for Python, such as scikit-learn, draw on Cython features like this to keep operations lean and fast.

Numba

Numba combines two of the previous approaches. From Cython, it takes the concept of speeding up the parts of the language that most need it (typically CPU-bound math); like PyPy and Pyston, it does so via LLVM. Functions compiled with Numba can be specified with a decorator, and Numba works hand-in-hand with NumPy to quicken the functions found. However, Numba doesn't perform JITing; the code is compiled ahead of time.

Python creator Guido van Rossum is adamant that many of Python's performance issues can be traced to improper use of the language. CPU-heavy processing, for instance, can be hastened through a few methods touched on here -- using NumPy (for math), using the multiprocessing extensions, or making calls to external C code and thus avoiding the Global Interpreter Lock (GIL), the root of Python's slowness. But since there's no viable replacement yet for the GIL in Python, it falls to others to come up with short-term solutions -- and maybe long-term ones, too.

This story, "5 projects that push Python performance" was originally published by InfoWorld.

Notice to our Readers
We're now using social media to take your comments and feedback. Learn more about this here.