Python and Cython

advanced
Published

March 31, 2024

Python’s ease of use and readability make it a favorite for many programmers. However, when performance becomes critical, Python’s interpreted nature can be a bottleneck. This is where Cython steps in, offering a powerful solution to bridge the gap between Python’s ease of development and the speed of compiled languages like C.

What is Cython?

Cython is a superset of Python that allows you to write C extensions for Python. It compiles your code (which looks largely like Python) into C, leveraging the speed of compiled languages while retaining much of Python’s syntax and ease of development. This means you can write performance-critical sections of your code in a language that’s almost Python, gaining significant speed improvements without completely rewriting everything in C or C++.

Why Use Cython?

Python’s Global Interpreter Lock (GIL) limits true parallelism in multithreaded applications. While multiprocessing can overcome this, it adds complexity. Cython allows you to bypass the GIL in certain scenarios, enabling more efficient multithreading for CPU-bound tasks.

Another key benefit is the ability to interact directly with C libraries. If you need to use existing C or C++ code, Cython provides a seamless way to integrate it into your Python project without sacrificing speed.

A Simple Cython Example

Let’s compare a pure Python function with its Cython equivalent:

Python (pure Python):

def py_sum_squares(n):
    total = 0
    for i in range(n):
        total += i*i
    return total

print(py_sum_squares(1000000))

Cython (.pyx file):

def cy_sum_squares(int n):
    cdef int i
    cdef long long total = 0  # Specify data types for optimization
    for i in range(n):
        total += i*i
    return total

To compile the Cython code:

  1. Save the Cython code as sum_squares.pyx.
  2. Create a setup.py file:
from setuptools import setup
from Cython.Build import cythonize

setup(
    ext_modules = cythonize("sum_squares.pyx")
)
  1. Run python setup.py build_ext --inplace in your terminal. This will generate a .so (or .pyd on Windows) file containing the compiled Cython code.

Now you can import and use the Cython function in your Python code:

import sum_squares

print(sum_squares.cy_sum_squares(1000000))

You’ll likely observe a substantial speed improvement with the Cython version, especially for larger values of n. The key here is specifying data types (cdef int i, cdef long long total) in the Cython code, which allows Cython to generate much more efficient C code.

Working with NumPy Arrays

Cython shines when working with NumPy arrays. Direct access to array elements without the overhead of Python’s indexing mechanisms yields dramatic speed boosts.

Python (using NumPy):

import numpy as np

def py_numpy_sum(arr):
    total = 0
    for i in range(len(arr)):
        total += arr[i]
    return total

arr = np.arange(1000000)
print(py_numpy_sum(arr))

Cython (using NumPy):

import numpy as np
cimport numpy as np

def cy_numpy_sum(np.ndarray[np.int64_t] arr): # Specify NumPy array type
    cdef long long total = 0
    cdef int i
    cdef int n = arr.shape[0]
    for i in range(n):
        total += arr[i]
    return total

arr = np.arange(1000000, dtype=np.int64)
print(cy_numpy_sum(arr))

Again, compile this using the same setup.py method as before. The Cython version will significantly outperform the pure Python version when dealing with large NumPy arrays. The type declaration np.ndarray[np.int64_t] arr is crucial for optimization.

Beyond the Basics

This post covers the fundamentals. Cython’s capabilities extend far beyond these simple examples, including memory management techniques, working with C++ code, and more advanced optimization strategies. Exploring its documentation will unlock its full potential for accelerating your Python projects.