Measure First

Never optimize without profiling. Python’s -m cProfile and timeit identify hot paths:

  import cProfile

def slow_function():
    total = sum(i**2 for i in range(100000))
    return total

cProfile.run('slow_function()')
  
  import timeit
timeit.timeit('sum(range(1000))', number=10000)
  

Line Profiler

  pip install line_profiler
kernprof -l -v myscript.py
  

Common Optimizations

Use Built-ins and Comprehensions

  # Slow
result = []
for x in data:
    result.append(x * 2)

# Fast
result = [x * 2 for x in data]
  

Choose the Right Data Structure

  # O(1) lookup
users_set = set(user_ids)
if user_id in users_set: ...

# O(1) key lookup
cache = {}  # dict
  

Generators for Large Data

  def read_large_file(path):
    with open(path) as f:
        for line in f:
            yield line.strip()
  

Caching

  from functools import lru_cache

@lru_cache(maxsize=128)
def fibonacci(n):
    if n < 2:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)
  

NumPy for Numerical Work

  import numpy as np
# 10-100x faster than pure Python loops for array math
result = np.dot(matrix_a, matrix_b)
  

Multiprocessing for CPU-Bound Tasks

  from concurrent.futures import ProcessPoolExecutor

with ProcessPoolExecutor() as executor:
    results = list(executor.map(process_item, items))
  

When to Use C Extensions

For critical hot loops, consider:

  • Cython — compile Python-like code to C
  • Numba — JIT compile with @jit decorator
  • PyPy — alternative Python interpreter with JIT
  from numba import jit

@jit(nopython=True)
def fast_sum(arr):
    total = 0
    for x in arr:
        total += x
    return total
  

Anti-Patterns

  • Creating objects in tight loops unnecessarily
  • String concatenation with + in loops (use ''.join())
  • Reading entire large files into memory
  • Ignoring database query optimization (N+1 problem)

Profile, optimize the bottleneck, verify improvement — repeat.