Table of Contents
- Concurrency vs. Parallelism: What’s the Difference?
- The Global Interpreter Lock (GIL): Python’s Hidden Constraint
- Threading in Python: Lightweight Concurrency
- Multiprocessing in Python: True Parallelism
- Simpler Concurrency with
concurrent.futures - Threading vs. Multiprocessing: When to Use Which?
- Best Practices
- References
1. Concurrency vs. Parallelism: What’s the Difference?
Before diving into Python’s tools, it’s critical to distinguish between concurrency and parallelism—two often-confused terms:
-
Concurrency: Managing multiple tasks overlapping in time. Tasks may start, run, and complete in any order, but they don’t necessarily execute simultaneously. Think of a chef juggling chopping vegetables, boiling water, and seasoning a dish—tasks are interleaved but not done at the exact same time.
-
Parallelism: Executing multiple tasks simultaneously (e.g., on separate CPU cores). This requires hardware support (multiple cores/processors). Think of two chefs working side-by-side: one chops, the other boils water—tasks run in parallel.
Python supports both, but the choice between threading and multiprocessing depends on whether your task is I/O-bound (waiting for input/output, e.g., network calls, file reads) or CPU-bound (intensive computations, e.g., mathematical modeling).
2. The Global Interpreter Lock (GIL): Python’s Hidden Constraint
To understand why threading and multiprocessing behave differently in Python, we must first discuss the Global Interpreter Lock (GIL).
The GIL is a mutex (mutual exclusion lock) in CPython (Python’s default interpreter) that ensures only one thread executes Python bytecode at a time. This simplifies memory management (e.g., reference counting) but limits true parallelism for CPU-bound tasks:
-
Effect on Threading: Even with multiple threads, only one can execute Python code at a time. For CPU-bound tasks, this means threading won’t speed up execution (threads take turns running). For I/O-bound tasks, however, threads spend most of their time waiting (e.g., for a network response), so the GIL is released, allowing other threads to run.
-
Effect on Multiprocessing: Each process gets its own Python interpreter and memory space, bypassing the GIL. Thus, multiprocessing enables true parallelism for CPU-bound tasks.
Key Takeaway: The GIL is a per-interpreter lock. Threads share the same interpreter (and GIL), while processes do not.
3. Threading in Python: Lightweight Concurrency
Threading is ideal for I/O-bound tasks (e.g., web scraping, API calls) where tasks spend little time using the CPU and much time waiting. Python’s threading module provides a high-level interface for creating and managing threads.
3.1 The threading Module
The threading module simplifies thread management with the Thread class. Here’s a basic workflow:
- Define a task (function) to run in a thread.
- Create a
Threadobject, passing the task and arguments. - Start the thread with
start(). - Join the thread with
join()to wait for it to finish (optional).
Example: Basic Thread Creation
import threading
import time
def print_numbers(name, delay):
"""Print numbers 1-5 with a delay."""
for i in range(1, 6):
time.sleep(delay)
print(f"Thread {name}: {i}")
# Create threads
thread1 = threading.Thread(target=print_numbers, args=("A", 1))
thread2 = threading.Thread(target=print_numbers, args=("B", 1.5))
# Start threads
thread1.start()
thread2.start()
# Wait for threads to finish
thread1.join()
thread2.join()
print("Main thread finished.")
Output (order may vary):
Thread A: 1
Thread B: 1
Thread A: 2
Thread A: 3
Thread B: 2
Thread A: 4
Thread B: 3
Thread A: 5
Thread B: 4
Thread B: 5
Main thread finished.
Daemon Threads
By default, the main thread waits for all non-daemon threads to finish. To create a background thread that exits when the main thread exits, set daemon=True:
daemon_thread = threading.Thread(target=print_numbers, args=("Daemon", 1), daemon=True)
daemon_thread.start()
time.sleep(3) # Main thread sleeps; daemon runs
print("Main thread exiting (daemon thread may be killed).")
3.2 Thread Safety and Race Conditions
Threads share the same memory space, so they can access shared variables. This can lead to race conditions—when multiple threads modify a shared resource simultaneously, causing unexpected behavior.
Example: Race Condition
import threading
counter = 0
def increment_counter():
global counter
for _ in range(100000):
counter += 1 # Not thread-safe!
# Create 10 threads
threads = [threading.Thread(target=increment_counter) for _ in range(10)]
# Start threads
for thread in threads:
thread.start()
# Wait for threads to finish
for thread in threads:
thread.join()
print(f"Expected counter: 1,000,000. Actual: {counter}") # Often less than 1M!
Why? The operation counter += 1 isn’t atomic (it’s temp = counter; temp += 1; counter = temp). If two threads read counter simultaneously, both increment to temp+1, overwriting each other’s work.
Fixing Race Conditions with Locks
Use threading.Lock to enforce mutual exclusion—only one thread can hold the lock at a time:
counter = 0
lock = threading.Lock() # Create a lock
def safe_increment():
global counter
for _ in range(100000):
with lock: # Acquire lock; release automatically when done
counter += 1
threads = [threading.Thread(target=safe_increment) for _ in range(10)]
for thread in threads:
thread.start()
for thread in threads:
thread.join()
print(f"Expected counter: 1,000,000. Actual: {counter}") # Now correct!
3.3 Example: Threading for I/O-Bound Tasks
Let’s scrape multiple URLs concurrently to demonstrate threading’s value for I/O-bound tasks. We’ll use requests for HTTP calls and measure execution time.
import threading
import requests
import time
def fetch_url(url):
"""Fetch a URL and return its status code."""
response = requests.get(url)
return f"{url}: {response.status_code}"
# List of URLs to scrape
urls = [
"https://www.google.com",
"https://www.github.com",
"https://www.python.org",
"https://www.stackoverflow.com"
]
# Sequential execution
start = time.time()
for url in urls:
print(fetch_url(url))
print(f"Sequential time: {time.time() - start:.2f}s")
# Threaded execution
start = time.time()
threads = []
results = []
def thread_task(url):
results.append(fetch_url(url))
for url in urls:
thread = threading.Thread(target=thread_task, args=(url,))
threads.append(thread)
thread.start()
for thread in threads:
thread.join()
for result in results:
print(result)
print(f"Threaded time: {time.time() - start:.2f}s") # ~2-3x faster!
Why Faster? Threads wait for network responses, so the GIL is released, allowing other threads to fetch URLs concurrently.
4. Multiprocessing in Python: True Parallelism
Multiprocessing is ideal for CPU-bound tasks (e.g., data processing, mathematical computations) where tasks require heavy CPU usage. The multiprocessing module spawns separate processes, each with its own Python interpreter and memory space.
4.1 The multiprocessing Module
The multiprocessing module mirrors threading in many ways but uses Process instead of Thread. Key differences:
- Processes don’t share memory by default (no race conditions, but harder to share data).
- Higher overhead than threads (separate memory spaces, slower startup).
Example: Basic Process Creation
import multiprocessing
import time
def square_numbers(name, numbers):
"""Square a list of numbers and print results."""
for num in numbers:
time.sleep(0.5)
print(f"Process {name}: {num}^2 = {num**2}")
# Split work into two processes
process1 = multiprocessing.Process(
target=square_numbers,
args=("A", [1, 2, 3])
)
process2 = multiprocessing.Process(
target=square_numbers,
args=("B", [4, 5, 6])
)
# Start processes
process1.start()
process2.start()
# Wait for processes to finish
process1.join()
process2.join()
print("Main process finished.")
4.2 Inter-Process Communication (IPC)
Since processes don’t share memory, use these tools to pass data:
Queue: Thread/process-safe FIFO queue for sending data between processes.Pipe: Two-way communication channel between two processes.- Shared Memory:
ValueandArrayfor sharing primitive data types (e.g., integers, arrays).
Example: Using Queue for IPC
import multiprocessing
def producer(queue):
"""Add items to the queue."""
for i in range(5):
queue.put(i)
print(f"Produced: {i}")
def consumer(queue):
"""Remove items from the queue."""
while True:
item = queue.get()
if item is None: # Sentinel to exit
break
print(f"Consumed: {item}")
# Create a queue
queue = multiprocessing.Queue()
# Start producer and consumer
producer_process = multiprocessing.Process(target=producer, args=(queue,))
consumer_process = multiprocessing.Process(target=consumer, args=(queue,))
producer_process.start()
consumer_process.start()
# Wait for producer to finish
producer_process.join()
# Send sentinel to consumer
queue.put(None)
consumer_process.join()
4.3 Example: Multiprocessing for CPU-Bound Tasks
Let’s compute prime numbers (a CPU-heavy task) using multiprocessing to leverage multiple cores.
import multiprocessing
import time
def is_prime(n):
"""Check if a number is prime."""
if n < 2:
return False
for i in range(2, int(n**0.5) + 1):
if n % i == 0:
return False
return True
def count_primes_in_range(start, end):
"""Count primes between start and end."""
count = 0
for num in range(start, end):
if is_prime(num):
count += 1
return count
# Define a large range to process
start_range = 1
end_range = 1_000_000
# Sequential execution
start = time.time()
sequential_count = count_primes_in_range(start_range, end_range)
print(f"Sequential primes: {sequential_count}")
print(f"Sequential time: {time.time() - start:.2f}s")
# Multiprocessing execution (split range across CPU cores)
num_processes = multiprocessing.cpu_count() # Use all available cores
chunk_size = (end_range - start_range) // num_processes
ranges = [
(start_range + i * chunk_size, start_range + (i+1) * chunk_size)
for i in range(num_processes)
]
# Create processes
processes = []
results = multiprocessing.Queue() # To collect results
def process_task(start, end, queue):
queue.put(count_primes_in_range(start, end))
for start, end in ranges:
process = multiprocessing.Process(
target=process_task,
args=(start, end, results)
)
processes.append(process)
process.start()
# Wait for processes and sum results
total_primes = 0
for _ in processes:
total_primes += results.get()
for process in processes:
process.join()
print(f"Multiprocessing primes: {total_primes}")
print(f"Multiprocessing time: {time.time() - start:.2f}s") # ~3-4x faster on 4-core CPU!
5. Simpler Concurrency with concurrent.futures
The concurrent.futures module (introduced in Python 3.2) provides a high-level interface for threading and multiprocessing via ThreadPoolExecutor and ProcessPoolExecutor. These abstractions simplify task submission and result handling.
5.1 ThreadPoolExecutor
For I/O-bound tasks, ThreadPoolExecutor manages a pool of worker threads.
Example: Fetch URLs with ThreadPoolExecutor
import concurrent.futures
import requests
import time
def fetch_url(url):
response = requests.get(url)
return f"{url}: {response.status_code}"
urls = [
"https://www.google.com",
"https://www.github.com",
"https://www.python.org",
"https://www.stackoverflow.com"
]
start = time.time()
# Use ThreadPoolExecutor with 4 workers
with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
# Map URLs to fetch_url (returns results in order)
results = executor.map(fetch_url, urls)
for result in results:
print(result)
print(f"ThreadPoolExecutor time: {time.time() - start:.2f}s")
5.2 ProcessPoolExecutor
For CPU-bound tasks, ProcessPoolExecutor manages a pool of worker processes.
Example: Prime Counting with ProcessPoolExecutor
import concurrent.futures
import multiprocessing
import time
def is_prime(n):
if n < 2:
return False
for i in range(2, int(n**0.5) + 1):
if n % i == 0:
return False
return True
def count_primes(start, end):
return sum(1 for num in range(start, end) if is_prime(num))
start_range, end_range = 1, 1_000_000
num_processes = multiprocessing.cpu_count()
chunk_size = (end_range - start_range) // num_processes
ranges = [
(start_range + i * chunk_size, start_range + (i+1) * chunk_size)
for i in range(num_processes)
]
start = time.time()
with concurrent.futures.ProcessPoolExecutor() as executor:
# Submit tasks and collect futures
futures = [executor.submit(count_primes, start, end) for start, end in ranges]
# Wait for all futures to complete and sum results
total_primes = sum(future.result() for future in concurrent.futures.as_completed(futures))
print(f"ProcessPoolExecutor primes: {total_primes}")
print(f"ProcessPoolExecutor time: {time.time() - start:.2f}s")
6. Threading vs. Multiprocessing: When to Use Which?
| Factor | Threading | Multiprocessing |
|---|---|---|
| Use Case | I/O-bound tasks (e.g., web requests) | CPU-bound tasks (e.g., data processing) |
| Parallelism | No (GIL limits Python execution) | Yes (separate interpreters) |
| Memory Sharing | Shared (use locks for safety) | Not shared (use IPC for communication) |
| Overhead | Low (lightweight threads) | High (separate memory spaces) |
| Debugging | Easier (shared memory) | Harder (separate processes) |
7. Best Practices
- Avoid Global Variables in Threads: Use function arguments or thread-local storage (
threading.local()) to avoid race conditions. - Use
concurrent.futuresfor Simplicity: PreferThreadPoolExecutor/ProcessPoolExecutorover rawthreading/multiprocessingfor cleaner code. - Limit Process/Thread Count: For threads, avoid creating more than ~1000 (overhead). For processes, use
os.cpu_count()to match available cores. - Use Locks Sparingly: Overusing locks causes bottlenecks. Design code to minimize shared state.
- Test Both Approaches: Benchmark threading and multiprocessing for your specific task—real-world performance may surprise you!
8. References
- Python
threadingModule Documentation - Python
multiprocessingModule Documentation - Python
concurrent.futuresModule Documentation - Real Python: Python Multiprocessing Guide
- Fluent Python by Luciano Ramalho (Chapter 17: Concurrency with Futures)
- Python GIL Explained
By mastering threading and multiprocessing, you’ll unlock Python’s full potential for building fast, responsive applications. Remember: choose threading for I/O, multiprocessing for CPU, and concurrent.futures for simplicity! 🚀