Table of Contents
1. Core System & I/O Modules
These modules handle low-level interactions with the system, file systems, and input/output operations.
sys: System-specific Parameters & Functions
The sys module provides access to Python’s interpreter and system-level variables/functions. It’s essential for controlling the runtime environment, handling command-line arguments, and managing standard input/output.
Key Components:
sys.argv: List of command-line arguments passed to the script.sys.exit([status]): Exit the interpreter with an optional status code (0 = success).sys.stdin/sys.stdout/sys.stderr: File-like objects for standard input, output, and error streams.sys.modules: Dictionary mapping module names to loaded module objects.
Example: Access Command-Line Arguments
import sys
# sys.argv[0] is the script name; sys.argv[1:] are arguments
print(f"Script name: {sys.argv[0]}")
print(f"Arguments: {sys.argv[1:]}")
# Example usage: python script.py hello world
# Output:
# Script name: script.py
# Arguments: ['hello', 'world']
Example: Exit with Status Code
import sys
def main():
if len(sys.argv) < 2:
print("Error: Missing argument!", file=sys.stderr) # Write to stderr
sys.exit(1) # Non-zero exit code indicates failure
print(f"Hello, {sys.argv[1]}!")
sys.exit(0) # Success
if __name__ == "__main__":
main()
os: Operating System Interactions
The os module abstracts operating system (OS) differences, allowing you to interact with the file system, environment variables, and process management in a cross-platform way.
Key Components:
os.environ: Dictionary-like object for environment variables.os.listdir(path): List files/directories inpath.os.path: Submodule for path manipulation (e.g.,os.path.join(),os.path.exists()).os.system(command): Execute a shell command (returns exit status).
Example: Read Environment Variables
import os
# Get the PATH environment variable
path = os.environ.get("PATH")
print(f"System PATH: {path[:50]}...") # Truncate for readability
# Set a custom environment variable (temporary for the process)
os.environ["MY_APP_CONFIG"] = "/etc/myapp/config.ini"
print(f"Custom config path: {os.environ['MY_APP_CONFIG']}")
Example: List Files in a Directory
import os
current_dir = os.getcwd() # Get current working directory
print(f"Files in {current_dir}:")
for file in os.listdir(current_dir):
if os.path.isfile(file): # Check if it's a file (not a directory)
print(f" - {file}")
pathlib: Object-Oriented File Paths
Introduced in Python 3.4, pathlib provides an object-oriented alternative to os.path for path manipulation. It makes path handling more intuitive and readable.
Key Components:
Path: Core class representing a file system path.Path.joinpath(*paths): Combine paths (equivalent toos.path.join).Path.exists(): Check if the path exists.Path.glob(pattern): Find files matching a glob pattern (e.g.,*.txt).
Example: Create and Query Paths
from pathlib import Path
# Create a Path object for the user's home directory
home = Path.home()
print(f"Home directory: {home}")
# Build a path to a documents folder
docs_path = home / "Documents" / "reports" # Use / operator to join paths
print(f"Reports path: {docs_path}")
# Check if the path exists; create it if not
if not docs_path.exists():
docs_path.mkdir(parents=True, exist_ok=True) # parents=True creates nested dirs
print(f"Created: {docs_path}")
# Find all .pdf files in the reports folder
pdf_files = list(docs_path.glob("*.pdf"))
print(f"PDF files found: {[f.name for f in pdf_files]}")
2. Data Handling & Manipulation
These modules simplify working with dates, serialization, and advanced data structures.
datetime: Date & Time Management
The datetime module provides classes for manipulating dates, times, and time intervals with precision.
Key Components:
date: Represents a date (year, month, day).time: Represents a time (hour, minute, second, microsecond).datetime: Combinesdateandtime.timedelta: Represents a duration (e.g., 3 days, 2 hours).strftime(format)/strptime(string, format): Format/parse datetime strings.
Example: Create and Format Datetimes
from datetime import date, datetime, timedelta
# Create a date object
today = date.today()
print(f"Today: {today}") # Output: YYYY-MM-DD
# Create a datetime object (with time)
now = datetime.now()
print(f"Current time: {now}") # Output: YYYY-MM-DD HH:MM:SS.ffffff
# Add 7 days to today
next_week = today + timedelta(days=7)
print(f"Next week: {next_week}")
# Format datetime as a string (strftime)
formatted = now.strftime("%A, %B %d, %Y - %H:%M:%S")
print(f"Formatted: {formatted}") # Example: "Monday, January 01, 2024 - 14:30:45"
# Parse a string into a datetime (strptime)
date_str = "2023-12-25"
christmas = datetime.strptime(date_str, "%Y-%m-%d")
print(f"Parsed date: {christmas.date()}")
json: JSON Serialization/Deserialization
The json module handles conversion between Python objects (dicts, lists) and JSON strings/files—a common task for APIs, config files, and data storage.
Key Components:
json.dump(obj, file): Writeobjto a file as JSON.json.dumps(obj): Convertobjto a JSON string.json.load(file): Read JSON from a file into a Python object.json.loads(string): Parse a JSON string into a Python object.
Example: Serialize and Deserialize Data
import json
# Sample Python data
data = {
"name": "Alice",
"age": 30,
"is_student": False,
"hobbies": ["reading", "hiking"]
}
# Serialize to JSON string (dumps = "dump string")
json_str = json.dumps(data, indent=4) # indent for readability
print("JSON string:")
print(json_str)
# Serialize to a file (dump)
with open("data.json", "w") as f:
json.dump(data, f, indent=4)
# Deserialize from file (load)
with open("data.json", "r") as f:
loaded_data = json.load(f)
print("\nLoaded data:", loaded_data)
print("Name:", loaded_data["name"]) # Access like a dict
collections: Enhanced Data Structures
The collections module extends Python’s built-in data structures (lists, dicts, tuples) with specialized types for common use cases.
Key Components:
namedtuple: Immutable tuple with named fields (e.g.,Point(x=1, y=2)).deque: Double-ended queue for efficient appends/pops from both ends.defaultdict: Dict that auto-initializes missing keys with a default value.Counter: Counts hashable objects (e.g., word frequencies).
Example: namedtuple for Structured Data
from collections import namedtuple
# Define a named tuple type "Point" with fields x and y
Point = namedtuple("Point", ["x", "y"])
p = Point(x=5, y=10)
print(f"Point: ({p.x}, {p.y})") # Access by name
print(f"Tuple form: {tuple(p)}") # Still behaves like a tuple
Example: defaultdict to Avoid KeyErrors
from collections import defaultdict
# defaultdict with list as default (auto-creates empty list for new keys)
word_counts = defaultdict(list)
words = ["apple", "banana", "apple", "cherry", "banana"]
for idx, word in enumerate(words):
word_counts[word].append(idx) # No KeyError for new words
print("Word indices:")
for word, indices in word_counts.items():
print(f" {word}: {indices}")
Example: Counter for Frequency Counting
from collections import Counter
fruits = ["apple", "banana", "apple", "orange", "banana", "apple"]
count = Counter(fruits)
print("Fruit counts:", count)
print("Most common:", count.most_common(2)) # Top 2 most common
itertools: Efficient Iteration Tools
The itertools module provides functions for creating and combining iterators, enabling memory-efficient loops and complex iterations (e.g., permutations, combinations).
Key Components:
product: Cartesian product of iterables (e.g.,product([1,2], ['a','b'])→(1,'a'), (1,'b'), (2,'a'), (2,'b')).permutations(iterable, r): All possible r-length permutations ofiterable.chain: Combine multiple iterables into one (e.g.,chain([1,2], [3,4])→1,2,3,4).islice: Slice an iterator without converting it to a list (memory-efficient).
Example: Generate Permutations
from itertools import permutations
# Generate all 2-length permutations of [1,2,3]
perms = permutations([1,2,3], r=2)
print("Permutations of length 2:", list(perms)) # Output: [(1,2), (1,3), (2,1), (2,3), (3,1), (3,2)]
Example: Chain Iterables
from itertools import chain
list1 = [1, 2, 3]
list2 = ['a', 'b', 'c']
combined = chain(list1, list2)
print("Combined iterable:", list(combined)) # Output: [1, 2, 3, 'a', 'b', 'c']
3. Text Processing
re: Regular Expressions
The re module enables pattern matching and manipulation of text using regular expressions—powerful for tasks like validation, parsing, and search/replace.
Key Components:
re.search(pattern, string): Search forpatternanywhere instring(returns a match object).re.match(pattern, string): Matchpatternat the start ofstring.re.findall(pattern, string): Return all non-overlapping matches as a list.re.sub(pattern, repl, string): Replace matches ofpatternwithreplinstring.
Example: Validate Email Addresses
import re
email_pattern = r"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$"
def is_valid_email(email):
return re.match(email_pattern, email) is not None # match checks from start
print(is_valid_email("[email protected]")) # True
print(is_valid_email("invalid-email")) # False
Example: Extract URLs from Text
import re
text = "Visit https://python.org or http://example.com for more info."
url_pattern = r"https?://[^\s]+" # Matches http:// or https:// followed by non-whitespace
urls = re.findall(url_pattern, text)
print("Extracted URLs:", urls) # Output: ['https://python.org', 'http://example.com']
4. Testing & Debugging
unittest: Unit Testing Framework
The unittest module (inspired by JUnit) provides tools for writing and running unit tests to validate code correctness.
Key Components:
unittest.TestCase: Base class for test cases, with assertion methods (e.g.,assertEqual,assertTrue).setUp()/tearDown(): Run before/after each test method.unittest.main(): Discover and run tests.
Example: Test a Simple Function
import unittest
def add(a, b):
return a + b
class TestAddFunction(unittest.TestCase):
def test_add_positive_numbers(self):
self.assertEqual(add(2, 3), 5) # Assert 2+3=5
def test_add_negative_numbers(self):
self.assertEqual(add(-1, -1), -2) # Assert (-1)+(-1)=-2
def test_add_zero(self):
self.assertEqual(add(0, 5), 5) # Assert 0+5=5
if __name__ == "__main__":
unittest.main() # Run all tests
Output:
...
----------------------------------------------------------------------
Ran 3 tests in 0.001s
OK
logging: Flexible Logging System
The logging module replaces print statements for debugging and monitoring, offering configurable severity levels, output destinations, and formatting.
Key Components:
- Log levels:
DEBUG(10),INFO(20),WARNING(30),ERROR(40),CRITICAL(50). logging.basicConfig(): Configure logging (level, format, file).logging.debug(msg)/info()/warning()/etc.: Log messages at specified levels.
Example: Basic Logging Setup
import logging
# Configure logging: write to file, set level to DEBUG, and format messages
logging.basicConfig(
filename="app.log",
level=logging.DEBUG, # Capture DEBUG and above
format="%(asctime)s - %(levelname)s - %(message)s" # Include timestamp and level
)
logging.debug("This is a debug message (detailed info for debugging)")
logging.info("User 'alice' logged in")
logging.warning("Low disk space!")
logging.error("Failed to connect to database")
logging.critical("Server is down!")
app.log contents:
2024-05-20 14:30:00,123 - DEBUG - This is a debug message (detailed info for debugging)
2024-05-20 14:30:00,124 - INFO - User 'alice' logged in
2024-05-20 14:30:00,124 - WARNING - Low disk space!
2024-05-20 14:30:00,125 - ERROR - Failed to connect to database
2024-05-20 14:30:00,125 - CRITICAL - Server is down!
5. Conclusion
Python’s standard library is a treasure trove of tools that streamline development across domains. From system interactions (sys, os) to data processing (datetime, json), text manipulation (re), and testing (unittest), these modules reduce reliance on third-party packages and ensure code quality.
This blog covered only a subset of the standard library—explore further modules like csv (CSV parsing), sqlite3 (database), socket (networking), and math (mathematical operations) to expand your toolkit. The key is to familiarize yourself with what’s available, so you can reach for the right module instead of reinventing the wheel.
6. References
- Python Standard Library Documentation
- Real Python: Python Standard Library
- Fluent Python by Luciano Ramalho (covers standard library modules in depth)