Table of Contents
- Introduction to doctest
- How doctest Works
- Getting Started with doctest
- Writing Effective Doctests
- Advanced doctest Features
- Best Practices for Using doctest
- Limitations of doctest
- Conclusion
- References
Introduction to doctest
At its core, doctest is a testing framework that extracts code examples from docstrings (Python’s built-in documentation strings) and executes them to verify their correctness. These examples mimic interactive Python sessions (like those in the Python shell), making them both human-readable (for documentation) and machine-executable (for testing).
Why Use doctest?
- Documentation Validation: Ensures code examples in docstrings are always up-to-date with the actual code behavior.
- Simplicity: No need to learn a new testing syntax—examples look like standard Python shell input/output.
- Integration: Seamlessly embedded in your code, so documentation and tests live together.
How doctest Works
doctest operates in three key steps:
- Extraction: It scans docstrings (in modules, functions, classes, or methods) for lines starting with
>>>(the Python prompt) and...(for multi-line inputs). These lines are treated as executable code. - Execution: The extracted code is run in a simulated Python environment.
- Comparison: The actual output from execution is compared to the expected output (the lines following the
>>>/...prompts in the docstring). If they match, the test passes; otherwise, it fails.
For example, a docstring like this:
def add(a, b):
"""Return the sum of two numbers.
>>> add(2, 3)
5
>>> add(-1, 1)
0
"""
return a + b
doctest will execute add(2, 3) and check if the result is 5, then add(-1, 1) and check for 0.
Getting Started with doctest
Let’s walk through a hands-on example to see doctest in action.
Basic Example
Suppose we’re writing a function to calculate the factorial of a non-negative integer. We’ll include a docstring with usage examples, which doctest will validate.
def factorial(n):
"""Calculate the factorial of a non-negative integer n.
Factorial of n (n!) is the product of all positive integers up to n.
For n=0, the factorial is 1.
Examples:
>>> factorial(5)
120
>>> factorial(0)
1
>>> factorial(10)
3628800
"""
if not isinstance(n, int):
raise TypeError("n must be an integer")
if n < 0:
raise ValueError("n must be non-negative")
result = 1
for i in range(1, n + 1):
result *= i
return result
Here, the docstring includes three examples. Doctest will treat each >>> factorial(...) line as a test case and verify the output matches the expected number.
Running Doctests
There are two common ways to run doctests:
1. Via the Command Line
Use Python’s -m doctest flag to run doctests in a module. For a file named math_utils.py containing the factorial function above:
python -m doctest -v math_utils.py
-v(verbose mode) shows detailed output, including which tests passed/failed. Omit-vto see only failures.
Sample output (with -v):
Trying:
factorial(5)
Expecting:
120
ok
Trying:
factorial(0)
Expecting:
1
ok
Trying:
factorial(10)
Expecting:
3628800
ok
1 items had no tests:
math_utils
1 items passed all tests:
3 tests in math_utils.factorial
3 tests in 2 items.
3 passed and 0 failed.
Test passed.
2. Embedded in Code
Add a __main__ block to your module to run doctests when the script is executed directly:
if __name__ == "__main__":
import doctest
doctest.testmod() # Runs all doctests in the module
Now, run the script normally:
python math_utils.py -v
This achieves the same result as the command-line approach.
Writing Effective Doctests
To get the most out of doctest, follow these guidelines for writing clear, reliable examples.
Syntax Rules
doctest examples must follow strict syntax to be recognized:
-
Input Lines: Start with
>>>(the Python prompt). For multi-line inputs (e.g., loops, conditionals), use...for continuation lines:def greet(name): """ >>> def f(x): ... return x + 1 ... >>> f(3) 4 """ -
Output Lines: Immediately follow the input line(s) and contain the expected output. For functions returning values, this is the return value. For
print()statements, it’s the printed text. -
Exceptions: To test for exceptions, include the
Tracebackmessage or just the exception type and message. Use...to truncate long tracebacks:>>> factorial("not_an_integer") Traceback (most recent call last): ... TypeError: n must be an integer
Handling Edge Cases
Doctests should include examples that reflect real-world usage, including edge cases:
- Invalid inputs (e.g., non-integers for
factorial). - Boundary values (e.g.,
n=0forfactorial). - Empty collections (e.g.,
sum_list([])).
Example with edge cases:
def sum_list(numbers):
"""Sum a list of numbers.
>>> sum_list([1, 2, 3])
6
>>> sum_list([]) # Edge case: empty list
0
>>> sum_list([-1, 1])
0
"""
return sum(numbers)
Whitespace and Formatting Tips
doctest is whitespace-sensitive, so minor formatting differences can cause test failures:
-
Output Must Match Exactly: Extra/missing spaces, newlines, or punctuation will break the test.
# Bad: Extra space in output >>> greet("Alice") Hello, Alice! # Expected: "Hello, Alice!" (no trailing space) -
Multi-line Outputs: Preserve line breaks and indentation in expected output:
>>> print_list([1, 2, 3]) [1, 2, 3] -
Use
+ELLIPSISfor Dynamic Outputs: For outputs with non-deterministic parts (e.g., memory addresses, timestamps), use the+ELLIPSISflag to ignore parts of the output with...:>>> id("test") # Memory address varies 0x... # Fails! Use +ELLIPSIS instead: # doctest: +ELLIPSIS >>> id("test") 0x...
Advanced doctest Features
doctest includes powerful features for handling complex scenarios.
Skipping Tests
Mark tests as skipped with the # doctest: +SKIP directive to exclude them temporarily (e.g., for known failures or platform-specific code):
>>> factorial(1000) # doctest: +SKIP
# This test is skipped (large computation, takes too long)
Partial Output Matching with Ellipsis
The +ELLIPSIS option lets you use ... as a wildcard to match any substring in the output. Useful for dynamic values like timestamps or object IDs:
>>> import datetime
>>> datetime.datetime.now() # doctest: +ELLIPSIS
datetime.datetime(...)
Custom Output Checkers
For complex comparisons (e.g., floating-point precision, NumPy arrays), create a custom OutputChecker to override how doctest validates output.
Example: Tolerate floating-point errors with +REPORT_NDIFF and a custom checker:
import doctest
import math
class FloatChecker(doctest.OutputChecker):
def check_output(self, want, got, optionflags):
# Allow small differences in floating-point numbers
if want.startswith("Approx: "):
expected = float(want[8:])
got_val = float(got)
return math.isclose(expected, got_val, rel_tol=1e-9)
return super().check_output(want, got, optionflags)
def circle_area(radius):
"""
>>> circle_area(2) # doctest: +ELLIPSIS
Approx: 12.566370614359172
"""
return math.pi * radius **2
if __name__ == "__main__":
doctest.testmod(checker=FloatChecker())
Integrating with unittest/pytest
doctest works seamlessly with other testing frameworks:
-** unittest **: Use unittest.DocTestSuite to wrap doctests into a unittest test suite:
import unittest
import doctest
import math_utils
def load_tests(loader, tests, pattern):
tests.addTests(doctest.DocTestSuite(math_utils))
return tests
-** pytest **: pytest natively supports doctests via the pytest --doctest-modules flag, which runs all doctests in your project.
Best Practices for Using doctest
-** Keep Examples Simple : Doctests are for documentation first. Use short, illustrative examples, not complex logic (save that for unit tests).
- Avoid Overusing Doctests : Don’t cram every test case into docstrings. Use doctests for critical examples and unit tests for exhaustive testing.
- Run Doctests Regularly : Include doctests in your CI/CD pipeline to catch regressions early.
- Document Non-Obvious Behavior **: Add comments to explain why an example behaves a certain way:
>>> factorial(0)
1 # By mathematical definition, 0! = 1
Limitations of doctest
doctest is powerful but not perfect. Be aware of these drawbacks:
-** Sensitive to Output Formatting : Changes in output (e.g., library updates altering __str__ methods) can break tests even if functionality is correct.
- Not for Complex Tests : Poorly suited for tests requiring setup/teardown, mocking, or complex assertions.
- Bloated Docstrings**: Too many examples can make docstrings hard to read.
Conclusion
doctest is a unique tool that unites documentation and testing, ensuring your code examples are always accurate. By embedding simple, illustrative tests in docstrings, you keep documentation helpful and trustworthy. Use it alongside unit tests (e.g., unittest, pytest) for a balanced testing strategy: doctests for clarity, unit tests for depth.
Start small—add a few doctests to your most-used functions—and watch your documentation become more reliable than ever.