py4u guide

Using Python’s doctest to Validate Your Documentation

Documentation is the backbone of usable software. It guides users, explains functionality, and ensures consistency—but what happens when your code evolves and your documentation doesn’t? Outdated examples, incorrect outputs, or broken code snippets can frustrate users and erode trust. Enter **doctest**: a built-in Python module that bridges the gap between documentation and testing by turning code examples in your docstrings into executable tests. In this blog, we’ll explore how doctest works, how to write effective doctests, advanced features, best practices, and its limitations. By the end, you’ll be equipped to use doctest to keep your documentation accurate, your examples valid, and your users happy.

Table of Contents

  1. Introduction to doctest
  2. How doctest Works
  3. Getting Started with doctest
  4. Writing Effective Doctests
  5. Advanced doctest Features
  6. Best Practices for Using doctest
  7. Limitations of doctest
  8. Conclusion
  9. References

Introduction to doctest

At its core, doctest is a testing framework that extracts code examples from docstrings (Python’s built-in documentation strings) and executes them to verify their correctness. These examples mimic interactive Python sessions (like those in the Python shell), making them both human-readable (for documentation) and machine-executable (for testing).

Why Use doctest?

  • Documentation Validation: Ensures code examples in docstrings are always up-to-date with the actual code behavior.
  • Simplicity: No need to learn a new testing syntax—examples look like standard Python shell input/output.
  • Integration: Seamlessly embedded in your code, so documentation and tests live together.

How doctest Works

doctest operates in three key steps:

  1. Extraction: It scans docstrings (in modules, functions, classes, or methods) for lines starting with >>> (the Python prompt) and ... (for multi-line inputs). These lines are treated as executable code.
  2. Execution: The extracted code is run in a simulated Python environment.
  3. Comparison: The actual output from execution is compared to the expected output (the lines following the >>>/... prompts in the docstring). If they match, the test passes; otherwise, it fails.

For example, a docstring like this:

def add(a, b):  
    """Return the sum of two numbers.  

    >>> add(2, 3)  
    5  
    >>> add(-1, 1)  
    0  
    """  
    return a + b  

doctest will execute add(2, 3) and check if the result is 5, then add(-1, 1) and check for 0.

Getting Started with doctest

Let’s walk through a hands-on example to see doctest in action.

Basic Example

Suppose we’re writing a function to calculate the factorial of a non-negative integer. We’ll include a docstring with usage examples, which doctest will validate.

def factorial(n):  
    """Calculate the factorial of a non-negative integer n.  

    Factorial of n (n!) is the product of all positive integers up to n.  
    For n=0, the factorial is 1.  

    Examples:  
    >>> factorial(5)  
    120  
    >>> factorial(0)  
    1  
    >>> factorial(10)  
    3628800  
    """  
    if not isinstance(n, int):  
        raise TypeError("n must be an integer")  
    if n < 0:  
        raise ValueError("n must be non-negative")  
    result = 1  
    for i in range(1, n + 1):  
        result *= i  
    return result  

Here, the docstring includes three examples. Doctest will treat each >>> factorial(...) line as a test case and verify the output matches the expected number.

Running Doctests

There are two common ways to run doctests:

1. Via the Command Line

Use Python’s -m doctest flag to run doctests in a module. For a file named math_utils.py containing the factorial function above:

python -m doctest -v math_utils.py  
  • -v (verbose mode) shows detailed output, including which tests passed/failed. Omit -v to see only failures.

Sample output (with -v):

Trying:  
    factorial(5)  
Expecting:  
    120  
ok  
Trying:  
    factorial(0)  
Expecting:  
    1  
ok  
Trying:  
    factorial(10)  
Expecting:  
    3628800  
ok  
1 items had no tests:  
    math_utils  
1 items passed all tests:  
   3 tests in math_utils.factorial  
3 tests in 2 items.  
3 passed and 0 failed.  
Test passed.  

2. Embedded in Code

Add a __main__ block to your module to run doctests when the script is executed directly:

if __name__ == "__main__":  
    import doctest  
    doctest.testmod()  # Runs all doctests in the module  

Now, run the script normally:

python math_utils.py -v  

This achieves the same result as the command-line approach.

Writing Effective Doctests

To get the most out of doctest, follow these guidelines for writing clear, reliable examples.

Syntax Rules

doctest examples must follow strict syntax to be recognized:

  • Input Lines: Start with >>> (the Python prompt). For multi-line inputs (e.g., loops, conditionals), use ... for continuation lines:

    def greet(name):  
        """  
        >>> def f(x):  
        ...     return x + 1  
        ...  
        >>> f(3)  
        4  
        """  
  • Output Lines: Immediately follow the input line(s) and contain the expected output. For functions returning values, this is the return value. For print() statements, it’s the printed text.

  • Exceptions: To test for exceptions, include the Traceback message or just the exception type and message. Use ... to truncate long tracebacks:

    >>> factorial("not_an_integer")  
    Traceback (most recent call last):  
        ...  
    TypeError: n must be an integer  

Handling Edge Cases

Doctests should include examples that reflect real-world usage, including edge cases:

  • Invalid inputs (e.g., non-integers for factorial).
  • Boundary values (e.g., n=0 for factorial).
  • Empty collections (e.g., sum_list([])).

Example with edge cases:

def sum_list(numbers):  
    """Sum a list of numbers.  

    >>> sum_list([1, 2, 3])  
    6  
    >>> sum_list([])  # Edge case: empty list  
    0  
    >>> sum_list([-1, 1])  
    0  
    """  
    return sum(numbers)  

Whitespace and Formatting Tips

doctest is whitespace-sensitive, so minor formatting differences can cause test failures:

  • Output Must Match Exactly: Extra/missing spaces, newlines, or punctuation will break the test.

    # Bad: Extra space in output  
    >>> greet("Alice")  
    Hello, Alice!  # Expected: "Hello, Alice!" (no trailing space)  
  • Multi-line Outputs: Preserve line breaks and indentation in expected output:

    >>> print_list([1, 2, 3])  
    [1,  
     2,  
     3]  
  • Use +ELLIPSIS for Dynamic Outputs: For outputs with non-deterministic parts (e.g., memory addresses, timestamps), use the +ELLIPSIS flag to ignore parts of the output with ...:

    >>> id("test")  # Memory address varies  
    0x...  # Fails! Use +ELLIPSIS instead:  
    # doctest: +ELLIPSIS  
    >>> id("test")  
    0x...  

Advanced doctest Features

doctest includes powerful features for handling complex scenarios.

Skipping Tests

Mark tests as skipped with the # doctest: +SKIP directive to exclude them temporarily (e.g., for known failures or platform-specific code):

>>> factorial(1000)  # doctest: +SKIP  
# This test is skipped (large computation, takes too long)  

Partial Output Matching with Ellipsis

The +ELLIPSIS option lets you use ... as a wildcard to match any substring in the output. Useful for dynamic values like timestamps or object IDs:

>>> import datetime  
>>> datetime.datetime.now()  # doctest: +ELLIPSIS  
datetime.datetime(...)  

Custom Output Checkers

For complex comparisons (e.g., floating-point precision, NumPy arrays), create a custom OutputChecker to override how doctest validates output.

Example: Tolerate floating-point errors with +REPORT_NDIFF and a custom checker:

import doctest  
import math  

class FloatChecker(doctest.OutputChecker):  
    def check_output(self, want, got, optionflags):  
        # Allow small differences in floating-point numbers  
        if want.startswith("Approx: "):  
            expected = float(want[8:])  
            got_val = float(got)  
            return math.isclose(expected, got_val, rel_tol=1e-9)  
        return super().check_output(want, got, optionflags)  

def circle_area(radius):  
    """  
    >>> circle_area(2)  # doctest: +ELLIPSIS  
    Approx: 12.566370614359172  
    """  
    return math.pi * radius **2  

if __name__ == "__main__":  
    doctest.testmod(checker=FloatChecker())  

Integrating with unittest/pytest

doctest works seamlessly with other testing frameworks:

-** unittest **: Use unittest.DocTestSuite to wrap doctests into a unittest test suite:

import unittest  
import doctest  
import math_utils  

def load_tests(loader, tests, pattern):  
    tests.addTests(doctest.DocTestSuite(math_utils))  
    return tests  

-** pytest **: pytest natively supports doctests via the pytest --doctest-modules flag, which runs all doctests in your project.

Best Practices for Using doctest

-** Keep Examples Simple : Doctests are for documentation first. Use short, illustrative examples, not complex logic (save that for unit tests).
-
Avoid Overusing Doctests : Don’t cram every test case into docstrings. Use doctests for critical examples and unit tests for exhaustive testing.
-
Run Doctests Regularly : Include doctests in your CI/CD pipeline to catch regressions early.
-
Document Non-Obvious Behavior **: Add comments to explain why an example behaves a certain way:

>>> factorial(0)  
1  # By mathematical definition, 0! = 1  

Limitations of doctest

doctest is powerful but not perfect. Be aware of these drawbacks:

-** Sensitive to Output Formatting : Changes in output (e.g., library updates altering __str__ methods) can break tests even if functionality is correct.
-
Not for Complex Tests : Poorly suited for tests requiring setup/teardown, mocking, or complex assertions.
-
Bloated Docstrings**: Too many examples can make docstrings hard to read.

Conclusion

doctest is a unique tool that unites documentation and testing, ensuring your code examples are always accurate. By embedding simple, illustrative tests in docstrings, you keep documentation helpful and trustworthy. Use it alongside unit tests (e.g., unittest, pytest) for a balanced testing strategy: doctests for clarity, unit tests for depth.

Start small—add a few doctests to your most-used functions—and watch your documentation become more reliable than ever.

References