11 minutes read

You already know that comments are very useful to explain the inner logic of your code and clarify implicit steps. Comments are intended for other people working on your program. However, there can also be people who will just use your program and they won't need to look through the code and understand all the implementation details. They will need to learn how to use a certain function, a module, etc. So, to briefly describe the object's functionality and how to use it, you can provide a general description via a docstring, the main unit of documenting in Python.

In this topic, we will take a look at what is considered documentation in Python. Moreover, docstrings have several Python Enhancement Proposals dedicated to them, so we will also sum up the conventions of PEP 257 and PEP 287.

What is a docstring?

Docstring (documentation string) is a string literal. It is written as the first statement in the definition of a module, a class, a method, a function, etc., and briefly describes its behavior and how you can use it, what parameters you should pass to the function.

Let's take one example right away! Python has a built-in statistics module. It contains several functions for data statistics; all of them have descriptions in the docs. For example, the source code for the function median() , and everything within """ """ (triple-double quotes) is actually the docstring describing the function.

def median(data):
    """Return the median (middle value) of numeric data.

    When the number of data points is odd, return the middle data point.
    When the number of data points is even, the median is interpolated by
    taking the average of the two middle values:

    >>> median([1, 3, 5])
    3
    >>> median([1, 3, 5, 7])
    4.0

    """
    data = sorted(data)
    n = len(data)
    if n == 0:
        raise StatisticsError("no median for empty data")
    if n%2 == 1:
        return data[n//2]
    else:
        i = n//2
        return (data[i - 1] + data[i])/2

This docstring contains a description of what the function does and its expected behavior towards the values passed to it. Note that triple-double quotes are the conventional punctuation signs to indicate a docstring in Python, and the annotation should start with a capital letter and end with a period, as recommended by PEP 257. What is more, just like a comment, each line in a docstring should be no longer than 72 characters.

We can access docstrings without reading the source code: for example, by using the __doc__ attribute:

import statistics

statistics.median.__doc__
# Return the median (middle value) of numeric data.
#
#    When the number of data points is odd, return the middle data point.
#    When the number of data points is even, the median is interpolated by
#    taking the average of the two middle values:
#
#    >>> median([1, 3, 5])
#    3
#    >>> median([1, 3, 5, 7])
#    4.0

Alternatively, we can call the help() function on the object:

help(statistics.median)
# Help on function median in module statistics:
# 
# median(data)
#    Return the median (middle value) of numeric data.
#    
#    When the number of data points is odd, return the middle data point.
#    When the number of data points is even, the median is interpolated by
#    taking the average of the two middle values:
#    
#    >>> median([1, 3, 5])
#    3
#    >>> median([1, 3, 5, 7])
#    4.0

The help() function also allows you to access a docstring for the object without importing it. To do so, you just need to pass its name in quotes: for example, help('statistics.median').

Now let's move further and learn more about docstrings.

Types of docstrings

The two main types of documentation strings in Python are one-liners and multi-liners. In the previous example, the docstring for the median() function is a multi-line docstring. However, the very first line of it can be regarded as a one-liner itself:

# Return the median (middle value) of numeric data.

One-line docstrings are a sort of quick summary for your object, it is the shortest description. Ideally, they are easy to understand for any person who uses something in your program for the first time. Generally, it is better to provide a multi-line description, but in some obvious cases that don't require further explanation, one-liners can be acceptable.

Naturally, multi-line docstrings should contain a more detailed description. For example, the median docstring includes the function outline and two cases of implementation. The structure of multi-line docstrings can be summed up as:

  • A brief one-line description of the object's purpose;

  • A more elaborate explanation of the functionality, for instance, a list of classes a module has or usage examples.

Now, let's try to write a docstring!

One-line docstrings for functions and methods

First, let's create a small example with a one-line string. Below, we declare the count_factorial() function and specify what it does in triple double-quotes right after the declaration:

def count_factorial(num):
    """Return the factorial of the number."""
    if num == 0:
        return 1
    else:
        return num * count_factorial(num - 1)

Under PEP 257, you should follow the next conventions for docstrings for functions and methods:

  • The opening and the closing quotes should be on the same line.

  • There should be no empty strings either before or after the docstring.

  • Your description should be imperative, that's why we need the wordings like """Return the factorial.""" or """Return the number.""" instead of """Returns the number.""" or """It returns the number.""".

  • The description is not a scheme that repeats the object's parameters and return values, like """count_factorial(num) -> int.""".

As in the example with median(), we can access the annotation via __doc__.

print(count_factorial.__doc__)
# Return the factorial of the number.

If you have backslashes in your docstring, you should also wrap it with the r prefix, for example, as in r"""A \new example with \triple double-quotes.""". Otherwise, the combination of a backslash and a letter will be handled as an escape sequence.

Multi-line docstrings for functions and methods

Now, let's create a bit more elaborate description of the function's behavior. Following the general structure above, we start by leaving the first line unchanged: it continues to be the main summary of our function. Next, the multi-line docstring for a method or a function should include the information about the arguments, return values, and other points concerning it. In the example below, we indicate the right argument type for num, what value the function returns, and what this return value denotes:

def count_factorial(num):
    """Return the factorial of the number.

    Arguments:
    num -- an integer to count the factorial of.
    Return values:
    The integer factorial of the number.
    """
    if num == 0:
        return 1
    else:
        return num * count_factorial(num-1)

As for style conventions, it is worth noting the following three things:

  1. The summary is separated from the detailed description using a single blank line.

  2. The docstring in the example starts right after the triple-double quotes. However, it is also possible to specify them on the next line after the opening quotes:

    def count_factorial(num):      
          """
          Return the factorial of the number.
          
          The rest of the doctsring.  
          """
  3. The detailed description starts at the same position as the first quote of the first docstring line — there's no indent.

Classes and modules

Now, we will turn to class and module docstrings. PEP 257 proposes the following conventions:

  • Module docstrings should also provide a brief one-line description. After that, it is recommended to specify all classes, methods, functions, or any other of the module's objects.

  • In class docstrings, apart from the general purpose of the class, you should indicate the information about methods, instance variables, attributes, and so forth. Nevertheless, all these individual objects should still have their own docstrings, with more thorough information given.

To practice this, let's create an example summing up both cases. Below, we briefly visualize the docstrings of the Person class.

# information.py module
"""The functionality for manipulating the user-related information."""


class Person:
    """The creation of the Person object and the related functionality."""

    def __init__(self, name, surname, birthdate):
        """The initializer for the class.

        Arguments:
        name -- a string representing the person's name.
        surname -- a string representing the person's surname.
        birthdate -- a string representing the person's birthdate.
        """
        self.name = name
        self.surname = surname
        self.birthdate = birthdate

    def calculate_age(self):
        """Return the current age of the person."""
        # the body of the method

First of all, note that the class constructor should be documented in the __init__ method. Also, according to PEP 257, we should insert a blank line after a class docstring to separate the class documentation and the first method.

In the given example, we don't give a comprehensive annotation for the module and for the class, even though it is recommended. Why so? Imagine that you list all the objects that the information.py module contains in the docstring, and then, in its turn, every object that the Person class contains in the docstring. In this case, your annotation may get redundant: for example, you would have to repeat what the calculate_age() function does three times: in the module's annotation, in the class' one, and, finally, in the function's annotation. This is done so that just by looking at the docstring for the object we can get a deeper understanding of what it contains. However, when we use some documentation generating tools or the help() function and call them on the class, we are likely to get the outline of all its methods even without specifying them in the class docstring.

The help() function

Perhaps, you came across the help() function previously: it is used to access the documentation of the object. If you type this command without any arguments, it will start an interactive help utility. To get the documentation of a particular module (a class, a method, etc.) you simply need to pass it as the argument to this function. Take a look at the example below — there we learn the documentation for the Person class, defined in the information.py module.

help(Person)
# Help on class Person in module __main__:
#
# class Person(builtins.object)
#  |  The creation of the Person object and the related functionality.
#  |  
#  |  Methods defined here:
#  |  
#  |  __init__(self, name, surname, birthdate)
#  |      The initializer for the class.
#  |      
#  |      Arguments:
#  |      name -- a string representing the person's name.
#  |      surname -- a string representing the person's surname.
#  |      birthdate -- a string representing the person's birthdate.
#  |  
#  |  calculate_age(self)
#  |      Return the current age of the person.

As you can see, here we obtain the docstrings not only for the class but also for all of its objects.

Summary

In this topic, we covered recommendations concerning the style of writing docstrings with respect to PEP 257. Although this is the prevalent style of designing docstrings, other style guides also exist. For example, you can check out Google-format docstrings.

Generally, docstrings can be used to describe the behavior of modules, classes, functions, and so forth. Python uses triple-double quotes """ """ for them. The two main types of docstrings are one-liners and multi-liners.

You can find the object's docstring in its source code. To access it without looking through the source code, use the help() function or the __doc__ attribute.

Now, let's practice!

Read more on this topic in Commenting the Right Way in Python Code on Hyperskill Blog.

240 learners liked this piece of theory. 8 didn't like it. What about you?
Report a typo