Writing example based tests

Note: Since we want to discover errors using unit tests, let us assume that we did not discuss anything about the edge cases for multiplication and division routines we have written.

Although Python has the built-in module unittest, another framework for unit tests, pytest, exists, which is easier to use and offers more functionalities. Therefore, we will stick to pytest in this class. The thoughts presented however, can be used with any testing framework.

When using jupyter-notebook, the module ipytest could be very handy. This can be installed with

conda install -c conda-forge ipytest

Using conda, we can install pytest by executing

conda install -c anaconda pytest

in the console.

Suppose we have written mul function in the file multiplication.py and div function in the file division.py, we can create the file test_mul_div_expl.py in the same directory and import both functions as:

from multiplication import mul
from division import div

Unit test with examples

We choose one example for each function and write

def test_mul_example():
    assert mul(3, 8) == 24

def test_div_example():
    assert div(17, 3) == 5

where we call each function on the selected example and compare the output with the expected outcome.

After saving and exiting the document, we can execute

pytest

in the console. pytest will then find every .py files in the directory which begins with test execute every function inside, which begins with test. If we only want to execute test functions from one specific file, say, test_mul_div_expl.py. we should call

pytest test_mul_div_expl.py

If any assert statement throws an exception, pytest will informs us about it. In this case, we should see

================================ 2 passed in 0.10s ================================

although the time may differ. It is good to see that the test passed. but just because something works on one example, does not mean it always works. One way to be more confident is to go through more examples. Instead of writing the same function for all examples, we can use the function decorator @parametrize provided by pytest.

Unit test with parametrized examples

We can use the function decorator by importing pytest and write

import pytest

@pytest.mark.parametrize(
    'a, b, expected', 
    [(3, 8, 24), (7, 4, 28), (14, 11, 154), (8, 53, 424)],
)
def test_mul_param_example(a, b, expected):
    assert mul(a, b) == expected

@pytest.mark.parametrize(
    'a, b, expected', 
    [(17, 3, 5), (21, 7, 3), (31, 2, 15), (6, 12, 0)],
)
def test_div_param_example(a, b, expected):
    assert div(a, b) == expected

The decorator @parametrize feeds the test function with values and makes testing with multiple examples easy. It will becomes tedious however, if we want to try even more examples.

Unit test with random examples

By going through a large amount of randomly generated examples, we may uncover rarely occuring errors. This method is not always available, since the expected output must somehow be available. In this case however, we can just use python's built-in * and // operator to verify our own function.

The following listing shows tests for 50 examples:

from random import randrange

N = 50

def test_mul_random_example():
    for _ in range(0, N):
        a = randrange(1_000)
        b = randrange(1_000)
        assert mul(a, b) == a * b

def test_div_random_example():
    for _ in range(0, N):
        a = randrange(1_000)
        b = randrange(1_000)
        assert div(a, b) == a // b

Running pytest should probably give us 2 passes. To be more confident, we can increase the number of loops to, say, 700. Now, calling pytest several times, we might get something like

========================================= short test summary info ==========================================
FAILED test_mul_div.py::test_div_random_example -  ZeroDivisionError: integer division or modulo by zero
======================================= 1 failed, 1 passed in 0.20s =======================================

This tells us that the ZeroDivisonError exception occured while running test_div_param_example function. Some more information can be seen above the summary, and it should look like

def test_div_random_example():
        for _ in range(0, N):
            a = randrange(1_000)
            b = randrange(1_000)
>           assert div(a, b) == a // b
E          ZeroDivisionError: integer division or modulo by zero

The arrow in the second last line shows the code where the exception occured. In this case, we have provided the floor division operator // with a zero on the right side. We thus know that we should modify our own implementation to handle this case.

We have found the error without knowing the detailed implementation of the functions. This is desired since human tends to overlook things when analyzing code and some special cases might not be covered by testing with just a few examples. Although with 700 loops, the test passes about 50 % of the time, if we increase the number of loops to several thousands or even higher, the test is almost guaranteed to fail and inform us about deficies in our functions.

The existence of a reference method is not only possible in our toy example, but also occurs in realistic cases. A common case is an intuitive, easy and less error-prone to implement method, which has a long runtime. A more complicated implementation which runs faster can then be tested against this reference method. In our case, we could use naive_mul and naive_div as reference methods for mul and div, respectively.

But what if we really do not have a reference method to produced a large amount of expected outputs? The so called property based testing could help us in this case.