Use gherkin explicitly define requirements and test cases

Gherkin is great tool to define requirements it forces the requester to explicitly state the expectations and acceptance criteria. It also provides a way to make clear how many test cases there is.

If you define requirements in plain English then there is a lot of things left for interpretation and complex requirements seem simple and effortless to implement. Let’s take a look at this requirement:

Provide a way to catch exceptions and log them with tags. Implement a decorator function eg. log_on_exception to provide the feature.

Simple. Right?

Oh! We forgot to mention that we need the messages logged as UTF-8 encoded strings. You implicitly knew that right?

Estimate effort

You need to provide 100% test coverage. Not only execute each line of code but think of edge cases and implement tests for them.

Take a minute to think about the requirements, put it together in your mind before you read further.

Again easy, right? We have access to the Product Owner we ask questions and we can quickly put together some test cases in our mind — we’ll forget about them in a minute and other team members will come up with different sets test cases. But that’s what the planning poker is for, to account for different understanding of the issue, right? We can always make notes — no one does, but we can if we need to, but we do not need to. We’ll come up with the test cased again during the implementation of the issue. We’re agile.

Note the number of test cases you came up with. Write down how many test methods you will need to implement. Humor me, write down the number to make sure you can compare your number with what I implemented in the example below. How many assertions will there be in total?

Shift right

Let’s imagine that developers are overwhelmed with requests they are always in a hurry and want to push the implementation to the right as quicly as possible with minimum effort. The same way product people do: Hey, guys it’s simple let’s not waste time on descriptions.

This haste will backfire. More often than not you will end up with incorrect implementation in production and you will have to circle back. In the best case you will have something that does the job poorly and you will decide that it’s tech debt, to fix later — meaning, never.

You saved some time on the requirement definition. How does it compare on money lost by the company?

Shift left

Now let’s imagine that the Product Owner with the help of QA professional had to do the thought process and define a clear acceptance criteria and write down all the edge cases that they discovered. Provide an explicit requirements that we can expand during grooming session — imagine that you have to write down all the things that developers came up with — let’s not leave them to perishable memory, let’s not depend on the experience an depend on developers ability to test this feature properly.

It’s never cheaper to fix later in the process than it is to fix them now. If you fix them in the backlog by providing a better requirement statement, it’s the cheapest it can get.

Let’s try to define the same requirements with gherkin.

Feature: log_on_exception decorator
  Write a decorator named "log_on_exception" which takes a sequence of tags
  and logs a message in case an exception happens within the decorated
  function. Log the messages with existing api:

  def log(utf8_string, tags):

  ??? Assumption to verify: the decorator will not handle
  the exception but log it and raise it again, the only side effect will be the
  log call.

  Support Python 2.7 

  GIVEN fn with <params>
  GIVEN fn is decorated with log_on_exception(<tags>)

  Scenario Outline: Function raising exception
  GIVEN fn raises <exception> with <msg>
  WHEN fn is called with <arguments>
  THEN log is called with <msg> and list of <tags>
  THEN <exception> with <utf8_string> is raised

      | params   | arguments | tags           | exception  | msg              | utf8_string             |
      | one, two | 1, two=2  | "tag1", "tag2" | ValueError | "foo"            | "foo"                   |
      |          |           | "whatever"     | Exception  | "Gewährleistung" | "Gew\xc3\xa4hrleistung" |

  Scenario Outline: Function without exception
  GIVEN fn returns <result>
  WHEN fn is called with <arguments>
  THEN fn returns <result>
  BUT log is NOT called

      | params   | arguments | tags            | result |
      |          |           | "tag1", "tag2"  | None   |
      | one, two | 1, two=2  | "tag13", "tag4" | "foo"  |

Already looks more complicated, than the initial one liner requirement.

We have examples and some test cases provided already. Can you come up with more? Do you wan’t to update the number of test methods you will need to implement? Write it down. Did the number of assertions increased?

The implementation

Now let’s see the implementation. Again think about and note the number of test cases and assertions that you think you need to have 100% coverage.

# -*- coding: utf-8 -*-
import functools

def log(utf8_string, tags):

def log_on_exception(*tags):
    def decorator(fn):
        def wrapper(*args, **kwargs):
                return fn(*args, **kwargs)
            except Exception as ex:
                    msg = str(ex)
                except UnicodeError:
                    msg = unicode(ex).encode("UTF-8")
                log(msg, tags)

        return wrapper

    return decorator

The tests

Bear in mind that I want to have gradual tests that test one thing. These tests take Arrange-Act-Assert approach (or Given When Then) and Act is sometimes implemented in the fixture because the Act phase is shared by multiple test cases.

This approach allow me to think about testing one aspect of the feature and do not concern me with the others. Precision and focus. Removing or adding test cases will not affect the the other test cases.

# -*- coding: utf-8 -*-
import decorators
import pytest
from decorators import log_on_exception

class TestLogOnExceptionDecorator:
    The test suite below follows the Arrange-Act-Assert pattern, where
    the actual execution of the method under test is done in fixtures:
    result_ok and result_err.
    The test methods have just one and only one assert to make the test cases
    very specific and their fixture dependencies short.
    There is TestLogOnExceptionDecoratorHardToMaintain suite for contrast.

    def tags(self, faker):
        return (faker.word(), faker.word())

    def msg(self, faker):
        return faker.sentence()

    def exception(self, msg):
        return Exception(msg)

    def fn(self, mocker):
        fn = mocker.stub()
        fn.__name__ = "fn"
        fn.__qualname__ = "foo.fn"
        fn.__annotations__ = {}
        return fn

    def fn_err(self, fn, exception):
        fn.side_effect = exception
        return fn

    def dummy(self, faker):
        return faker.word()

    def fn_ok(self, fn, dummy):
        fn.return_value = dummy
        return fn

    def log(self, mocker):
        return mocker.spy(decorators, "log")

    def args(self):
        return ()

    def kwargs(self):
        return {}

    def result_ok(self, tags, fn_ok, args, kwargs):
        decorated = log_on_exception(*tags)(fn_ok)
        return decorated(*args, **kwargs)

    def result_err(self, tags, fn_err, args, kwargs, exception, msg):
        decorated = log_on_exception(*tags)(fn_err)
        with pytest.raises(Exception) as excinfo:
            decorated(*args, **kwargs)
        return excinfo.value

    def test_decorated_function_returning_result(self, result_ok, dummy):
        assert result_ok is dummy

        argnames="args, kwargs",
            ([], {}),
            ([1], dict(second=2)),
    def test_decorated_function_is_called_with_params(self, fn_ok, args, kwargs):
        fn_ok.assert_called_once_with(*args, **kwargs)

    # order of fixtures is important here, we need to spy the log method first
    @pytest.mark.usefixtures("log", "result_ok")
    def test_log_is_not_called_without_exception(self, log):

    @pytest.mark.usefixtures("log", "result_err")
    def test_log_is_called_on_exception(self, log, msg, tags):
        log.assert_called_once_with(msg, tags)

    @pytest.mark.usefixtures("log", "result_err")
        argnames="msg, utf8_encoded",
            (u"Żażółć gęślą jaźń", b"\xc5\xbb\x61\xc5\xbc\xc3\xb3\xc5\x82\xc4\x87\x20\x67\xc4\x99\xc5\x9b\x6c\xc4\x85\x20\x6a\x61\xc5\xba\xc5\x84"),
            (u"Gewährleistung", b"\x47\x65\x77\xc3\xa4\x68\x72\x6c\x65\x69\x73\x74\x75\x6e\x67"),
            (b"Plain text", b"Plain text"),
    def test_log_is_called_encoded_str(self, log, utf8_encoded):
        assert log.call_args.args[0] == utf8_encoded

    def test_log_is_called_on_exception(self, result_err, exception):
        assert result_err is exception

class TestLogOnExceptionDecoratorHardToMaintain(TestLogOnExceptionDecorator):
    This test suite has only 2 that are only seemingly simpler than the base
    test methods that assert multiple things thus make them dependent on
    multiple fixtures and prone to errors when set of test cases will change.

        argnames="args, kwargs",
            ([], {}),
            ([1], dict(second=2)),
    def test_decorate_function_returning_result(self, tags, fn_ok, dummy, args, kwargs, log, msg):
        fn = log_on_exception(*tags)(fn_ok)
        assert fn(*args, **kwargs) is dummy
        fn_ok.assert_called_once_with(*args, **kwargs)

        argnames="args, kwargs",
            ([], {}),
            ([1], dict(second=2)),
    def test_decorate_function_raising_exception(
        self, tags, fn_err, exception, msg, args, kwargs, log
        fn = log_on_exception(*tags)(fn_err)
        with pytest.raises(type(exception), match=msg):
            fn(*args, **kwargs)
        fn_err.assert_called_once_with(*args, **kwargs)
        log.assert_called_once_with(msg.encode("UTF-8"), tags)

Just 1 decorator – 8 test cases

If I count them right, there are at least 8 test cases for this really simple requirement.

Leave a Reply

Back to Top