TDM 30100: Project 7 — 2022

Motivation: Code, especially newly written code, is refactored, updated, and improved frequently. It is for these reasons that testing code is imperative. Testing code is a good way to ensure that code is working as intended. When a change is made to code, you can run a suite a tests, and feel confident (or at least more confident) that the changes you made are not introducing new bugs. While methods of programming like TDD (test-driven development) are popular in some circles, and unpopular in others, what is agreed upon is that writing good tests is a useful skill and a good habit to have.

Context: This is the first of a series of two projects that explore writing unit tests, and doc tests. In The Data Mine, we will focus on using pytest, doc tests, and mypy, while writing code to manipulate and work with data.

Scope: Python, testing, pytest, mypy, doc tests

Learning Objectives
  • Write and run unit tests using pytest.

  • Include and run doc tests in your docstrings, using pytest.

  • Gain familiarity with mypy, and explain why static type checking can be useful.

  • Comprehend what a function is, and the components of a function in Python.

Make sure to read about, and use the template found here, and the important information about projects submissions here.

Dataset(s)

The following questions will use the following dataset(s):

  • /anvil/projects/tdm/data/*

Questions

Question 1

In the previous project, you were given a Python module and a test module. You found bugs in the code and fix the functions to pass the provided tests. This is good practice fixing code based on tests. In this project, you will get an opportunity to try test driven development (TDD). This is (roughly) where you write tests first, then write your code to pass your tests.

There are some good discussions on TDD here and here.

Start by choosing 1 dataset from our data directory: /anvil/projects/tdm/data. This will be the dataset which you operate on for the remainder of the project. Alternatively, you may scrape data from online as your "data source".

In a markdown cell, describe 3 functions that you will write, and what those functions should do.

Create two files: $HOME/project07/project07.py and $HOME/project07/test_project07.py.

Items to submit
  • Code used to solve this problem.

  • Output from running the code.

Question 2

Expand on question (1). In $HOME/project07/project07.py create the 3 functions, and write detailed, google style docstrings for each function. Leave each function blank, with just a pass keyword. For example.

def my_super_awesome_function(some_parameter: str) -> str:
    """
    My detailed google style docstring here...
    """
    pass
  • Make sure the reader can understand what your functions do based on the docstrings.

  • Your functions should not be anything trivial like splitting a string, summing data, or anything that could be easily accomplished in a single line of code using other built-in methods. This is your chance to get creative!

pass is a keyword that you can use to "pass" or just not perform any operation. Without pass in this function, you would get an error.

Items to submit
  • Code used to solve this problem.

  • Output from running the code.

Question 3

Write at least 2 unit tests (using pytest) for each function. Each assert counts as a unit test.

Write an additional test for each function (again, using pytest). For this set of tests, experiment with features from pytest that you haven’t tried before! This is the official pytest documentation. Some options could be: fixtures, parametrizing, or using temporary directories and files.

In a bash cell, run your pytest tests, which should all fail.

%%bash

cd $HOME/project07
python3 -m pytest
Items to submit
  • Code used to solve this problem.

  • Output from running the code.

Question 4

Begin writing your functions by filling in $HOME/project07/project07.py. Record each time your re-run your tests in a new bash cell, by running the following.

%%bash

cd $HOME/project07
python3 -m pytest

Please record each re-run of your test in a new bash cell — you could end up with 10 or more cells where you’ve run your tests. We want to see the progression as you write the functions and how the failures change as you fix your code. You don’t need to record the changes you make to project07.py, but we do want to see the results of running the tests each time you run them.

Continue to re-run tests until they all pass and your functions work as intended.

Items to submit
  • Code used to solve this problem.

  • Output from running the code.

Question 5

This is perhaps a very different style of writing code for you, depending on your past experiences! After giving it a try, how do you like TDD? If you haven’t yet, read some article online about TDD. Do you think you should always use TDD? What are your opinions. Write a few sentences about your thoughts, and where you stand on TDD.

At this end of this project you should submit the following.

  • A .ipynb file with your results from running your tests initially in question (3), and repeatedly, until they pass, in question (4).

  • Your project07.py file with your passing functions, and beautiful docstrings.

  • Your test_project07.py file with at least 9 total (passing) tests, 3 of which should explore previously mentioned "new" features of pytest.

Items to submit
  • A few sentences in a markdown cell describing what you think about TDD.

Please make sure to double check that your submission is complete, and contains all of your code and output before submitting. If you are on a spotty internet connection, it is recommended to download your submission after submitting it to make sure what you think you submitted, was what you actually submitted.

In addition, please review our submission guidelines before submitting your project.