TDM 10200: Project 4 — 2024

Motivation: In the last project, we began exploring control flow, including if statements and for loops. In this project, we will delve deeper into the concept of loops. There are three main types of loops that we will focus on: for loops, while loops, and nested loops.

Context: We will continue to work on some basic data types and will discuss some similar control flow concepts.

Scope: loops, basic data structures such as tuples, lists, loops, dict

Dataset

All questions in this project will use the data sets in this directory:

/anvil/projects/tdm/data/noaa/

Just as we did in Project 3, please remember to use header=None when you read in the data set, and remember to use:

names=["id","date","element_code","value","mflag","qflag","sflag","obstime"]

to create the column names.

Readings and Resources

  • Make sure to read about, and use the template found here, and the important information about projects submissions here.

We added three new videos to help you with Project 4.

Questions

Question 1 (2 points)

  1. Use Pandas iterrows to loop over the 1880.csv data set. For any row that has information about precipitation (in the element code), if the value column in that row is more than 1200, print the date for that row. (Hint: Ten rows meet this condition, so you should print a total of 10 dates.)

  2. Same question, also using the 1880.csv data set, but now we would like you to use indexing to answer the same question from question 1a.

  3. Which of these two approaches is faster? Which one do you understand more?

Question 2 (2 points)

  1. Write a for loop that displays the average precipitation per year, for each year from 1800 to 1850. On each line of your output, print the year and the average precipitation for that year.

  2. Change your for loop to a while loop, which prints the average precipitation, for each year from 1800 to 1850 BUT stops printing after the first year with average precipitation 22 or higher. (Hint: You will see that, because of the behavior of your while loop, it should print the average precipitation for the years 1800 to 1813.)

Question 3 (2 points)

  1. For the 1880.csv data, find the average precipitation for each id. Which id has the largest average precipitation? (Hint: The average precipitation for this id is 610; which id has that largest average precipitation?)

  2. What is the average precipitation for the id USC00288878?

Question 4 (2 points)

  1. Change the results from question 3a into a dictionary. (Hint: Depending on how you solved question 3a, if you did it like Dr Ward did it, you probably got a series in question 3a, and you can probably use the to_dict() method to convert the series into a dictionary.)

Question 5 (2 points)

  1. You have used while loops and for loops on this project. Please choose any two data sets that we have posted on Anvil (they do not have to be NOAA data sets from /anvil/projects/tdm/data/noaa/. It is totally OK to use any data set in /anvil/projects/tdm/data/) and use a while loop or a for loop and explore the data. Show us something cool that you learned about these two data sets that you picked. Anything is OK; you have some freedom to explore!

Project 04 Assignment Checklist

  • Jupyter Lab notebook with your code, comments and output for the assignment

    • firstname-lastname-project04.ipynb.

  • Python file with code and comments for the assignment

    • firstname-lastname-project04.py

  • Submit files through Gradescope

Please make sure to double check that your submission is complete, and contains all of your code and output before submitting. If you are on a spotty internet connection, it is recommended to download your submission after submitting it to make sure what you think you submitted, was what you actually submitted.

In addition, please review our submission guidelines before submitting your project.