TDM 10100: Project 13 — 2023

Motivation: This semester we took a deep dive into R and its packages. Let’s take a second to pat ourselves on the back for surviving a long semester and review what we have learned!

Make sure to read about, and use the template found here, and the important information about projects submissions here.

Dataset(s)

The project will use the following dataset:

  • /anvil/projects/tdm/data/icecream/combined/products.csv

Questions

Questions 1 (2 pts)

For question 1, read the dataset into a data.frame called orders

  1. Create a plot that shows, for each brand of ice cream, the total number of rating_count (in other words, the sum of those rating_count values) for each brand of icecream. There are 4 brands, so your solution should have 4 values altogether.

  • It might be worthwhile to make a dotchart.

Before solving Question 2, please build a data frame called bigDF from these three files

/anvil/projects/tdm/data/icecream/bj/reviews.csv

/anvil/projects/tdm/data/icecream/breyers/reviews.csv

/anvil/projects/tdm/data/icecream/talenti/reviews.csv

using this code:

mybrands <- c("bj", "breyers", "talenti")
myfiles <- paste0("/anvil/projects/tdm/data/icecream/", mybrands, "/reviews.csv")
bigDF <- do.call(rbind, lapply(myfiles, fread))

Use this data frame bigDF to answer Questions 2, 3, and 4:

Question 2 (2 pts)

  1. In which month-and-year pair were the most reviews given? (There is one review per line of this data frame bigDF.

  2. Make a plot that shows, for each year, the average number of stars in that year.

Question 3 (2 pts)

  1. Which key has the lowest average number of stars?

  2. There is one entry in which the text review has more than 2500 characters! Print the text of this review.

Question 4 (2 pts)

  1. Consider all of the authors of the reviews. Which author wrote the most reviews altogether? (Note: there are many blank authors, and there are a lot of Anonymous authors, but please ignore blank authors and Anonymous authors in this question.)

  2. Considering the 43 reviews written by the author that you found in question 4a, this author is usually happy and gives high ratings. BUT this author gave one review that only had 1 star. Print the text of that 1 star review from the author you found in question 4a.

Project 13 Assignment Checklist

  • Jupyter Lab notebook with your code, comments and output for the assignment

    • firstname-lastname-project13.ipynb

  • R code and comments for the assignment

    • firstname-lastname-project13.R.

  • Submit files through Gradescope

Please make sure to double check that your submission is complete, and contains all of your code and output before submitting. If you are on a spotty internet connection, it is recommended to download your submission after submitting it to make sure what you think you submitted, was what you actually submitted.

In addition, please review our submission guidelines before submitting your project.