TDM 10100: Project 11 — 2023

Motivation: Selecting the right tools, understanding a problem and knowing what is available to support you takes practice.
So far this semester we have learned multiple tools to use in R to help solve a problem. This project will be an opportunity for you to choose the tools and decide how to solve the problem presented.

We will also be looking at Time Series data. This is a way to study the change of one or more variables through time. Data visualizations help greatly in looking at Time Series data.

Make sure to read about, and use the template found here, and the important information about projects submissions here.

Dataset(s)

The project will use the following dataset:

  • /anvil/projects/tdm/data/restaurant/orders.csv

Questions

Question 1 (2 pts)

Read in the dataset /anvil/projects/tdm/data/restaurant/orders.csv into a data.frame named orders

  1. Convert the created_at column to month, date, year format

  2. How many unique years are in the data.frame ?

  3. Create a line plot that shows the average number of orders placed per day of the week ( e.g. Monday, Tuesday …​).

  4. Write one to two sentences on what you notice in the graph

Question 2 ( 2 pts)

  1. Identify the top 5 vendors (vendor_id) with the highest number of orders over the years (based on created_at for time reference)

  2. For these top 5 vendors, determine the average grand_total amount for the orders they received each year

  3. Comment on any interesting patterns you observe, regarding the average total amount across these vendors, and how that changed over the years.

You can use either tapply OR the aggregate function to group or summarize data

Question 3 (2 pts)

  1. Using the created_at field, try to find out how many orders are placed after 5 pm, and how many orders are placed before 5 pm?

  2. Create a bar chart that compares the number of orders placed after 5 pm with the number of orders before 5 pm, for each day of the week

You can use the library ggplot2 for this question.

You may get more information about ggplot2 from here: ggplot2.tidyverse.org

Question 4 (2 pts)

Looking at the data, is there something that you find interesting? Create 3 new graphs, and explain what you see, and why you chose each specific type of plot.

Project 11 Assignment Checklist

  • Jupyter Lab notebook with your code, comments and output for the assignment

    • firstname-lastname-project11.ipynb

  • R code and comments for the assignment

    • firstname-lastname-project11.R.

  • Submit files through Gradescope

Please make sure to double check that your submission is complete, and contains all of your code and output before submitting. If you are on a spotty internet connection, it is recommended to download your submission after submitting it to make sure what you think you submitted, was what you actually submitted.

In addition, please review our submission guidelines before submitting your project.