Think Summer: Project 4 — 2024

Question 1

For the show the Gilmore Girls, there are 7 seasons listed in the IMDB database. Find the average rating of each of the seven seasons. Hint: Use AVG for find the average, and GROUP BY the season_number. Make a plot or dotchart to show the average rating for each season in R.

Question 2

Identify the six most popular episodes of the show Grey’s Anatomy (where "popular" denotes a high rating).

Question 3

Make a dotchart in R showing the results of the previous question. Hint: You can use your work from SQL, and export the results to a dataframe called myDF in R. Then you can use something like:

# use a dbGetQuery here, to import the SQL results to R, and then
myresults <- myDF$rating
names(myresults) <- myDF$primary_title
dotchart(myresults)

Question 4

Make a plot or dotchart showing the total amount of money donated in each of the top 10 states, during the 2000 federal election cycle.

Question 5

Make a dotchart that shows how many movies premiered in each year. You do not need to show all of the years; there are too many years! Just show the number of movies premiered in each year since the year 2000.

Question 6

Among the three big New York City airports (JFK, LGA, EWR), which of these airports had the worst DepDelay (on average) in 2005? (Can you solve this with 1 line of R, using a tapply (rather than using 3 separate lines of R)? Hint: After you run the tapply, you can index your results using [c("JFK", "LGA", "EWR")] to lookup all 3 airports at once.)

Question 7

Use LIKE to analyze the primary_title of all IMDB titles: First determine how many titles have Batman anywhere in the title, and then determine how many titles have Superman anywhere in the title? Which one occurs more often?

Question 8

How much money was donated during the 2000 federal election cycle by people who have PURDUE listed somewhere in their employer name? How much money was donated by people who have MICROSOFT listed somewhere in their employer name? Hint: You might use the grep or the grepl (which is a logical grep) to solve this one.

Question 9

How much money was donated during the 2000 federal election cycle by people from your hometown? (Be sure to match the city and the state.)

Question 10

During the years 2000 to 2020, how many people (from the people table) died in each year? Make a plot or dotchart to show the number of people who died in each year.

Question 11

Consider only the flights that arrive to Indianapolis (airport code IND), i.e., for which Indianapolis is the destination. What are the 10 most popular origin airports? Make a plot or dotchart to show the number of flights from each of these 10 most popular origin airports (with Indianapolis as the destination airport).

Question 12

Create your own interesting question based on the things you have learned this week. What insights can you find?