Think Summer: Day 1 Notes — 2024

Loading the database

%sql sqlite:////anvil/projects/tdm/data/movies_and_tv/imdb.db

Extracting a few rows from the each of the 6 tables

%%sql
SELECT * FROM titles LIMIT 5;
%%sql
SELECT * FROM episodes LIMIT 5;
%%sql
SELECT * FROM people LIMIT 5;
%%sql
SELECT * FROM ratings LIMIT 5;
%%sql
SELECT * FROM crew LIMIT 5;
%%sql
SELECT * FROM akas LIMIT 5;

We can see how many rows were in each table, as follows:

%%sql
SELECT COUNT(*) FROM titles LIMIT 5;
%%sql
SELECT COUNT(*) FROM episodes LIMIT 5;
%%sql
SELECT COUNT(*) FROM people LIMIT 5;

We can also start to investigate individual people, for instance:

We can look Jack Black up, by his username:

%%sql
SELECT * FROM people WHERE person_id = 'nm0085312' LIMIT 5;

Or we can lookup people by their name directly:

%%sql
SELECT * FROM people WHERE name = 'Jack Black' LIMIT 5;
%%sql
SELECT * FROM people WHERE name = 'Ryan Reynolds' LIMIT 5;
%%sql
SELECT * FROM people WHERE name = 'Hayden Christensen' LIMIT 5;

Community is a show that ran from 2009 to 2015

%%sql
SELECT * FROM titles WHERE title_id = 'tt1439629' LIMIT 5;

Friends is one of Dr Ward’s favorite shows. We can find it here:

%%sql
SELECT * FROM titles WHERE title_id = 'tt0108778' LIMIT 5;

or like this:

%%sql
SELECT * FROM titles WHERE (primary_title = 'Friends') AND (premiered > 1992) LIMIT 5;

These are the episodes from Friends:

%%sql
SELECT * FROM episodes WHERE show_title_id = 'tt0108778' LIMIT 5;

and one particular episode is called "The One Where Chandler Doesn’t Like Dogs"

%%sql
SELECT * FROM titles WHERE title_id = 'tt0583431' LIMIT 5;

That episode is in season 7, episode 8:

%%sql
SELECT * FROM episodes WHERE episode_title_id = 'tt0583431' LIMIT 5;

Here is the breakdown of the number of episodes per season:

%%sql
SELECT COUNT(*), season_number FROM episodes
WHERE show_title_id = 'tt0108778' GROUP BY season_number LIMIT 15;

Tobey Maguire was born in 1975

%%sql
SELECT * FROM people WHERE person_id = 'nm0001497' LIMIT 5;

There are a total of 8064259 titles in the titles table.

%%sql
SELECT COUNT(*) FROM titles LIMIT 5;

These are the first 5 people in the people table.

%%sql
SELECT * FROM people LIMIT 5;

These are the first 5 episodes in the episodes table.

%%sql
SELECT * FROM episodes LIMIT 5;

These are the first 5 people in the crew table.

%%sql
SELECT * FROM crew LIMIT 5;

Only 3 movies have more than 2 million ratings

%%sql
SELECT * FROM ratings WHERE votes > 2000000 LIMIT 5;

Let’s find how many people were born in each year (after 1850).

%%sql
SELECT COUNT(*), born FROM people WHERE born > 1850
GROUP BY born LIMIT 200;

There are 487731 titles with rating 7.4 or higher.

%%sql
SELECT COUNT(*) FROM ratings WHERE rating >= 7.4 LIMIT 5;

The Family Guy has 374 episodes.

%%sql
SELECT COUNT(*) FROM episodes WHERE show_title_id = 'tt0182576' LIMIT 5;

These are five of the films where George Lucas was on the crew.

%%sql
SELECT * FROM crew WHERE person_id = 'nm0000184' LIMIT 5;

These are the number of times that he played each role in the crew:

%%sql
SELECT COUNT(*), category FROM crew WHERE person_id = 'nm0000184' GROUP BY category LIMIT 50;

We can investigate how many titles premiered in each year, by grouping things together according to the year that the title premiered, and by ordering the results according to the year that the title premiered. The "desc" specifies that we want the results in descending order, i.e., with the largest result first (where "largest" means the "last year", because we are ordering by the years).

%%sql
SELECT COUNT(*), premiered FROM titles
GROUP BY premiered ORDER BY premiered DESC LIMIT 20;

The Family Guy premiered in 1999 and ended in 2022.

%%sql
SELECT * FROM titles WHERE title_id = 'tt0182576' LIMIT 5;

If you want to find the first five Comedies, you can find the ones where the genres are like Comedy, possibly with some other characters before and after:

%%sql
SELECT * FROM titles WHERE genres LIKE '%Comedy%' LIMIT 5;

Similarly, you can find actors and actresses with Audrey in their name:

%%sql
SELECT * FROM people WHERE name LIKE '%Audrey%' LIMIT 5;