# Think Summer: Project 1 Solutions — 2022

## 1. How many tables are in the SQL database?

One method is to load the SQL database

``%sql sqlite:////anvil/projects/tdm/data/movies_and_tv/imdb.db``

and select the names of the tables, as we explained in the introduction to the course. There are six tables: `akas`, `crew`, `episodes`, `people`, `ratings`, `titles`

``````%%sql
SELECT
name
FROM
sqlite_master
WHERE
TYPE IN('table', 'view')
AND name NOT LIKE 'sqlite_%'
ORDER BY
1;``````

or we can do this in R:

``````%%R
library(RSQLite)
conn <- dbConnect(RSQLite::SQLite(), "/anvil/projects/tdm/data/movies_and_tv/imdb.db")
dbListTables(conn)``````

## 2. Find and print the title_id, rating, and number of votes (votes) for all movies that received at least 2 million votes.

In a second query (and new cell), use the information you found in the previous query to identify the primary_title of these movies.

These are the movies with at least 2 million votes:

``````%%sql
SELECT * FROM ratings WHERE votes >= 2000000 LIMIT 5;``````

and then we can lookup their titles:

``````%%sql
SELECT * FROM titles WHERE title_id = 'tt0111161' OR title_id = 'tt0468569' OR title_id = 'tt1375666' LIMIT 5;``````

## 3. Find the primary_title of every movie that is over 2 hours long or that premiered after 1990.

Order the result from newest premier year to oldest, and limit the output to 15 movies. Make sure premiered and runtime_minutes are not NULL.

We just add the conditions to the query about the titles table.

``````%%sql
SELECT * FROM titles WHERE (type == 'movie') AND (runtime_minutes IS NOT NULL) AND (premiered IS NOT NULL) AND ((runtime_minutes > 120) OR (premiered > 1990)) ORDER BY premiered DESC LIMIT 15;``````

Now we can find the total number of such movies, using the `COUNT`:

``````%%sql
SELECT COUNT(*) FROM titles WHERE (type == 'movie') AND (runtime_minutes IS NOT NULL) AND (premiered IS NOT NULL) AND ((runtime_minutes > 120) OR (premiered > 1990)) ORDER BY premiered DESC LIMIT 15;``````

## 4. What actor has the longest name? Answer this question using just SQL.

We can use the `length` function, as follows:

``````%%sql
SELECT *, length(name) FROM people ORDER BY length(name) DESC LIMIT 5;``````

## 5. LIKE is a very powerful tool. You can read about SQLite’s version of LIKE here. Use LIKE and/or R to get a count of how many movies (type='movie') that starts with each letter of the alphabet. Can you think of another way to do this? If so, show us, and explain what you did!

I used the `substr` function, which gets a substring from the string. In this case, I used it to just get the first letter of the `primary_title`.

``````%%sql
SELECT substr(primary_title,1,1), count(substr(primary_title,1,1)) FROM titles GROUP BY substr(primary_title,1,1) ORDER BY count(substr(primary_title,1,1)) DESC LIMIT 5;``````