5. SQL in R
Demo
|
Please use |
We will show how to make SQL calls from R, so that we can (for example) make plots related to our SQL queries. When we make a SQL call in R, the data is returned as an R database.
Connect to the IMDb SQLite database
conn <- dbConnect(RSQLite::SQLite(), "/anvil/projects/tdm/data/movies_and_tv/imdb.db")
Also load libraries we will be using in R:
library(RSQLite)
library(dplyr)
library(ggplot2)
Check available tables:
dbListTables(conn)
Write the first SQL query in R:
dbGetQuery(conn, "SELECT * FROM titles LIMIT 5;")
Start asking some questions: what type of titles Quentin Tarantino mainly works on (following query takes some time)
tarantino_types <- dbGetQuery(conn, "
SELECT t.type, COUNT(*) AS count
FROM titles AS t
JOIN crew AS c ON t.title_id = c.title_id
JOIN people AS p ON c.person_id = p.person_id
WHERE p.name = 'Quentin Tarantino'
GROUP BY t.type
ORDER BY count DESC;
")
tarantino_types
Plot title types
ggplot(tarantino_types, aes(x = reorder(type, count), y = count)) +
geom_bar(stat = "identity", fill = "#0073C2FF") +
coord_flip() +
labs(title = "Types of Titles Quentin Tarantino Worked On",
x = "Title Type", y = "Count") +
theme_minimal()
|
SQL syntax is NOT case sensitive. However, remember that R is case sensitive, so please be careful with the capitalization of data frame and variable names. |