Baseball Examples Using the Lahman Database
These examples depend on this database:
/anvil/projects/tdm/data/lahman/lahman.db
Players With At Least 60 Home Runs
There are only a few players with at least 60 home runs during the same season. First we find their playerID values:
SELECT playerID, yearID, teamID, HR FROM Batting WHERE HR >= 60;
To get the information about the full names of the players in this query, we need to use the Batting table and the People table too:
SELECT b.playerID, b.yearID, b.teamID, p.nameFirst, p.nameLast, b.HR
FROM Batting b INNER JOIN People p ON b.playerID = p.playerID WHERE
b.HR >= 60;
We might want to also know the full names of the teams that they played on.
SELECT b.playerID, b.yearID, b.teamID, p.nameFirst, p.nameLast, b.HR, t.name
FROM Batting b INNER JOIN People p ON b.playerID = p.playerID
INNER JOIN Teams t ON (t.yearID = b.yearID AND t.teamID = b.teamID) WHERE
b.HR >= 60;
Babe Ruth
Now we can count the number of seasons in which Babe Ruth hit at least 50 HR’s:
SELECT b.playerID, p.nameFirst, p.nameLast, t.name, COUNT(*) FROM
Batting b INNER JOIN People p ON b.playerID = p.playerID INNER JOIN
Teams t ON (t.yearID = b.yearID AND t.teamID = b.teamID) WHERE b.HR >=
50 AND b.playerID = 'ruthba01';
Finally, we can count the number of seasons for each player in which the player hit over 50 HR’s.
SELECT b.playerID, p.nameFirst, p.nameLast, t.name, COUNT(*) FROM
Batting b INNER JOIN People p ON b.playerID = p.playerID INNER JOIN
Teams t ON (t.yearID = b.yearID AND t.teamID = b.teamID) WHERE b.HR >=
50 GROUP BY b.playerID;
Players with more than 30 home runs
Now we find the players who have hit more than 30 home runs per year, grouped according to player, and displaying the count of the times this happened:
SELECT p.nameFirst, p.nameLast, b.playerID, b.teamID, t.name, COUNT(*) FROM Batting b JOIN People p ON p.playerID = b.playerID JOIN Teams t ON b.yearID = t.yearID AND b.teamID = t.teamID WHERE b.HR >= 30 GROUP BY b.playerID;
Now we calculate the same feat, but no longer grouping by playerID. Instead, we can show the actual number of home runs and the year:
SELECT p.nameFirst, p.nameLast, b.playerID, b.HR, b.yearID, b.teamID, t.name FROM Batting b JOIN People p ON p.playerID = b.playerID JOIN Teams t ON b.yearID = t.yearID AND b.teamID = t.teamID WHERE b.HR >= 30 AND t.yearID >= 1990;
Same thing, but only considering a specific year, namely, 1995:
SELECT p.nameFirst, p.nameLast, b.playerID, b.HR, b.yearID, b.teamID, t.name FROM Batting b JOIN People p ON p.playerID = b.playerID JOIN Teams t ON b.yearID = t.yearID AND b.teamID = t.teamID WHERE b.HR >= 30 AND t.yearID = 1995;
Now, for each year since 1991, we count the number of players with at least 30 home runs during that year:
SELECT b.yearID, COUNT(*) FROM Batting b WHERE b.HR >= 30 AND b.yearID >= 1991 GROUP BY b.yearID;
Players with At Least 50 Home Runs
List each player who hit 50 or more home runs during at least one year since 1991:
SELECT DISTINCT(p.playerID) FROM Batting b JOIN People p ON p.playerID = b.playerID JOIN Teams t ON b.yearID = t.yearID AND b.teamID = t.teamID WHERE b.HR >= 50 AND t.yearID >= 1991 GROUP BY b.playerID;
Boston Red Sox in 2008
Next we focus on gathering data about the Boston Red Sox. We first find out which teamID to use by looking at:
SELECT DISTINCT(teamID), name FROM Teams WHERE name LIKE 'Boston%';
Now we get the names of all Boston Red Sox players in 2008:
SELECT p.playerID, b.yearID, b.teamID, p.nameFirst, p.nameLast, t.name
FROM Batting b INNER JOIN People p ON b.playerID = p.playerID INNER
JOIN Teams t ON (t.yearID = b.yearID AND t.teamID = b.teamID) WHERE
(t.teamID = 'BOS' AND t.yearID = 2008);
Players Who Attended Purdue University
Find the first and last names of all the players who attended Purdue University. Then add the years (starting and ending years) for which each of these people played for Purdue. Finally, add the columns for the total number of home runs for each player in the Major Leagues, as well as the range of years that each player was active in the Major Leagues.
The 15 players are:
SELECT DISTINCT(p.playerID) FROM People p INNER JOIN CollegePlaying c ON p.playerID=c.playerID WHERE c.schoolID='purdue';
Now we add the years when they each played for Purdue:
SELECT p.nameFirst, p.nameLast, MIN(c.yearID), MAX(c.yearID) FROM People p INNER JOIN CollegePlaying c on p.playerID=c.playerID WHERE c.schoolID='purdue' GROUP BY p.playerID;
Finally, we add columns for the total number of home runs for each player in the Major Leagues, as well as the range of years that each player was active in the Major Leagues.
SELECT p.nameFirst, p.nameLast, MIN(c.yearID), MAX(c.yearID), SUM(b.HR), MIN(b.yearID), MAX(b.yearID) FROM People p INNER JOIN CollegePlaying c on p.playerID=c.playerID INNER JOIN Batting b ON p.playerID = b.playerID WHERE c.schoolID='purdue' GROUP BY p.playerID;
Players Who Attended Colleges In The State Of Indiana
Revise your solution to the previous question, so that it includes not only the players who attended college at Purdue, but rather, all of the players who attended a college from the State of Indiana.
SELECT s.schoolID, p.nameFirst, p.nameLast, MIN(c.yearID), MAX(c.yearID), SUM(b.HR), MIN(b.yearID), MAX(b.yearID) FROM People p INNER JOIN CollegePlaying c on p.playerID=c.playerID INNER JOIN Batting b ON p.playerID = b.playerID INNER JOIN Schools s ON c.schoolID = s.schoolID WHERE s.state = 'IN' GROUP BY p.playerID;
Teams With Many Wins
Find all teams that won 110 or more games in a season. List the year, the team’s full name, and the number of wins in that season.
SELECT t.yearID, t.name, t.W FROM Teams t WHERE t.W >= 110;
Now include (also) the first and last name of the manager for the team.
SELECT t.yearID, t.name, t.W, p.nameFirst, p.nameLast FROM Teams t INNER JOIN Managers m ON m.yearID=t.yearID and m.teamID=t.teamID INNER JOIN People p ON p.playerID=m.playerID WHERE t.W >= 110;
Chicago Cubs
Find all the years in which the Chicago Cubs were the League Champions, and this time include the first and last name of the manager for the team.
SELECT t.yearID, t.name, t.teamID, p.nameFirst, p.nameLast FROM Teams t INNER JOIN Managers mng ON mng.yearID=t.yearID INNER JOIN People p ON p.playerID=mng.playerID AND mng.teamID=t.teamID WHERE LgWin = 'Y' AND name='Chicago Cubs';
Stealing Bases
Find all players who stole 100 or more bases during a season. For each such player, list the player’s first name, last name, the number of bases stolen, the year, and the player’s team name.
SELECT p.nameFirst, p.nameLast, b.SB, b.yearID, t.name FROM People p INNER JOIN Batting b ON p.playerID = b.playerID INNER JOIN Teams t ON b.yearID=t.yearID AND b.teamID=t.teamID WHERE b.SB >= 100 AND b.SB != '';
Large Salaries
List the rows from the Salaries table in which a player made at least 30 million dollars in a single season, but now, for each such occurrence, list the player’s first name, last name, the year, the salary, the teamID, and the Team’s full name.
SELECT p.nameFirst, p.nameLast, s.yearID, s.salary, t.teamID, t.name FROM People p INNER JOIN Salaries s ON p.playerID = s.playerID INNER JOIN Teams t ON t.yearID = s.yearID AND t.teamID = s.teamID WHERE s.salary >= 30000000;