Baseball Examples Using the Lahman Database

These examples depend on this database:

/anvil/projects/tdm/data/lahman/lahman.db

Parks in Indiana

Display the ballparks (from the Parks table) located in Indiana.

SELECT * FROM parks WHERE state = 'IN';

Currently Active Teams

Find the names of all 30 active teams in 2023.

SELECT franchName FROM TeamsFranchises WHERE active=='Y';

Chicago Cubs

Find all the years in which the Chicago Cubs were the League Champions.

SELECT t.yearID, t.name, t.teamID FROM Teams t WHERE LgWin = 'Y' AND name='Chicago Cubs';

Wade Boggs Salary

Find the salaries for Wade Boggs (playerID = boggswa01) for each year of his career.

SELECT * FROM Salaries WHERE playerID = 'boggswa01';

Characteristics of Cal Ripken Jr.

Cal Ripken Sr. and Cal Ripken Jr. both had long careers with the Baltimore Orioles:

SELECT * FROM People WHERE nameFirst = 'Cal' AND nameLast = 'Ripken' LIMIT 5;

Find Cal Ripken Jr.'s weight, height, and whether he bats left or right handed.

SELECT weight, height, bats FROM People WHERE playerID = 'ripkeca01' LIMIT 5;

Roberto Clemente

Find the number of at-bats, home runs, and stolen bases for Roberto Clemente (playerID = 'clemero01') by year, given in ascending order according to the year.

SELECT yearID, AB, HR, SB FROM Batting WHERE playerID = 'clemero01' ORDER BY yearID;

Hall Of Fame Inductions

Find the number of players inducted into the Hall Of Fame each year since 2000.

SELECT h.yearID, COUNT(*) FROM HallOfFame h WHERE h.yearID >= 2000 AND h.inducted = 'Y' AND h.category='Player' GROUP BY h.yearID;

Ichiro Suzuki

Ichiro Suzuki is regarded as one of the greatest hitters of all time because of his prowess in both American and Japanese professional baseball. Find the number of hits (H) and home runs (HR) for Ichiro Suzuki (playerID = suzukic01) during each year of his career.

SELECT teamID, yearID, H, HR FROM batting WHERE playerID == 'suzukic01';

Notice that he was traded in 2012, so if we want his total number of hits (H) and home runs (HR) in each year, we need to group the values together, according to the yearID. In such a case, it no longer makes sense to list the teamID because he has multiple teamID values in 2012.

SELECT yearID, SUM(H), SUM(HR) FROM batting WHERE playerID == 'suzukic01' GROUP BY yearID;

This story about Iris Skinner and Ichiro Suzuki is one of my favorite baseball stories:

Pitchers With Many Wins

Consider only the seasons since 1960. Find the number of pitchers in each season who had 25 or more wins in that season.

SELECT yearID, COUNT(*) FROM Pitching WHERE W >= 25 AND yearID >= 1960 GROUP BY yearID;

Hall of Fame Veterans Committee

The Hall of Fame Veterans Committee nominates Hall of Fame players that are forgotten, so to say, in baseball history. The committee allows players to recognize and give justice to their underrated, underviewed, or undervalued peers. Find all playerID and yearID values corresponding to players who were selectged for the Hall of Fame by the Veterans Committee between 1990 and 2000. Put the list in descending order.

SELECT playerID, yearID FROM halloffame WHERE votedBy = 'Veterans' AND inducted = 'Y' AND yearID BETWEEN 1990 AND 2000 ORDER BY yearID DESC;

Most Shutouts

Which pitcher has thrown the most shutouts (SHO) in the National League since 2010?

SELECT playerID, SUM(SHO) as mycounts FROM pitching WHERE yearID >= 2010 AND lgID = 'NL' GROUP BY playerID ORDER BY mycounts DESC LIMIT 1;

Florida teams

There are only two teams from Florida, namely, the Marlins (who have played as the Florida Marlins and now as the Miami Marlins, with franchID = 'FLA') and the Rays (who have played as the Tampa Bay Devil Rays and now as the Tampa Bay Rays, with franchID = 'TBD'). Considering only these two teams, find their attendance in each year, with all Marlins results given first (in ascending order by year) and then all Rays results given second (in ascending order by year).

SELECT yearID, franchID, attendance FROM teams WHERE franchID = 'TBD' OR franchID == 'FLA' ORDER BY franchID, yearID;

Toronto Blue Jays Attendance

Sometimes a baseball team plays in different stadiums, for instance, due to a stadium renovation or due to the COVID pandemic. Consider the Toronto Blue Jays (teamkey = 'TOR') attendance, as described in the Homegames table. Notice that there are multiple rows for the yearkey values 1989, 2001, 2021. Now SUM the values for each yearkey, to get the total attendance in each year. List your results according to the yearkey, in descreasing order.

SELECT yearkey, SUM(attendance) FROM homegames WHERE teamkey = 'TOR' GROUP BY yearkey ORDER BY yearkey DESC;

The Most Errors in Baseball

Herman Long committed the most errors in the history of baseball.

SELECT playerID, SUM(E) as mycounts FROM fielding GROUP BY playerID ORDER BY mycounts DESC LIMIT 1;

Strikeouts for Batters and Pitchers

Strikeouts are good for pitchers but bad for batters. During the 2014 season, which pitcher had the most strikeouts, and how many strikeouts did they have? Which batter had the most strikeouts, and how many strikeouts did they have? (It is OK to just find the playerID for this pitcher and this batter.

SELECT playerID, SO FROM Pitching WHERE yearID = 2014 ORDER BY SO DESC LIMIT (1);

SELECT playerID, SO FROM Batting WHERE yearID = 2014 ORDER BY SO DESC LIMIT (1);

Large Salaries

List the rows from the Salaries table in which a player made at least 30 million dollars in a single season.

SELECT * FROM Salaries WHERE salary >= 30000000 ORDER BY salary DESC;

At Least One Complete Game In A Year

In recent years, it is rare for pitchers to throw a complete game. For this question, in each year, consider only the pitchers (from the Pitching table) who pitched at least 1 complete game (in other words, who have CG > 0). In each year, how many players achieved this feat, i.e., had at least one complete game?

In 1871, there were 14 such pitchers, and in 2023, there were 25 such pitchers, but the number changed a lot over the years.

SELECT COUNT(*), yearID FROM Pitching WHERE CG > 0 GROUP BY yearID;

Cy Young Award

Which pitcher has received the Cy Young Award the most times altogether? How many times did he receive this award? It suffices to just find this pitcher’s playerID.

SELECT COUNT(*) as mycounts, playerID FROM awardsplayers WHERE awardID = 'Cy Young Award' GROUP BY playerID ORDER BY mycounts DESC LIMIT 1;

Most World Series

Find the franchise that has won the World Series the most times.

SELECT COUNT(*) as mycounts, franchID FROM Teams WHERE WSWin = 'Y' GROUP BY franchID ORDER BY mycounts DESC LIMIT 1;

Most Runs Batted In

In the 2023 season, find the batter with the most runs batted in. It suffices to simply find their playerID.

In the Batting table, the runs batted in are found in the RBI column.

SELECT playerID, SUM(RBI) as mysum FROM Batting WHERE yearID = 2023 GROUP BY playerID ORDER BY mysum DESC LIMIT 1;

Catchers Who Won the Gold Glove Award More Than Once In Their Careers

The Gold Glove award recognizes players with excellent fielding abilities. Identify the catchers who have been recognized with this award more than once in their careers.

SELECT COUNT(*) as mycounts, playerID FROM AwardsPlayers WHERE awardID = 'Gold Glove' AND notes = 'C' GROUP BY playerID HAVING mycounts > 1 ORDER BY mycounts DESC;

Very Successful Manager in the 1990s

For this question, consider only games player during the 1990s. Find the playerID for each manager who won at least 60 percent of their games. Because there are some managers who were involved in only a few games, include the criteria that the managers was in more than 100 games during this time period.

There was only one such manager to achieve this feat in the 1990s.

SELECT CAST(SUM(W) AS FLOAT)/(SUM(W)+SUM(L)) as myrecord, playerID, SUM(W), SUM(L)
FROM managers
WHERE yearID BETWEEN 1990 AND 1999
GROUP BY playerID
HAVING myrecord > 0.6 AND SUM(W) + SUM(L) > 100
ORDER BY myrecord DESC;

Journeymen

The term journeyman is one "who plays for numerous clubs during his career":

Make a list of all of the pitchers who have played for 12 or more teams during their careers.

SELECT COUNT(DISTINCT teamID) as mycounts, playerID FROM Pitching GROUP BY playerID ORDER BY mycounts DESC LIMIT 15;