Apply Functions

`tapply`

The documentation definition for tapply is a bit more specific than the others, where the arguments are now (X, INDEX, FUN), with X being an object where the split function applies, INDEX is a factor by which X is grouped, and FUN is function as before.

To simplify this definition, we can say tapply applies FUN to X when X is grouped by INDEX.

Examples

Using the Iowa liquor sales file, use `fread` to read all 27 million rows of the data set again, but this time, only read in the columns "Zip Code", "Category Name", "Sale (Dollars)." Find the 10 "Zip Code" values that have the largest sum of "Sale (Dollars)" altogether, and give those "Zip Code" values and each of their sums of "Sale (Dollars)".

Click to see solution

# read in data
iowa2 <- fread("/anvil/projects/tdm/data/iowa_liquor_sales/iowa_liquor_sales.csv", select=c("Zip Code", "Category Name", "Sale (Dollars)"))

zip_sales <- tapply(iowa2$`Sale (Dollars)`, iowa2$`Zip Code`, sum)
head(sort(zip_sales, decreasing=TRUE), 10)

50320
    132861227.43
52402
    108460935.17
52240
    106827908.74
50266
    95956448.74
51501
    84485599.04
52241
    80224356.18
50613
    70716357.28
50311
    65407916.64
52722
    63447651.28
50021
    61328202.38

Using the Iowa liquor sales file, find the 10 "Category Name" values that have the largest sum of "Sale (Dollars)" altogether, and give those "Category Name" values and each of their sums of "Sale (Dollars)".

Click to see solution

# read in data
iowa2 <- fread("/anvil/projects/tdm/data/iowa_liquor_sales/iowa_liquor_sales.csv", select=c("Zip Code", "Category Name", "Sale (Dollars)"))

category_sales <- tapply(iowa2$`Sale (Dollars)`, iowa2$`Category Name`, sum)
head(sort(category_sales, decreasing=TRUE), 10)

CANADIAN WHISKIES
    457612891.06
AMERICAN VODKAS
    380307151.309999
STRAIGHT BOURBON WHISKIES
    257794861.83
SPICED RUM
    254362805.42
WHISKEY LIQUEUR
    199736754.69
IMPORTED VODKAS
    183082358.92
TENNESSEE WHISKIES
    162676709.12
100% AGAVE TEQUILA
    124223944.31
BLENDED WHISKIES
    109152590.55
IMPORTED BRANDIES
    88413645.9