Apply Functions
tapply
The documentation definition for tapply
is a bit more specific than the others, where the arguments are now (X, INDEX, FUN)
, with X
being an object where the split
function applies, INDEX
is a factor by which X
is grouped, and FUN
is function as before.
To simplify this definition, we can say tapply
applies FUN
to X
when X
is grouped by INDEX
.
Examples
Using the 5000_transactions csv file, find the sum of the amount spent (in the SPEND column) at each of the store regions (the STORE_R column)
Click to see solution
# read in data
library(data.table)
myDF <- fread("/anvil/projects/tdm/data/8451/The_Complete_Journey_2_Master/5000_transactions.csv")
tapply(myDF$SPEND, myDF$STORE_R, sum, na.rm=TRUE)
CENTRAL 8897305.13999992 EAST 11699446.8599998 SOUTH 7957920.76999994 WEST 9680106.5399999
In the 5000_transactions csv file, find the total amount of money spent in 2016 altogether, and the total amount of money spent in 2017 altogether
Click to see solution
tapply(myDF$SPEND, myDF$YEAR, sum, na.rm=TRUE)
2016 19051720.0099997 2017 19183059.2999997
In the 5000_transactions csv file, show the top 10 types of PRODUCT_NUM, according to the total amount of money spent on those products (i.e., according to the sum of the SPEND column for those 10 PRODUCT_NUM values).
Click to see solution
library(data.table)
myDF <- fread("/anvil/projects/tdm/data/8451/The_Complete_Journey_2_Master/5000_transactions.csv")
tail(sort(tapply(myDF$SPEND, myDF$PRODUCT_NUM, sum)), n=10)
89415 50032.42 8523 53845.65 1344763 58170.84 4889358 63823.61 85201 65605.34 766108 66085 74424 75787.49 85311 102928.59 1367192 111433.78 8511 131399.78
In the 5000_transactions csv file, show the sum of the values in the SPEND column according to the 8 possible pairs of YEAR and STORE_R values.
Click to see solution
library(data.table)
myDF <- fread("/anvil/projects/tdm/data/8451/The_Complete_Journey_2_Master/5000_transactions.csv")
sum <- tapply(transactions$SPEND, list(transactions$YEAR, transactions$STORE_R) sum)
print(sum)
CENTRAL EAST SOUTH WEST 2016 4471801 5829166 3996751 4754003 2017 4425505 5870281 3961170 4926104