Apply Functions
tapply
The documentation definition for tapply
is a bit more specific than the others, where the arguments are now (X, INDEX, FUN)
, with X
being an object where the split
function applies, INDEX
is a factor by which X
is grouped, and FUN
is function as before.
To simplify this definition, we can say tapply
applies FUN
to X
when X
is grouped by INDEX
.
Examples
Use the sapply function to run this function on each month to get total amount of money spent on taxi cab rides each day. Use the tapply function to add up the amounts spent per day.
myfares <- function(mymonth) {
myDF <- fread(paste0("/anvil/projects/tdm/data/taxi/yellow/yellow_tripdata_2018-", mymonth, ".csv"), select=c(2,17))
mytable <- tapply(myDF$total_amount, as.Date(myDF$tpep_pickup_datetime), sum)
return(mytable)
}
Click to see solution
myfares <- function(mymonth) {
myDF <- fread(paste0("/anvil/projects/tdm/data/taxi/yellow/yellow_tripdata_2018-", mymonth, ".csv"), select=c(2,17))
mytable <- tapply(myDF$total_amount, as.Date(myDF$tpep_pickup_datetime), sum)
return(mytable)
}
library(data.table)
myresults <- sapply( sprintf("%02d", 1:12), myfares )
names(myresults) <- NULL
v <- do.call(c, myresults)
mytotals <- tapply(v, names(v), sum)
sapply
sapply
will function identically to lapply
unless the output can be simplified, in which case sapply
executes that simplification. The following occurs when we run sapply
in place of lapply
on our squares
vector.
Examples
Use the sapply function to run this function on each month to get total amount of money spent on taxi cab rides each day.
myfares <- function(mymonth) {
myDF <- fread(paste0("/anvil/projects/tdm/data/taxi/yellow/yellow_tripdata_2018-", mymonth, ".csv"), select=c(2,17))
mytable <- tapply(myDF$total_amount, as.Date(myDF$tpep_pickup_datetime), sum)
return(mytable)
}
Click to see solution
myfares <- function(mymonth) {
myDF <- fread(paste0("/anvil/projects/tdm/data/taxi/yellow/yellow_tripdata_2018-", mymonth, ".csv"), select=c(2,17))
mytable <- tapply(myDF$total_amount, as.Date(myDF$tpep_pickup_datetime), sum)
return(mytable)
}
library(data.table)
myresults <- sapply( sprintf("%02d", 1:12), myfares )
myresults