Apply Functions

tapply

The documentation definition for tapply is a bit more specific than the others, where the arguments are now (X, INDEX, FUN), with X being an object where the split function applies, INDEX is a factor by which X is grouped, and FUN is function as before.

To simplify this definition, we can say tapply applies FUN to X when X is grouped by INDEX.

Examples

Using the reviews_sample csv file, show the three dates on which the mean score is a 5.

Click to see solution
library(data.table)
myDF <- fread("/anvil/projects/tdm/data/beer/reviews_sample.csv")

tail(sort(tapply(myDF$score, myDF$date, mean, na.rm=TRUE)), n=3)
2001-04-26
    5
2001-06-18
    5
2002-01-26
    5

Using the reviews_sample csv file, show a table displaying the mean score values for each month and year pair.

Click to see solution
library(data.table)
myDF <- fread("/anvil/projects/tdm/data/beer/reviews_sample.csv")

beer_reviews$date <- as.Date(beer_reviews$date)

years <- format(beer_reviews$date, "%Y")
months <- format(beer_reviews$date, "%m")

mean_scores <- tapply(beer_reviews$score, list(years, months), mean)

print(mean_scores)
           01       02       03       04       05       06       07       08
1998 3.770000 3.396667 4.092000 3.840000 3.702000 4.700000 3.100000 3.823333
1999       NA 3.613333       NA       NA 3.820000 3.850000 3.880000       NA
2000       NA 4.300000       NA 3.880000       NA 4.470000 3.995000       NA
2001 4.220000 4.488000 4.403333 3.053333       NA 4.012000 4.080000 3.905455
2002 4.246667 3.706000 3.933846 3.831224 3.887788 3.782655 3.950776 3.628201
2003 3.842596 3.921875 3.840573 3.929500 3.895977 3.768022 3.742609 3.710635
2004 3.892104 3.822910 3.757987 3.825360 3.826656 3.798576 3.816569 3.861793
2005 3.872065 3.805870 3.884944 3.806607 3.743355 3.859615 3.769045 3.784184
2006 3.821626 3.789613 3.803201 3.833529 3.816436 3.847766 3.799106 3.795228
2007 3.796619 3.820563 3.785231 3.820230 3.768441 3.721336 3.809563 3.710408
2008 3.897296 3.879322 3.825841 3.866337 3.819464 3.824667 3.833681 3.845346
2009 3.868856 3.839302 3.847518 3.846370 3.892921 3.872649 3.851616 3.850718
2010 3.810428 3.886246 3.884490 3.869777 3.838745 3.838772 3.806898 3.842232
2011 3.861355 3.839600 3.839057 3.841564 3.844314 3.840459 3.855617 3.809778
2012 3.827531 3.813721 3.842391 3.856536 3.843407 3.827998 3.843218 3.818722
2013 3.930060 3.922282 3.945560 3.949689 3.945230 3.883678 3.849965 3.847642
2014 3.894819 3.919469 3.923504 3.890891 3.886415 3.909815 3.872173 3.872730
2015 3.997097 3.996901 4.005002 3.991280 3.984110 3.979467 3.967579 3.963439
2016 3.986488 4.001558 3.987044 3.950565 3.970315 3.982854 3.987852 3.993576
2017 4.011244 4.036964 4.025383 4.010692 3.986720 3.978366 3.978893 3.998201
2018 4.025227 4.030995 4.013674 4.007635 3.999648 4.001002 3.948450 3.980969
           09       10       11       12
1998 3.355000 3.910000       NA 3.930000
1999       NA 3.500000 3.880000 4.000000
2000 3.885000 3.880000 4.670000 3.400000
2001 4.010556 3.948000 4.112069 3.851053
2002 3.798758 3.784247 3.885028 3.832537
2003 3.761452 3.771104 3.790879 3.802826
2004 3.802122 3.784444 3.741100 3.843094
2005 3.795644 3.782152 3.855852 3.860837
2006 3.826782 3.764831 3.802075 3.804746
2007 3.769330 3.826076 3.779580 3.834992
2008 3.824287 3.817620 3.841760 3.816298
2009 3.809730 3.862528 3.851910 3.860305
2010 3.844956 3.807355 3.844931 3.876926
2011 3.799865 3.854808 4.132859 4.013434
2012 3.826335 3.831577 3.869356 3.853065
2013 3.822392 3.860475 3.875164 3.885565
2014 3.885820 3.921727 3.933716 3.967668
2015 3.967281 3.967198 3.985564 3.993312
2016 4.005607 3.990623 4.007644 4.010456
2017 4.005115 4.002342 4.008391 4.045614
2018 3.992782       NA       NA       NA