Apply Functions

tapply

The documentation definition for tapply is a bit more specific than the others, where the arguments are now (X, INDEX, FUN), with X being an object where the split function applies, INDEX is a factor by which X is grouped, and FUN is function as before.

To simplify this definition, we can say tapply applies FUN to X when X is grouped by INDEX.

Examples

In athlete_events.csv, find the average height of the athletes in each country (the country is the NOC column).

Click to see solution
# read in data
library(data.table)
myDF <- fread("/anvil/projects/tdm/data/olympics/athlete_events.csv")

tapply(myDF$Height, myDF$NOC, sum, na.rm=TRUE)
    AFG     AHO     ALB     ALG     AND     ANG     ANT     ANZ     ARG     ARM
   9212    9042    9861   85255   23450   43660   20139    4595  403895   35935
    ARU     ASA     AUS     AUT     AZE     BAH     BAN     BAR     BDI     BEL
   7102    3689 1191954  575742   47670   62814    7977   33907    6560  316177
    BEN     BER     BHU     BIH     BIZ     BLR     BOH     BOL     BOT     BRA
  10421   24482    4829   20007   12779  297630    1044   19699   14768  597749
    BRN     BRU     BUL     BUR     CAF     CAM     CAN     CAY     CGO     CHA
  18383    1495  523332    6512    8426    9503 1422227   13104   14316    6003
    CHI     CHN     CIV     CMR     COD     COK     COL     COM     CPV     CRC
  94143  843337   28050   45145   11749    6339  166347    2563    3166   36604
    CRO     CRT     CUB     CYP     CZE     DEN     DJI     DMA     DOM     ECU
 153317       0  386897   33745  333355  323637    5376    3148   36510   44326
    EGY     ERI     ESA     ESP     EST     ETH     EUN     FIJ     FIN     FRA
 155129    6288   29334  766507  144338   59921  127136   36726  779838 1449532
    FRG     FSM     GAB     GAM     GBR     GBS     GDR     GEO     GEQ     GER
 581244    4031    9947    6822 1395588    3390  459949   49878    3916 1315364
    GHA     GRE     GRN     GUA     GUI     GUM     GUY     HAI     HKG     HON
  52686  339424    6306   56308   10719   14480   14656   12523  107634   24123
    HUN     INA     IND     IOA     IRI     IRL     IRQ     ISL     ISR     ISV
 831216   56946  152930   12038  119247  185445   28344   79023  102388   42704
    ITA     IVB     JAM     JOR     JPN     KAZ     KEN     KGZ     KIR     KOR
1410280    6788  143225   11950 1271132  237756  120731   35615    1836  668190
    KOS     KSA     KUW     LAO     LAT     LBA     LBR     LCA     LES     LIB
   1388   32378   38739    8488  140482    9771    9375    4722    8530   38660
    LIE     LTU     LUX     MAD     MAL     MAR     MAS     MAW     MDA     MDV
  48378  109806   73722   18868    2007  102420   78937   13276   39076    6878
    MEX     MGL     MHL     MKD     MLI     MLT     MNE     MON     MOZ     MRI
 418606   82368    2191   12767   11017   14613   17255   24324    9840   19352
    MTN     MYA     NAM     NBO     NCA     NED     NEP     NFL     NGR     NIG
   4861   15614   12273       0   13743  679545   13932     170  130864    4310
    NOR     NRU     NZL     OMA     PAK     PAN     PAR     PER     PHI     PLE
 593994    1839  368114   10858   63825   20454   17835   70980   89561    3054
    PLW     PNG     POL     POR     PRK     PUR     QAT     RHO     ROT     ROU
   3971   17213 1007774  194348  104528  129662   30672    2032    2039  604973
    RSA     RUS     RWA     SAA     SAM     SCG     SEN     SEY     SGP     SKN
 192186  853054    6045     479    8042   57566   61441   17550   44651    6833
    SLE     SLO     SMR     SOL     SOM     SRB     SRI     SSD     STP     SUD
  17881  182898   27715    2200    4752   72377   20864     520    2585   15999
    SUI     SUR     SVK     SWE     SWZ     SYR     TAN     TCH     TGA     THA
 734529    9768  185297  967471    9434   13433   25276  481899    6455  113134
    TJK     TKM     TLS     TOG     TPE     TTO     TUN     TUR     TUV     UAE
  11347    9182    1453    9153  178782   61587   94263  167905     663   22445
    UAR     UGA     UKR     UNK     URS     URU     USA     UZB     VAN     VEN
    362   38369  425540       0  869189   60842 2622879   80615    5248  137748
    VIE     VIN     VNM     WIF     YAR     YEM     YMD     YUG     ZAM     ZIM
  24110    4072    6299    3560    1674    4199     350  294766   22122   49929

Find the average height of the athletes in each sport. After finding these average heights, please sort your results.

Click to see solution
sort(tapply(myDF$HEIGHT, myDF$SPORT, mean, na.rm=TRUE), decreasing = TRUE)
               Basketball                Volleyball          Beach Volleyball
                 190.8699                  186.9948                  186.1450
               Water Polo                    Rowing                  Handball
                 184.8346                  184.1722                  183.3846
                 Baseball                Tug-Of-War                 Bobsleigh
                 182.5993                  182.4800                  181.4375
             Motorboating                Ice Hockey                    Tennis
                 181.0000                  178.9013                  178.8981
                 Swimming                  Canoeing              Jeu De Paume
                 178.5625                  178.5396                  178.5000
                  Sailing         Modern Pentathlon                   Fencing
                 178.2622                  177.9443                  177.1642
                Taekwondo                      Luge           Nordic Combined
                 176.7508                  176.6577                  176.5045
              Ski Jumping                 Athletics                  Skeleton
                 176.4028                  176.2563                  176.1886
                  Cycling                     Rugby                  Racquets
                 176.1088                  176.0968                  176.0000
                     Polo                  Football              Rugby Sevens
                 175.5000                  175.4022                  175.3636
         Art Competitions             Equestrianism                      Golf
                 174.6441                  174.3753                  174.2901
                  Curling                      Judo                 Badminton
                 174.2031                  174.1874                  174.1788
            Speed Skating                  Biathlon                  Lacrosse
                 174.0834                  174.0348                  174.0000
                Triathlon                  Shooting             Alpine Skiing
                 173.6458                  173.5722                  173.4891
                   Hockey      Cross Country Skiing                   Archery
                 173.3597                  173.2492                  173.2031
             Snowboarding                    Boxing                 Wrestling
                 173.0856                  172.8257                  172.3586
             Table Tennis          Freestyle Skiing Short Track Speed Skating
                 171.2538                  171.0130                  170.1082
                 Softball     Synchronized Swimming            Figure Skating
                 169.3951                  168.4815                  168.2022
      Rhythmic Gymnastics             Weightlifting                    Diving
                 167.8703                  167.8248                  166.6343
             Trampolining                Gymnastics
                 166.5828                  162.9360

Which sport are the athletes the tallest (on average)? Does this make sense intuitively, i.e., is height an advantage in this sport?

Click to see solution
# just the tallest on average sport, sepperated for clarity
head(sort(tapply(myDF$Height, myDF$Sport, mean, na.rm=TRUE), decreasing = TRUE), n=1)
Basketball: 190.869878897191