R base functions

table

table is a function used to build a contingency table, which is a table that shows counts for categorical data, from one or more categories. prop.table is a function that accepts table output, returning proportions of the counts.

Examples

In the Olympics data, which value appears in the "NOC" column the most times?

Click to see solution
myDF <- read.csv("/anvil/projects/tdm/data/olympics/athlete_events.csv")

head(sort(table(myDF$NOC), decreasing=TRUE), n=1)
 USA
18853

Which value appears in the "Name" column the most times?

Click to see solution
head(sort(table(myDF$Name), decreasing=TRUE), n=1)
Robert Tait McKenzie
                  58

How many rows correspond exactly to team "Denmark"?

Click to see solution
table(myDF$Team)['Denmark']
Denmark: 3424

grep & grepl

grep stands for " globally search for a regular expression and print all matches," just as in UNIX. The function allows you to use regular expressions to search for a pattern in a vector of strings or characters, and returns the index (indices) of the match(es).

Additionally, the function grepl (derived from grep-logical) uses the same inputs, but returns a logical vector, where TRUE indicates a match at that index, and FALSE indicates the opposite.

grep and grepl can be used on individual strings, though they match the entire string, not the index of the character that matches the regular expression. For grep, a hit would return [1] 1 and a miss would return integer(0). For grepl, you still get TRUE or FALSE.

Examples

How many rows have "Denmark" in the team name ("Denmark" may or may not be the exact team name)?

Click to see solution
table(grepl("Denmark", myDF$Team))["TRUE"]
TRUE: 3496

Find the names of the teams that have "Denmark" in the team name but are not exactly "Denmark".

Click to see solution
myDF$Team[grepl("Denmark", myDF$Team) & myDF$Team != "Denmark"]
    'Denmark/Sweden'
    'Denmark-2'
    'Denmark-1'
    'Denmark-1'
    'Denmark-1'
    'Denmark-2'
    'Denmark-1'
    'Denmark-2'
    'Denmark-2'
    'Denmark-2'
    'Miss Denmark 1964'
    'Denmark-1'
    'Denmark-1'
    'Denmark-2'
    'Denmark-2'
    'Denmark-3'
    'Denmark-1'
    'Denmark-2'
    'Denmark-2'
    'Denmark-1'
    'Denmark-2'
    'Denmark-2'
    'Denmark-1'
    'Denmark-2'
    'Denmark-1'
    'Denmark-1'
    'Denmark-1'
    'Denmark-2'
    'Denmark-2'
    'Denmark-2'
    'Denmark-1'
    'Denmark-2'
    'Denmark-1'
    'Denmark-2'
    'Denmark-2'
    'Denmark-2'
    'Denmark-2'
    'Denmark-4'
    'Denmark-2'
    'Denmark-1'
    'Denmark/Sweden'
    'Denmark-2'
    'Denmark-1'
    'Denmark-2'
    'Denmark-1'
    'Denmark-2'
    'Denmark-1'
    'Denmark-1'
    'Denmark-1'
    'Denmark-1'
    'Miss Denmark 1964'
    'Denmark-1'
    'Denmark-1'
    'Denmark-1'
    'Denmark-3'
    'Denmark-2'
    'Denmark-2'
    'Denmark-2'
    'Denmark-1'
    'Denmark/Sweden'
    'Denmark/Sweden'
    'Denmark-2'
    'Denmark/Sweden'
    'Denmark-1'
    'Denmark-1'
    'Denmark-2'
    'Denmark-1'
    'Denmark-4'
    'Denmark-1'
    'Denmark-2'
    'Denmark-1'
    'Denmark/Sweden'