R base
functions
table
table
is a function used to build a contingency table, which is a table that shows counts for categorical data, from one or more categories. prop.table
is a function that accepts table
output, returning proportions of the counts.
Examples
Which value appears in the "STATE" column the most times, in itcont1980.txt?
Click to see solution
library(data.table)
myDF <- fread("/anvil/projects/tdm/data/election/itcont1980.txt", quote="")
names(myDF) <- c("CMTE_ID", "AMNDT_IND", "RPT_TP", "TRANSACTION_PGI", "IMAGE_NUM", "TRANSACTION_TP", "ENTITY_TP", "NAME", "CITY", "STATE", "ZIP_CODE", "EMPLOYER", "OCCUPATION", "TRANSACTION_DT", "TRANSACTION_AMT", "OTHER_ID", "TRAN_ID", "FILE_NUM", "MEMO_CD", "MEMO_TEXT", "SUB_ID")
head(sort(table(myDF$STATE), decreasing=TRUE), n=1)
CA 3706
paste
and paste0
paste
is a function that converts vector elements to character strings and then concatenates them. It has a sep
argument (default sep = " "
) where the user can include a phrase/string to separate the strings being pasted together
paste0
is a version of paste
where its sep
argument is "", meaning the strings will be linked with no characters in between.
Examples
Use the paste command to join the "CITY" and "STATE" columns, with the goal of determining the top 5 city-and-state locations where donations were made.
Click to see solution
head(sort(table(paste(myDF$"CITY", myDF$"STATE", sep=", ")), decreasing=TRUE), n=6)
NEW YORK, NY , HOUSTON, TX DALLAS, TX WASHINGTON, DC 13862 11582 10146 6438 5890 LOS ANGELES, CA 5866