Reading and Writing Data
If you are working with datasets in R, there are various ways to handle them. One of the most common tasks is importing your existing data into R. To do this, you first need to know your current working environment. You can use the following command to check your working directory:
getwd()
This command will return the path of your current working directory, which is where R looks for files by default when reading or saving data.
Also, .
means the current working directory.
Change your working directory:
setwd("the place you want to go!")
How do I read a csv file called grades.csv
into a data.frame?
If we were in /home/john/projects
, that is our current working directory.
./grades.csv
is the same as /home/john/projects/grades.csv
.
In this case, ./grades.csv
is the relative path.
dat <- read.csv("./grades.csv")
head(dat)
grade year
1 100 junior
2 99 sophomore
3 75 sophomore
4 74 sophomore
5 44 senior
6 69 junior
How do I read a csv file called grades.csv
into a data.frame using the function fread
?
Note: The fread
function is part of the data.table
package. It reads in datasets faster than read.csv
. You are strongly encouraged to use fread
to read large datasets in R.
library(data.table)
dat <- data.frame(fread("./grades.csv"))
head(dat)
grade year
1 100 junior
2 99 sophomore
3 75 sophomore
4 74 sophomore
5 44 senior
6 69 junior
How do I read a csv file called grades2.csv
where the data is separated by semi-colons (;) in a data.frame?
CSV stands for comma-separated values. So, read.csv has comma (,) as the default value for the sep parameter.
|
dat <- read.csv("./grades_semi.csv", sep=";")
head(dat)
grade year
1 100 junior
2 99 sophomore
3 75 sophomore
4 74 sophomore
5 44 senior
6 69 junior
How do I prevent R from reading in strings as factors when using a function like read.csv
?
In R 4.0+, strings are not read in as factors, so you don’t need to worry about this.
For R < 4.0 (any older R version than 4.0), use stringsAsFactors
.
dat <- read.csv("./grades.csv", stringsAsFactors=F)
head(dat)
grade year
1 100 junior
2 99 sophomore
3 75 sophomore
4 74 sophomore
5 44 senior
6 69 junior
How do I specify the type of 1 or more columns when reading in a csv file?
dat <- read.csv("./grades.csv", colClasses=c("grade"="character", "year"="factor"))
str(dat)
'data.frame': 10 obs. of 2 variables:
$ grade: chr "100" "99" "75" "74" ...
$ year : Factor w/ 4 levels "freshman","junior",..: 2 4 4 4 3 2 2 3 1 2
Given a list of csv files with the same columns, how can I read in and combine them into a one dataframe?
# We want to read in grades.csv, grades2.csv, and grades3.csv
# into a single dataframe.
list_of_files <- c("grades.csv", "grades2.csv", "grades3.csv")
results <- data.frame()
for (file in list_of_files) {
dat <- read.csv(file)
results <- rbind(results, dat)
}
dim(results)
[1] 32 2