str

Basics

str stands for structure, and is a function that allows you to look at the inner workings of your desired variable.

Though str is an abbreviation for String in other languages and the function strsplit, str by itself has nothing to do with Strings


Examples

How do I get the number of rows and columns in the mtcars dataset?

Click to see solution

mtcars is a dataset built-in to R, so we don’t have to do anything special before analyzing it with str. You can look at all the built-in R datasets by executing data() in the console.

str(mtcars)
'data.frame':	32 obs. of  11 variables:
 $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
 $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
 $ disp: num  160 160 108 258 360 ...
 $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
 $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
 $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
 $ qsec: num  16.5 17 18.6 19.4 17 ...
 $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
 $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
 $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
 $ carb: num  4 4 1 1 2 1 4 2 2 4 ...

We can see there are 32 observations (rows) of 11 variables (columns), so we have 32 rows and 11 columns.

What are the column types in Loblolly? How are they measuring tree age?

Click to see solution
str(Loblolly)
Classes ‘nfnGroupedData’, ‘nfGroupedData’, ‘groupedData’ and 'data.frame':	84 obs. of  3 variables:
 $ height: num  4.51 10.89 28.72 41.74 52.7 ...
 $ age   : num  3 5 10 15 20 25 3 5 10 15 ...
 $ Seed  : Ord.factor w/ 14 levels "329"<"327"<"325"<..: 10 10 10 10 10 10 13 13 13 13 ...
 - attr(*, "formula")=Class 'formula'  language height ~ age | Seed
  .. ..- attr(*, ".Environment")=<environment: R_EmptyEnv>
 - attr(*, "labels")=List of 2
  ..$ x: chr "Age of tree"
  ..$ y: chr "Height of tree"
 - attr(*, "units")=List of 2
  ..$ x: chr "(yr)"
  ..$ y: chr "(ft)"

Loblolly contains all numbers, and while we have two numeric variables in height and age, Seed is an ordered factor, indicating that the numbers in Seed function as labels.

As for age, our "labels" section shows us $ x refers to the age of the tree, and going to the corresponding row in "units" tells us that age is measured in years.

Just by glancing at the data and the str output, we see that the experiment involved height measurements on 14 different seed types after 3, 5, 10, 15, 20, and 25 years.