order

Basics

order is a function similar to sort, where vectors are arranged in ascending or descending order. The order function will use further columns in the object to break ties among similar values, making it the preferred method for sorting data.frames and matrices.

Unlike sort, order returns the indices for the sorted object, so indexing on the output of order will give us our desired output.


Examples

Given a matrix, arrange it in ascending order using the first column.

Click to see solution
my_mat <- matrix(c(1, 5, 0, 2,
                   10, 1, 2, 8,
                   9, 1,0,2), ncol=3)
my_mat[order(my_mat[,1]), ]
     [,1] [,2] [,3]
[1,]    0    2    0
[2,]    1   10    9
[3,]    2    8    2
[4,]    5    1    1

Great, we got what we wanted! But what’s with all the brackets and commas? Let’s work from the inside to understand what’s going on:

my_mat[,1] uses column-specific indexing to get the matrix’s first column.

order(my_mat[,1]) sorts the first column in ascending order, then returns the indices of those values in the original my_mat. The output of this code is:

[1] 3 1 4 2

We can quickly verify the output by noting that column 1 sorted in ascending order is 0, 1, 2, 5. Those values are located at indices 3, 1, 4, and 2.

Finally, we note that the output of order(my_mat[,1]) is a vector, which we’ll call x for now. This means our final line of code is equivalent to my_mat[x, ]. Once again recalling indexing rules, we are asking R to display the 3rd, 1st, 4th, and 2nd row of my_mat in that order, which is our output in the example. This whole process is condensed into one line — thanks, R!

In the Orange data set, order by age and circumference to compare tree size at each stage.

Click to see solution

Keep in mind that order will go left-to-right when evaluating ties. If we don’t also include Orange$circumference when ordering by Orange$age, order will use Orange$Tree for tie-breaking, which won’t give us a conclusive pattern in terms of growth — we’ll just get 1, 2, 3, 4, and 5 repeated.

Orange[order(Orange$age, Orange$circumference),]
   Tree  age circumference
1     1  118            30
15    3  118            30
29    5  118            30
22    4  118            32
8     2  118            33
30    5  484            49
16    3  484            51
2     1  484            58
23    4  484            62
9     2  484            69
17    3  664            75
31    5  664            81
3     1  664            87
10    2  664           111
24    4  664           112
18    3 1004           108
4     1 1004           115
32    5 1004           125
11    2 1004           156
25    4 1004           167
19    3 1231           115
5     1 1231           120
33    5 1231           142
12    2 1231           172
26    4 1231           179
20    3 1372           139
6     1 1372           142
34    5 1372           174
13    2 1372           203
27    4 1372           209
21    3 1582           140
7     1 1582           145
35    5 1582           177
14    2 1582           203
28    4 1582           214

At each age, the ranking of tree size from smallest to largest is:

  • 1, 3, 5, 4, 2

  • 5, 3, 1, 4, 2

  • 3, 5, 1, 4, 2

  • 3, 1, 5, 2, 4

  • 3, 1, 5, 2, 4

  • 3, 1, 5, 2, 4

  • 3, 1, 5, 2, 4

A pattern emerges after the 4th measurement, meaning we have a general ranking for tree size. This information is helpfully listed in the output of Orange$Tree.