How to select top rows of an R data frame based on groups of factor column?



We use head function to take a look at some top values in an R data frame but it shows the top values for the whole data frame without considering the groups of factor column. Therefore, if we have a large number of values in a particular group then head function does not seem to be helpful alone, we must use something to extract the top values for each of the groups. This can be done through using by function with single square brackets and head function.

Examples

data(iris) str(iris) 'data.frame': 150 obs. of 5 variables: $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ... $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ... $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ... $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ... $ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ... Top5_Based_on_Species<-by(iris,iris["Species"],head,n=5) Top5_Based_on_Species Species: setosa

Output

Sepal.Length Sepal.Width Petal.Length Petal.Width Species 1 5.1 3.5 1.4 0.2 setosa 2 4.9 3.0 1.4 0.2 setosa 3 4.7 3.2 1.3 0.2 setosa 4 4.6 3.1 1.5 0.2 setosa 5 5.0 3.6 1.4 0.2 setosa ------------------------------------------------------------ Species: versicolor Sepal.Length Sepal.Width Petal.Length Petal.Width Species 51 7.0 3.2 4.7 1.4 versicolor 52 6.4 3.2 4.5 1.5 versicolor 53 6.9 3.1 4.9 1.5 versicolor 54 5.5 2.3 4.0 1.3 versicolor 55 6.5 2.8 4.6 1.5 versicolor ------------------------------------------------------------ Species: virginica Sepal.Length Sepal.Width Petal.Length Petal.Width Species 101 6.3 3.3 6.0 2.5 virginica 102 5.8 2.7 5.1 1.9 virginica 103 7.1 3.0 5.9 2.1 virginica 104 6.3 2.9 5.6 1.8 virginica 105 6.5 3.0 5.8 2.2 virginica

Example

data(ToothGrowth) str(ToothGrowth) 'data.frame': 60 obs. of 3 variables: $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ... $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ... $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ... Top10_Based_on_Supp<-by(ToothGrowth,ToothGrowth["supp"],head,n=10) Top10_Based_on_Supp supp: OJ

Output

len supp dose 31 15.2 OJ 0.5 32 21.5 OJ 0.5 33 17.6 OJ 0.5 34 9.7 OJ 0.5 35 14.5 OJ 0.5 36 10.0 OJ 0.5 37 8.2 OJ 0.5 38 9.4 OJ 0.5 39 16.5 OJ 0.5 40 9.7 OJ 0.5 ------------------------------------------------------------ supp: VC len supp dose 1 4.2 VC 0.5 2 11.5 VC 0.5 3 7.3 VC 0.5 4 5.8 VC 0.5 5 6.4 VC 0.5 6 10.0 VC 0.5 7 11.2 VC 0.5 8 11.2 VC 0.5 9 5.2 VC 0.5 10 7.0 VC 0.5

Example

data(CO2) str(CO2) Classes ‘nfnGroupedData’, ‘nfGroupedData’, ‘groupedData’ and 'data.frame': 84 obs. of 5 variables: $ Plant : Ord.factor w/ 12 levels "Qn1"<"Qn2"<"Qn3"<..: 1 1 1 1 1 1 1 2 2 2 ... $ Type : Factor w/ 2 levels "Quebec","Mississippi": 1 1 1 1 1 1 1 1 1 1 ... $ Treatment: Factor w/ 2 levels "nonchilled","chilled": 1 1 1 1 1 1 1 1 1 1 ... $ conc : num 95 175 250 350 500 675 1000 95 175 250 ... $ uptake : num 16 30.4 34.8 37.2 35.3 39.2 39.7 13.6 27.3 37.1 ... - attr(*, "formula")=Class 'formula' language uptake ~ conc | Plant .. ..- attr(*, ".Environment")=<environment: R_EmptyEnv> - attr(*, "outer")=Class 'formula' language ~Treatment * Type .. ..- attr(*, ".Environment")=<environment: R_EmptyEnv> - attr(*, "labels")=List of 2 ..$ x: chr "Ambient carbon dioxide concentration" ..$ y: chr "CO2 uptake rate" - attr(*, "units")=List of 2 ..$ x: chr "(uL/L)" ..$ y: chr "(umol/m^2 s)" Top5_Based_on_Treatment<-by(CO2,CO2["Treatment"],head,n=5) Top5_Based_on_Treatment

Output

Treatment: nonchilled Plant Type Treatment conc uptake 1 Qn1 Quebec nonchilled 95 16.0 2 Qn1 Quebec nonchilled 175 30.4 3 Qn1 Quebec nonchilled 250 34.8 4 Qn1 Quebec nonchilled 350 37.2 5 Qn1 Quebec nonchilled 500 35.3 ------------------------------------------------------------ Treatment: chilled Plant Type Treatment conc uptake 22 Qc1 Quebec chilled 95 14.2 23 Qc1 Quebec chilled 175 24.1 24 Qc1 Quebec chilled 250 30.3 25 Qc1 Quebec chilled 350 34.6 26 Qc1 Quebec chilled 500 32.5
Updated on: 2020-08-21T12:33:35+05:30

241 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements