How to create a subset for a factor level in an R data frame?



In data analysis, we often deal with factor variables and these factor variables have different levels. Sometimes, we want to create subset of the data frame in R for specific factor levels to analyze the data only for that particular level of the factor variable. This can be simply done by using subset function.

Example

Consider the below data frame −

> set.seed(99) > Factor<-rep(c("India","China","USA","UK","Canada"),times=4) > Percentage<-sample(1:100,20) > df<-data.frame(Factor,Percentage) > df   Factor Percentage 1   India 48 2   China 33 3     USA 44 4      UK 22 5  Canada 62 6   India 32 7   China 13 8     USA 20 9      UK 31 10 Canada 68 11   India 9 12  China 82 13    USA 88 14     UK 30 15 Canada 86 16  India 84 17  China 95 18    USA 14 19   UK 4 20 Canada 78

Here, we have five levels of factor variable Factor. Now suppose we want to create a subset of Percentage for each of these levels then it can be done as shown below −

> India<-subset(df,Factor=="India") > India   Factor Percentage  1 India 48  6 India 32 11 India  9 16 India 84 > UK<-subset(df,Factor=="UK") > UK  Factor Percentage  4 UK  22  9 UK  31 14 UK  30 19 UK   4 > China<-subset(df,Factor=="China") > China   Factor Percentage  2 China 33  7 China 13 12 China 82 17 China 95 > USA<-subset(df,Factor=="USA") > USA Factor Percentage  3 USA 44  8 USA 20 13 USA 88 18 USA 14 > Canada<-subset(df,Factor=="Canada") > Canada Factor Percentage 5 Canada 62 10 Canada 68 15 Canada 86 20 Canada 78
Updated on: 2020-08-12T12:24:52+05:30

4K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements