How to create group names for consecutively duplicate values in an R data frame column?



The grouping of values can be done in many ways and one such way is if we have duplicate values or unique values then the group can be set based on that. If all the values are unique then there is no sense for grouping but if we have varying values then the grouping can be done. For this purpose, we can use rleid function as shown in the below examples.

Example1

Consider the below data frame −

Live Demo

> x<-sample(0:2,20,replace=TRUE) > df1<-data.frame(x) > df1

Output

 x 1 2 2 1 3 2 4 2 5 1 6 0 7 1 8 1 9 1 10 1 11 0 12 0 13 1 14 2 15 1 16 0 17 1 18 0 19 1 20 2

Creating the groups for values in x −

> df1$Grp<-paste0("Grp",rleid(df1$x)) > df1

Output

 x Grp 1 2 Grp1 2 1 Grp2 3 2 Grp3 4 2 Grp3 5 1 Grp4 6 0 Grp5 7 1 Grp6 8 1 Grp6 9 1 Grp6 10 1 Grp6 11 0 Grp7 12 0 Grp7 13 1 Grp8 14 2 Grp9 15 1 Grp10 16 0 Grp11 17 1 Grp12 18 0 Grp13 19 1 Grp14 20 2 Grp15

Example2

Live Demo

> y<-sample(0:1,20,replace=TRUE) > df2<-data.frame(y) > df2

Output

 y 1 0 2 1 3 0 4 1 5 1 6 1 7 0 8 0 9 0 10 1 11 0 12 0 13 0 14 0 15 0 16 1 17 1 18 1 19 1 20 0

Creating the groups for values in y −

> df2$Category<-paste0("Category#",rleid(df2$y)) > df2

Output

 y Category 1 0 Category#1 2 1 Category#2 3 0 Category#3 4 1 Category#4 5 1 Category#4 6 1 Category#4 7 0 Category#5 8 0 Category#5 9 0 Category#5 10 1 Category#6 11 0 Category#7 12 0 Category#7 13 0 Category#7 14 0 Category#7 15 0 Category#7 16 1 Category#8 17 1 Category#8 18 1 Category#8 19 1 Category#8 20 0 Category#9
Updated on: 2021-03-05T06:12:19+05:30

462 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements