How to select rows of an R data frame that are non-NA?



To select rows of an R data frame that are non-Na, we can use complete.cases function with single square brackets. For example, if we have a data frame called that contains some missing values (NA) then the selection of rows that are non-NA can be done by using the command df[complete.cases(df),].

Example1

Consider the below data frame −

Live Demo

> x1<-sample(c(1,NA),20,replace=TRUE) > x2<-sample(c(5,NA),20,replace=TRUE) > x3<-sample(c(3,NA),20,replace=TRUE) > df1<-data.frame(x1,x2,x3) > df1

Output

   x1 x2 x3 1   1 NA NA 2  NA  5  3 3   1  5 NA 4   1 NA NA 5  NA  5 NA 6  NA  5  3 7  NA  5 NA 8   1 NA  3 9  NA  5 NA 10 NA  5 NA 11 NA NA NA 12  1  5  3 13 NA  5  3 14 NA NA NA 15  1 NA NA 16 NA  5  3 17 NA NA  3 18 NA NA NA 19  1 NA  3 20 NA NA  3

Selecting rows of df1 that do not contain any NA −

> df1[complete.cases(df1),]

Output

   x1 x2 x3 12  1  5  3

Example2

Live Demo

> y1<-sample(c(rnorm(2),NA),20,replace=TRUE) > y2<-sample(c(rnorm(2),NA),20,replace=TRUE) > df2<-data.frame(y1,y2) > df2

Output

           y1        y2 1  0.15079115 -0.626630 2  0.15079115        NA 3          NA -0.626630 4  0.15079115 -0.626630 5  0.15079115        NA 6  0.15079115 -0.626630 7  0.15079115        NA 8  0.15079115 -1.691553 9          NA -1.691553 10         NA -0.626630 11 0.15079115 -1.691553 12 0.15079115        NA 13         NA -1.691553 14         NA -1.691553 15 0.15079115 -1.691553 16         NA -0.626630 17 0.01495388 -0.626630 18 0.01495388 -1.691553 19 0.15079115 -1.691553 20         NA        NA

Selecting rows of df2 that do not contain any NA −

> df2[complete.cases(df2),]

Output

           y1        y2 1  0.15079115 -0.626630 4  0.15079115 -0.626630 6  0.15079115 -0.626630 8  0.15079115 -1.691553 11 0.15079115 -1.691553 15 0.15079115 -1.691553 17 0.01495388 -0.626630 18 0.01495388 -1.691553 19 0.15079115 -1.691553

Example3

Live Demo

> z1<-sample(c("A",NA),20,replace=TRUE) > z2<-sample(c("B",NA),20,replace=TRUE) > z3<-sample(c("C",NA),20,replace=TRUE) > df3<-data.frame(z1,z2,z3) > df3

Output

     z1   z2   z3 1     A <NA>    C 2  <NA>    B    C 3  <NA> <NA> <NA> 4     A    B <NA> 5  <NA> <NA>    C 6     A <NA>    C 7     A    B    C 8  <NA>    B    C 9  <NA> <NA>    C 10 <NA> <NA>    C 11    A <NA>    C 12 <NA> <NA>    C 13    A    B    C 14    A    B    C 15 <NA> <NA> <NA> 16    A    B    C 17 <NA> <NA> <NA> 18 <NA> <NA>    C 19    A    B    C 20 <NA> <NA> <NA>

Selecting rows of df3 that do not contain any NA −

> df3[complete.cases(df3),]

Output

   z1 z2 z3 7   A  B  C 13  A  B  C 14  A  B  C 16  A  B  C 19  A  B  C
Updated on: 2021-03-06T05:27:42+05:30

10K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements