 
  Data Structure Data Structure
 Networking Networking
 RDBMS RDBMS
 Operating System Operating System
 Java Java
 MS Excel MS Excel
 iOS iOS
 HTML HTML
 CSS CSS
 Android Android
 Python Python
 C Programming C Programming
 C++ C++
 C# C#
 MongoDB MongoDB
 MySQL MySQL
 Javascript Javascript
 PHP PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to remove rows based on blanks in a column from a data frame in R?
Sometimes data is incorrectly entered into systems and that is the reason we must be careful while doing data cleaning before proceeding to analysis. A data collector or the sampled unit might enter blank to an answer if he or she does not find an appropriate option for the question. This also happens if the questionnaire is not properly designed or blank is filled by mistake. Also, if we have categorical variable then a control category might be filled with blank or we may want to have a blank category to use a new one at later stage. Whatever the reason behind, an analyst faces such type of problems. These blanks are actually inserted by using space key on computers. Therefore, if a data frame has any column with blank values then those rows can be removed by using subsetting with single square brackets.
Example1
Consider the below data frame:
> set.seed(24) > x1<-sample(c(" ",1:5),20,replace=TRUE) > x2<-rnorm(20,4,1.25) > df1<-data.frame(x1,x2) > df1  Output
x1 x2 1 2 3.413674 2 1 3.581267 3 2 5.920315 4 4 4.762493 5 1 4.645420 6 5 3.907114 7 1 3.243554 8 1.862944 9 3 3.664134 10 3.189261 11 3.882362 12 4 3.893074 13 4 4.149414 14 3.854630 15 4 2.820216 16 4 3.957828 17 3 3.268216 18 4 4.766064 19 1 5.896403 20 4.821726
Removing rows with blanks:
Example
> df1[!df1$x1==" ",]
Output
x1 x2 1 2 3.413674 2 1 3.581267 3 2 5.920315 4 4 4.762493 5 1 4.645420 6 5 3.907114 7 1 3.243554 9 3 3.664134 12 4 3.893074 13 4 4.149414 15 4 2.820216 16 4 3.957828 17 3 3.268216 18 4 4.766064 19 1 5.896403
Example2
> y1<-sample(c(" ",rpois(5,1)),20,replace=TRUE) > y2<-rpois(20,5) > df2<-data.frame(y1,y2) > df2  Output
y1 y2 1 1 2 2 0 4 3 3 4 10 5 0 6 6 0 5 7 0 7 8 0 3 9 1 1 10 1 6 11 2 7 12 2 5 13 0 5 14 3 15 0 5 16 0 3 17 1 4 18 0 4 19 2 2 20 14
Removing rows with blanks:
Example
> df2[!df2$y1==" ",]
Output
y1 y2 1 1 2 2 0 4 5 0 6 6 0 5 7 0 7 8 0 3 9 1 1 10 1 6 11 2 7 12 2 5 13 0 5 15 0 5 16 0 3 17 1 4 18 0 4 19 2 2
