How to remove underscore from column names of an R data frame?



When we import data from outside sources then the header or column names might be imported with underscore separated values and this is also possible if the original data has the same format. Therefore, to make the headers shorter and look better we would prefer to remove the underscore sign and this can be easily done with the help of gsub function.

Consider the below data frame −

Example

 Live Demo

x_1<-sample(1:10,20,replace=TRUE) x_2<-sample(1:10,20,replace=TRUE) x_3<-sample(1:10,20,replace=TRUE) x_4<-sample(1:10,20,replace=TRUE) x_5<-sample(1:10,20,replace=TRUE) df1<-data.frame(x_1,x_2,x_3,x_4,x_5) df1

Output

x_1 x_2 x_3 x_4 x_5 1 10 4 6 5 10 2 6 10 2 1 4 3 9 9 6 1 4 4 6 1 5 5 8 5 7 7 4 7 4 6 1 5 2 1 8 7 8 5 5 2 9 8 8 4 1 9 8 9 8 1 7 4 3 10 5 9 3 10 3 11 2 7 5 6 9 12 10 1 4 1 5 13 8 10 10 1 2 14 3 10 5 7 6 15 5 6 9 1 10 16 3 8 6 4 7 17 8 9 5 7 2 18 6 10 5 6 8 19 1 8 3 2 9 20 8 1 5 10 5

Removing underscore from column names −

Example

names(df1)<-gsub("\_","",names(df1)) df1

Output

  x1 x2 x3 x4 x5 1 6 8 2 9 6 2 1 9 3 4 10 3 2 1 8 10 10 4 4 10 3 6 1 5 10 6 6 6 5 6 9 4 6 6 2 7 3 9 10 5 9 8 8 1 5 3 8 9 4 9 2 5 6 10 9 3 3 5 4 11 7 1 4 6 3 12 10 6 3 3 1 13 7 6 10 10 8 14 9 6 4 1 1 15 7 5 10 2 1 16 1 3 7 4 8 17 2 1 7 2 8 18 1 10 8 2 3 19 8 7 6 6 10 20 3 8 9 8 3

Let’s have a look at another example −

Example

 Live Demo

y_1<-rnorm(20) y_2<-rnorm(20,2,1) y_3<-rnorm(20,2,0.5) y_4<-rnorm(20,2,0.0003) y_5<-rnorm(20,10,1) df2<-data.frame(y_1,y_2,y_3,y_4,y_5) df2

Output

        y_1       y_2      y_3       y_4      y_5 1 0.514450792  2.4374182  3.230083 1.999826 12.625661 2 -0.312792686  0.8350701  2.769788 1.999740 8.699441 3 -0.710758168  2.7832089  1.971917 2.000519 8.430542 4 -0.060647019  1.4626953 1.971298 2.000600 9.568890 5 2.363567996  0.8239008  2.626454 2.000266 10.038633 6 1.227010669  2.6716199  1.844929 1.999768 7.838243 7 -0.994717233  1.1798125  2.084188 1.999643 11.254072 8 2.584374114  1.6053897  2.453163 2.000089 11.256447 9 0.863363636  1.0685646  1.457286 2.000659 11.001834 10 -0.190736476  1.4468239  1.829696 2.000229 10.425032 11 0.716178594  2.7498080  2.406190 1.999487 9.906237 12 -1.670744103  1.1184815  2.206973 2.000288 8.993506 13 1.011970392  2.7794836  2.560877 2.000160 12.564313 14 -0.099591556  1.5176429  1.841669 2.000175 12.050816 15 3.230713917  1.8450534  2.065576 2.000189 9.243683 16 0.734370382  0.8649671  1.550325 2.000698 10.320533 17 1.156661539  3.8099910 2.842250 1.999826 10.134682 18 -0.496844480  2.0082680 1.456640 2.000119 10.498172 19 -0.001995988  1.7054230 2.702496 1.999963 8.572382 20 -0.190562902  2.6200714 1.822893 1.999612 9.683227

Removing underscore from column names −

Example

names(df2)<-gsub("\_","",names(df2)) df2

Output

    y1 y2 y3 y4 y5 1 0.35283126 2.7403674 1.5855939 1.999599 10.615962 2 2.04048363 1.7570445 1.9365559 1.999934 10.734033 3 -0.99194313 1.9299296 3.4318183 2.000200 8.821012 4 0.03923376 2.8984508 1.3765896 1.999948 8.371278 5 0.48921437 1.7272755 2.0049735 1.999814 10.769563 6 -1.52296501 1.1843431 1.3387394 1.999670 10.984169 7 -0.43659539 3.0847073 2.0724138 2.000099 10.163438 8 -1.07562516 2.4046583 2.3631921 1.999976 8.119308 9 0.25897051 4.0599361 2.5180669 2.000179 8.780155 10 0.90011031 0.5844179 3.0924616 2.000156 10.945022 11 -1.01455924 1.3601391 1.3491111 2.000197 11.172243 12 -1.21902395 1.5613617 1.6721161 2.000014 9.752595 13 1.10335026 3.0485505 2.5479672 2.000200 10.851384 14 1.66150031 0.9157312 2.0733168 2.000298 10.045139 15 -2.88733135 1.6426962 1.4906487 1.999932 10.596103  16 -0.20689147 1.7962494 0.9636048 1.999893 10.489436  17 -0.66668766 2.0058826 1.7932363 2.000102 10.702172 18 -0.32072057 2.8834813 2.1764040 2.000017 10.699573 19 -0.29862766 4.6416591 2.8638125 1.999819 10.211451 20 -0.47632229 1.2781510 2.8128627 1.999981 9.046588
Updated on: 2020-10-16T15:20:06+05:30

3K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements