How to convert a data frame with categorical columns to numeric in R?



We might want to convert categorical columns to numeric for reasons such as parametric results of the ordinal or nominal data. If we have categorical columns and the values are represented by using letters/words then the conversion will be based on the first character of the category. To understand the conversion, check out the below examples.

Example1

 Live Demo

Consider the below data frame −

set.seed(100) x1<−sample(LETTERS[1:4],20,replace=TRUE) x2<−sample(LETTERS[1:4],20,replace=TRUE) x3<−sample(LETTERS[1:4],20,replace=TRUE) x4<−sample(LETTERS[1:4],20,replace=TRUE) df1<−data.frame(x1,x2,x3,x4) df1

Output

x1 x2 x3 x4 1 B C C B 2 C D A A 3 B B D A 4 D A C A 5 C D D B 6 A C B D 7 B C B C 8 B D A C 9 D B A C 10 C A B A 11 D B B A 12 B C A B 13 B D C D 14 D D C B 15 C B A C 16 B D C A 17 B D A B 18 C D D D 19 C A C C 20 C C C B

Converting columns in df1 to numerical −

Example

df1[]<−as.numeric(factor(as.matrix(df1))) df1

Output

x1 x2 x3 x4 1 2 3 3 2 2 3 4 1 1 3 2 2 4 1 4 4 1 3 1 5 3 4 4 2 6 1 3 2 4 7 2 3 2 3 8 2 4 1 3 9 4 2 1 3 10 3 1 2 1 11 4 2 2 1 12 2 3 1 2 13 2 4 3 4 14 4 4 3 2 15 3 2 1 3 16 2 4 3 1 17 2 4 1 2 18 3 4 4 4 19 3 1 3 3 20 3 3 3 2

Example2

 Live Demo

y1<−sample(c("Hot","Cold","Bitter"),20,replace=TRUE) y2<−sample(c("Hot","Cold","Bitter"),20,replace=TRUE) y3<−sample(c("Hot","Cold","Bitter"),20,replace=TRUE) df2<−data.frame(y1,y2,y3) df2

Output

y1 y2 y3 1 Bitter Hot Cold 2 Bitter Cold Hot 3 Bitter Bitter Cold 4 Cold Hot Bitter 5 Bitter Cold Cold 6 Cold Hot Bitter 7 Cold Cold Cold 8 Hot Cold Bitter 9 Bitter Bitter Bitter 10 Bitter Hot Bitter 11 Bitter Cold Cold 12 Bitter Bitter Hot 13 Hot Bitter Bitter 14 Cold Bitter Cold 15 Cold Bitter Bitter 16 Hot Bitter Hot 17 Bitter Cold Cold 18 Hot Cold Bitter 19 Hot Hot Cold 20 Hot Bitter Cold

Converting columns in df2 to numerical −

Example

df2[]<−as.numeric(factor(as.matrix(df2))) df2

Output

y1 y2 y3 1 1 3 2 2 1 2 3 3 1 1 2 4 2 3 1 5 1 2 2 6 2 3 1 7 2 2 2 8 3 2 1 9 1 1 1 10 1 3 1 11 1 2 2 12 1 1 3 13 3 1 1 14 2 1 2 15 2 1 1 16 3 1 3 17 1 2 2 18 3 2 1 19 3 3 2 20 3 1 2

Here, first letter of the category is considered for numbering.

Updated on: 2021-02-09T12:02:25+05:30

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements