How to find the sum of values based on key in other column of an R data frame?



If we have a column that is key that means we want to use that column as an independent variable and find the statistical values such as sum, mean, standard deviation, range, etc. for the dependent variable. This can be done with the combination of with and tapply function as shown in the below examples.

Consider the below data frame −

Example

 Live Demo

x1<-sample(c("A","B","C"),20,replace=TRUE) y1<-rpois(20,5) df1<-data.frame(x1,y1) df1

Output

   x1  y1 1  C   0 2  A   4 3  C   5 4  C   5 5  A   5 6  C   3 7  B   7 8  B   6 9  C   6 10 C   13 11 C   6 12 C   5 13 C   6 14 A   7 15 B   4 16 C   1 17 C   7 18 B   6 19 B   3 20 B   5

Finding the sum of y1 for values in x1 −

with(df1,tapply(y1,x1,FUN=sum))

A B C 16 31 57

Example

 Live Demo

x2<-sample(c("India","Indonesia","UK"),20,replace=TRUE) y2<-rpois(20,10) df2<-data.frame(x2,y2) df2

Output

    x2        y2 1  India      11 2  India       8 3  Indonesia  16 4  India       8 5  Indonesia 10 6  UK         16 7  India     16 8  Indonesia 9 9  Indonesia 11 10 India       9 11 UK         7 12 India      14 13 Indonesia 9 14 India     12 15 UK         8 16 Indonesia 10 17 UK         14 18 India      9 19 India     13 20 Indonesia 10

Finding the sum of y2 for values in x2 −

with(df2,tapply(y2,x2,FUN=sum))

India Indonesia UK 100 75 45
Updated on: 2021-02-06T08:23:34+05:30

664 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements