How to find the frequency for all columns based on a condition in R?



To find the conditional frequency for all columns based on a condition, we can use for loop where we will define the length of each column with condition for which we want to find the frequency.

For example, if we have a data frame called df and we want to find the number of values in each column that are greater than 5 then we can use the below given command −

 Columns <- vector() for(i in 1:ncol(df1)){    Columns[i]<-length(df1[df1[,i] >5 ,i]) } Columns

Example 1

Following snippet creates a sample data frame −

x1<-rpois(20,1) x2<-rpois(20,2) x3<-rpois(20,3) df1<-data.frame(x1,x2,x3) df1

The following dataframe is created −

   x1 x2 x3 1  1  1  1 2  0  1  3 3  1  3  3 4  2  4  2 5  1  2  1 6  0  7  0 7  1  1  2 8  2  1  3 9  0  6  1 10 0  5  3 11 2  1  4 12 2  2 10 13 1  1  4 14 1  2  3 15 0  2  2 16 0  2  3 17 0  1  3 18 0  4  4 19 0  4  6 20 3  1  3

In order to find the frequency in each column of df1 if column value is greater than 2, add the following code to the above snippet −

Columns1 <- vector() for(i in 1:ncol(df1)){    + Columns1[i]<-length(df1[df1[,i] >2 ,i]) + } Columns1

Output

If you execute all the above given snippets as a single program, it generates the following output −

[1] 1 7 13

Example 2

Following snippet creates a sample data frame −

y1<-rnorm(20) y2<-rnorm(20) y3<-rnorm(20) df2<-data.frame(y1,y2,y3) df2

The following dataframe is created −

       y1         y2          y3 1  -0.7446072   0.2772768  -0.2099932 2   0.4497256  -1.5064792  -0.7166337 3   0.8316262  -1.0904581   0.5837854 4  -0.2955840   1.8329734   1.9440828 5   1.4989187   0.7655811  -1.7222717 6   1.6513081  -1.4800745   0.9092251 7   0.7703807  -1.3972957  -0.6070779 8   0.8522162  -0.3482059  -0.7727520 9  -0.8581488   1.6068537  -2.3097855 10 -0.6890322   1.8891767  -1.3816252 11 -0.2896339   1.9209137   0.5935030 12 -0.9241086  -2.0833818   0.7365296 13 -1.1093938   1.4950127   1.5394590 14 -0.1203023  -0.7265817  -0.1850344 15 -0.1747876  -0.3429473   0.9155441 16  0.2678002  -0.4080068  -0.5372238 17  0.1292888   0.8621264  -1.0343519 18  1.0656223   0.3492514  -1.8643609 19 -1.0106256   0.3237296  -0.3930171 20  0.7498458  -0.1454423  -1.2903053

In order to find the frequency in each column of df2 if column value is greater than 5, add the following code to the above snippet −

Columns2<-vector() for(i in 1:ncol(df2)){    + Columns2[i] <- length(df2[df2[,i]>0.5 ,i]) + } Columns2

Output

If you execute all the above given snippets as a single program, it generates the following output −

[1] 7 7 7
Updated on: 2021-11-05T07:34:13+05:30

499 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements