Python – Group and calculate the sum of column values of a Pandas DataFrame



We will consider an example of Car Sale Records and group month-wise to calculate the sum of Registration Price of car monthly. To sum, we use the sum() method.

At first, let’s say the following is our Pandas DataFrame with three columns −

dataFrame = pd.DataFrame(    {       "Car": ["Audi", "Lexus", "Tesla", "Mercedes", "BMW", "Toyota", "Nissan", "Bentley", "Mustang"], "Date_of_Purchase": [ pd.Timestamp("2021-06-10"), pd.Timestamp("2021-07-11"), pd.Timestamp("2021-06-25"), pd.Timestamp("2021-06-29"), pd.Timestamp("2021-03-20"), pd.Timestamp("2021-01-22"), pd.Timestamp("2021-01-06"), pd.Timestamp("2021-01-04"), pd.Timestamp("2021-05-09") ], "Reg_Price": [1000, 1400, 1100, 900, 1700, 1800, 1300, 1150, 1350] } ) 

Use the Grouper to select Date_of_Purchase column within groupby() function. The frequency freq is set ‘M’ to group by month-wise and sum is calculates using the sum() function −

print"\nGroup Dataframe by month...\n",dataFrame.groupby(pd.Grouper(key='Date_of_Purchase', axis=0, freq='M')).sum()

Example

Following is the code −

import pandas as pd # dataframe with one of the columns as Date_of_Purchase dataFrame = pd.DataFrame(    {       "Car": ["Audi", "Lexus", "Tesla", "Mercedes", "BMW", "Toyota", "Nissan", "Bentley", "Mustang"],       "Date_of_Purchase": [          pd.Timestamp("2021-06-10"),          pd.Timestamp("2021-07-11"),          pd.Timestamp("2021-06-25"),          pd.Timestamp("2021-06-29"),          pd.Timestamp("2021-03-20"),          pd.Timestamp("2021-01-22"),          pd.Timestamp("2021-01-06"),          pd.Timestamp("2021-01-04"),          pd.Timestamp("2021-05-09") ], "Reg_Price": [1000, 1400, 1100, 900, 1700, 1800, 1300, 1150, 1350] } ) print"DataFrame...\n",dataFrame # Grouper to select Date_of_Purchase column within groupby function # calculation the sum month-wise print"\nGroup Dataframe by month...\n",dataFrame.groupby(pd.Grouper(key='Date_of_Purchase', axis=0, freq='M')).sum() 

Output

This will produce the following output −

DataFrame...         Car   Date_of_Purchase Reg_Price 0      Audi        2021-06-10 1000 1     Lexus        2021-07-11 1400 2     Tesla        2021-06-25 1100 3  Mercedes        2021-06-29 900 4       BMW        2021-03-20 1700 5    Toyota        2021-01-22 1800 6    Nissan        2021-01-06 1300 7   Bentley        2021-01-04 1150 8   Mustang        2021-05-09 1350 Group Dataframe by month... Reg_Price Date_of_Purchase 2021-01-31 4250.0 2021-02-28 NaN 2021-03-31 1700.0 2021-04-30 NaN 2021-05-31 1350.0 2021-06-30 3000.0 2021-07-31 1400.0
Updated on: 2021-09-16T07:19:05+05:30

2K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements