 
  Data Structure Data Structure
 Networking Networking
 RDBMS RDBMS
 Operating System Operating System
 Java Java
 MS Excel MS Excel
 iOS iOS
 HTML HTML
 CSS CSS
 Android Android
 Python Python
 C Programming C Programming
 C++ C++
 C# C#
 MongoDB MongoDB
 MySQL MySQL
 Javascript Javascript
 PHP PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Python - How to Group Pandas DataFrame by Year?
We will group Pandas DataFrame using the groupby(). Select the column to be used using the grouper function. We will group year-wise and calculate sum of Registration Price with year interval for our example shown below for Car Sale Records.
At first, let’s say the following is our Pandas DataFrame with three columns −
# dataframe with one of the columns as Date_of_Purchase dataFrame = pd.DataFrame(    {       "Car": ["Audi", "Lexus", "Tesla", "Mercedes", "BMW", "Toyota", "Nissan", "Bentley", "Mustang"],       "Date_of_Purchase": [pd.Timestamp("2021-06-10"),          pd.Timestamp("2019-07-11"),          pd.Timestamp("2016-06-25"),          pd.Timestamp("2021-06-29"),          pd.Timestamp("2020-03-20"),          pd.Timestamp("2019-01-22"),          pd.Timestamp("2011-01-06"),          pd.Timestamp("2013-01-04"),          pd.Timestamp("2014-05-09")       ],       "Reg_Price": [1000, 1400, 1100, 900, 1700, 1800, 1300, 1150, 1350]    } ) Next, use the Grouper to select Date_of_Purchase column within groupby function. The frequency is set as 3Y i.e. interval of 3 Years grouped.
Example
Following is the code −
import pandas as pd # dataframe with one of the columns as Date_of_Purchase dataFrame = pd.DataFrame(    {       "Car": ["Audi", "Lexus", "Tesla", "Mercedes", "BMW", "Toyota", "Nissan", "Bentley", "Mustang"],       "Date_of_Purchase": [pd.Timestamp("2021-06-10"),          pd.Timestamp("2019-07-11"),          pd.Timestamp("2016-06-25"),          pd.Timestamp("2021-06-29"),          pd.Timestamp("2020-03-20"),          pd.Timestamp("2019-01-22"),          pd.Timestamp("2011-01-06"),          pd.Timestamp("2013-01-04"), pd.Timestamp("2014-05-09")       ],       "Reg_Price": [1000, 1400, 1100, 900, 1700, 1800, 1300, 1150, 1350]    } ) print("DataFrame...\n",dataFrame) # Grouper to select Date_of_Purchase column within groupby function print("\nGroup Dataframe by 3 years...\n",dataFrame.groupby(pd.Grouper(key='Date_of_Purchase', axis=0, freq='3Y')).sum())  Output
This will produce the following output −
DataFrame... Car Date_of_Purchase Reg_Price 0 Audi 2021-06-10 1000 1 Lexus 2019-07-11 1400 2 Tesla 2016-06-25 1100 3 Mercedes 2021-06-29 900 4 BMW 2020-03-20 1700 5 Toyota 2019-01-22 1800 6 Nissan 2011-01-06 1300 7 Bentley 2013-01-04 1150 8 Mustang 2014-05-09 1350 Group Dataframe by 3 years... Reg_Price Date_of_Purchase 2011-12-31 1300 2014-12-31 2500 2017-12-31 1100 2020-12-31 4900 2023-12-31 1900
Advertisements
 