Hello polymaths,
The above-mentioned task is important to know for most of the Python Developers working in the Data Field. Imagine, you have several Excel files (or) CSV files (or) a single Excel file with multiple sheets etc. and you want to compute a logic that considers the entire data for calculation -> Obviously, you have to append each file and its corresponding Name to a separate DataFrame that consists of a list of DataFrames to obtain the Output.
Step 1
Create an empty list to append the names of each DataFrame.
Step 2
Create an empty list to append the data related to each DataFrame.
Step 3
Make use of loop concepts to iterate through each DataFrame.
Step 4
Perform Data Cleaning, Transformations etc. if necessary and finally append the data to the previously created empty lists.
Step 5
Create a new DataFrame and assign the data parameter with the above two lists.
Sample Code:-
# Import necessary Libraries import pandas as pd; ef=pd.ExcelFile("path/input.xlsx") # Load the Excel File dataframes=[]; # Empty List to append the data of each File names=[]; # Empty List to append the name of each File # Iterate through all the sheets within the Excel object for i in ef.sheet_names: df=ef.parse(i); # Store the data as a DataFrame from each sheet df_name=i; # Store the name of the DataFrame from each sheet # Perform Data Cleaning and Tranformations, if necessary # dataframes.append(df); names.append(df_name); df_final=pd.DataFrame(data={"Name":names, "DataFrame":dataframes}); # Create the Final DataFrame
Done
Top comments (0)