Python Forum

Full Version: Pandas, How to trigger parallel loop
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi,
I have multiple CSV files in the folder, and I need to read each file and some calculation (like getting the first coulmn sum) and concate to result_df. Is there any method in python? to achieve this. Actually to ready and do some calculation, it is taking me 2mins, I need to wait longer time if there are many files.
Please show your code so far.
import os import pandas as pd df_result =pd.DataFrame() directory = os.path.join("D:\\","\PythonCodes\inputmultifiles") for root,dirs,files in os.walk(directory): for file in files: f = os.path.join(directory,file) if f.endswith(".csv"): ff=pd.read_csv(f) tmp = ff['Name'] print(tmp) df_result= pd.concat([df_result,ff['Name']]) df_result = df_result.reset_index(drop=True) df_result.columns = ['New_col']
if the file size is large, and it takes time, wait the previous iteration finish. Now I want to do like multiple threading to trigger all iterations at a time and combine the results from each iteration.
This is really a pandas question, not csv, as pandas reads in the data.
(Oct-28-2020, 02:21 PM)Larz60+ Wrote: [ -> ]This is really a pandas question, not csv, as pandas reads in the data.

Yes, but more generally I would put it this way: how to trigger multithreading