Python Forum
Is there a more elegant way to concatenate data frames?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Is there a more elegant way to concatenate data frames?
#1
Hi, I'm picturing a script that will include in a loop everything after the setting of "today" and before the "print" prototyped in the code below. The loop will read symbols from a notepad file line by line. I would substitute the hardcoded Ticker settings you see below with the symbol just read from notepad. If the record read is the first, the download call where AAPL is now a placeholder would be called. Otherwise the placeholder for the WMT call would be executed.

The download method allows passing multiple tickers in a call but long term I think this will be more flexible. From what I can tell, if you don't pass multiple symbols in a call, you don't get a column for Ticker.

Can I make this more elegant? It seems awkward.

import yfinance as yf import pandas as pd from datetime import date today = date.today() Ticker="AAPL" data1 = yf.download(Ticker, start="2023-05-01", end=today).round(2) data1["Ticker"]=Ticker Ticker="WMT" data2 = yf.download(Ticker, start="2023-05-01", end=today).round(2) data2["Ticker"]=Ticker data1=[data1,data2] data1 = pd.concat(data1) print(data1)
Reply
#2
Looks fine to me.
import yfinance as yf import pandas as pd from datetime import date tickers = ("AAPL", "WMT") # Or read from file today = date.today() start = today - timedelta(days=7) data = None for ticker in tickers: x = yf.download(ticker, start=start, end=today, progress=False).round(2) x.insert(0, "Ticker", ticker) if data is None: data = x else: data = pd.concat((data, x)) data = data.sort_index() print(data)
Output:
Ticker Open High Low Close Adj Close Volume Date 2023-06-06 AAPL 179.97 180.12 177.43 179.21 179.21 64848400 2023-06-06 WMT 149.70 150.19 148.51 149.78 149.78 5005200 2023-06-07 AAPL 178.44 181.21 177.32 177.82 177.82 61944600 2023-06-07 WMT 149.25 150.36 149.04 150.00 150.00 8085500 2023-06-08 AAPL 177.90 180.84 177.46 180.57 180.57 50214900 2023-06-08 WMT 150.39 152.43 149.79 152.17 152.17 6291000 2023-06-09 AAPL 181.50 182.23 180.63 180.96 180.96 48870700 2023-06-09 WMT 152.16 153.72 151.60 153.09 153.09 5201300 2023-06-12 AAPL 181.27 183.89 180.97 183.79 183.79 54274900 2023-06-12 WMT 153.43 154.30 153.17 154.10 154.10 4904500
I moved the ticker column. I think it makes more sense to place it ahead of the financial information. Also sorted the resulting table by the date index and changed the starting data to a calculation instead of a string. Just for fun.
snippsat likes this post
Reply
#3
much more elegant. thank you.
Reply
#4
Some tips about dates in Pandas and if look Date so is lower in header column and need a fix.
So here have i remove datatime import an used Pandas own date functionality
Can fine use both,but when first has import Pandas don't need a addition import of datetime.
import yfinance as yf import pandas as pd tickers = ("AAPL", "WMT") # Or read from file today = pd.to_datetime("today") start = today - pd.Timedelta(days=7) data = None for ticker in tickers: x = yf.download(ticker, start=start, end=today, progress=False).round(2) x.insert(0, "Ticker", ticker) if data is None: data = x else: data = pd.concat((data, x)) data = data.sort_index() print(data)
>>> data Ticker Open High Low Close Adj Close Volume Date 2023-06-06 AAPL 179.97 180.12 177.43 179.21 179.21 64848400 2023-06-06 WMT 149.70 150.19 148.51 149.78 149.78 5005200 2023-06-07 AAPL 178.44 181.21 177.32 177.82 177.82 61944600 2023-06-07 WMT 149.25 150.36 149.04 150.00 150.00 8085500 2023-06-08 AAPL 177.90 180.84 177.46 180.57 180.57 50214900 2023-06-08 WMT 150.39 152.43 149.79 152.17 152.17 6291000 2023-06-09 AAPL 181.50 182.23 180.63 180.96 180.96 48870700 2023-06-09 WMT 152.16 153.72 151.60 153.09 153.09 5201300 2023-06-12 AAPL 181.27 183.89 180.97 183.79 183.79 54274900 2023-06-12 WMT 153.43 154.30 153.17 154.10 154.10 4904500 2023-06-13 AAPL 182.80 184.15 182.47 183.13 183.13 27582874 2023-06-13 WMT 154.52 155.49 154.07 155.40 155.40 1844848 >>> data.info() <class 'pandas.core.frame.DataFrame'> DatetimeIndex: 12 entries, 2023-06-06 to 2023-06-13 Data columns (total 7 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Ticker 12 non-null object 1 Open 12 non-null float64 2 High 12 non-null float64 3 Low 12 non-null float64 4 Close 12 non-null float64 5 Adj Close 12 non-null float64 6 Volume 12 non-null int64 dtypes: float64(5), int64(1), object(1) memory usage: 768.0+ bytes
So in info we see no Date info,to fix this.
>>> data = data.reset_index() >>> data.info() <class 'pandas.core.frame.DataFrame'> RangeIndex: 12 entries, 0 to 11 Data columns (total 8 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Date 12 non-null datetime64[ns] 1 Ticker 12 non-null object 2 Open 12 non-null float64 3 High 12 non-null float64 4 Low 12 non-null float64 5 Close 12 non-null float64 6 Adj Close 12 non-null float64 7 Volume 12 non-null int64 dtypes: datetime64[ns](1), float64(5), int64(1), object(1) memory usage: 900.0+ bytes
So now have a working DataFrame,Date see datetime64[ns]
Then can eg do a plot with Date and low last 90 days,high using eg seaborn
import yfinance as yf import pandas as pd import matplotlib.pyplot as plt import seaborn as sns tickers = ("AAPL", "WMT") # Or read from file today = pd.to_datetime("today") start = today - pd.Timedelta(days=90) data = None for ticker in tickers: x = yf.download(ticker, start=start, end=today, progress=False).round(2) x.insert(0, "Ticker", ticker) if data is None: data = x else: data = pd.concat((data, x)) data = data.sort_index() #print(data) data = data.reset_index() # Plot plt.figure(figsize=(15, 6)) sns.set_style("darkgrid") sns.lineplot(data=data, x='Date', y='High', label='High') sns.lineplot(data=data, x='Date', y='Low', label='Low') plt.xlabel('Date') plt.ylabel('Price') plt.title('High and Low Stock Prices') plt.legend() plt.show()
[Image: jluiKZ.png]
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Elegant way to apply each element of an array to a dataframe? sawtooth500 7 4,738 Mar-29-2024, 05:51 PM
Last Post: deanhystad
  Concatenate array for 3D plotting armanditod 1 1,866 Mar-21-2024, 08:08 PM
Last Post: deanhystad
  Better python library to create ER Diagram by using pandas data frames as tables klllmmm 0 4,285 Oct-19-2023, 01:01 PM
Last Post: klllmmm
  How to concatenate filepath with variable? Mark17 2 11,626 Jan-31-2022, 09:13 PM
Last Post: Mark17
  How to map two data frames based on multiple condition SriRajesh 0 2,734 Oct-27-2021, 02:43 PM
Last Post: SriRajesh
  Concatenate str JohnnyCoffee 2 4,234 May-01-2021, 03:58 PM
Last Post: JohnnyCoffee
  More elegant way to remove time from text lines. Pedroski55 6 6,602 Apr-25-2021, 03:18 PM
Last Post: perfringo
  Concatenate two dataframes moralear27 2 3,156 Sep-15-2020, 08:04 AM
Last Post: moralear27
  concatenate a request to the endpoint of OSM-API?! - how to? apollo 0 2,060 Aug-18-2020, 03:21 PM
Last Post: apollo
  Moving Rows From Different Data Frames JoeDainton123 1 5,940 Aug-06-2020, 05:19 AM
Last Post: scidam

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020
This forum uses Lukasz Tkacz MyBB addons.
Forum use Krzysztof "Supryk" Supryczynski addons.