End-of-year “roll-over”#

Single-year datasets retrieved from the PSM3 API with utc=false have a quirk where the last few hours in the dataset are technically from the prior year.

I guess the reason is due to a process roughly as follows:

  • The API grabs a calendar year of data in UTC

  • The API converts to local time, thereby shifting a few January hours into the previous December

  • The API takes those few hours in the previous year and puts them at the end of the dataset, thereby creating a complete calendar year again

In any case, it can create some minor inconsistencies in the last N hours of the dataset, where N is the location’s UTC offset. This notebook shows how it introduces discontinuities in temperature and humidity. The Solar Position notebook shows how it affects the solar position calculation.

[3]: 
import pvlib import pandas as pd import numpy as np lat, lon = 40, -120 # note: get_psm3() specifies utc=false internally. # nothing special about this place and time, other than the discontinuities # being visually obvious. df, meta = pvlib.iotools.get_psm3(lat, lon, 'DEMO_KEY', 'assessingsolar@gmail.com', names=2018, interval=5, map_variables=True, leap_day=True, attributes=['air_temperature', 'surface_pressure', 'total_precipitable_water', 'relative_humidity']) # drop nuisance cols, only keep the ones we want: df = df[['temp_air', 'pressure', 'precipitable_water', 'relative_humidity']] 

Taking a look at the last few hours of the year, notice the discontinuities:

[4]: 
axes = df.loc['2018-12-29':].plot(subplots=True) boundary = df.index[-1] + df.index[-1].utcoffset() for ax in axes: ax.axvline(boundary, ls=':', c='k') 
../_images/pages_rollover_3_0.png

Now, let’s move those last few hours back to the beginning where they originally came from. See how nicely it joins up with the rest of the data?

[5]: 
df_shift = df.copy() df_shift.index = np.where(df_shift.index > boundary, df_shift.index - pd.DateOffset(years=1), df_shift.index) df_shift = df_shift.sort_index() axes = df_shift.loc[:'2018-01-02'].plot(subplots=True) for ax in axes: ax.axvline(df.index[0], ls=':', c='k') 
../_images/pages_rollover_5_0.png
[6]: 
%load_ext watermark %watermark --iversions -u -d -t 
 Last updated: 2022-09-21 18:07:13 numpy : 1.22.3 pandas: 1.5.0 pvlib : 0.9.3 
[ ]: