- Notifications
You must be signed in to change notification settings - Fork 316
Open
Description
The wfdb.Record.to_dataframe function generates a DataFrame from a Record object. The index of the resulting DataFrame is the elapsed or absolute time of each sample.
This code, however, will have significant rounding errors over a long record:
if self.base_datetime is not None: index = pd.date_range( start=self.base_datetime, periods=self.sig_len, freq=pd.Timedelta(seconds=1 / self.fs), ) else: index = pd.timedelta_range( start=pd.Timedelta(0), periods=self.sig_len, freq=pd.Timedelta(seconds=1 / self.fs), ) For example:
$ python3 >>> import wfdb >>> r = wfdb.rdrecord('81739927', pn_dir='mimic4wdb/0.1.0/waves/p100/p10014354/81739927') >>> str(r.base_datetime) '2148-08-16 09:00:17.566000' >>> r.fs 62.4725 >>> r.sig_len 6661120 >>> r.to_dataframe() I II III V aVR Pleth Resp 2148-08-16 09:00:17.566000 NaN NaN NaN NaN NaN NaN -0.751374 2148-08-16 09:00:17.582007 NaN NaN NaN NaN NaN NaN -0.751374 2148-08-16 09:00:17.598014 NaN NaN NaN NaN NaN NaN -0.751374 2148-08-16 09:00:17.614021 NaN NaN NaN NaN NaN NaN -0.751374 2148-08-16 09:00:17.630028 NaN NaN NaN NaN NaN NaN -0.751374 ... .. ... ... ... ... ... ... 2148-08-17 14:37:22.033805 NaN -0.220 -0.285 -0.025 NaN 0.404297 0.487477 2148-08-17 14:37:22.049812 NaN -0.030 0.005 0.025 NaN 0.396484 0.530238 2148-08-17 14:37:22.065819 NaN -0.065 -0.030 -0.015 NaN 0.386475 0.574832 2148-08-17 14:37:22.081826 NaN -0.265 -0.255 -0.125 NaN 0.375977 0.621258 2148-08-17 14:37:22.097833 NaN -0.550 -0.610 -0.355 NaN 0.366211 0.664020 [6661120 rows x 7 columns] >>> str(r.get_absolute_time(6661119) '2148-08-17 14:37:22.384920' $ wfdbtime -r mimic4wdb/0.1.0/waves/p100/p10014354/81739927/ s6661119 s6661119 29:37:04.819 [14:37:22.385 17/08/2148] Here, get_absolute_time is correct to the nearest microsecond and the wfdbtime command is correct to the nearest millisecond. to_dataframe, however, is off by 0.287 seconds.
I think this would be avoided by using start and end arguments to date_range or timedelta_range, rather than using start and freq.
Metadata
Metadata
Assignees
Labels
No labels