DEV Community

Cover image for Fetch Values from Previous Non-Null Value Rows — From SQL to SPL #24
Judith-Data-Processing-Hacks
Judith-Data-Processing-Hacks

Posted on

Fetch Values from Previous Non-Null Value Rows — From SQL to SPL #24

Problem description & analysis:

The database table organisation_user_link stores the current state of the account, where dossier_created is the account creation time.

source table 1

The database table organisation_user_link_status_history stores the change history of account status.

source table 2

Task: Now, based on the specified query date, list the account status for each day within the period from today (March 14, 2024) to the query date (March 1, 2024). Requirement: Reasonably fill in the status of blank dates, for example, from today to the latest change date, it should be filled in as today’s, and from the latest change date to the next nearest change date, it should be filled in as the latest’s; Supplement the account creation date; Finally, sort by account and date in reverse order.

expected results

Code comparisons:

SQL

WITH RECURSIVE dates ( date ) AS ( SELECT DATE('2024-03-01') UNION ALL SELECT DATE(date) + INTERVAL 1 DAY FROM dates WHERE DATE(DATE) < (NOW() - INTERVAL 1 DAY) ), current_history_data_query AS ( SELECT current_history_data.* FROM ( SELECT DATE(timestamp) AS date, user_id, organisation_id, status_id, stopped_reason_id, dossier_created, 'history-data' AS src FROM ( SELECT oulsh.user_id, oulsh.organisation_id, oulsh.timestamp, oulsh.status_id, oulsh.stopped_reason_id, oul.dossier_created, ROW_NUMBER() OVER (PARTITION BY oulsh.user_id, oulsh.organisation_id, DATE(oulsh.timestamp) ORDER BY oulsh.timestamp DESC) AS row_num FROM organisation_user_link_status_history AS oulsh INNER JOIN organisation_user_link AS oul ON oulsh.user_id = oul.user_id AND oulsh.organisation_id = oul.organisation_id ) AS numbered_rows WHERE row_num = 1 AND DATE(timestamp) != DATE(NOW()) UNION ALL SELECT CURRENT_DATE AS date, oul.user_id, oul.organisation_id, oul.status_id, oul.stopped_reason_id, oul.dossier_created, 'current-data' AS src FROM organisation_user_link AS oul ) AS current_history_data ORDER BY DATE DESC ) SELECT d.date, u.user_id, u.organisation_id, ( SELECT status_id FROM current_history_data_query WHERE user_id = u.user_id AND organisation_id = u.organisation_id AND date >= d.date ORDER BY date ASC LIMIT 1 ) AS status_id, ( SELECT stopped_reason_id FROM current_history_data_query WHERE user_id = u.user_id AND organisation_id = u.organisation_id AND date >= d.date ORDER BY date ASC LIMIT 1 ) AS stopped_reason_id, ( SELECT dossier_created FROM current_history_data_query WHERE user_id = u.user_id AND organisation_id = u.organisation_id AND date >= d.date ORDER BY date ASC LIMIT 1 ) AS dossier_created FROM dates d JOIN (SELECT DISTINCT user_id, organisation_id FROM organisation_user_link) u ORDER BY d.date DESC, u.user_id; 
Enter fullscreen mode Exit fullscreen mode

SQL uses recursive subqueries to create date sequences, which have complex structures. It uses multi-layer nested queries and window functions to mark state changes, and then fills in blank date data with join statements, making the code cumbersome.

SPL: SPL does not aggregate after grouping by accounts, but continues to calculate subsets of the groups. SPL provides functions for generating date sequences and records based on date sequences. 👉🏻Try.DEMO

SPL code

A1, A2: Load data. The current status table has added a date field with a value of today’s date. The historical status table has added the account creation date with an empty value.

A3: Merge current and historical status, group by account, but do not aggregate.

A4: Process each group of data: Generate a date sequence based on the dates of each record and the next record. If it is the last record, generate a date sequence with the day before the query date; Generate a new batch of records for each date sequence, where the creation date comes from the first record in this group. The function ‘periods’ generates a date sequence, @x represents not including the posterior endpoint. The function news generates a batch of new records for each record in the current group.


Try esProc SPL for Free — esProc SPL FREE Download.

Top comments (1)

Collapse
 
judith677 profile image
Judith-Data-Processing-Hacks

Feel free to download esProce SPL and give it a try!

💥Discord
💥Reddit