Skip to content

Conversation

spantaleev
Copy link
Contributor

In pgStatWalReceiverQueryTemplate, the order of the columns (when hasFlushedLSN == true) is:

  • ...
  • receive_start_lsn
  • flushed_lsn
  • receive_start_tli
  • ...

However, columns were scanned in this order:

  • ...
  • receive_start_lsn -> receiveStartLsn
  • receive_start_tli -> flushedLsn (!)
  • flushed_lsn -> receiveStartTli (!)
  • ...

This incorrect hydration of variables also manifests as swapped values for the pg_stat_wal_receiver_flushed_lsn and pg_stat_wal_receiver_receive_start_tli metrics.

This seems to be a bug that has existed since the initial implementation:

In this patch, I'm:

  • fixing the .Scan(), so that it hydrates variables in the correct order

  • adjusting the order in which metrics are pushed out to the channel, to follow the order we consume them in (.., receive_start_lsn, flushed_lsn, receive_start_tli, ..)

  • adjusting the walreceiver tests, to follow the new order (which matches .Scan())

  • fixing a small identation issue in pgStatWalReceiverQueryTemplate

@cristiangreco
Copy link
Contributor

Hi @spantaleev! Do you mind fixing the DCO check please?

…lector In `pgStatWalReceiverQueryTemplate`, the order of the columns (when `hasFlushedLSN == true`) is: - ... - `receive_start_lsn` - `flushed_lsn` - `receive_start_tli` - ... However, columns were scanned in this order: - ... - `receive_start_lsn` -> `receiveStartLsn` - `receive_start_tli` -> `flushedLsn` (!) - `flushed_lsn` -> `receiveStartTli` (!) - ... This incorrect hydration of variables also manifests as swapped values for the `pg_stat_wal_receiver_flushed_lsn` and `pg_stat_wal_receiver_receive_start_tli` metrics. This seems to be a bug that has existed since the initial implementation: - 2d7e152 - prometheus-community#844 In this patch, I'm: - fixing the `.Scan()`, so that it hydrates variables in the correct order - adjusting the order in which metrics are pushed out to the channel, to follow the order we consume them in (.., `receive_start_lsn`, `flushed_lsn`, `receive_start_tli`, ..) - adjusting the walreceiver tests, to follow the new order (which matches .`Scan()`) - fixing a small identation issue in `pgStatWalReceiverQueryTemplate` Signed-off-by: Slavi Pantaleev <slavi@devture.com>
@spantaleev spantaleev force-pushed the fix-walreceiver-swapped-values branch from b1cc9bc to 024c1fd Compare September 29, 2025 11:07
@spantaleev
Copy link
Contributor Author

I've fixed the sign-off. Sorry for missing that the first time around!

@cristiangreco cristiangreco merged commit ef2736e into prometheus-community:master Sep 29, 2025
11 checks passed
cristiangreco added a commit that referenced this pull request Sep 29, 2025
* [BUGFIX] Fix swapped `flushedLsn` and `receiveStartTli` for `wal_receiver` collector by @spantaleev in #1198 * [BUGFIX] Fix superfluous semicolon breaking query in `process_idle` by @sysadmind in #1197 and #1201
cristiangreco added a commit that referenced this pull request Sep 29, 2025
* [BUGFIX] Fix swapped `flushedLsn` and `receiveStartTli` for `wal_receiver` collector by @spantaleev in #1198 * [BUGFIX] Fix superfluous semicolon breaking query in `process_idle` by @sysadmind in #1197 and #1201 Signed-off-by: Cristian Greco <cristian@regolo.cc>
sysadmind pushed a commit that referenced this pull request Sep 29, 2025
* [BUGFIX] Fix swapped `flushedLsn` and `receiveStartTli` for `wal_receiver` collector by @spantaleev in #1198 * [BUGFIX] Fix superfluous semicolon breaking query in `process_idle` by @sysadmind in #1197 and #1201 Signed-off-by: Cristian Greco <cristian@regolo.cc>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants