Skip to content

Conversation

@atoulme
Copy link
Contributor

@atoulme atoulme commented Oct 25, 2025

Fixes #2232

Changes

Add the ability to remove a synchronous instrument identified by its attributes. The instrument will no longer report.

@dashpole
Copy link
Contributor

One thing that has been asked for is to be able to delete multiple series at once, since callers often don't have access to the complete list of attribute sets they've previously incremented. E.g. remove(http.target=foo) would remove all streams for that http.target. One way to solve this is by matching all attribute sets which don't have the keys provided. Remove without any arguments would match and remove all streams from the instrument. Unfortunately, it wouldn't be backwards-compatible to change this later, so we need to decide if this is important from the start.

The other big question (which we need to resolve in the SDK spec PR), is how this impacts start time handling for cumulative metrics in the SDK. If an attribute set is deleted and recreated, the resulting metric data must have non-overlapping start-end time ranges since the cumulative value has been (presumably) reset. One way to solve this would be to require per-attribute-set start times in the SDK (#4184).

@jack-berg
Copy link
Member

A more flexible version of @dashpole's suggestion might look like: remove(Predicate<Attribute> predicate), where the predicate is invoked for each series, with usage in java like:

instrument.remove(attributes -> true); // Remove all series instrument.remove(attributes -> attribute.get("http.route").equals("/v1/foo/bar")); // Remove all series where http.route=/v1/foo/bar instrument.remove(attributes -> attributes.get("http.route").startsWith("/v1/foo")); // Remove all series matching pattern http.route=/v1/foo.* 

If an attribute set is deleted and recreated, the resulting metric data must have non-overlapping start-end time ranges since the cumulative value has been (presumably) reset. One way to solve this would be to require per-attribute-set start times in the SDK (#4184).

Yeah for example, in java, we always use the same SDK start time for all cumulative series. Whether they see their first data at start time or days later, the start is always the same. I think your suggestion to track per-attribute-set start times in the SDK seems reasonable, but that seems to imply a behavior change from the single constant start time that we currently do java. Are there implications for this? Are there cases where a user with a cumulative backend would be upset to see new series with a start time corresponding to the window where data was first recorded?

@jack-berg
Copy link
Member

Also, if I call something like instrument.remove(attributes -> true) to delete all series, its not clear whether I intend to record additional data in the future. If I don't intend to record additional data, then I would probably view the fact that the SDK continues to have memory allocated to the instrument as a memory leak. But the SDK can't free up all the resources for that instrument without knowing for sure that I won't record again.

Makes me wonder if we need a top level instrument level close / remove method, as well as the fine grained method for removing specific series.

@dashpole
Copy link
Contributor

Are there implications for this? Are there cases where a user with a cumulative backend would be upset to see new series with a start time corresponding to the window where data was first recorded?

I don't think it will impact Prometheus in any negative way (@ArthurSens might know more, since he opened the original issue).

Depending on how strict you want to be, it may make it harder to aggregate timeseries with different start timestamps. If you user the earliest start timestamp, you may be missing data, and not produce an accurate cumulative for the entire time range.

@carlosalberto
Copy link
Contributor

cc @jmacd

@atoulme
Copy link
Contributor Author

atoulme commented Oct 28, 2025

Makes me wonder if we need a top level instrument level close / remove method, as well as the fine grained method for removing specific series.

This can be added later and separately from this effort, from what I can tell.

@atoulme
Copy link
Contributor Author

atoulme commented Oct 28, 2025

If an attribute set is deleted and recreated, the resulting metric data must have non-overlapping start-end time ranges since the cumulative value has been (presumably) reset. One way to solve this would be to require per-attribute-set start times in the SDK (#4184).

My concrete use case is tied to a queue system where we report the count of events seen. See https://github.com/open-telemetry/opentelemetry-java-contrib/blob/main/ibm-mq-metrics/model/metrics.yaml#L216

When we lose contact with the queue manager, we can no longer report this information. If we get back in contact with the queue manager, we must recreate the time series with the new information, resetting the counter and its start time. We typically alert on delta changes, so we want the time series to be separate.

@dashpole
Copy link
Contributor

we must recreate the time series with the new information, resetting the counter and its start time. We typically alert on delta changes, so we want the time series to be separate.

How do you reset the start time? All SDKs i'm aware of in OTel today set the cumulative start time at instrument creation time, and never reset it.

@ArthurSens
Copy link
Member

ArthurSens commented Oct 28, 2025

Are there implications for this? Are there cases where a user with a cumulative backend would be upset to see new series with a start time corresponding to the window where data was first recorded?

Actually, we would be happier to see this :) I've opened #4184 a long time ago, but never found the time to continue working on it. A start time per time series would help us provide more accurate increase rates.

@atoulme atoulme marked this pull request as ready for review October 29, 2025 22:27
@atoulme atoulme requested review from a team as code owners October 29, 2025 22:27
@atoulme
Copy link
Contributor Author

atoulme commented Oct 29, 2025

Since this is now sponsored, I am marking this PR ready for review. The discussion continues.

@atoulme
Copy link
Contributor Author

atoulme commented Oct 29, 2025

we must recreate the time series with the new information, resetting the counter and its start time. We typically alert on delta changes, so we want the time series to be separate.

How do you reset the start time? All SDKs i'm aware of in OTel today set the cumulative start time at instrument creation time, and never reset it.

Let's put this in as a requirement and try it out in the POCs, see how we fare.

@github-actions
Copy link

github-actions bot commented Nov 6, 2025

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Nov 6, 2025

##### Remove

Status: Development
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Status: Development
**Status**: [Development](../document-status.md)

##### Remove

Status: Development
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Status: Development
**Status**: [Development](../document-status.md)

##### Remove

Status: Development
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Status: Development
**Status**: [Development](../document-status.md)

##### Remove

Status: Development
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Status: Development
**Status**: [Development](../document-status.md)

Status: Development

Unregister the Counter. It will no longer be reported.
Copy link
Member

@pellared pellared Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Assuming that I do not miss something) This is not true. This does not registers the whole counter, but a data point (if I correctly remember/understand the terminology). Maybe it would be better to rename the operation to RemoveDataPoint so that we can add Remove in future that would unregister the whole instrument (and not only a given data point)?

The same comment applies to other instruments.

EDIT: I see similar comments like #4702 (comment) 😉

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we aren't unregistering the entire counter. I don't like RemoveDataPoint, as DataPoint is not really an API concept. I would suggest phrasing this as "Unregister the attribute set".

@pellared pellared removed the Stale label Nov 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

7 participants