Skip to content

Account for soon-to-be-written data when checking the flood-stage watermark #88693

@DaveCTurner

Description

@DaveCTurner

Today the flood-stage watermark decides that a disk is in danger of filling up by just looking at the amount of data on disk, which does not account for data that is in flight towards the disk on the node. Nodes with ample heap but slow and small disks could in principle have so much data in flight that it will fill the disk from below the flood-stage watermark all the way up to 100% before indexing can be blocked.

We should be more conservative about disk usage to protect these cases. We can quantify the amount of in-flight data using indexing pressure metrics and the indexing memory buffer, so maybe we should account for this usage when checking the flood-stage watermark.

Relates #88606 (and IMO merges are a bigger problem)

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Distributed Indexing/CRUDA catch all label for issues around indexing, updating and getting a doc by id. Not search.SupportabilityImprove our (devs, SREs, support eng, users) ability to troubleshoot/self-service product better.Team:Distributed (Obsolete)Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions