Skip to content

Store root-namespace storage statistics on database

Problem to solve

Today we check storage statistics using a GROUP BY operator on ProjectStatistics and it's one of the longest running transaction in production (https://gitlab.com/gitlab-org/gitlab-ce/issues/62488)

We're using this information as part of a public API on storage counter at group level. And once we start enforcing storage limits we will need to rely on this query more often.

Also, our billing schema is based on root-namespace aggregation and this query do not aggregate to root-namespace.

Technical bits

Proposal

  1. Create a new model with the same attributes as ProjectStatistics.*_size. The purpose of this model will be to hold the information in an aggregated form.
  2. Update the statistics in this model in an async way, to avoid large database transactions. (See backend section for the technical details)
  3. Rework !28277 (merged) to make use of this new query - https://gitlab.com/gitlab-org/gitlab-ce/issues/62796

Development log

Decisions

Backend implications

Prework

Technical details (%12.1 )

  1. Create root_namespace_storage_statistics with all the ProjectStatistics.*_size attributes
  2. Create a second table (namespace_aggregation_schedules) with two columns id and namespace_id.
  3. Whenever the statistics of a project changes, we insert a row into namespace_aggregation_schedules
    • We don't insert a new row if there's already one related to the namespace.
    • Insertion is done through a callback and with a Sidekiq job. We can't do it in the same transaction as ProjectStatistics is already involved in a large one (https://gitlab.com/gitlab-org/gitlab-ce/issues/62488)
  4. After inserting the row, we schedule a new worker X hours into the future.
  5. This job will:
    • Update the root namespace storage statistics by querying all the namespaces through a service.
    • Delete the related namespace_aggregation_schedules after the update
  6. We also need to create another Sidekiq job that will traverse any remaining rows on namespace_aggregation_schedules and schedule jobs for every pending row.
  7. Hide all these changes behind a FF
  8. we will read the interval of caching time form redis defaulting to once every 3 hours
  9. we will experiment tweaking the interval aiming for a smaller value
  10. when we will remove the feature flag, the interval must be hardcoded or converted to an application setting (to be decided)

Merge Requests

Edited by Mayra Cabrera