Measure per-request/per-worker memory allocations statistics

Problem

Currently, we scrape amount of memory allocations for all work being executed. This does not provide a granularity needed to understand per-context memory allocations (per-request or per-worker execution).

This is due that a globally sampled stats are affected by all threads being executed.

Idea

We should have a predictable way to understand how much memory is being allocated by each requests/worker during its execution to understand:

GC pressure on heap slots allocations
a number of malloc calls (like for Strings)
a size of malloc calls (like processing large blobs of data)
allocations per second of execution
allocation size per second of execution
(maybe) histogram of memory allocations

Requirements

be thread-safe and measure only with a context of a given execution
log all counters to allow them easily to scrape using ELK

Solution

Extend Ruby VM, and try to upstream it to provide an ability to measure allocations done in a given thread.
Expose these counters as part of our logs to be able to scrape it.
Analise data
Provide a dashboards taking into account per-request and per-feature_category information about allocations per-unit of execution or per-unit of time

What needs to be done?

1. Harder and longer

Get the Ruby VM patch be merged upstream and wait for us to update to Ruby 3.1 or 3.2
Update GitLab Rails to use a patch and log data

2. Likely easier, but requiring to update our components (choosen path)

Backport the upstream patch to Ruby 2.7: Done here https://github.com/ayufan-research/ruby/tree/thread-memory-stat-2.7
Patch Ruby shipped with Omnibus: omnibus-gitlab!4948 (merged)
Patch Ruby shipped with CNG: gitlab-org/build/CNG!591 (merged)
Patch Ruby used by GitLab CI for testing: gitlab-build-images!355 (merged)
Patch Ruby used by GCK: gitlab-compose-kit!149 (merged) (maybe this is not needed now if we have Ruby for tests updated, and Ruby changes validated by CI)
Patch Ruby used by GDK: gitlab-development-kit!1812 (merged) (maybe this is not needed now if we have Ruby for tests updated, and Ruby changes validated by CI)
Update GitLab CI testing of GitLab to use patched RubyVM: !53226 (merged)
Update GitLab Rails to use a patch and log data: !52306 (merged)

Summary

I decided to take a road of manually patching our stack for time being, and hopefully getting this merged upstream. Assuming that this gets merged and we update to Ruby 3.1 we could have that supported out of box and just drop the patch.

Edited Feb 17, 2021 by Kamil Trzciński