- Notifications
You must be signed in to change notification settings - Fork 465
Support tagfiltertree for fast matching metricIDs to queries #4310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
shaan420 wants to merge 8 commits into master Choose a base branch from snair/tagfiltertree
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline, and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
8 commits Select commit Hold shift + click to select a range
13cf2c2 Support tagfiltertree for fast matching metricIDs to queries
shaan420 4c96204 lint fixes
shaan420 bde1d75 lint fixes
shaan420 ee6c313 lint fixes
shaan420 b9368bb review comments
shaan420 3423225 fix bug in Match() logic
shaan420 c65aa85 remove golint since it is deprecated. It was causing issues with dete…
shaan420 41769a2 update README.md
shaan420 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| | @@ -186,7 +186,6 @@ linters: | |
| - gci | ||
| - goconst | ||
| - gocritic | ||
| - golint | ||
| - gosimple | ||
| - govet | ||
| - ineffassign | ||
| | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,68 @@ | ||
| # Tag Filter Tree | ||
| | ||
| ## Motivation | ||
| There are many instances where we want to match an input metricID against | ||
| a set of tag filter patterns. Iterating through each filter individually and matching | ||
| them is extremely expensive since it has to be done on each incoming metricID. | ||
| One example is you want to drop incoming metricIDs that matches any | ||
| of the configured tag filter patterns. Instead of iterating over all configured tag | ||
| filter patterns, they could be structured as a tree and then quickly prune non-relevant | ||
| tag filter patterns from being matched and thereby also reducing the reducdancy in matching | ||
| the same tag filters over and over that are common between the tag filter patterns. | ||
| | ||
| Example: | ||
| Consider a set of tag filter patterns: | ||
| 1. service:foo env:prod* name:num_requests | ||
| 2. service:foo env:staging* name:*num_errors | ||
| | ||
| When an input metricID comes in: | ||
| service:foo env:staging-123 name:login.num_errors | ||
| it would be wasteful to match "service:foo" again. Also once we know that "env:staging*" | ||
| matched then we only need to look at the "name:*num_errors" to find a match for the | ||
| input metricID. | ||
| | ||
| ## Usage | ||
| First create a trie using New() and then add tagFilters using AddTagFilter(). | ||
| Collaborator There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And then I guess you use Contributor Author There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done! | ||
| The tags within a filter can be specified in any order but to condense the compiled | ||
| output of the trie, try and specify the most common set of tags in the beginning | ||
| and in the same order. | ||
| For instance, in case you have a tag "service" which you anticipate to be present | ||
| in all filters then make sure that is specified first and then specify the remaining tags | ||
| in the filter. | ||
| So basically re-order the tags filters in the tag filter pattern based on the popularity | ||
| of the tag filter. | ||
| | ||
| There are two ways to use the Match() API. | ||
| 1. Match All | ||
| When calling AddTagFilter() you can attach data to that tag filter pattern. This data | ||
| can be retreived during Match(). Consequently, a Match() call with an input metricID | ||
| can match multiple tag filter patterns thus multiple data. The 2nd parameter passed to | ||
| Match() is a data slice to hold the matched data. | ||
| | ||
| 2. Match Any | ||
| In case we are simply interested in knowing whether the input metricID matched any of the | ||
| tag filter patterns and don't really care about which in particular then we can call Match() | ||
| with the 2nd parameter as nil. This results an even faster code path than the one above. | ||
| | ||
| ```go | ||
| tree := New[*Rule]() | ||
| for _, rule := range tt.rules { | ||
| for _, tagFilterPattern := range rule.TagFiltersPatterns { | ||
| err := tree.AddTagFilter(tagFilterPattern, &rule) | ||
| require.NoError(t, err) | ||
| } | ||
| } | ||
| | ||
| ... | ||
| // check match all | ||
| data := make([]*Rule, 0) | ||
| matched, err := tree.Match(inputTags, &data) | ||
| | ||
| ... | ||
| // check match any | ||
| anyMatched, err := tree.Match(inputTags, nil) | ||
| ... | ||
| ``` | ||
| | ||
| ## Caveats | ||
| The trie might return duplicates and it is up to the caller to de-dup the results. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,30 @@ | ||
| package tagfiltertree | ||
| | ||
| import "github.com/m3db/m3/src/metrics/filters" | ||
| | ||
| // Options is a set of options for the attributor. | ||
| type Options interface { | ||
| TagFilterOptions() filters.TagsFilterOptions | ||
| SetTagFilterOptions(tf filters.TagsFilterOptions) Options | ||
| } | ||
| | ||
| type options struct { | ||
| tagFilterOptions filters.TagsFilterOptions | ||
| } | ||
| | ||
| // NewOptions creates a new set of options. | ||
| func NewOptions() Options { | ||
| return &options{} | ||
| } | ||
| | ||
| // TagFilterOptions returns the tag filter options. | ||
| func (o *options) TagFilterOptions() filters.TagsFilterOptions { | ||
| return o.tagFilterOptions | ||
| } | ||
| | ||
| // SetTagFilterOptions sets the tag filter options. | ||
| func (o *options) SetTagFilterOptions(tf filters.TagsFilterOptions) Options { | ||
| opts := *o | ||
| opts.tagFilterOptions = tf | ||
| return &opts | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,40 @@ | ||
| package tagfiltertree | ||
| | ||
| import "math/bits" | ||
| | ||
| // PointerSet is a set of pointers backed by a bitmap to | ||
| // represent a sparse set of at most 127 pointers. | ||
| type PointerSet struct { | ||
| bits [2]uint64 // Using 2 uint64 gives us 128 bits (0 to 127). | ||
| } | ||
| | ||
| // Set adds a pointer at index i (0 <= i < 127). | ||
| func (ps *PointerSet) Set(i byte) { | ||
| if i < 64 { | ||
| ps.bits[0] |= (1 << i) | ||
| } else { | ||
| ps.bits[1] |= (1 << (i - 64)) | ||
| } | ||
| } | ||
| | ||
| // IsSet checks if a pointer is present at index i. | ||
| func (ps *PointerSet) IsSet(i byte) bool { | ||
| if i < 64 { | ||
| return ps.bits[0]&(1<<i) != 0 | ||
| } | ||
| return ps.bits[1]&(1<<(i-64)) != 0 | ||
| } | ||
| | ||
| // CountSetBitsUntil counts how many bits are set to 1 up to index i (inclusive). | ||
| func (ps *PointerSet) CountSetBitsUntil(i byte) int { | ||
| if i < 64 { | ||
| // Count bits in the first uint64 up to index i. | ||
| return bits.OnesCount64(ps.bits[0] & ((1 << (i + 1)) - 1)) | ||
| } | ||
| | ||
| // Count all bits in the first uint64. | ||
| count := bits.OnesCount64(ps.bits[0]) | ||
| // Count bits in the second uint64 up to index i - 64. | ||
| count += bits.OnesCount64(ps.bits[1] & ((1 << (i - 64 + 1)) - 1)) | ||
| return count | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,61 @@ | ||
| package tagfiltertree | ||
| | ||
| import ( | ||
| "math" | ||
| "testing" | ||
| | ||
| "github.com/stretchr/testify/require" | ||
| ) | ||
| | ||
| func TestPointerSetCountBits(t *testing.T) { | ||
| tests := []struct { | ||
| name string | ||
| setBits []uint64 | ||
| expected int | ||
| }{ | ||
| { | ||
| name: "empty set", | ||
| setBits: []uint64{0, 0}, | ||
| expected: 0, | ||
| }, | ||
| { | ||
| name: "single set bit", | ||
| setBits: []uint64{0, 1}, | ||
| expected: 1, | ||
| }, | ||
| { | ||
| name: "multiple set bits", | ||
| setBits: []uint64{7, 7}, | ||
| expected: 6, | ||
| }, | ||
| { | ||
| name: "all set bits", | ||
| setBits: []uint64{math.MaxUint64, math.MaxUint64}, | ||
| expected: 128, | ||
| }, | ||
| } | ||
| | ||
| for _, tt := range tests { | ||
| t.Run(tt.name, func(t *testing.T) { | ||
| ps := PointerSet{} | ||
| l := tt.setBits[0] | ||
| r := tt.setBits[1] | ||
| var i byte | ||
| for i = 0; i < 128; i++ { | ||
| if i < 64 { | ||
| if l&0x1 == 1 { | ||
| ps.Set(i) | ||
| } | ||
| l >>= 1 | ||
| } else { | ||
| if r&0x1 == 1 { | ||
| ps.Set(i) | ||
| } | ||
| r >>= 1 | ||
| } | ||
| } | ||
| | ||
| require.Equal(t, tt.expected, ps.CountSetBitsUntil(127)) | ||
| }) | ||
| } | ||
| } |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge. Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
golint is deprecated now and mainly not detecting the type correctly when using Generics.