Skip to content

Conversation

@forsaken628
Copy link
Collaborator

@forsaken628 forsaken628 commented Apr 25, 2025

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

Added settings enable_shuffle_sort , disabled by default.

root@localhost:8000/default/default> explain pipeline select * from numbers(10) order by number; explain pipeline select * from numbers(10) order by number -[ EXPLAIN ]----------------------------------- digraph { 0 [ label = "NumbersSourceTransform" ] 1 [ label = "Resize" ] 2 [ label = "SortPartialTransform" ] // block level sort 3 [ label = "SortPartialTransform" ] 4 [ label = "TransformSortMergeCollect" ] // collect all block, then sample, may spill 5 [ label = "TransformSortMergeCollect" ] 6 [ label = "Resize" ] 7 [ label = "SortBoundBroadcast" ] // broadcast sample 8 [ label = "TransformSortRestore" ] // restore block by bound 9 [ label = "SortBoundEdge" ] // update meta 10 [ label = "ScatterTransform(SortBound)" ] // scatter 11 [ label = "TransformScatterExchangeSerializer" ] 12 [ label = "ExchangeShuffleTransform" ] 13 [ label = "ExchangeWriterSink" ] 14 [ label = "DummyTransform" ] 15 [ label = "ExchangeWriterSink" ] 16 [ label = "ExchangeSourceReader" ] 17 [ label = "ExchangeSourceReader" ] 18 [ label = "TransformExchangeDeserializer" ] 19 [ label = "TransformExchangeDeserializer" ] 20 [ label = "TransformExchangeDeserializer" ] 21 [ label = "BoundedMultiSortMerge" ] // multi stream merge sort 22 [ label = "DummyTransform" ] 23 [ label = "ExchangeSourceReader" ] 24 [ label = "ExchangeSourceReader" ] 25 [ label = "TransformExchangeDeserializer" ] 26 [ label = "TransformExchangeDeserializer" ] 27 [ label = "TransformExchangeDeserializer" ] 28 [ label = "SortRoute" ] // final merge 29 [ label = "CompoundBlockOperator(Project)" ] 0 -> 1 [ label = "" ] 1 -> 2 [ label = "from: 0, to: 0" ] 1 -> 3 [ label = "from: 1, to: 0" ] 2 -> 4 [ label = "" ] 3 -> 5 [ label = "" ] 4 -> 6 [ label = "from: 0, to: 0" ] 5 -> 6 [ label = "from: 0, to: 1" ] 6 -> 7 [ label = "" ] 7 -> 8 [ label = "" ] 8 -> 9 [ label = "" ] 9 -> 10 [ label = "" ] 10 -> 11 [ label = "" ] 11 -> 12 [ label = "" ] 12 -> 13 [ label = "from: 0, to: 0" ] 12 -> 14 [ label = "from: 1, to: 0" ] 12 -> 15 [ label = "from: 2, to: 0" ] 14 -> 18 [ label = "" ] 16 -> 19 [ label = "" ] 17 -> 20 [ label = "" ] 18 -> 21 [ label = "from: 0, to: 0" ] 19 -> 21 [ label = "from: 0, to: 1" ] 20 -> 21 [ label = "from: 0, to: 2" ] 21 -> 22 [ label = "" ] 22 -> 25 [ label = "" ] 23 -> 26 [ label = "" ] 24 -> 27 [ label = "" ] 25 -> 28 [ label = "from: 0, to: 0" ] 26 -> 28 [ label = "from: 0, to: 1" ] 27 -> 28 [ label = "from: 0, to: 2" ] 28 -> 29 [ label = "" ] } 

Due to the complexity of the cut-point selection, which requires a lot of work to optimize.

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-feature this PR introduces a new feature to the codebase label Apr 25, 2025
@forsaken628 forsaken628 added the ci-benchmark Benchmark: run all test label Apr 26, 2025
@github-actions
Copy link
Contributor

Docker Image for PR

  • tag: pr-17853-4194c2b-1745717419

note: this image tag is only available for internal use.

@forsaken628 forsaken628 added ci-benchmark Benchmark: run all test and removed ci-benchmark Benchmark: run all test labels Apr 27, 2025
@github-actions
Copy link
Contributor

Docker Image for PR

  • tag: pr-17853-049468d-1745761027

note: this image tag is only available for internal use.

@forsaken628 forsaken628 changed the title feat(query): range shuffle sort feat(query): range shuffle sort for standalone mode Apr 28, 2025
@forsaken628 forsaken628 removed the ci-benchmark Benchmark: run all test label Apr 28, 2025
@forsaken628 forsaken628 marked this pull request as ready for review April 28, 2025 03:00
@forsaken628 forsaken628 requested a review from sundy-li April 28, 2025 07:28
@forsaken628 forsaken628 marked this pull request as draft May 6, 2025 04:19
Signed-off-by: coldWater <forsaken628@gmail.com>
Signed-off-by: coldWater <forsaken628@gmail.com>
Signed-off-by: coldWater <forsaken628@gmail.com>
Signed-off-by: coldWater <forsaken628@gmail.com>
@forsaken628 forsaken628 changed the title feat(query): range shuffle sort for standalone mode feat(query): support shuffle sort Jul 14, 2025
@sundy-li sundy-li requested a review from zhang2014 July 16, 2025 07:34
@forsaken628 forsaken628 marked this pull request as ready for review July 17, 2025 14:16
Copy link
Member

@sundy-li sundy-li left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sundy-li sundy-li merged commit 378c6cc into databendlabs:main Jul 29, 2025
86 checks passed
@forsaken628 forsaken628 deleted the range-shuffle branch July 30, 2025 02:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-feature this PR introduces a new feature to the codebase

2 participants