Impacts of Sharding, Partitioning, Encoding, and Sorting on Distributed Query Performance Nga Tran Staff Engineer, InfluxData July 14, 2021
● InfluxData - Staff Engineer ● Tableau/Salesforce (2 years) ○ Sr. Manager of Automatic Statistics ● Vertica RDBMS (over a decade) ○ Engineer of Query Optimizer ○ Director of Engineering (R&D) ● ELCA (4 years)
Outline ● Non-distributed vs Distributed Databases ● Splitting Data to Gain Query Performance ○ Sharding, Partitioning, Encoding, and Sorting ● Impacts of different data setups on Query Performance
Distributed Database Non-Distributed DB: 1-node cluster ● 1 machine ● Data is loaded & then queried on that node Distributed DB: Cluster of many nodes ● Several machines shares the work ● Data is horizontally split between nodes ● Data is queried from all nodes Node Non-Distributed DB Node 1 Node 2 Node n N nodes, each plays the same role and talks to each other Distributed DB Row 1 Row 2 …….. Row a Row a+1 Row a+2 ……….. Row b Row x+1 Row x+2 ……….. Row n
Distributed Database Non-Distributed DB: 1-node cluster ● 1 machine ● Data is loaded & then queried on that node Distributed DB: Cluster of many nodes ● Several machines shares the work ● Data is horizontally split between nodes ● Data is queried from all nodes → How to split data to gain query performance? Node Non-Distributed DB Node 1 Node 2 Node n N nodes, each plays the same role and talks to each other Distributed DB Row 1 Row 2 …….. Row a Row a+1 Row a+2 ……….. Row b Row x+1 Row x+2 ……….. Row n
Splitting Data to Gain Query Performance ● Sharding ○ Horizontally split a table into N non-overlapping shards ■ → each node will (equally) share 1/n of the workload: ● Load 1/n data to each node ● Query: join & group-by on each node share 1/n workload ● Partitioning ○ Each shard is further split into smaller partitions for better data filtering, deleting, fanning out, local parallelism ● Encoding ○ Each column is encoded (sorted & compressed) to further help on join, filtering, group-by, order-by
Splitting Data to Gain Query Performance ● Sharding ○ Horizontally split a table into N non-overlapping shards ■ → each node will (equally) share 1/n of the workload: ● Load 1/n data to each node ● Query: join & group-by on each node share 1/n workload ● Partitioning ○ Each shard is further split into smaller partitions for better data filtering, deleting, fanning out, local parallelism ● Encoding ○ Each column is encoded (sorted & compressed) to further help on join, filtering, group-by, order-by → Let us dig into examples
Line_Item o_okey o_date o_pri 1 2021.05.01 2 2 2021.05.01 1 3 2021.05.02 1 4 2021.05.02 3 5 2021.05.02 1 Examples: Two tables Order & Line_Item Order l_okey l_name l_price l_shipdate 1 desk 100 2021.05.07 1 chair 50 2021.05.03 1 monitor 130 2021.05.03 1 mouse 10 2021.05.07 2 pot 20 2021.05.01 2 pan 25 2021.05.04 3 shirt 30 2021.05.10 4 bike 120 2021.05.04 4 helmet 30 2021.05.10 5 kayak 200 2021.05.05 5 lifevest 20 2021.05.02
Sharded : Order: (o_okey % 2) & Line_Item: (l_okey % 2) o_okey o_date o_pri 1 2021.05.01 2 3 2021.05.01 1 5 2021.05.02 1 l_okey l_name l_price l_shipdate 1 desk 100 2021.05.07 1 chair 50 2021.05.03 1 monitor 130 2021.05.03 1 mouse 10 2021.05.07 3 shirt 30 2021.05.2 5 kayak 200 2021.05.07 5 lifevest 20 2021.05.02 o_okey o_date o_pri 2 2021.05.01 1 4 2021.05.02 3 l_okey l_name l_price l_shipdate 2 pot 20 2021.05.01 2 pan 25 2021.05.04 4 bike 120 2021.05.04 4 helmet 30 2021.05.10 Examples: 2-node cluster Node 1 Node 2 Order Line_Item Line_Item Order
Sharded : Order: (o_okey % 2) & Line_Item: (l_okey % 2) Partitioned : Order: (o_date) & Line_Item: (l_shipdate) o_okey o_date o_pri 1 2021.05.01 2 3 2021.05.01 1 5 2021.05.02 1 l_okey l_name l_price l_shipdate 3 shirt 30 2021.05.2 5 lifevest 20 2021.05.02 1 chair 50 2021.05.03 1 monitor 130 2021.05.03 1 desk 100 2021.05.07 1 mouse 10 2021.05.07 5 kayak 200 2021.05.07 o_okey o_date o_pri 2 2021.05.01 1 4 2021.05.02 3 l_okey l_name l_price l_shipdate 2 pot 20 2021.05.01 2 pan 25 2021.05.04 4 bike 120 2021.05.04 4 helmet 30 2021.05.10 Examples: 2-node cluster Node 1 Node 2 Order Line_Item Line_Item Order
Sharded : Order: (o_okey % 2) & Line_Item: (l_okey % 2) Partitioned : Order: (o_date) & Line_Item: (l_shipdate) Encoded & Sorted : Order: (o_okey) & Line_Item: RLE(l_okey) o_okey o_date o_pri 1 2021.05.01 2 3 2021.05.01 1 5 2021.05.02 1 l_okey l_name l_price l_shipdate (3,1) shirt 30 2021.05.2 (5,1) lifevest 20 2021.05.02 (1, 2) chair 50 2021.05.03 monitor 130 2021.05.03 (1,2) desk 100 2021.05.07 mouse 10 2021.05.07 (5,1) kayak 200 2021.05.07 o_okey o_date o_pri 2 2021.05.01 1 4 2021.05.02 3 l_okey l_name l_price l_shipdate (2,1) pot 20 2021.05.01 (2,1) pan 25 2021.05.04 (4,1) bike 120 2021.05.04 (4,1) helmet 30 2021.05.10 Examples: 2-node cluster Node 1 Node 2 Order Line_Item Line_Item Order
Impacts of the setups on query performance
Examples: Query SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date;
Examples: Query - Do the shards help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date;
Sharded : Order: (o_okey % 2) & Line_Item: (l_okey % 2) o_okey o_date o_pri 1 2021.05.01 2 3 2021.05.01 1 5 2021.05.02 1 l_okey l_name l_price l_shipdate 1 desk 100 2021.05.07 1 chair 50 2021.05.03 1 monitor 130 2021.05.03 1 mouse 10 2021.05.07 3 shirt 30 2021.05.2 5 kayak 200 2021.05.07 5 lifevest 20 2021.05.02 o_okey o_date o_pri 2 2021.05.01 1 4 2021.05.02 3 l_okey l_name l_price l_shipdate 2 pot 20 2021.05.01 2 pan 25 2021.05.04 4 bike 120 2021.05.04 4 helmet 30 2021.05.10 Back to Shard setup Node 1 Node 2 Order Line_Item Line_Item Order
Examples: Query - Do the shards help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date, o_pri ORDER BY revenue desc, o_date; YES ● Join: l_okey = o_key ○ → all odd keys in node 1 and even keys in node 2 ○ → Node 1 and node 2 join data on their local node. No need to shuffle data between nodes before joining. ● Group By: l_okey, o_date, o_pri ○ → Similarly, same group-by keys are in the same nodes. Each node can aggregate data without the need to reshuffle data
Examples: Query - Do the shards help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_key, o_date, o_pri ORDER BY revenue desc, o_date; What if Order not sharded on o_okey & Line_item not sharded on l_okey?
Examples: Query - Do the shards help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_key, o_date, o_pri ORDER BY revenue desc, o_date; What if Order not sharded on o_okey & Line_item not sharded on l_okey? ● Join: l_okey = o_key ○ → Need to reshuffle data so same join keys land on the same nodes before joining. Many ways: ■ Reshard on the fly both Order on o_okey and Line_Item on l_okey ■ Broadcast small table (o_okey) to other nodes ● Group By: l_okey, o_date, o_pri ○ → If after the join the data is shared on l_okey, nothing is needed. Otherwise, either: ■ Reshard data on l_okey to 2 nodes ■ Send everything to one node to do the final group-by
Examples: Query - Do the shards help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_key, o_date, o_pri ORDER BY revenue desc, o_date; What if Order not sharded on o_okey & Line_item not sharded on l_okey? ● → Not sharded on join keys will lead to extra on-the-fly reshard or broadcast cost ● → Not already (re-)sharded on group-by keys before the group-by operator will lead to either ○ Reshard or ○ The final node has to do all the group-by work
Examples: Query - Do the partitions help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date;
Sharded : Order: (o_okey % 2) & Line_Item: (l_okey % 2) Partitioned : Order: (o_date) & Line_Item: (l_shipdate) o_okey o_date o_pri 1 2021.05.01 2 3 2021.05.01 1 5 2021.05.02 1 l_okey l_name l_price l_shipdate 3 shirt 30 2021.05.2 5 lifevest 20 2021.05.02 1 chair 50 2021.05.03 1 monitor 130 2021.05.03 1 desk 100 2021.05.07 1 mouse 10 2021.05.07 5 kayak 200 2021.05.07 o_okey o_date o_pri 2 2021.05.01 1 4 2021.05.02 3 l_okey l_name l_price l_shipdate 2 pot 20 2021.05.01 2 pan 25 2021.05.04 4 bike 120 2021.05.04 4 helmet 30 2021.05.10 Back to Partition Setup Node 1 Node 2 Order Line_Item Line_Item Order
Examples: Query - Do the partitions help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date; Yes ● Filter: o_date < 2021.05.02 and l_shipdate > 2021.05.03 ○ → Prune partitions not in the filter ranges
Examples: Query - Do the partitions help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date; What if Order is not partitioned on o_date and Line_Item not partitioned on l_shipdate?
Examples: Query - Do the partitions help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date; What if Order is not partitioned on o_date and Line_Item not partitioned on l_shipdate? ● → nothing to prune early, we have to scan all column data and apply the filter ranges
Examples: Query - Do the encoding & sorting help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date;
Sharded : Order: (o_okey % 2) & Line_Item: (l_okey % 2) Partitioned : Order: (o_date) & Line_Item: (l_shipdate) Encoded & Sorted : Order: (o_okey) & Line_Item: RLE(l_okey) o_okey o_date o_pri 1 2021.05.01 2 3 2021.05.01 1 5 2021.05.02 1 l_okey l_name l_price l_shipdate (3,1) shirt 30 2021.05.2 (5,1) lifevest 20 2021.05.02 (1, 2) chair 50 2021.05.03 monitor 130 2021.05.03 (1,2) desk 100 2021.05.07 mouse 10 2021.05.07 (5,1) kayak 200 2021.05.07 o_okey o_date o_pri 2 2021.05.01 1 4 2021.05.02 3 l_okey l_name l_price l_shipdate (2,1) pot 20 2021.05.01 (2,1) pan 25 2021.05.04 (4,1) bike 120 2021.05.04 (4,1) helmet 30 2021.05.10 Back to Encoding and Sorting Setup Node 1 Node 2 Order Line_Item Line_Item Order
Examples: Query - Do the encoding & sorting help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date; Yes ● Join: l_okey = o_key ○ → use fast & more memory efficient merge join because data already sorted on the join keys ○ → l_okey can be kept in RLE during join ● Group By: l_okey, o_date,o_pri ○ → Group-by key is sorted and no need doing hash groupby, simply group data as we get new batches until we reach higher value
Examples: Query - Do the encoding & sorting help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date; What if Order is not sorted on o_okey and Line_Item is not RLE on l_okey?
Examples: Query - Do the encoding & sorting help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date; What if Order is not sorted on o_okey and Line_Item is not RLE on l_okey? ● → use hash join instead (usually slower and requires more memory than merge join) ● → use hash-group-by method (similarly, usually slower and requires more memory than pipe-lined group-by) ● → If there are only a few line items per order, the RLE won’t save much space
Database Designer: ● Topic for another talk ● Startup: Ottertune https://ottertune.com ○ Database Optimization on Autopilot How to design sharding, partitioning, encoding, and sorting for a combination of queries?
So what we have demonstrated today? ● Sharding ○ Horizontally split a table into N non-overlapping shards ■ → each node will (equally) share 1/n of the workload: ● Load 1/n data to each node ● Query: join & group-by on each node share 1/n workload ● Partitioning ○ Each shard is further split into smaller partitions for better data filtering, deleting, fanning out, local parallelism ● Encoding ○ Each column is encoded (sorted & compressed) to further help on join, filtering, group-by, order-by → Can you think of examples for the cases we have not covered?
Thank you

Impacts of Sharding, Partitioning, Encoding, and Sorting on Distributed Query Performance

  • 1.
    Impacts of Sharding,Partitioning, Encoding, and Sorting on Distributed Query Performance Nga Tran Staff Engineer, InfluxData July 14, 2021
  • 2.
    ● InfluxData -Staff Engineer ● Tableau/Salesforce (2 years) ○ Sr. Manager of Automatic Statistics ● Vertica RDBMS (over a decade) ○ Engineer of Query Optimizer ○ Director of Engineering (R&D) ● ELCA (4 years)
  • 3.
    Outline ● Non-distributed vsDistributed Databases ● Splitting Data to Gain Query Performance ○ Sharding, Partitioning, Encoding, and Sorting ● Impacts of different data setups on Query Performance
  • 4.
    Distributed Database Non-Distributed DB:1-node cluster ● 1 machine ● Data is loaded & then queried on that node Distributed DB: Cluster of many nodes ● Several machines shares the work ● Data is horizontally split between nodes ● Data is queried from all nodes Node Non-Distributed DB Node 1 Node 2 Node n N nodes, each plays the same role and talks to each other Distributed DB Row 1 Row 2 …….. Row a Row a+1 Row a+2 ……….. Row b Row x+1 Row x+2 ……….. Row n
  • 5.
    Distributed Database Non-Distributed DB:1-node cluster ● 1 machine ● Data is loaded & then queried on that node Distributed DB: Cluster of many nodes ● Several machines shares the work ● Data is horizontally split between nodes ● Data is queried from all nodes → How to split data to gain query performance? Node Non-Distributed DB Node 1 Node 2 Node n N nodes, each plays the same role and talks to each other Distributed DB Row 1 Row 2 …….. Row a Row a+1 Row a+2 ……….. Row b Row x+1 Row x+2 ……….. Row n
  • 6.
    Splitting Data toGain Query Performance ● Sharding ○ Horizontally split a table into N non-overlapping shards ■ → each node will (equally) share 1/n of the workload: ● Load 1/n data to each node ● Query: join & group-by on each node share 1/n workload ● Partitioning ○ Each shard is further split into smaller partitions for better data filtering, deleting, fanning out, local parallelism ● Encoding ○ Each column is encoded (sorted & compressed) to further help on join, filtering, group-by, order-by
  • 7.
    Splitting Data toGain Query Performance ● Sharding ○ Horizontally split a table into N non-overlapping shards ■ → each node will (equally) share 1/n of the workload: ● Load 1/n data to each node ● Query: join & group-by on each node share 1/n workload ● Partitioning ○ Each shard is further split into smaller partitions for better data filtering, deleting, fanning out, local parallelism ● Encoding ○ Each column is encoded (sorted & compressed) to further help on join, filtering, group-by, order-by → Let us dig into examples
  • 8.
    Line_Item o_okey o_date o_pri 12021.05.01 2 2 2021.05.01 1 3 2021.05.02 1 4 2021.05.02 3 5 2021.05.02 1 Examples: Two tables Order & Line_Item Order l_okey l_name l_price l_shipdate 1 desk 100 2021.05.07 1 chair 50 2021.05.03 1 monitor 130 2021.05.03 1 mouse 10 2021.05.07 2 pot 20 2021.05.01 2 pan 25 2021.05.04 3 shirt 30 2021.05.10 4 bike 120 2021.05.04 4 helmet 30 2021.05.10 5 kayak 200 2021.05.05 5 lifevest 20 2021.05.02
  • 9.
    Sharded : Order:(o_okey % 2) & Line_Item: (l_okey % 2) o_okey o_date o_pri 1 2021.05.01 2 3 2021.05.01 1 5 2021.05.02 1 l_okey l_name l_price l_shipdate 1 desk 100 2021.05.07 1 chair 50 2021.05.03 1 monitor 130 2021.05.03 1 mouse 10 2021.05.07 3 shirt 30 2021.05.2 5 kayak 200 2021.05.07 5 lifevest 20 2021.05.02 o_okey o_date o_pri 2 2021.05.01 1 4 2021.05.02 3 l_okey l_name l_price l_shipdate 2 pot 20 2021.05.01 2 pan 25 2021.05.04 4 bike 120 2021.05.04 4 helmet 30 2021.05.10 Examples: 2-node cluster Node 1 Node 2 Order Line_Item Line_Item Order
  • 10.
    Sharded : Order:(o_okey % 2) & Line_Item: (l_okey % 2) Partitioned : Order: (o_date) & Line_Item: (l_shipdate) o_okey o_date o_pri 1 2021.05.01 2 3 2021.05.01 1 5 2021.05.02 1 l_okey l_name l_price l_shipdate 3 shirt 30 2021.05.2 5 lifevest 20 2021.05.02 1 chair 50 2021.05.03 1 monitor 130 2021.05.03 1 desk 100 2021.05.07 1 mouse 10 2021.05.07 5 kayak 200 2021.05.07 o_okey o_date o_pri 2 2021.05.01 1 4 2021.05.02 3 l_okey l_name l_price l_shipdate 2 pot 20 2021.05.01 2 pan 25 2021.05.04 4 bike 120 2021.05.04 4 helmet 30 2021.05.10 Examples: 2-node cluster Node 1 Node 2 Order Line_Item Line_Item Order
  • 11.
    Sharded : Order:(o_okey % 2) & Line_Item: (l_okey % 2) Partitioned : Order: (o_date) & Line_Item: (l_shipdate) Encoded & Sorted : Order: (o_okey) & Line_Item: RLE(l_okey) o_okey o_date o_pri 1 2021.05.01 2 3 2021.05.01 1 5 2021.05.02 1 l_okey l_name l_price l_shipdate (3,1) shirt 30 2021.05.2 (5,1) lifevest 20 2021.05.02 (1, 2) chair 50 2021.05.03 monitor 130 2021.05.03 (1,2) desk 100 2021.05.07 mouse 10 2021.05.07 (5,1) kayak 200 2021.05.07 o_okey o_date o_pri 2 2021.05.01 1 4 2021.05.02 3 l_okey l_name l_price l_shipdate (2,1) pot 20 2021.05.01 (2,1) pan 25 2021.05.04 (4,1) bike 120 2021.05.04 (4,1) helmet 30 2021.05.10 Examples: 2-node cluster Node 1 Node 2 Order Line_Item Line_Item Order
  • 12.
    Impacts of thesetups on query performance
  • 13.
    Examples: Query SELECT l_okey, sum(l_price)as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date;
  • 14.
    Examples: Query -Do the shards help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date;
  • 15.
    Sharded : Order:(o_okey % 2) & Line_Item: (l_okey % 2) o_okey o_date o_pri 1 2021.05.01 2 3 2021.05.01 1 5 2021.05.02 1 l_okey l_name l_price l_shipdate 1 desk 100 2021.05.07 1 chair 50 2021.05.03 1 monitor 130 2021.05.03 1 mouse 10 2021.05.07 3 shirt 30 2021.05.2 5 kayak 200 2021.05.07 5 lifevest 20 2021.05.02 o_okey o_date o_pri 2 2021.05.01 1 4 2021.05.02 3 l_okey l_name l_price l_shipdate 2 pot 20 2021.05.01 2 pan 25 2021.05.04 4 bike 120 2021.05.04 4 helmet 30 2021.05.10 Back to Shard setup Node 1 Node 2 Order Line_Item Line_Item Order
  • 16.
    Examples: Query -Do the shards help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date, o_pri ORDER BY revenue desc, o_date; YES ● Join: l_okey = o_key ○ → all odd keys in node 1 and even keys in node 2 ○ → Node 1 and node 2 join data on their local node. No need to shuffle data between nodes before joining. ● Group By: l_okey, o_date, o_pri ○ → Similarly, same group-by keys are in the same nodes. Each node can aggregate data without the need to reshuffle data
  • 17.
    Examples: Query -Do the shards help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_key, o_date, o_pri ORDER BY revenue desc, o_date; What if Order not sharded on o_okey & Line_item not sharded on l_okey?
  • 18.
    Examples: Query -Do the shards help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_key, o_date, o_pri ORDER BY revenue desc, o_date; What if Order not sharded on o_okey & Line_item not sharded on l_okey? ● Join: l_okey = o_key ○ → Need to reshuffle data so same join keys land on the same nodes before joining. Many ways: ■ Reshard on the fly both Order on o_okey and Line_Item on l_okey ■ Broadcast small table (o_okey) to other nodes ● Group By: l_okey, o_date, o_pri ○ → If after the join the data is shared on l_okey, nothing is needed. Otherwise, either: ■ Reshard data on l_okey to 2 nodes ■ Send everything to one node to do the final group-by
  • 19.
    Examples: Query -Do the shards help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_key, o_date, o_pri ORDER BY revenue desc, o_date; What if Order not sharded on o_okey & Line_item not sharded on l_okey? ● → Not sharded on join keys will lead to extra on-the-fly reshard or broadcast cost ● → Not already (re-)sharded on group-by keys before the group-by operator will lead to either ○ Reshard or ○ The final node has to do all the group-by work
  • 20.
    Examples: Query -Do the partitions help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date;
  • 21.
    Sharded : Order:(o_okey % 2) & Line_Item: (l_okey % 2) Partitioned : Order: (o_date) & Line_Item: (l_shipdate) o_okey o_date o_pri 1 2021.05.01 2 3 2021.05.01 1 5 2021.05.02 1 l_okey l_name l_price l_shipdate 3 shirt 30 2021.05.2 5 lifevest 20 2021.05.02 1 chair 50 2021.05.03 1 monitor 130 2021.05.03 1 desk 100 2021.05.07 1 mouse 10 2021.05.07 5 kayak 200 2021.05.07 o_okey o_date o_pri 2 2021.05.01 1 4 2021.05.02 3 l_okey l_name l_price l_shipdate 2 pot 20 2021.05.01 2 pan 25 2021.05.04 4 bike 120 2021.05.04 4 helmet 30 2021.05.10 Back to Partition Setup Node 1 Node 2 Order Line_Item Line_Item Order
  • 22.
    Examples: Query -Do the partitions help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date; Yes ● Filter: o_date < 2021.05.02 and l_shipdate > 2021.05.03 ○ → Prune partitions not in the filter ranges
  • 23.
    Examples: Query -Do the partitions help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date; What if Order is not partitioned on o_date and Line_Item not partitioned on l_shipdate?
  • 24.
    Examples: Query -Do the partitions help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date; What if Order is not partitioned on o_date and Line_Item not partitioned on l_shipdate? ● → nothing to prune early, we have to scan all column data and apply the filter ranges
  • 25.
    Examples: Query -Do the encoding & sorting help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date;
  • 26.
    Sharded : Order:(o_okey % 2) & Line_Item: (l_okey % 2) Partitioned : Order: (o_date) & Line_Item: (l_shipdate) Encoded & Sorted : Order: (o_okey) & Line_Item: RLE(l_okey) o_okey o_date o_pri 1 2021.05.01 2 3 2021.05.01 1 5 2021.05.02 1 l_okey l_name l_price l_shipdate (3,1) shirt 30 2021.05.2 (5,1) lifevest 20 2021.05.02 (1, 2) chair 50 2021.05.03 monitor 130 2021.05.03 (1,2) desk 100 2021.05.07 mouse 10 2021.05.07 (5,1) kayak 200 2021.05.07 o_okey o_date o_pri 2 2021.05.01 1 4 2021.05.02 3 l_okey l_name l_price l_shipdate (2,1) pot 20 2021.05.01 (2,1) pan 25 2021.05.04 (4,1) bike 120 2021.05.04 (4,1) helmet 30 2021.05.10 Back to Encoding and Sorting Setup Node 1 Node 2 Order Line_Item Line_Item Order
  • 27.
    Examples: Query -Do the encoding & sorting help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date; Yes ● Join: l_okey = o_key ○ → use fast & more memory efficient merge join because data already sorted on the join keys ○ → l_okey can be kept in RLE during join ● Group By: l_okey, o_date,o_pri ○ → Group-by key is sorted and no need doing hash groupby, simply group data as we get new batches until we reach higher value
  • 28.
    Examples: Query -Do the encoding & sorting help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date; What if Order is not sorted on o_okey and Line_Item is not RLE on l_okey?
  • 29.
    Examples: Query -Do the encoding & sorting help? SELECT l_okey, sum(l_price) as revenue, o_date, o_pri FROM customer, orders, lineitem WHERE l_okey = o_key and o_date < 2021.05.02 and l_shipdate > 2021.05.03 GROUP BY l_okey, o_date,o_pri ORDER BY revenue desc, o_date; What if Order is not sorted on o_okey and Line_Item is not RLE on l_okey? ● → use hash join instead (usually slower and requires more memory than merge join) ● → use hash-group-by method (similarly, usually slower and requires more memory than pipe-lined group-by) ● → If there are only a few line items per order, the RLE won’t save much space
  • 30.
    Database Designer: ● Topicfor another talk ● Startup: Ottertune https://ottertune.com ○ Database Optimization on Autopilot How to design sharding, partitioning, encoding, and sorting for a combination of queries?
  • 31.
    So what wehave demonstrated today? ● Sharding ○ Horizontally split a table into N non-overlapping shards ■ → each node will (equally) share 1/n of the workload: ● Load 1/n data to each node ● Query: join & group-by on each node share 1/n workload ● Partitioning ○ Each shard is further split into smaller partitions for better data filtering, deleting, fanning out, local parallelism ● Encoding ○ Each column is encoded (sorted & compressed) to further help on join, filtering, group-by, order-by → Can you think of examples for the cases we have not covered?
  • 32.