David J. DeWitt MIT Download slides and donate to a great cause: BrentOzar.com/go/dewitt SQL Query Optimization: Why Is It So Hard To Get Right?
How About a Quiz to Start! 2 • Who painted this picture? o Mondrian? o Picasso? o Ingres? • Actually it was the SQL Server query optimizer!! o Plan space for TPC-H query 8 as the parameter values for Acct-Bal and ExtendedPrice are varied o Each color represents a different query plan o Yikes! P1 P2 P3 P4 SQL Server
Today … I am going to talk about SQL query optimization My hope is that you will leave understanding why all database systems sometimes produce really bad plans Starting with the fundamental principals And why the move to the Cloud could be be a game changer
Anonymous Quote 4
The Role of the Query Optimizer (100,000 ft view) Query Optimizer SQL Statement Awesome Query Plan Magic Happens
What’s the Magic? Select o_year, sum(case when nation = 'BRAZIL' then volume else 0 end) / sum(volume) from ( select YEAR(O_ORDERDATE) as o_year, L_EXTENDEDPRICE * (1 - L_DISCOUNT) as volume, n2.N_NAME as nation from PART, SUPPLIER, LINEITEM, ORDERS, CUSTOMER, NATION n1, NATION n2, REGION where P_PARTKEY = L_PARTKEY and S_SUPPKEY = L_SUPPKEY and L_ORDERKEY = O_ORDERKEY and O_CUSTKEY = C_CUSTKEY and C_NATIONKEY = n1.N_NATIONKEY and n1.N_REGIONKEY = R_REGIONKEY and R_NAME = 'AMERICA‘ and S_NATIONKEY = n2.N_NATIONKEY and O_ORDERDATE between '1995-01-01' and '1996-12-31' and P_TYPE = 'ECONOMY ANODIZED STEEL' and S_ACCTBAL <= constant-1 and L_EXTENDEDPRICE <= constant-2 ) as all_nations group by o_year order by o_year Consider Query 8 of the TPC-H benchmark: Plan 1 Plan 2 Plan 3 Plan 4 Plan 5 … There about 22 million alternative ways of executing this query! A very big haystack to be searching through The QO must select a plan that runs in seconds or minutes, not days or weeks! Should not take hours or days to pick a plan!
Some Historical Background • Cost-based query optimization was invented by Pat Selinger as part of the IBM System R project in the late 1970s (System R became DB2) • Remains the hardest part of building a DBMS 30+ years later o Progress is hindered by fear of regressions o Far too frequently the QO picks an inefficient plan • Situation further complicated by advances in hardware and the rest of the DBMS software o Hardware is 1000X bigger and faster o DB software is 10X faster o Queries over huge amounts of data are possible IF the QO picks the right plan 7 Hardware Software Queries 1000X 10X Huge!
Database System More Precisely: The Role of the Query Optimizer 8 Transform SQL queries into an efficient execution plan Query Execution Engine Query OptimizerParserSQL Query Logical operator tree Physical operator tree Logical operators: what they do e.g., union, selection, project, join, grouping Physical operators: how they do it e.g., nested loop join, sort-merge join, hash join, index join
A First Example 9 Query Execution Engine Query Optimizer Parser SELECT Average(Rating) FROM Reviews WHERE MID = 932 Reviews Date CID MID Rating … … … … Logical operator tree Avg (Rating) Select MID = 932 Reviews Query Plan #1 Avg_agg [Cnt, Sum] Scan Reviews Filter MID = 932 Avg_agg [Cnt, Sum] Index Lookup MID = 932 MID Index Reviews Query Plan #2 or
Query Plan #1 • Plan starts by scanning the entire Reviews table o # of disk I/Os will be equal to the # of pages in the Reviews table o I/Os will be sequential. Each I/O will require about 0.1 milliseconds (0.0001 seconds) • Filter predicate “MID = 932” is applied to all rows • Only rows that satisfy the predicate are passed on to the average computation 10 Avg_agg [Cnt, Sum] Scan Reviews Filter MID = 932
Query Plan #2 • MID index is used to retrieve only those rows whose MID field (attribute) is equal to 932 o Since index is not “clustered”, about one disk I/O will be performed for each row o Each disk I/O will require a random seek and will take about 3 milliseconds (ms) • Retrieved rows will be passed to the average computation 11 Avg_agg [Cnt, Sum] Index Lookup MID = 932 MID Index Reviews
Which Plan Will be Faster? • Query optimizer must pick between the two plans by estimating the cost of each • To estimate the cost of a plan, the QO must: o Estimate the selectivity of the predicate MID=932 o Calculate the cost of both plans in terms of CPU time and I/O time • The QO uses statistics about each table to make these estimates • The “best” plan depends on how many reviews there are for movie with MID = 932 12 Query Plan #1 Avg_agg [Cnt, Sum] Scan Reviews Filter MID = 932 Avg_agg [Cnt, Sum] Index Lookup MID = 932 MID Index Reviews Query Plan #2 Vs. How many reviews for the movie with MID = 932 will there be? Best Query Plan or? ??
A Slightly More Complex Query • Consider the query: • Optimizer might first enumerate three physical plans: 13 Filter Rating > 9 Sequential Scan Reviews Filter 7/1 < Date > 7/31 Rating Index Filter 7/1 < Date < 7/31 Index Lookup Rating > 9 Reviews Filter Rating > 9 Index Lookup 7/1 < Date > 7/31 Reviews Date Index SF = .01 SF = .01 SF = .10 SF = .10 Cost = 11 seconds Cost = 100 seconds Cost = 25 seconds • Then, estimate selectivity factors • Then, calculate total cost • Finally, pick the plan with the lowest cost SELECT * FROM Reviews WHERE 7/1< date < 7/31 AND rating > 9
Enumerate logically equivalent plans by applying equivalence rules For each logically equivalent plan, enumerate all alternative physical query plans Estimate the cost of each of the alternative physical query plans Run the plan with lowest estimated overall cost Query Optimization: The Main Steps ✓ 2 1 3 4
Equivalence Rules 15 Select and join operators commute with each other Select Select Customers Select Select Customers Join Customers Reviews Join Reviews Customers Join Customers Reviews Join Movies Join Customers Join Reviews Movies Join operators are associative
Equivalence Rules (cont.) 16 Project [CID, Name] Customers Project [Name] Project operators cascade Project [Name] Customers Select operator distributes over joins Select Join Customers Reviews Select Join Customers Reviews
Example of Equivalent Logical Plans SELECT M.Title, M.Director FROM Movies M, Reviews R, Customers C WHERE C.City = “N.Y.” AND R.Rating > 7 AND M.MID = R.MID AND C.CID = R.CID • One possible logical plan: 17 Join SelectC.City = “N.Y” Select R.Rating > 7 JoinC.CID = R.CID R.MID = M.MID Customers Reviews Project M.Title, M.Director Movies MID Title Director Earnings 1 2 … CID Name Address City 5 11 … Date CID MID Rating 7/3 11 2 8 7/3 5 2 4 … Find titles and director names of movies with a rating > 7 from customers residing in NYC Customers Reviews Movies
Five Logically “Equivalent” Plans 18 Select Select Join Customers Reviews Project Join Movies Select Select Join Customers Reviews Project Join Movies Select Select Join Customers Reviews Project Join Movies Select Join Customers Reviews Join Movies Select Project The “original” plan Selects distribute over joins rule Join Customers Reviews Join Movies Select Project Select Selects commute rule
Four More! 19 Select Select Join Customers Reviews Project Join Movies The “original” plan Select CustomersSelect Reviews Project Join Movies Join Select Customers Select Reviews Project Join Movies Join Select CustomersSelect Reviews Project Join Movies Join Select Reviews Join Movies Customers Project Join Select Join commutativity rule Select commutativity rule
9 Logically Equivalent Plans, In Total 20 Select Select Join Customers Reviews Project Join Movies Select Select Join Customers Reviews Project Join Movies Select Select Join Customers Reviews Project Join Movies Select Join Customers Reviews Join Movies Select Project Select Customers Select Reviews Project Join Movies Join Select Customers Select Reviews Project Join Movies Join Select Reviews Join Movies Customers Project Join Select Select CustomersSelect Reviews Project Join Movies Join Join Customers Reviews Join Movies Select Project Select  All 9 logical plans will produce the same result  For each of these 9 plans there is a large number of alternative physical plans that the optimizer can choose from
Enumerate logically equivalent plans by applying equivalence rules For each logically equivalent plan, enumerate all alternative physical query plans Estimate the cost of each of the alternative physical query plans Run the plan with lowest estimated overall cost Query Optimization: The Main Steps ✓ 2 1 3 4 ✓
Physical Plan Example • Assume that the optimizer has: o Three join strategies that it can select from: • nested loops (NL), sort-merge join (SMJ), and hash join (HJ) o Two selection strategies: • sequential scan (SS) and index scan (IS) • Consider JUST ONE of the 9 logical plans • Here is one possible physical plan 22 Select Select Join Customers Reviews Project Join Movies SS IS HJ Customers Reviews Project NL Movies Sequential Scan Index Scan Hash Join Nested loops join • There are actually 36 possible physical alternatives for this single logical plan. (I was too lazy to draw pictures of all 36). • With 9 equivalent logical plans, there are 324 = (9 * 36) physical plans that the optimizer must enumerate and cost as part of the search for the best execution plan for the query And this was a VERY simple query! • Later we will look at how dynamic programming is used to explore the space of logical and physical plans w/o enumerating the entire plan space
Enumerate logically equivalent plans by applying equivalence rules For each logically equivalent plan, enumerate all alternative physical query plans Estimate the cost of each of the alternative physical query plans. • Estimate the selectivity factor and output cardinality of each predicate • Estimate the cost of each operator Run the plan with lowest estimated overall cost Query Optimization: The Main Steps ✓ 2 1 3 4 ✓ ✓
Selectivity Estimation • Task of estimating how many rows will satisfy a predicate such as Movies.MID=932 • Plan quality is highly dependent on quality of the estimates that the query optimizer makes 24 0 1 2 3 4 5 • Histograms are the standard technique used to estimate selectivity factors for predicates on a single table • Many different flavors: o Equi-Width o Equi-Height o Max-Diff o …
5 52 83 6 10 157 125 17 55 37 56 38 19 48 56 83 43 37 5 7 0 20 40 60 80 100 120 140 160 180 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Histogram Motivation 25 # of Reviews for each customer (total of 939 rows) Customer ID (CID) values in Reviews Table Some examples: #1) Predicate: CID = 9 Actual Sel. Factor = 55/939 = .059 #2) Predicate: 2 <= CID <= 3 Actual Sel. Factor = 135/939 = .144 In general, there is not enough space in the catalogs to store summary statistics for each distinct attribute value The solution: histograms
Equi-Width Histogram Example 26 CID Values Count Count 1-4 17-2013-169-125-8 Equi-width histogram Yikes! 8X error!! 0 20 40 60 80 100 120 140 160 180 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 146 309 186 206 92 0 50 100 150 200 250 300 350 All buckets cover roughly the same key range Example #1: Predicate: CID = 9 Actual Sel. Factor = 55/939= .059 Estimated Sel. Factor = (186/4)/939 = .050 Example #2: Predicate: CID = 5 Actual Sel. Factor = 10/939 = .011 Estimated Sel. Factor = (309/4)/993 =.082
156 157 142 148 161 175 0 50 100 150 200 Equi-Height Histograms 27 Count Count Equi-height histogram Divide ranges so that all buckets contain roughly the same number of values 1-5 16-2012-159-117-86 0 20 40 60 80 100 120 140 160 180 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Example #2: Predicate: CID = 6 Actual Sel. Factor = 157/939 = .167 Estimated Sel. Factor = (157/1)/993 = .167 Example #2: Predicate: CID = 6 Actual Sel. Factor = 157/939 = .167 Estimated Sel. Factor = (309/4)/993 = .082 Example #1: Predicate: CID = 5 Actual Sel. Factor = 10/939 = .011 Estimated Sel. Factor = (309/4)/993 =.082 Example #1: Predicate: CID = 5 Actual Sel. Factor = 10/939 = .011 Estimated Sel. Factor = (156/5)/993 = .033 Equi-width vs. Equi-Height 28 1-4 17-2013-169-125-8 Equi-width Equi-height 156 157 142 148 161 175 0 50 100 150 200 146 309 186 206 92 0 50 100 150 200 250 300 350 1-5 16-2012-159-117-86
Histogram Summary • Histograms are a critical tool for estimating selectivity factors for selection predicates 29 Errors still occur, however! • Other statistics stored by the DBMS for each table include # of rows, # of pages, …
Enumerate logically equivalent plans by applying equivalence rules For each logically equivalent plan, enumerate all alternative physical query plans Estimate the cost of each of the alternative physical query plans. • Estimate the selectivity factor and output cardinality of each predicate • Estimate the cost of each operator Run the plan with lowest estimated overall cost Query Optimization: The Main Steps ✓ 2 1 3 4 ✓ ✓
Estimating Costs • Two key costs that the optimizer considers: o I/O time – cost of reading pages from mass storage o CPU time – cost of applying predicates and operating on tuples in memory • Actual values are highly dependent on CPU and I/O subsystem on which the query will be run o Further complicating the job of the query optimizer • For a parallel database system such as SQL DW, the cost of redistributing/shuffling rows must also be considered 31 vs.
An Example • Query: o SELECT Avg(Rating) FROM Reviews WHERE MID = 932 • Two physical query plans: 32 Reviews Date CID MID Rating Plan #1 Avg_agg [Cnt, Sum] Sequential Scan Reviews Filter MID = 932 Avg_agg [Cnt, Sum] Index Lookup MID = 932 MID Index Reviews Plan #2 Which plan is cheaper ???
Plan #1 33 Avg_agg [Cnt, Sum] Scan Reviews Filter MID = 932 • Filter is applied to 10M rows • The optimizer estimates that 100 rows will satisfy the predicate • Table is 100K pages with 100 rows/page • Sorted on date • Average computation is applied to 100 rows • Reviews is scanned sequentially at 100 MB/second • I/O time of scan is 8 seconds • At 0.1 microseconds/row, filter consumes 1 second of CPU time • At 0.1 microseconds/row, avg consumes .00001 seconds of CPU time Optimizer estimates total execution time of 9 seconds
Plan #2 34 Avg_agg [Cnt, Sum] Index Lookup MID = 932 MID Index Reviews • 100 rows are estimated to satisfy the predicate • Average computation is applied to 100 rows • At 0.1 microseconds/row, average consumes .00001 seconds of CPU time • 100 rows are retrieved using the MID index • Since table is sorted on date field (and not MID field), each I/O requires a random disk I/O – about .003 seconds per disk I/O • I/O time will be .3 seconds Optimizer estimates total execution time of 0.3 seconds The estimate for Plan #1 was 9 seconds, so Plan #2 is clearly the better choice
But … • What if the estimate of the number of rows that satisfy the predicate MID = 932 is WRONG? o E.g. 10,000 rows instead of 100 rows 35 0.01 0.1 1 10 100 1000 10 100 1,000 10,000 100,000 Time(#sec) # of rows Sequential Scan Non-Clustered Index Non-clustered Index is better here Sequential scan is better here
Estimating Join Costs • Three basic join methods: o Nested-loops join o Sort-merge join o Hash-join • Very different performance characteristics • Critical for optimizer to carefully pick which method to use when 36 Join SelectC.City = “NY” Select R.Rating > 7 JoinC.CID = R.CID R.MID = M.MID Customers Reviews ProjectM.Title, M.Director Movies
Sort-Merge Join Algorithm Sort Reviews on MID column (unless already sorted) Sort Movies on MID column (unless already sorted) “Merge” two sorted tables: Scan each table sequential in tandem { For current row r of Reviews For current row m of Movies if r.MID = m.MID produce output row Advance r and m cursors } 37 Cost = |R| + |M| I/Os Merge Join Sort Sort Reviews (|R| pages) Movies (|M| pages) Reviews.MID = Movies.MID Cost = 4 * |M| I/Os Total I/O cost = 5*|R| + 5*|M| I/Os Cost = 4 * |R| I/Os Main Idea: Sort R and M on the join column (MID), then scan them to do a ``merge’’ (on join column), and output result tuples.
Nested-Loops Join For each page Ri, 1≤ i ≤ |R|, of Reviews { Read page Ri from disk For each Mj, 1≤ j ≤ |M|, of Movies { Read page Mj from disk For all rows r on page Ri { For all rows m on page Mj { if r.MID = m.MID produce output row } } } } 38 I/O Cost = |R| + |R| * |M| Nested Loops Join Movies (|M| pages) Reviews (|R| pages) Reviews.MID = Movies.MID Main Idea: Scan R, and for each tuple in R probe tuples in M (by scanning it). Output result tuples.
Main Idea: Scan R, and for each tuple in R probe tuples in M (by probing its index). Output result tuples. Index-Nested Loops For each page Ri, 1≤ i ≤ |R|, of Reviews { Read page Ri from disk For all rows r on page Ri { Use MID index on Movies to fetch rows with MID attributes = r.MID Form output row for each returned row } } 39 Movies (|M| pages) Nested Loops Join Reviews Reviews.MID = Movies.MID Index Lookup using r.MID MID Index (|R| pages) Sorted on date column Cost = |R| + |R| * (||R||/|R|) * 2 • 2 I/Os: 1 index I/O + 1 movie I/O as Reviews table is sorted on date column • ||R|| is # of rows in R • ||R||/|R| gives the average number of rows of R per page Notice that since Reviews is ordered on the Date column (and not MID), so each row of the Movies table retrieved incurs two random disk I/Os: • one to the index and • one to the table
Estimating Result Cardinalities • Consider the query SELECT * FROM Reviews WHERE 7/1 < date < 7/31 AND rating > 9 • Assume Reviews has 1M rows • Assume following selectivity factors: 40 Sel. Factor # of qualifying rows 7/1 < date < 7/31 0.1 100,000 Review > 9 0.01 10,000 • How many output rows will the query produce? o If predicates are not correlated • .1 * .01 * 1M = 1,000 rows o If predicates are correlated could be as high as • .1 * 1M = 100,000 rows Why does this matter?
1 10 100 1000 10000 0.000001 0.00001 0.0001 0.001 0.01 0.1 1 Time(#sec) Selectivity factor of predicate on Reviews table Nested Loops Sort Merge Index NL This is Why! Assume that: • Reviews table is 10,000 pages with 80 rows/page • Movies table is 2,000 pages • The primary index on Movies is on the MID column 41 Join R.MID = M.MID Select Reviews Project Movies Rating > 9 and 7/1 < date < 7/31 The consequences of incorrectly estimating the selectivity of the predicate on Reviews can be HUGE INL N L SM Note that each join algorithm has a region where it provides the best performance
Multidimensional Histograms • Used to capture correlation between attributes • A 2-D example 42 0 50 100 150 200 250 300 350 400 450 500 151 198 229 152 156 303 314 361 392 315 319 466 191 238 269 192 196 343 211 258 289 212 216 363 97 144 175 98 102 249
A Little Bit About Estimating Join Cardinalities • Question: Given a join of R and S, what is the range of possible result sizes (in #of tuples)? o Suppose the join is on a key for R and S Students(sid, sname, did), Dorm(did,d.addr) Select S.sid, D.address From Students S, Dorms D Where S.did = D.did What is the cardinality? 43 A student can only live in at most 1 dorm: • each S tuple can match with at most 1 D tuple • cardinality (S join D) = cardinality of S
• General case: join on {A} (where {A} is key for neither) o estimate each tuple r of R generates uniform number of matches in S and each tuple s of S generates uniform number of matches in R, e.g. SF = min(||R|| * ||S|| / NKeys(A,S), ||S|| * ||R|| / NKeys(A,R)) e.g., SELECT M.title, R.title FROM Movies M, Reviews R WHERE M.title = R.title Movies: 100 tuples, 75 unique titles  1.3 rows for each title Reviews: 20 tuples, 10 unique titles  2 rows for each title Estimating Join Cardinality = 100*20/10 = 200 = 20*100/75 = 26.6
Enumerate logically equivalent plans by applying equivalence rules For each logically equivalent plan, enumerate all alternative physical query plans Estimate the cost of each of the alternative physical query plans. • Estimate the selectivity factor and output cardinality of each predicate • Estimate the cost of each operator Run the plan with lowest estimated overall cost Query Optimization: The Main Steps ✓ 2 1 3 4 ✓ ✓ Enumerate How big is the plan space for a query involving N tables? enumerate It turns out that the answer depends on the “shape” of the query
Two Common Query “Shapes” 46 A B Join Join Join Join C D F “Star” Join Queries A B C D FJoin JoinJoin Join “Chain” Join Queries Number of logically equivalent alternatives # of Tables Star Chain 2 2 2 4 48 40 5 384 224 6 3,840 1,344 8 645,120 54,912 10 18,579,450 2,489,344 In practice, “typical” queries fall somewhere between these two extremes
Pruning the Plan Space • Consider only left-deep query plans to reduce the search space 47 A B C Join Join Join Join E D Left Deep Join Join Join Join ED A B C Bushy Star Join Queries Chain Join Queries # of Tables Bushy Left-Deep Bushy Left Deep 2 2 2 2 2 4 48 12 40 8 5 384 48 224 16 6 3,840 240 1,344 32 8 645,120 10,080 54,912 128 10 18,579,450 725,760 2,489,344 512 These are counts of logical plans only! With: i) 3 join methods ii) n joins in a query There will be 3 n physical plans for each logical planExample: For a left-deep, 8 table star join query there will be: i) 10,080 different logical plans ii) 22,044,960 different physical plans!! Solution: Use some form of dynamic programming (either bottom up or top down) to search the plan space heuristically Sometimes these heuristics will cause the best plan to be missed!!
• Optimization is performed in N passes (if N relations are joined): o Pass 1: Find the best (lowest cost) 1-relation plan for each relation. o Pass 2: Find the best way to join the result of each 1-relation plan (as the outer/left table) to another relation (as the inner/right table) to generate all 2-relation plans. o Pass N: Find best way to join result of a (N-1)-relation plan (as outer) to the N’th relation to generate all N-relation plans. • At each pass, for each subset of relations, prune all plans except those o Lowest cost plan overall, plus o Lowest cost plan for each interesting order of the rows • Order by, group by, aggregates etc. handled as the final step Bottom-Up QO Using Dynamic Programming In spite of pruning plan space, this approach is still exponential in the # of tables. Interesting orders include orders that facilitate the execution of joins, aggregates, and order by clauses subsequently by the query
49 A A SS A IS B B SS C C SS C IS D D SS D IS27 387313 42 9518 All single relation plans All tables First, generate all single relation plans: A Select Join Join C Select Join D B Select An Example: Legend: SS – sequential scan IS – index scan – cost5 Prune
50 B SS 73 A SS A IS 2713 D SS42 C IS 18 All single relation plans after pruning Then, All Two Relation Plans
Two Relation Plans Starting With A 51 B SS 73 A IS 27 A SS13 D SS42 C IS 18 A Select Join Join C Select Join D B Select A SS B SS NLJ A IS B SS NLJ A IS B SS SMJ A SS B SS SMJJoin Select A B A.a = B.a 1013 822315 293 Single relation plans Prune Let’s assume there are 2 alternative join methods for the QO to select from: 1. NLJ = Nested Loops Join 2. SMJ = Sort Merge Join
Two Relation Plans Starting With B 52 B SS A SS NLJ B SS A SS SMJ B SS NLJ A IS B SS SMJ A IS Select D B JoinB.b = D.b B SS D SS NLJ B SS D SS SMJ NLJ B SS C IS B SS SMJ C IS A Select Join Join C Select Join D B Select 1013 315 756 293 1520 432 2321 932 Single relation plansB SS 73 A IS 27 A SS13 D SS42 C IS 18 Prune
Two Relation Plans Starting With C 53 Select C B JoinB.C = C.c NLJ B SS C IS B SS SMJ C IS A Select Join Join C Select Join D B Select 6520 932 Single relation plansB SS 73 A IS 27 A SS13 D SS42 C IS 18 Prune
Two Relation Plans Starting With D 54 Select D B JoinB.b = D.b D SS B SS NLJ D SS B SS SMJ A Select Join Join C Select Join D B Select 1520 432 Single relation plans B SS 73 A IS 27 A SS13 D SS42 C IS 18 Prune
Further Prune Two Relation Plans 55 A IS B SS SMJ D SS B SS SMJ Pruned two relation plans B SS SMJ C IS B SS SMJ A IS B SS D SS SMJ B SS SMJ C IS A Select Join Join C Select Join D B Select
Next, All Three Relation Plans 56 A IS B SS SMJ Fully pruned two relation plans B SS SMJ C IS B SS D SS SMJ A Select Join Join C Select Join D B Select NLJ C IS A IS B SS SMJ SMJ C IS A IS B SS SMJ D SS NLJ A IS B SS SMJ D SS SMJ A IS B SS SMJ 1) Consider the Two Relation Plans That Start With A
Next, All Three Relation Plans 57 A IS B SS SMJ Fully pruned two relation plans B SS SMJ C IS B SS D SS SMJ A Select Join Join C Select Join D B Select B SS D SS SMJ A SS NLJ B SS D SS SMJ A SS SMJ NLJ A IS B SS D SS SMJ SMJ A IS B SS D SS SMJ NLJ C IS B SS D SS SMJ SMJ C IS B SS D SS SMJ 2) Consider the Two Relation Plans That Start With B
Next, All Three Relation Plans 58 A IS B SS SMJ Fully pruned two relation plansB SS SMJ C IS B SS D SS SMJ A Select Join Join C Select Join D B Select B SS SMJ C IS NLJ A IS SMJ A IS B SS SMJ C IS D SS NLJ C IS B SS SMJ D SS SMJ C IS B SS SMJ 3) Consider the Two Relation Plans That Start With C
You Have Now Seen the Theory • But the reality is: o Optimizer still pick bad plans too frequently for a variety of reasons: • Statistics can be missing, out-of-date, incorrect • Cardinality estimates assume uniformly distributed values but data values are skewed • Attribute values are correlated with one another: o Make = “Honda” and Model = “Accord” • Cost estimates are based on formulas that do not take into account the characteristics of the machine on which the query will actually be run o Regressions happen due hardware and software upgrades 59 What can be done to improve the situation?
Opportunities for Improvement • Develop tools that give us a better understanding of what goes wrong • Improve plan stability • Use of feedback from the QE to the QO to improve statistics and cost estimates 60
Towards a Better Understanding of QO Behavior • Picasso Project – Jayant Haritsa, IIT Bangalore o Bing “Picasso Haritsa” to find the project’s web site o Tool is available for SQL Server, Oracle, PostgreSQL, DB2, Sybase • Simple but powerful idea: • For a given query such as SELECT * from A, B WHERE A.a = B.b and A.c <= constant-1 and B.d <= constant-2 • Systematically vary constant-1 and constant-2 • Obtain query plan and estimated cost from the query optimizer for each combination of input parameters • Plot the results 61
Example: TPC-H Query 8 select o_year, sum(case when nation = 'BRAZIL' then volume else 0 end) / sum(volume) from ( select YEAR(O_ORDERDATE) as o_year, L_EXTENDEDPRICE * (1 - L_DISCOUNT) as volume, n2.N_NAME as nation from PART, SUPPLIER, LINEITEM, ORDERS, CUSTOMER, NATION n1, NATION n2, REGION where P_PARTKEY = L_PARTKEY and S_SUPPKEY = L_SUPPKEY and L_ORDERKEY = O_ORDERKEY and O_CUSTKEY = C_CUSTKEY and C_NATIONKEY = n1.N_NATIONKEY and n1.N_REGIONKEY = R_REGIONKEY and R_NAME = 'AMERICA‘ and S_NATIONKEY = n2.N_NATIONKEY and O_ORDERDATE between '1995-01-01' and '1996-12-31' and P_TYPE = 'ECONOMY ANODIZED STEEL' and S_ACCTBAL <= constant-1 and L_EXTENDEDPRICE <= constant-2 ) as all_nations group by o_year order by o_year
Resulting Plan Space • SQL Server 2008 R2 • A total of 90,000 queries o 300 different values for both L_ExtendedPrice and S_AcctBal • 204 different plans!! o Each distinct plan is assigned a unique color • Zooming in to the [0,20:0,40] region: 63 Key takeaway: If plan choice is so sensitive to the constants used, it will undoubtedly be sensitive to errors in statistics and cardinality estimates  Intuitively, this seems very bad!
• Recall this graph of join algorithm performance • While the two “nested loops” algorithms are faster at low selectivity factors, they are not as “stable” across the entire range of selectivity factors How Might We Do Better? 64 1 10 100 1000 10000 Time(#sec) Selectivity factor of predicate on Reviews table Nested Loops Sort Merge Index NL Join R.MID = M.MID Select Reviews Project Movies Rating > 9 and 7/1 < date < 7/31 INL N L SM
“Reduced” Plan Diagrams • Robustness is somehow tied to the number of plans o Fewer plans => more robust plans • For TPC-H query 8, it is possible to use only 30 plans (instead of 204) by picking more robust plans that are slightly slower (10% max, 2% avg) • Since each plan covers a larger region it will be less sensitive to errors in estimating cardinalities and costs 65 Reduced plan space for TPC-H query 8
How Might We Do Better? • At QO time, have the QO annotate compiled query plans with statistics (e.g. expected cardinalities) and check operators • At runtime, check operators collect the actual statistics and compare actual vs. predicted • Opens up a number of avenues for improving QO performance Especially in the CLOUD! INL A IS B SS SMJ C IS Check Check C IS Check B SS SMJ A IS INL 66
QO In the Cloud • What is different? o On prem, a DB vendor has essentially no insight to how its product is used o In the cloud, vendor knows • Schema information (tables, indices, …) • The hardware being used • The complete query workload • For each query, the optimized plan & its estimated cost, the actually running cost, and the selectivity of each operator • Use this information to build an optimizer that learns.
A Learning QO 68 OptimizerQuery Statistics Statistics Tracker Executor Database Check Check C IS Check B SS SMJ A IS INL Catalogs Observed StatsOriginal & Observed Optimization of subsequent queries benefits from the observed statistics and operator costs Query Plan Observed Costs
Key Points To Remember For The Quiz • Query optimization is harder than rocket science o The other components are trivial in comparison • Three key phases of QO o Enumeration of logical plan space o Enumeration of alternative physical plans o Selectivity estimation and costing • The QO team of every DB vendor lives in fear of regressions o How exactly do you expect them to make forward progress?

SQL Query Optimization: Why Is It So Hard to Get Right?

  • 1.
    David J. DeWitt MIT Downloadslides and donate to a great cause: BrentOzar.com/go/dewitt SQL Query Optimization: Why Is It So Hard To Get Right?
  • 2.
    How About aQuiz to Start! 2 • Who painted this picture? o Mondrian? o Picasso? o Ingres? • Actually it was the SQL Server query optimizer!! o Plan space for TPC-H query 8 as the parameter values for Acct-Bal and ExtendedPrice are varied o Each color represents a different query plan o Yikes! P1 P2 P3 P4 SQL Server
  • 3.
    Today … I amgoing to talk about SQL query optimization My hope is that you will leave understanding why all database systems sometimes produce really bad plans Starting with the fundamental principals And why the move to the Cloud could be be a game changer
  • 4.
  • 5.
    The Role ofthe Query Optimizer (100,000 ft view) Query Optimizer SQL Statement Awesome Query Plan Magic Happens
  • 6.
    What’s the Magic? Selecto_year, sum(case when nation = 'BRAZIL' then volume else 0 end) / sum(volume) from ( select YEAR(O_ORDERDATE) as o_year, L_EXTENDEDPRICE * (1 - L_DISCOUNT) as volume, n2.N_NAME as nation from PART, SUPPLIER, LINEITEM, ORDERS, CUSTOMER, NATION n1, NATION n2, REGION where P_PARTKEY = L_PARTKEY and S_SUPPKEY = L_SUPPKEY and L_ORDERKEY = O_ORDERKEY and O_CUSTKEY = C_CUSTKEY and C_NATIONKEY = n1.N_NATIONKEY and n1.N_REGIONKEY = R_REGIONKEY and R_NAME = 'AMERICA‘ and S_NATIONKEY = n2.N_NATIONKEY and O_ORDERDATE between '1995-01-01' and '1996-12-31' and P_TYPE = 'ECONOMY ANODIZED STEEL' and S_ACCTBAL <= constant-1 and L_EXTENDEDPRICE <= constant-2 ) as all_nations group by o_year order by o_year Consider Query 8 of the TPC-H benchmark: Plan 1 Plan 2 Plan 3 Plan 4 Plan 5 … There about 22 million alternative ways of executing this query! A very big haystack to be searching through The QO must select a plan that runs in seconds or minutes, not days or weeks! Should not take hours or days to pick a plan!
  • 7.
    Some Historical Background •Cost-based query optimization was invented by Pat Selinger as part of the IBM System R project in the late 1970s (System R became DB2) • Remains the hardest part of building a DBMS 30+ years later o Progress is hindered by fear of regressions o Far too frequently the QO picks an inefficient plan • Situation further complicated by advances in hardware and the rest of the DBMS software o Hardware is 1000X bigger and faster o DB software is 10X faster o Queries over huge amounts of data are possible IF the QO picks the right plan 7 Hardware Software Queries 1000X 10X Huge!
  • 8.
    Database System More Precisely: TheRole of the Query Optimizer 8 Transform SQL queries into an efficient execution plan Query Execution Engine Query OptimizerParserSQL Query Logical operator tree Physical operator tree Logical operators: what they do e.g., union, selection, project, join, grouping Physical operators: how they do it e.g., nested loop join, sort-merge join, hash join, index join
  • 9.
    A First Example 9 Query Execution Engine Query Optimizer Parser SELECT Average(Rating) FROMReviews WHERE MID = 932 Reviews Date CID MID Rating … … … … Logical operator tree Avg (Rating) Select MID = 932 Reviews Query Plan #1 Avg_agg [Cnt, Sum] Scan Reviews Filter MID = 932 Avg_agg [Cnt, Sum] Index Lookup MID = 932 MID Index Reviews Query Plan #2 or
  • 10.
    Query Plan #1 •Plan starts by scanning the entire Reviews table o # of disk I/Os will be equal to the # of pages in the Reviews table o I/Os will be sequential. Each I/O will require about 0.1 milliseconds (0.0001 seconds) • Filter predicate “MID = 932” is applied to all rows • Only rows that satisfy the predicate are passed on to the average computation 10 Avg_agg [Cnt, Sum] Scan Reviews Filter MID = 932
  • 11.
    Query Plan #2 •MID index is used to retrieve only those rows whose MID field (attribute) is equal to 932 o Since index is not “clustered”, about one disk I/O will be performed for each row o Each disk I/O will require a random seek and will take about 3 milliseconds (ms) • Retrieved rows will be passed to the average computation 11 Avg_agg [Cnt, Sum] Index Lookup MID = 932 MID Index Reviews
  • 12.
    Which Plan Willbe Faster? • Query optimizer must pick between the two plans by estimating the cost of each • To estimate the cost of a plan, the QO must: o Estimate the selectivity of the predicate MID=932 o Calculate the cost of both plans in terms of CPU time and I/O time • The QO uses statistics about each table to make these estimates • The “best” plan depends on how many reviews there are for movie with MID = 932 12 Query Plan #1 Avg_agg [Cnt, Sum] Scan Reviews Filter MID = 932 Avg_agg [Cnt, Sum] Index Lookup MID = 932 MID Index Reviews Query Plan #2 Vs. How many reviews for the movie with MID = 932 will there be? Best Query Plan or? ??
  • 13.
    A Slightly MoreComplex Query • Consider the query: • Optimizer might first enumerate three physical plans: 13 Filter Rating > 9 Sequential Scan Reviews Filter 7/1 < Date > 7/31 Rating Index Filter 7/1 < Date < 7/31 Index Lookup Rating > 9 Reviews Filter Rating > 9 Index Lookup 7/1 < Date > 7/31 Reviews Date Index SF = .01 SF = .01 SF = .10 SF = .10 Cost = 11 seconds Cost = 100 seconds Cost = 25 seconds • Then, estimate selectivity factors • Then, calculate total cost • Finally, pick the plan with the lowest cost SELECT * FROM Reviews WHERE 7/1< date < 7/31 AND rating > 9
  • 14.
    Enumerate logically equivalentplans by applying equivalence rules For each logically equivalent plan, enumerate all alternative physical query plans Estimate the cost of each of the alternative physical query plans Run the plan with lowest estimated overall cost Query Optimization: The Main Steps ✓ 2 1 3 4
  • 15.
    Equivalence Rules 15 Select andjoin operators commute with each other Select Select Customers Select Select Customers Join Customers Reviews Join Reviews Customers Join Customers Reviews Join Movies Join Customers Join Reviews Movies Join operators are associative
  • 16.
    Equivalence Rules (cont.) 16 Project [CID,Name] Customers Project [Name] Project operators cascade Project [Name] Customers Select operator distributes over joins Select Join Customers Reviews Select Join Customers Reviews
  • 17.
    Example of EquivalentLogical Plans SELECT M.Title, M.Director FROM Movies M, Reviews R, Customers C WHERE C.City = “N.Y.” AND R.Rating > 7 AND M.MID = R.MID AND C.CID = R.CID • One possible logical plan: 17 Join SelectC.City = “N.Y” Select R.Rating > 7 JoinC.CID = R.CID R.MID = M.MID Customers Reviews Project M.Title, M.Director Movies MID Title Director Earnings 1 2 … CID Name Address City 5 11 … Date CID MID Rating 7/3 11 2 8 7/3 5 2 4 … Find titles and director names of movies with a rating > 7 from customers residing in NYC Customers Reviews Movies
  • 18.
    Five Logically “Equivalent”Plans 18 Select Select Join Customers Reviews Project Join Movies Select Select Join Customers Reviews Project Join Movies Select Select Join Customers Reviews Project Join Movies Select Join Customers Reviews Join Movies Select Project The “original” plan Selects distribute over joins rule Join Customers Reviews Join Movies Select Project Select Selects commute rule
  • 19.
    Four More! 19 Select Select Join CustomersReviews Project Join Movies The “original” plan Select CustomersSelect Reviews Project Join Movies Join Select Customers Select Reviews Project Join Movies Join Select CustomersSelect Reviews Project Join Movies Join Select Reviews Join Movies Customers Project Join Select Join commutativity rule Select commutativity rule
  • 20.
    9 Logically EquivalentPlans, In Total 20 Select Select Join Customers Reviews Project Join Movies Select Select Join Customers Reviews Project Join Movies Select Select Join Customers Reviews Project Join Movies Select Join Customers Reviews Join Movies Select Project Select Customers Select Reviews Project Join Movies Join Select Customers Select Reviews Project Join Movies Join Select Reviews Join Movies Customers Project Join Select Select CustomersSelect Reviews Project Join Movies Join Join Customers Reviews Join Movies Select Project Select  All 9 logical plans will produce the same result  For each of these 9 plans there is a large number of alternative physical plans that the optimizer can choose from
  • 21.
    Enumerate logically equivalentplans by applying equivalence rules For each logically equivalent plan, enumerate all alternative physical query plans Estimate the cost of each of the alternative physical query plans Run the plan with lowest estimated overall cost Query Optimization: The Main Steps ✓ 2 1 3 4 ✓
  • 22.
    Physical Plan Example •Assume that the optimizer has: o Three join strategies that it can select from: • nested loops (NL), sort-merge join (SMJ), and hash join (HJ) o Two selection strategies: • sequential scan (SS) and index scan (IS) • Consider JUST ONE of the 9 logical plans • Here is one possible physical plan 22 Select Select Join Customers Reviews Project Join Movies SS IS HJ Customers Reviews Project NL Movies Sequential Scan Index Scan Hash Join Nested loops join • There are actually 36 possible physical alternatives for this single logical plan. (I was too lazy to draw pictures of all 36). • With 9 equivalent logical plans, there are 324 = (9 * 36) physical plans that the optimizer must enumerate and cost as part of the search for the best execution plan for the query And this was a VERY simple query! • Later we will look at how dynamic programming is used to explore the space of logical and physical plans w/o enumerating the entire plan space
  • 23.
    Enumerate logically equivalentplans by applying equivalence rules For each logically equivalent plan, enumerate all alternative physical query plans Estimate the cost of each of the alternative physical query plans. • Estimate the selectivity factor and output cardinality of each predicate • Estimate the cost of each operator Run the plan with lowest estimated overall cost Query Optimization: The Main Steps ✓ 2 1 3 4 ✓ ✓
  • 24.
    Selectivity Estimation • Taskof estimating how many rows will satisfy a predicate such as Movies.MID=932 • Plan quality is highly dependent on quality of the estimates that the query optimizer makes 24 0 1 2 3 4 5 • Histograms are the standard technique used to estimate selectivity factors for predicates on a single table • Many different flavors: o Equi-Width o Equi-Height o Max-Diff o …
  • 25.
    5 52 83 6 10 157 125 17 55 37 56 38 19 48 56 83 43 37 5 7 0 20 40 60 80 100 120 140 160 180 12 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Histogram Motivation 25 # of Reviews for each customer (total of 939 rows) Customer ID (CID) values in Reviews Table Some examples: #1) Predicate: CID = 9 Actual Sel. Factor = 55/939 = .059 #2) Predicate: 2 <= CID <= 3 Actual Sel. Factor = 135/939 = .144 In general, there is not enough space in the catalogs to store summary statistics for each distinct attribute value The solution: histograms
  • 26.
    Equi-Width Histogram Example 26 CIDValues Count Count 1-4 17-2013-169-125-8 Equi-width histogram Yikes! 8X error!! 0 20 40 60 80 100 120 140 160 180 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 146 309 186 206 92 0 50 100 150 200 250 300 350 All buckets cover roughly the same key range Example #1: Predicate: CID = 9 Actual Sel. Factor = 55/939= .059 Estimated Sel. Factor = (186/4)/939 = .050 Example #2: Predicate: CID = 5 Actual Sel. Factor = 10/939 = .011 Estimated Sel. Factor = (309/4)/993 =.082
  • 27.
    156 157 142 148 161 175 0 50 100 150 200 Equi-HeightHistograms 27 Count Count Equi-height histogram Divide ranges so that all buckets contain roughly the same number of values 1-5 16-2012-159-117-86 0 20 40 60 80 100 120 140 160 180 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
  • 28.
    Example #2: Predicate:CID = 6 Actual Sel. Factor = 157/939 = .167 Estimated Sel. Factor = (157/1)/993 = .167 Example #2: Predicate: CID = 6 Actual Sel. Factor = 157/939 = .167 Estimated Sel. Factor = (309/4)/993 = .082 Example #1: Predicate: CID = 5 Actual Sel. Factor = 10/939 = .011 Estimated Sel. Factor = (309/4)/993 =.082 Example #1: Predicate: CID = 5 Actual Sel. Factor = 10/939 = .011 Estimated Sel. Factor = (156/5)/993 = .033 Equi-width vs. Equi-Height 28 1-4 17-2013-169-125-8 Equi-width Equi-height 156 157 142 148 161 175 0 50 100 150 200 146 309 186 206 92 0 50 100 150 200 250 300 350 1-5 16-2012-159-117-86
  • 29.
    Histogram Summary • Histogramsare a critical tool for estimating selectivity factors for selection predicates 29 Errors still occur, however! • Other statistics stored by the DBMS for each table include # of rows, # of pages, …
  • 30.
    Enumerate logically equivalentplans by applying equivalence rules For each logically equivalent plan, enumerate all alternative physical query plans Estimate the cost of each of the alternative physical query plans. • Estimate the selectivity factor and output cardinality of each predicate • Estimate the cost of each operator Run the plan with lowest estimated overall cost Query Optimization: The Main Steps ✓ 2 1 3 4 ✓ ✓
  • 31.
    Estimating Costs • Twokey costs that the optimizer considers: o I/O time – cost of reading pages from mass storage o CPU time – cost of applying predicates and operating on tuples in memory • Actual values are highly dependent on CPU and I/O subsystem on which the query will be run o Further complicating the job of the query optimizer • For a parallel database system such as SQL DW, the cost of redistributing/shuffling rows must also be considered 31 vs.
  • 32.
    An Example • Query: oSELECT Avg(Rating) FROM Reviews WHERE MID = 932 • Two physical query plans: 32 Reviews Date CID MID Rating Plan #1 Avg_agg [Cnt, Sum] Sequential Scan Reviews Filter MID = 932 Avg_agg [Cnt, Sum] Index Lookup MID = 932 MID Index Reviews Plan #2 Which plan is cheaper ???
  • 33.
    Plan #1 33 Avg_agg [Cnt, Sum] Scan Reviews Filter MID= 932 • Filter is applied to 10M rows • The optimizer estimates that 100 rows will satisfy the predicate • Table is 100K pages with 100 rows/page • Sorted on date • Average computation is applied to 100 rows • Reviews is scanned sequentially at 100 MB/second • I/O time of scan is 8 seconds • At 0.1 microseconds/row, filter consumes 1 second of CPU time • At 0.1 microseconds/row, avg consumes .00001 seconds of CPU time Optimizer estimates total execution time of 9 seconds
  • 34.
    Plan #2 34 Avg_agg [Cnt, Sum] IndexLookup MID = 932 MID Index Reviews • 100 rows are estimated to satisfy the predicate • Average computation is applied to 100 rows • At 0.1 microseconds/row, average consumes .00001 seconds of CPU time • 100 rows are retrieved using the MID index • Since table is sorted on date field (and not MID field), each I/O requires a random disk I/O – about .003 seconds per disk I/O • I/O time will be .3 seconds Optimizer estimates total execution time of 0.3 seconds The estimate for Plan #1 was 9 seconds, so Plan #2 is clearly the better choice
  • 35.
    But … • Whatif the estimate of the number of rows that satisfy the predicate MID = 932 is WRONG? o E.g. 10,000 rows instead of 100 rows 35 0.01 0.1 1 10 100 1000 10 100 1,000 10,000 100,000 Time(#sec) # of rows Sequential Scan Non-Clustered Index Non-clustered Index is better here Sequential scan is better here
  • 36.
    Estimating Join Costs •Three basic join methods: o Nested-loops join o Sort-merge join o Hash-join • Very different performance characteristics • Critical for optimizer to carefully pick which method to use when 36 Join SelectC.City = “NY” Select R.Rating > 7 JoinC.CID = R.CID R.MID = M.MID Customers Reviews ProjectM.Title, M.Director Movies
  • 37.
    Sort-Merge Join Algorithm SortReviews on MID column (unless already sorted) Sort Movies on MID column (unless already sorted) “Merge” two sorted tables: Scan each table sequential in tandem { For current row r of Reviews For current row m of Movies if r.MID = m.MID produce output row Advance r and m cursors } 37 Cost = |R| + |M| I/Os Merge Join Sort Sort Reviews (|R| pages) Movies (|M| pages) Reviews.MID = Movies.MID Cost = 4 * |M| I/Os Total I/O cost = 5*|R| + 5*|M| I/Os Cost = 4 * |R| I/Os Main Idea: Sort R and M on the join column (MID), then scan them to do a ``merge’’ (on join column), and output result tuples.
  • 38.
    Nested-Loops Join For eachpage Ri, 1≤ i ≤ |R|, of Reviews { Read page Ri from disk For each Mj, 1≤ j ≤ |M|, of Movies { Read page Mj from disk For all rows r on page Ri { For all rows m on page Mj { if r.MID = m.MID produce output row } } } } 38 I/O Cost = |R| + |R| * |M| Nested Loops Join Movies (|M| pages) Reviews (|R| pages) Reviews.MID = Movies.MID Main Idea: Scan R, and for each tuple in R probe tuples in M (by scanning it). Output result tuples.
  • 39.
    Main Idea: ScanR, and for each tuple in R probe tuples in M (by probing its index). Output result tuples. Index-Nested Loops For each page Ri, 1≤ i ≤ |R|, of Reviews { Read page Ri from disk For all rows r on page Ri { Use MID index on Movies to fetch rows with MID attributes = r.MID Form output row for each returned row } } 39 Movies (|M| pages) Nested Loops Join Reviews Reviews.MID = Movies.MID Index Lookup using r.MID MID Index (|R| pages) Sorted on date column Cost = |R| + |R| * (||R||/|R|) * 2 • 2 I/Os: 1 index I/O + 1 movie I/O as Reviews table is sorted on date column • ||R|| is # of rows in R • ||R||/|R| gives the average number of rows of R per page Notice that since Reviews is ordered on the Date column (and not MID), so each row of the Movies table retrieved incurs two random disk I/Os: • one to the index and • one to the table
  • 40.
    Estimating Result Cardinalities •Consider the query SELECT * FROM Reviews WHERE 7/1 < date < 7/31 AND rating > 9 • Assume Reviews has 1M rows • Assume following selectivity factors: 40 Sel. Factor # of qualifying rows 7/1 < date < 7/31 0.1 100,000 Review > 9 0.01 10,000 • How many output rows will the query produce? o If predicates are not correlated • .1 * .01 * 1M = 1,000 rows o If predicates are correlated could be as high as • .1 * 1M = 100,000 rows Why does this matter?
  • 41.
    1 10 100 1000 10000 0.000001 0.00001 0.00010.001 0.01 0.1 1 Time(#sec) Selectivity factor of predicate on Reviews table Nested Loops Sort Merge Index NL This is Why! Assume that: • Reviews table is 10,000 pages with 80 rows/page • Movies table is 2,000 pages • The primary index on Movies is on the MID column 41 Join R.MID = M.MID Select Reviews Project Movies Rating > 9 and 7/1 < date < 7/31 The consequences of incorrectly estimating the selectivity of the predicate on Reviews can be HUGE INL N L SM Note that each join algorithm has a region where it provides the best performance
  • 42.
    Multidimensional Histograms • Usedto capture correlation between attributes • A 2-D example 42 0 50 100 150 200 250 300 350 400 450 500 151 198 229 152 156 303 314 361 392 315 319 466 191 238 269 192 196 343 211 258 289 212 216 363 97 144 175 98 102 249
  • 43.
    A Little BitAbout Estimating Join Cardinalities • Question: Given a join of R and S, what is the range of possible result sizes (in #of tuples)? o Suppose the join is on a key for R and S Students(sid, sname, did), Dorm(did,d.addr) Select S.sid, D.address From Students S, Dorms D Where S.did = D.did What is the cardinality? 43 A student can only live in at most 1 dorm: • each S tuple can match with at most 1 D tuple • cardinality (S join D) = cardinality of S
  • 44.
    • General case:join on {A} (where {A} is key for neither) o estimate each tuple r of R generates uniform number of matches in S and each tuple s of S generates uniform number of matches in R, e.g. SF = min(||R|| * ||S|| / NKeys(A,S), ||S|| * ||R|| / NKeys(A,R)) e.g., SELECT M.title, R.title FROM Movies M, Reviews R WHERE M.title = R.title Movies: 100 tuples, 75 unique titles  1.3 rows for each title Reviews: 20 tuples, 10 unique titles  2 rows for each title Estimating Join Cardinality = 100*20/10 = 200 = 20*100/75 = 26.6
  • 45.
    Enumerate logically equivalentplans by applying equivalence rules For each logically equivalent plan, enumerate all alternative physical query plans Estimate the cost of each of the alternative physical query plans. • Estimate the selectivity factor and output cardinality of each predicate • Estimate the cost of each operator Run the plan with lowest estimated overall cost Query Optimization: The Main Steps ✓ 2 1 3 4 ✓ ✓ Enumerate How big is the plan space for a query involving N tables? enumerate It turns out that the answer depends on the “shape” of the query
  • 46.
    Two Common Query“Shapes” 46 A B Join Join Join Join C D F “Star” Join Queries A B C D FJoin JoinJoin Join “Chain” Join Queries Number of logically equivalent alternatives # of Tables Star Chain 2 2 2 4 48 40 5 384 224 6 3,840 1,344 8 645,120 54,912 10 18,579,450 2,489,344 In practice, “typical” queries fall somewhere between these two extremes
  • 47.
    Pruning the PlanSpace • Consider only left-deep query plans to reduce the search space 47 A B C Join Join Join Join E D Left Deep Join Join Join Join ED A B C Bushy Star Join Queries Chain Join Queries # of Tables Bushy Left-Deep Bushy Left Deep 2 2 2 2 2 4 48 12 40 8 5 384 48 224 16 6 3,840 240 1,344 32 8 645,120 10,080 54,912 128 10 18,579,450 725,760 2,489,344 512 These are counts of logical plans only! With: i) 3 join methods ii) n joins in a query There will be 3 n physical plans for each logical planExample: For a left-deep, 8 table star join query there will be: i) 10,080 different logical plans ii) 22,044,960 different physical plans!! Solution: Use some form of dynamic programming (either bottom up or top down) to search the plan space heuristically Sometimes these heuristics will cause the best plan to be missed!!
  • 48.
    • Optimization isperformed in N passes (if N relations are joined): o Pass 1: Find the best (lowest cost) 1-relation plan for each relation. o Pass 2: Find the best way to join the result of each 1-relation plan (as the outer/left table) to another relation (as the inner/right table) to generate all 2-relation plans. o Pass N: Find best way to join result of a (N-1)-relation plan (as outer) to the N’th relation to generate all N-relation plans. • At each pass, for each subset of relations, prune all plans except those o Lowest cost plan overall, plus o Lowest cost plan for each interesting order of the rows • Order by, group by, aggregates etc. handled as the final step Bottom-Up QO Using Dynamic Programming In spite of pruning plan space, this approach is still exponential in the # of tables. Interesting orders include orders that facilitate the execution of joins, aggregates, and order by clauses subsequently by the query
  • 49.
    49 A A SS A IS B B SS C C SS C IS D D SS D IS27 387313 429518 All single relation plans All tables First, generate all single relation plans: A Select Join Join C Select Join D B Select An Example: Legend: SS – sequential scan IS – index scan – cost5 Prune
  • 50.
    50 B SS 73 A SS A IS 2713 D SS42 C IS 18 Allsingle relation plans after pruning Then, All Two Relation Plans
  • 51.
    Two Relation Plans StartingWith A 51 B SS 73 A IS 27 A SS13 D SS42 C IS 18 A Select Join Join C Select Join D B Select A SS B SS NLJ A IS B SS NLJ A IS B SS SMJ A SS B SS SMJJoin Select A B A.a = B.a 1013 822315 293 Single relation plans Prune Let’s assume there are 2 alternative join methods for the QO to select from: 1. NLJ = Nested Loops Join 2. SMJ = Sort Merge Join
  • 52.
    Two Relation Plans StartingWith B 52 B SS A SS NLJ B SS A SS SMJ B SS NLJ A IS B SS SMJ A IS Select D B JoinB.b = D.b B SS D SS NLJ B SS D SS SMJ NLJ B SS C IS B SS SMJ C IS A Select Join Join C Select Join D B Select 1013 315 756 293 1520 432 2321 932 Single relation plansB SS 73 A IS 27 A SS13 D SS42 C IS 18 Prune
  • 53.
    Two Relation Plans StartingWith C 53 Select C B JoinB.C = C.c NLJ B SS C IS B SS SMJ C IS A Select Join Join C Select Join D B Select 6520 932 Single relation plansB SS 73 A IS 27 A SS13 D SS42 C IS 18 Prune
  • 54.
    Two Relation Plans StartingWith D 54 Select D B JoinB.b = D.b D SS B SS NLJ D SS B SS SMJ A Select Join Join C Select Join D B Select 1520 432 Single relation plans B SS 73 A IS 27 A SS13 D SS42 C IS 18 Prune
  • 55.
    Further Prune Two RelationPlans 55 A IS B SS SMJ D SS B SS SMJ Pruned two relation plans B SS SMJ C IS B SS SMJ A IS B SS D SS SMJ B SS SMJ C IS A Select Join Join C Select Join D B Select
  • 56.
    Next, All Three RelationPlans 56 A IS B SS SMJ Fully pruned two relation plans B SS SMJ C IS B SS D SS SMJ A Select Join Join C Select Join D B Select NLJ C IS A IS B SS SMJ SMJ C IS A IS B SS SMJ D SS NLJ A IS B SS SMJ D SS SMJ A IS B SS SMJ 1) Consider the Two Relation Plans That Start With A
  • 57.
    Next, All Three RelationPlans 57 A IS B SS SMJ Fully pruned two relation plans B SS SMJ C IS B SS D SS SMJ A Select Join Join C Select Join D B Select B SS D SS SMJ A SS NLJ B SS D SS SMJ A SS SMJ NLJ A IS B SS D SS SMJ SMJ A IS B SS D SS SMJ NLJ C IS B SS D SS SMJ SMJ C IS B SS D SS SMJ 2) Consider the Two Relation Plans That Start With B
  • 58.
    Next, All Three RelationPlans 58 A IS B SS SMJ Fully pruned two relation plansB SS SMJ C IS B SS D SS SMJ A Select Join Join C Select Join D B Select B SS SMJ C IS NLJ A IS SMJ A IS B SS SMJ C IS D SS NLJ C IS B SS SMJ D SS SMJ C IS B SS SMJ 3) Consider the Two Relation Plans That Start With C
  • 59.
    You Have NowSeen the Theory • But the reality is: o Optimizer still pick bad plans too frequently for a variety of reasons: • Statistics can be missing, out-of-date, incorrect • Cardinality estimates assume uniformly distributed values but data values are skewed • Attribute values are correlated with one another: o Make = “Honda” and Model = “Accord” • Cost estimates are based on formulas that do not take into account the characteristics of the machine on which the query will actually be run o Regressions happen due hardware and software upgrades 59 What can be done to improve the situation?
  • 60.
    Opportunities for Improvement •Develop tools that give us a better understanding of what goes wrong • Improve plan stability • Use of feedback from the QE to the QO to improve statistics and cost estimates 60
  • 61.
    Towards a Better Understandingof QO Behavior • Picasso Project – Jayant Haritsa, IIT Bangalore o Bing “Picasso Haritsa” to find the project’s web site o Tool is available for SQL Server, Oracle, PostgreSQL, DB2, Sybase • Simple but powerful idea: • For a given query such as SELECT * from A, B WHERE A.a = B.b and A.c <= constant-1 and B.d <= constant-2 • Systematically vary constant-1 and constant-2 • Obtain query plan and estimated cost from the query optimizer for each combination of input parameters • Plot the results 61
  • 62.
    Example: TPC-H Query8 select o_year, sum(case when nation = 'BRAZIL' then volume else 0 end) / sum(volume) from ( select YEAR(O_ORDERDATE) as o_year, L_EXTENDEDPRICE * (1 - L_DISCOUNT) as volume, n2.N_NAME as nation from PART, SUPPLIER, LINEITEM, ORDERS, CUSTOMER, NATION n1, NATION n2, REGION where P_PARTKEY = L_PARTKEY and S_SUPPKEY = L_SUPPKEY and L_ORDERKEY = O_ORDERKEY and O_CUSTKEY = C_CUSTKEY and C_NATIONKEY = n1.N_NATIONKEY and n1.N_REGIONKEY = R_REGIONKEY and R_NAME = 'AMERICA‘ and S_NATIONKEY = n2.N_NATIONKEY and O_ORDERDATE between '1995-01-01' and '1996-12-31' and P_TYPE = 'ECONOMY ANODIZED STEEL' and S_ACCTBAL <= constant-1 and L_EXTENDEDPRICE <= constant-2 ) as all_nations group by o_year order by o_year
  • 63.
    Resulting Plan Space •SQL Server 2008 R2 • A total of 90,000 queries o 300 different values for both L_ExtendedPrice and S_AcctBal • 204 different plans!! o Each distinct plan is assigned a unique color • Zooming in to the [0,20:0,40] region: 63 Key takeaway: If plan choice is so sensitive to the constants used, it will undoubtedly be sensitive to errors in statistics and cardinality estimates  Intuitively, this seems very bad!
  • 64.
    • Recall thisgraph of join algorithm performance • While the two “nested loops” algorithms are faster at low selectivity factors, they are not as “stable” across the entire range of selectivity factors How Might We Do Better? 64 1 10 100 1000 10000 Time(#sec) Selectivity factor of predicate on Reviews table Nested Loops Sort Merge Index NL Join R.MID = M.MID Select Reviews Project Movies Rating > 9 and 7/1 < date < 7/31 INL N L SM
  • 65.
    “Reduced” Plan Diagrams •Robustness is somehow tied to the number of plans o Fewer plans => more robust plans • For TPC-H query 8, it is possible to use only 30 plans (instead of 204) by picking more robust plans that are slightly slower (10% max, 2% avg) • Since each plan covers a larger region it will be less sensitive to errors in estimating cardinalities and costs 65 Reduced plan space for TPC-H query 8
  • 66.
    How Might WeDo Better? • At QO time, have the QO annotate compiled query plans with statistics (e.g. expected cardinalities) and check operators • At runtime, check operators collect the actual statistics and compare actual vs. predicted • Opens up a number of avenues for improving QO performance Especially in the CLOUD! INL A IS B SS SMJ C IS Check Check C IS Check B SS SMJ A IS INL 66
  • 67.
    QO In theCloud • What is different? o On prem, a DB vendor has essentially no insight to how its product is used o In the cloud, vendor knows • Schema information (tables, indices, …) • The hardware being used • The complete query workload • For each query, the optimized plan & its estimated cost, the actually running cost, and the selectivity of each operator • Use this information to build an optimizer that learns.
  • 68.
    A Learning QO 68 OptimizerQuery Statistics Statistics Tracker ExecutorDatabase Check Check C IS Check B SS SMJ A IS INL Catalogs Observed StatsOriginal & Observed Optimization of subsequent queries benefits from the observed statistics and operator costs Query Plan Observed Costs
  • 69.
    Key Points ToRemember For The Quiz • Query optimization is harder than rocket science o The other components are trivial in comparison • Three key phases of QO o Enumeration of logical plan space o Enumeration of alternative physical plans o Selectivity estimation and costing • The QO team of every DB vendor lives in fear of regressions o How exactly do you expect them to make forward progress?