 Chapter 11  Database Performance Tuning and Query Optimization ISBN-13: 978-1337627900 ISBN-10: 1337627909 Buy Book Amazon.com URL Modified by: Usman Tariq, PhD Associate Professor, PSAU Office ☎ 00966 11 588 8386
Learning Objectives  In this chapter, the student will learn:  Basic database performance-tuning concepts  How a DBMS processes SQL queries  About the importance of indexes in query processing  About the types of decisions the query optimizer has to make  Some common practices used to write efficient SQL code  How to formulate queries and tune the DBMS for optimal performance 2
Database Performance-Tuning Concepts  Goal of database performance is to execute queries as fast as possible  Database performance tuning: Set of activities and procedures that reduce response time of database system  Fine-tuning the performance of a system requires that all factors must operate at optimum level with minimal bottlenecks 3
Table 11.1 - General Guidelines for Better System Performance 4
Performance Tuning: Client and Server  Client side  SQL performance tuning: Generates SQL query that returns correct answer in least amount of time  Using minimum amount of resources at server  Server side  DBMS performance tuning: DBMS environment configured to respond to clients’ requests as fast as possible  Optimum use of existing resources 5
“Low-end” Server Specifications 6
“Enterprise” Server Specifications 7
DATABASE STATISTICS MEASUREMENTS 8
Example: to generate statistics for the VENDOR table, you would use: 1. In Oracle:  ANALYZE TABLE VENDOR COMPUTE STATISTICS; 2. In MySQL:  ANALYZE TABLE VENDOR; 3. In SQL Server: UPDATE STATISTICS VENDOR; 9 It is common to periodically regenerate the statistics for database objects, especially database objects that are subject to frequent change.
Database analysis 10 Growth trend of the database in total, the data files, the transaction logs, and the filestream data, for the selected period
Select more fields to avoid SELECT DISTINCT  Inefficient:  SELECT * FROM Customers  Efficient:  SELECT FirstName, LastName, Address, City, State, Zip FROM Customers  Inefficient and inaccurate:  SELECT DISTINCT FirstName, LastName, State FROM Customers 11
SELECT DISTINCT statement in MySQL 12 Will take at least 10 minutes to furnish results
Query Processing [1/2]  Parsing. The DBMS parses the SQL query and chooses the most efficient access/execution plan.  Execution. The DBMS executes the SQL query using the chosen execution plan.  Fetching. The DBMS fetches the data and sends the result set back to the client. 13 SQL query is: 1. Validated for syntax compliance 2. Validated against the data dictionary to ensure that table names and column names are correct 3. Validated against the data dictionary to ensure that the user has proper access rights 4. Analyzed and decomposed into more atomic components
Query Processing [2/2] 14
SQL Parsing Phase  Query is broken down into smaller units  Original SQL query is transformed into slightly different version of the original SQL code which is fully equivalent and more efficient  Query optimizer: Analyzes SQL query and finds most efficient way to access data  Access plans: DBMS-specific and translate client’s SQL query into a series of complex I/O operations 15
SQL Parsing  Syntax Check  Oracle Database must check each SQL statement for syntactic validity.  A statement that breaks a rule for well-formed SQL syntax fails the check. For example, the following statement fails because the keyword FROM is misspelled as FORM: 16 external
SQL Parsing  Semantic Check  A semantic check determines whether a statement is meaningful, for example, whether the objects and columns in the statement exist.  A syntactically correct statement can fail a semantic check, as shown in the following example of a query of a nonexistent table: 17 external
SQL Parsing  Shared Pool Check  Shared Pool is a portion of the System Global Area (SGA) that contains shared memory constructs such as shared SQL areas.  During the parse, the database performs a shared pool check to determine whether it can skip resource-intensive steps of statement processing. 18 external
SQL Parsing Phase  If Query Access Plan already exists for query in SQL cache, DBMS reuses it  If not, optimizer evaluates various plans and chooses one to be placed in SQL cache for use 19 ParallelQueryAccessPlan
Create joins with INNER JOIN rather than WHERE  SELECT Customers.CustomerID, Customers.Name, Sales.LastSaleDate FROM Customers, Sales WHERE Customers.CustomerID = Sales.CustomerID 20 SELECT Customers.CustomerID, Customers.Name, Sales.LastSaleDate FROM Customers INNER JOIN Sales ON Customers.CustomerID = Sales.CustomerID The database would only generate the #### desired records where CustomerID is equal
Use WHERE instead of HAVING to define filters [1/2]  SELECT Customers.CustomerID, Customers.Name, Count(Sales.SalesID) FROM Customers INNER JOIN Sales ON Customers.CustomerID = Sales.CustomerID GROUP BY Customers.CustomerID, Customers.Name HAVING Sales.LastSaleDate BETWEEN #1/1/2019# AND #12/31/2019# 21 SELECT Customers.CustomerID, Customers.Name, Count(Sales.SalesID) FROM Customers INNER JOIN Sales ON Customers.CustomerID = Sales.CustomerID WHERE Sales.LastSaleDate BETWEEN #10/10/2019# AND #17/10/2019# GROUP BY Customers.CustomerID, Customers.Name
Use WHERE instead of HAVING to define filters [2/2] 22 SELECT Customers.CustomerID, Customers.Name, Count(Sales.SalesID) FROM Customers INNER JOIN Sales ON Customers.CustomerID = Sales.CustomerID WHERE Sales.LastSaleDate BETWEEN #10/10/2019# AND #17/10/2019# GROUP BY Customers.CustomerID, Customers.Name HAVING Count(Sales.SalesID) > 5 HAVING should only be used when filtering on an aggregated field. In the query above, we could additionally filter for customers with greater than 5 sales using a HAVING statement.
23 1. Proper indexing 2. Retrieve the relevant data only 3. Getting rid of correlated subqueries 4. Using or avoiding temporary tables according to requirement 5. Avoid coding loops OPTIMIZATION 1. Proper indexing 2. Retrieve the relevant data only 3. Getting rid of correlated subqueries 4. Using or avoiding temporary tables according to requirement 5. Avoid coding loops MONITORING PERFORMANCE 1. Monitor Availability and Resource Consumption 2. Measure and Compare Throughput 3. Monitor Expensive Queries 4. Track Database Changes 5. Monitor Logs A data mart is a subset of a data warehouse oriented to a specific business line. Data marts contain repositories of summarized data collected for analysis on a specific section or unit within an organization, for example, the sales department. The Oracle online table reorganization package, (dbms_redefinition) is used to reorganize tables while they are accepting updates. Partitioning is the database process where very large tables are divided into multiple smaller parts. In computing, a materialized view is a database object that contains the results of a query. For example, it may be a local copy of data located remotely, or may be a subset of the rows and/or columns of a table or join result, or may be a summary using an aggregate function. external
Full Table Scan: Example 24 The following statement queries monthly salaries over SAR 4000
Nulls: Example 25
Indexes and Query Optimization  Indexes are crucial in speeding up data access because they facilitate searching, sorting, and using aggregate functions and even join operations. 26
Index Range Scan: Example 27 SELECT * FROM employees WHERE department_id = 20 AND salary > 1000; If indexes are so important, why not index every column in every table? Indexing every column in every table overtaxes the DBMS in terms of index- maintenance processing, especially if the table has many attributes and rows, or requires many inserts, updates, and deletes.
Database Query Optimization Modes  Algorithms proposed for query optimization are based on: 1. Selection of the optimum order to achieve the fastest execution time 2. Selection of sites to be accessed to minimize communication costs  Evaluated on the basis of: 1. Operation mode 2. Timing of its optimization 28
Query Blocks  Each SELECT block in the original SQL statement is represented internally by a query block. A query block can be a top-level statement, subquery, or unmerged view. 29 external
Query Optimizer Operations 30 The optimizer contains three components: the transformer, estimator, and plan generator. external
31 Plan Generator external
Join Method Example Adaptive Query Plans SELECT product_name FROM order_items o, prod_info p WHERE o.unit_price = 15 AND quantity > 1 AND p.product_id = o.product_id 32 external Hybrid Hash Distribution EXPLAIN PLAN FOR SELECT /*+ parallel(4) full(e) full(d) */ department_name, sum(salary) FROM employees e, departments d WHERE d.department_id=e.department_id GROUP BY department_name;
Using Hints to Affect Optimizer Choices 33
DBMS tuning hierarchical level 34 external
Adaptive Query Plans 35 Statistics Feedback SELECT o.order_id, v.product_name FROM orders o, ( SELECT order_id, product_name FROM order_items o, product_information p WHERE p.product_id = o.product_id AND list_price < 50 AND min_price < 40 ) v WHERE o.order_id = v.order_id external
Classification of Operation Modes  Automatic query optimization: DBMS finds the most cost-effective access path without user intervention  Manual query optimization: Requires that the optimization be selected and scheduled by the end user or programmer 36
Classification Based on Timing of Optimization  Static query optimization: best optimization strategy is selected when the query is compiled by the DBMS  Takes place at compilation time  Dynamic query optimization: Access strategy is dynamically determined by the DBMS at run time, using the most up-to-date information about the database  Takes place at execution time 37
Classification Based on Type of Information Used to Optimize the Query  Statistically based query optimization algorithm: Statistics are used by the DBMS to determine the best access strategy  Statistical information is generated by DBMS through:  Dynamic statistical generation mode  Manual statistical generation mode  Rule-based query optimization algorithm: based on a set of user-defined rules to determine the best query access strategy 38
DBMS Architecture  All data in a database are stored in data files  Data files automatically expand in predefined increments known as extends  Data files are grouped in file groups or table spaces  Table space or file group: Logical grouping of several data files that store data with similar characteristics  Data cache or buffer cache: Shared, reserved memory area  Stores most recently accessed data blocks in RAM 39
DBMS Architecture  SQL cache or procedure cache: Stores most recently executed SQL statements or PL/SQL procedures  DBMS retrieves data from permanent storage and places them in RAM  Input/output request: Low-level data access operation that reads or writes data to and from computer devices  Data cache is faster than working with data files  Majority of performance-tuning activities focus on minimizing I/O operations 40
Find Current Location of Data and Log File of All the Database 41 SELECT name, physical_name AS current_file_location FROM sys.master_files
42 DBMS - File Structure external
Figure 11.1 - Basic DBMS Architecture 43
SQL Execution Phase  All I/O operations indicated in the access plan are executed  Locks are acquired  Data are retrieved and placed in data cache  Transaction management commands are processed 44
SQL Execution Phase 45 Using SQL Access Advisor external
SQL Fetching Phase  Rows of resulting query result set are returned to client  DBMS may use temporary table space to store temporary data  Database server coordinates the movement of the result set rows from the server cache to the client cache 46
Indexes and Query Optimization  Indexes  Help speed up data access  Facilitate searching, sorting, using aggregate functions, and join operations  Ordered set of values that contain the index key and pointers  More efficient than a full table scan 47 SELECT * FROM Table WITH(INDEX(Index_Name)) external
Indexes and Query Optimization  Data sparsity: Number of different values a column could have  Data structures used to implement indexes:  Hash indexes  B-tree indexes  Bitmap indexes  DBMSs determine best type of index to use 48
Creating a Named b-Tree Composite 49 external
Optimizer Choices  Rule-based optimizer: Uses preset rules and points to determine the best approach to execute a query  Cost-based optimizer: Uses algorithms based on statistics about objects being accessed to determine the best approach to execute a query 50
Using Hints to Affect Optimizer Choices  Optimizer might not choose the best execution plan  Makes decisions based on existing statistics, which might be old  Might choose less-efficient decisions  Optimizer hints: Special instructions for the optimizer, embedded in the SQL command text 51
Optimizer Hints  A hint is embedded within a SQL comment.  The hint comment must immediately follow the first keyword of a SQL statement block. You can use either style of comment: a slash-star (/*) or pair of dashes (--). The plus-sign (+) hint delimiter must immediately follow the comment delimiter, with no space permitted before the plus sign, as in the following fragment: 52 external
Table 11.5 - Optimizer Hints 53
SQL Performance Tuning  Evaluated from client perspective  Most current relational DBMSs perform automatic query optimization at the server end  Most SQL performance optimization techniques are DBMS-specific and thus rarely portable  Majority of performance problems are related to poorly written SQL code 54
Index Selectivity  Measure of the likelihood that an index will be used in query processing  Indexes are used when a subset of rows from a large table is to be selected based on a given condition  Index cannot always be used to improve performance  Function-based index: Based on a specific SQL function or expression 55
Conditional Expressions  Expressed within WHERE or HAVING clauses of a SQL statement  Restricts the output of a query to only rows matching conditional criteria  Guidelines to write efficient conditional expressions in SQL code  Use simple columns or literals as operands  Numeric field comparisons are faster than character, date, and NULL comparisons 56
SQL Expressions  Oracle Database does not accept all forms of expressions in all parts of all SQL statements. Information URL 57 external
Conditional Expressions  Equality comparisons are faster than inequality comparisons  Transform conditional expressions to use literals  Write equality conditions first when using multiple conditional expressions  When using multiple AND conditions, write the condition most likely to be false first  When using multiple OR conditions, put the condition most likely to be true first  Avoid the use of NOT logical operator 58
Query Formulation  Identify what columns and computations are required  Identify source tables  Determine how to join tables  Determine what selection criteria are needed  Determine the order in which to display the output 59
DBMS Performance Tuning  Managing DBMS processes in primary memory and the structures in physical storage  DBMS performance tuning at server end focuses on setting parameters used for: 1. Data cache 2. SQL cache 3. Sort cache 4. Optimizer mode  In-memory database: Store large portions of the database in primary storage 60
DBMS Performance Tuning  Recommendations for physical storage of databases:  Use RAID (Redundant Array of Independent Disks) to provide a balance between performance improvement and fault tolerance  Minimize disk contention  Put high-usage tables in their own table spaces  Assign separate data files in separate storage volumes for indexes, system, and high-usage tables 61
DBMS Performance Tuning  Take advantage of the various table storage organizations in the database  Index-organized table or clustered index table: Stores the end-user data and the index data in consecutive locations in permanent storage  Partition tables based on usage  Use denormalized tables where appropriate  Store computed and aggregate attributes in tables 62
63 Databases, tablespaces, and datafiles are closely related, but they have important differences: • An Oracle database consists of one or more logical storage units called tablespaces, which collectively store all of the database's data. • Each tablespace in an Oracle database consists of one or more files called datafiles, which are physical structures that conform to the operating system in which Oracle is running. • A database's data is collectively stored in the datafiles that constitute each tablespace of the database. For example, the simplest Oracle database would have one tablespace and one datafile. Another database can have three tablespaces, each consisting of two datafiles (for a total of six datafiles). external
CREATE TABLESPACE  Create a Shadow Tablespace: Example 64 The following example creates a 10 MB undo tablespace undots1:
Creating a Temporary Tablespace: Example  This statement shows how the temporary tablespace that serves as the default temporary tablespace for database users in the sample database was created: 65
Enabling Autoextend for a Tablespace: Example 66 This statement creates a tablespace named tbs_02 with one data file. When more space is required, 500 kilobyte extents will be added up to a maximum size of 100 megabytes:
Using Multiple Tablespaces  You can use multiple tablespaces to perform the following tasks: 1. Control disk space allocation for database data 2. Assign specific space quotas for database users 3. Control availability of data by taking individual tablespaces online or offline 4. Perform partial database backup or recovery operations 5. Allocate data storage across devices to improve performance 67 external
Using Tablespaces  A database administrator can use tablespaces to do the following actions: 1. Create new tablespaces 2. Add datafiles to tablespaces 3. Set and alter default segment storage settings for segments created in a tablespace 4. Make a tablespace read only or read/write 5. Make a tablespace temporary or permanent 6. Rename tablespaces 7. Drop tablespaces 68 external
Create TABLESPCE: Example 69 CREATE TABLESPACE lmtbsb DATAFILE '/u02/oracle/data/lmtbsb01.dbf' SIZE 50M EXTENT MANAGEMENT LOCAL UNIFORM SIZE 128K; In this example, a 128K extent size is specified. Each 128K extent (which, if the tablespace block size is 2K, is equivalent to 64 Oracle blocks) is represented by a bit in the extent bitmap for this file.
Make a tablespace read only 70 ALTER TABLESPACE sales_1 READ ONLY; ALTER TABLESPACE LMS_Student READ ONLY;
Creating Oracle Managed Files: Examples 71 CREATE TABLESPACE omf_ts2 DATAFILE AUTOEXTEND OFF; The following example creates a tablespace with an Oracle-managed data file of 100M that is not autoextensible: ALTER SYSTEM SET DB_CREATE_FILE_DEST = '$ORACLE_HOME/rdbms/dbs'; CREATE TABLESPACE omf_ts1; The following example sets the default location for data file creation and creates a tablespace with a data file in the default location. The data file is 100M and is autoextensible with an unlimited maximum size:

Database performance tuning and query optimization

  • 1.
     Chapter 11 Database Performance Tuning and Query Optimization ISBN-13: 978-1337627900 ISBN-10: 1337627909 Buy Book Amazon.com URL Modified by: Usman Tariq, PhD Associate Professor, PSAU Office ☎ 00966 11 588 8386
  • 2.
    Learning Objectives  Inthis chapter, the student will learn:  Basic database performance-tuning concepts  How a DBMS processes SQL queries  About the importance of indexes in query processing  About the types of decisions the query optimizer has to make  Some common practices used to write efficient SQL code  How to formulate queries and tune the DBMS for optimal performance 2
  • 3.
    Database Performance-Tuning Concepts Goal of database performance is to execute queries as fast as possible  Database performance tuning: Set of activities and procedures that reduce response time of database system  Fine-tuning the performance of a system requires that all factors must operate at optimum level with minimal bottlenecks 3
  • 4.
    Table 11.1 -General Guidelines for Better System Performance 4
  • 5.
    Performance Tuning: Clientand Server  Client side  SQL performance tuning: Generates SQL query that returns correct answer in least amount of time  Using minimum amount of resources at server  Server side  DBMS performance tuning: DBMS environment configured to respond to clients’ requests as fast as possible  Optimum use of existing resources 5
  • 6.
  • 7.
  • 8.
  • 9.
    Example: to generatestatistics for the VENDOR table, you would use: 1. In Oracle:  ANALYZE TABLE VENDOR COMPUTE STATISTICS; 2. In MySQL:  ANALYZE TABLE VENDOR; 3. In SQL Server: UPDATE STATISTICS VENDOR; 9 It is common to periodically regenerate the statistics for database objects, especially database objects that are subject to frequent change.
  • 10.
    Database analysis 10 Growth trendof the database in total, the data files, the transaction logs, and the filestream data, for the selected period
  • 11.
    Select more fieldsto avoid SELECT DISTINCT  Inefficient:  SELECT * FROM Customers  Efficient:  SELECT FirstName, LastName, Address, City, State, Zip FROM Customers  Inefficient and inaccurate:  SELECT DISTINCT FirstName, LastName, State FROM Customers 11
  • 12.
    SELECT DISTINCT statementin MySQL 12 Will take at least 10 minutes to furnish results
  • 13.
    Query Processing [1/2] Parsing. The DBMS parses the SQL query and chooses the most efficient access/execution plan.  Execution. The DBMS executes the SQL query using the chosen execution plan.  Fetching. The DBMS fetches the data and sends the result set back to the client. 13 SQL query is: 1. Validated for syntax compliance 2. Validated against the data dictionary to ensure that table names and column names are correct 3. Validated against the data dictionary to ensure that the user has proper access rights 4. Analyzed and decomposed into more atomic components
  • 14.
  • 15.
    SQL Parsing Phase Query is broken down into smaller units  Original SQL query is transformed into slightly different version of the original SQL code which is fully equivalent and more efficient  Query optimizer: Analyzes SQL query and finds most efficient way to access data  Access plans: DBMS-specific and translate client’s SQL query into a series of complex I/O operations 15
  • 16.
    SQL Parsing  SyntaxCheck  Oracle Database must check each SQL statement for syntactic validity.  A statement that breaks a rule for well-formed SQL syntax fails the check. For example, the following statement fails because the keyword FROM is misspelled as FORM: 16 external
  • 17.
    SQL Parsing  SemanticCheck  A semantic check determines whether a statement is meaningful, for example, whether the objects and columns in the statement exist.  A syntactically correct statement can fail a semantic check, as shown in the following example of a query of a nonexistent table: 17 external
  • 18.
    SQL Parsing  SharedPool Check  Shared Pool is a portion of the System Global Area (SGA) that contains shared memory constructs such as shared SQL areas.  During the parse, the database performs a shared pool check to determine whether it can skip resource-intensive steps of statement processing. 18 external
  • 19.
    SQL Parsing Phase If Query Access Plan already exists for query in SQL cache, DBMS reuses it  If not, optimizer evaluates various plans and chooses one to be placed in SQL cache for use 19 ParallelQueryAccessPlan
  • 20.
    Create joins withINNER JOIN rather than WHERE  SELECT Customers.CustomerID, Customers.Name, Sales.LastSaleDate FROM Customers, Sales WHERE Customers.CustomerID = Sales.CustomerID 20 SELECT Customers.CustomerID, Customers.Name, Sales.LastSaleDate FROM Customers INNER JOIN Sales ON Customers.CustomerID = Sales.CustomerID The database would only generate the #### desired records where CustomerID is equal
  • 21.
    Use WHERE insteadof HAVING to define filters [1/2]  SELECT Customers.CustomerID, Customers.Name, Count(Sales.SalesID) FROM Customers INNER JOIN Sales ON Customers.CustomerID = Sales.CustomerID GROUP BY Customers.CustomerID, Customers.Name HAVING Sales.LastSaleDate BETWEEN #1/1/2019# AND #12/31/2019# 21 SELECT Customers.CustomerID, Customers.Name, Count(Sales.SalesID) FROM Customers INNER JOIN Sales ON Customers.CustomerID = Sales.CustomerID WHERE Sales.LastSaleDate BETWEEN #10/10/2019# AND #17/10/2019# GROUP BY Customers.CustomerID, Customers.Name
  • 22.
    Use WHERE insteadof HAVING to define filters [2/2] 22 SELECT Customers.CustomerID, Customers.Name, Count(Sales.SalesID) FROM Customers INNER JOIN Sales ON Customers.CustomerID = Sales.CustomerID WHERE Sales.LastSaleDate BETWEEN #10/10/2019# AND #17/10/2019# GROUP BY Customers.CustomerID, Customers.Name HAVING Count(Sales.SalesID) > 5 HAVING should only be used when filtering on an aggregated field. In the query above, we could additionally filter for customers with greater than 5 sales using a HAVING statement.
  • 23.
    23 1. Proper indexing 2.Retrieve the relevant data only 3. Getting rid of correlated subqueries 4. Using or avoiding temporary tables according to requirement 5. Avoid coding loops OPTIMIZATION 1. Proper indexing 2. Retrieve the relevant data only 3. Getting rid of correlated subqueries 4. Using or avoiding temporary tables according to requirement 5. Avoid coding loops MONITORING PERFORMANCE 1. Monitor Availability and Resource Consumption 2. Measure and Compare Throughput 3. Monitor Expensive Queries 4. Track Database Changes 5. Monitor Logs A data mart is a subset of a data warehouse oriented to a specific business line. Data marts contain repositories of summarized data collected for analysis on a specific section or unit within an organization, for example, the sales department. The Oracle online table reorganization package, (dbms_redefinition) is used to reorganize tables while they are accepting updates. Partitioning is the database process where very large tables are divided into multiple smaller parts. In computing, a materialized view is a database object that contains the results of a query. For example, it may be a local copy of data located remotely, or may be a subset of the rows and/or columns of a table or join result, or may be a summary using an aggregate function. external
  • 24.
    Full Table Scan:Example 24 The following statement queries monthly salaries over SAR 4000
  • 25.
  • 26.
    Indexes and QueryOptimization  Indexes are crucial in speeding up data access because they facilitate searching, sorting, and using aggregate functions and even join operations. 26
  • 27.
    Index Range Scan:Example 27 SELECT * FROM employees WHERE department_id = 20 AND salary > 1000; If indexes are so important, why not index every column in every table? Indexing every column in every table overtaxes the DBMS in terms of index- maintenance processing, especially if the table has many attributes and rows, or requires many inserts, updates, and deletes.
  • 28.
    Database Query OptimizationModes  Algorithms proposed for query optimization are based on: 1. Selection of the optimum order to achieve the fastest execution time 2. Selection of sites to be accessed to minimize communication costs  Evaluated on the basis of: 1. Operation mode 2. Timing of its optimization 28
  • 29.
    Query Blocks  EachSELECT block in the original SQL statement is represented internally by a query block. A query block can be a top-level statement, subquery, or unmerged view. 29 external
  • 30.
    Query Optimizer Operations 30 Theoptimizer contains three components: the transformer, estimator, and plan generator. external
  • 31.
  • 32.
    Join Method Example AdaptiveQuery Plans SELECT product_name FROM order_items o, prod_info p WHERE o.unit_price = 15 AND quantity > 1 AND p.product_id = o.product_id 32 external Hybrid Hash Distribution EXPLAIN PLAN FOR SELECT /*+ parallel(4) full(e) full(d) */ department_name, sum(salary) FROM employees e, departments d WHERE d.department_id=e.department_id GROUP BY department_name;
  • 33.
    Using Hints toAffect Optimizer Choices 33
  • 34.
    DBMS tuning hierarchicallevel 34 external
  • 35.
    Adaptive Query Plans 35 StatisticsFeedback SELECT o.order_id, v.product_name FROM orders o, ( SELECT order_id, product_name FROM order_items o, product_information p WHERE p.product_id = o.product_id AND list_price < 50 AND min_price < 40 ) v WHERE o.order_id = v.order_id external
  • 36.
    Classification of OperationModes  Automatic query optimization: DBMS finds the most cost-effective access path without user intervention  Manual query optimization: Requires that the optimization be selected and scheduled by the end user or programmer 36
  • 37.
    Classification Based onTiming of Optimization  Static query optimization: best optimization strategy is selected when the query is compiled by the DBMS  Takes place at compilation time  Dynamic query optimization: Access strategy is dynamically determined by the DBMS at run time, using the most up-to-date information about the database  Takes place at execution time 37
  • 38.
    Classification Based onType of Information Used to Optimize the Query  Statistically based query optimization algorithm: Statistics are used by the DBMS to determine the best access strategy  Statistical information is generated by DBMS through:  Dynamic statistical generation mode  Manual statistical generation mode  Rule-based query optimization algorithm: based on a set of user-defined rules to determine the best query access strategy 38
  • 39.
    DBMS Architecture  Alldata in a database are stored in data files  Data files automatically expand in predefined increments known as extends  Data files are grouped in file groups or table spaces  Table space or file group: Logical grouping of several data files that store data with similar characteristics  Data cache or buffer cache: Shared, reserved memory area  Stores most recently accessed data blocks in RAM 39
  • 40.
    DBMS Architecture  SQLcache or procedure cache: Stores most recently executed SQL statements or PL/SQL procedures  DBMS retrieves data from permanent storage and places them in RAM  Input/output request: Low-level data access operation that reads or writes data to and from computer devices  Data cache is faster than working with data files  Majority of performance-tuning activities focus on minimizing I/O operations 40
  • 41.
    Find Current Locationof Data and Log File of All the Database 41 SELECT name, physical_name AS current_file_location FROM sys.master_files
  • 42.
    42 DBMS - FileStructure external
  • 43.
    Figure 11.1 -Basic DBMS Architecture 43
  • 44.
    SQL Execution Phase All I/O operations indicated in the access plan are executed  Locks are acquired  Data are retrieved and placed in data cache  Transaction management commands are processed 44
  • 45.
    SQL Execution Phase 45 UsingSQL Access Advisor external
  • 46.
    SQL Fetching Phase Rows of resulting query result set are returned to client  DBMS may use temporary table space to store temporary data  Database server coordinates the movement of the result set rows from the server cache to the client cache 46
  • 47.
    Indexes and QueryOptimization  Indexes  Help speed up data access  Facilitate searching, sorting, using aggregate functions, and join operations  Ordered set of values that contain the index key and pointers  More efficient than a full table scan 47 SELECT * FROM Table WITH(INDEX(Index_Name)) external
  • 48.
    Indexes and QueryOptimization  Data sparsity: Number of different values a column could have  Data structures used to implement indexes:  Hash indexes  B-tree indexes  Bitmap indexes  DBMSs determine best type of index to use 48
  • 49.
    Creating a Namedb-Tree Composite 49 external
  • 50.
    Optimizer Choices  Rule-basedoptimizer: Uses preset rules and points to determine the best approach to execute a query  Cost-based optimizer: Uses algorithms based on statistics about objects being accessed to determine the best approach to execute a query 50
  • 51.
    Using Hints toAffect Optimizer Choices  Optimizer might not choose the best execution plan  Makes decisions based on existing statistics, which might be old  Might choose less-efficient decisions  Optimizer hints: Special instructions for the optimizer, embedded in the SQL command text 51
  • 52.
    Optimizer Hints  Ahint is embedded within a SQL comment.  The hint comment must immediately follow the first keyword of a SQL statement block. You can use either style of comment: a slash-star (/*) or pair of dashes (--). The plus-sign (+) hint delimiter must immediately follow the comment delimiter, with no space permitted before the plus sign, as in the following fragment: 52 external
  • 53.
    Table 11.5 -Optimizer Hints 53
  • 54.
    SQL Performance Tuning Evaluated from client perspective  Most current relational DBMSs perform automatic query optimization at the server end  Most SQL performance optimization techniques are DBMS-specific and thus rarely portable  Majority of performance problems are related to poorly written SQL code 54
  • 55.
    Index Selectivity  Measureof the likelihood that an index will be used in query processing  Indexes are used when a subset of rows from a large table is to be selected based on a given condition  Index cannot always be used to improve performance  Function-based index: Based on a specific SQL function or expression 55
  • 56.
    Conditional Expressions  Expressedwithin WHERE or HAVING clauses of a SQL statement  Restricts the output of a query to only rows matching conditional criteria  Guidelines to write efficient conditional expressions in SQL code  Use simple columns or literals as operands  Numeric field comparisons are faster than character, date, and NULL comparisons 56
  • 57.
    SQL Expressions  OracleDatabase does not accept all forms of expressions in all parts of all SQL statements. Information URL 57 external
  • 58.
    Conditional Expressions  Equalitycomparisons are faster than inequality comparisons  Transform conditional expressions to use literals  Write equality conditions first when using multiple conditional expressions  When using multiple AND conditions, write the condition most likely to be false first  When using multiple OR conditions, put the condition most likely to be true first  Avoid the use of NOT logical operator 58
  • 59.
    Query Formulation  Identifywhat columns and computations are required  Identify source tables  Determine how to join tables  Determine what selection criteria are needed  Determine the order in which to display the output 59
  • 60.
    DBMS Performance Tuning Managing DBMS processes in primary memory and the structures in physical storage  DBMS performance tuning at server end focuses on setting parameters used for: 1. Data cache 2. SQL cache 3. Sort cache 4. Optimizer mode  In-memory database: Store large portions of the database in primary storage 60
  • 61.
    DBMS Performance Tuning Recommendations for physical storage of databases:  Use RAID (Redundant Array of Independent Disks) to provide a balance between performance improvement and fault tolerance  Minimize disk contention  Put high-usage tables in their own table spaces  Assign separate data files in separate storage volumes for indexes, system, and high-usage tables 61
  • 62.
    DBMS Performance Tuning Take advantage of the various table storage organizations in the database  Index-organized table or clustered index table: Stores the end-user data and the index data in consecutive locations in permanent storage  Partition tables based on usage  Use denormalized tables where appropriate  Store computed and aggregate attributes in tables 62
  • 63.
    63 Databases, tablespaces, anddatafiles are closely related, but they have important differences: • An Oracle database consists of one or more logical storage units called tablespaces, which collectively store all of the database's data. • Each tablespace in an Oracle database consists of one or more files called datafiles, which are physical structures that conform to the operating system in which Oracle is running. • A database's data is collectively stored in the datafiles that constitute each tablespace of the database. For example, the simplest Oracle database would have one tablespace and one datafile. Another database can have three tablespaces, each consisting of two datafiles (for a total of six datafiles). external
  • 64.
    CREATE TABLESPACE  Createa Shadow Tablespace: Example 64 The following example creates a 10 MB undo tablespace undots1:
  • 65.
    Creating a TemporaryTablespace: Example  This statement shows how the temporary tablespace that serves as the default temporary tablespace for database users in the sample database was created: 65
  • 66.
    Enabling Autoextend fora Tablespace: Example 66 This statement creates a tablespace named tbs_02 with one data file. When more space is required, 500 kilobyte extents will be added up to a maximum size of 100 megabytes:
  • 67.
    Using Multiple Tablespaces You can use multiple tablespaces to perform the following tasks: 1. Control disk space allocation for database data 2. Assign specific space quotas for database users 3. Control availability of data by taking individual tablespaces online or offline 4. Perform partial database backup or recovery operations 5. Allocate data storage across devices to improve performance 67 external
  • 68.
    Using Tablespaces  Adatabase administrator can use tablespaces to do the following actions: 1. Create new tablespaces 2. Add datafiles to tablespaces 3. Set and alter default segment storage settings for segments created in a tablespace 4. Make a tablespace read only or read/write 5. Make a tablespace temporary or permanent 6. Rename tablespaces 7. Drop tablespaces 68 external
  • 69.
    Create TABLESPCE: Example 69 CREATETABLESPACE lmtbsb DATAFILE '/u02/oracle/data/lmtbsb01.dbf' SIZE 50M EXTENT MANAGEMENT LOCAL UNIFORM SIZE 128K; In this example, a 128K extent size is specified. Each 128K extent (which, if the tablespace block size is 2K, is equivalent to 64 Oracle blocks) is represented by a bit in the extent bitmap for this file.
  • 70.
    Make a tablespaceread only 70 ALTER TABLESPACE sales_1 READ ONLY; ALTER TABLESPACE LMS_Student READ ONLY;
  • 71.
    Creating Oracle ManagedFiles: Examples 71 CREATE TABLESPACE omf_ts2 DATAFILE AUTOEXTEND OFF; The following example creates a tablespace with an Oracle-managed data file of 100M that is not autoextensible: ALTER SYSTEM SET DB_CREATE_FILE_DEST = '$ORACLE_HOME/rdbms/dbs'; CREATE TABLESPACE omf_ts1; The following example sets the default location for data file creation and creates a tablespace with a data file in the default location. The data file is 100M and is autoextensible with an unlimited maximum size:

Editor's Notes

  • #3 Indexes are special lookup tables that the database search engine can use to speed up data retrieval. Simply put, an index is a pointer to data in a table. An index in a database is very similar to an index in the back of a book.
  • #9 A histogram is a graphical display of data using bars of different heights. In a histogram, each bar groups numbers into ranges. Taller bars show that more data falls in that range. A histogram displays the shape and spread of continuous sample data. s
  • #10 When you generate statistics for a table, all related indexes are also analyzed. However, you could generate statistics for a single index by using the following command, where VEND_NDX is the name of the index: ANALYZE INDEX VEND_NDX COMPUTE STATISTICS;
  • #12 The SELECT DISTINCT statement is used to return only distinct (different) values. Inside a table, a column often contains many duplicate values; and sometimes you only want to list the different (distinct) values.
  • #13 SELECT DISTINCT sp.atcoCode, sp.name, sp.longitude, sp.latitude FROM `transportdata`.stoppoints as sp INNER JOIN `vehicledata`.gtfsstop_times as st ON sp.atcoCode = st.fk_atco_code INNER JOIN `vehicledata`.gtfstrips as trip ON st.trip_id = trip.trip_id INNER JOIN `vehicledata`.gtfsroutes as route ON trip.route_id = route.route_id INNER JOIN `vehicledata`.gtfsagencys as agency ON route.agency_id = agency.agency_id WHERE agency.agency_id IN (1,2,3,4);
  • #19 This graphic shows a shared pool check. At the top are three boxes laid on top of one another, each box smaller than the one behind it. The smallest box shows hash values and is labeled Shared SQL Area. The second box is labeled Shared Pool. The outer box is labeled SGA. Below this box is another box labeled PGA. Within the PGA box is a box labeled Private SQL Area, which contains a hash value. A double-ended arrow connects the upper and lower boxes and is labeled "Comparison of hash values." To the right of the PGA box is a person icon labeled User Process. The icons are connected by a double-sided arrow. Above the User Process icon is an "Update ...." statement. An arrow leads from the User Process below so the Server Process icon below.
  • #21 The INNER JOIN selects all rows from both participating tables as long as there is a match between the columns.
  • #22 HAVING should only be used when filtering on an aggregated field. 
  • #27 If there is no index, the DBMS will perform a full-table scan and read all 14,786 customer rows. Assuming that the index STATE_NDX is created (and analyzed), the DBMS will automatically use the index to locate the first customer with a state equal to 'FL' and hen proceed to read all subsequent CUSTOMER rows, using the row IDs in the index as a guide. Assuming that only five rows meet the condition CUS_STATE = 'FL' there are five accesses to the index and five accesses to the data, for a total of 10 I/O accesses. The DBMS would be saved from reading approximately 14,776 I/O requests for customer rows that do not meet the criteria. That is a lot of CPU cycles!
  • #46 This flowchart diagram shows the SQL Tuning Sets APIs. The following SQL Tuning Set API procedures are shown: CREATE_SQLSET, DROP_SQLSET, LOAD_SQLSET, UNPACK_STQTAB_SQLSET, and PACK_STQTAB_SQLSET. All procedures either lead into or from the SQL Tuning Set (STS). Three input sources are shown leading into the LOAD_SQLSET procedure: cursor cache, AWR, and another STS.
  • #50 Assume that the value of SPARSEINDEX is BTREE. The following statements define two objects: a named composite that has a b-tree index and base dimensions of market and a variable called expenses that is dimensioned by the month dimension and the market.product composite.
  • #58 { simple_expression | compound_expression | calc_meas_expression | case_expression | cursor_expression | datetime_expression | function_expression | interval_expression | JSON_object_access_expr | model_expression | object_access_expression | scalar_subquery_expression | type_constructor_expression | variable_expression }
  • #65 CREATE BIGFILE TABLESPACE <shadow_tablespace_name> DATAFILE <datafile_name> SIZE 100000000 BLOCKSIZE 8K LOST WRITE PROTECTION
  • #66 CREATE TEMPORARY TABLESPACE temp_demo TEMPFILE 'temp01.dbf' SIZE 5M AUTOEXTEND ON;
  • #67 CREATE TABLESPACE tbs_02 DATAFILE 'diskb:tbs_f5.dbf' SIZE 500K REUSE AUTOEXTEND ON NEXT 500K MAXSIZE 100M;
  • #70 CREATE TABLESPACE lmtbsb DATAFILE '/u02/oracle/data/lmtbsb01.dbf' SIZE 50M EXTENT MANAGEMENT LOCAL UNIFORM SIZE 128K;