Performance troubleshooting
using Active Session History
OUG Ireland
Conference 2012
Marcin Przepirowski
Intro : About me
Oracle DBA / consultant / trainer since 2000
Oracle ACE since 2010
Blogger - http://oracleprof.blogspot.com/
Agenda
Performance tuning problems and goals
DBTime is money
Average Active Session
Active Session History
Case study
Performance tuning
Define a metrics to measure
best metrics are related to time
and business activities
Compare performance with system baseline
and business goals
IT
USER
Tuning goals
99 % of response time for SQL/business
transaction has to be X seconds, rest has to be
maximum Y seconds
For 99 % of executions - X number of
rows/business operations have to be
processed in Y seconds
DBTime is money
DB Time
Oracle doc Database time represents the total
time spent in database calls, and is an indicator of
the total instance workload
Sum of CPU time and not idle wait time (IO and
others)
ASH samples counts = DB Time in seconds
(proof in DB Time Oracle Performance Tuning: Theory and Practice by
Graham Wood, John Beresniewicz)
DB Time
DB Time increase when system load increase
number of active sessions increase
number of calls increase
DB Time increase when system
throughput/performance decrease
IO time increase
non-idle events time increase (locks, network
problems)
7
Average Active Session
Average Active Session
AAS = DB Time / Elapsed clock time
Average Active Session using ASH samples
AAS = ASH samples ( t=1s)/ Elapsed clock time
For more detail go to Appendix A
of Average active sessions: the magic metric ? By John Beresniewicz
8
Average Active Session
Time
t0
t1
AAS = (DB TIME(t1) DB TIME(t0) ) / (t1 t0)
9
Average Active Session
Time
t0
2
t1
2
No of samples
AAS = ASH samples/ (t1 t0)
10
AAS using ASH
DB TIME
ASH samples
Time
t0
t1
Be aware of using small number of samples poor
and misleading results
More samples better results
11
Average Active Session
Total upper AAS boundary is equal to number
of sessions connected to instance
AAS 0 database is idle
AAS < no of CPU no system bottleneck
AAS >> no of CPU database bottleneck
12
Average Active Session - CPU
Upper boundary for CPU class number of
CPU cores
If CPU class is close to upper boundary there is
a CPU bottleneck in the system
13
Average Active Session - IO
Upper boundary for IO vs. maximum system
IO/s
Single IO operation time
single block read
multiple block read
Number of IO requests in one ASH sample
#IO requests / sample 1 s / max ( sbrt, mbrt )
14
Average Active Session - IO
System max IO/s = 8000 IO/s
Avg IO time = 5 ms (0.005 s)
Max #IO per ASH sample = 1 / 0.005 = 200
Upper boundary for User/System IO class is
8000 / 200 = 40
15
Active Session History
Active Session History is system-wide activity
trace
Active sessions are sampled and keep inmemory
Active != Idle - not idle event or on CPU
Sampled every second
Inserted into circular buffer in SGA
MMON Light take care about it
16
Active Session History
Started with first release of Oracle 10g
Enhancement in all next versions number of
columns/metrics increased
Together with AWR can provide historical
information every 1/10 of samples are store
in persistent tables
17
Active Session History
45 columns in 10.2.0.4
66 columns in 11.1.0.7
93 columns in 11.2.0.1
Important new columns:
- sql_plan_line_id 11g
- machine 11gR2
- delta columns 11gR2
18
Active Session History
V$ACTIVE_SESSION_HISTORY
Main view based on in-memory X$ASH table.
Circular buffer size from 1 to 128 MB
about 2 MB / CPU.
Flushed to disk:
- every hour / AWR sample
- when buffer is 2/3 full
19
Active Session History
DBA_HIST_ACTIVE_SESS_HISTORY
Flushed history of ASH persistent table
1/10 of sampled data
Partitioned for easier purging
Part of AWR system
20
Using ASH ON CPU
Column name
Value
SESSION_STATE
ON CPU
WAIT_TIME
Non zero value
EVENT, P1, P2, P3,
CURRENT_OBJ#,
CURRENT_FILE#,
CURRENT_BLOCK#
Maybe not cleared
from previous event
21
Using ASH - WAITING
Column name
Value
SESSION_STATE
WAITING
WAIT_TIME
TIME_WAITED
only last sample is
updated/fixed after
event DO NOT use in
ASH calculations
22
Using ASH - WAITING
SQL> select sample_time, event, TIME_WAITED from
v$active_session_history where session_id = 170
and sql_id = '6548zp29zsqgj' order by 1;
SAMPLE_TIME
-----------------08.39.59.483 AM
08.40.00.483 AM
08.40.01.483 AM
08.40.02.483 AM
. . .
08.40.18.503 AM
08.40.19.503 AM
EVENT
TIME_WAITED
------------------------------- ----------enq: TX - row lock contention
0
enq: TX - row lock contention
0
enq: TX - row lock contention
0
enq: TX - row lock contention
0
enq: TX - row lock contention
enq: TX - row lock contention
0
21295426
23
Using ASH - WAITING
EVENTS 1142 in 11.2.0.2
WAIT CLASS
Concurrency
User I/O
System I/O
Administrative
Other
Configuration
Cluster
Application
Queuing
Scheduler
Network
Commit
24
ASH - Math
Use count(*) for calculate wait or ON CPU
time
Group by session_id, sql_id, client_id, etc....
As ASH is a sample do not use time_waited
column in calculations:
sum(time_waited)
avg(time_waited)
etc.
25
ASH - Math
Example of query Profile of SQL queries from ASH
select
sql_id,
event,
count(*) cnt
from v$active_session_history
group by sql_id,event
26
Using ASH - time
sample_time when it happen
Using samples across time
in-memory 1 second sample
on-disk 10 second sample
select count(*)
from v$active_session_history ...
select count(*) * 10
from dba_hist_active_sess_history ...
27
Using ASH - Drilldown
session_id
user_id
program
module
action
machine
client_id
sql_id
pl/sql
blocking_session
object
file
etc.
28
Top SQL
select sql_id,
round((count(*) / sum(count(*)) over
())*100,2) ActPct
from v$active_session_history
where sql_id is not null
and sample_time > sysdate5/1440
group by sql_id
order by ActPct
29
Top Session
select session_id, round((count(*) /
sum(count(*)) over ())*100,2) ActPct
from v$active_session_history
where
sample_time > sysdate5/1440
group by session_id
order by ActPct desc
30
Top program
select program, round((count(*) /
sum(count(*)) over ())*100,2) ActPct
from v$active_session_history
where
sample_time > sysdate5/1440
group by program
order by ActPct desc
31
Blocking session
select
session_id, event, blocking_session
from v$active_session_history
where
blocking_session is not null
and sample_time > sysdate - 5/1440;
32
OEM 11g
33
OEM 12c ASH Analytics
34
OEM 12c ASH Analytics
35
OEM 12c ASH Analytics
36
OEM 12c ASH Analytics
37
OEM 12c ASH Analytics
38
ASH availability
Oracle 10g onwards
Oracle Enterprise Edition only
Diagnostic and Tuning Pack required
Accessibility:
Oracle Enterprise Manager Performance Tab
Text / html reports ASHRPT.SQL
V$ and DBA_HIST tables
39
Other options
3rd party products DB Optimizer, I3, Spotlight
Free solution Simulating-ASH
http://ashmasters.com/
https://sourceforge.net/projects/orasash
https://github.com/pioro/orasash
Interactive ASH tool with statistics
Snapper v. 3.0. by Tanel Poder
40
Case study
Average Active Sessions vs. system throughput
Is big stuff always a problem ?
Do I need to investigate a big stuff ?
Compare it with throughput first DB Time
will increase when system load increase
41
Case study
AverageTransactions/s - 80.23
AverageTransactions/s - 98.18
42
Case Study
Average Active Session system load
Load profile
SQL plan flip
SQL query profile
43
AAS system load
select mtime, round(sum(c1),2) AAS_WAIT,
round(sum(c2),2) AAS_CPU,
round(sum(cnt),2) AAS
from ( select
to_char(sample_time,'YYYY-MM-DD HH24) mtime,
decode(session_state,'WAITING',count(*),0)/360 c1,
decode(session_state,'ON CPU',count(*),0) /360 c2,
count(*)/360 cnt
from dba_hist_active_sess_history
group by to_char(sample_time,'YYYY-MM-DD HH24'),
session_state
)
group by mtime order by mtime;
44
AAS system load
MTIME
AAS_WAIT AAS_CPU
AAS
------------- -------- ------- -----2012-03-03 08
6.36
4.23 10.58
2012-03-03 09
17.91
6.64 24.54
2012-03-03 10
33.10
8.63 41.73
2012-03-03 11
29.34
8.90 38.24
2012-03-03 12
29.76
8.76 38.52
2012-03-03 13
33.30
9.59 42.89
2012-03-03 14
38.02
9.40 47.42
2012-03-03 15
21.05
6.11 27.16
2012-03-03 16
3.81
1.63
5.44
2012-03-03 17
3.84
1.41
5.25
45
AAS system load
46
Load profile
select * from (
select
decode(session_state,'WAITING',event,'ON CPU'),
count(*) cnt,
count(*) / (sum(count(*)) over ()) * 100 pct
from dba_hist_active_sess_history
where
sample_time between X and Y
group by
decode(session_state,'WAITING',event,'ON CPU')
order by cnt desc
) where rownum <= 5;
47
Load profile
EVENT
CNT
PCT
------------------------------ ---------- ------enq: US - contention
8059
53.65
ON CPU
3107
20.68
db file sequential read
645
4.29
latch: undo global data
536
3.57
enq: TX - row lock contention
446
2.97
ON CPU
db file sequential read
enq: US - contention
LNS wait on SENDREQ
LGWR-LNS wait on channel
1347
524
365
215
201
38.38
14.93
10.40
6.13
5.73
48
SQL plan flip
select sql_id, count(*)
from (
select distinct sql_id, sql_plan_hash_value
from dba_hist_active_sess_history
where sql_opname <> 'INSERT' and user_id = X
and sql_plan_hash_value <> 0)
group by sql_id
having count(*) > 1
order by 2;
49
SQL plan flip
SQL_ID
COUNT(*)
------------- ---------a7rf2v00g2y9t
2
8w2904am00nbq
2
dyuz1k3bdr36s
2
7upuf7wbk7nbf
2
da6wsxtsz2ftk
3
gur9jrbxvbvur
No sql_opname filter
50
SQL plan flip ?
select distinct sql_id, sql_plan_hash_value
from dba_hist_active_sess_history
where sql_id = 'gur9jrbxvbvur';
SQL_ID
SQL_PLAN_HASH_VALUE
------------- ------------------gur9jrbxvbvur
2263242137
gur9jrbxvbvur
3926490670
select sql_id from v$sql
where plan_hash_value = 3926490670;
SQL_ID
------------62fbdqwqht3x1
51
Lazy clean up
select *
from table(dbms_xplan.display_cursor
('62fbdqwqht3x1',null));
SQL_ID 62fbdqwqht3x1, child number 0
------------------------------------select KVAL_SEQUENCE.nextval from dual
Plan hash value: 3926490670
---------------------|
0 | SELECT STATEMENT |
|
1 | SEQUENCE
|
|
2 |
FAST DUAL
|
52
SQL plan flip
select
distinct sql_id, sql_plan_hash_value, sql_opname
from dba_hist_active_sess_history
where sql_id = 'dyuz1k3bdr36s';
SQL_ID
SQL_PLAN_HASH_VALUE SQL_OPNAME
------------- ------------------- ----------dyuz1k3bdr36s
760792886 SELECT
dyuz1k3bdr36s
2913971644 SELECT
53
SQL plan flip
select
to_char(sample_time, 'YYYY-MM-DD HH24') sample_time,
sql_id, sql_plan_hash_value
from dba_hist_active_sess_history
where
sql_id = 'dyuz1k3bdr36s'
group by
to_char(sample_time, 'YYYY-MM-DD HH24'),
sql_id, sql_plan_hash_value
order by sample_time;
54
SQL plan flip
SAMPLE_TIME
------------2012-03-01 08
2012-03-01 09
2012-03-01 10
2012-03-01 10
2012-03-01 11
. . .
2012-03-02 11
2012-03-02 12
2012-03-02 12
2012-03-02 13
2012-03-02 14
SQL_ID
SQL_PLAN_HASH_VALUE
------------- ------------------dyuz1k3bdr36s
760792886
dyuz1k3bdr36s
760792886
dyuz1k3bdr36s
760792886
dyuz1k3bdr36s
2913971644
dyuz1k3bdr36s
2913971644
dyuz1k3bdr36s
dyuz1k3bdr36s
dyuz1k3bdr36s
dyuz1k3bdr36s
dyuz1k3bdr36s
2913971644
760792886
2913971644
760792886
760792886
55
SQL query profile
select sql_plan_hash_value, sql_exec_id,
count(*) cnt
from dba_hist_active_sess_history
where sample_time between X and Y
and sql_id = 'dyuz1k3bdr36s'
group by sql_plan_hash_value,sql_exec_id
order by sql_plan_hash_value, cnt;
SQL_PLAN_HASH_VALUE SQL_EXEC_ID
CNT
------------------- ----------- ---------760792886
17324679
1
2913971644
17324251
51
56
SQL query profile
select
decode(session_state,'WAITING',event,'ON CPU'),
count(*) cnt
from dba_hist_active_sess_history
where sql_id = 'dyuz1k3bdr36s'
and SQL_EXEC_ID = 999999999
group by
decode(session_state,'WAITING',event,'ON CPU')
order by cnt;
57
SQL query profile
SQL EXEC ID - 17324255
EVENT
CNT
----------------------------- ---------db file sequential read
3
ON CPU
6
db file scattered read
41
SQL EXEC ID - 17324679
EVENT
CNT
----------------------------- ---------db file scattered read
1
58
Q&A
59