Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1

Troubleshoo4ng the Most Complex Performance Issue I’ve ever seen Tanel Poder hAp://blog.tanelpoder.com hAp://tech.e2sn.com www.enkitec.com 1

Intro: About me •  Tanel Põder Oracle Database Performance geek Exadata Performance geek Hadoop Performance geek •  Enkitec •  Consultant •  Researcher •  Technology Evangelist •  Just moved to Dallas •  ANer Tallinn -‐> Stockholm -‐> London -‐> Cancun -‐> Singapore www.enkitec.com Expert Oracle Exadata book (with Kerry Osborne and Randy Johnson of Enkitec) 2

Two issues -‐ actually •  For warm-‐up: •  cursor pin: S wait events and sporadic CPU spikes •  Read more from my blog entry: •  hAp://blog.tanelpoder.com/2010/04/21/cursor-‐pin-‐s-‐waits-‐sporadic-‐cpu-‐ spikes-‐and-‐systema4c-‐troubleshoo4ng/ •  Or just google for “cursor pin s” www.enkitec.com 3

Environment •  •  •  •  High-‐concurrency, high-‐visibility OLTP database Oracle 11.1.0.7 single-‐instance, dedicated server processes HP-‐UX on Itanium 32 CPUs, 128 GB RAM •  Thousands of end users •  Mul4ple WebLogic applica4on servers talking to database via connec4on pools www.enkitec.com 4

The problem •  •  •  •  Sporadic extreme slowness of Oracle DB and the server Slowness lasts for 1 .. 20 minutes at a 4me… Queries don’t answer or extremely slow Can’t even log on to OS during that 4me •  New SSH connec4ons succeded once the spike was over •  It takes minutes to run simple OS commands during the problem 4me •  This is a global server-‐wide problem – everyone complains! So, the scope of this problem is global, server-‐wide. Therefore we can use global, server-‐wide metrics to diagnose the problem. www.enkitec.com 5

Let’s pick and diagnose one occurrence of this problem •  The database response 4mes extremely bad again around 18:10 and this lasted for about 5 minutes… •  If it’s the users who report the problem (as opposed to applica4on side measurements), then there may be some discrepancies in the user reported 4mes vs actual problem 4me www.enkitec.com 6

Ini4al AWR Report Number of sessions has grown by ~500! Snap Id Snap Time Sessions Curs/Sess --------- ------------------- -------- --------Begin Snap: 61921 30-Oct-10 18:00:10 2,383 28.9 Using a 20 minute End Snap: 61922 30-Oct-10 18:20:20 2,863 24.7 report for diagnosing Elapsed: 20.17 (mins) a 5 minute problem?! DB Time: 559.31 (mins) Avg % DB Event Waits Time(s) (ms) time Wait Class --------------------------- --------- ----------- ------ ------ ---------db file sequential read 2,135,668 21,468 10 64.0 User I/O DB CPU 5,860 17.5 log file sync 92,720 1,498 16 4.5 Commit read by other session 91,676 1,307 14 3.9 User I/O SQL*Net message from dblink 525 1,132 2155 3.4 Network This 66% idle is an average over 20 minutes! Host CPU (CPUs: ~~~~~~~~ 32 Cores: 32 Sockets: 32) Load Average Begin End %User %System %WIO %Idle --------- --------- --------- --------- --------- --------0.37 0.31 16.3 17.6 12.1 66.1 www.enkitec.com 7

ASH data (shown in OEM) •  Average ac4ve sessions showed something diﬀerent •  Note that this data is from another period of 4me when a similar spike happened •  In worst 4mes there were up to 220 ac4ve sessions trying to be on CPU! •  Thanks to beAer granularity we see the spikes instead of some 20-‐minute or hourly averages… •  The problem with ASH samples is that it looks into session state from inside Oracle •  Perhaps the starva5on is due to some other applica5on / instance in the server? www.enkitec.com 8

How many logons were done? Number of sessions has grown by ~500! Snap Id Snap Time Sessions Curs/Sess --------- ------------------- -------- --------Begin Snap: 61921 30-Oct-10 18:00:10 2,383 28.9 End Snap: 61922 30-Oct-10 18:20:20 2,863 24.7 Elapsed: 20.17 (mins) DB Time: 559.31 (mins) Statistic Total per Second -------------------------------- ------------------ -------------index fetch by key 24,174,148 19,971.0 index scans kdiixs1 24,565,055 20,293.9 leaf node 90-10 splits 5,865 4.9 leaf node splits 14,529 12.0 lob reads 34,480 28.5 lob writes 1,623,273 1,341.0 lob writes unaligned 1,623,266 1,341.0 logons cumulative 2,550 2.1 messages received 133,740 110.5 messages sent 133,740 110.5 min active SCN optimization appl 538,358 444.8 no buffer to keep pinned count 6,331 5.2 no work - consistent read gets 146,703,542 121,196.1 opened cursors cumulative 4,168,700 3,443.9 www.enkitec.com 2.1 logons per second, but we don’t know how these logons are per Trans ------------distributed over 43.6 the 20 minute 44.3 period! 0.0 0.0 0.1 2.9 2.9 0.0 0.2 0.2 1.0 0.0 264.7 7.5 9

OS level metrics don’t lie (well, they do, but less ;-‐) | | | Date | Time |CPU % | 10/30/2010|17:45:00| 25.06| 10/30/2010|17:50:00| 24.77| 10/30/2010|17:55:00| 24.60| 10/30/2010|18:00:00| 25.95| 10/30/2010|18:05:00| 22.88| 10/30/2010|18:10:00| 66.89| 10/30/2010|18:15:00| 24.51| 10/30/2010|18:20:00| 25.47| 10/30/2010|18:25:00| 28.38| Phys | IO Rt | 7149.5| 5334.8| 7176.4| 7556.2| 5379.5| 4544.6| 7544.9| 5144.0| 10139.5| Phys |Memory|Pg Out | VM Pg | KB Rt | % | Rate | Scan Rt | 145817.6| 74.07| 0.0| 0.0| 60928.0| 73.98| 0.0| 0.0| 186368.0| 73.98| 0.0| 0.0| 192307.2| 74.11| 0.0| 0.0| 67584.0| 74.15| 0.0| 0.0| 58060.8| 76.97| 0.0| 0.0| 159334.4| 76.40| 0.0| 0.0| 59187.2| 75.04| 0.0| 0.0| 151552.0| 74.37| 0.0| 0.0| 1)  What does the Time 18:10:00 mean, beginning of the monitoring interval or end? 2)  66.89% busy during 5 minutes may actually mean 100% busy during ~3 minutes out of 5, but we don’t know that for sure without measuring in more detail (beAer granularity)… www.enkitec.com 10

Measuring CPU u4liza4on in more detail The spike lasted from 18:11 to 18:14 (3 min) Around 90% in Kernel mode!!! www.enkitec.com 11

Checkpoint – measured evidence so far •  Fact: We have a 100% CPU u4liza4on spike, las4ng 3 minutes •  Fact: 90% of it is spent in KERNEL mode •  Fact: We have over 2500 logons done during 20 minute period •  2.1 logons / second on average (which doesn’t sound bad) •  Kernel mode CPU usage is usually caused by system calls •  …or some internal kernel thread ac4vity 100% CPU usage doesn’t always automa4cally mean you have a serious CPU starva4on problem. The CPU runqueue length would indicate you how much starva4on (wai4ng for CPU service) there is. However seeing 90% of CPU used in KERNEL mode is deﬁnitely not normal for an Oracle database server. www.enkitec.com 12

Diagnosing 90% kernel-‐mode CPU usage spikes… 1.  Systema4c •  •  •  •  Break down this 90% of Kernel mode CPU usage Proﬁling! Oh, this is a produc4on system and the problem is acute & ongoing On Solaris, I’d have used Dtrace stack() probe to record OS kernel stack traces most common on CPU (google for dstackprof) •  •  Or lockstat as it reports spins on spinlocks (which consume kernel CPU) But this was HP-‐UX and I didn’t know the tools needed •  •  But I knew what numbers I wanted to see! We sent a request to HP-‐UX support: “How do we measure & break down where is kernel mode CPU used?” 2.  Check for usual suspects •  Fast, cheap checks to rule out or ﬁnd known troublemakers www.enkitec.com 13

Kernel mode CPU usage spikes – the usual suspects •  Before star4ng the systema4c troubleshoo4ng & drilldown, do quick checks for usual suspects •  Remember, the client has a business problem, 4me is of essence… 1.  Logon (or logoﬀ) storms •  Spawning, ini4alizing new processes, opening ﬁles and aAaching to SGA means system calls, kernel CPU usage 2.  Oracle code gevng into some crazy loop (due to a bug) •  Semop(), yield(), read /proc/…, getrusage(), etc loop 3.  OS kernel spinlock conten4on •  •  Not so usual suspects really… Variety of reasons… ONen due to bugs in OS or some kernel module www.enkitec.com 14

Measuring logon storms •  Use the AUD$ records or “logons cumula@ve” number from V$SYSSTAT or AWR, right? •  Wrong! •  logons cumula5ve number is incremented by the session itself – aNer it has logged on, the same applies to audit records! 1.  2.  3.  4.  5.  ANer the listener connec4on has been established… The process has been started… It has aAached to SGA SHM segments… Audit ﬁle has been wriAen (if needed) … Process, session SGA structures have been created •  Memory from OS and shared pool allocated (shared pool latches!) 6.  Session has been authen4cated 7.  Then the logons cumula4ve is incremented! www.enkitec.com The logon storm may have started way before these logons ﬁnally succeeded! 15

Measuring logon storms •  Logon storms should be measured at the listener level •  Process listener.log using a script: $ tail listener.log 30-OCT-2010 23:22:03 30-OCT-2010 23:22:08 30-OCT-2010 23:22:08 30-OCT-2010 23:22:09 * * * * (CONNECT_DATA=…) (CONNECT_DATA=…) (CONNECT_DATA=…) service_update * * establish * E2SNDB * 0 * establish * E2SNDB * 0 * establish * E2SNDB * 0 E2SNDB * 0 $ fgrep "30-OCT-2010" listener.log | fgrep "establish" | awk '{ print $1 " " $2 }' | awk -F: '{ print $1 }' | sort | uniq –c … 5564 30-OCT-2010 20 5571 30-OCT-2010 21 5703 30-OCT-2010 22 5703 logons during 4828 30-OCT-2010 23 22:00 – 22:59 … www.enkitec.com 16

Measuring logon storms •  Logon storms should be measured at the listener level •  Process listener.log using a script: $ fgrep "30-OCT-2010 22:" listener.log | fgrep "establish" | awk '{ print $1 " " $2 }' | awk -F: '{ print $1 ":" $2 }' | sort | uniq –c … 88 30-OCT-2010 22:00 120 30-OCT-2010 22:01 94 30-OCT-2010 22:02 How many connec4ons 94 30-OCT-2010 22:03 listener established every 95 30-OCT-2010 22:04 minute (this data from non-‐ 120 30-OCT-2010 22:05 problem 4me) 79 30-OCT-2010 22:06 101 30-OCT-2010 22:07 85 30-OCT-2010 22:08 100 30-OCT-2010 22:09 85 30-OCT-2010 22:10 89 30-OCT-2010 22:11 83 30-OCT-2010 22:12 93 30-OCT-2010 22:13 www.enkitec.com 17

Con4nuing the OS kernel mode CPU usage diagnosis Let’s trace system calls by one CPU heavy Oracle process # tusc -cp 8021 ( Attached to process 8021 ("oracleXYZ (LOCAL=NO)") [64-bit] ) ( Detaching from process 8021 ("oracleXYZ (LOCAL=NO)") ) Syscall open ----Total Interes4ngly 3 open() syscalls take over a second in total. Could this be caused by the general CPU starva4on in the server? Seconds 1.05 ----1.05 Calls 3 ----3 The next step should have been to check which ﬁle did we try to open (but the spike ended before we could do that) www.enkitec.com Errors 3 ----3 Seeing errors isn’t a problem itself, as this is how Oracle and libs some4mes check for existence of a ﬁle… 18

Con4nuing the OS kernel mode CPU usage diagnosis •  HP-‐UX support got back to us and recommended the use of runki tool for measuring kernel CPU u4liza4on in detail •  It had to be installed as root •  It measured a lot of things happening in kernel, wri4ng a big output ﬁle •  on Solaris there’s a tool called TNF trace, AIX has trace command for such full kernel tracing •  Now we had to wait for the problem to happen again •  Didn’t have to wait for too long… •  We sent the raw trace dump to HP Support, so they could run something like “tkprof” on that traceﬁle •  It basically just summed up the kernel spinlock wait, spin events by lock, object locked etc www.enkitec.com 19

kiprof – proﬁled runki output Total Hardclock traces: 30239 ================================ State Count Percent USER 7130 23.58 SYS 22965 75.94 IDLE 24 0.08 SSYS 120 0.40 That’s basically spinning for locks (adap4ve decisions to spin or wait) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Kernel Functions executed during profile Count Pct State Function Self-‐explanatory. We ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 15615 51.64% SYS timed_preArbitration are spinning for a lock 7130 23.58% USER OTHER 1275 4.22% SYS spinlock 1008 3.33% SYS wait_for_lock_spinner VxFS directory block 488 1.61% SYS vx_dirbread read? 464 1.53% SYS vx_bc_getblk 417 1.38% SYS preArbitration 338 1.12% SYS vx_dirscan VxFS directory 291 0.96% SYS vx_bc_brelse contents scan !!! www.enkitec.com 20

Drilling down to kernel spinlock spinning The main spinlock experiencing spinning was related to VxFS: spn%kern 7.24 0.11 0.04 0.04 0.01 cumpct 7.24 7.36 7.40 7.44 7.45 seconds 71.96 1.14 0.44 0.42 0.12 spn%cpu lock name 3.77 FS:vxfs:bc_freelist_lock spin 0.06 FS:vxfs:inode spin for sleep lock 0.02 FS:vxfs:i_spinspin lock 0.02 Sleep Queue lock 0.01 v_count_lock Oracle processes were the main ones spinning: spn%spn 99.17 0.60 0.08 0.05 0.03 0.03 cumpct 99.17 99.77 99.85 99.91 99.93 99.96 spnsec 73.82 0.45 0.06 0.04 0.02 0.02 usrsec 880.32 0.00 1.42 0.00 0.00 0.00 kernsec spn%kern process name 543.70 13.58 oracle 7.37 6.11 vxfsd 11.52 0.52 tnslsnr 402.60 0.01 [IDLE] 0.50 4.00 xyz 0.43 4.65 sadc www.enkitec.com 21

Which system calls? $ /opt/ktools/bin/kiinfo -kitrace pid=28679,syscalls -ts 1103_1307 | grep open | sort -r -n -k 13,13 | more 2.550292 cpu=6 seqcnt=43808107 pid=28679 ktid=526366 utid=1 open [5] err=17 ret1=12 ret2=1 syscallbeg= 3.295138 ktcsys= 0.000000 A0=0x9fffffff A1= 0xffffceb0 A2=0 A3=0xd02 A4=0 A5=432 fname='/opt/apps/oracle/ admin/xyzdb/adump/xyzdb_ora_28679_21.aud' 8.789893 cpu=6 seqcnt=43825746 pid=28679 ktid=526366 utid=1 open [5] err=17 ret1=12 ret2=1 syscallbeg= 3.172851 ktcsys= 0.000000 A0=0x9fffffff A1= 0xffffceb0 A2=0 A3=0xd02 A4=0 A5=432 fname='/opt/apps/oracle/ admin/xyzdb/adump/xyzdb_ora_28679_23.aud' 5.617009 cpu=6 seqcnt=43816702 pid=28679 ktid=526366 utid=1 open [5] err=17 ret1=12 ret2=1 syscallbeg= 3.066686 ktcsys= 0.000000 A0=0x9fffffff A1= 0xffffceb0 A2=0 A3=0xd02 A4=0 A5=432 fname='/opt/apps/oracle/ admin/xyzdb/adump/xyzdb_ora_28679_22.aud' 18.143066 cpu=6 seqcnt=43848837 pid=28679 ktid=526366 utid=1 open [5] err=17 ret1=12 ret2=1 syscallbeg= 2.576237 ktcsys= 0.000000 A0=0x9fffffff A1= 0xffffceb0 A2=0 A3=0xd02 A4=0 A5=432 fname='/opt/apps/oracle/ admin/xyzdb/adump/xyzdb_ora_28679_26.aud' www.enkitec.com 22

Audit file des4ona4on •  New audit file name format in 11g… •  A new file is created for each audit file where the SPID collides with a previous file… •  Every @me when crea@ng a new audit file, Oracle has to check whether such file already exists with suffix _1, then _2, _3, etc.. $ cd /u01/app/oracle/admin/E2SNDB/adump $ ls -l | head total 4788 -rw-r----- 1 oracle dba 735 Feb 28 16:06 -rw-r----- 1 oracle dba 710 Oct 16 17:58 -rw-r----- 1 oracle dba 735 Oct 16 17:58 -rw-r----- 1 oracle dba 735 Feb 27 17:53 -rw-r----- 1 oracle dba 736 Oct 16 17:58 -rw-r----- 1 oracle dba 740 Oct 16 17:58 -rw-r----- 1 oracle dba 735 Feb 28 16:07 -rw-r----- 1 oracle dba 735 Feb 24 17:44 -rw-r----- 1 oracle dba 735 Dec 22 21:28 www.enkitec.com e2sndb_ora_10028_1.aud e2sndb_ora_10082_1.aud e2sndb_ora_10082_2.aud e2sndb_ora_10095_1.aud e2sndb_ora_10120_1.aud e2sndb_ora_10125_1.aud e2sndb_ora_10158_1.aud e2sndb_ora_10206_1.aud e2sndb_ora_10482_1.aud 23

Shouldn’t the audit files be created only for SYSDBA and SYSOPER access? •  In theory, yes. •  In prac4ce in our case, no. •  Bug 9744092: EXCESSIVE AMOUNT OF AUD FILES BEING GENERATED IN 11.1 •  Oracle generated a new .aud file for every new database connec4on! •  Not just SYSOPER/SYSDBA like normally •  This is all despite having audit_trail = DB •  Normally the .aud files in audit_dump_dest are not related to regular audit trail, but for SYSOPER/SYSDBA logon, startup/shutdown audi4ng •  …and if AUDIT_SYS_OPERATIONS = TRUE then all commands issued as SYS www.enkitec.com 24

Bug 9744092: EXCESSIVE AMOUNT OF AUD FILES BEING GENERATED IN 11.1 PROBLEM: -------After upgrade to 11.1 the system creates 10 - 16 *aud files per minute in audit_file_dest. Out of 9528 *.aud files that the customer uploaded, 9124 files recorded ACTION:[3] "102". DIAGNOSTIC ANALYSIS: -------------------The change of behavior (move audit action 102 from aud$ to audit_file_dest when audit_trail=DB.) is due to a fix for an unpublished bug 5476184 in 11.1. It is not an intended feature for 11G. WORKAROUND: ----------Manual delete of audit files www.enkitec.com 25

Diagnosis 1.  Thanks to bug 9744092 and a behavior change in Oracle 11.1 a new audit file was created for each new connec4on to DB •  If a file already existed, Oracle checked if a similar file name with larger suffix value (_2, _3, _4 etc) existed 2.  The audit_file_dest eventually had over 1.5 M files in it! •  For each logon, mul4ple file existence checks (open() syscalls) had to be done 3.  Checking whether a file existed (open syscall -‐> directory entry scan) became very slow – and it’s done in kernel mode •  •  A spinlock was held during the directory entry scan Other new Oracle processes also wanted to do the directory scan, resul4ng in spinlock conten4on and further Kernel mode CPU usage 4.  When the DB got slow – app servers fired up hundreds of new connec4ons to “make things faster” •  This all fed back to the problem – even more conten4on & spinning www.enkitec.com 26

Limi4ng logon storms Use Oracle Listener connec4on rate limiter (11gR1+) listener.ora: LISTENER= (ADDRESS_LIST= (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521)(RATE_LIMIT=5)) (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1522)(RATE_LIMIT=10)) (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1523)) ) Oracle Documenta5on: Oracle Net Listener Parameters (listener.ora) hQp://download.oracle.com/docs/cd/B28359_01/network.111/b28317/listener.htm Also, it is possible to limit logoﬀ storm rate _logout_storm_rate parameter (instance-‐wide) www.enkitec.com 27

Troubleshoo4ng sporadic system performance issues Right Data !!! •  Right scope – if your problem lasts for seconds, this should be the granularity of your data too •  OS level data, in addi4on to the database metrics •  Ideally OS level metrics sampled mul4ple 4mes per minute www.enkitec.com 28

Conclusions •  Logon storms are evil! •  They will amplify any performance hiccups as they cause extra load just when the resources are scarcest •  Connec4on pools ﬁring up hundreds of new connec4ons are evil! •  Know your limits (both max connec4ons and max connect rate / sec) •  Here’s a thought: •  If you have planned the servers’ capacity to support N-‐thousand connec4ons anyway (by allowing connec4on pools grow that high), why not create this amount of connec4ons right away? •  This would avoid logon storms during worst 4mes as all connec4ons have already been created! www.enkitec.com 29

Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1

In this document

More Related Content

What's hot

Similar to Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1

More from Tanel Poder

Recently uploaded

Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1