Jagjeet's Oracle Blog!

December 12, 2011

Library cache: mutex X

Filed under: Oracle 11g — Tags: , , — Jagjeet Singh @ 5:24 pm

Recently, I was engaged in Exadata POV for benchmarking for Ebiz R12 on Exa* boxes. We had quarter rack running 11.2.0 db with 2 node RAC Cluster (24 CPU) and all other components/Middle Teir including Form, Web, CM were configured on Exalogic Server. Objective was to achieve best possible scalability numbers on Exa* boxes for few Ebiz modules.

Both boxes were connected with Infiniband network, Test scenarios were given for Order2Cash and Self-services modules. Idea was to utilized Load runner to generate the load using few hundred users. After struggling a while we could streamlined our testing scripts and now were ready to focus on performance.

We had several hiccups while running performance tests but issues were eroded by following generic guidelines on performance tuning, we were doing good progress until we stuck with “library cache: mutex X” issues which was kind of road block for us. Each time we touch 600 concurrent users for Order 2 Cash, we had high waits on mutex issue, CPU utilization were constantly touching 100%.  Have a look to below screenshot, most of the sessions were waiting on concurrency.

One of my colleague immediately pointed out about known bug in Oracle 11g for mutex issues, We could find several hits for the bug but apply a patch was not an option due to conflict with

Exadata Bundle Patch. Latest BP did not solve the issue and we were left out to do something by own ( if we can ).

Plan B – as always, Google is the only option for people like me ( may not be the good solution for production like instances ), I was lucky to came across with http://andreynikolaev.wordpress.com/2011/05/01/divide-and-conquer-the-true-mutex-contention/    Andrey S. Nikolaev has a great post sharing his experience to handle mutex issues. I was excited to see if out database is also suffering with same issues.

I used script from his post “library_cache_objects_heavily_accessed.sql” and below was output from our DB.

Select * from (
Select case when (kglhdadr = kglhdpar) then 'Parent' else 'Child '||kglobt09 end cursor,
kglhdadr ADDRESS,substr(kglnaobj,1,20) name, kglnahsh hash_value,kglobtyd type,kglobt23 LOCKED_TOTAL,kglobt24 PINNED_TOTAL,kglhdexc EXECUTIONS,kglhdnsp NAMESPACE
from x$kglob
order by kglobt24 desc)
where rownum <= 20

ADDRESS          NAME                 HASH_VALUE TYPE    Locked    Pinned  Executed  Nsp
---------------- -------------------- ---------- ------- ------ --------- --------- ----
0000000547522910 IGI_GEN_VERT          729966110 PACKAGE ######  10501405         0    1
0000000557C98F28 MO_GLOBAL            3732637996 PACKAGE ######   7413354     30388    1
000000054FF4ABA8 IGI_GEN_VERT         1541933742 PACKAGE   2617   5250401         0    2

000000054F798C58 MO_GLOBAL            3743427754 PACKAGE  13501   5069333  31299942    2

000000054B903E78 OE_MSG_PUB            247466199 PACKAGE ######   1338812      2692    1
000000054BBBFD70 OE_SYS_PARAMETERS    2653012139 PACKAGE ######   1132706     71580    1
000000054F381F10 OE_OE_FORM_LINE      3689477926 PACKAGE ######   1015546         0    1
000000054BE04E50 DBMS_UTILITY         1023687597 PACKAGE  12098   1011637         0    1
000000055773BBA0 SELECT PROCESS_ENABL 1617001630 CURSOR     326   1007630   1007478    0
000000055B047128 DBMS_UTILITY         1295727999 PACKAGE   3748   1003292  45896957    2

000000054B904C58 OE_DEBUG_PUB         2263617911 PACKAGE ######    958636         0    1
000000054B904A18 OE_SYS_PARAMETERS_PV 2314508377 PACKAGE    953    950874         0    1
00000005472CDE20 OE_SYS_PARAMETERS    2476839672 PACKAGE    953    950874   8680985    2

00000005472CD068 OE_SYS_PARAMETERS_PV 2626471498 PACKAGE    953    950874   8862791    2

00000005472C2DC8 OE_MSG_PUB           1210870744 PACKAGE    635    937875   3267991    2

I was interested to *PINNED* column, it tells high number of calls for IGI_GEN_VERT and MO_GLOBAL packages.By default, Oracle creates multiple copies of object based on CPU Core. But still this could lead to contention if you have highly concurrent users using same package. I came to know about undocumented dbms_shared_pool.markhot() procedure Markhot procedure creates multiple copies of object in shared pool which can eliminate the contention.

Since, it was not an production instance, I decided to give it a try by marking both packages as hot.

exec dbms_shared_pool.markhot('APPS','MO_GLOBAL',1);
exec dbms_shared_pool.markhot('APPS','MO_GLOBAL',2);
exec dbms_shared_pool.markhot(hash=>3732637996,NAMESPACE=>0);

exec dbms_shared_pool.markhot('APPS','MO_GLOBAL',1);
exec dbms_shared_pool.markhot('APPS','MO_GLOBAL',2);
exec dbms_shared_pool.markhot(hash=>3743427754,NAMESPACE=>0);

exec dbms_shared_pool.markhot('APPS','IGI_GEN_VERT',1);
exec dbms_shared_pool.markhot('APPS','IGI_GEN_VERT',2);
exec dbms_shared_pool.markhot(hash=>729966110,NAMESPACE=>0);

exec dbms_shared_pool.markhot('APPS','IGI_GEN_VERT',1);
exec dbms_shared_pool.markhot('APPS','IGI_GEN_VERT',2);
exec dbms_shared_pool.markhot(hash=>1541933742,NAMESPACE=>0);

Markhot procedure needs to be called for Package and Package body individually, and after that

Wolla .. mutex X waits were reduced to 3.3%..

Top 5 Timed Foreground Events
                                                          wait   % DB
Event                                 Waits     Time(s)   (ms)   time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
DB CPU                                           33,055          84.1
library cache: mutex X              209,094       1,307      6    3.3 Concurrenc
SQL*Net more data from client     5,516,878       1,220      0    3.1 Network
enq: UL - contention                     97         521   5367    1.3 Applicatio
log file sync                        72,433         514      7    1.3 Commit

For me- that was a lifesaver and it reduced the contetntion, Thanks again to Andrey S. Nikolaev

Blog at WordPress.com.