Jagjeet's Oracle Blog!

May 13, 2012

Sqlplus Dashboard For RAC

I have been passionate to work with CLI due to flexibility, lightweight and easy access. Long back, I had an idea to write a sqlplus script to provide a dashboard kind of utility but could never figure out the auto refresh functionality until unless Tanel & Adrian provided moats.

I got the Idea and solution of auto refresh functionality in sqlplus. Big thanks to great Tanel & Adrian. Majority of my time I spent doing bench-marking on Exadata, so I decided to write a sqlplus dashboard kind of utility for RAC, idea was to get some critical metrics for performance benchmarks on one screen and have a automatic refresh capability ( similar EM “TOP Activity” page )

It took almost 3 weeks for me to write this utility from scratch, It tells you info about Exadata smart scan/storage index saving/Flash cache/ CPU usage/TOP sqls/waits/Cluster interconnect traffic/Top segments by GC* waits and a lot more info. which can be useful to monitor system performance. Based on current formatting, It can support up to 8 node cluster information easily.
I tried my best to make this utility lightweight by reading data from gv$ views once and store in plsql/collections for subsequent requests, Could not find session in EM top activity page running this utility. As of today, it will only work with 11g RAC instances or single instance with gv$* views [catclust.sql].

here is video link… video quality isn’t that great. Must be seen in HD mode.

I have tried to explain about metrics below.


############################################
-- Below section shows metrics instance wise
############################################


+Inst-------+CPUIDL%--IO%-USR%--SYS%+--Tprse/s--+Hprse/s+--PhyWIO/s-+-PhyWMB/s-+--PhyRIO/s--+-PhyRMB/s-+-SessLIO/s--+---Exec/s-+RedoMB/s+Commit/s+-ExSSMB/s-+-ExSIMB/s+-ExFCRh/s+
|AIA1       | 67.8     0  31.2    .7|       2714|      0|       2623|        21|        1183|         9|      370001|      8471|      11|    2001|         0|        0|      980|
|AIA2       | 68.2     0  31.1    .6|       2709|      0|       2491|        22|         928|         7|      340040|      8609|      11|    2004|         0|        0|      703|
+-----------+-----------------------+-----------+-------+-----------+----------+------------+----------+------------+----------+--------+--------+----------+---------+---------+


+Inst-------+ <-- Instance name     +CPUIDL%--IO%-USR%--SYS%+ 
|AIA1       |                       | 67.8     0  31.2    .7| -- This sections shows CPU usage IDLE% IO%, USER% SYS% 
|AIA2       |                       | 68.2     0  31.1    .6| -- For Exadata, we may not see IO Usage as Grid disks exposed to ASM only 
+-----------+                       +-----------------------+




+--Tprse/s--+Hprse/s+--PhyWIO/s-+-PhyWMB/s-+--PhyRIO/s--+-PhyRMB/s-+-SessLIO/s--+
|       2714|      0|       2623|        21|        1183|         9|      370001|
|       2709|      0|       2491|        22|         928|         7|      340040|
+-----------+-------+-----------+----------+------------+----------+------------+

--  Tprse/s   : Total parse per second
--  Hprse/s   : Hard parse per second
--  PhyWIO/s  : Physical Write IO per second
--  PhyWMB/s  : Physical Write MB per second                                                         
--  PhyRIO/s  : Physical Read IO per second								
--  PhyRMB/s  : Physical Read MB per second								
--  SessLIO/s : Session Logical IO per second								
		


+---Exec/s-+RedoMB/s+Commit/s+
|      8471|      11|    2001|
|      8609|      11|    2004|
+----------+--------+--------+

-- Exec/s   : Total Executions for the Instance per second
-- RedoMB/s : Redo Generation MB per second 
-- Commit/s : User commit per second






-- Exadata Related Stats
+-ExSSMB/s-+-ExSIMB/s+-ExFCRh/s+
|         0|        0|      980|
|         0|        0|      703|
+----------+---------+---------+

-- ExSSMB/s  : cell physical IO interconnect bytes returned by smart scan MB per second 
-- ExSIMB/s  : cell physical IO bytes saved by storage index MB per second
-- ExFCRh/s  : cell flash cache read hits per second 

--     Note : Above snapshot was taken for OLTP kind of workload, hence no 
--            smartscan/storage index was in picture, only Flash Cache

 
#########################################
-- Below section shows Cluster Level info
#########################################

+IMPACT%-+--TOP WAIT EVENTS-----------------------+-WAIT CLASS--------+ 
|  51.2% | transaction                            | Other             | 
|  28.2% | enq: TX - index contention             | Concurrency       | 
|   8.7% | gc buffer busy acquire                 | Cluster           | 
|     5% | cell single block physical read        | User I/O          | 
|   2.2% | gc current grant busy                  | Cluster           | 
+--------+----------------------------------------+-------------------+ 

 Above sections shows TOP 5 wait events for the last sample [ default 6 second ] 	

-- IMPACT%    :  Impact was calulcated based on AWR formula, Time spent on individual wait event / Total Wait Time * 100
-- WAIT EVENT :  Name of the Wait Event 
-- WAIT CLASS :  Name of the Wait Class

  
 * * Cluster Level info * *
+IMPACT%-+ TOP SQLS -(child)-+-TOP SESSIONS-------INST:SID----------------------------------------+
|  26.5% | bk2mk4dh179y7(2)  | 1:1043, 1:1174, 1:1239, 1:1304, ......                             |
|  17.6% | bk2mk4dh179y7(0)  | 1:1045, 1:1109, 1:1110, 1:1174, .......                            |
|  11.8% | 7jjp97nb9h2up(1)  | 2:1048, 2:1109, 2:1112, 2:1174,                                    |
|   7.5% | 0r5xv5d42p3p6(2)  | 1:653, 1:979, 2:1047, 2:1501, 2:588, 2:68  .....                   |
|   7.3% | apgu0yan9pb6m(5)  | 1:653, 1:979, 2:1047, 2:1501, 2:588, 2:68  ......                  |
+--------+-------------------+--------------------------------------------------------------------+

 Above sections shows TOP 5 SQLs for the last sample [ default 6 second ] 	

-- IMPACT%    :  Impact was calculated based on AWR formula, Time spent on individual sql / Total Wait Time * 100
-- TOP SQL - (child) : Sql id along with child info
-- INST:SID   :  Instance ID and SID for the session running sql, due to space limitation, only few sessions will shown






   * * Cluster Level info * *

           -------+                                 ACTIVE SESSIONS GRAPH                              +----
     Active   375 |                                                                                    | 375
   Sessions   349 |                                                                                    | 349
              322 |                            #                    #                                  | 322
              295 |                            #   # ##       #   # #                                  | 295
              268 |                            #   # ##       #   # #                  #          #_288| 268
              242 |                            #   # ###  #   #   # #                 ### #       #    | 242
              215 |                            #   # ###  #   #   # #                 #####       #    | 215
              188 |                            #   # ###  #   #   # #                 #####  #    #    | 188
              161 |                           ##################  #########     ##    #####  ######    | 161
              134 |                           #############################     #### ##############    | 134
              109 |                           ##############################    ###################    | 109
     	       81 |                           ##############################  #####################    | 81
               24 |             -MAX CPUs [24]-####################################################    | 24
                1 |                           #####################################################    | 1
                0 +------------------------------------------------------------------------------------+ 0
		  ^ 01:42:22                     01:49:02 ^                             01:55:42 ^



-- Above sections shows total active sessions across the cluster, time window calculated based on sample time [6 second default ]
-- total 80 samples are printed in graph, one line represents one sample time. Active Session range will change automatically
-- based on active sessions and graph will adjust accordingly. Idea was to have some trend about the active session history.
-- MAX CPU will also be shown to have some idea about Active sessions Vs CPUs available.

-- Note : if value falls in between 2 range buckets then it will be shown in front of lower value, for example - in above screenshot 
--        current active sessions were 288 and it was displayed in front of 268, same applies to MAX CPU info as well
 

   * * Instance level info * *

    |  Global  |  Global  | Estd.     | 
    |  Cache   |  Cache   | Intercnt  | 
Inst|  Blocks  |  Blocks  | Traffic   | 
  ID|  Sent/s  |  Rcvd/s  | MB/s      | 
+---+----------+----------+-----------+ 
  1 |     10831|     10188|      173.3| 
  2 |     10650|     10586|        175| 
                                        

-- Above sections shows Global Cache info instance wise

-- InstID : Instance ID no.
-- GC Blocks Sents/s :  Global cache blocks sent by instance per second
-- GC Blocks Rcvd/s  :  Global cache blocks received by instance per second
-- Estd. Interconnect Traffic MB :  Estimated traffic MB for interconnect by instance





 * * Cluster Level info * *
                             
 | TOP Segments by GC*     |    
 | Waits                   |    
 | IMPACT% [Type:Segment]  |   
 +-------------------------+    
 | 40.8% IND:REFERENCE_INSTANCE 
 | 23.3% IND:MEDIATOR_INSTANCE  
 | 17.4% IND:REFERENCE_INSTANCE 
 |  5.8% IND:S_DOCK_TXN_LOG_P1  
    4.2% IND:COMPOSITE_INSTANCE                                                                         

-- TOP Segments based on all Global Cache* events, Impact is calculated based on GC* waits on individual
-- Segment / Total GC* waits * 100





  * * Cluster level info * * 

+--SqlID--------+--SqlText----------------------------------------------------------------------------+-LongstDur-+-InstCnt-+-Cnt-+--CPU%--+--CONC%--+--CLUS%--+-IO%-+---ReadMb-+
| 0r5xv5d42p3p6 | INSERT INTO REFERENCE_INSTANCE (ID, BINDING_TYPE, REFERENCE_NAME, UPDATED_TIME, PRO |           |    2    |  12 |      0 |      0  |      0  |    0|         0|
+---------------+-------------------------------------------------------------------------------------+-----------+---------+-----+--------+---------+---------+-----+----------+
| bk2mk4dh179y7 | select VALUE from XREF_DATA  where XREF_COLUMN_NAME = :1  and XREF_TABLE_NAME = :2  | 00:00:09  |    2    |  46 |    59% |     3%  |      0  |   4%|       227|
+---------------+-------------------------------------------------------------------------------------+-----------+---------+-----+--------+---------+---------+-----+----------+
| 7jjp97nb9h2up | INSERT INTO COMPOSITE_INSTANCE (ID, COMPOSITE_DN, CONVERSATION_ID, UPDATED_BY, PARE |           |    2    |  14 |      0 |      0  |      0  |    0|         0|
+---------------+-------------------------------------------------------------------------------------|-----------+---------+-----+--------+---------+---------+-----+----------+
| apgu0yan9pb6m | insert into  XREF_DATA  ( XREF_TABLE_NAME, XREF_COLUMN_NAME, ROW_NUMBER, VALUE, IS  |           |    2    |  14 |      0 |      0  |      0  |    0|         0|
+---------------+-------------------------------------------------------------------------------------|-----------+---------+-----+--------+---------+---------+-----+----------+

-- Above sections shows cummulative info for the TOP Sqls 

-- LongstDur : if multiple sessions are running same sql, this column would tell timing for the first query, longest duration.
-- InstCnt   : Total no. of Instance running same sql at time of sample [ including parallel and non-parallel query ]
-- Cnt       : Total sessions running same query across all the instances [ including parallel and non-parallel ] 

-- CPU%  : %time spent on CPU
-- CONC% : %time spent on Concurrency
-- CLU%  : %time spent on Cluster waits
-- IO%   : %time spent on IO

--ReadMB : Total physical read mb by the query, cummulative for all running queries across the cluster

-- Note: Most of the columns are fetched from gV$sql_Monitor, will only be populated if query running more than 6 seconds on CPU/IO


Code can be downloaded here.

Please do not forget to share your suggestions/feedback. Thanks –

Note: This utility uses ASH and v$sql_monitor which requires diagnostics and tuning pack !

Advertisements

Create a free website or blog at WordPress.com.