Thursday, December 20, 2012

How to find total IO of the database instance

Total IO of database instance is sum of the physical reads, physical writes and redo writes. There are several views to find these values.
v$sysmetric - Reports metric values for only the most current time sample 60 secs.
v$sysmetric_summary - Reports metric values for time sample of 1 hour.
v$sysmetric_history - Reports metric values every 60 sec from the time instance is up. Better way to analyse IO using this view to take deltas between two time periods.
dba_hist_sysmetric_history - All the above views are refreshed when the instance is restarted. This view, part of AWR, stores the historical stats. I have used this view for my report.

Query:
======


set lines 350 pages 50 feedback off
set markup html on
alter session set nls_date_format='dd-mm-yyyy hh24:mi';
col Phys_IO_Tot_MBps_AVG for 999999999999
col Phys_IO_Tot_MBps_MAX for 999999999999
col Phys_IOPS_Tot_AVG for 999999999999
col Phys_IOPS_Tot_MAX for 999999999999
col Host_CPU_util_AVG for 999999999999
col Host_CPU_util_MAX for 999999999999

spool IO_stats.html
--Prompt 'CPU below 40% is not reported'
--Prompt 'Total MBPS below 20MB is not reported'
--Prompt 'Total number IO per sec below 500 is not reported'
select 'Report Run date',sysdate from dual;
select min(begin_time) "Start Time",
snap_id,
sum(case metric_name when 'Physical Read Total Bytes Per Sec' then average end)/1024/1024 +
sum(case metric_name when 'Physical Write Total Bytes Per Sec' then average end)/1024/1024 +
sum(case metric_name when 'Redo Generated Per Sec' then average end)/1024/1024 Phys_IO_Tot_MBps_AVG,
sum(case metric_name when 'Physical Read Total Bytes Per Sec' then maxval end)/1024/1024 +
sum(case metric_name when 'Physical Write Total Bytes Per Sec' then maxval end)/1024/1024 +
sum(case metric_name when 'Redo Generated Per Sec' then maxval end)/1024/1024 Phys_IO_Tot_MBps_MAX,
sum(case metric_name when 'Physical Read Total IO Requests Per Sec' then average end) +
sum(case metric_name when 'Physical Write Total IO Requests Per Sec' then average end) +
sum(case metric_name when 'Redo Writes Per Sec' then average end) Phys_IOPS_Tot_AVG,
sum(case metric_name when 'Physical Read Total IO Requests Per Sec' then maxval end) +
sum(case metric_name when 'Physical Write Total IO Requests Per Sec' then maxval end) +
sum(case metric_name when 'Redo Writes Per Sec' then maxval end) Phys_IOPS_Tot_MAX,
sum(case metric_name when 'Host CPU Utilization (%)' then average end) Host_CPU_util_AVG,
sum(case metric_name when 'Host CPU Utilization (%)' then maxval end) Host_CPU_util_MAX
from dba_hist_sysmetric_summary

--having sum(case metric_name when 'Host CPU Utilization (%)' then average end) >= 40
--and
--sum(case metric_name when 'Physical Read Total Bytes Per Sec' then average end)/1024/1024 +
--sum(case metric_name when 'Physical Write Total Bytes Per Sec' then average end)/1024/1024 +
--sum(case metric_name when 'Redo Generated Per Sec' then average end)/1024/1024 >= 20
--and
--sum(case metric_name when 'Physical Read Total IO Requests Per Sec' then average end) +
--sum(case metric_name when 'Physical Write Total IO Requests Per Sec' then average end) +
--sum(case metric_name when 'Redo Writes Per Sec' then average end) >= 500
group by snap_id
order by snap_id;
spool off


Output: (in HTML)
==============


Start Time     SNAP_ID  PHYS_IO_TOT_MBPS_AVG  PHYS_IO_TOT_MBPS_MAX  PHYS_IOPS_TOT_AVG  PHYS_IOPS_TOT_MAX  HOST_CPU_UTIL_AVG  HOST_CPU_UTIL_MAX 
01-10-2012 01:00  36623       7       69      154    1267       9       18 
01-10-2012 01:59  36624       5       46      173    1622       8       27 
01-10-2012 02:59  36625       7       67      155    1324       9       29 
01-10-2012 03:59  36626       3       58      41    620       6       14 
01-10-2012 05:00  36627       19      119      270    2074       13      45 
01-10-2012 06:00  36628       9       102      201    1650       10      30 
01-10-2012 06:59  36629       10      96      227    2080       11      22 
01-10-2012 07:59  36630       11      126      132    1314       13      34 
01-10-2012 09:00  36631       11      110      243    2685       18      34 
01-10-2012 10:00  36632       5       81      87    654       30      71

........

Analysis of output
==============

Definitions:
IOPS  :
The standard unit of measurement for I/O operations per second. Should include all reads and writes.
Mbytes/s: Mega”Bytes” per sec
*****_AVG columns should be read to calculate the total value (can be daily/weekly) for a column.
*****_MAX columns show us the peak usage of a particular column. This can be used to check the busy timings.

You can also uncomment the commented lines to suppress the inactive periods, hence useful for management reporting. You can also play around with "where begin_time like '%-10-2012%'" to focus on time.

Getting more out the script:The same query can be used for IO designing. We can use the value of PHYS_IO_TOT_MBPS_MAX and PHYS_IOPS_TOT_MAX columns.
Also there an another metric "I/O Megabytes per Second" which servers same purpose as PHYS_IO_TOT_MBPS_AVG. But I decided to use PHYS_IO_TOT_MBPS_AVG instead. The values of both the metrics might not match, as all metrics are sampled - they are digital observations - snapshots, sampled "as of a point in time". And they are not sample atomically - we don't get all metrics at the same precise exact point in time.
There is also a interesting Column STANDARD_DEVIATION, a statistical term. If the difference between STANDARD_DEVIATION and AVERAGE is large, means that the data is highly scattered. In my case the STANDARD_DEVIATION valued reached 1 million sometimes. This tells me to avoid taking AVERAGE values for designing.
----

1 comment:

  1. Hi,
    I was looking at your article. Thank you for very informative piece.

    In your article, you added 3 metrics; { physical read total + physical write total + redo } to arrive at combined averages and maximums. However, another public article explicitly mentions that physical total metrics already include redo metrics as well. For your reference, I am including the link below:

    http://flashdba.com/2014/02/26/oracle-awr-reports-understanding-io-statistics/

    "[Note: There was another section here detailing how to find and include the I/O generated by redo into the totals, but after consultation with guru and legend Tanel Poder it's come to my attention that this is incorrect. In fact, reads and writes to redo logs are included in the physical read/write total statistics...]"

    I would be curious know if you have any comment about the validity of explicit addition of redo to totals to obtain combined totals.

    thanks

    ReplyDelete