Oracle Tablespace Monitoring

I carefully monitor Oracle Tablespace utilization because I don’t buy into the space is cheap mantra. It’s especially important when we are dealing with SAN/FC/FLASH based storage.  If you can effectively find the root cause of the vanishing space in an Oracle Tablespace you can save a good chunk of money for your organization and make your own life much easier by having fewer requests for new LUNs from the storage team.

Back in 2014 I was working for a company that operated a very large internet site and our space consumption was at the rate of 800GB every month, we were adding 4x400GB LUNs to our RAC cluster every two months. Then, one month we almost doubled our space consumption rate for no apparent reason. I decided that it was time to find the root cause of the issue and developed a series of scripts that helped us find the root cause in no time. I’ll show you how it works so you can adopt it in your workflow and DB monitoring.

The two scripts are:

  1. segs2: Fast Extending Segments in the last X minutes for given TS
  2. segs3: Fast Extending Segments Since Last Datafile Creation for given TS

These scripts use Oracle AWR, specifically dba_hist_seg_stat and dba_hist_seg_stat_obj so it’s important that you have a recent AWR snapshot to work with – you can easily create one using the following command:

exec dbms_workload_repository.create_snapshot();

Be careful – don’t add a call to dbms_workload_repository.create_snapshot into the actual scripts and then run it over and over again – doing so can put a strain on a busy database.

Once you have a fresh AWR Snapshot created it’s very easy to use the scripts – simply login to the Oracle DB in question via sqlplus and call the script:

For example to find which segments grew in the last 60 minutes in a tablespace DATA run segs2.sql as follows:

wget https://s3.amazonaws.com/mve-shared/segs2.sql
sqlplus / as sysdba
@segs2.sql 60 DATA

or if you are interested in what segments grew in a tablespace DATA since the last datafile was added run segs3.sql as follows:

wget https://s3.amazonaws.com/mve-shared/segs3.sql
sqlplus / as sysdba
@segs3.sql DATA

I am considering incorporating the above two scripts into the new Oracle Monitoring Framework I am developing, so if you think this is valuable – sign up for the EvenTorEx Mailing List to start the discussion and get the private beta as soon as it’s ready (no SPAM here I guarantee it!):

EvenTorEx Mailing List

December 16, 2015

Posted In: Operations, Scripts

Tags: , , ,

Oracle ASH Monitoring

I have over 60 Oracle diagnostic SQL scripts in my arsenal.  A lot of them hit ASH and AWR.  I’ve tested these scripts on a high traffic web-site backed by an 6 node Oracle RAC Cluster.  Everyone of these scripts saved my day at one point, but there is one script that stands out from them all – Oracle ASH Top Waits Script: h1.sql.

Oracle ASH Top Waits

This little gem is the #1 thing I reach for when I am asked to troubleshoot a performance problem that was reported hours ago. For example lets say it’s 9:30am and I get a call from a dev saying that her APEX web application is hanging – she suspects a locking issue. All I need to know at this point is when the problem was first reported – armed with the start time (lets say 7:00am) I simply do this:

sqlplus / as sysdba
@h1 0700 0930 0 -1 -1

What’s happening here? There are 5 parameters:

1: start HHMI     [0700 = 7:00am]
2: end HHMI       [0930 = 9:30am]
3: days back      [0 = today; or 7 = seven days back]
4: instance       [1 = INST_ID=1, give -1 for all]
5: service_hash   [1225919510 = dba_services.name_hash, give -1 for all]

and here’s what I get back:

Oracle ASH Top Waits Script output sample

Can you spot the problem? It’s “SQL*Net message from dblink”, there are no locks – it’s a simple problem of a badly written distributed query that is waiting on remote DB. That was easy! One other hidden benefit of this script is that it saves it’s output in a table under a RUN_ID (in this case RUN_ID=81) allowing you to compare the output of two RUN_IDs and clearly spot the differences in values grouped by WAIT EVENT. This is extremely valuable especially when someone says – “this used to work last week!”, you simply do this:

sqlplus / as sysdba
@h1 0700 0930 7 -1 -1

The “7” in the third parameter instructs the script to look 7 days back for the same time frame (7-9:30am). The output of the above report will have it’s own RUN_ID=82 (next in sequence) and you can now compare the two using a special script h1d.sql like so:

@h1d 81 82

We didn’t need OEM or any GUI apps to get to the bottom of the problem – all because the diagnostics data is already in AWR tables and is available to us directly from command line / sqlplus. Years ago – we’d have to sample v$session_wait to get similar diagnostics, in fact, I wrote a complete monitoring system that utilized such technique. But now, Oracle built this into the core engine in a form of Active Session History (ASH) that automatically samples this data every second with practically no overhead to the database! That is an incredible level of instrumentation available to us and it would be a shame not to utilize it beyond what the OEM reports are capable of.

Note however that ASH sampling is short lived – only 1/10th of it’s sampling is saved in AWR based on some internal thresholds Oracle came up with. Now imagine that we:

  1. Take a script like h1.sql and refactor it to use ASH instead of AWR because ASH samples are at higher fidelity (AWR only get’s 1/10th of ASH data).
  2. Run this script every 3 minutes to capture the live heartbeat of the database.
  3. Define your own thresholds on top of this sampling and get notified if something is amiss.
  4. Save all this data for historical purposes so that even if AWR gets wiped out you have solid performance metrics years later.
  5. Wrap this all up in an easy to deploy (single binary) distribution which only takes a minute to install on a new host.

Are you interested? Does this sound like something you’d like in your shop? If yes – then I’d like to get some feedback from you because I am building it! Sign up for the EvenTorEx Mailing List to start the discussion and get the private beta as soon as it’s ready (no SPAM here I guarantee it!):

EvenTorEx Mailing List

December 2, 2015

Posted In: Operations, Scripts

Tags: , ,

09′ IOUG Collaborate – notes for Oracle DBAs

I just reviewed Tom Kyte’s presentation from the 09′ IOUG Collaborate event and made a list of things that are specifically interesting to an Oracle DBA with the links for further research into each feature.

I. Those managing PHP/Oracle shops will be happy to learn that there’s a new Database Resident Connection Pool [DRCP] that performs as well as dedicated server (it really is) with benefits of shared server model.

II. Are you sill using TOAD? Perhaps it’s time to give Oracle Sql Developer a shoot because it’s offering a host of Sql/PLSQL Tuning Features built right in:

June 10, 2009

Posted In: Tuning

Tags: , , , , , ,