Early Monday morning, I was told our data warehouse loading process took much longer than before.
The application has been running fine for months, and there was no recent changes made on application, system, or database. The data volume is about the same as before.
AWR reports showed that the bottleneck was on system IO, looks like storage IO performance is much worse than before. So I asked the storage engineer the check storage battery status of StorageTek.
Yes, our batteres expired!
Vendor is delivering new batteries for us, replacing batteries should solve our issue.
A few years ago, when I first started as a DBA, I encountered similar issue on an OLTP database.
All of sudden, one database became very slow. I spent much more time, but still couldn't figure out what went wrong. Only after a few days, our storage engineer noticed the battery expiration in HDS.
After replacing new batteries, database performance became normal again.
Why expired batteries affect the system I/O?
Joshua Townsend has a very good article on storage basics.
In storage, there is some amount of cache RAM acting as a buffer to physical disks, I/O operations on cache is much faster than the disks.
In case of battery expiration, storage will disable the cache, all the I/O operations directly goes to the disks, which are much slower than cache RAM.
For details, please visit Joshua's blog:
http://vmtoday.com/2010/03/storage-basics-part-v-controllers-cache-and-coalescing/
Showing posts with label AWR. Show all posts
Showing posts with label AWR. Show all posts
Wednesday, 24 April 2013
Thursday, 6 December 2012
ORA-27300 and swap error
Recently I took over a HPUX running oracle 10g, when checking the alert log, I saw a few ORA-27300, ORA-27301, ORA-27301 errors, /var/log/messages also shows this:
Deferred swap reservation failure pid: 12345.
Apparently system doesn't have enough swap space for oracle.
$ swapinfo -am
Mb Mb Mb PCT START/ Mb
TYPE AVAIL USED FREE USED LIMIT RESERVE PRI NAME
dev 6656 195 6461 3% 0 - 1 /dev/vg00/lvol2
reserve - 3134 -3134
memory 4536 1109 3427 24%
swapinfo shows that system has 4.5G physical memory, swap space is 6.6G, a bit smaller but quite close to oracle recommanded 4.5X1.5G
further check found that system hit ORA-27300 daily between 18:00 and 20:00, there must be some job running during this period causing the excessive swap usage.
AWR report shows this sql suspicous:
SELECT table_a.column_1, table_a.column_2, table_a.column_3
BULK COLLECT INTO
col_1, col_2, col_3
FROM table_a, table_b, table_c
WHERE table_a.created_date >= to_date(:1,'dd/mm/yyyy:hh24:mi')
AND table_a.created_date <= to_date(:2,'dd/mm/yyyy:hh24:mi');
two issues with this sql:
1. cartition join table_b table_c, yet no columns from table_b and table_c are selected columns
the cartition join of table_b and table_c produces about 2 billion rows,
so if there is 1 row in table_a match the WHERE condition, sql will try to load the same row 2 billion times in memory.
2. use BULK COLLECT INTO directly
according to oracle, BULK COLLECT should always be used with LIMIT
It turns out that the sql is part of a procedure, which runs daily at 18:00.
Since the job has been failing for months, I just disabled it, informed development team to rectify it.
So far two weeks passed, both server and database are working fine, no more ORA-27300 or "Deferred swap reservation failure" appears.
reference:
http://www.oracle-base.com/articles/9i/bulk-binds-and-record-processing-9i.php
http://www.oracle.com/technetwork/issue-archive/2008/08-mar/o28plsql-095155.html
Deferred swap reservation failure pid: 12345.
Apparently system doesn't have enough swap space for oracle.
$ swapinfo -am
Mb Mb Mb PCT START/ Mb
TYPE AVAIL USED FREE USED LIMIT RESERVE PRI NAME
dev 6656 195 6461 3% 0 - 1 /dev/vg00/lvol2
reserve - 3134 -3134
memory 4536 1109 3427 24%
swapinfo shows that system has 4.5G physical memory, swap space is 6.6G, a bit smaller but quite close to oracle recommanded 4.5X1.5G
further check found that system hit ORA-27300 daily between 18:00 and 20:00, there must be some job running during this period causing the excessive swap usage.
AWR report shows this sql suspicous:
SELECT table_a.column_1, table_a.column_2, table_a.column_3
BULK COLLECT INTO
col_1, col_2, col_3
FROM table_a, table_b, table_c
WHERE table_a.created_date >= to_date(:1,'dd/mm/yyyy:hh24:mi')
AND table_a.created_date <= to_date(:2,'dd/mm/yyyy:hh24:mi');
two issues with this sql:
1. cartition join table_b table_c, yet no columns from table_b and table_c are selected columns
the cartition join of table_b and table_c produces about 2 billion rows,
so if there is 1 row in table_a match the WHERE condition, sql will try to load the same row 2 billion times in memory.
2. use BULK COLLECT INTO directly
according to oracle, BULK COLLECT should always be used with LIMIT
It turns out that the sql is part of a procedure, which runs daily at 18:00.
Since the job has been failing for months, I just disabled it, informed development team to rectify it.
So far two weeks passed, both server and database are working fine, no more ORA-27300 or "Deferred swap reservation failure" appears.
reference:
http://www.oracle-base.com/articles/9i/bulk-binds-and-record-processing-9i.php
http://www.oracle.com/technetwork/issue-archive/2008/08-mar/o28plsql-095155.html
Subscribe to:
Comments (Atom)
loading..