Showing posts with label performance. Show all posts
Showing posts with label performance. Show all posts

Saturday, 6 March 2010

multible blocksizes in an Oracle Database


In short: Not useful.

More elaborated: Messy, waste of memory-space and admin-effort.


Let me explain:

I've come across this discussion again and I considered myself lucky that Charles Hooper has done all the research already.

I'm summarizing his findings here:

Multiple Blocksizes do NOT offer Any Proven Advantage.

The theory about more efficient indexes and better managed cache is good, and I dont deny there is good reasoning behind it. But in practice, having multiple blocksizes and multiple db-nK-caches doesnt make a difference.

It is a waste of the extra (little bit of) work.

And most likely, you end up wasteing cache-space because you hard-divide the cache space into chunks and you prevent the Oracle LRU-mechanism from utilizing all available cache as it sees fit.

Of course, you may be the exception that proves the rule, and I would like to hear from you in that case, but until further notice, I'll stick with my motto:

Simple = Better.

Friday, 30 October 2009

Oracle Performance Tuning with hit ratios - Risky.

The Ratio, as drawn by Leonardo da Vinci.While he was sitting in front of a fireplace somewhere in New England, Martin Widlake prompted this quicky-post.


First my Confession: I still use "Ratios" in assessing and fixing performance.

I know current wisdom is that ratios and aggregated numbers are not good enough and should not be used in tuning. I agree that ratios can be abused, and can be hugely decepive.

But when guestimating the scalability or performance of an OLTP database-application, I use these three numbers:

Buffer-gets per row, 10 or less.
Buffer-gets per execute, 100 or less.
Buffer-gets per transaction, 1000 or less.

Martin will immediatly see why, IMHO, his result of 380 for the 1st or 2nd ratio was too high.

I picked those numbers for several, Simple,, reasons.

Firstly, these numbers are simple to remember.

Secondly, these numbers can simply be determined. You can get them from v$-views, from traces, from AWR (Licensed), or from statspack (free!).

Finally, these numbers (10, 100, 1000) have generally served me well in places where I looked into heavily loaded OLTP systems. I dont recall a single problem with any statement or transaction that managed to stay within those limits.


Now be careful. Your situation will be different. And you should not strive for these numbers per-se, but rather for happy customers. And whether these numbers really are appropriate for you depends on the number of calls you do, the type of work your users/systems do, on the capacity of your hardware, and the demands placed on your system. Some systems operate comfortable with one zero added to each number.

But if you need millisecond response times, or need to serve thousands of requests per minute, then my advice is: Strive for those numbers for every statement and transaction on critical the path.

Any query or transaction that consistently breaks one of those ratios, should be looked into.

And if your system doesnt meet the numbers, there are three fixes:
You either Eliminate the statements (just dont do it),
Optimize the offending statements (improve where-clause or indexes), or
Contain those statements (do it less frequent, and dont do it at peak-load periods).




A further word of warning on the general danger of "Ratios" is appropriate.

Phi, or the Golden ratio of 1.6180... Maybe we should look for some relevant way to use that, just for fun.We have had a clear demonstration, by Connor McDonald I think it was, that Hit-Ratios are not a good indicator for "performance". Connor was able to "set" the hit-ratio to whatever his manager wanted, with very little impact on performance [link?].

Subsequent other performance specialists, notably Cary Millsap, have indeed proven beyond doubt that ratios and other "aggregate" metrics are not good indicators of performance.

Back in the nineties, my colleagues and myself have used scripts based on numerous book (from the nineties...) to determine all sort of (hit)ratios for cache_size, shared-pool, redo en other settings. The ratios were derived by high-tech scripts we crafted, based on the books we bought. And from the nicely, SQL*Plus formatted, output we determined that we needed more memory, larger shared-pool, more redo-buffers or larger redo-files.

In very few cases did those ratios point us to the real problem and in even fewer cases did the advice make any difference.

In almost every case, the real solution came from B/Estat, tkprof, Statspack, AWR or ASH. And most of the time, we were brought straight to the heaviest query in v$sql.

That query was then either made more efficient (less buffer gets per execute) or was eliminated (just dont do it). In 90% of cases, that solved the problem (and sometimes improved a few ratios as a by-product).


My Conclusions:
Most Ratios are indeed of little use - they do not address the real problem.
But Some ratios can help you, if you know what you are doing (and if you know what your customer is trying to do).

With or without Ratios, you still need to find out what your system does, and use common sense to fix it.


I'll save a few other examples of Ratios for later, notably one I use to find locking problems. I think Martin can already guess which one.

Wednesday, 18 March 2009

Simple lessons from the DBMS SIG

Crisis or not, there was a reasonably good turnout on the last DBMS Special Interest group this Tuesday (17th March).

If you need to justify membership cost and time to attend SIGs like this, the one single item is probably Phil Davies from Oracle Support who talks you through the latest scary bugs. Some of those, I can almost recite by now, but I know full well how useful this horror-stories can be for project who are preparing to upgrade or migrate.

The XPLAN presetation by David Kurtz was instructive, as there are still a few options in dbms_xplan that I plan to explore when time or necessity arise.

For monitoring, the rapidly delivered presentation from Jason Lester was refreshing. We all know what Grid Control can do by now, but there is always need for more, for different and for more customizable tools. I do not need to be convinced of the advantages of Open Source Software (I did a similar topic myself in ... eeeh... 2004, for a different forum and I stil stand by that message).

Although I have seen Zabbix (http://www.zabbix.org/) used more often then Nagios (www.nagios.org), I would like to add my own endorsement for any OSS or otherwise "independent" monitoring tool. You need some independent tool, next to GridControl and other vendor supplied solutions (openview still sprouts up everywhere - sure we use it if someone has paid for it, or enforced it onto us.)

But quite often, for the actual, Real Application Monitoring we will also implement our own home-grown or OSS solution next to the GridControl. You generally need a tool where you can add/tweak your own monitoring, and that you can rely on to do (or build) the stuff that the commecial tools dont do, or dont want to do. And despite what the book, the courses, and the management say, any DBA or sysadmin worth his salt will DIY in this area. I am also still guilty of jotting up my own unix and sqlplus scripts in various situations. They will do Exactly what I need when I need it. Some of my unix/linux scripts have dates in them going back to 1994 (sad eh?).


Mental note: do a topic on "simple monitoring" sometime.


Pete Finnigan had his moment with explaining the uses and pitfalls of VPD (or is that RLS or FGAC? ... ). Always nice to hear PXF himself explain (the lack of) security in his down to earth way. He keeps going on about security being a "layered" thing, and how adding more security can also itself against you if you dont do it properly and exactly Right. Well done, Cheers Pete!

The cherry on the cake was Jonathan -scratchpad- Lewis explaining how Simple (and yet Complicated) the analysis of Statspack (aka AWR, if you have the budget) really is.
Jonathan did a Great job of letting statspack-reports explain themselves, and with constant "challenging" and checking of his own assumptions.

Key messages that I retained, somewhat biased by my own experience, were:

1. Read from the top, get a feeling for overall functionality and the load of the system in hand. How much time, how many CPUs, how much memory and IO was used. Quite basic.
2. Relate the work done, as shown in statspack, to the capacity of the underlying
system: was the database challenging the hardware or not ? Is it a capacity problem or a single-user, single-application problem.
3. Dont be afraid to ask questions. always.

Jonathan refuses to "write the book" on statspack, with the excuse that there will be too much material to cover, and he is afraid to leave out items that are deemed critically important. My reply to him is along the lines of: Real Application Statspackreading is about common sense.
And if Jonahtan can convey some good messages in a 1 hour presentation, surely
it must be possible to write a not-too-complex book to help the average
reader out there in 90% of the cases? For the other 10% you can always hire a specialist.

I become more and more tempted to write a few "simple" things about Statspack, the Real Fantastic CBO, and the blessings of proper physical design for Real Appliations. The book by Tom Kyte is all you really need (need link).
Hm, will I too then get sucked into the wonderful CBO? (Real Application CBO; it gets it right most of the time... )
My next presentation maybe.

Friday, 6 March 2009

deadlocks: get rid of the bitmap indexes

This one is for a certain Peter, he is a DBA at one of my customers.

We had a problem: Deadlocks happening. Bitmap index.

Users: Complaining, deadlock errors ... ???
we: That bitmap index is causing the deadlocks.
Duhvelopers: But we NEED that index for job xyz, and others Might use it too.
we: Sure, but that index is causing your deadlocks
Users: still complaining!


Yes, we know there is more to Deadlocks, but this was the Bitmap!
Peter is the DBA who found Bug 6175584.
And in the lenghty process of convincing metalink to see that bug, he learned more about Ora-00060 then you will ever need - until version 11.2 comes along at least.



Anyway, this bitmap index...

SQL> Alter index monitor usage ;
Showed the index was indeed being used. But by whom ?
Only by the one intended job or also by other more user-relevant queries?

Peter, who is the local onsite-DBA where I am only a passing consultant, finally managed to convince the "architecs" and other title-bearers that the index should go.

It went, and with it went the deadlock-error.
General Performance also felt better.

Apparently nobody but the batch job was using that index.
Or if they did, it was not helping them.
Problem Solved.

Then Anjo Kolk, over a nice meal, mentioned he had seen something similar, and had it fixed while on his hand-free carphone.
And I should have realized how simple it is to "prove" this case to the Architects.

A Query like this:
SQL> select sql_id, object_name
from v$sql_plan
where object_name like '%STATUS_BMP';

Or a similar qry against the WRH$ table will show you immediately which statements have used a particular index !

Peter is off to check a few other indexes straight away.
Kuddos to Anjo for pointing it out. Sometimes a Simple solution is right under your nose. But it takes a nice "incident" to learn even simple lessons.

As for the developers and architects who concoct it all: No hard feelings. After all, the Funtastic manual says bitmpas are for low-cardinality colums. And at least their tests had been really fast using "this bitmap thing".
They had no bad intentions, and where would we be without their complex solutions and indexes eh ?