Showing posts with label physical datamodel. Show all posts
Showing posts with label physical datamodel. Show all posts

Saturday, 6 March 2010

multible blocksizes in an Oracle Database


In short: Not useful.

More elaborated: Messy, waste of memory-space and admin-effort.


Let me explain:

I've come across this discussion again and I considered myself lucky that Charles Hooper has done all the research already.

I'm summarizing his findings here:

Multiple Blocksizes do NOT offer Any Proven Advantage.

The theory about more efficient indexes and better managed cache is good, and I dont deny there is good reasoning behind it. But in practice, having multiple blocksizes and multiple db-nK-caches doesnt make a difference.

It is a waste of the extra (little bit of) work.

And most likely, you end up wasteing cache-space because you hard-divide the cache space into chunks and you prevent the Oracle LRU-mechanism from utilizing all available cache as it sees fit.

Of course, you may be the exception that proves the rule, and I would like to hear from you in that case, but until further notice, I'll stick with my motto:

Simple = Better.

Wednesday, 18 March 2009

Simple lessons from the DBMS SIG

Crisis or not, there was a reasonably good turnout on the last DBMS Special Interest group this Tuesday (17th March).

If you need to justify membership cost and time to attend SIGs like this, the one single item is probably Phil Davies from Oracle Support who talks you through the latest scary bugs. Some of those, I can almost recite by now, but I know full well how useful this horror-stories can be for project who are preparing to upgrade or migrate.

The XPLAN presetation by David Kurtz was instructive, as there are still a few options in dbms_xplan that I plan to explore when time or necessity arise.

For monitoring, the rapidly delivered presentation from Jason Lester was refreshing. We all know what Grid Control can do by now, but there is always need for more, for different and for more customizable tools. I do not need to be convinced of the advantages of Open Source Software (I did a similar topic myself in ... eeeh... 2004, for a different forum and I stil stand by that message).

Although I have seen Zabbix (http://www.zabbix.org/) used more often then Nagios (www.nagios.org), I would like to add my own endorsement for any OSS or otherwise "independent" monitoring tool. You need some independent tool, next to GridControl and other vendor supplied solutions (openview still sprouts up everywhere - sure we use it if someone has paid for it, or enforced it onto us.)

But quite often, for the actual, Real Application Monitoring we will also implement our own home-grown or OSS solution next to the GridControl. You generally need a tool where you can add/tweak your own monitoring, and that you can rely on to do (or build) the stuff that the commecial tools dont do, or dont want to do. And despite what the book, the courses, and the management say, any DBA or sysadmin worth his salt will DIY in this area. I am also still guilty of jotting up my own unix and sqlplus scripts in various situations. They will do Exactly what I need when I need it. Some of my unix/linux scripts have dates in them going back to 1994 (sad eh?).


Mental note: do a topic on "simple monitoring" sometime.


Pete Finnigan had his moment with explaining the uses and pitfalls of VPD (or is that RLS or FGAC? ... ). Always nice to hear PXF himself explain (the lack of) security in his down to earth way. He keeps going on about security being a "layered" thing, and how adding more security can also itself against you if you dont do it properly and exactly Right. Well done, Cheers Pete!

The cherry on the cake was Jonathan -scratchpad- Lewis explaining how Simple (and yet Complicated) the analysis of Statspack (aka AWR, if you have the budget) really is.
Jonathan did a Great job of letting statspack-reports explain themselves, and with constant "challenging" and checking of his own assumptions.

Key messages that I retained, somewhat biased by my own experience, were:

1. Read from the top, get a feeling for overall functionality and the load of the system in hand. How much time, how many CPUs, how much memory and IO was used. Quite basic.
2. Relate the work done, as shown in statspack, to the capacity of the underlying
system: was the database challenging the hardware or not ? Is it a capacity problem or a single-user, single-application problem.
3. Dont be afraid to ask questions. always.

Jonathan refuses to "write the book" on statspack, with the excuse that there will be too much material to cover, and he is afraid to leave out items that are deemed critically important. My reply to him is along the lines of: Real Application Statspackreading is about common sense.
And if Jonahtan can convey some good messages in a 1 hour presentation, surely
it must be possible to write a not-too-complex book to help the average
reader out there in 90% of the cases? For the other 10% you can always hire a specialist.

I become more and more tempted to write a few "simple" things about Statspack, the Real Fantastic CBO, and the blessings of proper physical design for Real Appliations. The book by Tom Kyte is all you really need (need link).
Hm, will I too then get sucked into the wonderful CBO? (Real Application CBO; it gets it right most of the time... )
My next presentation maybe.

Sunday, 6 May 2007

Index Organized Tables IOTs, the forgotten benefits

Check if your database can benefit from IOTs. You could be surprised.

Index Organized tables are possibly the most under used feature after the clustered table (now there is another underused item …). IOT’s also have some additional benefits that the manual seems to forget.

In brief: If you have tables with parent-child relations, or dependent tables where you often retrieve all related records at once, then you should read up on Index Organized Tables. Notice that this includes nearly every database out there.

Wake up call and Warning
This feature is available since 8i, and only now, recently, does it get a number of Bugs reported against itself (not in the last place because one of our major projects boldly went out using IOTs). Have those bugs been “dormant” all the time ? Check out the list of Expected fixes for 9.2.0.8 and see for yourself (Note:358776.1, when I checked it 28-Mar-06). 3 out of the 7 fixes mentioned under IOT are bugs found by us, and a bug-search for IOT turns up even more issues. The information and the dates on the bugs and fixes suggest that most of them have also been around in 10.x. Be Careful.
Despite all these issues, IOT remains a highly recommended feature.


First a recap.

Briefly described, the IOT stores all data of a table in the leaf-blocks of the PK-index. This means there is no table-segment anymore, only an index segment. You can verify this from user_segments.

On a well defined IOT, all records with the same parent key are grouped together in one single block or in a low number of adjacent blocks. This is an advantage on retrieval. All related data is generally retrieved in just one IO action. This speeds up retrieval (less IO is needed), but also makes more efficient use of the buffer-cache because the related records do not get dispersed over many blocks. The overhead of unused cache space is reduced. So far the known advantages.


Now the bonus features.

Bonus Feature #1: More data in the secondary index. As with normal tables, You can define other indexes on the IOT table. But those indexes now have an additional feature: The secondary index doesn’t contain a physical rowid, instead it contains the actual values of the PK fields. This means the 2ndary index is often ideal for index-only-lookup or just to join additional tables after retrieving only the index-blocks. There is no need to visit the table to fetch the PK fields. For those who used to inflate the 2ndary indexes to obtain
Index-Only-Lookups: Use of IOTs can automatically lead to more IOL.

A good example is an m:n relation where the linking table contains just two id’s. A conventional design would call for a heap table with two indexes. A total of 3 segments. An IOT would result in two segments: the PK and one index. Again, there is no table-segment. The PK-segment contains both columns in the leaf-blocks. And the 2ndary index contains the PK-values as an alternative to the ROWID. Any query always needs to search only one (index-) segment, and in each case, the required data will be grouped together in a single or low number of blocks, resulting in minimal IO and efficient use of buffer cache.

Bonus Feature #2: Reduced density, avoiding of hot-blocks. Because there is no separate PK-segment, and because data and index are together in one segment, the nr of records per block is lower. Lower Density.
Normally, the PKs are the most dense segments in a system, and therefore most likely to cause contention on DML transactions most likely to show up in gv$cache_transfer on a RAC system. On an IOT, the PK automatically contains fewer records per block because of the extra space required to store the “data” columns. If paired with a higher then default value of pctfree, an IOT can be used to reduce the impact of a “hot” index. This is especially useful in a RAC environment where hot blocks need to be transferred between instances.


Some of the downsides.

Inserts and updates are slower. This is partly a myth. In like for like comparisons, you may find the difference is not that big. When people complain about the slow loading of data into IOTs they often compare loading an IOT against a table with no indexes at all. If you can do without indexes, you can also do without IOTs. A more honest comparison would be to include the additional time required to create the indexes after loading, or to compare against a table with a PK already defined.
Another way to speed up loading is to use the “ordered” option.

A quirk: the physical guess.
The 2ndary index on a IOT contains a so called row-guess to allow fast access to the leaf block of the PK were the row was originally stored. These row-guesses can grow stale, and the index statistics reflect this in the pct_direct_access value. I have never been able to demonstrate the effect, let alone the benefit of this “row guess”.
A Challenge to investigative minds out there: Can someone prove me the usage and benefit of the row-guess. Ideally, a 10049 trace should be able to show the impact of the row-guess, and of “migrated” rows and stale row-guesses. (Update in 2009, With thanks to Barry Jones, who actually demonstrated 2ndary index deteriorated over time. Impressive).

Some Best Practices for IOT:

1) bad news first: check the bugs, even on 10.2.
Although we have reaped good benefits from IOTs on a multi-Terabyte database, we did stumble across a number of Nasty bugs. We had notable problems on the gathering of statistics (dbms_stats) and with the use of Function Based Indexes (FBIs). On IOTs you can use both dbms_stats and FBIs, but beware of the bugs. Metalink is your friend.

2) Determine a good pctfree.
Set this value higher then default. We tend to use 30 instead of the default 10. This will at least partly avoid the general problem of slower DML on IOTs. Your optimal value will vary, depending in the amount of updates and inserts you do.

3) create your own overflow segments.
If your table contains many columns or large records, consider defining an Overflow segment and use it to store the columns that you will need less-frequently. If you avoid creating overflow segments, you may find that oracle does this for you. You probably want control over this yourself to avoid surprises.

4) Keep blevel down, use partitioning.
As access to the table-data will always go via the PK, the table-access as shown in an explain-plan is actually another PK lookup (note that I’m very skeptical about the row-guess). Therefore, the blevel of the PK has an impact on the number of blocks that need visiting before a data-record is retrieved. By partitioning the IOT you can make sure that the blevel stays sufficiently low. IMHO, a blevel of 2 is generally achievable, 3 is acceptable, and 4 is too high.
Nb: monitoring blevel seems an often overlooked item, and it still applies to conventional indexes as well. But the impact of a high blevel on the PK of an IOT can be especially devastating.


Final words:

Start by reading up on the IOT feature, but then.Test, Verify, Tweak, and re-Test for your specific situation. IOTs are useful in a specific set of circumstances: Retrieval of related sets, avoidance of hot-spots, RAC scalability.
As always, Your Mileage May Vary.

Further reading:
- RTFM (a bit of a duh, but yep, those pdfs are very useful)
- metalink (same duh, but there is some really good info out there too, bugs and all)
- OTN : http://www.oracle.com/technology/products/oracle9i/daily/sept04.html