Tuesday, 11 December 2007

UKOUG 2007 Conference - the Lessons.

Finally, I can offload some of my UKOUG stuff.


The UK Oracle User Group... go say hi to them...First the Budget-justification (management is reading - sometimes). Like every year, I can justify visiting UKOUG as a training cost. I have learned more in four days UKOUG then I would have on just about any (Oracle) course and have saved my employer some money at that.


By taking the time to see other “users”, I have picked up some interesting new notions and ideas. Not to mention how a particular colleague has found a way to save a large, high-profile project some significant money. Money that a Large vendOR or some partner-reseller would gladly have pocketed.

In other words: information that would probably not have been spelled out clearly in a “commercial” course.


The hint is in Lesson 1:

Sometimes the best way of doing something is to … Not do it at all.

(going back, I think to Dave Ensor, possibly further).


Enough messages to management now, on with the real stuff.


CERN: Testing the limits.


The CERN guys were impressive. As with the CERN-physics, They are going places with the software were some of us may follow in a few years time. Others may never have to go that far, but we can all benefit from their experience.


What sets them apart is not only their pioneering and their IT-skills. They also have the urge, the liberty (and the duty?) to share their knowledge freely.


They had some interesting experiences with RAC, Clustering, Scaling, en software distribution. They notably had some gentle criticism on the deployment-process of the software. See the presentation on RPM-ing Oracle that was particularly in my line of reasoning.

An RPM or similar distribution, paired with a re-vamped OFA would make our favORite software much more manageable (Those who saw my rant presentation this year know why).


The input from CERN can be very valuable to many organizations (commercial and others) out there. They fact that they can present this at the conference is another demonstration of the useful role that UKOUG has in knowledge transfer.


CERN employees can be quite open and can share their experience more easily then commercial users with competitors listening. They can also more easily and openly criticize the vendor then commercial consultants are able to. Most “independent partners” both large and small, will always remain

“diplomatic” towards the dominant party, as they need the business and revenue that is benignly handed down to them (and can be taken away on bad-behavior).


Meanwhile, CERN is a highly visible, and hopefully “independent” user of the product. It is possibly one of the very few large customers to which oracle might sometimes listen.


Lesson 2: Criticism is Good.

Events like this, allow for some constructive criticism towards the vendor.
Criticism that would otherwise mostly get lost in the “commercial” spin processes. It is my own belief that this criticism can make the product more competitive. And ultimately, both the vendor and the user-community will benefit.


Other presentations: Many Good ones…


I cannot resist mentioning how Michael Doherty confirmed my opinion:

Keeping a system “Up” is never Simple (as we would like it to be simple…)

We can keep trying though.


Lesson 3: Simplicity is Always better!


There are many more memorable presentations I want to mention, but they will have to wait until the next installment. The good topics range from “disater recovery”, use of streams (aka logical standby), all the way to various RAC-internals and ASM. And of course the inevitable topics on optimization and CBO (which I keep attending, because of course we have been bitten by CBO, again).


Hm, tempted to insert a semi-funny CBO lessons here:

4a) CBO is never wrong, but your design/code/sys-statistics/seg-statistics/data/hints/outlines can be very Very VERY wrong…

But there is hope: According to Jonathan Lewis, commenting on the unpredictable behavior of CBO, notably on upgrades, and on the feeble attempts by mere mortals to use hints or otherwise control the beast:

4b) “Sometimes, you Can get lucky.”


Before pressing the publish-button, I specifically wanted to insert two “Thank You” items.


First a “Thank You” to Dough Burns for reviving OFA.


The old, rather boring workhorse of OFA was conceived by Cary Millsap and put to Good use in Oracle versions 6, 7 and 8. It even became part of the official oracle distribution. Dough has now made it his mission to re-educate the world about how sensible good OFA practices are.

It doesn’t seem to matter that Oracle still has OFA in as part of their manuals, and even as part of their default installations. This standards still seems to get both ignored and abused by many. Visit Dough for more.


And have a read of the presentation by Robyn Sands. Same problem, different angle. Interesting. As I have hinted in my own presentation, OFA can also do with a bit of good attention. Maybe we should make some suggestions to Oracle.


Finally, a humble “Thank You” to those who sat out my own disastrous presentation. I’m sure they had fun seeing me sweat when the equipment didnt work as expected. Plan A failed. Plan B failed.. Improvise!


As a final lesson, here is a more detailed account of what (nearly) went wrong. But I basically had to do the first 10 minutes without slides.

The ICC technician saved the day. He deserves a medal.


I think everybody picked up the last lessons there:

Do not go head-first with the latest + greatest versions. Or if you do: Test Properly, Test again, and be very, very prepared.

And to survive disasters: Prepare (have backups and standbys), prepare (have more copies), prepare (document), and prepare further (test)…

Friday, 16 November 2007

Databases and Systems, what kind of Relation.

Over the years, I’ve had several interesting discussion on the relationship between “databases” and “systems”. It comes down to this: Should there be many databases on a single system (m:1, the traditional approach), or should there be many systems underneath one database (1:m, grid)? And when is 1:1 appropriate?

Current hardware is powerful enough to allow “supernodes” that can run hundreds of databases. We are also confronted with Virtualization or “carve-up” of hardware by way of XEN, VMware, or vendor specific product that create domains or partitions.
With the current push towards “virtualization” of systems and the (in)capabilities of Oracle, it may just be worth to re-start some of the n:m discussion (did it ever go away?).

It is time to take a position on the many-to-many relationship between databases and systems.

This very story can also be found is actually the text version of my presentation "Databases Everywhere", and can be found from various UKOUG and other sites. (include a link...)

To summarize it for those architects and (account-)managers who are in a hurry:
I am in favor of 1:1 wherever possible, I support 1:m (RAC, grid) if really, really needed (but just please think about it one more time). RAC is a wonderful piece of technology that can serve many other vendors as a good example. It can be made to work. I will try to indicate under which conditions I think RAC can or cannot be applied.

Finally, I will only tolerate the old-fashioned m:1 for non-critical situations or on systems where there is some sort of risk-mitigation against interference between the (instances of) multiple databases.

After explaining these positions, I also have a list of recommendations for customers, providers and even for the Big-Oracle itself.

Those of you interested: read on. Possibly prove me wrong.
Others, keep browsing, the truth is out there, whichever version or vendor you want to see.

For the record: the bottom of the text contains links to all our major “partners”. Please keep the invites coming.

Intro, and some definitions.

On conventional systems, we generally find 1 or more databases running on a single system (*nix or even windows). For example, for concept-testing of a DG setup with cascaded standby, my laptop has run 5 databases simultaneously. Slow, but running. It just takes loads of memory and careful (memory-) parameter setting. It illustrates the capabilities of the Oracle database that the concept, once proven on a laptop could then be used to clone a 3TB production system (this time not on Windows). But the laptop Proof-of-Concept also illustrated an important issue when running multiple databases on a single system: Contention. The databases were visibly (and audibly from disk and fan) competing for IO and CPU resource.


NB: Can’t resist to put one in for Deb and Nigel: Let the world know that Deb got more memory into her laptop than Nigel, but then both David and I still have more memory in our laptops then the two of you together, so there ;-). Size matters.
Techies rule.
Back to serious business.



With the current possibilities for virtualization, you can take hardware and split it into many “systems” using VMware, OracleVM, any other Xen or vendor-specific tools for domains, lpars, etc..

Each resulting system can then be used to run instance(s) of 1 or more databases. When is this useful and how far should we take this?
With Oracle RAC (GRID, anyone?) you can take a database and distribute its instances over many system. When does this add benefits?

First a few quick definitions to delimit the playing field. Please not that the definitions are for the purpose of this argument only. They don’t pretend to be scientific or final. Or even to be correct.

System: a running *nix (or win*) instance containing a process list and an amount of addressable memory. On short: a running instance of an “operating system”.
A system can be Virtual if the hardware seen by the operating system is not identical to the actual underlying kit. This is the case when a larger system is split into “virtual” units by use of Xen, VMware or some vendor-specific layer of software or firmware. Some of those can also be modified dynamically (e.g.: Rolling). Some definitions and a good description can be found here:
http://encyclopedia.thefreedictionary.com/virtualization.

Database: an (Oracle) database, containing one system tablespace, and one user called SYS (I’m still trying to find the “essence” of an Oracle database, how about the sys.obj$ entries?). Note that the use of DataGuard can mystify the definition of a “database”, because each DG-clone can represent “the database”. The actual Single Point of Truth (SPoT) is where the current “primary instantiation” of the database resides.

Also note that my definitions do not include the binaries, or the ORACLE_HOME, as part of the database or the system. Indeed, systems and databases can be used in situations where the software needed to run them is “shared”. Most system only need a few “system specific” files in /var and /etc. I will always point out VMS as the ultimate mother of all clustered systems whereby files are shared between multiple nodes. But that requires a Clustered File System (CFS), and that opens a different discussion altogether. Suffice to say that a CFS is very suitable to ensure that all machines can be connected to the same, identical software and are guaranteed to run the same version of the binaries.

Now let me briefly elaborate on the different options.

Conventional deployment - many:1.

On conventional systems, we often see many databases running on a single (unix) instace.

The DBA can look after the databases, and the unix administrators have only one entity to watch. Any collision between databases will have to be handled by the DBA. Note that in most of these conventional cases all databases share the same software-tree (multi databases running from the same oracle_home).
These systems tend to have a relaxed SLA. The utilization varies and sometimes we see high percentages of CPU or IO bandwidth being consumed by a single database.

The main disadvantage of conventional systems is the fact that there is only one system, and all databases meet there. A problem with either a single database or with the system itself can quickly contaminate all databases on the system. And upgrades of system- or oracle-software lead to simultaneous outages of all databases on the system.

In case of system- or hardware failures many databases must be recovered simultaneously or re-started on one or more (other) systems (requiring some prioritization). The simultaneous recovery of many databases may lead to a brief period of overload and/or chaos an possibly a domino effect on other systems or components.

The advantages of an m:1 configuration are the simplicity of a “single system”, which is easy on the system-admin, and the often cheaper license structure when using per-system license.

Simple and Robust 1:1

On system with a high load or a stringent SLA, we tend to see 1:1 relationships: a single database, whereby the sole and single instance of the database runs in a single unix system. This is sometimes referred to as the monogamous configuration.
By having the whole system to itself, the database can benefit from all the resources available. There is no interaction (disturbance) from adjacent instances or other processes. Determining parameters is relatively simple.

However, as Oracle (-sales) will point out, this 1:1 configuration generally means the system is grossly under-utilized. It also means the database can still suffer from unix- or hardware failures.
But I like the simplicity of this configuration, I think this is the most “robust” solution and is applicable to the majority of databases. Whenever a problem occurs on one of the systems, only one database is affected, and recovery-efforts can be concentrated on a single database and application, reducing the risk of a domino-effect.

The 1:1 situation also lends itself very well to hardware-clustering or “cold failover” whereby a database is re-started on another system (node) in case the underlying system or hardware fails. Only one database needs to be re-started or recovered.

Since a 1:1 configuration requires many “systems”, it is attractive to use server virtualization. By running multiple “virtual” systems on a single piece of hardware, you can quickly create the required number of “separate” or isolated systems. When doing this, keep in mind that the underlying hardware remains the single point of failure. When one or more virtual system are meant to replace one-another in case of failure, they should preferably run on separated hardware.

Real Application Clusters, RAC - 1 : many

As a techie, I like the technology behind RAC. It is a wonderful thing to play with and I like the challenge to master this thoroughbred in real-life system. But I have to be careful not to be running a “solution looking for a problem”.

We tend to see RAC databases in organizations with formal and very stringent SLA’s and with the budget and the resources to try and meet these requirements.

OPS and RAC eliminate the SPoF of the “system” and deploy the database over multiple systems. Theoretically this works nicely and even provides dynamic provisioning of system resources (you can utilize all available kit, and you can add more kit as needed). In practice, many have pointed out the relative complex setup, the high price, and the other shortcomings of RAC (link to Miraculous, Famous Danish company).

We have indeed seen successful deployments of multi-instance database on some very large kit. In some cases, the impressive amount of hardware was able to hide badly designed application-code for quite some time. And by constantly distributing the (mainly CPU-) workload over the available unix systems, the coders got away with some appallingly inefficient constructions. Some of these cases have demonstrated the viability (and sometimes vulnerability) of RAC quite nicely, although a better design or implementation might have been cheaper (I prefer brains over iron, always!, in case of doubt: reduce the size of the hardware and tell the IT crowd to JFDI).
Note however, that it is important to let the hardware boys and vendors it their way a bit too. Riding in some extra hardware makes them happy, and is good for our relationship with these vendors. They might invite us on future projects.

And to pour further praise on the Oracle techies, TAF has saved our systems several times when a node got in trouble and died. Database-nodes die mostly through software errors, core dumps or memory leaks, and sometimes through human errors. The underlying hardware is rarely a problem, as these high-end-hardware systems are built to keep running run even if a salty ocean wave runs through the lower floors of the building : unintentionally (Kathryn, are you reading this?).
Please remark that even when you have RAC and TAF-capabilities, you still need to code your application to correctly trap and handle the failover-events. Otherwise, it only works on “idle” connections.

Note that most databases can (be made to) run on RAC, provided there are no obvious bottlenecks such as an ordered-no-cache sequence or some very hot blocks with running-totals. And those bottlenecks can generally be un-designed.
NB: Oracle currently seems to have the following position on RAC: if it doesn’t run (or scale) on RAC, your design is wrong.

Suffice to say that where Very Fast Failover (TAF or FCF/FAN) are needed, RAC has no equal. And for systems that have extreme hardware (CPU) requirements, the RAC scale-out model is also beneficial.

Question : Now what to choose ?

The classic answer: It Depends.
However, I will try to provide some guidance and some opinions.

The only factor that really matters is “The Business” (duh).
What does your business need and what can it afford. For simple or undemanding SLAs the traditional m:1 configuration is often sufficient and cost-effective. For businesses that have more stringent demands or for providers that risk being sued by their (business-)customers over a broken SLA, a 1:1 is advised, possibly with some cold-failover mechanism. And finally, if you really need the additional 10 minutes, or if you need the scale-out features, AND if you can afford the resources for testing, training and ongoing maintenance, a 1:m configuration (aka RAC or grid) can be your choice.

To figure out what your business needs (and can afford), you can either think for yourself, or you can give yourself and your department more credibility by engaging (principle-, business-) consultants to do cost-benefit analysis, risk-assessments or FEMA (that is: Failure Escalation Mode Analysis, not the other FEMA). They will especially stress the business-cost of downtime and any related loss of data/productivity/customers/orders/MegaWatts/ That is enough FUD for the moment. Back to more practical matters.

On the practical level, there are some factors that come into play. There is a) the preference (eeh: dictate) of your system administrators, your SAN engineers, your ASP, or your hosting provider. Then there is b) your commercial relationship with Oracle which will determine how high your licensing cost will turn out. But there may be others, such as c) the capabilities and preferences of your DBA. We will not even go into items like d) the availability of test-systems to prove and maintain your architecture, or e) the available rack-space.

The first important factor that often comes into play is the preference (or the pricing-policy) of the system-admin team or the Hosting-Organization (the ASP). Are they capable of handling many systems or do they prefer a low number of unixes ? Can they quickly build and clone systems for provisioning? What price do your ASP’s charge for additional systems ? This may determine your capability to run “multiple” systems.
Some organizations, by choice or by force, still get away with running just 1 large unix box with everything on it: HR, CRM, Logistics, and sometimes they even have their dev/test/uat environments on the very same box. Feasible, but with most of the drawback of a m:1 configuration.

The next factor is often License cost. How is your relation with Oracle, commercially? If you have to pay list-price, you will want to stick with “conventional”. Here, Oracle shoots itself in the foot: a lower price on RAC would speed up acceptance of the RAC and GRID model.
Machiavelli did suspect this was done to buy sufficient Beta-time to find all quirks, possibly to find a solution for instance-affinity and to give customers the time to come up with a solution for the friendly delivered “Your Design is Wrong” consultancy-audit outcomes.

And last but not least, what do your DBA’s prefer, and how trained and comfortable are they with RAC/grid? The traditional choice, m:1, despite its disadvantages (contention, domino-effects) is still the easiest to maintain for a DBA. Choosing a 1:1 configuration brings on a slightly higher workload, but has the advantage of more robustness and easier, isolated, troubleshooting since databases and systems do not affect one-another when trouble or maintenance occurs (yes, yes, someone must shout “utilization” now, thank you).

The choice for a RAC or grid configuration tends to create a significant overhead. We politely disagree with Oracle at this point that the new grid-control alleviates all problems.
And even if GC and its agents do try to take away a lot of the routine-tasks, Knowledge and Experience can never be completely replaced by a GUI. This aspect, the Human- or Operator dimension, tends to be the most under-estimated factor when (prematurely) implementing RAC/Grid.

Recommendations:

For customers and end-user businesses:
Move carefully from m:1 to 1:1. The 1:1 configuraion is at this moment arguably the most robust way to run a database. Consider using virtual systems to support a 1:1 deployment, but beware of the possible contention and SPoF on the underlying physical layers. Move on to 1:m (RAC) for cases with specific needs (failover or scale-out). Only use RAC in cases where you must, but then don’t hesitate to use it. It can (be made to) work, and it will work (eventually), and you may have to learn these tricks eventually. Start on the first valid occasion.

For ASP and hosting organizations:
Learn how to handle clusters, clones and virtual systems. These tools will give you an edge in flexibility. Then offer your customers the possibility to host many _identical_ systems at a rebate. Your customers will buy more as 1:1 and 1:m systems proliferate, and in the long run you will benefit. Hosting Companies and some vendor pricing-policies are the largest obstacle when moving from m:1 to 1:1 or even on to 1:m. Innovative customers will try to move to cheaper and more flexible platforms, and even lagging customers will eventually follow. If the hosting provider can quickly “provision” at acceptable cost, he can be seen as a partner in commoditization, rather then as an obstacle to flexibility.

For System-admins:
Learn how to handle a multitude of systems, learn how to keep them in sync, and how to clone or (re-)build systems quickly.
NB: for the addicts: investigate the use of Clustered File Systems (CFS).

For DBA’s and system-admins:
Aim to deploy databases and systems in a 1:1 fashion. The “isolation” of each database and system greatly facilitates admin- and troubleshooting activities.
Also get used to replicating or sharing software through OUI or other mechanisms.
And if possible, start to work with a CFS.
Sharing storage at the “filesystem” level can facilitate the juggling of multiple systems very much. Even NFS (supported, but not recommended is a usable alternative.
A CFS can offer great advantages by sharing files across nodes and this can simplify software-deployment and distribution. You will always need two copies of software for redundancy-purposes, but please think before making he 3rd, 4th or 42nd copy. Sharing is better the copying, especially at high numbers. It is easier to manage a small number of shared oracle_home trees then to have 42 or more copies that need to be rsynced or otherwise kept identical.

For Oracle:
It is appalling to find that a grid (aka an OPS database) was easier to build and maintain under VMS or Tru64 with Oracle 8174 then it is with the current 10g versions.
Please pursue the development of OCFS and facilitate shared-binary installs on OCFS and other CFS platforms. This will help proliferation of your GRID strategy, and will get you more market-share and revenue in the long run. The oracle-inventory mechanism and the configuration of agents tend to make life difficult for deployment of shared-binaries. Any viable grid should IMHO include the shared use of a software tree and not depend on endless replication of executables.

Shutdown (normal)

By adopting a grid-strategy, Oracle has greatly increased the options for its customers. And for those of you who don’t know it yet: find the Oracle Sponsored GRID-INDEX. Various hard- and software vendors have added to the palette of choices by implementing their own versions of Grid, clusters, or virtualization. All this new (eh, apologies: innovative but proven) technology can be put to good use, but only when making the correct choices.

We hope the preceding information can offer some help, and we would like to close down by adding one more item of advice: try to base you choices on simplicity

Relevant links

The usual brownie-point partner-links and marketing buzzwords on grid etc :

http://www.oracle.com/global/nl/corporate/press/200634.html
http://www.oracle.com/global/eu/pressroom/emeagridreport4.pdf
http://www.hp.com/techservers/grid/index.html
http://www.ibm.com
http://www.sun.com

More on server-virtualisation (start here!):
http://encyclopedia.thefreedictionary.com/virtualization

An introduction to Clustered File Systems:
http://www.beedub.com/clusterfs.html
http://www.oracle.com/technology/pub/articles/calish_filesys2.html

Some good, albeit biassed arguments for CFS can also be found here:
http://www.polyserver.com (look for articles by Kevin Closson).

A classic on RAC:
http://www.miracleas.dk/WritingsFromMogens/YouProbablyDontNeedRACUSVersion.pdf

Just for fun:
http://www.userfriendly.org
http://www.dilbert.com

Last bootnote for all Alliance Managers and other people in control of party-invites: take note of my cross-links to our “powerful partners”. And don’t worry, since you didn’t bother to read all of the text, neither did the real decision-makers, hence no damage is done. Oh, and please keep me on the invite-lists, I still appreciate good food and quality entertainment.

Now shoot me (over lunch?).

Wednesday, 24 October 2007

ILM is TNT. The Next TLA

First the positive words:

Information Lifecycle Management (ILM) is a great way to dig into data-management, partitioning, and serious considerations about your datamodel.
You should always do ILM, in one form or another.

However .... ,

The term is now being hijacked by the Marketing Departments of Storage- System-, Database-, application-, ETL-, and just about any other FUD-touting vendor.

Oracle is also promoting ILM. They even provide an ILM (gui-)tool for free.

Why?
Well, ILM relies on Partitioning, and Partitioning is ... licensed. Ka-Tching.

If you go down the ILM route, the usual disclaimers apply:
- use at own risk.
- RTFM (...) please.
- proceed with extreme caution.

But it will be a good learning experience and as a buzzword, ILM looks good on your CV.

Hence vendors in various layers of the stack (disk/storage, database, ETL, business-intelligence, compliance, architects, consultancy etc...) all try to sell ILM "solutions" (e.g. ILM-slower/older-disks, ILM-partitioning, ILM-extract-tools, ILM-methodologies, ILM-audits, ILM-patterns, ILM-best-practices, ...)


If you like bling and flash, and good entertainment, this is where you should be in the next months : Ai, El, and Em - ILM.

Quite fun to see:
The trainees, the D-division, and all other abitious employees are sent out to promote this TLA. Expect lots of seminars, catered-food and drinkies...

And some of it looks ok, credible even. Like I said: Every good datamodel should have some ILM in it. Its just that we used to call it "archiving" (=boring) or "cleansing" (sounds like washroom sanitation maintenance dunnit ?).

ILM is definitely the way to go. Unless you can spot TNT coming (The Next TLA) ...
I'm waiting for the bang.

Monday, 7 May 2007

sitemap of SimpleoOracleDBA

Here are the introductions to PdvFirstBlog and SimpleOracleDBA



Various Oracle Tips and Rants and Opinions, all to do with practicalities.

  • Simple Oracle DBA - Availability, Scalability, Manageability... It has to be Simple

  • Bitmap Indexes on an OLTP system - ORA-00060... Classic Deadlock

  • Backup on a different Level - in the SOA layer.

  • How To Pool Connections - One of the challenges

  • Databases Everywhwere - Systems and Databases, how many databases.

  • ILM, Information LifeCycle Management - another Three Letter Acronym.

  • Index Organized Tables - Many Benefits

  • txt - txt.

  • txt - txt.




  • Various Rants, all to do with practicalities.

  • In times of Crisis you call the DBA - Who you gonna call...

  • Rationalize TOAD - Toad will find a way...

  • M O O W - Miracle Oracle Open World...

  • My blog-plug for the Conference - UK Oracle USergroup...




  • Travel related stuff.

  • Angel of the North - My most visited leasure page...




  • The usual links to groups etc..



    My own linked in profile (shameless plug)


    my agenda (normally at the right hand side of the blog)




    UKOUG main url



    Miracle BV



    the home of Oracle-L : Best Mailing list



    link to Oracle Forum.

    Sunday, 6 May 2007

    Index Organized Tables IOTs, the forgotten benefits

    Check if your database can benefit from IOTs. You could be surprised.

    Index Organized tables are possibly the most under used feature after the clustered table (now there is another underused item …). IOT’s also have some additional benefits that the manual seems to forget.

    In brief: If you have tables with parent-child relations, or dependent tables where you often retrieve all related records at once, then you should read up on Index Organized Tables. Notice that this includes nearly every database out there.

    Wake up call and Warning
    This feature is available since 8i, and only now, recently, does it get a number of Bugs reported against itself (not in the last place because one of our major projects boldly went out using IOTs). Have those bugs been “dormant” all the time ? Check out the list of Expected fixes for 9.2.0.8 and see for yourself (Note:358776.1, when I checked it 28-Mar-06). 3 out of the 7 fixes mentioned under IOT are bugs found by us, and a bug-search for IOT turns up even more issues. The information and the dates on the bugs and fixes suggest that most of them have also been around in 10.x. Be Careful.
    Despite all these issues, IOT remains a highly recommended feature.


    First a recap.

    Briefly described, the IOT stores all data of a table in the leaf-blocks of the PK-index. This means there is no table-segment anymore, only an index segment. You can verify this from user_segments.

    On a well defined IOT, all records with the same parent key are grouped together in one single block or in a low number of adjacent blocks. This is an advantage on retrieval. All related data is generally retrieved in just one IO action. This speeds up retrieval (less IO is needed), but also makes more efficient use of the buffer-cache because the related records do not get dispersed over many blocks. The overhead of unused cache space is reduced. So far the known advantages.


    Now the bonus features.

    Bonus Feature #1: More data in the secondary index. As with normal tables, You can define other indexes on the IOT table. But those indexes now have an additional feature: The secondary index doesn’t contain a physical rowid, instead it contains the actual values of the PK fields. This means the 2ndary index is often ideal for index-only-lookup or just to join additional tables after retrieving only the index-blocks. There is no need to visit the table to fetch the PK fields. For those who used to inflate the 2ndary indexes to obtain
    Index-Only-Lookups: Use of IOTs can automatically lead to more IOL.

    A good example is an m:n relation where the linking table contains just two id’s. A conventional design would call for a heap table with two indexes. A total of 3 segments. An IOT would result in two segments: the PK and one index. Again, there is no table-segment. The PK-segment contains both columns in the leaf-blocks. And the 2ndary index contains the PK-values as an alternative to the ROWID. Any query always needs to search only one (index-) segment, and in each case, the required data will be grouped together in a single or low number of blocks, resulting in minimal IO and efficient use of buffer cache.

    Bonus Feature #2: Reduced density, avoiding of hot-blocks. Because there is no separate PK-segment, and because data and index are together in one segment, the nr of records per block is lower. Lower Density.
    Normally, the PKs are the most dense segments in a system, and therefore most likely to cause contention on DML transactions most likely to show up in gv$cache_transfer on a RAC system. On an IOT, the PK automatically contains fewer records per block because of the extra space required to store the “data” columns. If paired with a higher then default value of pctfree, an IOT can be used to reduce the impact of a “hot” index. This is especially useful in a RAC environment where hot blocks need to be transferred between instances.


    Some of the downsides.

    Inserts and updates are slower. This is partly a myth. In like for like comparisons, you may find the difference is not that big. When people complain about the slow loading of data into IOTs they often compare loading an IOT against a table with no indexes at all. If you can do without indexes, you can also do without IOTs. A more honest comparison would be to include the additional time required to create the indexes after loading, or to compare against a table with a PK already defined.
    Another way to speed up loading is to use the “ordered” option.

    A quirk: the physical guess.
    The 2ndary index on a IOT contains a so called row-guess to allow fast access to the leaf block of the PK were the row was originally stored. These row-guesses can grow stale, and the index statistics reflect this in the pct_direct_access value. I have never been able to demonstrate the effect, let alone the benefit of this “row guess”.
    A Challenge to investigative minds out there: Can someone prove me the usage and benefit of the row-guess. Ideally, a 10049 trace should be able to show the impact of the row-guess, and of “migrated” rows and stale row-guesses. (Update in 2009, With thanks to Barry Jones, who actually demonstrated 2ndary index deteriorated over time. Impressive).

    Some Best Practices for IOT:

    1) bad news first: check the bugs, even on 10.2.
    Although we have reaped good benefits from IOTs on a multi-Terabyte database, we did stumble across a number of Nasty bugs. We had notable problems on the gathering of statistics (dbms_stats) and with the use of Function Based Indexes (FBIs). On IOTs you can use both dbms_stats and FBIs, but beware of the bugs. Metalink is your friend.

    2) Determine a good pctfree.
    Set this value higher then default. We tend to use 30 instead of the default 10. This will at least partly avoid the general problem of slower DML on IOTs. Your optimal value will vary, depending in the amount of updates and inserts you do.

    3) create your own overflow segments.
    If your table contains many columns or large records, consider defining an Overflow segment and use it to store the columns that you will need less-frequently. If you avoid creating overflow segments, you may find that oracle does this for you. You probably want control over this yourself to avoid surprises.

    4) Keep blevel down, use partitioning.
    As access to the table-data will always go via the PK, the table-access as shown in an explain-plan is actually another PK lookup (note that I’m very skeptical about the row-guess). Therefore, the blevel of the PK has an impact on the number of blocks that need visiting before a data-record is retrieved. By partitioning the IOT you can make sure that the blevel stays sufficiently low. IMHO, a blevel of 2 is generally achievable, 3 is acceptable, and 4 is too high.
    Nb: monitoring blevel seems an often overlooked item, and it still applies to conventional indexes as well. But the impact of a high blevel on the PK of an IOT can be especially devastating.


    Final words:

    Start by reading up on the IOT feature, but then.Test, Verify, Tweak, and re-Test for your specific situation. IOTs are useful in a specific set of circumstances: Retrieval of related sets, avoidance of hot-spots, RAC scalability.
    As always, Your Mileage May Vary.

    Further reading:
    - RTFM (a bit of a duh, but yep, those pdfs are very useful)
    - metalink (same duh, but there is some really good info out there too, bugs and all)
    - OTN : http://www.oracle.com/technology/products/oracle9i/daily/sept04.html

    Friday, 4 May 2007

    Simple Oracle DBA

    Introduction ? Simple !

    This blog will try to keep life simple for Oracle DBA's.

    Because Simplicity is the Ultimate Sophistication (L. da Vinci)

    Because Complexity sells better (E.W. Dijkstra)

    Because life is complicated enough.

    Simple Oracle : Oxymoron ?




    This presenter had a Very Drastic way to keep the audience awake! have a look...Presentations

    Been there and done that, and got the presentations and papers to show for it.


    Below are the links to my most Favored Rants material. Most of my public presentations have grown from either client-pitches, project-blunders or "architectural" frustrations. Hence most of them are slightly missionary: I want to convince the audience of something. Something Simple.


    High Availability: How to keep it simple (ppt)
    Talk customers through the options for high-availability.
    Start with "What do you really need", going through "what can you afford", and then match that with the possibilities. Ever more Complex...
    When in doubt: Simplify!
    The link is to my presentation from the Miracle DBF in 2006, but the message still stands.


    Databases Everywhere (ppt)
    Everywhere indeed.
    What can we do to keep the systems running, and keep life simple for the DBA? DBA Two-dot-Oh even.


    Upgrade Nightmares (pdf)
    My down-to-earth vision on upgrades.
    And what you can do to keep life simple.


    Is my Backup Covered (pdf)
    It is about Recovery!
    Can you Recover? Can someone else do your recovery for you?
    The most boring subject, but surprisingly turned into my most asked-for ppt


    Backup, at a different level
    A paper/blogpost, aimed a SOA "architects".
    Rather then cover "backup" at the disk- or database level, you can use messaging systems (SOA, ESB, BEPL) to distribute your data over two or more systems.
    Redundancy at a higher level in the stack. Use at own Risk!


    Index Organized Tables, and all the benefits.
    Blogpost/paper, aimed at developers/implementors and ... DBAs.
    I have driven many to despair with a strong belief in the efficiency of IOTs and other seemingly excotic table- and index definitions to store data physically.


    And if you feel like presenting at a usergroup, but dont know what about:
    Remember your last big frustration or problem ?
    Try turning that into a presentation. Surprisingly, that worked for me.