Saturday, 23 May 2009

Single Table Hash Clusters: Size Matters.

OK, so we know that in certain situations, with the right design, the right size-assumption, and the correct usage, a Single Table Hash Cluster seens to be the most efficient way to get to a row, using a single block get.


I'll have to ask Richard Foote for confirmation on "the fastest way" as I still think that Index Organized Tables are very efficient indeed, and have less downsides. But since IOTs are B-tree indexes, they will result in 2 or more block-gets when the blevel increases.
IOTs are an old hobby of me, but they also nearly ended my career once. Complicated story. Some of you know it. Better not divulge.



I now want to know what happens if we get outside the "comfort zone" of an STHC. What if the well-intended design-assumptions about the number and the size of the records grow wrong.
In both cases, I expect the hash-mechanism to become less efficient.

Caveat: I can only test a limited number of cases, and will focus on the overflow-case. The case where too-many records end up in one block is harder to prove, as it will probably involve hash-collisions and extra CPU effort to retrieve a record. I'm lazy, and will focus on stuff I can prove with just simple queries and the odd autotrace.


Recap

Our initial STHC was created as a table with 64K records and an average recordsize of aporox 75 bytes. The original script to create it is still here (demo_sthc.sql). It was developed on 10.2.0.3 (winXP) with an 8k blocksize.

To play with sizing, and get a feeling for the storage-efficiency, I'll first do some sizing-reports to compare STHC to other options. I'm not worried about disk-storage (disk is cheap, innit?). But blocks in memory still take up valuable cache, and the reading, getting and scanning of blocks is still "work". Effort that can be avoided. Hence I always assume small (and Simple) is beautiful.


To view sizes of objects, I use an auxiliary script segsize

In the original demo, the segements for heap, iot and cluster had the following sizes:

TABLE_NAME    NUM_ROWS   BLOCKS empblck avgspc chncnt AVG_ROW_LEN COMPRES
---------- ----------- -------- ------- ------ ------ ----------- -------
HEAP 65,535 720 47 855 0 77 DISABLED
IOTB 65,535 0 79 DISABLED
CLUT 65,535 1,014 9 2,886 0 78 DISABLED


SEGMENT_NAME KBYTES BLOCKS EXTENTS
-------------------- --------- -------- -------
HEAP 6144 768 21
HEAP_PK 4096 512 19
------- --------- -------- -------
Total for HEAP 10240 1,280 40


IOTB_PK 11264 1,408 26
-------- --------- -------- -------
Total for IOT 11264 1,408 26


CLU 8192 1,024 23
CLUT_PK 6144 768 21
-------- --------- -------- -------
Total for Cluster 14336 1,792 44




The cluster I created for my first demo seems relatively large and inefficient compared to the heap-table with the same data, as it takes up roughly 30% more space. That was probably because I didnt care for the size- and hashkey parameters yet.


All indexes are relatively large compared to their tables, due to the long and rather funny VC2 field that is the PK.

If we replace the key-values with, say, 'abcd'||to_char(rownum) or a sequenced number(n) field instead of the spelled-out numbers, all segments and notably the indexes turn out much smaller. But my original "Business Case" has long-ish strings of format AAAAnnnnnnnnn in a VC2 field as the PK (and search-criteria). GUID-strings of 64 hex-chars stored in a VC2 would be a comparable case.
And now I'm stuck with this demo...

btw: bad practice to store numbers in VC2 if you already _know_ they will be numbers, please dont.




Setting the SIZE

My initial errors were to not specify SIZE and to specify the wrong nr of HASHKEYS, supposedly also leading to hash-collisions. I'm a bit surprised nobody spotted it. But I guess there are not that many readers eh ?

I had already created the cluster with HASHKEYS 100, 200, 500, 1200 and 2000 and I still did not specify size (it is not mandatory, and I'm lazeeee).

Go on, run the script (demo_sthc_hashkeys.sql) for yourself
(At your own risk! Please dont sue me if it F***s Cleans up your data...)

SQL > @http://www.hotlinkfiles.com/files/2562031_70t6l/demo_sthc_hashkeys

(you did not have a table called CLU01 out there did you ?)

From this, I found that Oracle allocated the storage (in extents of 64KB) such that the nr of blocks in the cluster was just higher then the nr of specified hashkeys. It seems to assume that one hashkey per blocks will be required
(Question: Is this default behaviour when you omit the SIZE parameter? Consistent over all versions? Does this hold for all blocksizes? I dont know yet...).

I now decided to do it by-the-book and specified SIZE and HASHKEYS. And my next script (demo_sthc_75x64k.sql) would, I hoped, create a proper cluster ready to hold 64K records of 75 bytes each:

Oh come on, run it! (at your own risk etc...)

SQL > @http://www.hotlinkfiles.com/files/2562028_kvses/demo_sthc_75x64k 


For those of you too lazy or too scared, part of it looks like this:
CREATE CLUSTER clu ( key varchar2(78 byte) ) 
SINGLE TABLE
SIZE 75
HASHKEYS 65536;

CREATE TABLE clut (
key VARCHAR2(78 BYTE),
num_key NUMBER,
date_modified DATE,
padding VARCHAR2(300 BYTE)
, constraint clut_pk primary key ( key)
)
CLUSTER clu ( key );



And the allocated storage for the resulting (still-empty) cluster is ....:



SEGMENT_NAME KBYTES BLOCKS EXTENTS
-------------------- --------- -------- -------
CLU 5120 640 20
CLUT_PK 64 8 1
--------- --------- -------- -------
Total 5184 648 21



That 5 MB (in my case) is surprisingly close to that of the original heap-table (6 MB in my case).
But also surprisingly close to 65535 x 75 = 4.8 MB.

Will this pre-created cluster-segment contain sufficient free-space?
And will it be able to cater for contingencies ?

Let me insert the data...

(if you had run the script at your own SQL-prompt, you would know all the answers by now...)
SYSTEM @ DB10  > insert into clut select * from heap ;

65535 rows created.

SYSTEM @ DB10 > commit ;

Commit complete.

SYSTEM @ DB10 >
SYSTEM @ DB10 > analyze table clut compute statistics ;

Table analyzed.

SYSTEM @ DB10 > set echo off

SEGMENT_NAME KB BLOCKS
-------------------- ---------- --------
CLU 8192 1,024
CLUT_PK 6144 768




Heeeey.
After I inserted the same 64K rows, the cluster has increased in size to the same 8M size of my previously created cluster.

Next time, I might as well create my STHC with a bit of contingency. And for further experimenting I will aim to create this particular cluster with SIZE 100 and HASHKEYS 80000 to end up with the correct size straight away (demo_sthc_100x80k.sql):

One more time I will try to tempt you to run one of my scritps...
(at your own risk, of course...)

SQL > @http://www.hotlinkfiles.com/files/2562061_v4rto/demo_sthc_100x80k



This cluster is now finally created at the right size (space allocated) and if you ran the script (did you? on 8k blocks?) you can verify that it absorbed all the rows without growing.
Supposedly, every row now really IS accessible by a single block get, and hash-collisions should be avoided.


I did verify a few selects, similar to the ones in the previous episode and I did find the 1(one) consistent get again - At least that still worked ...


Lessons from this so far:
- Pay attention to SIZE and HASHKEYS, and dont run with defaults here.
- Even properly(?) specified parameters dont seem to suffice. Always Verify!
- And (minor issue but...) my STHC still takes up more space then my heap-table.

PS: I will not be surprised or offended if someone comes out and says:
That is all wrong, and it is very much different on this other system... Please do!
Such is life with Oracle.
And we may be able to learn something still.


In the next experiment, I will try to overfill the STCH with data to see if any problems arise from that.


It takes some bloody long train-journeys, and a significant follow-up to investigate all this obscure stuff. And all the searching leaves you with a feeling you know even less.
Arcane stuff that nobody uses anyway.







Oh, btw, For those of you who want to set up a so-called mission-critical system where my demo-scripts would "accidentally" delete the data owned by your boss:
You Do Run this stuff AT YOUR OWN RISK. Period.


3 comments:

Martin said...

Thanks for the second instalment Piet. Disc is cheap huh? Tell that to anyone who has fulled their controller and needs a second one...

I always like to avoid wasted space, for the reasons you say. however, I'd be happy for a STHC to be twice the size of the heap or IOT because...I hope to do only one consistent get, not two or three, per value selected. So, unless I am constantly fetching all of the data in the table and thus have cached the whole thing in my buffer cache, I think I would gain rather than lose with a "bigger" STHC.

I need to do my own tests but I am about to set up a new V11 test server "soon", I'll try your scripts there and my thoughts and let you know.

Radoslav said...

HASHKEYS is rounded to the nearest greater prime number. 65535, is not a prime number..

SIZE is amount of data you expect to be associated with key, not size of key. - Oracle computes from this number how much keys can be stored in single block.

Tom Kyte's Database Architecture book is a great resource.

sap erp 6.0 said...

Single table hash clusters are explained in this post. The definition is described very well. It also gives you idea of the exact size of the cluster. It also tells you why size matters s much. I am very impressed with your work.