Friday, 1 August 2025

Oracle23ai and python - How to fix (avoid) a MERGE statement.

TL;DR: Still using Oracle and python, and loving the combination. But some Annoying High-Frequency MERGE statements from several python programs caused too many RoundTrips... Trying to fix that.

And can I just say: RoundTrips are Evil !

Image: Vaguely related, we have lots of Traffic, and we are trying to MERGE...


Background: need to Eliminate RoundTrips.

For a number of python programs we have logic that checks on "existing data". One example is to find a certain SOURCE. The table holding the sources looks like this (simplified):

If a source-record, by whatever SRC_NAME, exists, we want to get the ID, and if the source does not exist, we need to create it and return its newly assigned ID. 

Perfect case for a MERGE-statement, righ ?

Except that we do many source-checks per second, and each of those stmnts becomes a RoundTrip. These merge-stmnts were 2nd on our high-frequency-list. And since we eliminated the single-record-inserts, they are now Top-of-Problem-List.


The Problem (simplified): 

The function will check for existence of a source-record. The MERGE statement was the most obvious to use here. Our MERGE statement looked like this (somewhat simplified):

The MERGE stmnt is quite standard: check for existence of a given name, and it it doesnt not exist: create it. Then return the ID.

Also notice the (global) variable: src_dict = {}, it will hold the pairs of SRC_NAME and ID as we find them.

The function that uses this MERGE statement gets called at very high frequency in several places. And in the original situation, every call causes a round-trip to the RDBMS.

The original function (simplified) looks like this:

Image will pop out if you click it..

Quite straight forward: create a cursor, assign the bind-variables, execute the statement, and catch the returned ID. Job Done. Except that this would sometimes run at 100s per sec. and show millions of executes per hour in AWR.

Note that we effectively check for parent-records or LOV-records before an insert of a detail-record. I can imagine other constructs, such as triggers or a PL/SQL function to contain this logic. But here we are...

Note Also: On the topic of MERGE-stmnt, allow me a sidestep to this blog by Oren Nakdimon about concurrency-issue with MERGE (link), but that is out of scope for me at this point. Maybe Later.


The Possible Solution:

Thinking back to my previous "Fix for Roundtrips" (link), some form of local buffering or a "local cache" would seem appropriate. But there were some issues:

- Uniqueness: Any new record, e.g. newly-found SRC_NAME, should Ideally be "claimed" with a new ID into the RDBMS Immediately to prevent other systems from assigning different IDs to the same source-name.

- Timeliness: A local buffer would _always_ be out of date, especially when multiple running jobs were likely to discover the same or similar sources in the same timespan. Ideally, the local buffer would always have to be up-to-date, or kept in sync, with the Database.

In short: The Truth is In the Database, the Single Point of Truth (SPOT, classic problem of copy-to-cache...).

- And preferably no "Slurp" of all data: A local buffer could potentially be large, but not every program-run needs all the records. Most programs would only need a small set of the data, typically 10-20 source-records (but they do millions of unnecessary merge-check-retrieve for that small set of SRC_NAMEs). A pro-active "Slurp" of a large set of LOV-data would not be desirable.

One of the "lucky" aspects of our merge-problem was that the SOURCE-data, for this process, was insert/lookup-only. Any updates (comments, modifications, or even combining of sources) would happen elsewhere. The "worker programs" just needed to fetch an ID, or create an ID where none existed yet. 

But any new ID would have to be "immediately" stored into the RDBMS to have it available for others.


The concept-solution.

With some thought, the following pseudo-code came to mind (some architects will call this a "design pattern"):

The image will pop out if you click it..

The comments speak for itself. I chose a python structure called a DICT, which I think is similar to an associative-arry in PL/SQL. 

Note that at this point of writing, I do not yet know if that choice was "optimal", but it seemed to work just fine in our cases. Again something to investigate Later...


Let's put it to Code.

This idea was relatively easy to code. A link to complete and hopefully runnable setup- and test-scripts is at the end of the blog.

The new function First checks if the SRC_NAME is present in a DICT, and if not, then calls the "old function" to check against the RDBMS.  Then returns the ID.

It looks like this:

That was Easy enough (In practice there is a bit more, but out of scope). And it also seemed to pass all of my testing. 

But was it really Better...?


Now Race it for Speed....

To find if something is "faster". I tend to just call the thing 100s or even millions of times, and compare timings.

Our test-case is best described as: 

- The potential names are "src_123", using numbers from 1-1000.

- At the start, the table contains 501 records with SRC_NAMEs and IDs ( the odd numbers. Setup is via the file tst_merge_src.sql (links to all files below)

- Program will generate names randomly of format "src_123". Then check the name, and add a new record if needed.

- We run this for 100-random-names and report timing + statistics... We check the timings, and repeat for another 100-random-names. Until Control-C.


The Original Function with Merge: 200 records/sec, steady.

Here is the first run of 100 checks, using the old function.

Click image to pop-out larger version...

The top-right terminal shows stdout of the test-program: 

It did 106 RoundTrips to test 100 records (100 merges plus 6 to connect and some overhead). It managed to do this at a rate of 212 records/sec. Check also the time-difference of the two lines at start (blue mark): the test took about 0.5 of a sec for 100 records, which confirms: 200/sec.

To the Left, we see a terminal with the count of the records: At the start of the test, there were only the 500 + 1 existing old sources. After the first pass of 100 records, the random-mechanims found 51 (even numbered) new names and merged them into the table. Those newly-found names are immediately "claimed" with an ID. Any other program finding the same names, would be able to pick them up and use the Correct ID.

Let's hit enter a few times and do some tests. After 6 runs:

Click to pop out larger version

Now the program did 621 RTs (600 records checked, and some overhead for commit and statistics. But the speed is still 180/sec. In fact, it varied a little, but the rate was stable around 200 records/sec.

Meanwhile, the record-count now shows 223 new records added (e.g. 223 new names found + merged). And this program will keep processing at this rate.

 The Speed is 200 records/sec, no matter how long we run test.

Time to hit Control-C and start testing the new function...


The New Function: using a DICT with cached values..

On the first run with the new function we found the same speed:

Click image to pop out more readable version...

No improvement yet. It added 100 RTs and was still only processing at 200/sec. 

But the DICT now contains 97 elements, this is the start of our "cache"... 

The record-count showed the total of NEW records now 257, some 34 records were added in this round.

Let's now hit enter a few more times and get to 10-runs with the new function, and as we run more records, the DICT fills up and the cache-effect starts to work:

Click to pop our more readable version..

Now we have a speed of 500 records/sec and the DICT now contains 637 records (out of a potential maximum of 1000). And the number of RTs per run-of-100 is down to about 30 per testloop.

As we run more+more test-loops of 100 records, most of the 1000 potential names end up in the DICT, the number of RTs needed decreases, and the speed in nr or records/sec goes up steadily...

click to pop our larger, readable version

At the 25th test-loop, the DICT holds 902 values, and per test-of-100 we are down to about 10 RTs. The measured speed has gone up to 1797 records/sec: that is more than 8x faster than the original function.

This Cache-Mechanism Works !


Reflexions... 

The local cache will Not Know about Deletes or Updates. But for most LOV- or parent-table records, deletes are un-likely. And from the nature of this data: the SRC_NAME is not likely to change over time (it is effectively the Alternate-Key).

As always, I do not like to add additional code to a system: There is enough "maintenance" already without me adding smart-tricks... Weigh the benefits carefully in your own situations.

I do not now (yet) how efficient the DICT-lookups are on large(er) sets. But assuming a local operation (inside 1 process) is generally much more efficient than a call-out to a (remote) RDBMS. But still something keep in mind, maybe check in future tests.

I dont rule out that certain DataFrames also solve this problem. I hope they Do. But I'm not yet sufficiently fluent in DataFrames to comment much. Maybe Later.

Alternatives.... Use a Local Database or a file?  Someone suggested to copy relevant LOV-data into a local store, either a SQLite, or some copy-file type of cache. I would hesitate to do this, but it may be an option. This is not uncommon in systems that deploy Microservices.


Summary, Wrap-up.

The caching mechanism worked. 

And again, the (evil) impact of RoundTrips is demonstrated.

(In This Particular system, YMMV!)

By eliminating calls to and from the database, we reduce the workload on the python program and on the database.

The python-program does not have to call / fill / run statements and wait for the returns.. 

This frees up time and resources for other work. => Win.

The RDBMS does no longer get 1000s of "identical- and near-useless" calls anymore, it doesnt not have to use its precious processing power to serve those merges anymore. 

This frees up resources at the RDBMS side for other work too. => Win.

Needless to say, if you have your software distributed around servers or over datacentres at distance, the impact of Latency and RoundTrips is Even Bigger. 

I said it before (+/- 1998): Chattiness is the next Challenge in IT.


-- -- -- -- -- End of this blog, for Now -- -- -- -- --

Appendix A: links to sourcefiles.


tst_merge_src.sql : the test table (from tst_item_labels.sql from earlier blog)

tst_merge_src.py : the testdemo python code. (it needs some imports!)


The program imports the following: os, sys, array, random, time, datetime and dotenv. 

And of course the oracle-driver: python-oracledb.

All of which are either standard in your python-installation, or can be get with pip3.


And I use some helper-files, you'll need those for import:

prefix.py : function pp(*argv), prefix the stdout lines with file + timestamp

duration.py : the stopwatch-utility I use.

ora_logon.py : functions to logon to database, and to report data from v$mystat

.env : edit this to include your scott/tiger@orcl, dotenv will read it.

-- -- -- -- -- End of this blog, for Real -- -- -- -- -- 

No comments: