Thursday, 14 August 2025

Oracle23ai and python - Measuring (Network-)Time.

TL;DR: I wanted a utility-function in python to report the "time spent", notably looking for "Network-Time". It evolved during coding into a time-reporting-function. It Worked. But I learned something new again.

Image: waiting for a bus, you never know when...


Background: (I'm not obsessed with - ) Roundtrips and Latency.

From earlier work, I have a number of exploratory python code-snippets to study and measure "network time" for (python-)programs that talk to the database. It so happened I came across some "Chatty" programs that were running in network-configurations with high Latency. That sparked a lot of tuning or "optimisation" (simple: removing roundtrips). But there is still a lot to be gained, notably by deploying code closer to the RDBMS: By Reducing Latency.

At first, I set out to write two functions:
 - f_rt_time_lost() : to report "time lost" on RoundTrips in the network. 
 - f_run_pings() : to imitate the ping-behaviour: report pingtimes to stdout.

And if I had those two, I might as well create a test-program: db_ping.py..?

In the process of building and testing those, I also found the need for something akin to linux time, to report how much time a program has spent on CPU (on the client side). 
And as a good DBA,  of course I wanted to report on the DB-time from my RDBMS.
So this "project" got a little out of hand, code-wise. I typed up a lot more than I had originally expected.

But to stay focussed: I wanted to "Report" on the time lost (in network...) by various programs.. so most of the effort went into collecting and reporting the basics:

The formula for time "lost in the Network":

    Network-time = 
        RoundTripTime (RTT)  *  number of RoundTrips  (nrRTs)

In between I started thinking of "total time", and would it be possible to put time in perspective: How much is spend "in the App", and how much is spend "in the RDBMS".

The next equation I can think of is this one: The Equation of Time...

    Total time = 
        App-time (spent running code from client-program) 
    + DB-time (spent running SQL or PL/SQL in RDBMS-session) 
    + Network-time (nrRTs * RTT).
[    + Idle- or user-input-time (sleep, keyboard-wait, other?... ). ] 

If I can report on those items I can make fairly clear-cut cases for troubleshooting: Try reducing the biggest time-consumer. And if that happens to be "the network" than we know where to try and improve. 
If it is Not the Network, than we probably still know where to look. And we can better predict what kind of impact any improvement can have.

Note1: At this point, real-techies start to suggest "profilers", or even a fancy "flame graph". But in my environment I dont have the luxury nor the knowledge-base to go and do extensive profiling. I will have to help my colleagues with the tools we have, and with tricks they can understand and implement.

Note2: Anything that runs multi-threaded or uses PX (parallel processing of SQL), will pose additional challenges. I'm deliberately keeping those out of scope.

Note3: Equation of Time.. EoT.. I'm deliberately not going to use that abbreviation.

For now..... 
Most programs already pop out a "summary" of their processing to stdout, and I wanted to add my own little item to that.

What if Every Program could produce that Equation of Time at the end of its run... ?


Some musings on measuring time.

In IT it is always very much "in vogue" to blame network or latency, or "our WAN connection" for any slow response-times or long-running-programs. 
Which is Why I want to Know.. I want to be able to say with more confidence: "this is a network issue". 
And since I am not a "network engineer", not even a proper sysadmin, I need Data, and I need data that I can "Understand" myself and that I can Explain Clearly to Managers and other non-tellectuals.

With the Oracle Database, we are used to AWR (or STATSPACK), and ASH, to report on activity and we know how to use that information for "tuning". But those tools have difficulty covering network activity. 
At session-level I can get the nr of RoundTrips (RTs) from a session, And I can have a session report how much DB-time it spent. 

The two items missing from AWR are, IMHO, App-time and Network-time. 
App time would be the time spend "in the App", e.g. on CPU or other activities at the client-side (some of it maybe even waiting for a keyboard-enter).
Network-time would be the time spent on "transport" to and from the RDBMS. And that Network-time is possibly the least-productive if them all. It is the "waste" that we are trying to eliminate...

I realise full well that AWR, based on the RDBMS, will not be able to measure App-time or Network-time, and I am not blaming Oracle or not providing this data. But I can look for that data myself...


Let's put it in code..

From earlier tweaking, I had some code more or less ready to copy-in. 
For python, I decided on the time module, and using time.perf_counter_ns() as probably the most precise way to measure time (see also older blog on duration.py)

Last year, I already had thrown together some "stopwatch" ideas so there was some re-use of a module called duration.py. (it now also has a function inspired by linux  time command- maybe good topic for later..

In short: I can cover the App-time, both active-time and elapsed clocktime from the python time module.

The functions I used comes from the time module.

time.perf_counter_ns () : a clock counting nanoseconds. The reference point is undefined, hence you need to save the value when your program starts and than check the difference to know how many nanoseconds have elapsed.

time.process_time_ns () : the nr of nano-seconds spent on user- or system-time. See this as "busy-time". When compared to linux time, this should be close to the sum of user+sys time (check this...?)

Then I needed DB-time, and number of RoundTrips: those could be gotten from v$mystat and v$sess_time_model. I used this query:


This gives me the nr of RoundTrips for the session, and the DB-time from both v$mystat and from v$session_time_model. I used the time-model as it returns microseconds and implies to be more precise, but the old-fashioned DB-time from v$mystat, in centiseconds, achieves the same result once the session has consumed a few seconds of time.

To find the RoundTripTime (RTT), I need a ping, or a number of pings. And then multiply ping-time by the total nr of RTs that the session has done.

I sampled RTT using 5 pings, with 1 second in between. More on that later...  

And with the code approached completeness, I had:
App-time: from time.process_time_ns()
DB-time: from the v$sess_time_model
Network-time: from the multiply of RoundTrips x Ping-time
And Total-time: from the wall-clock of time.perf_counter ().

That looked quite complete and realistic. 


The Test-setup and first run...

While coding, I had the function included in ora_login.py , my module with oracle-utilities for use in our python programs (link below).

And for testing, I had the function called at the end of do_sql_times.py, as described in an earlier blog.

This program allows me to run an ad-hoc query against the RDBMS. It uses the dotenv utility to get credentials from the file .env. And it reports on how many microseconds each record takes to "retrieve". I've used that program to measure timings before, notably to show the effects of arraysize and prefetchrows.

A test-run with a query on dual looked like this:


After the Connect, this program notifies that "array" and "prefetch" are set to non-defaults of 1 (to maximise RoundTrips). 

Every record will be a RoundTrip, to maximise the number of round trips and thus maximise Network-time.

The very simple query is just 1 record, and that seems to take 49 ms (I happen to know that could be the approximate RoundTrip time for this system). 

The 2nd part, the "reporting of the time" is interesting: We collect the ping-time, we retrieve DB-time and RTs from the RDBMS, and we can calculate "all the values":


Note that the ping-times to this system are fairly high, and variable. This is the kind of system we have trouble with... 

And then we see the data we want from the programs:
App-time: measured by time.process_time()
DB-time: from the DB
Network-time: pingtime x nr RoundTrips
Idle time: calculated.
Total time: measured from time.perf_counter()

The first time I tested this, I was surprised by the large, 5sec, idle time in a program that I thought did very little. Then I realised: The loop for the 5 pings has a 1-sec sleep in it, and that generates the (approx) 5 sec of Idle-time.

As a separate check, I've used the linux time command to time the run of the program, and the values concur, even for a very trivial, short-running program:

Real: the elapsed time, close to the 8sec reported by python.
User and sys : Together they are +/- close to the App-time repoorted by python.

And the data from time correspond, with a margin of error, to the data reported by my python contraption.

So I now have a function I can call at the finish of a program that  will report how time was spent. And a verification via linux time looks acceptable. 

So far it looks promising, usable (but you can feel a problem coming...)


Test some Longer running queries...

Then I did a very simple test that turnd out funny. Query 100 records (with one RoundTrip per record), and I found this:


Let me point out: 
Each record seems to take between 30 and 60 ms to retrieve (forced to do a RoundTrip every record)
The ping-samples are much longer than that, with an average of 175ms ?
And the Idle time is reported Negative ??

On first impression: the individual records come back (from a RoundTrip) with some 40ms, which is expected on this system. And if it takes 40ms to fetch a records, the RTT should be lower than that.
But the ping-samples are higher and show more irregularity than expected. 

The same test, against a database running nearby (ping-time around 1ms), shows this:


On this easy-to-reach RDBMS, ping-time shows less variance, and the Idle-time is as expected: 5 seconds from 5 call to ping plus sleep-1-sec to sample RT-time.

It seems that on the "irregular system" the sampling of the ping-times (5 pings in 5 seconds) comes back with values that are too high, and possibly more irregular than they are in reality.
And from over-estimating the RTT, the Idle-time is then calculated as ... Negative?

Something is Not Right.


Partial solution: Do the ping "faster".

One apparent fix was to modify the ping-sampling, and remove the sleep-time between pings. This gave more regular and lower RTT times:


But mainly shows that pings (and RoundTrips, RTT) were faster when done at high-frequency. And we dont know at what frequency the regular python-programs do their round-trips, especially while we are trying to Eliminate the RoundTrips in the first place.


Wrap Up: This helps, but there is More... 

This measurement is only Partly Helpful.  First the list of positives...

DB-time: We get a view of how much time is spend in the database. If this time is significant, we can go in and optimize some of the queries. Classic DBA-tuning stuff... (but not our biggest challenge on this system).

App-time: We get an idea of how much time is spent by the python-code and the measure from time.process_time () seems to concur, but not quite, with what the linux time command provides as user + sys time.

RoundTrips: We can record the exact number of RTs per program, and thus have a better measure for its chattiness (but we already knew that and from AWR-data...). In our test runs, the program did 110 RTs in about 9 seconds,  say +/- 12 per second. No idea if that is good or bad, and we also dont quite know in what "pattern" they were done. As it turns out this program did most of the roundtrips in a quick sequence while fetching 100 rows. Other programs will display different patterns...

Network time: Not Quite. This "measure" is still unreliable, mainly due the the uncertainty in measuring ping-time or RoundTripTime (RTT). 
Ongoing experiments will probably be with some "ping" utility.

Current measure: For the moment, I'm reducing (removing) the sleep-time between pings when sampling. But I do not know the behaviour of the pyhthon-programs with regards to their frequency of RTs to the RDBMS, or the pauses between SQL-Net calls.

Root Causes: We suspect this  behaviour is due to the nature of our systems. Most of the Databases run on VMs. But a number of (test)databases are running on containers. The latency (and ping times) vary depending on the locations in the networks and the time of day, and possibly the phase of the moon...  In general, the VMs seem to have more problems than the containers with test-databases...

Dont tell the manager, but .. we dont really "Know Everything" yet.

Main goal for the moment remains: Find and Remove RoundTrips....

-- -- -- -- -- End of this blog post, for now -- -- -- -- -- 

Appendix : links to code and scripts

do_sql_times.py : the program used in testing. 

You can use this program to run "any query" and get timing data out of it. The actual function to report the timing is ora_time_spent ( the_conn) and is defined in the file ora_logon.py.

The utilities used:

ora_login.py : Functions to logon, some utilities, and now also time-reporting.

duration.py : Time-measurement (and some reporting).

prefix.py : Enable a prefixed print to stdout, add timestamp in microsec.

.env : The config file for dotenv.

Beware: the .env now contains non-defaults for arraysize and prefetchrows. If you run with those set to 1 (one), some of your code may cause excessive RoundTrips! 

-- -- -- -- -- End of this blogpost, for real -- -- -- -- --




Tuesday, 12 August 2025

Oracle23ai and python - Pingtime and RoundTrips

TL;DR: from my obsession with finding and measuring RoundTrips, I got to tweak with pings and RTs. Some findings, some code. And maybe more to investigate if and when that is deemed relevant... 

image: queues, roundtrips, time-spent... 


Background: RoundTrips, Latency.

When moving our python-code around to find the best (safest, and cheapest) place to run, I got obsessed with network-delays. Notably RoundTrips and ping-times. You can find Proof of my obsession in some of the previous blogs...  

From using ping (on linux and mac-OS) and tnsping, and my earlier doodling with the ping-function from the python-oracledb driver, I wanted to incorporate a "latency measurement" into the python (problem-)code to collect data, maybe help advise on where to deploy the programs.
And another logical follow-up would be to build a db_ping(.py) utility for when tnsping was not available (thin driver...). Later.

Mind you: a lot of detailed "network stuff" is way out of my "knowledge zone", but there are Two Things I know: 
1) Fast Connections, low latency, makes our programs "run better" and 
2) Keep the nr of RoundTrips Down, no matter what. 
So I should be measuring and counting those items.

Here are some notes, some warnings, and some code to re-use if you want.


The DBA view: AWR (STATSPACK) does not know...

As a DBA I have a tendency to look for "data" in AWR (Automatic Workload Repository), ASH (Active Session History) and similar information provided by the RDBMS. It will give me things like user-calls and SQL*Net Round Trips, and information on the SQL-statements done by the application. Those tools are Excellent for troubleshooting Database-behaviour, and to advise app-coders on where+how to improve their SQL. But the data from AWR has two flaws for my purpose:
1) For the RoundTrips, AWR does not produce any "ping-times", no Latency is measured.
2) It can only report on things from "inside" the RDBMS. Any "time used" or "time lost" on the client-side is not registered.

To measure Latency, you need something similar to ping. And to measure time consumed by the application, you need to "profile" its behaviour or measure consumed-time and consumed-CPU somehow. Not something that the RDBMS can "see" without additional help. (multi-component-view needed, topic for future blog..?)


Find the ping-time.

Firstly, I felt the need to report the RoundTrip-times before and after various (python)programs. 
(Note that good old tnsping was also not available bcse of thin driver...)

I had already, innocently, incorporated something like :


os.system('ping -c10 www.oracle.com ' ) 


But I quickly found out that ping (and my other old-time favorite: traceroute) were not always present in the (production-, container-) locations where the python code was running. And getting tools "included in the build" was complicated. (tip: nobody seems to worry about the pip3-stuff because  "it does not run under root". No further comment, and I am not hacking our own systems... Never).

So I built a few (amateurisch-) functions for checking connectivity and had them included in some of the python-rollouts.

Link to complete source below, but here is my quick code to measure a single ping and a series of 100 pings:


In this code you can see there is a function, f_rt_1ping(), that is is called once at the very start of the "main" program. And then it is called again, in a loop of 100. The result looked like this:


That first ping seemed to have taken of 28ms, and that is what I more or less expected from the linux-pings to this system.
The loop of 100 pings yielded an average of 25ms. Also in expected range.

Note: Please disregard the burps about starting MAIN, and the connection-info: I Need to Be Sure of the address, instance, user etc.. to avoid reporting data from the wrong database (oh yes, that still happens a lot...not kidding - link to older blog..).


Calibrate: make sure there is no overhead.

Because I am slightly paranoid, I wanted to make sure there was no hidden "overhead" in my ping process, I also built a calibrate-function. I would measure the same activity with and without the actual ping. The result was reassuring: 

The overhead of code looping around a number of pings is negligible, see program run below. And I +/- knew this from an earlier experiment as well..


Something to Include into python programs...

To collect some data, I experimented with code to do "a sample of pings" inside the regular python programs. It would ping for a pre-determined interval (e.g. say 2 seconds?), or a pre-determined number of pings (say 100?). The output could go to stdout and get collected into the log. In future, I could also envisage storing the data in a table (it is an RDBMS after all).

For now: I was happy to have an approximate RoundTrip time (in ms) in stdout.


Even Better: I also now the RoundTrips and the DB-time..

From earlier work, I already had a way to interrogate V$MYSTATS. This would tell me what the RDBMS thought that my session had done. I could re-use that sampling of V$ to report on the session activity, as seen from the RDBMS. It would tell me how many RoundTrips a program had done.

So near the end of each program (more accurate: before closing the DB-Connection), I could report the number of RoundTrips to stdout.


Let's run the test-program to completion:

With all this data to be "collected, the end-result, to be put at the bottom of a python-program, will look like this:


With just a little extra code, any (relevant) python-program could, once it has done its job, report this data into stdout for examination by the sysadmins. And to myself, the DBA...
Notice that this program (just a test-thing), has done... 
 - a test of 1 ping, which yielded 27.266 ms RoundTrip time.
 - a test of 100 pings, which yielded avg 27.373 ms.
 - a test of 2sec calibrate-pinging, with an avg (netto) pingtime of 27.0266.. ms.

And from the session-statistics, we detect:
 - 177 RoundTrips.

Also notice: I can use the linux time command to report the real+user time consumed by the program. I woul like to use that somehow. Maybe Later.

Note: I'll ignore the difference between db-time (microseconds from the latest time_model and centiseconds from the old mystats) for the moment. I might ask some ppl who know more about oracle when we meet. And I'll keep an eye on that when running more relevant programs once the code is included...

(And Yes, I'll think about popping this data into some log-table - that is definitely an option! Later)


Making Management happy, with numbers...

So by collecting RT-times and number of RTs, I have "measures" for each program or program-run.

High RT times, high Latency, will cause time lost in "the network".

High nr of RTs, Chattiness, will cause time lost in "the network".

The multiple of the two should be approx total time lost on network-traffic, call that Network-time.

So by collecting these numbers, I can "guesstimate" the time lost in the network. And I can indicate the possible gains from either bringing down the RTs or placing the program closer to the RDBMS.

From the example above: with 177 Roundtrips, at an average of 27ms, those round-trips may have consumed about 177 x 27 = 4779 ms, or 4.8 sec of "network time".
This we can report to our Manager...

I have a slumbering idea to take it even further: 
I can try to measure time spent in python on the app, on the container, App-time.
And I can try too measure time spent in the RDBMS (DB-time, from V$STATS or V$SESS_TIME_MODEL. 
And for non-threaded, non-parallel processes, the total of time: App-time + DB-time + Net-time => Should add up to running time of the program.

Maybe later. I have some more ideas to tweak code and experiment. 
Focus for now is on reducing Chattiness and Hopefully reducing Latency.


Wrap up: It helps to collect Data.

At this point in time, I am just happy to collect "data" and use it as a broad-indicator. Finding the programs with most RTs. And relating the RTs to "sensible values".

For example, if a certain program does Millions of RTs, but only processes a few thousand records (and if we somehow know that with certainty...): something is suspicious and it may need examination of code...

And in our case, any component(-location) with a high latency will also get some attention.

At least by collecting the data, I have arguments to discuss with various colleagues.

More to come. Always more Ideas. Maybe if time.

-- -- -- -- -- End of this blogpost, for now -- -- -- -- --

Appendix : links to code.. 

tst_netw.py : the test-program used for the demos and screenshots above.

Plus these additional files, needed to import helper functions into test-program:

ora_login.py :  login and some other oracle-helper functions.

duration.py : time-keeping and reporting functions - instrumentation

prefix.py : print the filename + timestamp to stdout - instrumentation.

.env : to replace credentials with your own scott/tiger@host:port:service

that code: Take it! 

Feel free to copy or plagiarize (do you really need a large disclaimer..?) Just Copy What You think is Useful). 

 If you have suggestions for improvements: Let me know. I'm always interested in discussion these things if time+place allow. Preferably in the margin of a Good Conference, or on one of the social channels (bsky, slack, email...).

Oh, and you may need to get (pip3) the following:
pip3 oracledb   
pip3 dotenv
pip3 datetime
pip3 array

-- -- -- -- -- End of this blogpost, for Real -- -- -- -- --

Tuesday, 5 August 2025

Oracle23ai and python - too many RoundTrips, Fixed.

 TL;DR: Time got lost due to leftover code. In our case, some well-intended "defensive coding" caused additional RoundTrips and some commit-wait time.

Found it. Fixed it.
And at the same time, we (re)learned a little about coding, and on how to measure, and how to "instrument" our code.


Image: waiting for a train. I've spent a lot of time waiting for trains, waiting for planes, and waiting for traffic jams..


Background: Suspiciously slow python code.

After some careful testing we stumbled across parts of the program that did more RoundTrips than the data justified. And RoundTrips were still the item that seemed to cause us most "wait-time".


Testing: Run the suspicious code in "isolated" test.

To better observe, we re-worked the UnitTest to run just the parts of the code that we suspected to cause "slowness". After much simplification, we ended up with this result:


We see the result of 1000 loops of "insert item". That took 14.17 seconds, corresponding to a "speed" of 70.58 records/sec.

From the session info (V$MYSTATS) we see that the session did about 1000 executes, which makes sense for 1000 inserted records.
But the surprising numbers are the RoundTrips and the user-calls: 3000 each ??
It seems this code does 3 Roundtrips for every record inserted....

We also know that this particular setup at the time had a latency of around 3-4ms per RT. Meaning that 3000 RTs would consume close to 12sec of the 14sec elapsed.

Furthermore, from V$MYSTAT, we notice that the RDBMS consumed 160 Centiseconds, or 1.6 sec of time.

Note1: The db-connect and the measurement-queries represent the additional overhead of 14 executes, 4 roundtrips and 5 user-calls. We know this from earlier testing of similar code, see previous blogs (link). 

Note2: I also realise the reading or our stdout takes some getting used to, we sometimes appear a little "over instrumented". My bad. But it Helped!

Note3: the DB-time for the session is reported at 160 CentiSeconds e.g. this session only consumed 1.5 second of time at the RDBMS. The rest of the elapsed 14 sec is spent "elsewhere", and we suspect 12sec of that were spent in RTs ... Interesting? Something for later research?

Anyway, from this test, it seems the insert of these particular item-records contains some overhead. Time to read the code...



Code Inspection: Aha.

Once we had the suspect isolated as "inserting of items", we examined the code involved. And after some searching, the simplified version of what we found was this:


There are two things "wrong" with this code.

Firstly, the Try-Except block. It will catch parse-errors. But after catching the exception, it does not really know "what to do with the error". It simply re-raises it and then quits. You can smile at "return 0" as well.. 
But what mattered in our case: the parse-attempt causes a RoundTrip to the RDBMS (and RTs are considered Evil, in our case).

Secondly, there is a Commit after the insert. Oops. Transaction-Logic was not supposed to be part of the "insert function". That commit was a leftover from the "development time", when commits were useful to preserve the state (the data) that would otherwise get lost if the program did an exit-on-error.
But guess what: a commit causes a RoundTrip (and RTs are considered Evil, in our case).

Note: The developer was Innocent: he had created this code with the Best Intentions. There was a reason for putting the parse and print-of-stmnt there.  On errors, I will _Always_ ask for the offending SQL. And over time, as the datamodel evolved, columns were added/modified, causing statements to fail. Hence the need to "print the SQL".
Please understand that I don't want to unfairly blame the DEV...


The improved version.

It was not difficult to remove the offending code. And after some edit, we ended up with this (simplified!) version:


Now the function only does the insert, and returns the newly-found ID. Parse and Commit have been removed. This should avoid some round trips. 

Will this speed up our (test)code ? Let's run it... 


And Voila, the same 1000 inserts now done in 4.46 seconds, and with now only 1000 Round-Trips. Since on this setup we know that an RT is generally about 4ms, the timing-info make sense: We eliminated 2000 RTs, and at 4ms per RT that would make the program run about 8sec faster from eliminating RTs alone... 
And especially from a "far-away" application with vey high numbers of inserts, this is going to make a difference.

But there is more: The reported DB-time from V$MYSTAT is now down to just 14 CentiSeconds. That might be "explained" from the 1000 commits of the previous version of the code. A commit would realistically (on this old test-system) take about 1ms. Interesting.

The parse-time we save by no-parsing is likely to be minimal: re-parsing the same statement is a Very Efficient operations thx to the Oracle Shared_Pool.

The total gains from the new code seems to be about 10sec: 8sec gained from eleminating RTs, and 1.4 sec gained at the DB-time, probably from eliminating Commits. Total gain close to 10 sec. And that concurs with the measurement.

In summary: The new reported "speed" is now 224 records/sec.
About 3x faster.


Wrap-up, on the (Human-) Coding-side.. 

The  Defensive Coding is generally not a bad thing and the "problem-code" was built with the Best Intentions.

The parse-call was intended to check the connection, check the privs, and check the syntax of the statement. And "defensive coding" is generally not a bad thing. Except that sometimes, the addition of "being careful" results in time spent on un-necessary activity. These "checks" were better done in the Testing-phase, or ultimately by single-calls at the start of the program. Not inside of a high-frequency call.

Key points: 
Spend more time informing DEVs.
Pay attention to (Unit-)Testing, maybe include reporting of statistics and time ? 
Do occasional code-reviews ?


Wrap up, on the (geek-tech) Measurement-side:

We have the start of instrumentation: python does some time-keeping and prints timestamps.  And the captured data from V$MYSTAT helped to identify the additional RoundTrips and user-calls. We also noted the DB-time: after removing parse- and commit- overhead, there is Notably less time reported as DB-time.
It would be nice to know a little more on the "time spent" by the various componentsin more detail... 

Future improvements come to mind : better use of the timing-information available in python (time.perf_counter/ns ? ). And what about the data available in Oracle (the new sys_time_model?). Maybe also use the linux time command?
Would it be possible to "measure" time on both sides: application + RDBMS, and then attribute the "missing time" to RoundTrips ?  Future Blog(s).. ? 

Key point: 
Improve on Instrumentation ?

(spoiler: possibly more ideas on instrumentation coming... )


-- -- -- -- -- End of this Blogpost, for now -- -- -- -- --

Appendix: links to code..

tst_merge_src.sql : set up the tables for this test (re-used from previous test)

tst_ins_item.py : the test-code used in the demo



Plus these additional files, needed to import helper functions into test-program:

ora_login.py :  login and some other oracle-helper functions.

duration.py : time-keeping and reporting functions - instrumentation

prefix.py : print the filename + timestamp to stdout - instrumentation.

.env : to replace credentials with your own scott/tiger@host:port:service

And you may need to get (pip3) the following:
pip3 oracledb   
pip3 dotenv
pip3 datetime
pip3 array

-- -- -- -- -- End of this Blogpost, for Real -- -- -- -- -- 

Friday, 1 August 2025

Oracle23ai and python - How to fix (avoid) a MERGE statement.

TL;DR: Still using Oracle and python, and loving the combination. But some Annoying High-Frequency MERGE statements from several python programs caused too many RoundTrips... Trying to fix that.

And can I just say: RoundTrips are Evil !

Image: Vaguely related, we have lots of Traffic, and we are trying to MERGE...


Background: need to Eliminate RoundTrips.

For a number of python programs we have logic that checks on "existing data". One example is to find a certain SOURCE. The table holding the sources looks like this (simplified):

If a source-record, by whatever SRC_NAME, exists, we want to get the ID, and if the source does not exist, we need to create it and return its newly assigned ID. 

Perfect case for a MERGE-statement, righ ?

Except that we do many source-checks per second, and each of those stmnts becomes a RoundTrip. These merge-stmnts were 2nd on our high-frequency-list. And since we eliminated the single-record-inserts, they are now Top-of-Problem-List.


The Problem (simplified): 

The function will check for existence of a source-record. The MERGE statement was the most obvious to use here. Our MERGE statement looked like this (somewhat simplified):

The MERGE stmnt is quite standard: check for existence of a given name, and it it doesnt not exist: create it. Then return the ID.

Also notice the (global) variable: src_dict = {}, it will hold the pairs of SRC_NAME and ID as we find them.

The function that uses this MERGE statement gets called at very high frequency in several places. And in the original situation, every call causes a round-trip to the RDBMS.

The original function (simplified) looks like this:

Image will pop out if you click it..

Quite straight forward: create a cursor, assign the bind-variables, execute the statement, and catch the returned ID. Job Done. Except that this would sometimes run at 100s per sec. and show millions of executes per hour in AWR.

Note that we effectively check for parent-records or LOV-records before an insert of a detail-record. I can imagine other constructs, such as triggers or a PL/SQL function to contain this logic. But here we are...

Note Also: On the topic of MERGE-stmnt, allow me a sidestep to this blog by Oren Nakdimon about concurrency-issue with MERGE (link), but that is out of scope for me at this point. Maybe Later.


The Possible Solution:

Thinking back to my previous "Fix for Roundtrips" (link), some form of local buffering or a "local cache" would seem appropriate. But there were some issues:

- Uniqueness: Any new record, e.g. newly-found SRC_NAME, should Ideally be "claimed" with a new ID into the RDBMS Immediately to prevent other systems from assigning different IDs to the same source-name.

- Timeliness: A local buffer would _always_ be out of date, especially when multiple running jobs were likely to discover the same or similar sources in the same timespan. Ideally, the local buffer would always have to be up-to-date, or kept in sync, with the Database.

In short: The Truth is In the Database, the Single Point of Truth (SPOT, classic problem of copy-to-cache...).

- And preferably no "Slurp" of all data: A local buffer could potentially be large, but not every program-run needs all the records. Most programs would only need a small set of the data, typically 10-20 source-records (but they do millions of unnecessary merge-check-retrieve for that small set of SRC_NAMEs). A pro-active "Slurp" of a large set of LOV-data would not be desirable.

One of the "lucky" aspects of our merge-problem was that the SOURCE-data, for this process, was insert/lookup-only. Any updates (comments, modifications, or even combining of sources) would happen elsewhere. The "worker programs" just needed to fetch an ID, or create an ID where none existed yet. 

But any new ID would have to be "immediately" stored into the RDBMS to have it available for others.


The concept-solution.

With some thought, the following pseudo-code came to mind (some architects will call this a "design pattern"):

The image will pop out if you click it..

The comments speak for itself. I chose a python structure called a DICT, which I think is similar to an associative-arry in PL/SQL. 

Note that at this point of writing, I do not yet know if that choice was "optimal", but it seemed to work just fine in our cases. Again something to investigate Later...


Let's put it to Code.

This idea was relatively easy to code. A link to complete and hopefully runnable setup- and test-scripts is at the end of the blog.

The new function First checks if the SRC_NAME is present in a DICT, and if not, then calls the "old function" to check against the RDBMS.  Then returns the ID.

It looks like this:

That was Easy enough (In practice there is a bit more, but out of scope). And it also seemed to pass all of my testing. 

But was it really Better...?


Now Race it for Speed....

To find if something is "faster". I tend to just call the thing 100s or even millions of times, and compare timings.

Our test-case is best described as: 

- The potential names are "src_123", using numbers from 1-1000.

- At the start, the table contains 501 records with SRC_NAMEs and IDs ( the odd numbers. Setup is via the file tst_merge_src.sql (links to all files below)

- Program will generate names randomly of format "src_123". Then check the name, and add a new record if needed.

- We run this for 100-random-names and report timing + statistics... We check the timings, and repeat for another 100-random-names. Until Control-C.


The Original Function with Merge: 200 records/sec, steady.

Here is the first run of 100 checks, using the old function.

Click image to pop-out larger version...

The top-right terminal shows stdout of the test-program: 

It did 106 RoundTrips to test 100 records (100 merges plus 6 to connect and some overhead). It managed to do this at a rate of 212 records/sec. Check also the time-difference of the two lines at start (blue mark): the test took about 0.5 of a sec for 100 records, which confirms: 200/sec.

To the Left, we see a terminal with the count of the records: At the start of the test, there were only the 500 + 1 existing old sources. After the first pass of 100 records, the random-mechanims found 51 (even numbered) new names and merged them into the table. Those newly-found names are immediately "claimed" with an ID. Any other program finding the same names, would be able to pick them up and use the Correct ID.

Let's hit enter a few times and do some tests. After 6 runs:

Click to pop out larger version

Now the program did 621 RTs (600 records checked, and some overhead for commit and statistics. But the speed is still 180/sec. In fact, it varied a little, but the rate was stable around 200 records/sec.

Meanwhile, the record-count now shows 223 new records added (e.g. 223 new names found + merged). And this program will keep processing at this rate.

 The Speed is 200 records/sec, no matter how long we run test.

Time to hit Control-C and start testing the new function...


The New Function: using a DICT with cached values..

On the first run with the new function we found the same speed:

Click image to pop out more readable version...

No improvement yet. It added 100 RTs and was still only processing at 200/sec. 

But the DICT now contains 97 elements, this is the start of our "cache"... 

The record-count showed the total of NEW records now 257, some 34 records were added in this round.

Let's now hit enter a few more times and get to 10-runs with the new function, and as we run more records, the DICT fills up and the cache-effect starts to work:

Click to pop our more readable version..

Now we have a speed of 500 records/sec and the DICT now contains 637 records (out of a potential maximum of 1000). And the number of RTs per run-of-100 is down to about 30 per testloop.

As we run more+more test-loops of 100 records, most of the 1000 potential names end up in the DICT, the number of RTs needed decreases, and the speed in nr or records/sec goes up steadily...

click to pop our larger, readable version

At the 25th test-loop, the DICT holds 902 values, and per test-of-100 we are down to about 10 RTs. The measured speed has gone up to 1797 records/sec: that is more than 8x faster than the original function.

This Cache-Mechanism Works !


Reflexions... 

The local cache will Not Know about Deletes or Updates. But for most LOV- or parent-table records, deletes are un-likely. And from the nature of this data: the SRC_NAME is not likely to change over time (it is effectively the Alternate-Key).

As always, I do not like to add additional code to a system: There is enough "maintenance" already without me adding smart-tricks... Weigh the benefits carefully in your own situations.

I do not now (yet) how efficient the DICT-lookups are on large(er) sets. But assuming a local operation (inside 1 process) is generally much more efficient than a call-out to a (remote) RDBMS. But still something keep in mind, maybe check in future tests.

I dont rule out that certain DataFrames also solve this problem. I hope they Do. But I'm not yet sufficiently fluent in DataFrames to comment much. Maybe Later.

Alternatives.... Use a Local Database or a file?  Someone suggested to copy relevant LOV-data into a local store, either a SQLite, or some copy-file type of cache. I would hesitate to do this, but it may be an option. This is not uncommon in systems that deploy Microservices.


Summary, Wrap-up.

The caching mechanism worked. 

And again, the (evil) impact of RoundTrips is demonstrated.

(In This Particular system, YMMV!)

By eliminating calls to and from the database, we reduce the workload on the python program and on the database.

The python-program does not have to call / fill / run statements and wait for the returns.. 

This frees up time and resources for other work. => Win.

The RDBMS does no longer get 1000s of "identical- and near-useless" calls anymore, it doesnt not have to use its precious processing power to serve those merges anymore. 

This frees up resources at the RDBMS side for other work too. => Win.

Needless to say, if you have your software distributed around servers or over datacentres at distance, the impact of Latency and RoundTrips is Even Bigger. 

I said it before (+/- 1998): Chattiness is the next Challenge in IT.


-- -- -- -- -- End of this blog, for Now -- -- -- -- --

Appendix A: links to sourcefiles.


tst_merge_src.sql : the test table (from tst_item_labels.sql from earlier blog)

tst_merge_src.py : the testdemo python code. (it needs some imports!)


The program imports the following: os, sys, array, random, time, datetime and dotenv. 

And of course the oracle-driver: python-oracledb.

All of which are either standard in your python-installation, or can be get with pip3.


And I use some helper-files, you'll need those for import:

prefix.py : function pp(*argv), prefix the stdout lines with file + timestamp

duration.py : the stopwatch-utility I use.

ora_logon.py : functions to logon to database, and to report data from v$mystat

.env : edit this to include your scott/tiger@orcl, dotenv will read it.

-- -- -- -- -- End of this blog, for Real -- -- -- -- -- 

Wednesday, 30 July 2025

oracle23ai and python - eliminate RoundTrips.

TL;DR: Searching for RoundTrips to eliminate between python and Oracle. In this case, we "collect" the individual inserts to do them in a few large "batches". The result was Surprising, in a Good Way.

And then... Some pieces of program I would rather not have to write (re-constructing the wheel...). But Hey, it wasnt too hard, and it Really Helped.

Old Lesson (re)Learned: Row-by-Row == Slow-by-Slow.

Image: Several Cites have an efficient metro-train running next to an ever-jammed highway. I remember notably Chicago. But the view from the train from AMS airport to Utrecht is often very similar.


Background: Processing data (outside of the RDBMS), and TAPIs

Some systems want their data "processed" by python or other tools that are not part of the Oracle RDBMS (yet). Even if I think that taking data to and from the Database is generally not #SmartDB (link to asktom), sometimes this needs to be done. 

But when every individual record (ins/up) becomes a RoundTrip, you will notice.

Luckily, a lot of our tables already have rudimentary TAPIs (TAPI = Table Application Program Interface). And some of these TAPI-functions caused a lot of those infamous Round-Trips.


TAPI - a Good Concept - until it is not...

The concept might be rather old (80s, 90s?), but it still serves. You'll find similar concepts in ORM-frameworks like JOOQ and Hibernate (links)

In our case, our python code will generally handle the creation (insert/update) of a record in a separate function (f_ins_item... ). These TAPIs will typically handle things like: 1) ensure there is a parent-object, or maybe create one,  2) and handle MERGE-functionality when required to prevent insertion of duplicates. 3) verify (or create) the necessary metadata or lookup-data.

This is a Good Idea, as it centralises the logic for tables in a few, easy to find, functions.

Most of these TAPI functions do their Good Work quietly in the background. In our cases, the "create/update of a record" is not the most time-consuming activity of a program, but rather the outcome of a much longer process. Not a noticeable bottleneck. Mostly.

But all of these TAPIs are single-record functions: they act on 1 record at a time. And when processing large numbers of records, that TAPI-function and the round trip(s) it does can become a time-consuming activity. 

And "network" is a funny resource-consumer: you end up with both Application and Database doing seemingly "nothing" until you know where to look (In our case: AWR and application-logs-to-stdout, but this story is not about the "diagnose" it is about the "Fix").


TAPIs causing round-trips - Too Many RoundTrips.

As it was, the most executed statements on a particular system were the Inserts of "Details" in the datamodel: Records at the fringes of the ERD-diagrams that would receive 100s or even 1000s of records as+when details become known about an item (e.g. the "generic datamodel strikes again", different topic...).

The nature of those TAPIs is 1-record-at-a-time. And sometimes that hurts. From application-log (lines with epoch-time printed to stdout) we could learn that the insert-function was often called, and was a time-consumer. The RDBMS had some notable INSERTS as "High Frequency" (top of list for "ordered by executions"), but not as notable inefficient or resource-consuming statements. 

The whole picture of a slow-program, a relatively quiet RDBMS, and the AWR-numbers about executions and RoundTrips, was enough to warrant a little test.

What if we could "batch" those inserts and prove that 1-row-at-a-time was really the in-efficient part of the system ? 


Test: Catching the insert of a record...

For test-purposes, I simplified the table like this (will put link to script below):

The ID and CREATED_DT get generated by the RDBMS on insert. The FKs are self-explanatory. The Real-World case is a record of some 20 columns with a few more constraints (think: optional columns for dates, times, intervals, validity, valid lat/long, various field- and record-level constraints that can take some DB-CPU to validate, but never much....). And the (average)size for records varies between 500bytes and 2000bytes, depending on the item, the source and the label.

The Original insert looked (simplified) like this: 

We have a function that can be called whenever a "label" is detected for an Item. The properties of the label need to be stored, with a few FK-references and various detail-properties that go in the columns or (free-format-ish) fields of the record. 

Needless to say the FKs must exist for this to work. Cases where the FK-parents may have to be inserted are more complicated. And checking of various "validations" for columns can also take code + time. For the sake of demo, this is a much-simplified example.

Notice this SQL-statement is watermarked as /* t1 ...*/ for easy spotting in tests. Watermarking can also be Very Useful in deployment. Just saying.

Also notice: this function does Not Commit. The TX-logic is done elsewhere.

In short, several INSERT statement of this nature were The Most Executed stmnts from our problem-programs...


Buffering in a list, the Simple Concept.

The fact that all(?) insert in the original version go via a single function is a Great Start. All we have to do is "catch" the inserts, collect a sufficient number of them, and then send those to the RDBMS as a single statement using something like cursor.executemany (see this good example in the docu..)

In pseudo code:

That pseudo-code speaks for itself: store new records in a list (of records), and insert them when you have a significant collection. The Inspiration came partly from what I knew about RoundTrips and previous programming effort. And from a Very Readable example that can be found in the python-oracledb doc on "Batch Execution" (link).

Two main things to check in this concept: 1) Do Not Forget to check and insert any leftover items in the list before program commits or exits. 2) Avoid errors with an Empty-list, e.g. when no records at all are in the list, stop the function from throwing an error.

Other than that: Piece of Cake, right ? 

Note: Python-adepts may recognise this as "using dataframes". Very Similar. Except that at this point, I use self-coded lists for simplicity and demo. It is quite possible that our dev-team will, over time, adopt some form of data-frames (Pandas, Apache-PySpark) in future. You can lead a horse to water.... but Maybe Later.

Late-Edit: As I am writing this blog, Christopher Jones is writing about DataFrame support in the latest python-oracledb release. Check this!


Let's put it to Code:

Note: the complete, hopefully runnable, program and scripts are linked at the bottom of the blog...

We start by defining some necessary (global-)variables and constants:

The implementation will need a (global-) list to add records: itl_list

It needs a length at which to do the inserts and re-initialize the list: itl_list_max_len

And we have the SQL to do the work: itl_list_sql_ins

The SQL-statement does not have to be defined global, but putting it here cleans up the def-function code. In practice, having the SQL inside or near the function can help with coding and code-reading. You choose whatever is convenient for you.

With this in place, we can re-write the insert-functions, in two parts: First the function to catch the records:

This "insert" function no longer interacts with the RDBMS, instead it appends the records to the (global) list. 

In practice, there might be additional logic to check the "validity" of the record before adding it to the list. I've left that out in this example for simplicity.

But the add2list Does check for the size of the list. And when itl_list_max_len is reached: it calls the function to insert the records from the list and to reset the list.

The function that does the actual insert looks like this:


The special case is an empty-list: Do Nothing and return 0, no records inserted.

If there is data in the list: then insert it. 

Note that in this example, we do not (yet) check/re-check the validity of the data before handing it to the cursor. Any serious data-anomaly could throw a nasty error.

As return-value the function reports the actual number of rows processed by the cursor, assuming that it was the nr of inserted records.

This code "compiled and ran" and all seemed Well..

So Far So Good. But did it Help ? 


Let's Race It....

To compare, I pasted together a program that will do two loops of n_sec. One loop  of original, individual inserts. And another loop of n_sec of list-buffered-inserts. Let's see what comes out faster...

Note: On early testing, I started with n_sec = 120sec of inserts. The Array-insert was so fast it threw an error: ORA-01653. Good Start. I adjusted the timings downwards a bit...

So I have two while-loops that each try to insert "records" as fast as they can for _only_ 10 seconds. 

The first loop does 10 seconds of "individual inserts", it uses the original call for each individual record. The results to stdout looked like this:

Notice the number of loops (records): 2109, or 210.9 records per sec. Then notice: 2113 RoundTrips (minus the 4 from program-startup). Yep, 1 RT per record. Reporting the stats and the commit will add 2 more RTs, and bring the total to 2115 RTs before the next test starts.

(Also notice, I sneakily eliminated hard-parsing to beautify my results a little... )

The second loop does 10 seconds of append-to-list, with the occasional insert-into-table when the list gets to 1000 records. Stdout report looks like this:

Wow... That 10sec loop inserted 339.339 records....? A factor of 150x more. No wonder the first test hit my tablespace-size-limit.

First verification: count (*) in the database, Yep: over 600.000 records (there was a 1st run to eliminate the overhead of hard-parses...). Looks correct.

Second verification : the nr of RoundTrips. Those 339.339 new records, at 1000 records per execute, would have caused 340 RTs.. The reported nr of RTs is 2455. And minus the previous 2115 RTs, that is ... 340 RTs. That seems to concur Very Well.

Final check: V$SQLAREA (and counts) after two(!) runs of the program:

The Statements show up in the Shared_pool, and look at those numbers:

The individual-inserts /* t1 indiv */  have consumed about 273 microseconds per execute, for 1 row per execute, or 273 microsecond of DB-time Per Row.

The list-insert, marked /* t2 list */, with 1000 rows per execute, has consumed 12,854 microseconds per execute, but with 1000 rows per execute, that is Only about 13 microseconds of precious DB-time consumed per Row. 

This Thing Rocks!


Some sobering thoughts...

This is essentially an Old Lesson (re)Learned: Row-by-Row = Slow-by-Slow. We Knew this since, ah.. 1995.

The First Thing we missed with the new function was the RETURNING-ID. All of our TAPI functions so far return the ID (primary key) of newly inserted or merged record. In the case of bulk-detail-records that is not a problem. But for inserting new meta-data, new lookup-data or otherwise data that is needed for further processing, this can be an obstacle. In our case, we will only build list-inserts for records where we do not need that return-value. Minor problem so far...

Validation of records by the RDBMS, e.g. Constraints inside the Database, can be more complicated on bulk-inserts. Bulk-error processing is possible, but not always simple. With individual records, it is easier to catch errors with try ... except blocks. In our case, there are records where we dont want bulk (yet) for this reason. You decide how important this is to you.

Extra code means additional dependencies and additional (future-)maintenance. Especially difficult to argue when written to "work around a problem" rather than to add something (functionally-)useful to a system. In this case, I had to write two functions to replace the original (TAPI-)insert. And future programmers/users need to take into account that the leftover-data in the list needs 1-more-insert to clear out.

For this example: Someone will forget to purge the array at some point, and (inexplicably) loose the contents of the last batch...

I would recommend to only apply this trick when you Know it is going to make a Big Difference, and when your team is capable of understanding the additional (brain-)work.

Alternatively, you can search for existing solutions. Certain DataFrames for python may already solve this problem for you. The pandas-dataframe (link) looked promising, but on first-search it did not provide exactly what we were looking for.


Further items to explore..

Record-types. The equivalent of what PL/SQL has: EMP%TYPE could help in defining a structure to hold data. It could make the lists easier to manage and it can do some data-checks before adding data to the list. It would reduce potential for errors on the actual insert. Maybe Later.

Data-Frames or similar toolkits might "do the work for us". For java there are JOOQ and Hibernate. Python has several DataFrame options, such as Pandas and Apache-pySpark and some of those may have potential. Maybe Later.

Geeky: How big can the array be (in nr-records and/or in memory-footprint) before it shows signs of deterioration? For the moment, any number above, say, 100 will clearly benefit the system by reducing RoundTrips and overhead. But is there is an optimun or some upper-limit. Maybe Later.


Summary: Batch-Processing (array-processing) Works !

From this (over-simplified) testrun, I want to point out the Two Main Benefits of Array-Processing:

1. The program managed to insert 150x more records in the same 10sec interval. That is a clear increase in capacity for the app-program. The biggest benefit is in reducing the overhead, the call- and roundtrip-time per record.

2. The consumption of DB-resources on the RDBMS-side is much more efficient as well. Because the RDBMS can now handle "bigger chunks" per call, it spends less time on the same amount of "ingest data". This Benefits the RDBMS as well.

The numbers on this test Really Surprised me. Again. Despite using systems with only single-digit-millisec latency. And I knew RTs were costly, I've seen + fixed this kind of problem before. But I didnt expect the difference would be This Big. 

This problem is as old as the app-rdbms dichotomy.  And yet we dont seem to learn.

This test in particular also illustrates how un-necessary RoundTrips can slow down Both the App and the RDBMS: not just by losing time waiting for latency, but also from the incurred additional processing. 

RoundTrips are the Next Important Challenge in IT...

This test demonstrated: Both components, the Application and the RDBMS, Gain from batch- or array-processing and reduced RoundTrips.


-- -- -- -- -- End of this blogpost, for Now -- -- -- -- --


Appendix 1: links to Scripts.

tst_ins_labels.sql : set up the datamodel.

tst_ins_label.py : the demo program with two while loops


You need following "importable files" to run the tst_ins_label.py program:

ora_login.py : the login utility (uses dotenv) and the session-info.

prefix.py : contains pp ( *argv ) to print to stdout with timing info.

duration.py : my stopwatch, time-measurement for python programs.

.env : used by dotenv to store + get credentials and other info.


And to verify the results, you can use:

tst_2arr.sql : view SQL in shared_pool, and count records in src_item_label.


Feel free to copy or even re-type. 

Re-use of code is often a myth, unless you have typed or at least modified the code yourself.

If you have suggestions for improvements: Let me know. I'm always interested in discussion these things if time+place allow. Preferably in the margin of a Good Conference, or on one of the social channels (bsky, slack, email...).

-- -- -- -- -- End of this blogpost, for Real -- -- -- -- --