Thursday, 17 July 2025

Oracle 23ai and python - Beware of Latency and Roundtrips

 TLDR: While investigating the combination of Oracle23ai and python, I stumbled across some test-code and findings that may be useful to others. Lets start with the Latency.

Aim of this exercise was to find "Roundtrip and Latency Overhead" for a given connection or program. Chattiness is damaging with high-latency connections.

Chattiness: Comic-character Achille Talon is Chattiness in Person (google him!)


Background: running python code.

In general: this "cloud-only" industry needs to start understanding Latency.

Most (all) of my python work is just plumbing to get data to/from the RDBMS. But in having some python jobs "deployed" into containers in various locations, I have to either prove or disprove that "Bandwidth" or "Latency" are problems. 

This turned into an obsession with Timing (link to previous blog), and especially Round Trips.

Spoiler: it is not often Bandwidth, it more often Latency and Chattiness... If you do enough round-trips, that latency adds up.

I've started carefully by measuring some response-times, and trying to get a grip on how latency (expressed as time, mostly milliseconds or microseconds).

(turned out to be more than one blog-story to tell. stay tuned..... ) 


Why measure all this..

If I can get there, my aim would be twofold:

1. Measure and Report the (average..) Latency or RoundTripTime for a connection between a specific app running and an instance.

2. Measure and Report the nr of Roundtrips for a given program, depending on the amount of data to process.

Knowing both values will help to decide on placement of components: how big is the impact of reduced latency, and help to predict runtimes. But these measurements should also stimulate coders to try to reduce round-trips.


Caveat: This is still Simple and very Basic tooling.

In the examples below I will use very basic, straightforward connections. Most of my tools are stand-alone and do not have (and do not want) the luxury of the Oracle Client Library. And I do not use (do not need...) connection-pools.

As I was (still am!) a python n00b, some of my code may seem overly simple to more experienced users. Quite often I discover that "my problem has been solved by someone else, 10yrs ago...". Feel free to let me know where I need to improve.


Connecting and first measurements.

To test things out, I used a copy of do_sql.py (links below) from an earlier blog, and some utilities I constructed for use with python to ease login and to report some statistics on the connected session. I defined some funcions:

ora_logon (): returns a connection (or kicks out an error if not...)

ora_sess_info ( some_conn ): shows metrics from the connection.

pp ( args* ) :  print output, prefixed with the sourcefile-name and a timestamp

In the my (test-)program, do_sql_mtrcs.py, I will first use them to do a Logon, e.g. open a connection, and report the metrics. Start of the program looked like this:

The first version of the program does nothing yet, it just logs on and reports session-stats. By reporting the ora_sess_info 3 times, I get an idea of how much overhead each step is. Mind that I have already optimised the ora_sess_info to do just 1 (One) RoundTrip and 1 sort.

The result on stdout looks like this:

The timing-info in the log-lines gives us the approximate times for each step. 

Notice that "logon", e.g. opening a connection (in my case thin-mode!) seems to take 3 roundtrips, 10 executes and 4 user-calls. Subsequent call to ora_sess_info () seem to take up 1 roundtrip, 1 execute and 1 user-call each. I did also verify that later by running the 1000s of those calls against a near-silent instance and check in v$sysstats and by running an AWR-report.

So now I know the overhead of my calls (and I put them in comments in the python source-code). In a later version, I might add more to "Measure + Report" information on a connection.

Caveat: Those overheads should be verified for other-versions and future-versions of the driver and of the database. In my case I only use the thin-mode at the moment.

Now the timings: 

For logon, the call to ora_logon() : 2.635146 - 2.485878 = 0.149268 seconds... Say 0.15 sec for an action that we know involves 3 RoundTrips.

The approx time for each call to ora_sess_info() is about 0.05 sec (50 ms...) for what we know involvers 1 RoundTrip.  

First rough indication: a RoundTrip is somewhere below 50ms, and probably much less because each of the calls involves the roundtrips Plus the actual query on the RDBMS, Plus the output to stdout and the overhead of the python-function calls.

How can we measure this better ... ? 


Ping to the Database

The python-oracledb (link) contains a ping function, which is documented to Explicitly do One Round Trip. In the same docu-page, you find the is_healty() function which does Not do a roundtrip, and does perform a full connection check.

I decided to do a test of 3 cases, of 30 seconds each:

1- Running an empty loop to see how much time my python code takes. Count the loops and find time-per-loop.

2- A call to is_healthy() to see how much time that would take (no roundtrip!). Count the loops, and find time-per-call.

3- A call to ping()to generate real real RoundTrip, count the loops and find time-per-RoundTri (and substract the time-per-loop to counter for python-overhead)

The code looked like this:


This function contains 3 while loops that will repeat an action for n_sec and report on the time + time-per-call.

I called the function with n_sec set to 30 seconds.. And the outcome was: 

The time per empty-loop was 0.000314 ms, e.g. less than a single-microsecond of overhead because of the while-loop.

And the avg time for a ping was 1.53 ms. The outcomes of several runs were consistent to 1 or 2 digits (on this system!).

Also check the nr of round trips reported by ora_sess_info () : 15156, which is exactly 3 more than the nr of ping-loops reported. This concurs with the fact that we noticed earlier: logon took 3 RoundTrips. We can conclude that both python (the n_count) and the RDBMS (from v$mystats) reported the same number of RoundTrips happening. And AWR confirmed this separately by reporting a similar number of RT and UserCalls on testing this program.

Confirmed: Each of those ping() calls did 1 RT and 1 UserCall.

The average time for a call to is_healthy() was 0.000680 ms, still less than a microsecond, very short compared to the ping and closer to the "empty loop", hence my impression that this function is efficient but definitely does not do a RoundTrip, e.g. not a complete Health-Check for a connetion (?).

At this point I have an impression that the RoundTrip to the database for this combination of app-platform and location is about 1.5 ms


Wrapping up: Roundtrips  x Latency = Time in the Nework

We can count RoundTrips from V$MYSTAT, V$SESSTAT or by analysing AWR. We have established a way to measure Latency between application and RDBMS from connection.ping(). It was approx 1.5ms on this system.

Once we have these numbers, we can start guesstimating the impact of network-changes or application-moves. 

A program that needs 1000 roundtrips to do its job, in a system with latency of 2ms, will loose 2 sec of real time just in the network. Any increase in Latency.. or a higher nr of RoundTrips will impact time needed by the program. 

By regularly measuring and reporting those numbers, we keep an eye on the impact or our network and the Quality of Service.

Notice that we hardly did any real queries yet. Next blogs will add a few more insights in the traffic between app and RDBMS.


-- -- -- -- -- End of this blogpost, for now -- -- -- -- --


Appendix: Links to scripts.

do_sql_mtrcs.py:  logon + get 1st metrics (.. it can do more, Later)

rt3_ping.py : try to measure the RoundTrips.

ora_login.py : utilities to logon to database and report v$mystats.

prefix.py : utility to prefix print with filename + timestamp to stdout.

.env : edit this file to include your own scott/tiger credentials.

You will also need python components : python-oracledb and dotenv, which you can pip into your system.


-- -- -- -- -- End of this blogpost, for Real -- -- -- -- --

No comments: