Tuesday, 15 July 2025

Oracle23ai : Measuring time more precisely, and observing my Servers.

 TL;DR: In a few old scripts, it was time to replace sysdate with systimestamp, for very geeky reasons. The reason to revisit these old scripts was to increase precision and to adapt them for use together with some python-code - More on that Later.

Oh, and I'll show how to check/verify the CPUs in a system (with a very old but simple trick).



Background: some old scripts, time to upgrade.

Since way back, Orale7, I had two scripts, and two similar functions: spin.sql and sleep.sql. The sleep.sql is evident: put a process to sleep for N seconds. The spin.sql would do the opposite: make the attached connection run full-speed, thus generating N seconds of load on the database server. Why these scripts... you'll see.

Note that I first created those as runnable-scripts, but very soon also as select-able functions for easy use in statements. I'll demo some of it, and I'll put the sourcecode in links below.

For "sleep", having a script provided very flexible: in cases where the execute-grant on DBMS_LOCK was not given immediately. It may even have started with just a host-call to the OS sleep program, hehehe. (Dear DBA, can I would like to have the following GRANTs..., approval could take a few weeks...).

But in the back of my head was always the nagging knowledge : those are not very exact. We Can Do Better...


Running the "old versions" of the scripts...

The lack of precision was notably on very short callsm and was often evident when I used set-timing-on:

You can see me running the script-versions of each. The sleep 1  and sleep 1.1 provide an elapsed time nearly Exact to the specified nr of seconds. The DBMS_LOCK.SLEEP is documented as 1/100sec precision and at the time that was fine.

But spin.sql, when called with various values, provides elapsed-times that are often less, but sometimes more (?) then the specified value.

And this is probably because start and end are measured in Whole-Seconds taken from the SYSDATE. So if you started a script or function at the tail-end of a second, you would loose the starting-part of that second. This was oracle7, remember.

But for spin.sql, or the call to the spinf() function, the actual "elapsed" time was never exactly the asked-for time, due to rounding errors on the "Seconds" of SYSDATE.

Also note: For the script-version of sleep, I would insert a host-sleep call if it was too complicated getting grant on DBMS_LOCK from the DBA (sigh..). The actual precision from a linux sleep call would be around 1/00 sec as well, often less precise due to various overheads. Still, at the time, that was good enough for me.

And normally precision was not a problem, as typical usage was for intervals of many seconds or minutes. Still, we could do better. And with the use python came the time module (link). Here, I suddenly got a system pretending to be nanosecond-precise with calls like time.time() and time.perf_counter_n () : Nanoseconds...! 


Systimestamp: FF9 measures down to nanoseconds (??).

Using SYSDATE short of shows the age of those scripts. And I didnt mind, I never had much need to get better accuracy than 1 Second anyway, sleep or spin. But still.... Since Oracle 9, there was SYSTIMESTAMP, which could be reported up to 9 decimals (e.g. nanoseconds...). Plus, in Oracle-18c, there was the more accessible sleep-function in sys.dbms_session instead of the old DBMS_LOCK (soon to be deprecated!). It was time to upgrade my scripts.

For starters, here is how I got the "epoch" out of my RDBMS:


(nb: geeks can now see or find-out when I wrote/tested this blogpost...)

So with an accuracy of microseconds, I could improve on the exact timings of those scripts ? 

And were these scripts of any use at all ?


Use-cases: CPU speed and nr of CPUs available.

Use-case 1: compare CPU-Speed.

Way back, the use of spin.sql was mainly play-time. See how I could "load the server". But very soon, The first Two serious Use-Cases popped up: 1- Get a feeling for the CPU speed by counting the nr of loops/sec. And 2- Verify the nr of CPUs in a system.

For an newly arrived Exadata, to replace our VMWare serverpark, we got asked: how much faster would this make "our application" ? The app in question was 92% or more just DB_CPU:

This application was Not doing typical database work and it was Not IO bound. It was purely CPU-bound. But a jump to next screen would take just over 30sec, all of it CPU-work on 1000s of function calls (dont ask..., I presented on this "performance problem" in 2015 or so...).

So on the old system, we ran "spin.sql for 30 or 60sec, and let it report the nr of (empty) loops, just to get a feeling for the speed. Then we did the same on the Exadata, and we got about 3x the nr of loops per sec. Hence our prediction was: this will _only_ give you about 3x increase but not more than that. And we were surprisingly close to the mark.

Use-case 2: check nr of CPUs

The other real-world story goes back to +/- 2008: Cstmer had a new Sun-M-series. The system was given on trial, and supposedly had 24 CPUs. Of course, I was curious how fast it would be, and how it would stand up to real Load. So I ran spin.sql and got back a nice, large number for loops/sec. And I could quickly run up 12 or more (putty + nohup) spin-scripts to the system and try to make it blow steam out of the serverroom.

With a 24-CPU system, I had expected my 12 scripts spinning away to load the machine to about 50% (or more). To my surprise, it only loaded up the system up to about 40%. Either this system throttled my jobs, or we had more CPUs...

Testing with 4 spinning scripts: about 13% load

Testing with 8 spinning scripts: about 27% load... I could see a trend.

When I showed this to the sysadmin, he was surprised, but went off to do his own verifications. About 30min later he came back: Dont Tell anyone, but we have 32 CPUs in the box. They probably forgot to "disable" 8 of them... He went back to his cubicle a Very Happy Man.

Note that with many system having the cpu_count set to something lower than the actual machine-capacity, this trick has become of limited use, and your test will get strangled by "resmgr: cpu quantum" (if your DBA was vigilant..)

But at the time, I think Oracle version 10.2, I was kinda Surprised at what my little playful scrippie had unearthed...


Now for Modernising... Functions, and more Precision.

Since python has a set of time-functions (link) that pretend to do nanosecond precision, I wanted to try and pair my Oracle-scripts to that. Although most Oracle implementations seem to stop a microseconds (given fractions of Seconds as : .123456000, where the zeros are the 999 nanoseconds).

So I considered it time to rework my scripts to do similar. I started by creating functions rather than scripts. From the code of the existing scripts, I created functions spinf (n_sec) and sleepf (n_sec). This meant I could also put "spin" and "sleep" into SQL queries. A bit similar to what a competing RDBMS also does (link? nah..).

The sleepf remained at Centiseconds for the moment, as both DBMS_LOCK.SLEEP (soon to be deprecated!) and DBMS_SESSION.SLEEP are documented to be up to 1/100 of a sec.

The resulting functions are defined in the sourcefile mk_spinf.sql (link below), and they can be uses in queries. Here are some examples for spinf:

Notice the results are still "inaccurate" at low values because of the use of whole seconds (from SYSDATE accuracy). The request to spin 1.1 second resulted in a spin-duration of 1.5 seconds, and the resulting loops/sec is estimated too high.

Of course, we can have the imprecision "averaged out" by using longer intervals, as was the habit with the old code anyway. But still...

So the next version of the function was spinf_ns, where the suffix _ns would suggest the nano-second precision, similar to the same suffix in python. Lets run it a few times..

Firstly, we did not get down to the nanosecond precision, not yet. But I kept the suffix anyway, bcse it looks good, and it is python-y.

But the actual elapsed time are quite close to the 1/100 sec precision now. And that is much better than before. 

I can now more or less control, specify, how much time I want as session to consume. I may use that to do some measurements later (future blog-plans...)


Similar with sleepf( n_sec ): allow for short powernaps, and verify.

While at it, I also created a function to sleepf ( n), where n can give you a precision of up to 2 decimals:

And voila, now the actual sleep-times are very close to the specified ones. You never know when it might be useful to slow down a query.


One more tool: Measure what my session did...

Just to top it up, I re-visited an other script as well: mytime.sql (link below). To report on the time spent inside the session. The view V$MYSTAT is a very nice way to see what Your Session actually Did. I've used that to see how much DB-time I have consumed (and roundtrips, and executes, and user-calls).

Here is how you can combine spinf_ns () and mytime.sql:

The first run of @ mytime . sql tells us the session has consumed 1,101 of DBtime (expressed in centiSeconds, so that would be 11.01 seconds). I then run spinf_ns to add 8.97 seconds of "activity". Taking into account the expected overhead of the "measurement", I should end up close to 2,000 centiSeconds (20sec). 

And as you can verify: That Worked. 

QED.

 

Wrapping it all up, future plans...

I have improved on my existing toolset. I have more precise instrumentation, notably to match the python tools (the ambitious _ns suffix). 

And I have a few ideas to put them to work by instrumenting my python- and PL/SQL code.

I plan to use some of this to investigate python-code, notably with regards to time lost in roundtrips. If I can get it all to work, and if it makes sense, I'll write up the results... (link to future blog here...)


-- -- -- -- -- End of this blog, for now -- -- 2 more appendices -- -- -- -- --


Appendix 1 : links to scripts.

sleep.sql : old script for sleeping N sec.

spin.sql : old script for spin, output nr-loops per sec.

mk_spinf.sql: create functions for sleep and spin.

dbtime.sql: report DB time consumed by the instance (since startup).

mytime.sql: report DB time consumed by the current session (since logon)

epoch.sql : my various attempts to extract epoch-seconds...

Feel free to steal and adjust for your own usage (the most re-used code is the code you write/edit/tweak yourself!)

And, Just Curious: how many loops/sec does your system get out of spinf_ns.sql ... ? Ping me the results if you ever run any of the spin-stmnts.


Appendix 2: More measurements: check the SLOB tool by Kevin Closson 

Over time, I have used spin.sql as basis for other scripts, notably to count nr-commits per sec and nr of Physical IOs. But because others did a much better job of determining these metrics, I've put those out of scope for now. Maybe later.  But you could learn a few things from checking the work of Kevin Closson (webpage-link) on the SLOB tool to benchmark Physical IO (github link).

-- -- -- -- -- -- End of this Blog, Really -- -- -- -- -- --




Monday, 14 July 2025

Oracle23ai and python - Just run this Query.

TL;DR: When I started to combine Oracle23ai and python, I wrote some Terrible pieces of code. One stands out as outright Dangerous (but Useful): How to Execute Any Select-Query...


Summary: Dynamic SQL, and Ad-Hoc queries.

I re-used the program victor.py (see blog last year) to run adhoc-queries: just give it a string of valid sql, and loop-print the results. Who needs SQL*Plus eh?

To copy this trick, you dont even need to download the file (links at bottom); you can almost type it faster yourself.

Up-front Warning: Do Not use this program (this trick) on Production-data. You will see why - and could be worthy of a whole separate blog...



Background: Do Some Queries...

My first steps into Python were focussed on "fetching data", e.g. SELECT, preferably in some SmartDB-way. That data got handed to PyTorch (link) for further processing. Soon after came the "insert of results". All very Basic, nothing fancy.
A need for debugging-inspection of (large-ish) vectors resulted in the program victor.py to fetch+inspect individual vectors (see earlier blog).

And as I experimented with the (interactive) program for vector-inspection, I "discovered" I could use the same program for adhoc query+display of regular records. That provided a nice playground to experiment. It also turned out quite useful for situations where python-access to the data was the easiest, or even the _only_ way to connect to certain schemas (dont ask, talk to our CISO...).


Return results of a query, simple...

You probably discovered for yourself, this here is the quickest way to verify a (select)query in python:

cursor = ora_conn.cursor()     

for row in cursor.execute ( "select * from emp" ): 

  print ( row)


You can even run queries "interactively" from the python-prompt, after you have opened the connection and defined the cursor:

>>>

>>> for row in cur.execute ( " select * from emp" ):

...   print ( row ) 

... 

(7369, 'SMITH', 'CLERK', 7902, datetime.datetime(1980, 12, 17, 0, 0), 800.0, None, 20)

(7499, 'ALLEN', 'SALESMAN', 7698, datetime.datetime(1981, 2, 20, 0, 0), 1600.0, 300.0, 30)

(7521, 'WARD', 'SALESMAN', 7698, datetime.datetime(1981, 2, 22, 0, 0), 1250.0, 500.0, 30)

(7566, 'JONES', 'MANAGER', 7839, datetime.datetime(1981, 4, 2, 0, 0), 2975.0, None, 20)

(7654, 'MARTIN', 'SALESMAN', 7698, datetime.datetime(1981, 9, 28, 0, 0), 1250.0, 1400.0, 30)

(7698, 'BLAKE', 'MANAGER', 7839, datetime.datetime(1981, 5, 1, 0, 0), 2850.0, None, 30)

(7782, 'CLARK', 'MANAGER', 7839, datetime.datetime(1981, 6, 9, 0, 0), 2450.0, None, 10)

(7788, 'SCOTT', 'ANALYST', 7566, datetime.datetime(1982, 12, 9, 0, 0), 3000.0, None, 20)

(7839, 'KING', 'PRESIDENT', None, datetime.datetime(1981, 11, 17, 0, 0), 5000.0, None, 10)

(7844, 'TURNER', 'SALESMAN', 7698, datetime.datetime(1981, 9, 8, 0, 0), 1500.0, 0.0, 30)

(7876, 'ADAMS', 'CLERK', 7788, datetime.datetime(1983, 1, 12, 0, 0), 1100.0, None, 20)

(7900, 'JAMES', 'CLERK', 7698, datetime.datetime(1981, 12, 3, 0, 0), 950.0, None, 30)

(7902, 'FORD', 'ANALYST', 7566, datetime.datetime(1981, 12, 3, 0, 0), 3000.0, None, 20)

(7934, 'MILLER', 'CLERK', 7782, datetime.datetime(1982, 1, 23, 0, 0), 1300.0, None, 40)

>>>


That was a Nice Discovery: once I had a connection-object open, I could just test queries from the prompt! But it can be a bit awkward if you are typo-prone like me...


So, why I wrapped it into a program...

Once I found myself testing queries that way, I came up with the little program do_sql1.py. It will prompt for a query, and try to execute whatever SQL was entered. And display the result line by line (or an error...). The result looked like this:


As you can see, the program opens a connection (using credentials from dotenv, see earlier blog), and it prompts the user for input of a query.

So let's give it a query...: 

select * from dept order by dname

I typed the stmtnt, and hit enter:


After typing a query and hit enter, on one single line, bcse sys.input(), I get ... results! And from the timing in the print-stmnts, I even get an idea of where the time is spent, something I could use later in troubleshooting.


[intermezzo] Instrumentation: filename and timestamp.

From the output above, you can see that I print the (source)filename and a timestamp in microseconds in front of every line. To do so I am using a self-coded module called prefix.py (link at bottom) which might also merit a mention in a blog sometime. Printing those timestamps sometimes reveals some of the inner workings (separate story, maybe later...). 
As I found: the time.perf_counter (link) from python tries to go to nano-second precision. Promising for future use...


More Usage by using Stdin and argv[1]

I quickly discovered that I could use stdin to feed it pre-edited queries. As long as the inputfile was a one-liner. And I also started detecting sys.argv[1] and use that as query to run. All still very primitive, but I found it usable for all kinds of debug-purposes. 

For example, typing:

echo "SELECT ename, sal from EMP order by ename" | python3 do_sql1.py


yields the results, to stdout, like this:


Looks familiar ?
And we begin to notice is the time-differences between the output rows. The nr of microseconds for the first row is 945422 - 937535 = 7887 microsec. Most of the subsequent rows get fetched in less than 100microsec (e.g. >10x faster), which is probably an indicator for work going on behind the scene, e.g. a round trip ? 



Wrapping up: Risky, but Much more Useful than expected.

Normally, I would be the first to say: What a useless- and Trivial program. 
Because as query-tools, SQL*Plus (link) and SQLcl (download link) do a much better job as adhoc- or Query-interfaces. 

Warning and Caveat,  Do I really have to say this Out Loud, again...:"
This little program is very similar to doing "SQL-Injection" - Be Careful!

But I found myself using this do_sql1.py a lot while tweaking and troubleshooting python-oracle programs. 

For one, I could run this program from any python-container to test a db-connection.
It verifies: Does the connection go where you think it should? 
It does testing: And I could test what a certain query would return in That Particular Container. This is how I find a container is accidentally connected to the wrong database, or wrong schema (yep... quite often!). 
And when I finally started reading the documentation for oracledb (link), this program proved easily-adaptable to explore various settings and options. In the end, it proved much more useful than expected.

As always, I dont think many of you will copy it as-is. But I suspect some of you have similar tools, or will build similar tools. And often the best re-use of code is from the code you typed (or adapted) by yourself. 

Have Fun ! 

-- -- -- -- -- -- End of this blogpost, for now -- -- -- -- -- -- 

Appendix: links to sourcecode

do_sql1.py : the actual program.
prefix.py : the module used to format + print output
ora_login.py : the module used to open connections, from earlier blog (link)

-- -- -- -- -- End of this blogpost, for real -- -- -- -- -- -- 










Thursday, 10 July 2025

Oracle 23ai, python, and please Go Easy on my system...

TL;DR: Running Python against an Oracle 23ai database is Nice, but Dont Overload. I devised a few ways to throttle long running python processes. First trick was to (dynamically) inject sleep-calls. And then I refined that to react on the actual load on the RDBMS.


Summary: 

At first, I had an editable parameter-file to vary the sleep-time between loops. But then I refined it: The python program checks regularly on (average-)active-sessions, compares it to cpu_count, and takes a sleep when the system is considered "busy". The parameters to control this are imported by dotenv (link) and can be modified on the fly if needed. This allows long- or always-running jobs to retract to the background when system is busy, and to benefit from quiet intervals when they occur.

And by reacting to Avg-Active-Session and cpu_count values they automatically, dynamically, adjust to the load on the RDBMS, even regardless of the hardware underneath.


Background: Batches running python code.

Sometimes I have to build + run utilities for python-jobs that utilise the RDBMS. Some of those programs can run for Hours, if not Days, while doing loads/extracts/analysis/summaries... One request was to prevent Overload. e.g. to allow the python-jobs to "stop running" at times where other system and users were busywith More Important Work". So in the end, I had to define and measure "busy" and act accordingly. 

Note: I used to check with vmstat or sar (sysstat) to detect how busy my system is, or how much (percentage) idle I had left. Nowadays, I prefer to get my data from the RDBMS, for two reasons: 1) there are systems out there with SQL-only access (e.g. no linux prompt anymore...). And 2) system nowadays will run CDBs with multiple PDBs and databases limited by the CPU_COUNT parameter. Your "server" may have plenty capacity (or not), but you can only get 2 CPUs from it... (merits a separate story, try running the 23ai-free-version with the cpu_count parameter hard-set to 2 on an 8 cpu machine - anecdote later...).

Anyway, this all started very simple, with some calls to time.sleep ( n_sec ).




First Attempt: simple sleep( n_sec) in every main loop.

The long running programs are re-startable, which is good-practice anyway because sh*t^h^h^h^h hiccups happen. So when the the system of background-batch-processing needed to yield to more urgent work, we would kill some jobs (stop the containers), and re-start them later. Crude but simple.

But that meant the processing often stopped completely, and sometimes we forgot to re-start...

Next up was: stop the program (or the container running it), edit a sleep-call in some main loop, and re-start it. The program would run at slower pace. That worked. Sort-of. 

But it still meant the process had to be killed and edited. And if left alone, it would either run slooow for the rest of the job, thus taking much longer than usual. Or it would need another kill+edit+restart when the system had more process-capacity.  Plus, for some processes, the startup-overhead was significant. It worked, but still messy.


Next Attempt: dynamic sleep time, from an editable config-file.

Introducing: Throttle.py. This little module would read the dotenv-file at regular intervals and adjust the sleep-time to a value given in .env. 

The code to implement this can be found in throttle.py (link below). The module is meant to be imported, but running it stand alone will perform a test- and demo-sequence.

To "throttle" a running program, now we just had to edit the .env and put in a larger value for THR_SLEEP_TIME:

# throttle, sleeptime

THR_SLEEP_TIME=3

Note: The re-read interval for the dot-env file with the latest (editable) threshold was hardcoded at 60sec to ensure a relatively responsive system. The overhead of reading the file was considered minimal compared to 60sec. If you have to fetch this file from some remote location with significant time (latency), you can consider a larger interval.

We now had several advantages: No more kill+restart, but rather an easy-editable value in the config file (the .env).  This worked Fine for a while. It was a bit like opening and closing a valve from a reservoir. So far so good. 

But this trick still required manual intervention: ogling the system-load and editing .env.  Surely, there was a way to automate and improve on this.


The Database is the real, precious, Bottleneck.

The "load" on the system was, in this case, best monitored from the RDBMS, the Database. Hence, we needed some probe into that system to determine how much spare capacity it had to process our python jobs. 

And that is where AAS comes in: Average Active Session (link - Google turned up this rather old webpage, but the idea is actually older still). By using the statistic for DB-time, the database will tell you how much processing was done in 1 unit of time.

AAS = Delta-DB_Time (seconds) / Elapsed_time (seconds). 

So an AAS of 3 means that running processes consumed 3 seconds of time in a 1 second interval. Hence the Unit of Measurement for AAS can be expressed as Sessions or Processes.

Knowing AAS is nice, but what is my available capacity ?

Your AAS tells you how many processes (on average) are running. But you need to relate it to your CPU_COUNT to know how "loaded" your system is. Hence we calculate percentage-busy as:

PCT_BUSY = 100 * AAS / CPU_COUNT.

On a system where AAS=3 (e.g. 3 processes busy) and count of 4 CPUs, the pct_busy will be about 75 percent, hence there is only about 25 percent capacity, or one cpu, still idle. But on a system with 24 cpus, an AAS of 3 means it is only 100*3/24=12 percent busy and some 18 CPUs are still available to service more processes. 

Using this simple pct_busy as threshold, would more or less adjust to large and small systems. So for the moment, we decided to stick with this easy "percentage" (see notes below on setting threshold..). 

So, let's build this...


Putting it to work: the python code.

First to get the relevant info from the database, I needed a query.

And to determine delta_DB_time and delta_epoch_time, I needed two "global" variables to persist over time, to keep db_time and epoch.

Those definitions were simple:


The function to execute the code starts by reading the dot-env and fetching the relevant values  from the database.

Note that the parameter-file (dot-env) is read at Every Call to this function. When called too frequently (single-seconds...), the overhead can be significant. Hence the ora_aas_chk() should ideally be called at intervals of at least 30sec or higher.

Now that we have all the information, we can start the logic:


If it is the first call, and there are no "previous" values, we do nothing (pass)

Else, if there are previous values, we calculate the AAS and PCT_BUSY.

Once we have the values,  we can determine if a pause is justified. I've left the print(debug) stmnts in for the moment:

The Threshold determines if we need a pause (sleep). And at the end of the function we "set" the global variables that will be used to calculate the "deltas" on the next call. For good measure, we return the total time spent in ms.

If you insist on strictly "sleeping" your program until the pct_busy is below the specified threshold, you can use a while-loop: 

while ora_aas_chk ( ora_conn ) > 1000:  

    # sleep and only continue when pct_busy Below Threshold.

    pass

Repeat the call until the return-value is below, say, 1000 ms (assuming your pause-setting is more than 1 sec, as it should be). Any return of more than 1 sec will signify we did pause-sleep. Hence, your database was considered "busy", and you may want to hold back your python-job  from processing.


The code: added to ora_login.py

The logical place, at the time, to put this code was in the ora_login.py (link to earlier blog..). In hindsight, I should maybe have isolated it as an individual model. Later.

When running the ora_login.py as standalone program, it performs a self-check, where the output from ora_aas_chk() looks like this:

Notice that the duration of a call to ora_aas_chk () on this system takes about 15ms, which includes both the reading of dot-env and the roundtrip to the database to pick up the values. 

This "debug output" is especially relevant when using the check for the first time in a a new environment. It gives you an idea how fast your loops run, and how much overhead the ora_aas takes.

The sleep-time is specified in Seconds, a nudge to linux-sleep, it is recommended to insert calls to ora_aas_chk() in code where you can expect put at least 1sec between calls. More prudent would be to have at least 30 or 60 seconds, or even much longer intervals. Longer intervals between calls avoids both the overhead of the calls, and reduces the risk of flip-flopping between fast-slow states of the python program.


Final Remarks (and more in the appendix)

In short: Now we have something that we can configure to adjusts to actual activity on the database, and thus benefit (automatically) from quiet periods.

Future... ? I have several more ideas to beautify this little snippet of code: I could add an arg* to make it run "silent" (simple). And I would consider peeking into vmstat or sar on the Database-server (not that simple..) to take into account more of the load from other components. Maybe in Future.

But I'm sure most DBAs and sysadmins have similar (but better!) ideas and tools. If this helps some of you even a little, the good work is done.

Feel free to copy the code and modify it. Remember: Your most re-used code is the code written by Yourself.

So then maybe if we meet at some event or conference, you can tell me what I did wrong, and why your version is Much Better. I Want to Learn.


-- -- -- -- -- -- End of this blogpost, for now -- -- -- -- -- --


Appendices: Links to code and some more notes..

Note: the python files should perform a self-test when run stand-alone.

throttle.py : the first attempt, no connection, hence no database required.

ora_login.py : the (runnable) source, requires a connection to test.

.env : an example of dotenv-file (you must edit scott/tiger@localhost, unless....)


Question - why not resource-manager... 

Mostly because doing it in python was a hobby-item. Partly because resmgr only kicks in when the system is already fully loaded, and partly from lazyness as none of the DBAs wanted to tweak the resource-plans and consumer-groups. 

And of course, we do not run all our programs as SCOTT with full DBA privs...;-)


A Note on setting the ora_aas_threshold_pct:

I've chosen a percentage because many DBAs and Sysadmins think in percentage when they check for load. They will check vmstat, iostat or sar (from the linux sysstat package). These tools give you percentages for user-cpu, system-cpu and idle-cpu. And may old-hand admins check to see if the percentage idle-cpy is still at least in double-digits to see if their system is (over)loaded.

In future, rather than using a percentage threshold, we can consider checking for "cpus-available", calculated as:

CPUS_AVAILABLE = AAS - CPU_COUNT. 

But this merits a separate discussion. Maybe later.


A Note on AAS, and the measuring interval.

You can try to measure Active Sessions by query on v$session (status = ACTIVE or otherwise), but that gives you a "point-measure" of the very moment your query runs. 

AWR and Statspack do a good job, by measuring AAS over an interval, which is by default One Hour. But a 1 hour average may be a bit long for loads that change more dynamically. And some systems have AWR (or Statspack) not even activated. I needed a value that is reliably-there, and an interval relevant to the system and to the job at hand, hence I started doing my own sampling of DB_Time from v$sysstat. My AAS, which is a ratio of seconds-per-second, can then be calculated as:

AAS = Delta-DB_Time (seconds) / Elapsed_time (seconds). 

When using ora_aas_chk(), the actual Elapsed_time will be determined by the loop where the function-call is included. This seems like a reasonable compromise: on long running loops (10s 0f mintues, hours), the average is determined over the same interval as the expected next loop will take.

One Warning though: On very short intervals (single seconds, a minute), there may be some in-accuracy though. Also take into account that long running PL/SQL calls only report their "work done" into the V$SYSSTAT when the call is completed. This means that a heavy+long call to PL/SQL may hold up a CPU, but not be reflected in a relatively short interval between two calls to ora_aas_chk()

The user is Warned.

-- -- -- -- End of blogpost - For Real -- -- -- -- 

Wednesday, 2 July 2025

oracle 23ai and some python items.

TL;DR: When using Oracle23ai and python, I developed some utilities. Here are a few for your inspiration. Links to runnable (copy-able) code in text and at bottom.


Background: Running python to process data.

Because I have a customer who wants to do ingest and repeated processing of data with python-tools, I have written some fairly basic python programs. In doing so, I could not resist to create some utilities (functions, modules) for various "system-administrative tasks". It started simple with recording of time, and monitoring of heavy programs. It developed into something more. Below I will examine some ideas and link to my resulting code. When it all got too much and long, I split-off some items for later blogs.

It is Nothing very Fancy, just little tools. Use them for Ideas. Copy and Modify as much as you want.



Illustration of Measuring tools... 


Timers: how long does my loop take, and where does the time go ?

When you do a long, repeating, looping task, you want to measure the time. A lot of my functions already have local variables like start_dt and end_dt. But I needed something more global to span multiple functions. The result was two (global) variables to keep time and some set/get timer-functions.

One variable (g_tmr_total) is meant to measure "total time of an item", and can be set at "start" and queried by "tmr_total()". The other variable (g_tmr_durat) is meant to measure and re-set for shorter intervals. The functions are more or less self-explanatory:


And here is the link to the source, duration.py. You can import it into your program like this:

from duration import *

When you run the file stand-alone, it will do a self-test and demonstrate some timeings. The output should look like this:

This should all be self-explanatory but let me know if you have any questions or problems running on your system...

Notice there is one other function in there that allows you to "spin" a single CPU in your system for a given nr of seconds, in my case PI-seconds (3.14...sec). That function is out of scope for this blog, but it was useful for the test+demo. Maybe more later.

Notice also I used the linux time command to double check the total timing. Always a useful command.

If you are into precise timing or optimising of python programs, you should also start by studying the time -module. For example time.process_time() will give you the time spent "on your program". And you should check packages like timeit (link) for more advanced timing-ideas.

In Future, I will consider adding push and pop of timers onto a LIFO stack. This will allow more timers. But it will add the burden of book-keeping to know which timer is on top. Alternatively, there already are packages like timeit that do similar things.


Database-work: do you have an autotrace ?

Timing of activity is one thing, but as a DBA-geek, I also want to know what my RDBMS was doing. Inspired by sql-autotrace, I built a function to report on session-statistics. This would allow me to report the "work done by the session" at any given point. From earlier use, I already had a query very similar to this:

There is the SQL, defined as a constant inside the function ora_sess_info() that will display the relevant stats. Now completing that into a python program was easy. I had to add this code:

The simplest way of running and displaying a query form python. And when the function is called, your output is something like this:

Notice a few things:

You need access to V$MYSTATS (still, sometimes this is an item to discuss...)

The actual function needs a connection-object. Of course I never know which connections are available or how the connection-objects are named. They may not even be global. Hence it needs to be passed as an argument to the function.

Output is currently as text to stdout. In future, more sophisticated output might be enabled (e.g. json-format, or insert-into-some-log-table)

The function measures statistics "since connect". To reset the values, you can close and re-connect, but I would not do that with any high-frequency as the creation of a connection does take time and resources.

For Future additions, I had several ideas still. I would consider to include the SQL-stmnt into .env, and pick it up with dotenv. That would allow me to customize the query. And I would also consider storing some metrics in a table, to create a log of runs for particular programs and allow monitoring over time.

I was also considering to craft a function to report on the last-used plan of a given query, just to keep an eye on sudden plan changes (another one for the todo.txt).

But from actual usage, some other "requirements" came to light. Possible material for the next instalment of the blog (insert link when written?)


Summary : Feel free to copy and modify, YMMV. And there is more....

Like I said before, the most re-used code is the code that you create yourself (also called the Not Invented Here syndrome. So feel free to just copy and create your own varieties. 

If I can give a few ppl ideas: Good!

As hinted earlier, once these snippes of code were in use, more ideas and unexpected requirements popped up. Let me think about those and I'll pop out another blog-post (future link here...)

And maybe, hopefully, we meet at some event or conference: Feel free tell me what I did wrong, and how you totally improved my code...


-- -- -- -- -- -- -- -- end of this blog, for now -- -- -- -- -- -- --

Links to code and constants:

duration.py

ora_login.py (requires the dot-env file for credentials and constants)

.env (you need to edit to replace scott/tiger@localhost...)

Both modules are "self testing", if you run them they will produce some output to stdout. If you have all the requirements, they should ideally run without errors ;-)

-- -- -- -- -- -- -- -- end of this blog, really -- -- -- -- -- -- --

Monday, 7 April 2025

Oracle 23ai and python. some simple code.

TL;DR: Python and Oracle 23ai: A few helpful ideas, programs and code-snippets that made my life with python and Oracle easier. This one is about Connecting and Connection-info.


Background:

Below are some of the other small pieces of code that I split off into separate py-files and that are included into most of my (oracle-db connected) programs.

Maybe they can help you.

As with every language, after a while you develop habits and small tools that (hopefully) make you more productive or more secure in your work. In my case, with the combination of python and Oracle, I have only build very simple programs. And I do nearly all my work just from a terminal on either linux, mac or sometimes even windows. I use vi (and occasionally notepad or an IDE) as editors, and often still use print/f or dbms_output statements for debugging (showing my age eh...).

Last year I built a small python program to print out the contents of vectors stored in Oracle (blog on "victor" here...). And I have a few more "tools" that I built to help in my occasional-program or in my regular dba-ish tasks.

Caveat 1: Of course you will build your own. I know ppl re-use code much better when they first type it themselves. I Know you will do this this differently, and you will do it Much Better. 

Caveat 2: Don't expect high-tech: What I did with python is very simple, most of it is "pushed to the DBA" job. Things that I would call "plumber-stuff": process incoming files, updating some meta-data like LoV lists, produce simple interface-data, or very simple reports-to-mail. The kind of stuff that never seems to go away, no matter how sophisticated the new ecosystems become.


Environment-settings and other variables: python-dotenv.

As most programs have to work in various environments, I needed a way to quickly set or obtain the various variables for LOCAL, DEV, TEST, PRE and PROD. The easiest solution turned out to be a tool called python-dotenv.

If you dont know of this tool yet, read the intro (link), and play with it. 

I am still grateful to Gianni and Jeff and a few others who tipped me about this gizmo. Thx Folks!

In my case, it started by keeping the oracle-connect information plus a few other more system- or env-specific variables that I need to distinguish between various environments. It also coincided nicely with the preference of my cstmr to "keep stuff in config files".

And suddenly I found python-dotenv very tempting for multiple use-cases, and  I started to hoard a lot of "settable variables" into the file. Very Useful.


Open a db-connection, use a function (part 1)

As all my programs relate to an RDBMS, I need all programs to "connect" to the database. The information needed to connect was easily stored in a dotenv-file, but the creation of a connection needed some attention. For several reasons:

- We want credentials to be separate from code. Not hardcoded (and this time we mean it, looking at you Scott and Tiger..). Credentials need to be both safely stored and easily modifiable (some systems still change passwords every 32 days...)

- We want to verify that a connection is to the Correct, Intended database. Preferable to some logfile or stdout stream (check part 2).

The easiest part was to pick-up the credentials from the dotenv-file. To this extend, we created a function that picks up the connect-info (including tiger...) from an env-file. So far so good:

The first attempt looked like this:

Notice several things:

- We print some of the info to stdout, just to verify. Out-comment if you like, but I Want this check. More Later.

- We use easy-connect syntax.

- The function returns the connection-object, the connection does not have to be a global object.

If you want more complicated connects, if you prefer to use tns-aliases, or complete tns-entries, or other types of dsn: that would require a slight re-code. But you can stick to the concept of using dotenv, it is able to use multi-line variables. Hence you have a lot of options. I didnt have a need for more complicated connection-specifications yet, but in most cases, I would store the information in dotenv.


Open a db-connection, and Verify to stdout (part 2)

A very common mistake, and a rather Impactful Mistake, is to connect some infrequently used, adhoc-running metadata-update-tool to the wrong database... Ooops! (of course, in Your case, the passwords between Dev and Prod would be different, but some systems are made by Murphy... )

My previous blogpost on this was "how to set SQL-Prompt"..

https://simpleoracledba.blogspot.com/2023/06/do-you-know-where-you-are.html

And I wanted something similar to my python-programs.

Easy... In python, is it very simple to take any valid SELECT, have it executed, and print the results to stdout:

I inserted this snip of code into the ora_logon function:

 cursor = ora_conn.cursor()

  for row in cursor.execute ( sql_show_conn ):

    print  ( ' ora_login:', row[1] )

And now I can define sql_show_conn with any select I want, notably the ones I use in SQLcl and SQL*Plus to show my connection:

My favorite is:

  # customize this sql to show connection info on logon

  sql_show_conn="""

    select 2 as ordr

       , 'version : ' ||  substr ( banner_full, -12) as txt

    from v$version

  UNION ALL

    select 1

       , 'user    : '|| user || ' @ ' || global_name|| ' ' as txt

    FROM global_name     gn

  UNION ALL

    SELECT 3

       , 'prompt  : ' || user

         || ' @ ' ||db.name

         || ' @ '|| SYS_CONTEXT('USERENV','SERVER_HOST')

         || decode  (SYS_CONTEXT('USERENV','SERVER_HOST')

              , '98b6d46dd637',      ' (xe-dev)'

              , 'ip-nnn-nn-2-109',   ' (PROD)'

              , 'ip-nnn-nn-0-226',   ' (ACC)'

              , '2c4a51017a44',      ' (dckr-23ai)'

              , 'b88d087b53d3',      ' (dckr-23.7)'

              , ' (-envname-)')

         || ' > '

    FROM    v$database      db

  order by 1

  """


This is the equivalent of setting a fancy prompt in SQLcl to indicate "where you are" (link to old blog). 

This SQL, actually a UNION of three queries, roughly translates to : 1) show user and global_name, 2) show the version and 3) use a fancy decode to show which environment you are at, which I often use to list out dev/text/acc/prod and other varieties we use.

Because at a certain cstmer, the pdb-names and global names and service-names are often identical over several environments, we use the container-name or host-name to identify "where we are", and decode that to some "meaningful name" like "PROD". 

And you can make the verification-query as fancy as you want (or just keep it simple). In some cases, it was turned into a local linux-style MOTD, and in some cases into a complete Welcome-Message including some instructions and warnings.

You can add items to this cursor at your hearts content, and the union + order trick allow you to re-arrange the outcomes. Knock yourself Out (some devs really do!)

The result, in my case, comes out like this:


The complete, include-able function, together with some test-code at the bottom of the file, can be found here: link to ora_login.py.


Summary: Feel free to re-use, but...

From thinking- and coding my logon-function, using dotenv and my own logon-function, I got some good benefits. 

But, big Caveat: Tooling is Personal, and YMMV.

Over time I have taken to use a lot of self-built tools and scripts, and some tools+utils I picked up from 't internet or other coders (and mostly modified for my own use).

I did find that it is difficult to get others to re-use tools that I crafted, or any tool from anyone else.

I have a few other snippets of python code that I (re-)use a lot, and I will see how those develop over time (if at all..). So far, if my code or just my ideas does inspire some of you to improve your code: Job done.

My main tip to others on code-reuse is generally: Read, Consider, Try, ... and in case of doubt:  DIY. 

The most (re)used scripts and code will probably be the "built by you" version.

Enjoy !

-- -- -- -- -- end of this blogpost, for now -- -- -- --