Thursday, 10 July 2025

Oracle 23ai, python, and please Go Easy on my system...

TL;DR: Running Python against an Oracle 23ai database is Nice, but Dont Overload. I devised a few ways to throttle long running python processes. First trick was to (dynamically) inject sleep-calls. And then I refined that to react on the actual load on the RDBMS.


Summary: 

At first, I had an editable parameter-file to vary the sleep-time between loops. But then I refined it: The python program checks regularly on (average-)active-sessions, compares it to cpu_count, and takes a sleep when the system is considered "busy". The parameters to control this are imported by dotenv (link) and can be modified on the fly if needed. This allows long- or always-running jobs to retract to the background when system is busy, and to benefit from quiet intervals when they occur.

And by reacting to Avg-Active-Session and cpu_count values they automatically, dynamically, adjust to the load on the RDBMS, even regardless of the hardware underneath.


Background: Batches running python code.

Sometimes I have to build + run utilities for python-jobs that utilise the RDBMS. Some of those programs can run for Hours, if not Days, while doing loads/extracts/analysis/summaries... One request was to prevent Overload. e.g. to allow the python-jobs to "stop running" at times where other system and users were busywith More Important Work". So in the end, I had to define and measure "busy" and act accordingly. 

Note: I used to check with vmstat or sar (sysstat) to detect how busy my system is, or how much (percentage) idle I had left. Nowadays, I prefer to get my data from the RDBMS, for two reasons: 1) there are systems out there with SQL-only access (e.g. no linux prompt anymore...). And 2) system nowadays will run CDBs with multiple PDBs and databases limited by the CPU_COUNT parameter. Your "server" may have plenty capacity (or not), but you can only get 2 CPUs from it... (merits a separate story, try running the 23ai-free-version with the cpu_count parameter hard-set to 2 on an 8 cpu machine - anecdote later...).

Anyway, this all started very simple, with some calls to time.sleep ( n_sec ).




First Attempt: simple sleep( n_sec) in every main loop.

The long running programs are re-startable, which is good-practice anyway because sh*t^h^h^h^h hiccups happen. So when the the system of background-batch-processing needed to yield to more urgent work, we would kill some jobs (stop the containers), and re-start them later. Crude but simple.

But that meant the processing often stopped completely, and sometimes we forgot to re-start...

Next up was: stop the program (or the container running it), edit a sleep-call in some main loop, and re-start it. The program would run at slower pace. That worked. Sort-of. 

But it still meant the process had to be killed and edited. And if left alone, it would either run slooow for the rest of the job, thus taking much longer than usual. Or it would need another kill+edit+restart when the system had more process-capacity.  Plus, for some processes, the startup-overhead was significant. It worked, but still messy.


Next Attempt: dynamic sleep time, from an editable config-file.

Introducing: Throttle.py. This little module would read the dotenv-file at regular intervals and adjust the sleep-time to a value given in .env. 

The code to implement this can be found in throttle.py (link below). The module is meant to be imported, but running it stand alone will perform a test- and demo-sequence.

To "throttle" a running program, now we just had to edit the .env and put in a larger value for THR_SLEEP_TIME:

# throttle, sleeptime

THR_SLEEP_TIME=3

Note: The re-read interval for the dot-env file with the latest (editable) threshold was hardcoded at 60sec to ensure a relatively responsive system. The overhead of reading the file was considered minimal compared to 60sec. If you have to fetch this file from some remote location with significant time (latency), you can consider a larger interval.

We now had several advantages: No more kill+restart, but rather an easy-editable value in the config file (the .env).  This worked Fine for a while. It was a bit like opening and closing a valve from a reservoir. So far so good. 

But this trick still required manual intervention: ogling the system-load and editing .env.  Surely, there was a way to automate and improve on this.


The Database is the real, precious, Bottleneck.

The "load" on the system was, in this case, best monitored from the RDBMS, the Database. Hence, we needed some probe into that system to determine how much spare capacity it had to process our python jobs. 

And that is where AAS comes in: Average Active Session (link - Google turned up this rather old webpage, but the idea is actually older still). By using the statistic for DB-time, the database will tell you how much processing was done in 1 unit of time.

AAS = Delta-DB_Time (seconds) / Elapsed_time (seconds). 

So an AAS of 3 means that running processes consumed 3 seconds of time in a 1 second interval. Hence the Unit of Measurement for AAS can be expressed as Sessions or Processes.

Knowing AAS is nice, but what is my available capacity ?

Your AAS tells you how many processes (on average) are running. But you need to relate it to your CPU_COUNT to know how "loaded" your system is. Hence we calculate percentage-busy as:

PCT_BUSY = 100 * AAS / CPU_COUNT.

On a system where AAS=3 (e.g. 3 processes busy) and count of 4 CPUs, the pct_busy will be about 75 percent, hence there is only about 25 percent capacity, or one cpu, still idle. But on a system with 24 cpus, an AAS of 3 means it is only 100*3/24=12 percent busy and some 18 CPUs are still available to service more processes. 

Using this simple pct_busy as threshold, would more or less adjust to large and small systems. So for the moment, we decided to stick with this easy "percentage" (see notes below on setting threshold..). 

So, let's build this...


Putting it to work: the python code.

First to get the relevant info from the database, I needed a query.

And to determine delta_DB_time and delta_epoch_time, I needed two "global" variables to persist over time, to keep db_time and epoch.

Those definitions were simple:


The function to execute the code starts by reading the dot-env and fetching the relevant values  from the database.

Note that the parameter-file (dot-env) is read at Every Call to this function. When called too frequently (single-seconds...), the overhead can be significant. Hence the ora_aas_chk() should ideally be called at intervals of at least 30sec or higher.

Now that we have all the information, we can start the logic:


If it is the first call, and there are no "previous" values, we do nothing (pass)

Else, if there are previous values, we calculate the AAS and PCT_BUSY.

Once we have the values,  we can determine if a pause is justified. I've left the print(debug) stmnts in for the moment:

The Threshold determines if we need a pause (sleep). And at the end of the function we "set" the global variables that will be used to calculate the "deltas" on the next call. For good measure, we return the total time spent in ms.

If you insist on strictly "sleeping" your program until the pct_busy is below the specified threshold, you can use a while-loop: 

while ora_aas_chk ( ora_conn ) > 1000:  

    # sleep and only continue when pct_busy Below Threshold.

    pass

Repeat the call until the return-value is below, say, 1000 ms (assuming your pause-setting is more than 1 sec, as it should be). Any return of more than 1 sec will signify we did pause-sleep. Hence, your database was considered "busy", and you may want to hold back your python-job  from processing.


The code: added to ora_login.py

The logical place, at the time, to put this code was in the ora_login.py (link to earlier blog..). In hindsight, I should maybe have isolated it as an individual model. Later.

When running the ora_login.py as standalone program, it performs a self-check, where the output from ora_aas_chk() looks like this:

Notice that the duration of a call to ora_aas_chk () on this system takes about 15ms, which includes both the reading of dot-env and the roundtrip to the database to pick up the values. 

This "debug output" is especially relevant when using the check for the first time in a a new environment. It gives you an idea how fast your loops run, and how much overhead the ora_aas takes.

The sleep-time is specified in Seconds, a nudge to linux-sleep, it is recommended to insert calls to ora_aas_chk() in code where you can expect put at least 1sec between calls. More prudent would be to have at least 30 or 60 seconds, or even much longer intervals. Longer intervals between calls avoids both the overhead of the calls, and reduces the risk of flip-flopping between fast-slow states of the python program.


Final Remarks (and more in the appendix)

In short: Now we have something that we can configure to adjusts to actual activity on the database, and thus benefit (automatically) from quiet periods.

Future... ? I have several more ideas to beautify this little snippet of code: I could add an arg* to make it run "silent" (simple). And I would consider peeking into vmstat or sar on the Database-server (not that simple..) to take into account more of the load from other components. Maybe in Future.

But I'm sure most DBAs and sysadmins have similar (but better!) ideas and tools. If this helps some of you even a little, the good work is done.

Feel free to copy the code and modify it. Remember: Your most re-used code is the code written by Yourself.

So then maybe if we meet at some event or conference, you can tell me what I did wrong, and why your version is Much Better. I Want to Learn.


-- -- -- -- -- -- End of this blogpost, for now -- -- -- -- -- --


Appendices: Links to code and some more notes..

Note: the python files should perform a self-test when run stand-alone.

throttle.py : the first attempt, no connection, hence no database required.

ora_login.py : the (runnable) source, requires a connection to test.

.env : an example of dotenv-file (you must edit scott/tiger@localhost, unless....)


Question - why not resource-manager... 

Mostly because doing it in python was a hobby-item. Partly because resmgr only kicks in when the system is already fully loaded, and partly from lazyness as none of the DBAs wanted to tweak the resource-plans and consumer-groups. 

And of course, we do not run all our programs as SCOTT with full DBA privs...;-)


A Note on setting the ora_aas_threshold_pct:

I've chosen a percentage because many DBAs and Sysadmins think in percentage when they check for load. They will check vmstat, iostat or sar (from the linux sysstat package). These tools give you percentages for user-cpu, system-cpu and idle-cpu. And may old-hand admins check to see if the percentage idle-cpy is still at least in double-digits to see if their system is (over)loaded.

In future, rather than using a percentage threshold, we can consider checking for "cpus-available", calculated as:

CPUS_AVAILABLE = AAS - CPU_COUNT. 

But this merits a separate discussion. Maybe later.


A Note on AAS, and the measuring interval.

You can try to measure Active Sessions by query on v$session (status = ACTIVE or otherwise), but that gives you a "point-measure" of the very moment your query runs. 

AWR and Statspack do a good job, by measuring AAS over an interval, which is by default One Hour. But a 1 hour average may be a bit long for loads that change more dynamically. And some systems have AWR (or Statspack) not even activated. I needed a value that is reliably-there, and an interval relevant to the system and to the job at hand, hence I started doing my own sampling of DB_Time from v$sysstat. My AAS, which is a ratio of seconds-per-second, can then be calculated as:

AAS = Delta-DB_Time (seconds) / Elapsed_time (seconds). 

When using ora_aas_chk(), the actual Elapsed_time will be determined by the loop where the function-call is included. This seems like a reasonable compromise: on long running loops (10s 0f mintues, hours), the average is determined over the same interval as the expected next loop will take.

One Warning though: On very short intervals (single seconds, a minute), there may be some in-accuracy though. Also take into account that long running PL/SQL calls only report their "work done" into the V$SYSSTAT when the call is completed. This means that a heavy+long call to PL/SQL may hold up a CPU, but not be reflected in a relatively short interval between two calls to ora_aas_chk()

The user is Warned.

-- -- -- -- End of blogpost - For Real -- -- -- -- 

Wednesday, 2 July 2025

oracle 23ai and some python items.

TL;DR: When using Oracle23ai and python, I developed some utilities. Here are a few for your inspiration. Links to runnable (copy-able) code in text and at bottom.


Background: Running python to process data.

Because I have a customer who wants to do ingest and repeated processing of data with python-tools, I have written some fairly basic python programs. In doing so, I could not resist to create some utilities (functions, modules) for various "system-administrative tasks". It started simple with recording of time, and monitoring of heavy programs. It developed into something more. Below I will examine some ideas and link to my resulting code. When it all got too much and long, I split-off some items for later blogs.

It is Nothing very Fancy, just little tools. Use them for Ideas. Copy and Modify as much as you want.



Illustration of Measuring tools... 


Timers: how long does my loop take, and where does the time go ?

When you do a long, repeating, looping task, you want to measure the time. A lot of my functions already have local variables like start_dt and end_dt. But I needed something more global to span multiple functions. The result was two (global) variables to keep time and some set/get timer-functions.

One variable (g_tmr_total) is meant to measure "total time of an item", and can be set at "start" and queried by "tmr_total()". The other variable (g_tmr_durat) is meant to measure and re-set for shorter intervals. The functions are more or less self-explanatory:


And here is the link to the source, duration.py. You can import it into your program like this:

from duration import *

When you run the file stand-alone, it will do a self-test and demonstrate some timeings. The output should look like this:

This should all be self-explanatory but let me know if you have any questions or problems running on your system...

Notice there is one other function in there that allows you to "spin" a single CPU in your system for a given nr of seconds, in my case PI-seconds (3.14...sec). That function is out of scope for this blog, but it was useful for the test+demo. Maybe more later.

Notice also I used the linux time command to double check the total timing. Always a useful command.

If you are into precise timing or optimising of python programs, you should also start by studying the time -module. For example time.process_time() will give you the time spent "on your program". And you should check packages like timeit (link) for more advanced timing-ideas.

In Future, I will consider adding push and pop of timers onto a LIFO stack. This will allow more timers. But it will add the burden of book-keeping to know which timer is on top. Alternatively, there already are packages like timeit that do similar things.


Database-work: do you have an autotrace ?

Timing of activity is one thing, but as a DBA-geek, I also want to know what my RDBMS was doing. Inspired by sql-autotrace, I built a function to report on session-statistics. This would allow me to report the "work done by the session" at any given point. From earlier use, I already had a query very similar to this:

There is the SQL, defined as a constant inside the function ora_sess_info() that will display the relevant stats. Now completing that into a python program was easy. I had to add this code:

The simplest way of running and displaying a query form python. And when the function is called, your output is something like this:

Notice a few things:

You need access to V$MYSTATS (still, sometimes this is an item to discuss...)

The actual function needs a connection-object. Of course I never know which connections are available or how the connection-objects are named. They may not even be global. Hence it needs to be passed as an argument to the function.

Output is currently as text to stdout. In future, more sophisticated output might be enabled (e.g. json-format, or insert-into-some-log-table)

The function measures statistics "since connect". To reset the values, you can close and re-connect, but I would not do that with any high-frequency as the creation of a connection does take time and resources.

For Future additions, I had several ideas still. I would consider to include the SQL-stmnt into .env, and pick it up with dotenv. That would allow me to customize the query. And I would also consider storing some metrics in a table, to create a log of runs for particular programs and allow monitoring over time.

I was also considering to craft a function to report on the last-used plan of a given query, just to keep an eye on sudden plan changes (another one for the todo.txt).

But from actual usage, some other "requirements" came to light. Possible material for the next instalment of the blog (insert link when written?)


Summary : Feel free to copy and modify, YMMV. And there is more....

Like I said before, the most re-used code is the code that you create yourself (also called the Not Invented Here syndrome. So feel free to just copy and create your own varieties. 

If I can give a few ppl ideas: Good!

As hinted earlier, once these snippes of code were in use, more ideas and unexpected requirements popped up. Let me think about those and I'll pop out another blog-post (future link here...)

And maybe, hopefully, we meet at some event or conference: Feel free tell me what I did wrong, and how you totally improved my code...


-- -- -- -- -- -- -- -- end of this blog, for now -- -- -- -- -- -- --

Links to code and constants:

duration.py

ora_login.py (requires the dot-env file for credentials and constants)

.env (you need to edit to replace scott/tiger@localhost...)

Both modules are "self testing", if you run them they will produce some output to stdout. If you have all the requirements, they should ideally run without errors ;-)

-- -- -- -- -- -- -- -- end of this blog, really -- -- -- -- -- -- --

Monday, 7 April 2025

Oracle 23ai and python. some simple code.

TL;DR: Python and Oracle 23ai: A few helpful ideas, programs and code-snippets that made my life with python and Oracle easier. This one is about Connecting and Connection-info.


Background:

Below are some of the other small pieces of code that I split off into separate py-files and that are included into most of my (oracle-db connected) programs.

Maybe they can help you.

As with every language, after a while you develop habits and small tools that (hopefully) make you more productive or more secure in your work. In my case, with the combination of python and Oracle, I have only build very simple programs. And I do nearly all my work just from a terminal on either linux, mac or sometimes even windows. I use vi (and occasionally notepad or an IDE) as editors, and often still use print/f or dbms_output statements for debugging (showing my age eh...).

Last year I built a small python program to print out the contents of vectors stored in Oracle (blog on "victor" here...). And I have a few more "tools" that I built to help in my occasional-program or in my regular dba-ish tasks.

Caveat 1: Of course you will build your own. I know ppl re-use code much better when they first type it themselves. I Know you will do this this differently, and you will do it Much Better. 

Caveat 2: Don't expect high-tech: What I did with python is very simple, most of it is "pushed to the DBA" job. Things that I would call "plumber-stuff": process incoming files, updating some meta-data like LoV lists, produce simple interface-data, or very simple reports-to-mail. The kind of stuff that never seems to go away, no matter how sophisticated the new ecosystems become.


Environment-settings and other variables: python-dotenv.

As most programs have to work in various environments, I needed a way to quickly set or obtain the various variables for LOCAL, DEV, TEST, PRE and PROD. The easiest solution turned out to be a tool called python-dotenv.

If you dont know of this tool yet, read the intro (link), and play with it. 

I am still grateful to Gianni and Jeff and a few others who tipped me about this gizmo. Thx Folks!

In my case, it started by keeping the oracle-connect information plus a few other more system- or env-specific variables that I need to distinguish between various environments. It also coincided nicely with the preference of my cstmr to "keep stuff in config files".

And suddenly I found python-dotenv very tempting for multiple use-cases, and  I started to hoard a lot of "settable variables" into the file. Very Useful.


Open a db-connection, use a function (part 1)

As all my programs relate to an RDBMS, I need all programs to "connect" to the database. The information needed to connect was easily stored in a dotenv-file, but the creation of a connection needed some attention. For several reasons:

- We want credentials to be separate from code. Not hardcoded (and this time we mean it, looking at you Scott and Tiger..). Credentials need to be both safely stored and easily modifiable (some systems still change passwords every 32 days...)

- We want to verify that a connection is to the Correct, Intended database. Preferable to some logfile or stdout stream (check part 2).

The easiest part was to pick-up the credentials from the dotenv-file. To this extend, we created a function that picks up the connect-info (including tiger...) from an env-file. So far so good:

The first attempt looked like this:

Notice several things:

- We print some of the info to stdout, just to verify. Out-comment if you like, but I Want this check. More Later.

- We use easy-connect syntax.

- The function returns the connection-object, the connection does not have to be a global object.

If you want more complicated connects, if you prefer to use tns-aliases, or complete tns-entries, or other types of dsn: that would require a slight re-code. But you can stick to the concept of using dotenv, it is able to use multi-line variables. Hence you have a lot of options. I didnt have a need for more complicated connection-specifications yet, but in most cases, I would store the information in dotenv.


Open a db-connection, and Verify to stdout (part 2)

A very common mistake, and a rather Impactful Mistake, is to connect some infrequently used, adhoc-running metadata-update-tool to the wrong database... Ooops! (of course, in Your case, the passwords between Dev and Prod would be different, but some systems are made by Murphy... )

My previous blogpost on this was "how to set SQL-Prompt"..

https://simpleoracledba.blogspot.com/2023/06/do-you-know-where-you-are.html

And I wanted something similar to my python-programs.

Easy... In python, is it very simple to take any valid SELECT, have it executed, and print the results to stdout:

I inserted this snip of code into the ora_logon function:

 cursor = ora_conn.cursor()

  for row in cursor.execute ( sql_show_conn ):

    print  ( ' ora_login:', row[1] )

And now I can define sql_show_conn with any select I want, notably the ones I use in SQLcl and SQL*Plus to show my connection:

My favorite is:

  # customize this sql to show connection info on logon

  sql_show_conn="""

    select 2 as ordr

       , 'version : ' ||  substr ( banner_full, -12) as txt

    from v$version

  UNION ALL

    select 1

       , 'user    : '|| user || ' @ ' || global_name|| ' ' as txt

    FROM global_name     gn

  UNION ALL

    SELECT 3

       , 'prompt  : ' || user

         || ' @ ' ||db.name

         || ' @ '|| SYS_CONTEXT('USERENV','SERVER_HOST')

         || decode  (SYS_CONTEXT('USERENV','SERVER_HOST')

              , '98b6d46dd637',      ' (xe-dev)'

              , 'ip-nnn-nn-2-109',   ' (PROD)'

              , 'ip-nnn-nn-0-226',   ' (ACC)'

              , '2c4a51017a44',      ' (dckr-23ai)'

              , 'b88d087b53d3',      ' (dckr-23.7)'

              , ' (-envname-)')

         || ' > '

    FROM    v$database      db

  order by 1

  """


This is the equivalent of setting a fancy prompt in SQLcl to indicate "where you are" (link to old blog). 

This SQL, actually a UNION of three queries, roughly translates to : 1) show user and global_name, 2) show the version and 3) use a fancy decode to show which environment you are at, which I often use to list out dev/text/acc/prod and other varieties we use.

Because at a certain cstmer, the pdb-names and global names and service-names are often identical over several environments, we use the container-name or host-name to identify "where we are", and decode that to some "meaningful name" like "PROD". 

And you can make the verification-query as fancy as you want (or just keep it simple). In some cases, it was turned into a local linux-style MOTD, and in some cases into a complete Welcome-Message including some instructions and warnings.

You can add items to this cursor at your hearts content, and the union + order trick allow you to re-arrange the outcomes. Knock yourself Out (some devs really do!)

The result, in my case, comes out like this:


The complete, include-able function, together with some test-code at the bottom of the file, can be found here: link to ora_login.py.


Summary: Feel free to re-use, but...

From thinking- and coding my logon-function, using dotenv and my own logon-function, I got some good benefits. 

But, big Caveat: Tooling is Personal, and YMMV.

Over time I have taken to use a lot of self-built tools and scripts, and some tools+utils I picked up from 't internet or other coders (and mostly modified for my own use).

I did find that it is difficult to get others to re-use tools that I crafted, or any tool from anyone else.

I have a few other snippets of python code that I (re-)use a lot, and I will see how those develop over time (if at all..). So far, if my code or just my ideas does inspire some of you to improve your code: Job done.

My main tip to others on code-reuse is generally: Read, Consider, Try, ... and in case of doubt:  DIY. 

The most (re)used scripts and code will probably be the "built by you" version.

Enjoy !

-- -- -- -- -- end of this blogpost, for now -- -- -- --

Monday, 10 March 2025

The Generic Tips (was: python and Oracle)

TL;DR: Some Very Basic Tips, to write better programs and tools. In my recent case: using python and Oracle. But in the handover to colleagues, I came to realise there are some tips+tricks that need re-iterating every few years. So Here Goes (again). 

My opinion is of course strongly "biased" towards an Oracle Database, but that is not always a bad thing.

The actual blog on my Oracle-python-tricks will have to wait a bit.. but it is in the pipeline (# makefifo pythonstuff ; link to future blog...) 


The things that tend to wash up on the desk of a DBA...



Background: Been There Done That (have T-shirts).

Originally, I set out to describe some of the pyton programs (functions, really) that I (re)used to make my life and that of my cstmr easier. As it turned out, we ended up discussing some "habits" and practices rather than the actual code. So just a few generic tips, which I also used to explain things to cstmrs. Hence I wrote that up before I dipped into my python code.

YMMV and "Never say Never": The tips applied to myself and my cstmr-situation. But I came across enough situations to say they apply in many case. However, Your Milage May Vary. So if you think they do not apply to you, I am fine with that. I'm Not going to go religious on it.

My content here applies to the type of (simple-) jobs that tend to wash-up on the desk of a DBA: Interfaces, ad-hoc reports, occasional maintenance-jobs, loading "meta data". that sort of thing.

Also, do not expect High-Tech. In my case, with the combination of python and Oracle, I have only build relatively simple programs. And I do nearly all my work just from a "terminal" on either linux, mac or sometimes even windows. I use mainly vi, SQL-Developer (and occasionally some IDE or even notepad) as editors, and often still use print/f or dbms_output statements for debugging (showing my age eh...). 



Non-functional "Requirements": try to include in Specification.

Requirements, Specifications, or whatever name you have, often state: "ingest these datasets" or "produce a list of gizmos+items in that format", or "return a set of records with this data..." . And that is expected. 

The un-spoken or so called "Non Functional Requirements" are often along the lines of:

1) Make sure it runs in future infrastructure (if we move to "the other cloud"...)

2) Allow running- and maintenance by our outsourced-vendor-resources.

2a) Include some (unit-, integeration-) test capability. testset...

3) Make sure it runs inside 1 min (One Minute, even if 10G of xml-data....)

4) Produce some output or report to allow control or verification (and keep that)

5) Allow only "authorised" ppl/tools to run, and/or prevent DoS situations.

 etc...

Those requirements are often unspecified but implied when you are asked to do "just this simple program". But it helps to write them down (briefly!) even if nobody wants to read them. You may also need some trace of those Non-Functionals to justify the choices that you make (e.g. why didnt you build this program in Go, in C++, or as a bash-script using just grep and awk? ).

Note: The checklist of 1 to ... etc  above would be Tip-nr-0, if I was so pedantic as to start numbering at Zero...

So here come the "Generic Tips  (after weeding out half)...



Tip nr 1: Write it down, Briefly but Clearly

Why? Well, See above...

But also: the reader of your writings is probably yourself or your replacement-person in 3 years, or even in 20yrs (Yes.. I've seen them). Hence, it helps if you can introduce your program with a few lines of both the "functional" and "non-functional" requirements (what does it need to do!).

By writing (or just summarising) what your program needs to do, you can also "verify" it with the user, the architect or whoever is asking for it to be built.

You may also want to include some "justification", e.g. why you choose PL/SQL over Python or FORTRAN (hehehe).



Tip nr 2: Draw one or more ERDs. For better Understanding!

To understand a program: it rarely helps to read the source-code. But a look at the tables, or the ERD of the underlying data can do miracles for understanding. I tend to create my ERDs from the tables in the datbase and rarely work with other models than the Table-Relationship-Diagram. 

Even if you have very simple data, it helps to have an ERD. Not only will an ERD help you define tables (if you even use tables at all...) and the way those table relate. But a brief documentation of each data-element will also allow you to communicate purpose and usage of the data-elements. 

In creating the ERD you will most likely "discover" a few things about the data you process. I notably walk down the lines (relations) and verify things like .. "This order has zero, one, or more details...", and then ask "Is that Correct...?"

When you have to explain your program to colleagues, or have to ask "more questions" from the users: an ERD is a great tool to communicate, and much more readable than "open this source code" (the user may not have the dev-tools installed...).



Tip 3: SmartDB: Use. Your. Database (e.g. use SQL and PL/SQL)

If the data is moving into or out-of a an RDBMS: Consider using SQL and PL/SQL as much as possible.

Even if your organisation prefers using VMware virtual machines, or your architect prefers the use use of "spot-instances", or k8s, or some serverless FaaS solution... If your data ends up in a database, that database is the Safest,  the Longest-Lasting, and probably the Easiest to Maintain place to "store + run your program".

I recall my managers, in the 90s and 00s demanding things like "use FORTRAN to convert incoming data", or "only C will do this fast enough". In the end most of that code ended up being converted (reduced!) into PL/SQL with good results.

The excuses to not use a database are plenty:

- Database CPU is too expensive (maybe, but the database is often the moste efficient place to process and keep Data)

- PL/SQL requires a DBA... (well, a pl/sql person, acutally, so ..?)

-  This is so simple, you should use grep/awk for this (ok, try that. and then introduce a small change in requirements...or start processing GBs of data...)

I must say: I have broken this rule of #SmartDB many times, often at the expense of myself and my cstmr. Nothing lives as long as the database...

Of course, YMMV.  In some cases FORTRAN, C++ or python, is advised bcse of available tools or libraries (as was sometimes my case). The main Valid reason to do processing outside of the RDBMS is, in my opinion, the use of certain libraries for specific functions (math, geo, or some specific engineering- or scientific purpuse library).

But In General, by solving the problems "inside the database", the code was the most efficient and the easiest to maintain (some links to the praise of #SmartDB...?).



Tip 4: Provide some Test Cases.

This may sound trivial, but it can really help. I'm not specifically advocating TDD (Test Driven Development), or some other "framework" that may require a lot of methodology (and work). But you need some test-cases, if only to prove that the original code you delivered still works.

Ideally, you have a scripts (test_setup.sh and test_run.sh).

And those tests should not "break" or otherwise affect your production data. Hence this needs some thought, and probably needs to be embedded into the processes of the customer. 

All cstmrs have Test-platforms. Some cstmrs even have separate platforms where they run Production systems.



Tip 5: Plan Liberally, every IT tweak takes more time than expected.

The best known book on this is probably "The Mythical Man Month" by F.P. Brooks. One of my early managers made me read it. Recommended!

My rule of thumb is to sometimes multiply the original time-estimate. For simple systems: multiply by 3.14 ( approximately pi, Archimedes...). For more complicated systems: multiply by 9.8 ( approx pi-square or g)

Most ppl can remember pi and g (earth gravity).  Because simply stating "I multiplied by 10" sounds rather un-scientific.



Tip x : Stick with the tool(s), unless...

I also tend to include a warning: stick with the tool (bash, python, sqlcl, pl/sql...). If every little-maintenance job completely re-writes the program (in Java, bcse that is the future standard... ), you end up with a hodgepodge of code. If some future coder Insists on re-writing (in .cpp using chatgpt...): think hard, and possibly find a coder that Does understand the original tools/languages it was written in.

When I was an engineering student we were taught some rules around robustness, and maintainability. Such as how to choose components or parts: 

- Less Components make a system more Robust.

- Less Different Components make a system easier to Maintain.

I never forgot those.. and those simple rules also apply to IT-systems.



Oops... Writing Too Much ? 

This text is getting way to long, nobody reads that much anymore (so I went back and weeded out half of it).

I could add a several more rules. Notably around the fact that the Oracle RDBMS now includes several neat features that make your life easier, while still staying at the #SmartDB level e.g. "With the Database". There are now options to process JSON, to build REST-services (recommended), the use of APEX for simple window-on-data (recommended).  And you can now include ONNX models (new in Oracle23ai!). But all those are above + beyond the scope of what I wanted to write.

So there.



Summary: 

Write Some Specs, for yourself and your colleagues.

Plan carefully (and maybe read The Mythical Man Month...)

Make na ERD, it helps the understanding and facilitates discussions

Use #SmartDB whenever possible, it is more efficient.

Provide some Test-cases.

Limit your nr of tools/languages, prevent code-chaos.


That is about it.. Next instalment should be : python Tips + python Code ...