Monday, 7 April 2025

Oracle 23ai and python. some simple code.

TL;DR: Python and Oracle 23ai: A few helpful ideas, programs and code-snippets that made my life with python and Oracle easier. This one is about Connecting and Connection-info.


Background:

Below are some of the other small pieces of code that I split off into separate py-files and that are included into most of my (oracle-db connected) programs.

Maybe they can help you.

As with every language, after a while you develop habits and small tools that (hopefully) make you more productive or more secure in your work. In my case, with the combination of python and Oracle, I have only build very simple programs. And I do nearly all my work just from a terminal on either linux, mac or sometimes even windows. I use vi (and occasionally notepad or an IDE) as editors, and often still use print/f or dbms_output statements for debugging (showing my age eh...).

Last year I built a small python program to print out the contents of vectors stored in Oracle (blog on "victor" here...). And I have a few more "tools" that I built to help in my occasional-program or in my regular dba-ish tasks.

Caveat 1: Of course you will build your own. I know ppl re-use code much better when they first type it themselves. I Know you will do this this differently, and you will do it Much Better. 

Caveat 2: Don't expect high-tech: What I did with python is very simple, most of it is "pushed to the DBA" job. Things that I would call "plumber-stuff": process incoming files, updating some meta-data like LoV lists, produce simple interface-data, or very simple reports-to-mail. The kind of stuff that never seems to go away, no matter how sophisticated the new ecosystems become.


Environment-settings and other variables: python-dotenv.

As most programs have to work in various environments, I needed a way to quickly set or obtain the various variables for LOCAL, DEV, TEST, PRE and PROD. The easiest solution turned out to be a tool called python-dotenv.

If you dont know of this tool yet, read the intro (link), and play with it. 

I am still grateful to Gianni and Jeff and a few others who tipped me about this gizmo. Thx Folks!

In my case, it started by keeping the oracle-connect information plus a few other more system- or env-specific variables that I need to distinguish between various environments. It also coincided nicely with the preference of my cstmr to "keep stuff in config files".

And suddenly I found python-dotenv very tempting for multiple use-cases, and  I started to hoard a lot of "settable variables" into the file. Very Useful.


Open a db-connection, use a function (part 1)

As all my programs relate to an RDBMS, I need all programs to "connect" to the database. The information needed to connect was easily stored in a dotenv-file, but the creation of a connection needed some attention. For several reasons:

- We want credentials to be separate from code. Not hardcoded (and this time we mean it, looking at you Scott and Tiger..). Credentials need to be both safely stored and easily modifiable (some systems still change passwords every 32 days...)

- We want to verify that a connection is to the Correct, Intended database. Preferable to some logfile or stdout stream (check part 2).

The easiest part was to pick-up the credentials from the dotenv-file. To this extend, we created a function that picks up the connect-info (including tiger...) from an env-file. So far so good:

The first attempt looked like this:

Notice several things:

- We print some of the info to stdout, just to verify. Out-comment if you like, but I Want this check. More Later.

- We use easy-connect syntax.

- The function returns the connection-object, the connection does not have to be a global object.

If you want more complicated connects, if you prefer to use tns-aliases, or complete tns-entries, or other types of dsn: that would require a slight re-code. But you can stick to the concept of using dotenv, it is able to use multi-line variables. Hence you have a lot of options. I didnt have a need for more complicated connection-specifications yet, but in most cases, I would store the information in dotenv.


Open a db-connection, and Verify to stdout (part 2)

A very common mistake, and a rather Impactful Mistake, is to connect some infrequently used, adhoc-running metadata-update-tool to the wrong database... Ooops! (of course, in Your case, the passwords between Dev and Prod would be different, but some systems are made by Murphy... )

My previous blogpost on this was "how to set SQL-Prompt"..

https://simpleoracledba.blogspot.com/2023/06/do-you-know-where-you-are.html

And I wanted something similar to my python-programs.

Easy... In python, is it very simple to take any valid SELECT, have it executed, and print the results to stdout:

I inserted this snip of code into the ora_logon function:

 cursor = ora_conn.cursor()

  for row in cursor.execute ( sql_show_conn ):

    print  ( ' ora_login:', row[1] )

And now I can define sql_show_conn with any select I want, notably the ones I use in SQLcl and SQL*Plus to show my connection:

My favorite is:

  # customize this sql to show connection info on logon

  sql_show_conn="""

    select 2 as ordr

       , 'version : ' ||  substr ( banner_full, -12) as txt

    from v$version

  UNION ALL

    select 1

       , 'user    : '|| user || ' @ ' || global_name|| ' ' as txt

    FROM global_name     gn

  UNION ALL

    SELECT 3

       , 'prompt  : ' || user

         || ' @ ' ||db.name

         || ' @ '|| SYS_CONTEXT('USERENV','SERVER_HOST')

         || decode  (SYS_CONTEXT('USERENV','SERVER_HOST')

              , '98b6d46dd637',      ' (xe-dev)'

              , 'ip-nnn-nn-2-109',   ' (PROD)'

              , 'ip-nnn-nn-0-226',   ' (ACC)'

              , '2c4a51017a44',      ' (dckr-23ai)'

              , 'b88d087b53d3',      ' (dckr-23.7)'

              , ' (-envname-)')

         || ' > '

    FROM    v$database      db

  order by 1

  """


This is the equivalent of setting a fancy prompt in SQLcl to indicate "where you are" (link to old blog). 

This SQL, actually a UNION of three queries, roughly translates to : 1) show user and global_name, 2) show the version and 3) use a fancy decode to show which environment you are at, which I often use to list out dev/text/acc/prod and other varieties we use.

Because at a certain cstmer, the pdb-names and global names and service-names are often identical over several environments, we use the container-name or host-name to identify "where we are", and decode that to some "meaningful name" like "PROD". 

And you can make the verification-query as fancy as you want (or just keep it simple). In some cases, it was turned into a local linux-style MOTD, and in some cases into a complete Welcome-Message including some instructions and warnings.

You can add items to this cursor at your hearts content, and the union + order trick allow you to re-arrange the outcomes. Knock yourself Out (some devs really do!)

The result, in my case, comes out like this:


The complete, include-able function, together with some test-code at the bottom of the file, can be found here: link to ora_login.py.


Summary: Feel free to re-use, but...

From thinking- and coding my logon-function, using dotenv and my own logon-function, I got some good benefits. 

But, big Caveat: Tooling is Personal, and YMMV.

Over time I have taken to use a lot of self-built tools and scripts, and some tools+utils I picked up from 't internet or other coders (and mostly modified for my own use).

I did find that it is difficult to get others to re-use tools that I crafted, or any tool from anyone else.

I have a few other snippets of python code that I (re-)use a lot, and I will see how those develop over time (if at all..). So far, if my code or just my ideas does inspire some of you to improve your code: Job done.

My main tip to others on code-reuse is generally: Read, Consider, Try, ... and in case of doubt:  DIY. 

The most (re)used scripts and code will probably be the "built you own" version.

Enjoy !

-- -- -- -- -- end of this blogpost, for now -- -- -- --

Monday, 10 March 2025

The Generic Tips (was: python and Oracle)

TL;DR: Some Very Basic Tips, to write better programs and tools. In my recent case: using python and Oracle. But in the handover to colleagues, I came to realise there are some tips+tricks that need re-iterating every few years. So Here Goes (again). 

My opinion is of course strongly "biased" towards an Oracle Database, but that is not always a bad thing.

The actual blog on my Oracle-python-tricks will have to wait a bit.. but it is in the pipeline (# makefifo pythonstuff ; link to future blog...) 


The things that tend to wash up on the desk of a DBA...



Background: Been There Done That (have T-shirts).

Originally, I set out to describe some of the pyton programs (functions, really) that I (re)used to make my life and that of my cstmr easier. As it turned out, we ended up discussing some "habits" and practices rather than the actual code. So just a few generic tips, which I also used to explain things to cstmrs. Hence I wrote that up before I dipped into my python code.

YMMV and "Never say Never": The tips applied to myself and my cstmr-situation. But I came across enough situations to say they apply in many case. However, Your Milage May Vary. So if you think they do not apply to you, I am fine with that. I'm Not going to go religious on it.

My content here applies to the type of (simple-) jobs that tend to wash-up on the desk of a DBA: Interfaces, ad-hoc reports, occasional maintenance-jobs, loading "meta data". that sort of thing.

Also, do not expect High-Tech. In my case, with the combination of python and Oracle, I have only build relatively simple programs. And I do nearly all my work just from a "terminal" on either linux, mac or sometimes even windows. I use mainly vi, SQL-Developer (and occasionally some IDE or even notepad) as editors, and often still use print/f or dbms_output statements for debugging (showing my age eh...). 



Non-functional "Requirements": try to include in Specification.

Requirements, Specifications, or whatever name you have, often state: "ingest these datasets" or "produce a list of gizmos+items in that format", or "return a set of records with this data..." . And that is expected. 

The un-spoken or so called "Non Functional Requirements" are often along the lines of:

1) Make sure it runs in future infrastructure (if we move to "the other cloud"...)

2) Allow running- and maintenance by our outsourced-vendor-resources.

2a) Include some (unit-, integeration-) test capability. testset...

3) Make sure it runs inside 1 min (One Minute, even if 10G of xml-data....)

4) Produce some output or report to allow control or verification (and keep that)

5) Allow only "authorised" ppl/tools to run, and/or prevent DoS situations.

 etc...

Those requirements are often unspecified but implied when you are asked to do "just this simple program". But it helps to write them down (briefly!) even if nobody wants to read them. You may also need some trace of those Non-Functionals to justify the choices that you make (e.g. why didnt you build this program in Go, in C++, or as a bash-script using just grep and awk? ).

Note: The checklist of 1 to ... etc  above would be Tip-nr-0, if I was so pedantic as to start numbering at Zero...

So here come the "Generic Tips  (after weeding out half)...



Tip nr 1: Write it down, Briefly but Clearly

Why? Well, See above...

But also: the reader of your writings is probably yourself or your replacement-person in 3 years, or even in 20yrs (Yes.. I've seen them). Hence, it helps if you can introduce your program with a few lines of both the "functional" and "non-functional" requirements (what does it need to do!).

By writing (or just summarising) what your program needs to do, you can also "verify" it with the user, the architect or whoever is asking for it to be built.

You may also want to include some "justification", e.g. why you choose PL/SQL over Python or FORTRAN (hehehe).



Tip nr 2: Draw one or more ERDs. For better Understanding!

To understand a program: it rarely helps to read the source-code. But a look at the tables, or the ERD of the underlying data can do miracles for understanding. I tend to create my ERDs from the tables in the datbase and rarely work with other models than the Table-Relationship-Diagram. 

Even if you have very simple data, it helps to have an ERD. Not only will an ERD help you define tables (if you even use tables at all...) and the way those table relate. But a brief documentation of each data-element will also allow you to communicate purpose and usage of the data-elements. 

In creating the ERD you will most likely "discover" a few things about the data you process. I notably walk down the lines (relations) and verify things like .. "This order has zero, one, or more details...", and then ask "Is that Correct...?"

When you have to explain your program to colleagues, or have to ask "more questions" from the users: an ERD is a great tool to communicate, and much more readable than "open this source code" (the user may not have the dev-tools installed...).



Tip 3: SmartDB: Use. Your. Database (e.g. use SQL and PL/SQL)

If the data is moving into or out-of a an RDBMS: Consider using SQL and PL/SQL as much as possible.

Even if your organisation prefers using VMware virtual machines, or your architect prefers the use use of "spot-instances", or k8s, or some serverless FaaS solution... If your data ends up in a database, that database is the Safest,  the Longest-Lasting, and probably the Easiest to Maintain place to "store + run your program".

I recall my managers, in the 90s and 00s demanding things like "use FORTRAN to convert incoming data", or "only C will do this fast enough". In the end most of that code ended up being converted (reduced!) into PL/SQL with good results.

The excuses to not use a database are plenty:

- Database CPU is too expensive (maybe, but the database is often the moste efficient place to process and keep Data)

- PL/SQL requires a DBA... (well, a pl/sql person, acutally, so ..?)

-  This is so simple, you should use grep/awk for this (ok, try that. and then introduce a small change in requirements...or start processing GBs of data...)

I must say: I have broken this rule of #SmartDB many times, often at the expense of myself and my cstmr. Nothing lives as long as the database...

Of course, YMMV.  In some cases FORTRAN, C++ or python, is advised bcse of available tools or libraries (as was sometimes my case). The main Valid reason to do processing outside of the RDBMS is, in my opinion, the use of certain libraries for specific functions (math, geo, or some specific engineering- or scientific purpuse library).

But In General, by solving the problems "inside the database", the code was the most efficient and the easiest to maintain (some links to the praise of #SmartDB...?).



Tip 4: Provide some Test Cases.

This may sound trivial, but it can really help. I'm not specifically advocating TDD (Test Driven Development), or some other "framework" that may require a lot of methodology (and work). But you need some test-cases, if only to prove that the original code you delivered still works.

Ideally, you have a scripts (test_setup.sh and test_run.sh).

And those tests should not "break" or otherwise affect your production data. Hence this needs some thought, and probably needs to be embedded into the processes of the customer. 

All cstmrs have Test-platforms. Some cstmrs even have separate platforms where they run Production systems.



Tip 5: Plan Liberally, every IT tweak takes more time than expected.

The best known book on this is probably "The Mythical Man Month" by F.P. Brooks. One of my early managers made me read it. Recommended!

My rule of thumb is to sometimes multiply the original time-estimate. For simple systems: multiply by 3.14 ( approximately pi, Archimedes...). For more complicated systems: multiply by 9.8 ( approx pi-square or g)

Most ppl can remember pi and g (earth gravity).  Because simply stating "I multiplied by 10" sounds rather un-scientific.



Tip x : Stick with the tool(s), unless...

I also tend to include a warning: stick with the tool (bash, python, sqlcl, pl/sql...). If every little-maintenance job completely re-writes the program (in Java, bcse that is the future standard... ), you end up with a hodgepodge of code. If some future coder Insists on re-writing (in .cpp using chatgpt...): think hard, and possibly find a coder that Does understand the original tools/languages it was written in.

When I was an engineering student we were taught some rules around robustness, and maintainability. Such as how to choose components or parts: 

- Less Components make a system more Robust.

- Less Different Components make a system easier to Maintain.

I never forgot those.. and those simple rules also apply to IT-systems.



Oops... Writing Too Much ? 

This text is getting way to long, nobody reads that much anymore (so I went back and weeded out half of it).

I could add a several more rules. Notably around the fact that the Oracle RDBMS now includes several neat features that make your life easier, while still staying at the #SmartDB level e.g. "With the Database". There are now options to process JSON, to build REST-services (recommended), the use of APEX for simple window-on-data (recommended).  And you can now include ONNX models (new in Oracle23ai!). But all those are above + beyond the scope of what I wanted to write.

So there.



Summary: 

Write Some Specs, for yourself and your colleagues.

Plan carefully (and maybe read The Mythical Man Month...)

Make na ERD, it helps the understanding and facilitates discussions

Use #SmartDB whenever possible, it is more efficient.

Provide some Test-cases.

Limit your nr of tools/languages, prevent code-chaos.


That is about it.. Next instalment should be : python Tips + python Code ...