Online Statistics Gathering has been introduced in Oracle 12c. I think it is a very important building block to establish a consistent approach in dealing with optimizer statistics in your ETL jobs. I have already written some posts about it a while ago (first, second and third). When I was preparing a talk about optimizer statistics for data engineers recently, I took a closer look at it again. Oracle has now continuously improved the feature over several versions and has lifted many restrictions. Let’s see what I mean.
Continue readingSQL macros and the WITH clause
SQL macros and WITH clause are not known to be great friends: you cannot call a SQL macro in a WITH clause and if you want to define a table macro returning a query containing a WITH subquery(ies), then you won’t be able to use scalar parameters in this subquery. In the previous post I tried to explain another restriction from the technical perspective looking at the SQL macro expansion trace information. In this post I will try to understand the WITH clause related limitations using the same approach.
Continue readingSQL Macros – Changing The Query Shape Using Parameters?
One thing I learned early on when I was getting to know SQL macros is that we cannot replace each and every part of a SQL statement with a macro. In the same way, working with table SQL macros, we cannot represent just any part of the result string by referencing parameters inside it. Sometimes we need to use string concatenation. But why it behaves this way? In this post I will look at the available trace information about SQL macro expansion during hard parse and try to understand its impact on this restriction. After that we’ll also look how to overcome it and what to consider.
Continue readingObject Statistics in Your Data Pipelines (Part 2)
For me, building efficient data pipelines means thinking about performance from day one. And that means a conscious and thoughtful approach to optimizer statistics. It is not just this Autotask, which always runs in the database in the evenings or at weekends and is the responsibility of the DBAs. Every data engineer knows the data to be processed better, knows when and how it is loaded and processed. They should also be better able to recognize when and whether the statistics are critical at certain process steps and therefore when and how the statistics should be collected in the best possible way. That’s their job too! This is my point of view, which I would like to explain and emphasize with the posts in this series
Continue readingObject Statistics in Your Data Pipelines
As my electrical engineering lecturer at university used to joke, there are only two faults in electricity: 1) No contact where it is needed. 2) There is contact where it is not needed. You can often think the same way when looking for the causes of a poor execution plan: no stats were gathered when the optimizer needed them for a good execution plan or someone gathered stats when it was inappropriate.
Continue readingParameterized Views with SQL Macros (Part 2)
One of the most mentioned use cases for table SQL macros is the implementation, which works like a parameterized view. My first post on this is from 2020 and it ends with a thought that we have many advantages here but no real view at all. And indeed, I have heard this argument several times since then. Views are schema objects that we have been familiar with for decades, they are very easy to develop and deploy, and their query code is easily accessible via data dictionary. On the other hand, maintaining query code within SQL macros can be a bit more difficult and understandably not to everyone’s preference. But what if you could have a cake and eat it too? Here is how…
Continue readingSQL Macros – Some Less Obvious Facts (Part 2)
This second part of the “less obvious facts” series is about the COLUMNS pseudo-operator. It has been introduced in 18c as one of the variadic pseudo-operators, intended to operate with a variable number of operands. Unfortunately, no other pseudo-operators have been introduced since then, so that it is worth looking at what exactly the COLUMNS operator is, how to use it and how it can potentially be “misused”.
Continue readingSQL Macros – Some Less Obvious Facts (Part 1)
Since I have written a lot about SQL macros, I have tested some less obvious, sometimes surprising cases. But expanding about them in the original posts would just overload them, so I decided to make a dedicated post for that. The other good reason for this is that the SQL macros are, in my opinion, not documented in enough detail. So additional research and testing can be useful.
Continue readingSQL Macros – what is in the CBO trace?
In the previous post I showed how to get to the final query after SQL macro expansion using CBO trace. But there are still a few interesting things in the trace file that are worth looking at.
Continue readingBeware of NO_DATA_FOUND in your PTF!
A few days ago I received a comment/question on the older post about dynamically transposing rows to column with Polymorphic Table Functions (PTF). Back then I overlooked a bug in the example code, but the explanation takes a bit longer, so i decided to write a new post about it. The PTF was working initially but after inserting new data started to return wrong results – all NULL’s for some columns where we know there are actually values present. So, what’s going on?
Continue reading