CALENDAR Functions in Oracle 26ai – Part 2

Do I Still Need a Time Dimension?

In the previous post, we explored the new CALENDAR functions introduced in Oracle 26ai — their syntax, strengths, and some subtle caveats. Now it’s time to look at the architectural question: if date logic has become easier to express directly in SQL, does that reduce the need for a dedicated time dimension in analytical models?

Availability and Performance Considerations

The new CALENDAR functions are SQL functions. They are not available in PL/SQL expressions in the same way you would use date arithmetic directly in procedural code — at least for now. Knowing Oracle’s history, that might change in a future release. Their primary design target is analytical SQL. Compare the two calls in Listing 1: in PL/SQL, the new functions can only be used inside SQL statements. Their execution context will be the SQL engine then.

SQL> declare 
v_date date := date '2021-01-01';
begin
  dbms_output.put_line(CALENDAR_MONTH_OF_YEAR(v_date)); 
end;
>> declare 
v_date date := date '2021-01-01';
begin
  dbms_output.put_line(CALENDAR_MONTH_OF_YEAR(v_date)); 
end;
Error at line 2
ORA-06550: line 4, column 24:
PLS-00201: identifier 'CALENDAR_MONTH_OF_YEAR' must be declared
ORA-06550: line 4, column 3:
PL/SQL: Statement ignored
Help: https://docs.oracle.com/error-help/db/ora-06550/

SQL> declare 
v_date date := date '2021-01-01';
v_n number;
begin
  SELECT CALENDAR_MONTH_OF_YEAR(v_date) INTO v_n;
  dbms_output.put_line(v_n);  
end;
 PL/SQL procedure successfully completed.

Listing 1: CALENDAR functions are only available in the SQL engine as of now

In simple benchmarks, the functions appear to be slightly slower than equivalent handcrafted expressions — roughly in the range of 5–10% in my tests. This is not surprising. The built-in functions likely cover more edge cases and perform additional validation compared to minimal DIY expressions.

In practice, this overhead is acceptable and not a reason to avoid using them. Especially in analytical workloads, clarity and correctness often outweigh marginal micro-optimizations. Moreover, local testing (e.g., Podman on Windows in my case) may distort absolute timing results.

What should be avoided, however, is encapsulating complex DIY date expressions—or even the new CALENDAR functions themselves—inside plain PL/SQL scalar functions and calling them from SQL.

In that case, each row evaluation introduces a context switch from the SQL engine to the PL/SQL engine. If the PL/SQL function itself executes SQL (as in the second example above), control switches back to the SQL engine again. This results in a SQL → PL/SQL → SQL roundtrip per row, which can become very expensive.

If abstraction is required, scalar SQL macros are the better alternative, as they expand at parse time and preserve optimizer visibility. If the expressions are as simple as calling one or more nested SQL functions, then enabling the SQL transpiler may also be a viable option. Maybe I will expand on that in one of the next blog posts.

The more relevant performance question, however, is not whether CALENDAR functions are slightly slower than handcrafted expressions, but whether computing all attributes per row can compete with joining to a pre-populated time dimension.

Do I Still Need a Time Dimension?

Typically, one can run analytic queries in different situations: “in place” over your operational data which has not been specially prepared for this purpose, or you can have some kind of data store for analytical data without special data model (OBT – one big table concept), and of course you can query a nicely modeled star schema having a time dimension. For the first two categories you will probably need to do some date calculations “on the fly”. And this is where the new functions can be very handy. But does it mean, we should use them instead of referencing a time dimension in a star/snowflake schema, just because it has become so much easier?

No! There are at least two good reasons for not doing so. You typically reference the time dimension from your facts using just one date column or sometimes some kind of a surrogate key. And your BI tools know about these connections. Whenever you decide to extend your time dimension table, like you want additional descriptors, levels of hierarchy, or introduce fiscal periods which was not on your agenda until then, you will just extend this single table, propagate changes to your BI tool and everyone can start using it. Otherwise you will probably have to touch each and every report to introduce the “on the fly” calculations.

And the second reason is performance: you populate your time dimension just once or maybe regularly, which can mean even daily, but not before executing each query. So let’s do a check (see Listing 2), how fast a join with the table dimension table is compared to a calculation on the fly.

SQL> set timing on
SQL> SET AUTOTRACE TRACEONLY STATISTICS
SQL> SELECT
  2    t.calendar_year,
  3    t.calendar_quarter_number,
  4    t.calendar_quarter_desc,
  5    t.end_of_cal_quarter,
  6    t.calendar_month_number,
  7    t.calendar_month_desc,
  8    t.calendar_month_name,
  9    t.end_of_cal_month,
 10    t.calendar_week_number,
 11    t.week_ending_day,
 12    t.day_number_in_week,
 13    t.day_number_in_month,
 14    t.day_name,
 15    SUM(s.amount_sold) AS amount_sold
 16  FROM sales_bench s
 17  JOIN sh.times t
 18    ON t.time_id = s.time_id
 19  GROUP BY ALL;

1459 rows selected.

Elapsed: 00:00:02.96

Statistics
----------------------------------------------------------
         55  recursive calls
          0  db block gets
     221768  consistent gets
     221576  physical reads
          0  redo size
      79927  bytes sent via SQL*Net to client
       1418  bytes received via SQL*Net from client
         99  SQL*Net roundtrips to/from client
          5  sorts (memory)
          0  sorts (disk)
       1459  rows processed

SQL>
SQL>
SQL> SELECT
  2    CALENDAR_YEAR_NUMBER(s.time_id) AS calendar_year,
  3    CALENDAR_QUARTER_OF_YEAR(s.time_id) AS calendar_quarter_number,
  4    CALENDAR_QUARTER(s.time_id,'YYYY-"0"Q')  AS calendar_quarter_desc,
  5    CALENDAR_QUARTER_END_DATE(s.time_id) AS end_of_cal_quarter,
  6    CALENDAR_MONTH_OF_YEAR(s.time_id) AS calendar_month_number,
  7    CALENDAR_MONTH(s.time_id, 'YYYY-MM') AS calendar_month_desc,
  8    TRIM(replace(CALENDAR_MONTH(s.time_id, 'MonthSYYYY'),CALENDAR_YEAR(s.time_id))) AS calendar_month_name,
  9    CALENDAR_MONTH_END_DATE(s.time_id) AS end_of_cal_month,
 10    CALENDAR_WEEK_OF_YEAR(s.time_id) AS calendar_week_number,
 11    CALENDAR_WEEK_END_DATE(s.time_id) AS week_ending_day,
 12    CALENDAR_DAY_OF_WEEK(s.time_id, 'POSITION') AS day_number_in_week,
 13    CALENDAR_DAY_OF_MONTH(s.time_id) AS day_number_in_month,
 14    TRIM(REPLACE(CALENDAR_DAY(sysdate, 'DayWWYYYY'), CALENDAR_WEEK(sysdate, 'WWYYYY'))) AS day_name,
 15    SUM(s.amount_sold) AS amount_sold
 16  FROM sales_bench s
 17  GROUP BY ALL;

1459 rows selected.

Elapsed: 00:01:55.78

Statistics
----------------------------------------------------------
         14  recursive calls
          5  db block gets
     221617  consistent gets
     221576  physical reads
        972  redo size
      79133  bytes sent via SQL*Net to client
       1418  bytes received via SQL*Net from client
         99  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       1459  rows processed

Listing 2: Time dimension vs. on-the-fly calculation

Roughly three seconds compared to almost two minutes. The difference is not that CALENDAR functions are inherently slow, but that computing multiple date attributes per row cannot compete with joining to a precomputed dimension.

I deliberately selected many columns, so one could argue that a large number of function calls amplifies the effect and that real-world queries might use fewer attributes. However, once the join to the time dimension is performed, retrieving additional columns adds almost no incremental cost, whereas computing each attribute on the fly must be evaluated row by row.

Conclusion

CALENDAR functions are a valuable addition for ETL development, refactoring legacy SQL, and ad-hoc analytical queries where readability matters. They encapsulate complex fiscal and retail logic and can also be extremely useful when populating a time dimension itself. However, in established star-schema environments, they complement rather than replace a properly designed time dimension.

One more thing I was happy to read was that CALENDAR functions are “part of the broader Select for Analysis (SfA) initiative, designed to reduce verbosity and improve readability.” I’m eager to see more of these features in future releases.

Leave a Reply

Your email address will not be published. Required fields are marked *