SQL macros in Oracle 19c

Now that I have created a patched Oracle database docker image with Release Update 19.8 (19.8.0.0.DBRU:200714) to play with, I can finally start testing the backport of SQL macros! (UPDATE: As of now you can play with SQL macros on Autonomous Databases in Oracle Cloud (also free tier!) and on livesql.oracle.com)

SQL Macros have been introduced in Oracle 20c. As of now, Oracle 20c is only available as preview version in Oracle Cloud. That’s why I was quite excited when Connor McDonald pointed out on Twitter that they were backported to 19c. Let’s see if it works, and how.

It is a bug 30324180, where the backport is mentioned. And the fix is first included in 19.6. So, strictly speaking, you do not need 19.8 that I’m working with – 19.6 should also be just fine. (UPDATE: 19.6 is mentioned in the bug document, but I was able to run the example from this post on an ADW database in the cloud with 19.5 right now)

Let’s start with a basic example from my previous post – a scalar macro function JOB_DURATION:

SQL> CREATE OR REPLACE FUNCTION job_duration RETURN VARCHAR2 
  2  SQL_MACRO(SCALAR) 
  3  AS
  4  BEGIN
  5     RETURN q'! FLOOR(MONTHS_BETWEEN (SYSDATE, hiredate)/12) !';
  6  END;
  7  /

Function JOB_DURATION compiled

LINE/COL  ERROR
--------- -----------------------------------------------
2/10      PLS-00103: Encountered the symbol "(" 
when expecting one of the following:     ...

Listing 1: no scalar macros in 19c

Couldn’t compile… Why? The error message complains about the parenthesis around SCALAR, but not about the word SQL_MACRO. There is a good hint in the description of the “backport” bug 30324180 to understand this. Scalar macros were intentionally not backported! So you only have table semantic macros (which are also the default in 20c) and it seems that you cannot explicitly specify the type. The good news here is that you don’t have to change anything when you migrate to 20c. So let’s focus on the table semantic functions then.

Get your data CSV formatted

As an example, I wanted to take something that would be useful in itself and at the same time can show the other capabilities of what can be achieved with table SQL macros. Since I work a lot as an ETL engineer, I am often faced with the task of exporting data as CSV. Not my favorite format, to be honest, especially because you lose a lot of metadata. But there are often project requirements you have no control over. And if you have to export data as CSV, it is very important that it is done consistently in the same way in larger projects. And that’s where SQL macros can be very helpful! Let’s see why and how.

Our goal is to develop a generic function:

  • which we can feed with arbitrary data stream
  • which handles incoming data in a consistent way according to established conventions
  • in particular, the function should produce a header row with column names, format all date and timestamp fields according to ISO 8601 and enclose all text fields with quotation marks, whereby these must first be “escaped” in the text
  • which we can query from getting CSV-formatted output

For example having the view EMPV defined in previous posts, the invocation will be like this:

SQL> select * from get_csv(empv);

CSV_ROW                                                                                             
----------------------------------------------------------------------------------------------------
"EMPVID","EMPNO","ENAME","JOBNO","MGR","HIREDATE","SAL","COMM","DEPTNO","VALID_FROM","VALID_TO"
2,7499,"ALLEN",2,7698,1981-02-20,1600,300,30,1981-02-20,1989-12-31
3,7521,"WARD",2,7698,1981-02-22,1250,500,30,1981-02-22,1989-12-31
4,7566,"JONES",4,7839,1981-04-02,2975,,20,1981-04-02,1989-12-31

Listing 2: the desired usage and output

What then happens with this output, for example, persisting it as a file, is out of scope here.

I wrote about the basics of table SQL macros in a couple of previous posts, for example about building hash keys with SQL macros, so I’m not going to repeat all explanations but rather to show that things are working exactly the same in 19c.

create or replace function get_csv (p_input_data dbms_tf.table_t
                                   ,p_delimiter varchar2 default ','
                                   ,p_quote varchar2 default '"'
                                   ,p_escape varchar2 default '\'
                                   ) return clob sql_macro as
v_sql clob;
v_name dbms_id;
v_header clob;
v_delimiter_rec varchar2(9):= '||'''||p_delimiter||'''||';
v_record clob;
v_rec varchar2(300);
begin
  for i in 1..p_input_data.column.count loop
    v_name := p_input_data.column(i).description.name;
    if p_input_data.column(i).description.type 
         in (dbms_tf.type_varchar2
           , dbms_tf.type_char
           , dbms_tf.type_clob) then 
      if p_quote is not null then 
          -- escape quotation characters in text fields
          v_rec := 'replace('||v_name||','''||p_quote||''','''||p_escape||p_quote||''')';
          -- enclose text in quotation marks
          v_rec :=  ''''||p_quote||'''||'||v_rec||'||'''||p_quote||'''';
      else 
          v_rec := v_name;
      end if;
    elsif p_input_data.column(i).description.type = dbms_tf.type_number then
      v_rec := 'to_char('||v_name||')';
    elsif p_input_data.column(i).description.type = dbms_tf.type_date then
      v_rec := 'to_char('||v_name||',''YYYY-MM-DD'')';
    elsif p_input_data.column(i).description.type = dbms_tf.type_timestamp then
      v_rec := 'to_char('||v_name||',''YYYY-MM-DD"T"HH24:MI:SS.FF6'')';      
    end if;
    v_header := v_header || p_delimiter || v_name;
    v_record := v_record || v_delimiter_rec || v_rec;
  end loop;
  v_header := SUBSTR(v_header, LENGTH(p_delimiter) + 1);
  v_record := SUBSTR(v_header, LENGTH(v_delimiter_rec) + 1);
  v_sql := q'[
     with d as (
       select '%PH_HEADER%' as csv_row, 1 as row_order  from dual 
       union all 
       select %PH_RECORD% as csv_row , 2 as row_order from get_csv.p_input_data
       ) 
     select csv_row from d order by row_order
           ]';
  v_sql := replace(replace (v_sql, '%PH_RECORD%', v_record), '%PH_HEADER%' , v_header);
  return v_sql;
end;
/

Listing 3: Implementation of SQL macro function GET_CSV

Let’s look at the implementation in more detail.

  • Line 1-5: Declare the function as sql_macro. If we want a generic function capable to work with arbitrary input data sets we need a parameter of type DBMS_TF.TABLE_T. This is the way we get metadata about the data set helping us to implement the required logic. As with any other PL/SQL function, we can also declare “scalar” parameter when needed along with their default values as well.
  • Line 13: Loop over the column collection we got in P_INPUT_DATA. This contains column names and also data types
  • Lines 15-33: Implement business requirements for handling different data types, e.g. escaping characters, quoting texts, formatting dates, etc. These rules can be as simple or as complex as you need. Important is that you can establish them only once and expose through SQL macro function
  • Lines 34-38: We prepare the strings containing a select list for the column header and the data itself: in a loop and also some post processing
  • Lines 39-47: Now we are ready to build the result string that will be returned from the function. It should be a valid SQL query that will be substituted in in FROM clause of the original query instead of a SQL macro function invocation. In our case this is a UNION ALL combining the header row and the actual formatted data. Ordering by artificial column row_order helps us to output header first. IMPORTANT! Note how we are using the parameter P_INPUT_DATA at line 43! It doesn’t make sense at first sight. But keep in mind, this reference will be substituted with the real table name or view or subquery, whatever you have used in original query as a parameter for the function. EVEN MORE IMPORTANT! This substitution seems only to take place upon returning the string from the function. If you were using the parameter P_INPUT_DATA elsewhere in the function, for instance, to query some data from the table for some reason, this wouldn’t work. See an example in my post about dynamic pivot.

Now we can call the function GET_CSV and produce the desired output:

SQL> select * from get_csv(empv);

CSV_ROW                                                                                             
----------------------------------------------------------------------------------------------------
"EMPVID","EMPNO","ENAME","JOBNO","MGR","HIREDATE","SAL","COMM","DEPTNO","VALID_FROM","VALID_TO"
2,7499,"ALLEN",2,7698,1981-02-20,1600,300,30,1981-02-20,1989-12-31
3,7521,"WARD",2,7698,1981-02-22,1250,500,30,1981-02-22,1989-12-31
4,7566,"JONES",4,7839,1981-04-02,2975,,20,1981-04-02,1989-12-31
5,7654,"MARTIN",2,7698,1981-09-28,1250,1400,30,1981-09-28,1989-12-31
6,7698,"BLAKE",4,7839,1981-05-01,2850,,30,1981-05-01,1989-12-31

Listing 4: the function works as designed

If you have already learned a little about SQL macros, you would know, that rather a different query is executed at runtime. Our function GET_CSV will run only at parse time and return a piece of SQL the original query will select from. How to see what will actually be executed? In Oracle 19c, as well as in Oracle 20c, I have yet only found one way to do this: using DBMS_UTILITY.expand_sql_text. I have formatted the output a bit:

SQL> DECLARE
  2    l_clob CLOB;
  3  BEGIN
  4    DBMS_UTILITY.expand_sql_text (
  5      input_sql_text  => q'[select * from get_csv (empv)]',
  6      output_sql_text => l_clob  );
  7    DBMS_OUTPUT.put_line(l_clob);
  8  END;
  9  /

PL/SQL procedure successfully completed.

SELECT "A1"."CSV_ROW" "CSV_ROW" 
FROM  (
  SELECT "A4"."CSV_ROW" "CSV_ROW" 
  FROM  (
     SELECT "A5"."CSV_ROW" "CSV_ROW" 
     FROM  ( 
       (SELECT '"EMPVID","EMPNO","ENAME","JOBNO","MGR","HIREDATE","SAL","COMM","DEPTNO","VALID_FROM","VALID_TO"' "CSV_ROW",1 "ROW_ORDER" 
        FROM "SYS"."DUAL" "A7") 
        UNION ALL  
       (SELECT TO_CHAR("A6"."EMPVID")||','||
               TO_CHAR("A6"."EMPNO")||','||
               '"'||REPLACE("A6"."ENAME",'"','\"')||'"'||','||
               TO_CHAR("A6"."JOBNO")||','||
               TO_CHAR("A6"."MGR")||','||
               TO_CHAR("A6"."HIREDATE",'YYYY-MM-DD')||','||
               TO_CHAR("A6"."SAL")||','||
               TO_CHAR("A6"."COMM")||','||
               TO_CHAR("A6"."DEPTNO")||','||
               TO_CHAR("A6"."VALID_FROM",'YYYY-MM-DD')||','||
               TO_CHAR("A6"."VALID_TO",'YYYY-MM-DD') "CSV_ROW"
               , 2 "ROW_ORDER" 
        FROM  (
            SELECT "A3"."EMPVID" "EMPVID","A3"."EMPNO" "EMPNO"
                    ,"A3"."ENAME" "ENAME","A3"."JOBNO" "JOBNO"
                    ,"A3"."MGR" "MGR","A3"."HIREDATE" "HIREDATE"
                    ,"A3"."SAL" "SAL","A3"."COMM" "COMM"
                    ,"A3"."DEPTNO" "DEPTNO","A3"."VALID_FROM" "VALID_FROM"
                    ,"A3"."VALID_TO" "VALID_TO" 
            FROM "ONFTEST"."EMPV" "A3"
               ) "A6"
        )
     ) "A5" 
     ORDER BY "A5"."ROW_ORDER"
   ) "A4"
) "A1"

Listing 5: what happens in the background

As you can see, Oracle does a lot of work in background rewriting the original query. Just by comparing the original and the actually executed query, the key advantage of SQL macros becomes obvious. We can implement complex business logic in a consistent way and provide a simple interface through a SQL macro function. This business logic is not encapsulated in a PL/SQL function at run time (causing the context switch) but fully exposed to the database and query optimizer. The SQL macro function is not tied to particular table or view but can be generically used with different data sets, even subqueries.

Speaking of subqueries, you cannot pass them directly as the following listing shows, but you can do it with named queries defined in WITH clause.

SQL> select * from get_csv ((select empno, ename from empv));

SQL Error: ORA-00913: too many values

SQL> 
SQL> 
SQL> with subq as (select empno, ename from empv)
  2  select * from get_csv(subq);

CSV_ROW                       
------------------------------
"EMPNO","ENAME"
7499,"ALLEN"
7521,"WARD"
7566,"JONES"
7654,"MARTIN"
7698,"BLAKE"
...

Listing 6: using with a subquery

Looks good, but I also came across something that looks like a bug to me. When using a subquery as an input for SQL macro and generating some fields directly in this subquery, it will throw ORA-62558, even for some supported data types. In the following listing we can see no problem with the column HIRE_DATE coming from the table, but generating another date column using sysdate is not possible. Converting the generated date to string solves the problem though.

SQL> -- this works
SQL> with subq as (select empno, ename, hiredate 
  2                from empv 
  3                fetch first 5 rows only)
  4  select * from get_csv(subq);

CSV_ROW                                           
--------------------------------------------------
"EMPNO","ENAME","HIREDATE"
7654,"MARTIN",1981-09-28
7521,"WARD",1981-02-22
7566,"JONES",1981-04-02
7369,"SMITH",1980-12-17
7499,"ALLEN",1981-02-20

6 rows selected. 

SQL> 
SQL> -- doesn't work
SQL> with subq as (select empno, ename, hiredate, sysdate as extraction_dt 
  2                from empv
  3                fetch first 5 rows only)
  4  select * from get_csv(subq);

Error at Command Line : 235 Column : 15
Error report -
SQL Error: ORA-62558: Unsupported data types (DATE) for polymorphic table function
62558. 00000 -  "Unsupported data types (%s) for polymorphic table function"
*Cause:    Non-scalar data types were used in polymorphic table function.
           Only scalar data types are supported.
*Action:   Use only scalar data types in polymorphic table function.
SQL> 
SQL> -- works after casting to char datatype
SQL> with subq as (select empno, ename, hiredate
  2                     , to_char(sysdate,'yyyy-mm-dd') as extraction_dt 
  3                from empv
  4                fetch first 5 rows only)
  5  select * from get_csv(subq);

CSV_ROW                                           
--------------------------------------------------
"EMPNO","ENAME","HIREDATE","EXTRACTION_DT"
7654,"MARTIN",1981-09-28,"2020-10-03"
7521,"WARD",1981-02-22,"2020-10-03"
7566,"JONES",1981-04-02,"2020-10-03"
7369,"SMITH",1980-12-17,"2020-10-03"
7499,"ALLEN",1981-02-20,"2020-10-03"

6 rows selected. 

Listing 7: bug with subquery-generated columns

I could experience the same behavior with other data types like TIMESTAMP or INTERVAL and not only with SQL macros but also with polymorphic table functions as well. I think I should open an SR for this. But I don’t really think it is a show stopper for using SQL macros.

UPDATE: I’ve investigated the reasons for this behavior in the follow-up post Using Subqueries and Views with PTF or SQL macros. After all, I’m not sure, whether it is a bug at all. You can better workaround the problem by casting SYSDATE as DATE again.

As you can probably guess, I have become a big fan of SQL macros over the last few months. Okay, they are to be taken with a certain amount of caution, but they bring clear advantages. All in all, I am very excited to be able to work with SQL macros in 19c now. After all, 19c will be the release that I will have to work with for a few more years at most of my customers.

Related Posts

8 thoughts on “SQL macros in Oracle 19c

  1. Randolf Eberle-Geist

    Hi Andrej,

    I’ve just tried to reproduce the GET_CSV SQL Macro function on 19c, and I get varying results.

    In 19.7 the GET_CSV function in principle works as shown here, but when I try to specify non-default parameters, for example for “p_delimiter”, like “SELECT * FROM GET_CSV(, p_delimiter => ‘;’)”, then “p_delimiter” actually seems to be NULL inside the function and the result is unexpected in that sense that the header row is blank and the delimiter is NULL instead of “;”.

    Testing the same on “LiveSQL” which runs 19.8 the GET_CSV function doesn’t work at all and returns just “ORA-64626: invalid SQL text returned from SQL macro: “. I haven’t debugged further, but my actual point is that I face a similar problem in a different implementation, in that sense that passed parameters into the function are NULL.

    Can you confirm this or do you see different behaviour in 19c and if yes, which version?

    Do you happen to have a 20c instance available and can confirm that specifying non-default parameters for GET_CSV works as expected?

    Thanks,
    Randolf

    Reply
    1. admin Post author

      Hi Randolf,
      I can confirm that scalar function parameters are NULL inside the function in 19.8! That was not the case in 20c as far as I remember and as far as I can see from my previous posts. Unfortunately, I don’t have access to 20c right now. But maybe I can test this in a few days.
      Thanks,
      Andrej

      Reply
  2. Randolf Eberle-Geist

    Hi Andrej,

    some more information: The problem I got on “LiveSQL” was related to a bug I think in the GET_CSV function – using LTRIM in an inappropriate way when cutting off the leading delimiters from V_HEADER and V_RECORD. I replaced this with a SUBSTR expression instead which worked around the problem – for example V_RECORD := SUBSTR(V_RECORD, LENGTH(V_DELIMITER_REC) + 1) instead of V_RECORD := LTRIM(V_RECORD, V_DELIMITER_REC). The problem is that LTRIM treats the second argument as *set* of characters that will be trimmed, most people assume it uses the *string* to cut off. So LTRIM might trim too many characters in some cases – in particular when a character column comes as first column.

    The original problem remains: Specifying any non-default parameters for the scalar function parameters like p_delimiter leads to unexpected results, in that sense that the parameter value seems to become NULL inside the function… Any ideas?

    Thanks,
    Randolf

    Reply
    1. admin Post author

      Hi Randolf,
      Thank you for pointing this out. Funny, actually I know this, but bad habits are hard to overcome. I’ve corrected the code. But the problem passing scalar parameters remains.
      Regards
      Andrej

      Reply
  3. connor mcdonald

    The nulling of such parameters is intentional – if we let you use the *value* within the parameter, then the SQL injection possibilities are huge.
    The intent of these parameters that you can reference their *name* within the generated SQL string, and we will do the necessary translation at execution time.
    For example, to (say) concatenate 2 columns with delimiter “p_delim” I would NOT do the following: (I’m using concat to help keep the quote nesting etc at bay :-))

    sql := concat(
    ‘col1’,
    ‘||’,
    ””,
    p_delim, <=== ie, the *value* of p_delim
    '''',
    '||'
    'col2')

    ie, with the intent of generating the string: col1||'-'||col2

    I would do this:

    sql := concat(
    'col1',
    '||',
    'p_delim', <=== note in quotes, this is the *string* p_delim
    '||'
    'col2')

    to generate this: col1||p_delim||col2

    Then at *execution* time, you would the output of:

    john-smith
    mike-jones
    etc etc

    Reply
  4. connor mcdonald

    Thus we can tweak your (excellent) CSV routine to look something like the below

    SQL> set pagesize 30000
    SQL> set echo on
    SQL> create or replace
      2   function get_csv(p_input_data dbms_tf.table_t
      3   ,p_delimiter varchar2 default ','
      4   ,p_quote varchar2 default '"'
      5   ,p_escape varchar2 default '\'
      6   ) return clob sql_macro as
      7   v_sql varchar2(32767);
      8   v_name varchar2(200);
      9   v_header varchar2(32767);
     10   v_delimiter_rec varchar2(24):= '||p_delimiter||';
     11   v_record varchar2(32767);
     12   v_rec varchar2(300);
     13   begin
     14   for i in 1..p_input_data.column.count loop
     15   v_name := p_input_data.column(i).description.name;
     16  
     17   if p_input_data.column(i).description.type
     18   in (dbms_tf.type_varchar2
     19   , dbms_tf.type_char
     20   , dbms_tf.type_clob)
     21   then
     22   if p_quote is not null then
     23   -- escape quotation characters in text fields
     24   v_rec := 'replace('||v_name||','''||p_quote||''','''||p_escape||p_quote||''')';
     25   -- enclose text in quotation marks
     26   v_rec :=  ''''||p_quote||'''||'||v_rec||'||'''||p_quote||'''';
     27   else
     28   v_rec := v_name;
     29   end if;
     30   elsif p_input_data.column(i).description.type = dbms_tf.type_number then
     31   v_rec := 'to_char('||v_name||')';
     32   elsif p_input_data.column(i).description.type = dbms_tf.type_date then
     33   v_rec := 'to_char('||v_name||',''YYYY-MM-DD'')';
     34   elsif p_input_data.column(i).description.type = dbms_tf.type_timestamp then
     35   v_rec := 'to_char('||v_name||',''YYYY-MM-DD"T"HH24:MI:SS.FF6'')';
     36   end if;
     37   v_header := v_header || '||p_delimiter||' ||''''|| v_name ||'''';
     38   v_record := v_record || v_delimiter_rec || v_rec;
     39  
     40   end loop;
     41   v_header := SUBSTR(v_header, LENGTH('||p_delimiter||') + 1);
     42   v_record := SUBSTR(v_record, LENGTH(v_delimiter_rec) + 1);
     43  
     44   v_sql := q'[
     45   select @PH_HEADER@ as csv_row from dual
     46   union all
     47   select @PH_RECORD@ as csv_row from get_csv.p_input_data
     48   ]';
     49  
     50   v_sql := replace(replace (v_sql, '@PH_RECORD@', v_record), '@PH_HEADER@' , v_header);
     51  
     52   DBMS_OUTPUT.put_line('v_sql='||v_sql);
     53  
     54   return v_sql;
     55   end;
     56  /
    
    Function GET_CSV compiled
    
    SQL> 
    SQL> set serveroutput on
    SQL> 
    SQL> create or replace
      2  view empv as select empno, ename from scott.emp;
    
    View EMPV created.
    
    SQL> 
    SQL> select * from get_csv(p_input_data=>empv);
    
    CSV_ROW                                                        
    ---------------------------------------------------------------
    "EMPNO","ENAME"
    7369,"SMITH"
    7499,"ALLEN"
    7521,"WARD"
    7566,"JONES"
    7654,"MARTIN"
    7698,"BLAKE"
    7782,"CLARK"
    7788,"SCOTT"
    7839,"KING"
    7844,"TURNER"
    7876,"ADAMS"
    7900,"JAMES"
    7902,"FORD"
    7934,"MILLER"
    
    15 rows selected. 
    
    SQL> 
    SQL> select * from get_csv(p_input_data=>empv,p_delimiter=>':');
    
    CSV_ROW                                                        
    ---------------------------------------------------------------
    "EMPNO":"ENAME"
    7369:"SMITH"
    7499:"ALLEN"
    7521:"WARD"
    7566:"JONES"
    7654:"MARTIN"
    7698:"BLAKE"
    7782:"CLARK"
    7788:"SCOTT"
    7839:"KING"
    7844:"TURNER"
    7876:"ADAMS"
    7900:"JAMES"
    7902:"FORD"
    7934:"MILLER"
    
    15 rows selected. 
    
    
    Reply
  5. p3consulting

    Under 21c, I got problem with “select @PH_RECORD@ as csv_row from get_csv.p_input_data”
    replacing by “select @PH_RECORD@ as csv_row from p_input_data” solved the issue.

    Also a simple improvement should to use NLS parameters instead of hardcoded format for date and timestamp (and add support fo r dbms_tf.type_timestamp_tz)

    v_date_fmt varchar2(128);
    v_timestamp_fmt varchar2(128);
    v_timestamptz_fmt varchar2(128);
    begin
    select value into v_date_fmt from v$nls_parameters
    where parameter = ‘NLS_DATE_FORMAT’ ;
    select value into v_timestamp_fmt from v$nls_parameters
    where parameter = ‘NLS_TIMESTAMP_FORMAT’ ;
    select value into v_timestamptz_fmt from v$nls_parameters
    where parameter = ‘NLS_TIMESTAMP_TZ_FORMAT’ ;

    elsif p_input_data.column(i).description.type = dbms_tf.type_date then
    v_rec := ‘to_char(‘||v_name||’,”’ || v_date_fmt || ”’)’;
    elsif p_input_data.column(i).description.type = dbms_tf.type_timestamp then
    v_rec := ‘to_char(‘||v_name||’,”’ || v_timestamp_fmt || ”’)’;
    elsif p_input_data.column(i).description.type = dbms_tf.type_timestamp_tz then
    v_rec := ‘to_char(‘||v_name||’,”’ || v_timestamptz_fmt || ”’)’;

    Reply

Leave a Reply to admin Cancel reply

Your email address will not be published. Required fields are marked *