SQL Macros – Some Less Obvious Facts (Part 2)

This second part of the “less obvious facts” series is about the COLUMNS pseudo-operator. It has been introduced in 18c as one of the variadic pseudo-operators, intended to operate with a variable number of operands. Unfortunately, no other pseudo-operators have been introduced since then, so that it is worth looking at what exactly the COLUMNS operator is, how to use it and how it can potentially be “misused”.

TL;DR

If you don’t have much time to read a long story here are the key facts you can read and decide whether you should take the time ;-):

  • COLUMNS is intended to pass columns lists, it will be converted to a collection at parse time – you have to process this collection accordingly
  • As long as you are not using quoted identifiers for the columns involved in your query you can pass some additional information like ASC/DESC if you enclose in double quotes by yourself
  • COLUMNS can also be used for passing column names along with data types
  • COLUMNS operator can be used with scalar SQL macros as well

COLUMNS pseudo operator

It was originally intended for use with PTF’s. As we can see, these two topics are closely linked. The SQL macros seem to use the PTF infrastructure, but this is not clearly mentioned in the documentation. One of the places where you can clearly see this is the COLUMNS operator.

Figure 2: COLUMNS pseudo operator

Consider the Figure 2:

  • COLUMNS is intended to pass column lists
  • Despite of this, no check is conducted that columns passed exist in a table. In fact, the database doesn’t even know what table is meant
  • Only literals are allowed, no bind variables.
  • No empty list – at least one element is required
  • The elements should be valid SQL identifiers. You can quote them if needed. That means, all characters except CHR(0) and double quotes are allowed
  • At parsing phase the list passed with COLUMNS pseudo-operator will be converted by the database to the data type DBMS_TF.COLUMNS_T which is a table of DBMS_QUOTED_ID. All elements will be enclosed in double quotes if not already. There is also another possible use with different data type: see further down in this post.
  • You can access this PL/SQL table as usual in your macro body. How and whether you use it in your result expression is entirely up to you
  • The resulting SQL which will be executed doesn’t contain the COLUMNS-operator anymore, but completely dependent from what you’ve done with this list in your SQL macro body.
  • You can’t just leave the parameter name in the result string of the macro because there is no substitution that takes place like with scalar or table parameters. You have to concatenate for this

Passing more than just column names?

What if I want to pass more than just a plain column list? For example I want to pass a column list for an ORDER BY clause. This can additionally contain ASC/DESC or NULLS LAST/FIRST. Hmm… Actually, you can do this, as long as you are not using quoted identifiers for the column names. Just add what you need and enclose the whole expression in double quotes. Consider Listing 1:

create or replace function top_n (p_tab in dbms_tf.table_t
                                , p_limit in number
                                , p_order in dbms_tf.columns_t) 
return varchar2 sql_macro is 
v_order_list varchar2(2000);
begin
  -- turn PL/SQL table to comma separated list for ORDER BY clause
  select listagg(replace(column_value,'"'),',') into v_order_list from table (p_order);
  return 'SELECT * FROM top_n.p_tab ORDER BY '||v_order_list||
         ' FETCH FIRST top_n.p_limit ROWS ONLY';
end;
/

SQL> 
SQL> select * 
  2  from top_n(emp, 7, columns("comm desc nulls last")) e;

     EMPNO ENAME      JOB              MGR HIREDATE          SAL       COMM     DEPTNO
---------- ---------- --------- ---------- ---------- ---------- ---------- ----------
      7654 MARTIN     SALESMAN        7698 1981-09-28       1250       1400         30
      7521 WARD       SALESMAN        7698 1981-02-22       1250        500         30
      7499 ALLEN      SALESMAN        7698 1981-02-20       1600        300         30
      7844 TURNER     SALESMAN        7698 1981-09-08       1500          0         30
      7369 SMITH      CLERK           7902 1980-12-17        800                    20
      7698 BLAKE      MANAGER         7839 1981-05-01       2850                    30
      7782 CLARK      MANAGER         7839 1981-06-09       2450                    10

7 rows selected. 

Listing 1: using double quotes to pass more than just a column name

It is up to you to call it a “hack” or not. Of course, we could as well pass the ORDER BY clause as a string. Keep in mind that in this case you can’t process this parameter in any way, because it would be NULL inside your macro body. Everything you can do is to hand it over to the return string of the macro. Or we could just use another “half-hack” and pass the string as an element of the collection (table of varchar2) which will be fully visible inside the macro.

COLUMNS with data types

You can pass column names along with their (scalar) data types using COLUMNS pseudo-operator. The section of the documentation about this is very easy to overlook and I haven’t found any example of this use. In this case the input via COLUMNS is turned into DBMS_TF.COLUMNS_WITH_TYPE_T data type, which is not a simple list of quoted SQL identifiers more, but a complex record structure. Personally, I didn’t come up with a good use case for that yet. The one that came in mind is casting the input to the defined data types. See the Listing 2, which handles just VARCHAR2 and NUMBER datatypes to demonstrate the idea. We pass all columns as is, unless we have found a match in the passed COLUMNS operator along with the desired data types.

SQL> create or replace function cols_type (p_tab in dbms_tf.table_t
  2                                  , p_cols in dbms_tf.COLUMNS_WITH_TYPE_T) 
  3  return varchar2 sql_macro is 
  4  v_cols varchar2(10000);
  5  v_found boolean;
  6  begin
  7    for k in 1..p_tab.column.count loop
  8      v_found := false;
  9      for i in 1..p_cols.count loop
 10        if p_cols(i).name = p_tab.column(k).description.name then 
 11          case  p_cols(i).type 
 12            when  dbms_tf.TYPE_VARCHAR2 then  
 13              v_cols := v_cols || ', cast('||p_cols(i).name||' as varchar2('||p_cols(i).max_len||') ) as '||p_cols(i).name;
 14            when  dbms_tf.TYPE_NUMBER then  
 15              v_cols := v_cols || ', cast('||p_cols(i).name||' as number('||p_cols(i).precision||','||p_cols(i).scale||') ) as '||p_cols(i).name;
 16          end case;
 17          v_found := true;
 18          exit;
 19        end if;
 20      end loop;
 21      if not v_found then     
 22        v_cols := v_cols || ', '||p_tab.column(k).description.name;  
 23      end if;
 24    end loop;
 25    v_cols := substr(v_cols,2);
 26    return 'SELECT '||v_cols||' FROM p_tab ';
 27  end;
 28  /

Function COLS_TYPE compiled

SQL> 
SQL> set echo on
SQL> select * 
2  from cols_type(emp, columns(empno varchar2(2), job varchar2(3), sal number(6,-2)));

EM ENAME      JOB        MGR HIREDATE          SAL       COMM     DEPTNO
-- ---------- --- ---------- ---------- ---------- ---------- ----------
73 SMITH      CLE       7902 1980-12-17        800                    20
74 ALLEN      SAL       7698 1981-02-20       1600        300         30
75 WARD       SAL       7698 1981-02-22       1300        500         30
75 JONES      MAN       7839 1981-04-02       3000                    20
76 MARTIN     SAL       7698 1981-09-28       1300       1400         30
76 BLAKE      MAN       7839 1981-05-01       2900                    30
77 CLARK      MAN       7839 1981-06-09       2500                    10
77 SCOTT      ANA       7566 1987-04-19       3000                    20
78 KING       PRE            1981-11-17       5000                    10
78 TURNER     SAL       7698 1981-09-08       1500          0         30
78 ADAMS      CLE       7788 1987-05-23       1100                    20

Listing 2: COLUMNS with data types

Using COLUMNS with scalar SQL macros

It is not possible to use table operator with scalar macros – it is documented or at least caught by compilation exception PLS-00777: scalar SQL macro cannot have argument of type DBMS_TF.TABLE_T. But nothing prevents the use of COLUMN parameters with scalar macros if you need to. What can be the use case? Something where you benefit from operating with a variable number of operands of the same kind. In my career as data engineer I often had to deal with wide tables consisting of hundreds of numeric fields which have to be summarized to build up some KPI’s. I mean, not across rows, but across columns of the same row. I think I would have benefited from the function like in Listing 3. It adds some useful features like making the column list unique (if you end up with the list of 30-40 columns it is pretty easy to overlook duplicates) and surrounding the columns with NVL().

SQL> create or replace function  sum_up (p_sum_cols in  dbms_tf.columns_t) 
  2  return varchar2 sql_macro(scalar) as
  3  v_result varchar2(32767);
  4  v_sum_cols dbms_tf.columns_t;
  5  begin
  6      v_sum_cols := set (p_sum_cols); 
  7      for i in 1..v_sum_cols.count loop
  8          v_result := v_result||'+NVL('||v_sum_cols(i)||',0)';
  9      end loop;
 10      return substr(v_result,2);
 11  end;
 12  /

Function SUM_UP compiled

SQL> select empno, ename, sal, comm, sum_up(columns(sal, sal, comm)) result 
  2  from emp e;

     EMPNO ENAME             SAL       COMM     RESULT
---------- ---------- ---------- ---------- ----------
      7369 SMITH             800                   800
      7499 ALLEN            1600        300       1900
      7521 WARD             1250        500       1750
      7566 JONES            2975                  2975
      7654 MARTIN           1250       1400       2650
      7698 BLAKE            2850                  2850
...

Listing 3: Scalar SQL macros with COLUMNS pseudo operator

What about a REPLACE function which is able to handle not just one pair of search and replace expressions but an “infinite” number of them?

SQL> create or replace function  multireplace (p_input_str in varchar2
  2                                          , p_replace_chain in  dbms_tf.columns_t) 
  3  return varchar2 sql_macro(scalar) as
  5  v_result varchar2(32767);
  6  v_result_mid varchar2(32767);
  7  begin
  8      for i in 1..p_replace_chain.count loop
  9        if mod(i,2) = 0 then 
 10          v_result := v_result||'replace(';
 11          v_result_mid := v_result_mid||','''||replace(p_replace_chain(i-1),'"')
 12                          ||''','''||replace(p_replace_chain(i),'"')||''')';
 13        end if;
 14      end loop;
 15      return v_result||'p_input_str'||v_result_mid;
 16  end;
 17  /

Function MULTIREPLACE compiled

SQL> with t_input as (select 'Use nested REPLACE functions' str)
  2  select multireplace(str, COLUMNS("REPLACE", "function", 
  3                                   "functions", "instead",
  4                                   "nested", "MULTIREPLACE")) result
  5  from t_input;

RESULT                           
---------------------------------
Use MULTIREPLACE function instead

Listing 4: “misusing” the COLUMNS operator – passing something completely different than columns

Well, after all, I don’t think it would be a suitable example and I don’t advocate to use the operator this way. The first problem is that you can’t pass NULL/ empty strings or strings containing double quotes. Maybe I’m wrong, but misusing of COLUMNS operator this way looks more like a hack to me. Especially if you consider that it is just as well possible to implement this with user defined collection types instead of COLUMNS:

SQL> set echo on
SQL> create or replace function  multireplace_nt (p_input_str in varchar2
  2                                          , p_replace_chain in  sys.odcivarchar2list) 
  3  return varchar2 sql_macro(scalar) as
  4  v_result varchar2(32767);
  5  v_result_mid varchar2(32767);
  6  begin
  7      for i in 1..p_replace_chain.count loop
  8        if mod(i,2) = 0 then 
  9          v_result := v_result||'replace(';
 10          v_result_mid := v_result_mid||','''||replace(p_replace_chain(i-1),'"')
 11                          ||''','''||replace(p_replace_chain(i),'"')||''')';
 12        end if;
 13      end loop;
 14      return v_result||'p_input_str'||v_result_mid;
 15  end;
 16  /

Function MULTIREPLACE_NT compiled

SQL> 
SQL> with t_input as (select 'Use nested REPLACE functions' str)
  2  select multireplace_nt(str, sys.odcivarchar2list('REPLACE', '', 
  3                                                   'functions', 'instead', 
  4                                                   'nested', 'MULTIREPLACE')) result
  5  from t_input;

RESULT                   
-------------------------
Use MULTIREPLACE  instead 

Listing 5: Using a pre-defined collection sys.odcivarchar2list instead of misusing COLUMNS

Conclusion

The way COLUMNS pseudo operator was described as one of the variadic-pseudo-operators in the 18c documentation back then made me think that there will be more to come. I don’t know if it was meant that way, but no more have been introduced so far. So we looked at this one, how it works, how to use it and how to do a little bit more with it.

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *