This second part of the “less obvious facts” series is about the COLUMNS pseudo-operator. It has been introduced in 18c as one of the variadic pseudo-operators, intended to operate with a variable number of operands. Unfortunately, no other pseudo-operators have been introduced since then, so that it is worth looking at what exactly the COLUMNS operator is, how to use it and how it can potentially be “misused”.
TL;DR
If you don’t have much time to read a long story here are the key facts you can read and decide whether you should take the time ;-):
- COLUMNS is intended to pass columns lists, it will be converted to a collection at parse time – you have to process this collection accordingly
- As long as you are not using quoted identifiers for the columns involved in your query you can pass some additional information like ASC/DESC if you enclose in double quotes by yourself
- COLUMNS can also be used for passing column names along with data types
- COLUMNS operator can be used with scalar SQL macros as well
COLUMNS pseudo operator
It was originally intended for use with PTF’s. As we can see, these two topics are closely linked. The SQL macros seem to use the PTF infrastructure, but this is not clearly mentioned in the documentation. One of the places where you can clearly see this is the COLUMNS operator.
Consider the Figure 2:
- COLUMNS is intended to pass column lists
- Despite of this, no check is conducted that columns passed exist in a table. In fact, the database doesn’t even know what table is meant
- Only literals are allowed, no bind variables.
- No empty list – at least one element is required
- The elements should be valid SQL identifiers. You can quote them if needed. That means, all characters except CHR(0) and double quotes are allowed
- At parsing phase the list passed with COLUMNS pseudo-operator will be converted by the database to the data type DBMS_TF.COLUMNS_T which is a table of DBMS_QUOTED_ID. All elements will be enclosed in double quotes if not already. There is also another possible use with different data type: see further down in this post.
- You can access this PL/SQL table as usual in your macro body. How and whether you use it in your result expression is entirely up to you
- The resulting SQL which will be executed doesn’t contain the COLUMNS-operator anymore, but completely dependent from what you’ve done with this list in your SQL macro body.
- You can’t just leave the parameter name in the result string of the macro because there is no substitution that takes place like with scalar or table parameters. You have to concatenate for this
Passing more than just column names?
What if I want to pass more than just a plain column list? For example I want to pass a column list for an ORDER BY clause. This can additionally contain ASC/DESC or NULLS LAST/FIRST. Hmm… Actually, you can do this, as long as you are not using quoted identifiers for the column names. Just add what you need and enclose the whole expression in double quotes. Consider Listing 1:
create or replace function top_n (p_tab in dbms_tf.table_t
, p_limit in number
, p_order in dbms_tf.columns_t)
return varchar2 sql_macro is
v_order_list varchar2(2000);
begin
-- turn PL/SQL table to comma separated list for ORDER BY clause
select listagg(replace(column_value,'"'),',') into v_order_list from table (p_order);
return 'SELECT * FROM top_n.p_tab ORDER BY '||v_order_list||
' FETCH FIRST top_n.p_limit ROWS ONLY';
end;
/
SQL>
SQL> select *
2 from top_n(emp, 7, columns("comm desc nulls last")) e;
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
---------- ---------- --------- ---------- ---------- ---------- ---------- ----------
7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30
7521 WARD SALESMAN 7698 1981-02-22 1250 500 30
7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30
7844 TURNER SALESMAN 7698 1981-09-08 1500 0 30
7369 SMITH CLERK 7902 1980-12-17 800 20
7698 BLAKE MANAGER 7839 1981-05-01 2850 30
7782 CLARK MANAGER 7839 1981-06-09 2450 10
7 rows selected.
Listing 1: using double quotes to pass more than just a column name
It is up to you to call it a “hack” or not. Of course, we could as well pass the ORDER BY clause as a string. Keep in mind that in this case you can’t process this parameter in any way, because it would be NULL inside your macro body. Everything you can do is to hand it over to the return string of the macro. Or we could just use another “half-hack” and pass the string as an element of the collection (table of varchar2) which will be fully visible inside the macro.
COLUMNS with data types
You can pass column names along with their (scalar) data types using COLUMNS pseudo-operator. The section of the documentation about this is very easy to overlook and I haven’t found any example of this use. In this case the input via COLUMNS is turned into DBMS_TF.COLUMNS_WITH_TYPE_T data type, which is not a simple list of quoted SQL identifiers more, but a complex record structure. Personally, I didn’t come up with a good use case for that yet. The one that came in mind is casting the input to the defined data types. See the Listing 2, which handles just VARCHAR2 and NUMBER datatypes to demonstrate the idea. We pass all columns as is, unless we have found a match in the passed COLUMNS operator along with the desired data types.
SQL> create or replace function cols_type (p_tab in dbms_tf.table_t
2 , p_cols in dbms_tf.COLUMNS_WITH_TYPE_T)
3 return varchar2 sql_macro is
4 v_cols varchar2(10000);
5 v_found boolean;
6 begin
7 for k in 1..p_tab.column.count loop
8 v_found := false;
9 for i in 1..p_cols.count loop
10 if p_cols(i).name = p_tab.column(k).description.name then
11 case p_cols(i).type
12 when dbms_tf.TYPE_VARCHAR2 then
13 v_cols := v_cols || ', cast('||p_cols(i).name||' as varchar2('||p_cols(i).max_len||') ) as '||p_cols(i).name;
14 when dbms_tf.TYPE_NUMBER then
15 v_cols := v_cols || ', cast('||p_cols(i).name||' as number('||p_cols(i).precision||','||p_cols(i).scale||') ) as '||p_cols(i).name;
16 end case;
17 v_found := true;
18 exit;
19 end if;
20 end loop;
21 if not v_found then
22 v_cols := v_cols || ', '||p_tab.column(k).description.name;
23 end if;
24 end loop;
25 v_cols := substr(v_cols,2);
26 return 'SELECT '||v_cols||' FROM p_tab ';
27 end;
28 /
Function COLS_TYPE compiled
SQL>
SQL> set echo on
SQL> select *
2 from cols_type(emp, columns(empno varchar2(2), job varchar2(3), sal number(6,-2)));
EM ENAME JOB MGR HIREDATE SAL COMM DEPTNO
-- ---------- --- ---------- ---------- ---------- ---------- ----------
73 SMITH CLE 7902 1980-12-17 800 20
74 ALLEN SAL 7698 1981-02-20 1600 300 30
75 WARD SAL 7698 1981-02-22 1300 500 30
75 JONES MAN 7839 1981-04-02 3000 20
76 MARTIN SAL 7698 1981-09-28 1300 1400 30
76 BLAKE MAN 7839 1981-05-01 2900 30
77 CLARK MAN 7839 1981-06-09 2500 10
77 SCOTT ANA 7566 1987-04-19 3000 20
78 KING PRE 1981-11-17 5000 10
78 TURNER SAL 7698 1981-09-08 1500 0 30
78 ADAMS CLE 7788 1987-05-23 1100 20
Listing 2: COLUMNS with data types
Using COLUMNS with scalar SQL macros
It is not possible to use table operator with scalar macros – it is documented or at least caught by compilation exception PLS-00777: scalar SQL macro cannot have argument of type DBMS_TF.TABLE_T. But nothing prevents the use of COLUMN parameters with scalar macros if you need to. What can be the use case? Something where you benefit from operating with a variable number of operands of the same kind. In my career as data engineer I often had to deal with wide tables consisting of hundreds of numeric fields which have to be summarized to build up some KPI’s. I mean, not across rows, but across columns of the same row. I think I would have benefited from the function like in Listing 3. It adds some useful features like making the column list unique (if you end up with the list of 30-40 columns it is pretty easy to overlook duplicates) and surrounding the columns with NVL().
SQL> create or replace function sum_up (p_sum_cols in dbms_tf.columns_t)
2 return varchar2 sql_macro(scalar) as
3 v_result varchar2(32767);
4 v_sum_cols dbms_tf.columns_t;
5 begin
6 v_sum_cols := set (p_sum_cols);
7 for i in 1..v_sum_cols.count loop
8 v_result := v_result||'+NVL('||v_sum_cols(i)||',0)';
9 end loop;
10 return substr(v_result,2);
11 end;
12 /
Function SUM_UP compiled
SQL> select empno, ename, sal, comm, sum_up(columns(sal, sal, comm)) result
2 from emp e;
EMPNO ENAME SAL COMM RESULT
---------- ---------- ---------- ---------- ----------
7369 SMITH 800 800
7499 ALLEN 1600 300 1900
7521 WARD 1250 500 1750
7566 JONES 2975 2975
7654 MARTIN 1250 1400 2650
7698 BLAKE 2850 2850
...
Listing 3: Scalar SQL macros with COLUMNS pseudo operator
What about a REPLACE function which is able to handle not just one pair of search and replace expressions but an “infinite” number of them?
SQL> create or replace function multireplace (p_input_str in varchar2
2 , p_replace_chain in dbms_tf.columns_t)
3 return varchar2 sql_macro(scalar) as
5 v_result varchar2(32767);
6 v_result_mid varchar2(32767);
7 begin
8 for i in 1..p_replace_chain.count loop
9 if mod(i,2) = 0 then
10 v_result := v_result||'replace(';
11 v_result_mid := v_result_mid||','''||replace(p_replace_chain(i-1),'"')
12 ||''','''||replace(p_replace_chain(i),'"')||''')';
13 end if;
14 end loop;
15 return v_result||'p_input_str'||v_result_mid;
16 end;
17 /
Function MULTIREPLACE compiled
SQL> with t_input as (select 'Use nested REPLACE functions' str)
2 select multireplace(str, COLUMNS("REPLACE", "function",
3 "functions", "instead",
4 "nested", "MULTIREPLACE")) result
5 from t_input;
RESULT
---------------------------------
Use MULTIREPLACE function instead
Listing 4: “misusing” the COLUMNS operator – passing something completely different than columns
Well, after all, I don’t think it would be a suitable example and I don’t advocate to use the operator this way. The first problem is that you can’t pass NULL/ empty strings or strings containing double quotes. Maybe I’m wrong, but misusing of COLUMNS operator this way looks more like a hack to me. Especially if you consider that it is just as well possible to implement this with user defined collection types instead of COLUMNS:
SQL> set echo on
SQL> create or replace function multireplace_nt (p_input_str in varchar2
2 , p_replace_chain in sys.odcivarchar2list)
3 return varchar2 sql_macro(scalar) as
4 v_result varchar2(32767);
5 v_result_mid varchar2(32767);
6 begin
7 for i in 1..p_replace_chain.count loop
8 if mod(i,2) = 0 then
9 v_result := v_result||'replace(';
10 v_result_mid := v_result_mid||','''||replace(p_replace_chain(i-1),'"')
11 ||''','''||replace(p_replace_chain(i),'"')||''')';
12 end if;
13 end loop;
14 return v_result||'p_input_str'||v_result_mid;
15 end;
16 /
Function MULTIREPLACE_NT compiled
SQL>
SQL> with t_input as (select 'Use nested REPLACE functions' str)
2 select multireplace_nt(str, sys.odcivarchar2list('REPLACE', '',
3 'functions', 'instead',
4 'nested', 'MULTIREPLACE')) result
5 from t_input;
RESULT
-------------------------
Use MULTIREPLACE instead
Listing 5: Using a pre-defined collection sys.odcivarchar2list instead of misusing COLUMNS
Conclusion
The way COLUMNS pseudo operator was described as one of the variadic-pseudo-operators in the 18c documentation back then made me think that there will be more to come. I don’t know if it was meant that way, but no more have been introduced so far. So we looked at this one, how it works, how to use it and how to do a little bit more with it.