After the previous post about building an UNPIVOT query via SQL macro to output table rows like key-value pairs, I thought of another use case for the UNPIVOT operator, where developing a reusable SQL macro will make sense. From time to time you need to spot the difference between almost identical rows. You know they are different and you can easily check this using MINUS or GROUP BY, but if you want to know in what column(s) exactly the difference is, you need another approach.
Let’s start with a simple table to test with. We have three unique rows identified by ID
and VERS_ID
, system generated CREATED
timestamp and three columns C1, C2
and C3
where some values are different across those three rows.
SQL> CREATE TABLE diff_test ( 2 id NUMBER , 3 vers_id NUMBER, 4 created TIMESTAMP(6) default systimestamp, 5 c1 NUMBER, 6 c2 VARCHAR2(20), 7 c3 DATE 8 ); Table DIFF_TEST created. SQL> SQL> INSERT INTO diff_test (id, vers_id, c1, c2, c3) values (4711, 1, 100, 'A', DATE '2020-09-25'); 1 row inserted. SQL> INSERT INTO diff_test (id, vers_id, c1, c2, c3) values (4711, 2, 200, 'A', DATE '2020-09-25'); 1 row inserted. SQL> INSERT INTO diff_test (id, vers_id, c1, c2, c3) values (4711, 3, 200, 'B', DATE '2020-09-27'); 1 row inserted. SQL> SQL> SELECT * FROM diff_test; ID VERS_ID CREATED C1 C2 C3 ---------- ------- --------------------------- ----- --- ------------------- 4711 1 22.12.20 11:45:28,892617000 100 A 25.09.2020 00:00:00 4711 2 22.12.20 11:45:28,951802000 200 A 25.09.2020 00:00:00 4711 3 22.12.20 11:45:29,009886000 200 B 27.09.2020 00:00:00
Listing 1: Test table
Of course, for this small table we can spot the difference immediately: between VERS_ID 1 and 2 the column C1 changes from 100 to 200 and between VERS_ID 2 and 3 C2 changes to “B” and C3 from 25.09 to 27.09. But what about those systems (you name it) having hundreds of columns in their tables?
Let’s put together the query step by step. To simplify the example, let’s say we will only compare exactly two rows. However, this does not make our function any less useful.
Step 1: UNPIVOT
First we will construct a new “single-column” unique key P_KEY and select all columns we want to check:
SQL> with data as (select * from diff_test where vers_id < 3) 2 , src_query as ( 3 select "ID"||'#'||"VERS_ID" as p_key 4 , "C1","C2","C3" 5 from (data) ) 6 select * from src_query; P_KEY C1 C2 C3 -------- ----- --- ------------------- 4711#1 100 A 25.09.2020 00:00:00 4711#2 200 A 25.09.2020 00:00:00
Listing 2: select all columns and build a key
Next just UNPIVOT all columns except the key. You can have a look on the previous post to see how to do this and avoid ORA-01790:
SQL> with data as (select * from diff_test where vers_id < 3) 2 , src_query as ( 3 select ID||'#'||VERS_ID as p_key 4 , C1,C2,C3 5 from (data) ) 6 , unpivot_query as ( 7 select P_KEY, col_name, col_value 8 from ( select P_KEY, to_char(C1) as C1 9 , C2 10 , to_char(C3,'YYYY-MM-DD HH24:MI:SS') as C3 11 from src_query ) 12 unpivot include nulls (col_value for col_name in (C1,C2,C3))) 13 select * from unpivot_query; P_KEY CO COL_VALUE -------- -- ---------------------------------------- 4711#1 C1 100 4711#1 C2 A 4711#1 C3 2020-09-25 00:00:00 4711#2 C1 200 4711#2 C2 A 4711#2 C3 2020-09-25 00:00:00 6 rows selected.
Listing 3: Cast all columns as string and UNPIVOT
Step 2. GROUP BY
Now that we have each column name/column value pair in separate rows we can just GROUP BY
column names and COUNT DISTINCT
the values. If the values are different, you’ll have COUNT > 1
. That’s it.
SQL> with data as (select * from diff_test where vers_id < 3) 2 , src_query as ( 3 select ID||'#'||VERS_ID as p_key 4 , C1,C2,C3 5 from (data) ) 6 , unpivot_query as ( 7 select P_KEY, col_name, col_value 8 from ( select P_KEY, to_char(C1) as C1 9 , C2 10 , to_char(C3,'YYYY-MM-DD HH24:MI:SS') as C3 11 from src_query ) 12 unpivot include nulls (col_value for col_name in (C1,C2,C3))) 13 select listagg(p_key,'->') within group (order by p_key ) as p_keys 14 , col_name 15 , min (col_value) keep (dense_rank first order by p_key) old_value 16 , max (col_value) keep (dense_rank last order by p_key) new_value 17 from unpivot_query 18 group by col_name 19 having count(distinct dump(col_value)) > 1 ; P_KEYS COL_NAME OLD_VALUE NEW_VALUE ---------------- -------- -------------------- -------------------- 4711#1->4711#2 C1 100 200
Listing 4: GROUP BY can help to see differences
Why am I using DUMP(COL_VALUE)
? Without this trick we would not be able to detect changes from NULL to some value or from some value to NULL! This is because COUNT(NULL)
equals 0 (not NULL but 0!). Consider this small example:
SQL> with data as (select null as col from dual 2 union all 3 select 1 from dual) 4 select count(distinct col) 5 , count(distinct dump(col)) 6 from data; COUNT(DISTINCTCOL) COUNT(DISTINCTDUMP(COL)) ------------------ ------------------------ 1 2
Listing 5: Count(NULL) = 0
Step 3: SQL macro
Now that we know the query from Listing 4 does what we need, we can implement the SQL macro function:
create or replace function find_diff (p_tab dbms_tf.table_t , p_key dbms_tf.columns_t , p_exclude_cols dbms_tf.columns_t default null) return clob sql_macro as v_sql clob; v_key varchar2(1000); v_all_cols varchar2(32767); v_all_casted_cols varchar2(32767); v_name dbms_quoted_id; begin select listagg(column_value,'||''#''||') into v_key from table(p_key); FOR i IN 1..p_tab.column.count LOOP v_name := p_tab.column(i).description.name; IF (NOT v_name MEMBER OF p_exclude_cols or P_exclude_cols is null) AND NOT v_name MEMBER OF p_key THEN IF p_tab.column(i).description.type = dbms_tf.type_number THEN v_all_casted_cols := v_all_casted_cols||', to_char('||v_name||') as '||v_name ; ELSIF p_tab.column(i).description.type = dbms_tf.type_date THEN v_all_casted_cols := v_all_casted_cols||', to_char('||v_name||',''YYYY-MM-DD HH24:MI:SS'') as '||v_name ; ELSIF p_tab.column(i).description.type = dbms_tf.type_timestamp THEN v_all_casted_cols := v_all_casted_cols||', to_char('||v_name||',''YYYY-MM-DD HH24:MI:SS.FF9'') as '||v_name ; ELSE v_all_casted_cols := v_all_casted_cols||','||v_name ; END IF; v_all_cols := v_all_cols||','||v_name ; END IF; END LOOP; v_all_cols := ltrim(v_all_cols,','); v_all_casted_cols := ltrim(v_all_casted_cols,','); v_sql := q'{ with src_query as ( select %PH_KEY% as p_key, %PH_ALL_COLS% from p_tab ) , unpivot_query as ( select P_KEY, col_name, col_value from ( select P_KEY, %PH_ALL_CASTED_COLS% from src_query ) unpivot include nulls (col_value for col_name in (%PH_ALL_COLS% ) ) ) select listagg(p_key,'->') within group (order by p_key ) as p_keys , col_name , min (col_value) keep (dense_rank first order by p_key) old_value , max (col_value) keep (dense_rank last order by p_key) new_value from unpivot_query group by col_name having count(distinct dump(col_value)) > 1 }'; v_sql := replace(replace(replace(v_sql,'%PH_KEY%',v_key) ,'%PH_ALL_COLS%',v_all_cols) ,'%PH_ALL_CASTED_COLS%', v_all_casted_cols) ; dbms_output.put_line (v_sql); DBMS_TF.TRACE(p_tab); DBMS_TF.TRACE(p_key); return v_sql; end; /
Listing 6: The implementation of the SQL macro FIND_DIFF
The function accepts one table parameter (DBMS_TF.TABLE_T
) and two column list parameters (DBMS_TF.COLUMNS_T
): the one to pass all columns that make up the key (p_keys) and the one to pass the columns which we don’t want to compare (p_exclude_cols). The implementation is straightforward. Loop over the columns, check whether we should process them, handle conversion depending from the datatype (line 14-31). This kind of processing was a part of almost all examples for table macros so far. Then we just build the query. I prefer to write down the whole query first. I’m using placeholders, i.e. %PH_ALL_CASTED_COLS%
, where the content of other variables should be. But I don’t bother with concatenating upfront, but instead replace those placeholders afterwards (lines 47-49). I think it makes it just easier to see, understand and edit the query as a whole.
Quite important is the line 35. Simply put the table parameter name in your query where you want to select from it. Whatever it will be – table, view, subquery, etc. – it will be just substituted later.
Let’s test the function:
SQL> with data as (select * from diff_test where vers_id < 3) 2 select * from find_diff(data, columns(id, vers_id)); P_KEYS COL_NAME OLD_VALUE NEW_VALUE ---------------- -------- -------------------- -------------------- 4711#1->4711#2 C1 100 200 4711#1->4711#2 CREATED 2020-12-22 11:45:28. 2020-12-22 11:45:28. 892617000 951802000 -- Exclude the column CREATED SQL> with data as (select * from diff_test where vers_id < 3) 2 select * from find_diff(data, columns(id, vers_id), columns(created)); P_KEYS COL_NAME OLD_VALUE NEW_VALUE ---------------- -------- -------------------- -------------------- 4711#1->4711#2 C1 100 200 -- Compare 2nd and 3rd versions SQL> with data as (select * from diff_test where vers_id > 1) 2 select * from find_diff(data, columns(id, vers_id), columns(created)); P_KEYS COL_NAME OLD_VALUE NEW_VALUE ---------------- -------- -------------------- -------------------- 4711#2->4711#3 C2 A B 4711#2->4711#3 C3 2020-09-25 00:00:00 2020-09-27 00:00:00
Listing 7: test the function
As you can see in the first output, comparing technical audit columns such as CREATED, MODIFIED
and so on, may not be what you want and this is when the introduced parameter p_exclude_cols becomes handy.
DBMS_TF.TRACE
Did you notice two calls to DBMS_TF.TRACE
at the end of the function? If you already played around with polymorphic table functions (PTF) in 18c and above, you probably already know this procedure. In fact, it’s a wrapper about dbms_output.put_line
and makes it easy to output the internal structures used in context of PTF. Since we are using some of these structures (DBMS_TF.TABLE_T
and DBMS_TF.COLUMNS_T
) we can also call this logging procedure just as we did in the DESCRIBE method of the PTF.
........tab.column[1] = pass through column NOT a for read column __________Type:...............NUMBER __________Max Length:.........22 __________Name:..............."ID" __________Name Length:........4 __________Precision:..........0 __________Scale:..............-127 ........tab.column[2] = pass through column NOT a for read column __________Type:...............NUMBER __________Max Length:.........22 __________Name:..............."VERS_ID" __________Name Length:........9 __________Precision:..........0 __________Scale:..............-127 ........tab.column[3] = pass through column NOT a for read column __________Type:...............TIMESTAMP __________Max Length:.........20 __________Name:..............."CREATED" __________Name Length:........9 ........tab.column[4] = pass through column NOT a for read column __________Type:...............NUMBER __________Max Length:.........22 __________Name:..............."C1" __________Name Length:........4 __________Precision:..........0 __________Scale:..............-127 ........tab.column[5] = pass through column NOT a for read column __________Type:...............VARCHAR2 __________Max Length:.........20 __________Name:..............."C2" __________Name Length:........4 __________Charset Id:.........AL32UTF8 __________Collation:..........USING_NLS_COMP ........tab.column[6] = pass through column NOT a for read column __________Type:...............DATE __________Max Length:.........21 __________Name:..............."C3" __________Name Length:........4 ........columns[1] = "ID" ........columns[2] = "VERS_ID"
Listing 8: The output of DBMS_TF.TRACE
What happens in the background
It was very interesting to see what query will be actually executed. As shown in Listing 9, you can use DBMS_UTILITY.EXPAND_SQL_TEXT for it. I have formatted the output a little. Quite extensive transformations here! You can see how Oracle does UNPIVOT: for three columns there are three queries combined with UNION ALL. But also important are the highlighted lines where we can see our subquery used as the input for the macro function. Remember, it was just one parameter on line 35 of the Listing 6 where we passed it.
DECLARE l_clob CLOB; BEGIN DBMS_UTILITY.expand_sql_text ( input_sql_text => 'with data as (select * from diff_test where vers_id > 1) select * from find_diff(data, columns(id, vers_id), columns(created))', output_sql_text => l_clob ); DBMS_OUTPUT.put_line(l_clob); END; / SELECT "A1"."P_KEYS" "P_KEYS","A1"."COL_NAME" "COL_NAME","A1"."OLD_VALUE" "OLD_VALUE","A1"."NEW_VALUE" "NEW_VALUE" FROM (SELECT "A6"."P_KEYS" "P_KEYS","A6"."COL_NAME" "COL_NAME","A6"."OLD_VALUE" "OLD_VALUE","A6"."NEW_VALUE" "NEW_VALUE" FROM (SELECT LISTAGG("A7"."P_KEY",'->') WITHIN GROUP ( ORDER BY "A7"."P_KEY") "P_KEYS" ,"A7"."COL_NAME" "COL_NAME" ,MIN("A7"."COL_VALUE") KEEP (DENSE_RANK FIRST ORDER BY "A7"."P_KEY") "OLD_VALUE" ,MAX("A7"."COL_VALUE") KEEP (DENSE_RANK FIRST ORDER BY "A7"."P_KEY" DESC ) "NEW_VALUE" FROM (SELECT "A2"."P_KEY" "P_KEY","A2"."COL_NAME" "COL_NAME","A2"."COL_VALUE" "COL_VALUE" FROM (SELECT "A10"."P_KEY" "P_KEY",'C1' "COL_NAME","A10"."C1" "COL_VALUE" FROM (SELECT "A9"."P_KEY" "P_KEY",TO_CHAR("A9"."C1") "C1","A9"."C2" "C2",TO_CHAR("A9"."C3",'YYYY-MM-DD HH24:MI:SS') "C3" FROM (SELECT TO_CHAR("A3"."ID")||'#'||TO_CHAR("A3"."VERS_ID") "P_KEY","A3"."C1" "C1","A3"."C2" "C2","A3"."C3" "C3" FROM (SELECT "A4"."ID" "ID","A4"."VERS_ID" "VERS_ID","A4"."CREATED" "CREATED","A4"."C1" "C1","A4"."C2" "C2","A4"."C3" "C3" FROM (SELECT "A5"."ID" "ID","A5"."VERS_ID" "VERS_ID","A5"."CREATED" "CREATED","A5"."C1" "C1","A5"."C2" "C2","A5"."C3" "C3" FROM "ONFTEST"."DIFF_TEST" "A5" WHERE "A5"."VERS_ID">1) "A4" ) "A3" ) "A9" ) "A10" UNION ALL SELECT "A10"."P_KEY" "P_KEY",'C2' "COL_NAME","A10"."C2" "COL_VALUE" FROM (SELECT "A9"."P_KEY" "P_KEY",TO_CHAR("A9"."C1") "C1","A9"."C2" "C2",TO_CHAR("A9"."C3",'YYYY-MM-DD HH24:MI:SS') "C3" FROM (SELECT TO_CHAR("A3"."ID")||'#'||TO_CHAR("A3"."VERS_ID") "P_KEY","A3"."C1" "C1","A3"."C2" "C2","A3"."C3" "C3" FROM (SELECT "A4"."ID" "ID","A4"."VERS_ID" "VERS_ID","A4"."CREATED" "CREATED","A4"."C1" "C1","A4"."C2" "C2","A4"."C3" "C3" FROM (SELECT "A5"."ID" "ID","A5"."VERS_ID" "VERS_ID","A5"."CREATED" "CREATED","A5"."C1" "C1","A5"."C2" "C2","A5"."C3" "C3" FROM "ONFTEST"."DIFF_TEST" "A5" WHERE "A5"."VERS_ID">1) "A4" ) "A3" ) "A9" ) "A10" UNION ALL SELECT "A10"."P_KEY" "P_KEY",'C3' "COL_NAME","A10"."C3" "COL_VALUE" FROM (SELECT "A9"."P_KEY" "P_KEY",TO_CHAR("A9"."C1") "C1","A9"."C2" "C2",TO_CHAR("A9"."C3",'YYYY-MM-DD HH24:MI:SS') "C3" FROM (SELECT TO_CHAR("A3"."ID")||'#'||TO_CHAR("A3"."VERS_ID") "P_KEY","A3"."C1" "C1","A3"."C2" "C2","A3"."C3" "C3" FROM (SELECT "A4"."ID" "ID","A4"."VERS_ID" "VERS_ID","A4"."CREATED" "CREATED","A4"."C1" "C1","A4"."C2" "C2","A4"."C3" "C3" FROM (SELECT "A5"."ID" "ID","A5"."VERS_ID" "VERS_ID","A5"."CREATED" "CREATED","A5"."C1" "C1","A5"."C2" "C2","A5"."C3" "C3" FROM "ONFTEST"."DIFF_TEST" "A5" WHERE "A5"."VERS_ID">1) "A4" ) "A3" ) "A9" ) "A10" ) "A2" ) "A7" GROUP BY "A7"."COL_NAME" HAVING COUNT(DISTINCT DUMP("A7"."COL_VALUE"))>1) "A6" ) "A1"
Listing 9: the actual query looks quite different
Bonus: Flashback query
You do not always have two rows in one table to compare. What if you only have one row which changes over time? Someone has just changed the row of interest and you want to know what is different? There are several ways to do it. But if the chances are good that the version you need can still be found in the undo segment, you can use flashback query and our SQL macro function together to answer the question:
SQL> drop table test_emp_diff; Table TEST_EMP_DIFF dropped. SQL> SQL> create table test_emp_diff as select * from scott.emp; Table TEST_EMP_DIFF created. SQL> column tstp new_value tstp SQL> select to_char(systimestamp, 'yyyy-mm-dd HH24:mi:ss.ff6') tstp from dual; TSTP -------------------------- 2020-12-23 09:28:27.692766 SQL> update test_emp_diff set Job = 'DBA' where ename = 'SCOTT'; 1 row updated. SQL> with dat1 as ( 2 select 2 vers, a.* 3 from test_emp_diff a 4 union all 5 select 1 vers, b.* 6 from test_emp_diff 7 as of timestamp (to_timestamp('&tstp', 'yyyy-mm-dd HH24:mi:ss.ff6')) b 8 ) 9 , dat as (Select * from dat1 where ename ='SCOTT') 10 select * from find_diff (dat, columns(ename, vers)); P_KEYS COL_NAME OLD_VALUE NEW_VALUE ---------------- -------- -------------------- -------------------- SCOTT#1->SCOTT#2 JOB ANALYST DBA
Listing 10: Using SQL macro with flashback query
Where to test?
You can test table SQL macros in Oracle Database from 19.6. You can also play with them in LiveSQL. But the best thing is that you can test both scalar and table macros in the new release Oracle 21c with your free tier in Oracle cloud! Really great news! Go for it!
A minor difference I’ve noticed in 21c is that prefixing the table parameter with the function name doesn’t work anymore. I have not yet found out if this was intentional or a bug.
BANNER ----------------------------------------------------------------------------- Oracle Database 21c Enterprise Edition Release 21.0.0.0.0 - Production SQL> SQL> create or replace function test_with_function (p_tab dbms_tf.table_t) return varchar2 sql_macro is 2 begin 3 return 'test_with_function.p_tab'; 4 end; 5 / Function TEST_WITH_FUNCTION compiled SQL> SQL> create or replace function test_without_function (p_tab dbms_tf.table_t) return varchar2 sql_macro is 2 begin 3 return 'p_tab'; 4 end; 5 / Function TEST_WITHOUT_FUNCTION compiled SQL> SQL> select * from test_with_function(dual); SQL Error: ORA-00942: table or view does not exist 00942. 00000 - "table or view does not exist" SQL> SQL> select * from test_without_function(dual); D - X --------------- SQL> select banner from v$version; BANNER ----------------------------------------------------------------------- Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production SQL> SQL> create or replace function test_with_function (p_tab dbms_tf.table_t) return varchar2 sql_macro is 2 begin 3 return 'test_with_function.p_tab'; 4 end; 5 / Function TEST_WITH_FUNCTION compiled SQL> SQL> create or replace function test_without_function (p_tab dbms_tf.table_t) return varchar2 sql_macro is 2 begin 3 return 'p_tab'; 4 end; 5 / Function TEST_WITHOUT_FUNCTION compiled SQL> SQL> select * from test_with_function(dual); D - X SQL> SQL> select * from test_without_function(dual); D - X
Listing 11: Differences in parameter naming 19c vs. 21c
This will probably be my last blog post this year (but who knows?). Next year I promise to post not only about SQL macros 😉 But at least in the next time I am eager to share two more interesting use cases. Merry Christmas! Have a good and healthy 2021!