LiveSQL is great place to start playing with new features. It provides a couple of very helpful demo scripts explaining how polymorphic table functions work. There I found a new script few days ago which uses PTF for dynamic pivot! WOW! According to my subjective perception, it seems to be one of the most desired features in Oracle SQL! But let’s have a closer look. Is this really feasible and mature enough to be used in production code?
I still think it is “just for fun”. I have investigated nearly the same approach in this post: Polymorphic Table Functions Example – (NOT) Transposing Rows to Columns . As the title suggests, I was not satisfied with the result. The problems remain the same:
The Function is DML-agnostic
We don’t have direct access to the data in DESCRIBE. But we need to look inside to know what columns we have to generate. So the workaround is to provide a query as a parameter and execute it in DESCRIBE. Sounds good, except that we are scanning the table already during SQL parsing. But that is secondary. The main problem with this approach, what happens if the data change?
It is very easy to test. Run the SQL, then insert a new job and an employee referencing this new job. Re-run the statement.
SQL> with jobs as (
2 select job_id
3 from hr.employees
4 )
5 select * from dynamic_pivot_ptf (
6 jobs
7 ,columns ( job_id )
8 ,'select distinct job_id from hr.employees'
9 );
AC_ACCOUNT AC_MGR AD_ASST AD_PRES AD_VP FI_ACCOUNT FI_MGR HR_REP IT_PROG MK_MAN MK_REP PR_REP PU_CLERK PU_MAN SA_MAN SA_REP SH_CLERK ST_CLERK ST_MAN
---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
1 1 1 1 2 5 1 1 5 1 1 1 5 1 5 30 20 20 5
SQL> insert into hr.jobs(job_id, job_title) values ('NEW_JOB','NEW_JOB');
1 row inserted.
SQL> insert into hr.employees
2 (EMPLOYEE_ID,LAST_NAME,EMAIL,HIRE_DATE,JOB_ID)
3 values
4 (999, 'Test','test@oracle.com', sysdate, 'NEW_JOB');
1 row inserted.
SQL> COMMIT;
Commit complete.
SQL> with jobs as (
2 select job_id
3 from hr.employees
4 )
5 select * from dynamic_pivot_ptf (
6 jobs
7 ,columns ( job_id )
8 ,'select distinct job_id from hr.employees'
9 );
AC_ACCOUNT AC_MGR AD_ASST AD_PRES AD_VP FI_ACCOUNT FI_MGR HR_REP IT_PROG MK_MAN MK_REP PR_REP PU_CLERK PU_MAN SA_MAN SA_REP SH_CLERK ST_CLERK ST_MAN
---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
1 1 1 1 2 5 1 1 5 1 1 1 5 1 5 30 20 20 5
Hmm… The result is the same? No new column? But why should it be different? The structure (new columns) which the PTF should return was defined by the DESCRIBE method during the first run. Nothing has happened to force the hard parse of our query again. What can help is flushing the shared pool:
SQL> alter system flush shared_pool;
System FLUSH altered.
SQL> with jobs as (
2 select job_id
3 from hr.employees
4 )
5 select * from dynamic_pivot_ptf (
6 jobs
7 ,columns ( job_id )
8 ,'select distinct job_id from hr.employees'
9 );
AC_ACCOUNT AC_MGR AD_ASST AD_PRES AD_VP FI_ACCOUNT FI_MGR HR_REP IT_PROG MK_MAN MK_REP NEW_JOB PR_REP PU_CLERK PU_MAN SA_MAN SA_REP SH_CLERK ST_CLERK ST_MAN
---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
1 1 1 1 2 5 1 1 5 1 1 1 1 5 1 5 30 20 20 5
Now we can see the NEW_JOB. But honestly, nobody will do this! So the function is only useful on “static” tables. But why would you need a “dynamic” pivot on “static” tables? As far as I understand, this kind of issues are of caused by the PTF-architecture and are not so easy to fix in the future. I gladly let myself be convinced of the opposite but until then I don’t recommend querying any database state in the DESCRIBE function. IMO it is only safe to derive your output structure based on the input table and other PTF-parameters (use literals, no bind variables).
Partial results when applied to bigger table
The second problem was described in the recent post about table semantic PTF. The pivot function aggregates the source rows to count employees per job. So we expect only one resulting row. How to achieve this with a PTF? The idea is to hide all but the first row from the output (using row replication). Well, the solution is also the problem here. The exact definition of what you should do is: “hide all but the first row in the last rowset from the output”. You should hide ALL rows from the rowsets that are NOT the last one. But it’s impossible for now. We just don’t know! Hopefully this can be and will be fixed in one of the next releases. Look what happens, if we will have multiple rowsets and apply the same logic to all of them. Just re-run with a bigger table:
SQL> with jobs as (
2 select job_id
3 from hr.employees
4 cross join
5 (select 1 from dual connect by level >=15)
6 )
7 select * from dynamic_pivot_ptf (
8 jobs
9 ,columns ( job_id )
10 ,'select distinct job_id from hr.employees'
11 );
AC_ACCOUNT AC_MGR AD_ASST AD_PRES AD_VP FI_ACCOUNT FI_MGR HR_REP IT_PROG MK_MAN MK_REP PR_REP PU_CLERK PU_MAN SA_MAN SA_REP SH_CLERK ST_CLERK ST_MAN
---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
10 10 10 10 20 50 10 10 50 10 10 10 50 10 50 299 180 180 45
5 5 5 5 10 25 5 5 25 5 5 5 25 5 25 151 120 120 30
Initially we had 107 rows in HR.EMPLOYEES table. After multiplying them 15 times via cross join we now become 1605 rows. That leads to two rowsets: the first one with 1024 rows and the second one with 581 rows in it. That’s why we can see two resulting rows, one per rowset. We need some kind of post processing to fix this. The good news in this example: the arrays with counts are re-initialized for every run of FETCH_ROWS, thus making the partial results fully additive. We only need to sum up across all result rows. ONLY? Now we have the same problem: the columns we are going to aggregate are dynamic again! Yet another PTF to do this aggregation? In case we hadn’t this re-initialization or couldn’t even use it (like in case if we wanted MAX instead of COUNT as a pivot aggregation function), then only one resulting row would contain the correct answer whereas all others should be simply thrown away. But how to distinguish them? Again, I hope Oracle will provide a solution to detect the last row set.
Summary
Due to described problems I don’t feel this PTF could be used for dynamic pivot in real life. But I’m sure, these demo scripts are very valuable, giving us a feeling what else can be done with PTF, how it can be done and what problems cannot be solved yet. Thanks @chrisrsaxon for providing them!