jump to navigation

Exclusion of Unioned SQL in Views? August 16, 2012

Posted by mwidlake in performance, SQL.
Tags: , , ,
trackback

Question – you have a view definition of the following format:

select 1 as d_type, col_a,col_b,col_c
from TAB_X, TAB_Y, TAB_Z
where {your predicates}
UNION
select 2 as d_type, col_a,col_b,col_c
from TAB_P, TAB_Q, TAB_R
where {your predicates}
UNION
select 3 as d_type, col_a,col_b,col_c
from TAB_X X, TAB_Y, TAB_Z
where {your predicates}

You now select from the view and your code includes a filter predicate such as:

“WHERE D_TYPE = 1”

What will the CBO do? Is it intelligent enough to do what a human would do and exclude the parts of the query that could not possibly return data you want? Ie exclude the SQL statements where d_type is set to 2 and 3. Or is that funcitonality not built into the CBO? After all, it is only software and someone would have to decide that there could be benefit of looking into set of unioned SQL selects to see if any could be excluded and then code that into the CBO.

After all, the CBO is not smart enough to realise it could easily satisfy “select min(col_a), max(col_a) from tab x” via two fast lookups on a supporting index. It scans the index.

So – what is your guess?

I asked 3 people. Not a massive straw poll but I got one vote for it will not exclude, one for it will and one smart answer of “I would expect it to exclude- but as you asked the question I am thinking it does not”.

Let’s do the test.

DB_TEST> @demo_union_view_2
DB_TEST> drop table mdw_source purge;
Table dropped.

  1  create table mdw_source
  2  tablespace users
  3  as select * from dba_objects
  4  /
Table created.

  1  create index mdso_idx on mdw_source(owner,object_name)
  2  tablespace users
  3  /
Index created.

DB_TEST> exec dbms_stats.gather_table_stats(ownname=>USER,tabname=>'MDW_SOURCE',method_opt=>'for all columns size 1')
PL/SQL procedure successfully completed.

  1  select table_name,num_rows from dba_tab_statistics
  2  where owner=USER and table_name='MDW_SOURCE';

TABLE_NAME      NUM_ROWS
--------------- ----------
MDW_SOURCE      55895

  1  create or replace view mdw_v1 as
  2  select
  3    '1'              src_type
  4   ,ms.OWNER
  5   ,ms.OBJECT_NAME
  6   ,ms.SUBOBJECT_NAME
  7   ,ms.CREATED
  8  from mdw_source ms
  9  where ms.object_name like 'A%'
 10  union all
 11  select
 12    '2'              src_type
 13   ,ms.OWNER
 14   ,ms.OBJECT_NAME
 15   ,ms.SUBOBJECT_NAME
 16   ,ms.CREATED
 17  from mdw_source ms
 18  where ms.object_name like 'B%'
 19  UNION ALL
 20  select
 21    '3'              src_type
 22   ,ms.OWNER
 23   ,ms.OBJECT_NAME
 24   ,ms.SUBOBJECT_NAME
 25   ,ms.CREATED
 26  from mdw_source ms
 27  where ms.object_name like 'C%'
 28  /
View created.

So, a simple table is created from DBA_OBJECTS and a view over the top of the table consisting on 3 simple selects unioned together, with a static value (1, 2 or 3) generated for each of the 3 parts as SRC_TYPE

Now let’s run some tests:

DB_TEST> set autotrace on
DB_TEST> --
DB_TEST> -- below will do a select of a non-indexed column across the view
DB_TEST> select count(created) from mdw_v1
  2  /

COUNT(CREATED)
--------------
          1284

Execution Plan
----------------------------------------------------------
Plan hash value: 4259489107
-----------------------------------------------------------------------------------
| Id  | Operation            | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |            |     1 |     9 |   551   (1)| 00:00:07 |
|   1 |  SORT AGGREGATE      |            |     1 |     9 |            |          |
|   2 |   VIEW               | MDW_V1     |  2463 | 22167 |   551   (1)| 00:00:07 |
|   3 |    UNION-ALL         |            |       |       |            |          |
|*  4 |     TABLE ACCESS FULL| MDW_SOURCE |   821 | 27914 |   184   (1)| 00:00:03 |
|*  5 |     TABLE ACCESS FULL| MDW_SOURCE |   821 | 27914 |   184   (1)| 00:00:03 |
|*  6 |     TABLE ACCESS FULL| MDW_SOURCE |   821 | 27914 |   184   (1)| 00:00:03 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter("MS"."OBJECT_NAME" LIKE 'A%')
   5 - filter("MS"."OBJECT_NAME" LIKE 'B%')
   6 - filter("MS"."OBJECT_NAME" LIKE 'C%')

Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
       2417  consistent gets


DB_TEST> -- below will do a select of a non-indexed column across the view
DB_TEST> -- limiting it to "one" of the types
DB_TEST> select count(created)
  2  from mdw_v1
  3  where src_type='3'
  4  /

COUNT(CREATED)
--------------
           129


Execution Plan
----------------------------------------------------------
Plan hash value: 1188403912
------------------------------------------------------------------------------------
| Id  | Operation             | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |            |     1 |    12 |   184   (1)| 00:00:03 |
|   1 |  SORT AGGREGATE       |            |     1 |    12 |            |          |
|   2 |   VIEW                | MDW_V1     |   823 |  9876 |   184   (1)| 00:00:03 |
|   3 |    UNION-ALL          |            |       |       |            |          |
|*  4 |     FILTER            |            |       |       |            |          |
|*  5 |      TABLE ACCESS FULL| MDW_SOURCE |   821 | 27914 |   184   (1)| 00:00:03 |
|*  6 |     FILTER            |            |       |       |            |          |
|*  7 |      TABLE ACCESS FULL| MDW_SOURCE |   821 | 27914 |   184   (1)| 00:00:03 |
|*  8 |     TABLE ACCESS FULL | MDW_SOURCE |   821 | 27914 |   184   (1)| 00:00:03 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter(NULL IS NOT NULL)
   5 - filter("MS"."OBJECT_NAME" LIKE 'A%')
   6 - filter(NULL IS NOT NULL)
   7 - filter("MS"."OBJECT_NAME" LIKE 'B%')
   8 - filter("MS"."OBJECT_NAME" LIKE 'C%')

Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
        807  consistent gets


DB_TEST> -- below will do a select of an indexed column across the view
DB_TEST> select count(owner) from mdw_v1
  2  /

COUNT(OWNER)
------------
        1284


Execution Plan
----------------------------------------------------------
Plan hash value: 4219520050
------------------------------------------------------------------------------------
| Id  | Operation               | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |          |     1 |    17 |   221   (1)| 00:00:03 |
|   1 |  SORT AGGREGATE         |          |     1 |    17 |            |          |
|   2 |   VIEW                  | MDW_V1   |  2463 | 41871 |   221   (1)| 00:00:03 |
|   3 |    UNION-ALL            |          |       |       |            |          |
|*  4 |     INDEX FAST FULL SCAN| MDSO_IDX |   821 | 26272 |    74   (2)| 00:00:01 |
|*  5 |     INDEX FAST FULL SCAN| MDSO_IDX |   821 | 26272 |    74   (2)| 00:00:01 |
|*  6 |     INDEX FAST FULL SCAN| MDSO_IDX |   821 | 26272 |    74   (2)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter("MS"."OBJECT_NAME" LIKE 'A%')
   5 - filter("MS"."OBJECT_NAME" LIKE 'B%')
   6 - filter("MS"."OBJECT_NAME" LIKE 'C%')

Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
        989  consistent gets


DB_TEST> --
DB_TEST> -- below will do a select of an indexed column across the view
DB_TEST> -- limiting it to "one" of the types
DB_TEST> select count(owner)
  2  from mdw_v1
  3  where src_type='3'
  4  /

COUNT(OWNER)
------------
         129


Execution Plan
----------------------------------------------------------
Plan hash value: 815942527
-------------------------------------------------------------------------------------
| Id  | Operation                | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |          |     1 |    20 |    74   (2)| 00:00:01 |
|   1 |  SORT AGGREGATE          |          |     1 |    20 |            |          |
|   2 |   VIEW                   | MDW_V1   |   823 | 16460 |    74   (2)| 00:00:01 |
|   3 |    UNION-ALL             |          |       |       |            |          |
|*  4 |     FILTER               |          |       |       |            |          |
|*  5 |      INDEX FAST FULL SCAN| MDSO_IDX |   821 | 26272 |    74   (2)| 00:00:01 |
|*  6 |     FILTER               |          |       |       |            |          |
|*  7 |      INDEX FAST FULL SCAN| MDSO_IDX |   821 | 26272 |    74   (2)| 00:00:01 |
|*  8 |     INDEX FAST FULL SCAN | MDSO_IDX |   821 | 26272 |    74   (2)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter(NULL IS NOT NULL)
   5 - filter("MS"."OBJECT_NAME" LIKE 'A%')
   6 - filter(NULL IS NOT NULL)
   7 - filter("MS"."OBJECT_NAME" LIKE 'B%')
   8 - filter("MS"."OBJECT_NAME" LIKE 'C%')

Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
        331  consistent gets


DB_TEST> -- below will select against the undelying object we are interested in
DB_TEST> select count(created)
  2  from mdw_source ms
  3  where ms.object_name like 'C%'
  4  /

COUNT(CREATED)
--------------
           129


Execution Plan
----------------------------------------------------------
Plan hash value: 152094671

---------------------------------------------------------------------------------
| Id  | Operation          | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |            |     1 |    34 |   184   (1)| 00:00:03 |
|   1 |  SORT AGGREGATE    |            |     1 |    34 |            |          |
|*  2 |   TABLE ACCESS FULL| MDW_SOURCE |   821 | 27914 |   184   (1)| 00:00:03 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("MS"."OBJECT_NAME" LIKE 'C%')

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        805  consistent gets

Look at the plans at the start.

When my query does not include SRC_TYPE,ie the first SQL statement, the plan is simple – it scans the table three times, estimates 821 rows per scan and a cost of 184 for each step. When I include the SRC_TYPE = ‘3’ in the next step we see that the plan still includes all three full table scans, each still estimated at a cost of 184.
When you provide a predicate that identifies the one sql statement that you want to run, it still runs all three. I tried a few further tests and could not force the exclusion of the “unwanted” queries from the plan.

I showed my friend above (the one who suggested it would not work as I had asked the question) the results and we agreed that it was a little unfortunate that the Oracle Optimizer Pixies had not coded in the ability to exclude unwanted Unioned statements, but we could understand it.

However, for those of you thinking “you are missing something” – you are right. {And, to be fair to my friend, my example was not quite so simplified as what I have posted here}.

I decided to quantify the level of the impact and so shifted my concentration to the buffer gets. Look at the first and second statements again. The cost of the total statement drops, from 551 to 184. The actual consistent gets resulting from executing the SQL is dropping from 2417 to 807. They are both dropping dramatically to approximately 1/3.

Now check out the filter predicates at step 4 and 6 – “NULL IS NOT NULL”. That will always be false. Everything “after” that statement in the plan will not be executed. The step in the plan is there and, as a step, has the same cost – but it is is not being executed.

The pattern is repeated when the simple SQL statement is on an indexed column and the index is being used (the next two statements) {and honesty forces me to admit I have included more output than I need for this example, as I wanted to confuse the issue slightly and maybe encourage you to not read the plans too carefully}.

So yes, the CBO IS excluding the individual unioned SQL statements based on the stated SRC_TYPE and the fixed value stated in each part of the union.

I’m quite impressed by that. Clever Oracle Optimizer Pixies.

To tie this back to reality, I initially looked at this as I am trying to tune some code that hides all the complexity of the application under one massive unioned uber-view over a set of 9 other complex, unioned views. The explain plan runs to close to 700 lines. I had utterly overlooked the filters and filter predicates in that – heck I missed it initially in my simple test 🙂 {which was more complex than in this post, in my defense}. Trying to work with a very slow statement based on nested, complex views that is used by pretty much the whole application is “fun”.

Also back in reality, the filter clause I see in my real code is along the lines of:

:B17 is null or is not null

This is as bind variable is being considerd – and that is actually one of the considerations back in this real world of the original issue. The level that the master query needs to run in is decided by this bind variable – otherwise the application developer would (should have?) just pointed at the sub-view they needed for that level. So the CBO does not know which chunk of the union’d code it can ignore until it sees that bind variable, so the plan shows what could be executed.

Of course, it could be argued that if you know the type at the time you query the database, rather than use the above union filtering trick to cut down your massively complex view to the part that is required, just query the sub view you need directly. A bit more code but a lot simpler system to look after.

It’s a nice SQL trick but I would keep it to simple examples!

Finaly, just for fun, here is the plan I’ve been trying to tune. As I said, close to 700 steps and only the highlighted grey section on the left is shown in more detail on the right.

Comments»

1. nlitchfield - August 16, 2012

I *think* that the CBO code probably is there as a result/enhancement of this nice *new* feature http://docs.oracle.com/cd/A57673_01/DOC/server/doc/A48506/partview.htm#265 – at least it seems to me that 3 union alls against a rowsource *ought* to be subject to the same rules of optimisation as 3 union alls against 3 different tables. Right now I’m off to party like its 1999. 🙂

mwidlake - August 16, 2012

I wondered if the topic of Partiton Views would come up or not 🙂

I did ponder on it myself, but this is a case of the CBO recognising that it can exclude code based on the select text, not the where predicates or from clause. Back in oracle 7 there was this limit to partition views:

“It has exactly one table in the FROM clause and it is a base table.”

Now, my simple example matches that limitation but the more complex real-world example I derived this from does not.

An example I dropped out of the post is interesting – it limits the select against the view by the very same WHERE clause that the individual views use:
1 select count(created)
2 from mdw_v1 mv
3 where mv.object_name like ‘C%’
COUNT(CREATED)
————–
129

Execution Plan
———————————————————-
Plan hash value: 4259489107
———————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
———————————————————————————–
| 0 | SELECT STATEMENT | | 1 | 75 | 551 (1)| 00:00:07 |
| 1 | SORT AGGREGATE | | 1 | 75 | | |
| 2 | VIEW | MDW_V1 | 824 | 61800 | 551 (1)| 00:00:07 |
| 3 | UNION-ALL | | | | | |
|* 4 | TABLE ACCESS FULL| MDW_SOURCE | 1 | 34 | 184 (1)| 00:00:03 |
|* 5 | TABLE ACCESS FULL| MDW_SOURCE | 2 | 68 | 184 (1)| 00:00:03 |
|* 6 | TABLE ACCESS FULL| MDW_SOURCE | 821 | 27914 | 184 (1)| 00:00:03 |
———————————————————————————–

Predicate Information (identified by operation id):
—————————————————
4 – filter(“MS”.”OBJECT_NAME” LIKE ‘A%’ AND “MS”.”OBJECT_NAME” LIKE ‘C%’)
5 – filter(“MS”.”OBJECT_NAME” LIKE ‘B%’ AND “MS”.”OBJECT_NAME” LIKE ‘C%’)
6 – filter(“MS”.”OBJECT_NAME” LIKE ‘C%’)

Statistics
———————————————————-
8 recursive calls
0 db block gets
2417 consistent gets

But as you can see, the CBO cannot use this WHERE clause information to exclude sections of the union – All three SQL statements are executed for a total of 2417 consistent gets.

Jonathan LewisJonathan Lewis - August 20, 2012

Martin,

Oracle can be very touchy about how it plays this game – but even in 9i, after partition views had been deprecated, it was much better than 7.3 at handling things that looked like partition views.

In your example in the previous comment, elimination would probably happen if you changed all the predicates to equalities like: substr(ms.object_name,1,1) = ‘A’

2. rsiz - August 16, 2012

–+ gather_plan_statistics
as a hint and a subsequent allstats display might also be very useful to you for your tuning of the actual monster 700 line plan with various common tacked on predicates. That way you can check what someone might run into that tends to be expensive and see where exactly. Good luck my friend. Nice post.

select * from table(dbms_xplan.display_cursor(format=>’COST ALLSTATS LAST’));

is very useful if you use a long enough linesize. Pixies or Leprechans? Bless them either way.

mwidlake - August 16, 2012

Ahhh, Mark, very good advise. I would also suggest it is worth viewing such nasties via the SQL monitor page in OEM (licencing etc allowing) – but the pesky thing has a few limits including not showing code more than ?300? lines long. I do not have the power to go altering the relevant underscore parameters to change that 😦

NB Leprechans in Oracle Corp are limited to the SQL Optimizer team and their queen 🙂

rsiz - August 16, 2012

Nods. SQL Monitor good. I think when the plan is really huge and you really need to examine it good old fashioned paper landscape with no line or page breaks so you can write notes on it works out best. I know there are display and annotation capabilities now, but somehow certain dinosaurs do better with physical scrolling of the paper on a large surface. (And yes, I’m one of those dinosaurs and I’m not quite willing to spring for a wall sized touch screen yet.)

And yes, I think everyone is glad SQLMaria is who she is and does what she does. And that even though my CBO slogan “The goal is good plans!” did not make the cut to be in the vote. (That remains MY slogan for the CBO and I continue to predict it will go viral.)

3. Rene - August 16, 2012

What tool did you use to cremte that graph?

mwidlake - August 16, 2012

It’s straight from OEM Rene – V10 OEM and you can do similar in later version. Given you have the tuning and diagnostics pack you pick a sql statement and one of the option is to show the plan By default it is the graph, which I am not fond of I prefer the textual version 🙂

4. Martin Preiss - August 16, 2012

Hi Martin,

when I saw your code example my first idea was: ok, it’s UNION ALL, so the CBO can play “divide and conquer” – but with a simple UNION it would be a different game. But when I removed the ALLs from the UNIONs I still got the same plan (and the same filter steps), only with the addition of a SORT UNIQUE step:

————————————————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
————————————————————————————-
| 0 | SELECT STATEMENT | | 1 | 12 | 526 (1)| 00:00:03 |
| 1 | SORT AGGREGATE | | 1 | 12 | | |
| 2 | VIEW | MDW_V1 | 966 | 11592 | 526 (1)| 00:00:03 |
| 3 | SORT UNIQUE | | 966 | 39606 | 526 (100)| 00:00:03 |
| 4 | UNION-ALL | | | | | |
|* 5 | FILTER | | | | | |
|* 6 | TABLE ACCESS FULL| MDW_SOURCE | 964 | 39524 | 523 (0)| 00:00:03 |
|* 7 | FILTER | | | | | |
|* 8 | TABLE ACCESS FULL| MDW_SOURCE | 964 | 39524 | 523 (0)| 00:00:03 |
|* 9 | TABLE ACCESS FULL | MDW_SOURCE | 964 | 39524 | 523 (0)| 00:00:03 |
————————————————————————————-

So the SORT UNIQUE is postponed until the filtering could happen. Indeed clever and impressive.

mwidlake - August 16, 2012

I like your thinking there and that you tested it. Thanks for sharing the results.

In the real-world example this all sprang from, the developer has actually used a UNION to mask the fact that they get some duplicates, so I knew it would not make much difference. I stuck with UNION ALL simply to keep the explain plan that little bit simpler.

Dom Brooks - August 17, 2012

Not really relevant to the blog post but when talking about UNION vs UNION ALL, I always think this is something waiting to bite an unsuspecting developer who thinks that if two resultsets are mutually exclusive then a UNION can simply be replaced with a UNION ALL and a sort avoided:

SQL> select 1 col1
  2  from   dual
  3  connect by rownum <= 3
  4  union all
  5  select 2 col1
  6  from   dual
  7  connect by rownum <= 3; 

      COL1
----------
         1
         1
         1
         2
         2
         2

6 rows selected.

SQL> select 1 col1
  2  from   dual
  3  connect by rownum <= 3
  4  union
  5  select 2 col1
  6  from   dual
  7  connect by rownum <= 3; 

      COL1
----------
         1
         2

SQL> 
5. mwidlake - August 17, 2012

Which they can – if each individual result set itself consists of a set of unique results and there is something unique to each set 🙂
Again slightly off the thread, but whenever I see UNION or DISTINCT the little voices in my head are going “is this code as smart as it could be? Are they hiding some terrible chasm trap or sloppy design in there?”

6. Exclusion of Unioned SQL in Views – Followup « Martin Widlake's Yet Another Oracle Blog - August 20, 2012

[…] Last week I put up a post about how Oracle can filter out sections of a union view.. […]

7. wild card - September 5, 2018

Quality sites

Exclusion of Unioned SQL in Views? | Martin Widlake


Leave a reply to Martin Preiss Cancel reply