jump to navigation

Row Level Security 3 – In Pictures! November 22, 2012

Posted by mwidlake in database design, security.
Tags: , ,
3 comments

<..Part one intro and examples
<….Part two Permissions

I’ve noticed that there has not been a lot of traffic on this series on Row Level Security (data masking) so far – maybe due to how I am presenting the material? So here is a summary to date in picture/diagram format:

This is the flow of logic as to whether Row Level Security is applied to the SQL statement. Usually this is a query and usually it is on a table but the policy can be applied to other DML and also on views and synonyms:

These are the steps you need to add RLS to a table:

Below is a summary of how oracle fires and uses RLS functions and the predicates they provide:

An finally, a summary of what happens if an RLS function is invalid:

If you have not looked at the previous posts on this topic, more details are in there.

Enjoy!

Row Level Security Part 1 November 15, 2012

Posted by mwidlake in database design, security.
Tags: , ,
9 comments

I’ve been working a little on Row Level Security (RLS) recently and wanted to mention a few things, so first some groundwork.

If you want to limit the rows certain users can see, you might think to use views or you might think to use RLS (part of VPD – Virtual Private Database). You can also (from V10 I think) limit which columns users can see. An example is probably the best way to show this. I’m doing this on Oracle 11.2.0.3.

I have two users, MDW and MDW_OFFSHORE. MDW has DBA-type privileges and MDW_OFFSHORE has connect, resource and one or two other simple privs. I will now demonstrate creating and populating a simple table under MDW, adding RLS to it and how it alters what MDW_OFFSHORE sees.

MDW> -- create a table and show row level and column level rls
MDW> --
MDW> drop table TEST1 purge;
MDW> --
MDW> create table test1
  2  (id         number       not null
  3  ,cp_name    varchar2(30) not null
  4  ,other_name varchar2(30) not null
  5  ,MASK_FL    number
  6  )
  7  /
MDW> insert into test1 VALUES (1,'ERIC','ERIC THE RED',1);
MDW> insert into test1 VALUES (2,'BROWN','BOB THE BROWN',2);
MDW> insert into test1 VALUES (3,'GREEN','GARY THE GREEN',1);
MDW> insert into test1 VALUES (4,'BLUE','BILL THE BLUE',3);
MDW> insert into test1 VALUES (5,'BLACK','DAVE THE BLACK',4);
MDW> insert into test1 VALUES (6,'PURPLE','PETE THE PURPLE',5);
MDW> insert into test1 VALUES (7,'RED','ROGER THE RED',6);
MDW> --
MDW> COMMIT;
MDW> SELECT * FROM TEST1;

        ID CP_NAME                        OTHER_NAME                        MASK_FL
---------- ------------------------------ ------------------------------ ----------
         1 ERIC                           ERIC THE RED                            1
         2 BROWN                          BOB THE BROWN                           2
         3 GREEN                          GARY THE GREEN                          1
         4 BLUE                           BILL THE BLUE                           3
         5 BLACK                          DAVE THE BLACK                          4
         6 PURPLE                         PETE THE PURPLE                         5
         7 RED                            ROGER THE RED                           6
MDW> grant all on test1 to mdw_onshore;
MDW> grant select on test1 to mdw_offshore;
MDW> --
--
--This is where I create a function to generate the RLS rules
--
MDW> CREATE OR REPLACE FUNCTION hide_rows (
  2   v_schema IN VARCHAR2
  3  ,v_objname IN VARCHAR2)
  4  RETURN VARCHAR2 AS
  5  con VARCHAR2 (200);
  6  BEGIN
  7  -- note the use of backets to make it an atomic test
  8  con := '(user='''||v_schema||''' or MASK_FL > 3)';
  9  RETURN (con);
 10  END hide_rows;
 11  /
MDW> --
--
-- Now add the actual RLS policy to the table TEST1 using the function hide_rows
MDW> BEGIN
  2    sys.DBMS_RLS.ADD_POLICY (
  3      object_schema     => 'MDW'
  4     ,object_name       => 'TEST1'
  5     ,policy_name       => 'hide_rows_policy'
  6     ,policy_function   => 'hide_rows'
  7     ,function_schema   => 'MDW'
  8                            );
  9  END;
 10  /
MDW> --

That’s it. In effect you create a function, in this hide_rows, that creates a WHERE predicate (more on that later) and associate it with the table by creating a policy hide_ rows_policy.

Now let’s see what impact it has:

-- as the main user:
--
MDW> select * from test1
  2  /
        ID CP_NAME                        OTHER_NAME                        MASK_FL
---------- ------------------------------ ------------------------------ ----------
         1 ERIC                           ERIC THE RED                            1
         2 BROWN                          BOB THE BROWN                           2
         3 GREEN                          GARY THE GREEN                          1
         4 BLUE                           BILL THE BLUE                           3
         5 BLACK                          DAVE THE BLACK                          4
         6 PURPLE                         PETE THE PURPLE                         5
         7 RED                            ROGER THE RED                           6
7 rows selected.
MDW>
-- now as the offshore user:
mdw_offshore> select * from mdw.test1
  2  /
        ID CP_NAME                        OTHER_NAME                        MASK_FL
---------- ------------------------------ ------------------------------ ----------
         5 BLACK                          DAVE THE BLACK                          4
         6 PURPLE                         PETE THE PURPLE                         5
         7 RED                            ROGER THE RED                           6
3 rows selected.

As you can see, mdw_offshore is seeing only a subset of rows in the table. This subset is determined by the function hide_rows. This function returns a WHERE PREDICATE that is added to any query run against the table. It is actually very simply to see what that predicate is, you just call the function passing in the table owner and name:

MDW> select hide_rows(v_schema => user,v_objname=>’TEST1′) from dual;

HIDE_ROWS(V_SCHEMA=>USER,V_OBJNAME=>’TEST1′)
——————————————————————————-
(user=’MDW’ or MASK_FL > 3)

Note that my refernce to v_owner in the function comes out as the text MDW – a way to ensure the predicate does not filter data looked at by the table owner. This implies of course that yes, you can stop the table owner from seeing data!

By default, this extra WHERE PREDICATE is applied to any SELECT, INSERT, UPDATE or DELETE statement run against this table. The function has to accept the v_schema and v_objname parameters and output a varchar2 string. At the time of being called as part of a RLS policy, oracle passes in the owner and name of the object being masked (which is usually a table and sometimes a view).
There are a few things I think it is important to keep in mind in respect of this extra Where Predicate:-

  • You have very little control over what other where predicates will be applied to DML against this table, so keep it self-contained – which is why I bound it in brackets.
  • It has to be syntactically correct but no check is made for this when you create the function – it is simply a string being outputted, as far as Oracle is concerned, at the time of creation.
  • It can be any valid Where Predicate you like, so long as it does not reference other rows in the table with the policy is applied to
  • This could be fired a lot. The performance impact could be considerable, which is why CONTEXTs are often used (as they are memory resident and can be configured by eg logon triggers) but I have skipped over that. Go look at the manuals.

I mentioned that the masking of data could be applied to just a column rather than whole rows. So let’s demonstrate that. I can do this by simply dropping and reapplying the same function in a different way:

MDW> BEGIN
-- first drop the policy, simply by stating the object owner, name and policy
  2    sys.dbms_rls.drop_policy(
  3      object_schema     => 'MDW'
  4     ,object_name       => 'TEST1'
  5     ,policy_name       => 'hide_rows_policy'
  6                            );
-- and recreate - note the two new paramters 
  7    sys.DBMS_RLS.ADD_POLICY (
  8      object_schema     => 'MDW'
  9     ,object_name       => 'TEST1'
 10     ,policy_name       => 'hide_rows_policy'
 11     ,policy_function   => 'hide_rows'
 12     ,function_schema   => 'MDW'
 13     ,sec_relevant_cols => 'OTHER_NAME,MASK_FL'
 14     ,sec_relevant_cols_opt => sys.dbms_rls.ALL_ROWS
 15                            );
 16  END;
 17  /
PL/SQL procedure successfully completed.
--
-- as the main user
MDW> select * from test1
  2  /
        ID CP_NAME                        OTHER_NAME                        MASK_FL
---------- ------------------------------ ------------------------------ ----------
         1 ERIC                           ERIC THE RED                            1
         2 BROWN                          BOB THE BROWN                           2
         3 GREEN                          GARY THE GREEN                          1
         4 BLUE                           BILL THE BLUE                           3
         5 BLACK                          DAVE THE BLACK                          4
         6 PURPLE                         PETE THE PURPLE                         5
         7 RED                            ROGER THE RED                           6
7 rows selected.
--
-- now as the offshore user:
mdw_offshore> select * from mdw.test1
  2  /
        ID CP_NAME                        OTHER_NAME                        MASK_FL
---------- ------------------------------ ------------------------------ ----------
         1 ERIC
         2 BROWN
         3 GREEN
         4 BLUE
         5 BLACK                          DAVE THE BLACK                          4
         6 PURPLE                         PETE THE PURPLE                         5
         7 RED                            ROGER THE RED                           6
7 rows selected.
--
-- but what can look a little odd is that if mdw_offshore describes the table..
mdw_offshore> desc mdw.test1
 Name                                                  Null?    Type
 ----------------------------------------------------- -------- --------------
 ID                                                    NOT NULL NUMBER
 CP_NAME                                               NOT NULL VARCHAR2(30)
 OTHER_NAME                                            NOT NULL VARCHAR2(30)
 MASK_FL                                                        NUMBER
--
-- they see that OTHER_NAME is NOT NULL but...
mdw_offshore> select count(*) from mdw.test1 where other_name is null;

  COUNT(*)
----------
         4

That’s rather nice, I think. The same function can be used and rather than hide the rows it masks the stated columns. The sec_relevant_cols => ‘OTHER_NAME,MASK_FL’ parameter states the columns to apply the function on and the sec_relevant_cols_opt => sys.dbms_rls.ALL_ROWS parameter means “show a row even if a column or columns are hidden”. Without it, you only see rows if you do not reference the masked columns (even if indirectly via eg “select (*)”) – see this nice article by Tim Hall if you want an example of that.

With column masking the string from the function is not actually used as a Where Predicate but as a Boolean TRUE/FALSE test. If the test results in TRUE, you see the column(s), otherwise they are hidden.

As you can see from the example, this column-level RLS can result in some odd looking (though intentional) results. A DESC of the table shows the column OTHER_NAME is mandatory but records are found with it set to null. There can also be the situation where a user can insert a value for the masked column, but then cannot see it when they select the data back, though other users can see the populated column.

A limitation of column-level masking is that you can only mask columns with NULL and not, for example, a string of “MASKED”. Thus if a column you are masking is not manadatory, how would you distinguish between a true NULL and a masked NULL? {A nod of thanks goes to Pete Finnigan for confirming this “NULL only” limitation.}

One of the appeals of RLS over eg views is that it is hard to circumvent, as it is applied at the level of the SQL. Another is that the same function can be applied to many tables and might also be applied to both column and row masking, if designed well.

That will almost do for an introduction to the use of RLS to mask rows and columns. However I’ll finish with just a few important points that follow on from this being a security feature.

  • I demonstrated that I could see what the function responsible for this security actually returns. If you do not protect unauthorised execution of that function, that could be a security hole.
  • I masked the column that is used to control if the OTHER_NAME column is masked. Would you need to also do this?
  • Data that is missing is data that could be of interest.
  • Anyone who can alter that function can undo your security.
  • Anyone who can view DBA_SOURCE (eg has SELECT ANY DICTIONARY) can view the function and see how it works.
  • As you can appreciate from the above, apply Security securely is a right pain in the proverbial.

Exclusion of Unioned SQL in Views – Followup August 20, 2012

Posted by mwidlake in database design, performance, SQL.
Tags: , ,
add a comment

Last week I put up a post about how Oracle can filter out sections of a union view..

Within the comments I put up another example where the CBO did not filter out all but one of the Union views despite my replicating the exact WHERE clause of one of the unioned statements. Jonathan Lewis posted a followup to say “Oracle can be very touchy about how it plays this game” and made a prediction of how the CBO would handle a slightly different scenario.

This was the extra scenario and I include brief details on creating the unioned view too. NB all on Oracle 11.2.0.2. {non-Exadata :-) }

  1  create table mdw_source
  2  tablespace users
  3  as select * from dba_objects

  1  create or replace view mdw_v1 as
  2  select
  3    '1'              src_type
  4   ,ms.OWNER ,ms.OBJECT_NAME,ms.SUBOBJECT_NAME ,ms.CREATED
  5  from mdw_source ms
  6  where ms.object_name like 'A%'
  7  union all
  8  select
  9    '2'              src_type
 10   ,ms.OWNER ,ms.OBJECT_NAME,ms.SUBOBJECT_NAME ,ms.CREATED
 11  from mdw_source ms
 12  where ms.object_name like 'B%'
 13  UNION ALL
 14  select
 15    '3'              src_type
 16   ,ms.OWNER ,ms.OBJECT_NAME,ms.SUBOBJECT_NAME ,ms.CREATED
 17  from mdw_source ms
 18  where ms.object_name like 'C%'

1 select count(created)
2 from mdw_v1 mv
3 where mv.object_name like ‘C%’

Execution Plan
———————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
———————————————————————————–
| 0 | SELECT STATEMENT | | 1 | 75 | 551 (1)| 00:00:07 |
| 1 | SORT AGGREGATE | | 1 | 75 | | |
| 2 | VIEW | MDW_V1 | 824 | 61800 | 551 (1)| 00:00:07 |
| 3 | UNION-ALL | | | | | |
|* 4 | TABLE ACCESS FULL| MDW_SOURCE | 1 | 34 | 184 (1)| 00:00:03 |
|* 5 | TABLE ACCESS FULL| MDW_SOURCE | 2 | 68 | 184 (1)| 00:00:03 |
|* 6 | TABLE ACCESS FULL| MDW_SOURCE | 821 | 27914 | 184 (1)| 00:00:03 |
———————————————————————————–

Predicate Information (identified by operation id):
—————————————————
4 – filter(“MS”.”OBJECT_NAME” LIKE ‘A%’ AND “MS”.”OBJECT_NAME” LIKE ‘C%’)
5 – filter(“MS”.”OBJECT_NAME” LIKE ‘B%’ AND “MS”.”OBJECT_NAME” LIKE ‘C%’)
6 – filter(“MS”.”OBJECT_NAME” LIKE ‘C%’)

Statistics
———————————————————-
8 recursive calls
0 db block gets
2417 consistent gets 

So, as you can see, despite me providing the WHERE clause in my final select as “where mv.object_name like ‘C%’” and the WHERE clauses of the unioned view are all similar and one is “ms.object_name like ‘C%’” the CBO fails to realise it can exclude all but one of the unioned SQL statements – despite the mutally exclusive filter predicates:

4 – filter(“MS”.”OBJECT_NAME” LIKE ‘A%’ AND “MS”.”OBJECT_NAME” LIKE ‘C%’)
5 – filter(“MS”.”OBJECT_NAME” LIKE ‘B%’ AND “MS”.”OBJECT_NAME” LIKE ‘C%’)

I have to confess, in light of recent experience I would have wondered if these filter predicates would prevent the other two SQL statements being fired at execution time but the number of consistent gets confirms that it is not – 2417 compared to 805 or so for when the query exclusion occurs. You can check back to the prior post to verify that.

So, let’s do the tests Jonathan suggested. He suggested altering the view to use equality predicates so I have recreated the view as such:

TEST_DB> create or replace view mdw_v1 as
  2  select
  3    '1'              src_type
  4    ,ms.OWNER ,ms.OBJECT_NAME ,ms.SUBOBJECT_NAME ,ms.CREATED
  5  from mdw_source ms
  6  where substr(ms.object_name,1,1) ='A'
  7  union all
  8  select
  9    '2'              src_type
 10   ,ms.OWNER ,ms.OBJECT_NAME ,ms.SUBOBJECT_NAME ,ms.CREATED
 11  from mdw_source ms
 12  where substr(ms.object_name,1,1) ='B'
 13  UNION ALL
 14  select
 15    '3'              src_type
 16   ,ms.OWNER ,ms.OBJECT_NAME ,ms.SUBOBJECT_NAME ,ms.CREATED
 17  from mdw_source ms
 18  where substr(ms.object_name,1,1) ='C'

View created.

TEST_DB> set autotrace on
TEST_DB> --
TEST_DB> -- below will do a select of a non-indexed column across the view
TEST_DB> select count(created) from mdw_v1

COUNT(CREATED)
--------------
          1288

Execution Plan
-----------------------------------------------------------------------------------
| Id  | Operation            | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |            |     1 |     9 |   552   (1)| 00:00:07 |
|   1 |  SORT AGGREGATE      |            |     1 |     9 |            |          |
|   2 |   VIEW               | MDW_V1     |  1677 | 15093 |   552   (1)| 00:00:07 |
|   3 |    UNION-ALL         |            |       |       |            |          |
|*  4 |     TABLE ACCESS FULL| MDW_SOURCE |   559 | 19006 |   184   (1)| 00:00:03 |
|*  5 |     TABLE ACCESS FULL| MDW_SOURCE |   559 | 19006 |   184   (1)| 00:00:03 |
|*  6 |     TABLE ACCESS FULL| MDW_SOURCE |   559 | 19006 |   184   (1)| 00:00:03 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter(SUBSTR("MS"."OBJECT_NAME",1,1)='A')
   5 - filter(SUBSTR("MS"."OBJECT_NAME",1,1)='B')
   6 - filter(SUBSTR("MS"."OBJECT_NAME",1,1)='C')

Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
       2417  consistent gets


TEST_DB> -- below will do a select of a non-indexed column
TEST_DB> -- specific to one of the Union's in the view via the src_type
TEST_DB> select count(created)
  2  from mdw_v1
  3  where src_type='3'

COUNT(CREATED)
--------------
           129

Execution Plan
------------------------------------------------------------------------------------
| Id  | Operation             | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |            |     1 |    12 |   184   (1)| 00:00:03 |
|   1 |  SORT AGGREGATE       |            |     1 |    12 |            |          |
|   2 |   VIEW                | MDW_V1     |   561 |  6732 |   184   (1)| 00:00:03 |
|   3 |    UNION-ALL          |            |       |       |            |          |
|*  4 |     FILTER            |            |       |       |            |          |
|*  5 |      TABLE ACCESS FULL| MDW_SOURCE |   559 | 19006 |   184   (1)| 00:00:03 |
|*  6 |     FILTER            |            |       |       |            |          |
|*  7 |      TABLE ACCESS FULL| MDW_SOURCE |   559 | 19006 |   184   (1)| 00:00:03 |
|*  8 |     TABLE ACCESS FULL | MDW_SOURCE |   559 | 19006 |   184   (1)| 00:00:03 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter(NULL IS NOT NULL)
   5 - filter(SUBSTR("MS"."OBJECT_NAME",1,1)='A')
   6 - filter(NULL IS NOT NULL)
   7 - filter(SUBSTR("MS"."OBJECT_NAME",1,1)='B')
   8 - filter(SUBSTR("MS"."OBJECT_NAME",1,1)='C')

Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
        807  consistent gets


TEST_DB> -- below will do a select of a non-indexed column
TEST_DB> -- specific to one of the Union's in the view via matching the where clause
TEST_DB> select count(created)
  2  from mdw_v1
  3  where substr(object_name,1,1) ='C'

COUNT(CREATED)
--------------
           129

Execution Plan
------------------------------------------------------------------------------------
| Id  | Operation             | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |            |     1 |    75 |   184   (1)| 00:00:03 |
|   1 |  SORT AGGREGATE       |            |     1 |    75 |            |          |
|   2 |   VIEW                | MDW_V1     |   561 | 42075 |   184   (1)| 00:00:03 |
|   3 |    UNION-ALL          |            |       |       |            |          |
|*  4 |     FILTER            |            |       |       |            |          |
|*  5 |      TABLE ACCESS FULL| MDW_SOURCE |   559 | 19006 |   184   (1)| 00:00:03 |
|*  6 |     FILTER            |            |       |       |            |          |
|*  7 |      TABLE ACCESS FULL| MDW_SOURCE |   559 | 19006 |   184   (1)| 00:00:03 |
|*  8 |     TABLE ACCESS FULL | MDW_SOURCE |   559 | 19006 |   184   (1)| 00:00:03 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter(NULL IS NOT NULL)
   5 - filter(SUBSTR("MS"."OBJECT_NAME",1,1)='A')
   6 - filter(NULL IS NOT NULL)
   7 - filter(SUBSTR("MS"."OBJECT_NAME",1,1)='B')
   8 - filter(SUBSTR("MS"."OBJECT_NAME",1,1)='C')

Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
        807  consistent gets


TEST_DB> -- below will do a select of a non-indexed column
TEST_DB> -- specific to one of the Union's in the view via NOT matching the where clause
TEST_DB> -- but maybe logical exclusion can be managed
TEST_DB> select count(created)
  2  from mdw_v1
  3  where object_name ='C'

COUNT(CREATED)
--------------
             0

Execution Plan
-----------------------------------------------------------------------------------
| Id  | Operation            | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |            |     1 |    75 |   551   (1)| 00:00:07 |
|   1 |  SORT AGGREGATE      |            |     1 |    75 |            |          |
|   2 |   VIEW               | MDW_V1     |     3 |   225 |   551   (1)| 00:00:07 |
|   3 |    UNION-ALL         |            |       |       |            |          |
|*  4 |     TABLE ACCESS FULL| MDW_SOURCE |     1 |    34 |   184   (1)| 00:00:03 |
|*  5 |     TABLE ACCESS FULL| MDW_SOURCE |     1 |    34 |   184   (1)| 00:00:03 |
|*  6 |     TABLE ACCESS FULL| MDW_SOURCE |     1 |    34 |   184   (1)| 00:00:03 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter("MS"."OBJECT_NAME"='C' AND
              SUBSTR("MS"."OBJECT_NAME",1,1)='A')
   5 - filter("MS"."OBJECT_NAME"='C' AND
              SUBSTR("MS"."OBJECT_NAME",1,1)='B')
   6 - filter("MS"."OBJECT_NAME"='C' AND
              SUBSTR("MS"."OBJECT_NAME",1,1)='C')

Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
       2417  consistent gets

TEST_DB> -- below will do a select of a non-indexed column
TEST_DB> -- specific to one of the Union's in the view via NOT matching the where clause
TEST_DB> -- but logically it is the same.
TEST_DB> select count(created)
  2  from mdw_v1
  3  where object_name like 'C%'

COUNT(CREATED)
--------------
           129

Execution Plan
-----------------------------------------------------------------------------------
| Id  | Operation            | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |            |     1 |    75 |   551   (1)| 00:00:07 |
|   1 |  SORT AGGREGATE      |            |     1 |    75 |            |          |
|   2 |   VIEW               | MDW_V1     |    24 |  1800 |   551   (1)| 00:00:07 |
|   3 |    UNION-ALL         |            |       |       |            |          |
|*  4 |     TABLE ACCESS FULL| MDW_SOURCE |     8 |   272 |   184   (1)| 00:00:03 |
|*  5 |     TABLE ACCESS FULL| MDW_SOURCE |     8 |   272 |   184   (1)| 00:00:03 |
|*  6 |     TABLE ACCESS FULL| MDW_SOURCE |     8 |   272 |   184   (1)| 00:00:03 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter("MS"."OBJECT_NAME" LIKE 'C%' AND
              SUBSTR("MS"."OBJECT_NAME",1,1)='A')
   5 - filter("MS"."OBJECT_NAME" LIKE 'C%' AND
              SUBSTR("MS"."OBJECT_NAME",1,1)='B')
   6 - filter("MS"."OBJECT_NAME" LIKE 'C%' AND
              SUBSTR("MS"."OBJECT_NAME",1,1)='C')

Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
       2417  consistent gets

Above, I create the view, do a quick select count(*) to cause a full scan of all three section in the union view, then do selects using various WHERE predicates to see what sections of the unioned SQl statements are excluded.

The results are very intersting. The filtering on sub_type still works fine, but now the WHERE predicate “where substr(object_name,1,1) =’C’” allows the CBO to exclude sections of the union, the “Null is not null” filters appear and the consistent gets is 807. The WHERE predicate “where object_name =’C'” which at first glance you might think to be logically the same but is not (we selected object_names where only the first character is checked but the whole object_name is selected) acts does not do any exclusion and I thought it might. There are no OBJECT_NAMES of ‘C’ but logically as a human you can see that such an object name could not be found by the first two unioned SQL statements in any case.

The last test is most interesting. The WHERE clause of the SQL select over the view is “where object_name like ‘C%’” and is logically the same as the one WHERE clause in the unioned view “where substr(object_name,1,1) =’C’“. So no exclusion occurs and again we see 2417 consistent gets.

It would seem the CBO Pixies either have not coded for that or, as Jonathan says Oracle can be “touchy” about this feature. Bottom line, if you are expecting this sort of unioned view SQL exclusion to occur – TEST!

Dropped Tables, Hiding Extents and Slow DBA_FREE_SPACE Queries February 16, 2012

Posted by mwidlake in database design, internals, performance.
Tags: , , ,
8 comments

My last post was on slow dictionary queries caused by having many, many thousands of extents in your database. This post is about a special case of this problem, which I encountered recently. It was on an Exadata box – Exadata is not particularly relevant to the issue, but I’m curious to see if mentioning Exadata will make this post more popular :-)

I was querying the used and free space on a very small database on the client’s X2-2 1/4 rack Exadata machine. The query was really slow, taking about 30 seconds. This is a FAST box, what is going on?

I quickly realised that the problem was specific to one tablespace:

>@SPC_SUM
Enter the tablespace (or leave null)&gt; DATA_01

TS_NAME              ORD      SUM_BLKS        SUM_K  MAX_CHNK_K NUM_CHNK
-------------------- ----- ----------- ------------ ----------- --------
DATA_01              alloc     262,144    2,097,152   2,097,152        1
                     free       63,128      505,024     504,384       11
2 rows selected.
Elapsed: 00:00:00.21

&gt; @SPC_SUM
Enter the tablespace (or leave null)&gt; USERS

TS_NAME              ORD      SUM_BLKS        SUM_K  MAX_CHNK_K NUM_CHNK
-------------------- ----- ----------- ------------ ----------- --------
USERS                alloc     748,320    5,986,560   5,372,160        2
                     free      127,904    1,023,232       6,144    3,058
2 rows selected.

Elapsed: 00:00:26.05

We can see it takes 0.21 seconds for tablespace DATA_01, 26.05 seconds for the USERS. Yes, USERS is larger but not 150 times larger. What is going on? Well, as the title of this post suggests, it is down to dropping tables. The below shows my route to that conclusion.

What I could see was that for the USERS tablespace the maximum chunk of free space was relatively small compared to the sum of free space – 6MB out of 1GB, pretty much – and that there was a lot of individual pieces of free space, 3,058. This tablespace was shattered into a lot of bits. So, what sort of extent size management do we have? How big are the tables and indexes in this tablespace {NB I already knew we had no partitions so I did not have to worry about that}.

select tablespace_name,initial_extent,next_extent
      ,extent_management,min_extlen,allocation_type
from dba_tablespaces where tablespace_name ='USERS'

TABLESPACE_NAME                INITIAL_EXTENT NEXT_EXTENT EXTENT_MAN MIN_EXTLEN ALLOCATIO
------------------------------ -------------- ----------- ---------- ---------- ---------
USERS                                   65536             LOCAL           65536 SYSTEM

select tablespace_name,blocks,count(*) from dba_extents
where tablespace_name = 'USERS'
group by tablespace_name,blocks
having count(*) >1
order by blocks desc,tablespace_name

TABLESPACE_NAME                    BLOCKS   COUNT(*)
------------------------------ ---------- ----------
USERS                                2560          2
USERS                                2048          3
USERS                                1536          7
USERS                                1408          5
USERS                                1280          2
USERS                                1248          2
USERS                                1152          2
USERS                                1024        229
USERS                                 896         15
USERS                                 768         21
USERS                                 736          3
USERS                                 720          3
USERS                                 704          2
USERS                                 672          2
USERS                                 640         25
USERS                                 624          2
USERS                                 576          2
USERS                                 512        117
USERS                                 400          2
USERS                                 384         34
USERS                                 360          2
USERS                                 312          2
USERS                                 288          4
USERS                                 256         49
USERS                                 248          2
USERS                                 240          2
USERS                                 192          5
USERS                                 160          4
USERS                                 128       1165
USERS                                   8       1788

30 rows selected.

So we have system controlled extent allocation, this should be fine. I did a quick check of the table contents – 300 or so tables and 200 or so indexes. And, as you can see from the check of extents above, very few larger extents and a lot of small ones. {As a little aside, note the relatively large number of 8-block, 128-block, 256-block and 1024-block extents; These are the sizes that LMTs with system managed extents tend to use unless a similar sized chunk is found to be available, in which case it is used}.

I did some other checks on indexes and segment sizes and it all boiled down to one thing. There were not a lot of things in this tablespace, what was there was small and… The total of all those table and index blocks was way less than the used space in the tablespace.

That is the major indicator of what is going on here. The difference in space used in the tablespace and the total of the visible tables and indexes.

The last link in the chain is the recycle bin.

select owner,ts_name,count(*) from dba_recyclebin group by owner,ts_name

OWNER                          TS_NAME                          COUNT(*)
------------------------------ ------------------------------ ----------
USER1                          USERS                              542356
USER1                                                                  2
WEGWEGWEG                      USERS                                  97
KKKUKUYLLX                     USERS                                 149
USOVFPKEKS                     USERS                                   3
....
ERHJTRTTTURT                   USERS                                   4

11 rows selected.

That’s 542,356 objects in the recyclebin for one user, in the one tablespace. My problem tablespace. The penny clicked, something I had already noticed and was a little uncomfortable about fell into place.

The client is using Oracle Data Integrator (ODI) to pull data together and put it into the database. The process they are using basically creates a table, uses it to load some data into and then pushes the data into the target tables. Then drops the table. The drop does not do a “DROP TABLE … PURGE;”. This is done many, many times per load cycle, which is intended to run several times a day.

Something you should always keep in mind with the recyclebin is that the tables, indexes and their extents that go into the recycle bin do not get reported in several of the key data dictionary views. I did mention this in a blog post way back but maybe a whole post on it is called for.

So, the tablespace was being shattered by the constant creating and dropping of small tables. It was hidden from easy view due to how the recyclebin is (not) exposed in data dictionary views.

It is not good practice to constantly create and drop lots of tables. As well as the admittedly rather odd impact that this posting is all about, there is a lot of internal work involved for the oracle database in creating and dropping tables. It is an overhead best avoided. The client had good reasons for this approach but now they are going to look at the alternatives.

I’ve not actually proven in this posting that all those dropped tables is the actual cause of the slow performance in querying the free space in that tablespace {though the corroborative evidence is very strong}. Neither have I said how we cleared up the mess. I’ll cover both of those in the next post on this mini-thread.

Oracle Nostalgia December 15, 2011

Posted by mwidlake in database design, development.
Tags: ,
16 comments

When preparing the material for my “Oracle Lego – an introduction to Database Design” presentation for the UKOUG last week, I was looking back at my notes from a course on the topic from “a few years back”. There were a few bits which made me smile.

Oracle’s [SQL] implementation conforms to ANSI standard, although referential integrity will not be enforced until version 7

Any other old geezers having flashbacks? I am so glad my first major Oracle development project swapped to using a Beta of V7 very early, so we had the integrity turned on during most of development. I had to help a few projects go from V6 to V7 and turn on the RI – it was usually very, very painful. Or impossible. I always think back to those nightmare experiences when some bright spark suggests turning off referential integrity for “ease of development” or “performance” reasons. There are good performance reasons for altering how you implement RI but, as I said during my presentation on database design, I have never, ever, ever seen a system with RI turned off that did not have damaged data.

Oracle’s optimiser is rule-based. Designing efficient queries involves taking advantage of the optimiser behaviour

.

You can tell this course was run in the UK due to the lack of ‘z’ in ‘optimiser’ :-). How many of use can now make a stab at the seven or eight significant rules from the 15 (16, 17 as versions advanced) in the list? Several rules were to do with Clusters so you did not care. Let’s think, what were the main things to keep in mind…

  • most significant table last in the FROM clause and order upwards in the order you wanted to visit {most significant being the one you felt you could most efficiently do the first filter against}
  •  WHERE clauses ordered downwards in the order you wanted them to be applied.
  • Order of preference to identify a row was something like ROWID, primary key, unique key, full non-unique key, partial unique key, partial non-unique key, full index scan, full table scan.
  • Disable index access by adding 0 to numeric columns and concatinating null to varchars.

I’ve not checked back in the manuals (I have a set of the V7 on my laptop) so I’m probably wrong.

Storage….selecting suitable values for storage parameters … will improve the final performance of the database

Considering the “suitable values for storage parameters” was perhaps my first real conscious step into being a performance/design guy {I was lucky to be on a project where designing for the RBO and matching those rules was just part of being any developer}, but the calculating of rows-per-block, initial and next extent, pctincrease (not always zero you know), initrans/maxtrans, segment to tablespace size… I learnt all about that and had spreadsheets for it all.

Now of course, all of the above about storage (and RBO) has pretty much disappeared. Oracle has made some of the contents of my brain redundant.

But some things have not changed at all in 18 years:

Users can be relied upon to know what they do NOT want, not what they want, which {unfortunately} is the premise from which analysis starts)

I think the above is the fundamental issue from which all iterative design methodologies spring. ie do not believe what the user says they want, show them something and fix it. It is probably human nature that we are not well able to express what we want but have no problem pointing out something is not at all what we want :-). Add in all the issues in respect of forgetting about the exceptions, assumed knowledge, incompatible vocabularies (the words your users say to you are as confusing as the techno-babble you fire back at them) and all analysis is fundamentally flawed.

Do some analysis – but then prototype like crazy. With real users.

Lack of Index and Constraint Comments November 24, 2011

Posted by mwidlake in Architecture, database design, development.
Tags: , , , ,
10 comments

Something I’ve just reminded myself of is that under Oracle you cannot add a comment on an index or a constraint. You can only add comments on tables, views, materialized views, columns of those object types and a couple of esoteric things like Operators, Editions and Indextypes.

Here is an example of adding comments to tables and columns:

set pause off feed off
drop table mdw purge;
create table mdw(id number,vc1 varchar2(10));
comment on table mdw is 'Martin Widlake''s simple test table';
comment on column mdw.id is 'simple numeric PK sourced from sequence mdw_seq';
comment on column mdw.vc1 is'allow some random text up to 10 characters';
--
desc user_tab_comments

 Name                                                  Null?    Type
 ----------------------------------------------------- -------- ------------------------------------
 TABLE_NAME                                            NOT NULL VARCHAR2(30)
 TABLE_TYPE                                                     VARCHAR2(11)
 COMMENTS                                                       VARCHAR2(4000)

--
select * from dba_tab_comments where table_name='MDW'
/
OWNER                          TABLE_NAME                     TABLE_TYPE
------------------------------ ------------------------------ -----------
COMMENTS
----------------------------------------------------------------------------------------------------
MDW                            MDW                            TABLE
Martin Widlake's simple test table

select * from dba_col_comments where table_name='MDW'
order by column_name
/
OWNER                          TABLE_NAME                     COLUMN_NAME
------------------------------ ------------------------------ --------------
COMMENTS
----------------------------------------------------------------------------------------------------
MDW                            MDW                            ID
simple numeric PK sourced from sequence mdw_seq
MDW                            MDW                            VC1
allow some random text up to 10 characters
-- now to add a big comment so need to use the '-' line continuation character in sqlplus
--
comment on table mdw is 'this is my standard test table.-
 As you can see it is a simple table and has only two columns.-
 It will be populated with 42 rows as that is the solution to everything.'
/
select * from dba_tab_comments where table_name='MDW'
OWNER                          TABLE_NAME                     TABLE_TYPE
------------------------------ ------------------------------ -----------
COMMENTS
----------------------------------------------------------------------------------------------------
MDW                            MDW                            TABLE
this is my standard test table.  As you can see it is a simple table and has only two columns.  It w
ill be populated with 42 rows as that is the solution to everything.
--
/

Adding comments on tables, views and columns seems to have dropped out of fashion over the years but I think it is still a very useful feature of oracle and I still do add them (though I am getting a little slack about it myself over the last 3 or 4 years, which I must stop).

Comments are great, you can put 4000 characters of information into the database about each table, view and column. This can be a brief description of the object, a full explanation of what a column is to hold or even a list of typical entries for a column or table.

But you can’t add a comment on indexes or constraints. Why would I want to? Well, constraints and indexes should only be there for a reason and the reason is not always obvious from either the names of the columns or the name of the constraint or index, especially where you have a naming standard that forces you to name indexes and constraints after the columns they reference.

When you design a database, do a schema diagram or an ERD, you label your relationships between entities/tables. It tells you exactly what the relationship is. You might create an index to support a specific method of access or particular business function. You might alter the index in a way not immediately obvious to the casual observer, such as to allow queries that use the index to avoid having to visit the table. All of those things will, of course, be fully documented in the maintained project documentation in the central repository, available and used by all…

If I was able to add comments to constraints and indexes within the database then they would there. You move the system from one platform to the other, they are there. If for any wildly unlikely reason the central documentation lets you down, the information is always there in the database and easy to check. You may not be able to track down the original design documents but you have the database in front of you, so comments in that will persist and be very easy to find.

Lacking the ability to add comments on indexes and constraints, I have to put them at the table level, which I always feel is a kludge. I might actually raise an enhancement request for this, but as Oracle 12 is already nailed down, it will have to wait until Oracle 14. (A little bird told me Larry said there would be no Oracle 13…).

Follow

Get every new post delivered to your Inbox.

Join 156 other followers