jump to navigation

Row Level Security 3 – In Pictures! November 22, 2012

Posted by mwidlake in database design, security.
Tags: , ,
3 comments

<..Part one intro and examples
<….Part two Permissions

I’ve noticed that there has not been a lot of traffic on this series on Row Level Security (data masking) so far – maybe due to how I am presenting the material? So here is a summary to date in picture/diagram format:

This is the flow of logic as to whether Row Level Security is applied to the SQL statement. Usually this is a query and usually it is on a table but the policy can be applied to other DML and also on views and synonyms:

These are the steps you need to add RLS to a table:

Below is a summary of how oracle fires and uses RLS functions and the predicates they provide:

An finally, a summary of what happens if an RLS function is invalid:

If you have not looked at the previous posts on this topic, more details are in there.

Enjoy!

Row Level Security Part 2 – permissions November 20, 2012

Posted by mwidlake in internals, security, SQL.
Tags: , ,
3 comments

<..Part 1, introduction..
..Part 3 summary in pictures..>

In this second post on the topic of “an introduction to Row Level Security” I want to cover a few things about what permissions you need to implement RLS and some of the consequences. In my introduction in part one I just said my main user has “DBA type Privileges”.

{NB This is all on Oracle V11.2 and I believe everything below is applicable to V10 as well. Also, I should point out that I am not an Oracle security expert – but despite repeatedly saying this, it seems like at least once a year I am asked to improve a system’s security on the grounds of “more than we have now is an improvement”}.

Firstly, you need the permission to create and drop RLS policies. This is controlled by having the EXECUTE privilege on SYS.DBMS_RLS. That is it. If you can execute this package you can create and drop RLS Policies, enable and disable them and also create, manage and use Policy Groups.

Obviously EXECUTE on SYS.DBMS_RLS is a privilege you need to restrict to only “empowered” accounts- but you also want to be very careful about your definition of empowered. If an account has this privilege, it can alter (disable!) any RLS policies that ANYONE created on the instance. You may need to think carefully about having several accounts that can alter all the RLS policies across all schemas in your instance.

You might want to create a single user that has the EXECUTE on SYS.DBMS_RLS and use that to implement and control RLS across the instance {and even have the account locked when it is not needed}. Of course, under normal Oracle anyone with SYS access can run the procedure – but you have SYS access under control already, don’t you?… :-).

By default, who has this privilege?

MDW> @obj_privs
Enter value for obj_name: dbms_rls
Enter value for obj_own: sys
GRANTEE              OWNER        TABLE_NAME             GRANTOR  PRIVILEGE    GRA
-------------------- ------------ ---------------------- -------- ------------ ---
EXECUTE_CATALOG_ROLE SYS          DBMS_RLS               SYS      EXECUTE      NO
XDB                  SYS          DBMS_RLS               SYS      EXECUTE      NO
WMSYS                SYS          DBMS_RLS               SYS      EXECUTE      NO

--
-- and who has that EXECUTE_CATALOG_ROLE?
--
select * from dba_role_privs where granted_role='EXECUTE_CATALOG_ROLE'

GRANTEE                        GRANTED_ROLE                   ADM DEF
------------------------------ ------------------------------ --- ---
DBA                            EXECUTE_CATALOG_ROLE           YES YES
IMP_FULL_DATABASE              EXECUTE_CATALOG_ROLE           NO  YES
EXP_FULL_DATABASE              EXECUTE_CATALOG_ROLE           NO  YES
SYS                            EXECUTE_CATALOG_ROLE           YES YES

So, our old friend the DBA role has that privilege, via the EXECUTE_CATALOG_ROLE and IMP_/EXP_FULL_DATABASE.

Let’s go back to creating policies. I’m sticking to row-level RLS for now, not column-level. When you create a policy you basically are saying “when object X is accessed, run function Y and apply the resulting string to all access, as a predicate”

Lifted an example from the manual, but adding in a couple of bits the authors were too lazy to put in { :-) } :-

DBMS_RLS.ADD_POLICY (
 object_schema   =>   'hr'
,object_name     => 'employees'
,policy_name     => 'emp_policy'
,function_schema => 'hr_rls'
,policy_function => 'emp_sec'
,statement_types => 'select');

The function HR_RLS.EMP_SEC returns a string P1

When someone issues a statement that access the HR.EMPLOYEES table the function HR_RLS.EMP_SEC is executed as part of the parse and, internally, the following view is created and placed into the orginal statement:

SELECT * FROM hr.employees WHERE P1;

As this takes place as part of the parse part of the statement processing, errors with the function are found before any effort to access the actual data occur. It is also why it is not simple to avoid RLS – it is done as part of the overall SQL processing carried out by Oracle.

If the function is invalid, generates an error or the supplied “WHERE predicate” P1 causes the final statement to be syntactically incorrect, you get an error.
At this point Oracle fails “secure” and simply won’t fire the SQL statement. Even if you own the tables and have full privilege access to them. I think it is worth a quick demonstration

I am logged on as the owner of the table and RLS function, MDW:-

MDW> select * from test1
        ID CP_NAME                        OTHER_NAME                        MASK_FL
---------- ------------------------------ ------------------------------ ----------
         1 ERIC                           ERIC THE RED                            1
         2 BROWN                          BOB THE BROWN                           2
         3 GREEN                          GARY THE GREEN                          1
         4 BLUE                           BILL THE BLUE                           3
         5 BLACK                          DAVE THE BLACK                          4
         6 PURPLE                         PETE THE PURPLE                         5
         7 RED                            ROGER THE RED                           6
--
--All works OK
MDW> @rls_func_damage
--
CREATE OR REPLACE FUNCTION hide_rows (
 v_schema IN VARCHAR2
,v_objname IN VARCHAR2)
RETURN VARCHAR2 AS
con VARCHAR2 (200);
BEGIN
-- The line below states loser not user
con := (loser='''||v_schema||''' or MASK_FL > 3)';
RETURN (con);
END hide_rows;
/
Function created.
--
-- Let us check the predicate coming from the function
--
select hide_rows(v_schema=>user,v_objname=>'TEST1') from dual;

HIDE_ROWS(V_SCHEMA=>USER,V_OBJNAME=>'TEST1')
---------------------------------------------------------------------------------------
(loser='MDW' or MASK_FL > 3)

-- "loser" is not a valid piece of syntax or a known variable. So...
--
select * from TEST1
              *
ERROR at line 1:
ORA-28113: policy predicate has error

-- I now damage the statement creating the function so that it is syntactically incorrect
CREATE OR REPLACE FUNCTION hide_rows (
...
-- missing quote
con := (loser='''||v_schema||''' or MASK_FL > 3)';
...

Warning: Function created with compilation errors.

select * from TEST1
              *
ERROR at line 1:
ORA-28110: policy function or package MDW.HIDE_ROWS has error

--re-create the function as I started with

Function created.

select * from TEST1
        ID CP_NAME                        OTHER_NAME                        MASK_FL
---------- ------------------------------ ------------------------------ ----------
         1 ERIC                           ERIC THE RED                            1
         2 BROWN                          BOB THE BROWN                           2
...
         7 RED                            ROGER THE RED                           6
--
-- Back to a working system

As you can see, you get ORA-28110 if the function used by the RLS policy is damaged and ORA-28113 if the function returns a non-valid WHERE Predicate. In either case, the data in those tables is now inaccessible. That could be a problem…

It is of course very interesting that RLS is kind of “blind” to the functions it uses until it uses it. You can alter RLS activity if you can alter the functions.

Last post I mentioned that, as you RLS security is based on these functions, you should give thought to who can see them and change them. Hopefully the above highlights that. There is another access aspect to these functions that is key.

In the first example where I use DBMS_RLS.ADD_POLICY, I state function_schema – this is who owns the PL/SQL function. It defaults to the person who creates the Policy. The function is fired in the standard way with DEFINER RIGHTS – i.e. the rights of the function owner. {You could override this when creating the function such that it runs with executioner rights – but that strikes me as a potentially very confusing and a source of much pain and anguish}.

So the function can see what the owner can see, not the executioner. This has two key effects:

  • You can restrict the access to any DB objects that the function requires to the owner of the function. The end user does not need to have access and it may be more secure if they do not.
  • You have to grant access to objects directly to the function owner, via either object grants or system grants. Stored PL/SQL does not “see” roles if executed with definer rights.

So eg if your function is owned by MDW and references a master table called SEC_MASTER.MASKED_USERS then you need to grant select on SEC_MASTER.MASKED_USERS to MDW. No using roles.
{of course, calling a table MASKED_USERS might give anyone acquiring access to the DB a clue that it is important. I tend to call such “security important” tables things that are “obviously” boring, like ROTA_DATA.}

Finally, that WHERE Predicate P1 is in effect added to your SQL statement. It can be pretty much anything. It can reference other database objects. If it references other database objects THE EXECUTING USER MUST BE ABLE TO SEE THEM. Not the owner of the function; that function has been fired and the WHERE Predicate generated already. On actually running the SQL, the executing user must have access to all objects referenced, including those introduced by the P1 WHERE Predicate. Oh, and remember synonyms!

There is there a way to sidestep RLS and you sometimes need to.

If your logon has the EXEMPT ACCESS POLICY system privilege then RLS functions are utterly ignored by you. No check is made to see if policies apply to your SQL statements, let alone run them. You may need to use this privilege if you need to access functionality that RLS has issues with (see comments by Dom Brooks and Tony Sleight on my first post). You might also need it if you have errors with the policies but you need access to the data to keep the business moving. You probably need a locked-down user somewhere with the EXEMPT ACCESS POLICY privilege if you use RLS.
Of course, great care has to be taken in making use of this privilege as it side-steps all RLS_based security.
If you have master users that have full access to the data, this privilege also removes the overhead of the SQL engine identifying any policies, firing the function and including the extra predicates.

One very final point. I have found myself writing RLS functions that all check if the user was the owner of the table and, if so, allowing them to see all data. By granting EXEMPT ACCESS POLICY to that user I was able to do away with those checks and make the functions much simpler. In this case I did not so much think of EXEMPT ACCESS POLICY turning off security but the lack of it turning it on for restricted users.

That will do for part 2.

Row Level Security Part 1 November 15, 2012

Posted by mwidlake in database design, security.
Tags: , ,
9 comments

I’ve been working a little on Row Level Security (RLS) recently and wanted to mention a few things, so first some groundwork.

If you want to limit the rows certain users can see, you might think to use views or you might think to use RLS (part of VPD – Virtual Private Database). You can also (from V10 I think) limit which columns users can see. An example is probably the best way to show this. I’m doing this on Oracle 11.2.0.3.

I have two users, MDW and MDW_OFFSHORE. MDW has DBA-type privileges and MDW_OFFSHORE has connect, resource and one or two other simple privs. I will now demonstrate creating and populating a simple table under MDW, adding RLS to it and how it alters what MDW_OFFSHORE sees.

MDW> -- create a table and show row level and column level rls
MDW> --
MDW> drop table TEST1 purge;
MDW> --
MDW> create table test1
  2  (id         number       not null
  3  ,cp_name    varchar2(30) not null
  4  ,other_name varchar2(30) not null
  5  ,MASK_FL    number
  6  )
  7  /
MDW> insert into test1 VALUES (1,'ERIC','ERIC THE RED',1);
MDW> insert into test1 VALUES (2,'BROWN','BOB THE BROWN',2);
MDW> insert into test1 VALUES (3,'GREEN','GARY THE GREEN',1);
MDW> insert into test1 VALUES (4,'BLUE','BILL THE BLUE',3);
MDW> insert into test1 VALUES (5,'BLACK','DAVE THE BLACK',4);
MDW> insert into test1 VALUES (6,'PURPLE','PETE THE PURPLE',5);
MDW> insert into test1 VALUES (7,'RED','ROGER THE RED',6);
MDW> --
MDW> COMMIT;
MDW> SELECT * FROM TEST1;

        ID CP_NAME                        OTHER_NAME                        MASK_FL
---------- ------------------------------ ------------------------------ ----------
         1 ERIC                           ERIC THE RED                            1
         2 BROWN                          BOB THE BROWN                           2
         3 GREEN                          GARY THE GREEN                          1
         4 BLUE                           BILL THE BLUE                           3
         5 BLACK                          DAVE THE BLACK                          4
         6 PURPLE                         PETE THE PURPLE                         5
         7 RED                            ROGER THE RED                           6
MDW> grant all on test1 to mdw_onshore;
MDW> grant select on test1 to mdw_offshore;
MDW> --
--
--This is where I create a function to generate the RLS rules
--
MDW> CREATE OR REPLACE FUNCTION hide_rows (
  2   v_schema IN VARCHAR2
  3  ,v_objname IN VARCHAR2)
  4  RETURN VARCHAR2 AS
  5  con VARCHAR2 (200);
  6  BEGIN
  7  -- note the use of backets to make it an atomic test
  8  con := '(user='''||v_schema||''' or MASK_FL > 3)';
  9  RETURN (con);
 10  END hide_rows;
 11  /
MDW> --
--
-- Now add the actual RLS policy to the table TEST1 using the function hide_rows
MDW> BEGIN
  2    sys.DBMS_RLS.ADD_POLICY (
  3      object_schema     => 'MDW'
  4     ,object_name       => 'TEST1'
  5     ,policy_name       => 'hide_rows_policy'
  6     ,policy_function   => 'hide_rows'
  7     ,function_schema   => 'MDW'
  8                            );
  9  END;
 10  /
MDW> --

That’s it. In effect you create a function, in this hide_rows, that creates a WHERE predicate (more on that later) and associate it with the table by creating a policy hide_ rows_policy.

Now let’s see what impact it has:

-- as the main user:
--
MDW> select * from test1
  2  /
        ID CP_NAME                        OTHER_NAME                        MASK_FL
---------- ------------------------------ ------------------------------ ----------
         1 ERIC                           ERIC THE RED                            1
         2 BROWN                          BOB THE BROWN                           2
         3 GREEN                          GARY THE GREEN                          1
         4 BLUE                           BILL THE BLUE                           3
         5 BLACK                          DAVE THE BLACK                          4
         6 PURPLE                         PETE THE PURPLE                         5
         7 RED                            ROGER THE RED                           6
7 rows selected.
MDW>
-- now as the offshore user:
mdw_offshore> select * from mdw.test1
  2  /
        ID CP_NAME                        OTHER_NAME                        MASK_FL
---------- ------------------------------ ------------------------------ ----------
         5 BLACK                          DAVE THE BLACK                          4
         6 PURPLE                         PETE THE PURPLE                         5
         7 RED                            ROGER THE RED                           6
3 rows selected.

As you can see, mdw_offshore is seeing only a subset of rows in the table. This subset is determined by the function hide_rows. This function returns a WHERE PREDICATE that is added to any query run against the table. It is actually very simply to see what that predicate is, you just call the function passing in the table owner and name:

MDW> select hide_rows(v_schema => user,v_objname=>’TEST1′) from dual;

HIDE_ROWS(V_SCHEMA=>USER,V_OBJNAME=>’TEST1′)
——————————————————————————-
(user=’MDW’ or MASK_FL > 3)

Note that my refernce to v_owner in the function comes out as the text MDW – a way to ensure the predicate does not filter data looked at by the table owner. This implies of course that yes, you can stop the table owner from seeing data!

By default, this extra WHERE PREDICATE is applied to any SELECT, INSERT, UPDATE or DELETE statement run against this table. The function has to accept the v_schema and v_objname parameters and output a varchar2 string. At the time of being called as part of a RLS policy, oracle passes in the owner and name of the object being masked (which is usually a table and sometimes a view).
There are a few things I think it is important to keep in mind in respect of this extra Where Predicate:-

  • You have very little control over what other where predicates will be applied to DML against this table, so keep it self-contained – which is why I bound it in brackets.
  • It has to be syntactically correct but no check is made for this when you create the function – it is simply a string being outputted, as far as Oracle is concerned, at the time of creation.
  • It can be any valid Where Predicate you like, so long as it does not reference other rows in the table with the policy is applied to
  • This could be fired a lot. The performance impact could be considerable, which is why CONTEXTs are often used (as they are memory resident and can be configured by eg logon triggers) but I have skipped over that. Go look at the manuals.

I mentioned that the masking of data could be applied to just a column rather than whole rows. So let’s demonstrate that. I can do this by simply dropping and reapplying the same function in a different way:

MDW> BEGIN
-- first drop the policy, simply by stating the object owner, name and policy
  2    sys.dbms_rls.drop_policy(
  3      object_schema     => 'MDW'
  4     ,object_name       => 'TEST1'
  5     ,policy_name       => 'hide_rows_policy'
  6                            );
-- and recreate - note the two new paramters 
  7    sys.DBMS_RLS.ADD_POLICY (
  8      object_schema     => 'MDW'
  9     ,object_name       => 'TEST1'
 10     ,policy_name       => 'hide_rows_policy'
 11     ,policy_function   => 'hide_rows'
 12     ,function_schema   => 'MDW'
 13     ,sec_relevant_cols => 'OTHER_NAME,MASK_FL'
 14     ,sec_relevant_cols_opt => sys.dbms_rls.ALL_ROWS
 15                            );
 16  END;
 17  /
PL/SQL procedure successfully completed.
--
-- as the main user
MDW> select * from test1
  2  /
        ID CP_NAME                        OTHER_NAME                        MASK_FL
---------- ------------------------------ ------------------------------ ----------
         1 ERIC                           ERIC THE RED                            1
         2 BROWN                          BOB THE BROWN                           2
         3 GREEN                          GARY THE GREEN                          1
         4 BLUE                           BILL THE BLUE                           3
         5 BLACK                          DAVE THE BLACK                          4
         6 PURPLE                         PETE THE PURPLE                         5
         7 RED                            ROGER THE RED                           6
7 rows selected.
--
-- now as the offshore user:
mdw_offshore> select * from mdw.test1
  2  /
        ID CP_NAME                        OTHER_NAME                        MASK_FL
---------- ------------------------------ ------------------------------ ----------
         1 ERIC
         2 BROWN
         3 GREEN
         4 BLUE
         5 BLACK                          DAVE THE BLACK                          4
         6 PURPLE                         PETE THE PURPLE                         5
         7 RED                            ROGER THE RED                           6
7 rows selected.
--
-- but what can look a little odd is that if mdw_offshore describes the table..
mdw_offshore> desc mdw.test1
 Name                                                  Null?    Type
 ----------------------------------------------------- -------- --------------
 ID                                                    NOT NULL NUMBER
 CP_NAME                                               NOT NULL VARCHAR2(30)
 OTHER_NAME                                            NOT NULL VARCHAR2(30)
 MASK_FL                                                        NUMBER
--
-- they see that OTHER_NAME is NOT NULL but...
mdw_offshore> select count(*) from mdw.test1 where other_name is null;

  COUNT(*)
----------
         4

That’s rather nice, I think. The same function can be used and rather than hide the rows it masks the stated columns. The sec_relevant_cols => ‘OTHER_NAME,MASK_FL’ parameter states the columns to apply the function on and the sec_relevant_cols_opt => sys.dbms_rls.ALL_ROWS parameter means “show a row even if a column or columns are hidden”. Without it, you only see rows if you do not reference the masked columns (even if indirectly via eg “select (*)”) – see this nice article by Tim Hall if you want an example of that.

With column masking the string from the function is not actually used as a Where Predicate but as a Boolean TRUE/FALSE test. If the test results in TRUE, you see the column(s), otherwise they are hidden.

As you can see from the example, this column-level RLS can result in some odd looking (though intentional) results. A DESC of the table shows the column OTHER_NAME is mandatory but records are found with it set to null. There can also be the situation where a user can insert a value for the masked column, but then cannot see it when they select the data back, though other users can see the populated column.

A limitation of column-level masking is that you can only mask columns with NULL and not, for example, a string of “MASKED”. Thus if a column you are masking is not manadatory, how would you distinguish between a true NULL and a masked NULL? {A nod of thanks goes to Pete Finnigan for confirming this “NULL only” limitation.}

One of the appeals of RLS over eg views is that it is hard to circumvent, as it is applied at the level of the SQL. Another is that the same function can be applied to many tables and might also be applied to both column and row masking, if designed well.

That will almost do for an introduction to the use of RLS to mask rows and columns. However I’ll finish with just a few important points that follow on from this being a security feature.

  • I demonstrated that I could see what the function responsible for this security actually returns. If you do not protect unauthorised execution of that function, that could be a security hole.
  • I masked the column that is used to control if the OTHER_NAME column is masked. Would you need to also do this?
  • Data that is missing is data that could be of interest.
  • Anyone who can alter that function can undo your security.
  • Anyone who can view DBA_SOURCE (eg has SELECT ANY DICTIONARY) can view the function and see how it works.
  • As you can appreciate from the above, apply Security securely is a right pain in the proverbial.

Friday Philosophy – Is the CBO becoming way too complex? October 19, 2012

Posted by mwidlake in Friday Philosophy, performance.
Tags: , ,
20 comments

I was at the SIOUG annual conference in Slovenia this week (and a very good conference it was too) and I was watching a presentation by Christian Antognini about how the CBO learns by it’s mistakes. This was of course mostly about adaptive cursor sharing and cardinality feedback. Chris was also able to share a few tid-bits about 12c enhancements in this area. I can’t go into details, but basically it looks like the CBO is going to not only enhance those two features but there is a new one where the CBO can change the plan on the fly, as the same query progresses.

As I watched the presentation I found I was in two minds about this. Part of me was thinking “Yes, these features do help in the CBO realising it has chosen a poor plan and improving things for the next execution – and it will get better at it”. The other part was thinking “Oh dear, yet another source of confusion and of performance of queries changing when ‘nothing has changed’“.

It also solidified an idea I have for a new presentation, about how what you see in the execution plan may well not be what is actually executed. ie there are optional bits in there that do not get run at execution time, depending on the results of prior steps.

But I digress. What I ended up wondering at the end of Chris’s talk was this. Is the CBO getting just too complex? It is all very good that extra abilities are being added so that there is a better chance of a good plan being chosen in the first place and the various ways we can control the choice of the plan grows and grows. But we are long past the point when you could write even a simple sql statement and know what the plan will be before you explain it – let alone run it, check the plan and then run it again a few minutes later and be sure the plan will be the same.

Why does this bother me? For three reasons:

1) Performance will not be stable, so the user experience will not be consistent – and so they will be unhappy. Users will generally be happy if they run a report and it takes one minute, if it takes one minute each time. If it takes one minute and then it takes 5 seconds, they want it to always run in 5 seconds, else they fell cheated or that the system is broken. In fact, I am pretty sure a user will be happier if a report always takes 2 minutes rather than it take 5 seconds or 40 seconds but they never know which! (So long, that is, that they never, ever see it run in less than 2 minutes).

2) If performance of a particular SQL statement is absolutely crucial, I need to know how the CBO can and probably will satisfy it so that I can work that into my solution. Otherwise I am reduced to trying out informed options (or maybe random changes :-) ) in order to find the optimal solution – and I cannot be sure that the plan won’t suddenly swap later unless I fix the plan. OK, it might swap to be faster, but I only notice when it swaps to be a lot slower.

3) I’ve said this before, but I am starting to really wonder how anyone new to this starts understanding oracle performance tuning any more. Reading blogs and books about how Oracle works and what impacts the CBO teaches you a lot about what is possible but it only makes sense when you do little tests and proofs of concepts. ie you actually run simple examples and see what happens. But with all these options, considerations and now on-the-fly alterations to plans by the CBO, it is very confusing. Even I, after 20 years of doing performance tuning on and off, am constantly finding myself looking at relatively simple SQL and having to work out why the plan is as it is and satisfying myself that it has not changed. I started with the CBO when it was a lot simpler and I’ve been introduced to the complexities gradually, as they have been introduced, so learning about it has been easier for me.

Perhaps I should not worry about this. I mean, the CBO is getting better at doing what it does, that is good. It is getting hard for those younger and smarter than me to learn about it, so my career is protected. And there is more stuff to talk about so I can keep going to conferences and talking about them.

And with that happy idea, I’m off to start the weekend. Where was that cork screw?

DBMS_APPLICATION_INFO for Instrumentation September 3, 2012

Posted by mwidlake in development, Instrumentation, performance.
Tags: , , , ,
15 comments

I just wanted to put up a post about DBMS_APPLICATION_INFO. This is a fantastic little built-in PL/SQL package that Oracle has provided since Oracle 8 to allow you to instrument your code. i.e record what it is doing. I’m a big fan of DBMS_APPLICATION_INFO and have used it several times to help identify where in an application time is being spent and how that pattern of time has altered.

Some PL/SQL developers use it and some don’t. It seems to me that it’s use comes down to where you work, as most PL/SQL developers are aware of it – but not everyone uses it (a friend of mine made the comment recently that “all good PL/SQL developers use it“. I can understand his point but don’t 100% agree).

It is incredibly easy to use. You use the procedures SET_MODULE(module_name,action_name), SET_ACTION(action_name) and SET_CLIENT_INFO(client_info) to set the values of the corresponding columns in V$SESSION for your current session. This is done in memory, there is no inserting or updating of any table rows, so it is incredibly quick and light-weight. The below shows a subset of V$SESSION including the columns that get set:

desc v$session
Name Null? Type
----------------------------------------- -------- ----------------
SADDR RAW(8)
SID NUMBER
SERIAL# NUMBER
AUDSID NUMBER
PADDR RAW(8)
USER# NUMBER
USERNAME VARCHAR2(30)
COMMAND NUMBER
...
MODULE VARCHAR2(64)
MODULE_HASH NUMBER
ACTION VARCHAR2(64)
ACTION_HASH NUMBER
CLIENT_INFO VARCHAR2(64)
...

Just something odd to note. MODULE, ACTION and CLIENT_INFO are shown at VC(64) but if you check the documentation you will see that:

MODULE is limited to 48 bytes
ACTION is limited to 32 bytes
CLIENT_INFO can be set to the whole 64 bytes

I have no idea why MODULE and ACTION are limited in this way but it might be historic, backward compatibility with prior versions.

As a quick example of their use:

test_11_2> --demo_dai.sql
test_11_2> -- quick demo of dbms_application_info
test_11_2> COL sid form 9999
test_11_2> col username form a12
test_11_2> col module form a18
test_11_2> col action form a22
test_11_2> col client_info form a15
test_11_2> -- set module and action
test_11_2> EXEC DBMS_APPLICATION_INFO.SET_MODULE(MODULE_NAME=>'SALES_LOAD',ACTION_NAME=>'VALIDATE_ALL_PRICES')
test_11_2> select sid,username,module,action,client_info from v$session where sid=650;

SID USERNAME MODULE ACTION CLIENT_INFO
----- ------------ ------------------ ---------------------- ---------------
650 ERIC SALES_LOAD VALIDATE_ALL_PRICES
test_11_2> --
test_11_2> -- Update the action within a module
test_11_2> EXEC DBMS_APPLICATION_INFO.SET_ACTION(ACTION_NAME=>'Update all Prices')
test_11_2> select sid,username,module,action,client_info from v$session where sid=650;

SID USERNAME MODULE ACTION CLIENT_INFO
----- ------------ ------------------ ---------------------- ---------------
650 ERIC SALES_LOAD Update all Prices

test_11_2> -- clear them
test_11_2> EXEC DBMS_APPLICATION_INFO.SET_MODULE(null,null)
test_11_2> select sid,username,module,action,client_info from v$session where sid=650;

SID USERNAME MODULE ACTION CLIENT_INFO
----- ------------ ------------------ ---------------------- ---------------
650 ERIC

test_11_2> -- you can set just an action
test_11_2> EXEC DBMS_APPLICATION_INFO.SET_action(ACTION_NAME=>'orphan process')
test_11_2> select sid,username,module,action,client_info from v$session where sid=650;

SID USERNAME MODULE ACTION CLIENT_INFO
----- ------------ ------------------ ---------------------- ---------------
650 ERIC orphan process

test_11_2> -- set them to something sensible
test_11_2> EXEC DBMS_APPLICATION_INFO.SET_MODULE(MODULE_NAME=>'RETURNS_LOAD',ACTION_NAME=>'MATCH_TO_SALE')
test_11_2> select sid,username,module,action,client_info from v$session where sid=650;

SID USERNAME MODULE ACTION CLIENT_INFO
----- ------------ ------------------ ---------------------- ---------------
650 ERIC RETURNS_LOAD MATCH_TO_SALE

test_11_2> -- client_info can be longer
test_11_2> EXEC DBMS_APPLICATION_INFO.SET_client_info(client_info=>'entered this module at '||sysdate)
test_11_2> select sid,username,module,action,client_info from v$session where sid=650;

SID USERNAME MODULE ACTION CLIENT_INFO
----- ------------ --------------- --------------- ----------------------------------------
650 ERIC RETURNS_LOAD MATCH_TO_SALE entered this module at 03-SEP-2012 13:07

The intention is that you set the MODULE as you enter a functional chunk of your application (be it a PL/SQL package or a chunk of JAVA, whatever {but be aware of the stateless nature of systems using middle tiers and pooled connections}) and then update the ACTION as you progress. You can set an action with no module, as demonstrated above, but I would recommend against it. On that topic I would make the following recommendations:

  • Set the MODULE at the start, eg as you enter a PL/SQL package’s main module
  • Set the MODULE before you do all the validation – you may wish to see if that validation is as quick as you think.
  • Use SET_ACTION to update the action prudently. ie when the activity of the code alters but not all the time. It will come clearer with use, but what you are after is to be able to judge the relative run durations of parts of the application. Too much detail can mask the overall picture. If you can alter code you can always add more instrumentation.
  • Clear the MODULE and ACTION explicitly as you exit a package. The activity is lightweight and you do not want any code that lacks this instrumentation to inherit values from calling code.
  • Do not set an ACTION without having set the MODULE earlier in the code.
  • Keep the text of these values short and specific.

You do not have to look at V$SESSION to see these values, which is good as you may lack the privs needed to do so. Since oracle 9 there have also been procedures to check what the values are set to. e.g:

test_11_2> BEGIN
2 DECLARE
3 V_MODULE VARCHAR2 (64);
4 V_ACTION VARCHAR2 (64);
5 begin
6 dbms_application_info.READ_MODULE(V_MODULE,V_ACTION);
7 DBMS_OUTPUT.PUT_LINE('Module IS '||V_MODULE||' Action IS '||V_ACTION);
8 END;
9 END;
10 /

Module IS RETURNS_LOAD Action IS MATCH_TO_SALE
test_11_2>

So as you can see, these values can be set, they can be accessed via the data dictionary or the package itself and, if you believe my wild claims, they are very lightweight to use. However, these values really come into themselves with eg OEM’s performance screens. I’ll leave examples of that to a second post. Here I just wanted to cover how easy it is to set and get the information.

That does lead me onto what I think is the main reason that DBMS_APPLICATION_INFO is not as widely used as it could {and should} be. Developers often lack access to eg OEM and so do not see the full benefit of using it. Also, developers tend to be {note the get-out-clause) more concerned with making the code work and less with monitoring it after it is released. DBA-types tend to be more concerned with monitoring it. Thus the DBA types would want it in the code and developers would not want to spend the time coding it in. We are all one big team though, so some negotiation and beer-buying (or management enforcement) might well resolve that one.

Friday Philosophy – Work Inside Life August 24, 2012

Posted by mwidlake in Friday Philosophy, off-topic, Private Life.
Tags:
add a comment

I know, the usual phrase is “Life Outside Work” but I like to think that, no matter how much we may like our jobs, our overall life is the key thing.

I was prompted to write today’s Friday Philosophy when I was reading Neil Chandler’s blog a few days ago and saw that he had posted about his up-coming {and now in progress} Banger Rally. He and some mates are doing something very cool in their spare time, taking part in a rally from, basically, Calais in France to Naples in Italy, in a “banger”. This is a UK (and wider?) term for an old, tired, worn out car. The most they could spend on the car was £250 and it has to go over some pretty extreme mountain passes. I presume it also has to carry the set of 4 people involved too, so not a lot of weight there then, Neil :-). You can follow their progress at this blog. I was in a pub in Leeds with Neil when he was considering one car for this rally.

I can’t claim that I do anything as striking as Neil’s current jaunt when not working, but I do like to spend some time doing things that are nothing to do with IT at all. For me, this tends to be physical tasks like cutting down trees with my chainsaw. Or building structures in the garden with wood (OK, half-building them – I constructed the below platform and another out into our pond maybe 18 months ago, both still have no balustrade around them and there is a 45cm gap between the bank and the pond platform – NEXT weekend I’ll maybe finish one. Maybe.)


Another garden task recently was building our wood-fired, mud-constructed pizza oven. My wife and I went on a 1-day course to learn how to make a mud-based pizza oven and then spent, ohhhh, about 10 days over 8 weeks building one! We dug a big hole in the garden to get some clean clay (we live in an area that is on top of clay), stole some straw from the horse that lives at the bottom of our garden (long story), bought some sand (because nicking it from beaches is both illegal and bad form) and mixed up our first batch of clay-sand-straw. Add in some old bricks and we got started. It took about 30 lots of mixture, a few alterations to the ratios as we went, but we ended up with this monster.

If you are wondering what the white stuff on the oven is, some ants decided to build their nest in the oven, between the layers I guess (you build the inner oven first and then add several extra mud layers to give insulation and a larger thermal mass, so that the oven stays warm longer).

Like anything, if we did it a second time we would probably make a much better job of it as we learnt so much from the first attempt. One of them was to invite friends with young children over to see us during the build. Kids love the idea of treading the mud mix and you can get a suprising amount of work out of a single ten-year-old (and very insistent I get the age right) German girl if they are allowed to get very muddy.

I found the whole process very relaxing (but very tiring) as you do not need to use too much brain power, slapping the mud mixture on the outside is deeply cathartic and you end up with something useful. Well, slightly useful.


Having built it we then had to learn how to fire it. As an ex-boy-scout I thought this would be easy. Getting the fire going is not too hard, you need to keep it burning well for about an hour to heat up the oven and, one big tip, don’t use any damp wood or wood that “spits” (like willow). Having a burning log explode and shower your almost-ready pizza with charcoal embers can lead to angry garden scenes and considerable bad language.

We can just about get a 9″ pizza in there. The pizza in this shot is actually about 6″. The oven was designed to take a 12″ pizza but, errm, I forgot to allow enough space to one side to push the fire into as well.

Another major thing to keep in mind? That arch! You can’t put anything in the oven that is larger than that hole. More importantly, you can’t get anything out that won’t easily come back through that hole. Sticking your arm inside to jiggle things about is also tricky as it gets up to around 300C! {for US readers – about a millon F}

As you can see, the results are anything but professional! And, compared to nipping down to the shops to buy a pizza and just slamming it into the kitchen oven, it takes a lot, lot longer. But you can drink beer or enjoy a bottle of win and sit in the garden whilst getting the mud oven fired and hot. Last weekend we had friends over and the ladies made the pizzas, I cooked them and the other gentlemen helped us work through the beer collection and ate. As for taste? Fantastic.

My wife is far more artistic than I and she makes decorative cakes, sews and now even makes hats when not auditing IT systems (Have you ever been involved in an IT audit and the person knows how to make lists and check them – but can’t plug in a PC? Well, Sue was once a DBA and also a Unix Sys Admin, so when she audits you, she finds what you are hiding!). As such, maybe as an industry we should encourage her hat-making

So what, if anything is the point of this Friday Philosophy, other than to show off my pizza oven and my wife’s hats?

Well, I see this outside work activity as important in the workplace as well, especially if you are having to manage people. Firstly, it is important, I feel, for us all to do something we really enjoy to relax and re-energize (and this can be “computers” for IT people, but as I get older I notice more of my friends who were utterly technology-focused in their 20’s are now moving more towards non-IT hobbies).

Secondly, what people do in their spare time can tell you a lot about the person and what keeps them happy. If they have hobbies that are artistic and their IT job is very regimented, it could indicate they would be happier with some more less rigid task in the mix, say some design work. If they have NO outside hobbies, you are probably working the poor dears a little too hard!

Thirdly, and I have seen this for real, you may have a person or a team that is not performing well in the office – but it turns out a lot of them do interesting and challenging things at home. Someone who can motivate themselves to be a part-time fireman or has the dedication to train each day for a sport has skills and energy. Why, as their manager, am I not seeing much of these traits in the office? Because they are not happy and/or motivated. So I have an issue to sort out. I don’t manage people at present, but I still find it interesting what people do and achieve outside the office as it can indicate that they have talents and skills not being seen in the workplace.

I wonder how I can introduce my chain-saw skills into performance tuning?

Exclusion of Unioned SQL in Views – Followup August 20, 2012

Posted by mwidlake in database design, performance, SQL.
Tags: , ,
add a comment

Last week I put up a post about how Oracle can filter out sections of a union view..

Within the comments I put up another example where the CBO did not filter out all but one of the Union views despite my replicating the exact WHERE clause of one of the unioned statements. Jonathan Lewis posted a followup to say “Oracle can be very touchy about how it plays this game” and made a prediction of how the CBO would handle a slightly different scenario.

This was the extra scenario and I include brief details on creating the unioned view too. NB all on Oracle 11.2.0.2. {non-Exadata :-) }

  1  create table mdw_source
  2  tablespace users
  3  as select * from dba_objects

  1  create or replace view mdw_v1 as
  2  select
  3    '1'              src_type
  4   ,ms.OWNER ,ms.OBJECT_NAME,ms.SUBOBJECT_NAME ,ms.CREATED
  5  from mdw_source ms
  6  where ms.object_name like 'A%'
  7  union all
  8  select
  9    '2'              src_type
 10   ,ms.OWNER ,ms.OBJECT_NAME,ms.SUBOBJECT_NAME ,ms.CREATED
 11  from mdw_source ms
 12  where ms.object_name like 'B%'
 13  UNION ALL
 14  select
 15    '3'              src_type
 16   ,ms.OWNER ,ms.OBJECT_NAME,ms.SUBOBJECT_NAME ,ms.CREATED
 17  from mdw_source ms
 18  where ms.object_name like 'C%'

1 select count(created)
2 from mdw_v1 mv
3 where mv.object_name like ‘C%’

Execution Plan
———————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
———————————————————————————–
| 0 | SELECT STATEMENT | | 1 | 75 | 551 (1)| 00:00:07 |
| 1 | SORT AGGREGATE | | 1 | 75 | | |
| 2 | VIEW | MDW_V1 | 824 | 61800 | 551 (1)| 00:00:07 |
| 3 | UNION-ALL | | | | | |
|* 4 | TABLE ACCESS FULL| MDW_SOURCE | 1 | 34 | 184 (1)| 00:00:03 |
|* 5 | TABLE ACCESS FULL| MDW_SOURCE | 2 | 68 | 184 (1)| 00:00:03 |
|* 6 | TABLE ACCESS FULL| MDW_SOURCE | 821 | 27914 | 184 (1)| 00:00:03 |
———————————————————————————–

Predicate Information (identified by operation id):
—————————————————
4 – filter(“MS”.”OBJECT_NAME” LIKE ‘A%’ AND “MS”.”OBJECT_NAME” LIKE ‘C%’)
5 – filter(“MS”.”OBJECT_NAME” LIKE ‘B%’ AND “MS”.”OBJECT_NAME” LIKE ‘C%’)
6 – filter(“MS”.”OBJECT_NAME” LIKE ‘C%’)

Statistics
———————————————————-
8 recursive calls
0 db block gets
2417 consistent gets 

So, as you can see, despite me providing the WHERE clause in my final select as “where mv.object_name like ‘C%’” and the WHERE clauses of the unioned view are all similar and one is “ms.object_name like ‘C%’” the CBO fails to realise it can exclude all but one of the unioned SQL statements – despite the mutally exclusive filter predicates:

4 – filter(“MS”.”OBJECT_NAME” LIKE ‘A%’ AND “MS”.”OBJECT_NAME” LIKE ‘C%’)
5 – filter(“MS”.”OBJECT_NAME” LIKE ‘B%’ AND “MS”.”OBJECT_NAME” LIKE ‘C%’)

I have to confess, in light of recent experience I would have wondered if these filter predicates would prevent the other two SQL statements being fired at execution time but the number of consistent gets confirms that it is not – 2417 compared to 805 or so for when the query exclusion occurs. You can check back to the prior post to verify that.

So, let’s do the tests Jonathan suggested. He suggested altering the view to use equality predicates so I have recreated the view as such:

TEST_DB> create or replace view mdw_v1 as
  2  select
  3    '1'              src_type
  4    ,ms.OWNER ,ms.OBJECT_NAME ,ms.SUBOBJECT_NAME ,ms.CREATED
  5  from mdw_source ms
  6  where substr(ms.object_name,1,1) ='A'
  7  union all
  8  select
  9    '2'              src_type
 10   ,ms.OWNER ,ms.OBJECT_NAME ,ms.SUBOBJECT_NAME ,ms.CREATED
 11  from mdw_source ms
 12  where substr(ms.object_name,1,1) ='B'
 13  UNION ALL
 14  select
 15    '3'              src_type
 16   ,ms.OWNER ,ms.OBJECT_NAME ,ms.SUBOBJECT_NAME ,ms.CREATED
 17  from mdw_source ms
 18  where substr(ms.object_name,1,1) ='C'

View created.

TEST_DB> set autotrace on
TEST_DB> --
TEST_DB> -- below will do a select of a non-indexed column across the view
TEST_DB> select count(created) from mdw_v1

COUNT(CREATED)
--------------
          1288

Execution Plan
-----------------------------------------------------------------------------------
| Id  | Operation            | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |            |     1 |     9 |   552   (1)| 00:00:07 |
|   1 |  SORT AGGREGATE      |            |     1 |     9 |            |          |
|   2 |   VIEW               | MDW_V1     |  1677 | 15093 |   552   (1)| 00:00:07 |
|   3 |    UNION-ALL         |            |       |       |            |          |
|*  4 |     TABLE ACCESS FULL| MDW_SOURCE |   559 | 19006 |   184   (1)| 00:00:03 |
|*  5 |     TABLE ACCESS FULL| MDW_SOURCE |   559 | 19006 |   184   (1)| 00:00:03 |
|*  6 |     TABLE ACCESS FULL| MDW_SOURCE |   559 | 19006 |   184   (1)| 00:00:03 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter(SUBSTR("MS"."OBJECT_NAME",1,1)='A')
   5 - filter(SUBSTR("MS"."OBJECT_NAME",1,1)='B')
   6 - filter(SUBSTR("MS"."OBJECT_NAME",1,1)='C')

Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
       2417  consistent gets


TEST_DB> -- below will do a select of a non-indexed column
TEST_DB> -- specific to one of the Union's in the view via the src_type
TEST_DB> select count(created)
  2  from mdw_v1
  3  where src_type='3'

COUNT(CREATED)
--------------
           129

Execution Plan
------------------------------------------------------------------------------------
| Id  | Operation             | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |            |     1 |    12 |   184   (1)| 00:00:03 |
|   1 |  SORT AGGREGATE       |            |     1 |    12 |            |          |
|   2 |   VIEW                | MDW_V1     |   561 |  6732 |   184   (1)| 00:00:03 |
|   3 |    UNION-ALL          |            |       |       |            |          |
|*  4 |     FILTER            |            |       |       |            |          |
|*  5 |      TABLE ACCESS FULL| MDW_SOURCE |   559 | 19006 |   184   (1)| 00:00:03 |
|*  6 |     FILTER            |            |       |       |            |          |
|*  7 |      TABLE ACCESS FULL| MDW_SOURCE |   559 | 19006 |   184   (1)| 00:00:03 |
|*  8 |     TABLE ACCESS FULL | MDW_SOURCE |   559 | 19006 |   184   (1)| 00:00:03 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter(NULL IS NOT NULL)
   5 - filter(SUBSTR("MS"."OBJECT_NAME",1,1)='A')
   6 - filter(NULL IS NOT NULL)
   7 - filter(SUBSTR("MS"."OBJECT_NAME",1,1)='B')
   8 - filter(SUBSTR("MS"."OBJECT_NAME",1,1)='C')

Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
        807  consistent gets


TEST_DB> -- below will do a select of a non-indexed column
TEST_DB> -- specific to one of the Union's in the view via matching the where clause
TEST_DB> select count(created)
  2  from mdw_v1
  3  where substr(object_name,1,1) ='C'

COUNT(CREATED)
--------------
           129

Execution Plan
------------------------------------------------------------------------------------
| Id  | Operation             | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |            |     1 |    75 |   184   (1)| 00:00:03 |
|   1 |  SORT AGGREGATE       |            |     1 |    75 |            |          |
|   2 |   VIEW                | MDW_V1     |   561 | 42075 |   184   (1)| 00:00:03 |
|   3 |    UNION-ALL          |            |       |       |            |          |
|*  4 |     FILTER            |            |       |       |            |          |
|*  5 |      TABLE ACCESS FULL| MDW_SOURCE |   559 | 19006 |   184   (1)| 00:00:03 |
|*  6 |     FILTER            |            |       |       |            |          |
|*  7 |      TABLE ACCESS FULL| MDW_SOURCE |   559 | 19006 |   184   (1)| 00:00:03 |
|*  8 |     TABLE ACCESS FULL | MDW_SOURCE |   559 | 19006 |   184   (1)| 00:00:03 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter(NULL IS NOT NULL)
   5 - filter(SUBSTR("MS"."OBJECT_NAME",1,1)='A')
   6 - filter(NULL IS NOT NULL)
   7 - filter(SUBSTR("MS"."OBJECT_NAME",1,1)='B')
   8 - filter(SUBSTR("MS"."OBJECT_NAME",1,1)='C')

Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
        807  consistent gets


TEST_DB> -- below will do a select of a non-indexed column
TEST_DB> -- specific to one of the Union's in the view via NOT matching the where clause
TEST_DB> -- but maybe logical exclusion can be managed
TEST_DB> select count(created)
  2  from mdw_v1
  3  where object_name ='C'

COUNT(CREATED)
--------------
             0

Execution Plan
-----------------------------------------------------------------------------------
| Id  | Operation            | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |            |     1 |    75 |   551   (1)| 00:00:07 |
|   1 |  SORT AGGREGATE      |            |     1 |    75 |            |          |
|   2 |   VIEW               | MDW_V1     |     3 |   225 |   551   (1)| 00:00:07 |
|   3 |    UNION-ALL         |            |       |       |            |          |
|*  4 |     TABLE ACCESS FULL| MDW_SOURCE |     1 |    34 |   184   (1)| 00:00:03 |
|*  5 |     TABLE ACCESS FULL| MDW_SOURCE |     1 |    34 |   184   (1)| 00:00:03 |
|*  6 |     TABLE ACCESS FULL| MDW_SOURCE |     1 |    34 |   184   (1)| 00:00:03 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter("MS"."OBJECT_NAME"='C' AND
              SUBSTR("MS"."OBJECT_NAME",1,1)='A')
   5 - filter("MS"."OBJECT_NAME"='C' AND
              SUBSTR("MS"."OBJECT_NAME",1,1)='B')
   6 - filter("MS"."OBJECT_NAME"='C' AND
              SUBSTR("MS"."OBJECT_NAME",1,1)='C')

Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
       2417  consistent gets

TEST_DB> -- below will do a select of a non-indexed column
TEST_DB> -- specific to one of the Union's in the view via NOT matching the where clause
TEST_DB> -- but logically it is the same.
TEST_DB> select count(created)
  2  from mdw_v1
  3  where object_name like 'C%'

COUNT(CREATED)
--------------
           129

Execution Plan
-----------------------------------------------------------------------------------
| Id  | Operation            | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |            |     1 |    75 |   551   (1)| 00:00:07 |
|   1 |  SORT AGGREGATE      |            |     1 |    75 |            |          |
|   2 |   VIEW               | MDW_V1     |    24 |  1800 |   551   (1)| 00:00:07 |
|   3 |    UNION-ALL         |            |       |       |            |          |
|*  4 |     TABLE ACCESS FULL| MDW_SOURCE |     8 |   272 |   184   (1)| 00:00:03 |
|*  5 |     TABLE ACCESS FULL| MDW_SOURCE |     8 |   272 |   184   (1)| 00:00:03 |
|*  6 |     TABLE ACCESS FULL| MDW_SOURCE |     8 |   272 |   184   (1)| 00:00:03 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter("MS"."OBJECT_NAME" LIKE 'C%' AND
              SUBSTR("MS"."OBJECT_NAME",1,1)='A')
   5 - filter("MS"."OBJECT_NAME" LIKE 'C%' AND
              SUBSTR("MS"."OBJECT_NAME",1,1)='B')
   6 - filter("MS"."OBJECT_NAME" LIKE 'C%' AND
              SUBSTR("MS"."OBJECT_NAME",1,1)='C')

Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
       2417  consistent gets

Above, I create the view, do a quick select count(*) to cause a full scan of all three section in the union view, then do selects using various WHERE predicates to see what sections of the unioned SQl statements are excluded.

The results are very intersting. The filtering on sub_type still works fine, but now the WHERE predicate “where substr(object_name,1,1) =’C’” allows the CBO to exclude sections of the union, the “Null is not null” filters appear and the consistent gets is 807. The WHERE predicate “where object_name =’C'” which at first glance you might think to be logically the same but is not (we selected object_names where only the first character is checked but the whole object_name is selected) acts does not do any exclusion and I thought it might. There are no OBJECT_NAMES of ‘C’ but logically as a human you can see that such an object name could not be found by the first two unioned SQL statements in any case.

The last test is most interesting. The WHERE clause of the SQL select over the view is “where object_name like ‘C%’” and is logically the same as the one WHERE clause in the unioned view “where substr(object_name,1,1) =’C’“. So no exclusion occurs and again we see 2417 consistent gets.

It would seem the CBO Pixies either have not coded for that or, as Jonathan says Oracle can be “touchy” about this feature. Bottom line, if you are expecting this sort of unioned view SQL exclusion to occur – TEST!

Friday Philosophy – New Game: Phone Zombies! (You Too can Play) August 17, 2012

Posted by mwidlake in Friday Philosophy, humour, off-topic, Perceptions.
Tags: , ,
8 comments

I’m spending a lot more time in Central London at the moment due to current work commitments. A few weeks ago I was having a quiet stroll through the streets and had what I can only describe as an odd moment:

I looked around and found I was being converged upon by 5 or 6 people walking slowly and aimlessly towards me – all from different angles, all only vaguely aware of their surroundings, all looking like they were making straight for me. I instantly thought of one of the scenes from “Shaun of the Dead” {A cracking film, go hire it tonight}.

They were all on their smart phones of course, intent on the little glowing screens of whatever it was they could not drag their eyes from – despite them also trying to navigate a busy city landscape. As some of you know, I have a bit of a “hate-mild dislike-grudging acceptance” relationships with Smart Phones.

I was so struck by this scene that I nearly did not move in time, but finally I did step to one side as I watched them do this quite wonderful, little, shuffling-dance around each other. I think only one of them actually looked up properly, the others all did that micro-glance; frown; direction shift; re-engage-with-screen procedure that is becoming so common. As a species we must be somehow pre-designed to cope with this as none of them actually bumped into each other – but it took several micro-glance manoeuvres for some of them to make it through.

This has resulted in a new game I can’t stop playing as I make my way through London:

Phone Zombies – How many people can I see at any time who are effectively lobotomised by their personal electronic device?

I have a few rules:

  • I has to be an electronic device – phones, smart phones, crackberries, tablets, electronic books etc.
  • Real books and papers do not count.
  • They must be upright (so no sitting).
  • If they are moving they count.
  • If they are stopped in the middle of the path they count.
  • If they have put themselves in a doorway or some other sensible place they do not count.
  • Unless, even though they have done that, they are still e.g. blocking ingress and egress from the doorway.
  • A bonus point if they micro-glance manoeuvre.
  • 5 bonus points if contact is made with another person in the time I am watching.
  • 10 bonus points for contact with something inanimate {only once to date}.
  • 20 points if they go “uuurrrgggghhh” and have blood on them. {no one has got 20 bonus points yet, but I live in hope}

I think my best so far is about 14, but that is because two phone-zombies both walked into each other. Classic.

Exclusion of Unioned SQL in Views? August 16, 2012

Posted by mwidlake in performance, SQL.
Tags: , , ,
13 comments

Question – you have a view definition of the following format:

select 1 as d_type, col_a,col_b,col_c
from TAB_X, TAB_Y, TAB_Z
where {your predicates}
UNION
select 2 as d_type, col_a,col_b,col_c
from TAB_P, TAB_Q, TAB_R
where {your predicates}
UNION
select 3 as d_type, col_a,col_b,col_c
from TAB_X X, TAB_Y, TAB_Z
where {your predicates}

You now select from the view and your code includes a filter predicate such as:

“WHERE D_TYPE = 1″

What will the CBO do? Is it intelligent enough to do what a human would do and exclude the parts of the query that could not possibly return data you want? Ie exclude the SQL statements where d_type is set to 2 and 3. Or is that funcitonality not built into the CBO? After all, it is only software and someone would have to decide that there could be benefit of looking into set of unioned SQL selects to see if any could be excluded and then code that into the CBO.

After all, the CBO is not smart enough to realise it could easily satisfy “select min(col_a), max(col_a) from tab x” via two fast lookups on a supporting index. It scans the index.

So – what is your guess?

I asked 3 people. Not a massive straw poll but I got one vote for it will not exclude, one for it will and one smart answer of “I would expect it to exclude- but as you asked the question I am thinking it does not”.

Let’s do the test.

DB_TEST> @demo_union_view_2
DB_TEST> drop table mdw_source purge;
Table dropped.

  1  create table mdw_source
  2  tablespace users
  3  as select * from dba_objects
  4  /
Table created.

  1  create index mdso_idx on mdw_source(owner,object_name)
  2  tablespace users
  3  /
Index created.

DB_TEST> exec dbms_stats.gather_table_stats(ownname=>USER,tabname=>'MDW_SOURCE',method_opt=>'for all columns size 1')
PL/SQL procedure successfully completed.

  1  select table_name,num_rows from dba_tab_statistics
  2  where owner=USER and table_name='MDW_SOURCE';

TABLE_NAME      NUM_ROWS
--------------- ----------
MDW_SOURCE      55895

  1  create or replace view mdw_v1 as
  2  select
  3    '1'              src_type
  4   ,ms.OWNER
  5   ,ms.OBJECT_NAME
  6   ,ms.SUBOBJECT_NAME
  7   ,ms.CREATED
  8  from mdw_source ms
  9  where ms.object_name like 'A%'
 10  union all
 11  select
 12    '2'              src_type
 13   ,ms.OWNER
 14   ,ms.OBJECT_NAME
 15   ,ms.SUBOBJECT_NAME
 16   ,ms.CREATED
 17  from mdw_source ms
 18  where ms.object_name like 'B%'
 19  UNION ALL
 20  select
 21    '3'              src_type
 22   ,ms.OWNER
 23   ,ms.OBJECT_NAME
 24   ,ms.SUBOBJECT_NAME
 25   ,ms.CREATED
 26  from mdw_source ms
 27  where ms.object_name like 'C%'
 28  /
View created.

So, a simple table is created from DBA_OBJECTS and a view over the top of the table consisting on 3 simple selects unioned together, with a static value (1, 2 or 3) generated for each of the 3 parts as SRC_TYPE

Now let’s run some tests:

DB_TEST> set autotrace on
DB_TEST> --
DB_TEST> -- below will do a select of a non-indexed column across the view
DB_TEST> select count(created) from mdw_v1
  2  /

COUNT(CREATED)
--------------
          1284

Execution Plan
----------------------------------------------------------
Plan hash value: 4259489107
-----------------------------------------------------------------------------------
| Id  | Operation            | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |            |     1 |     9 |   551   (1)| 00:00:07 |
|   1 |  SORT AGGREGATE      |            |     1 |     9 |            |          |
|   2 |   VIEW               | MDW_V1     |  2463 | 22167 |   551   (1)| 00:00:07 |
|   3 |    UNION-ALL         |            |       |       |            |          |
|*  4 |     TABLE ACCESS FULL| MDW_SOURCE |   821 | 27914 |   184   (1)| 00:00:03 |
|*  5 |     TABLE ACCESS FULL| MDW_SOURCE |   821 | 27914 |   184   (1)| 00:00:03 |
|*  6 |     TABLE ACCESS FULL| MDW_SOURCE |   821 | 27914 |   184   (1)| 00:00:03 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter("MS"."OBJECT_NAME" LIKE 'A%')
   5 - filter("MS"."OBJECT_NAME" LIKE 'B%')
   6 - filter("MS"."OBJECT_NAME" LIKE 'C%')

Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
       2417  consistent gets


DB_TEST> -- below will do a select of a non-indexed column across the view
DB_TEST> -- limiting it to "one" of the types
DB_TEST> select count(created)
  2  from mdw_v1
  3  where src_type='3'
  4  /

COUNT(CREATED)
--------------
           129


Execution Plan
----------------------------------------------------------
Plan hash value: 1188403912
------------------------------------------------------------------------------------
| Id  | Operation             | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |            |     1 |    12 |   184   (1)| 00:00:03 |
|   1 |  SORT AGGREGATE       |            |     1 |    12 |            |          |
|   2 |   VIEW                | MDW_V1     |   823 |  9876 |   184   (1)| 00:00:03 |
|   3 |    UNION-ALL          |            |       |       |            |          |
|*  4 |     FILTER            |            |       |       |            |          |
|*  5 |      TABLE ACCESS FULL| MDW_SOURCE |   821 | 27914 |   184   (1)| 00:00:03 |
|*  6 |     FILTER            |            |       |       |            |          |
|*  7 |      TABLE ACCESS FULL| MDW_SOURCE |   821 | 27914 |   184   (1)| 00:00:03 |
|*  8 |     TABLE ACCESS FULL | MDW_SOURCE |   821 | 27914 |   184   (1)| 00:00:03 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter(NULL IS NOT NULL)
   5 - filter("MS"."OBJECT_NAME" LIKE 'A%')
   6 - filter(NULL IS NOT NULL)
   7 - filter("MS"."OBJECT_NAME" LIKE 'B%')
   8 - filter("MS"."OBJECT_NAME" LIKE 'C%')

Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
        807  consistent gets


DB_TEST> -- below will do a select of an indexed column across the view
DB_TEST> select count(owner) from mdw_v1
  2  /

COUNT(OWNER)
------------
        1284


Execution Plan
----------------------------------------------------------
Plan hash value: 4219520050
------------------------------------------------------------------------------------
| Id  | Operation               | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |          |     1 |    17 |   221   (1)| 00:00:03 |
|   1 |  SORT AGGREGATE         |          |     1 |    17 |            |          |
|   2 |   VIEW                  | MDW_V1   |  2463 | 41871 |   221   (1)| 00:00:03 |
|   3 |    UNION-ALL            |          |       |       |            |          |
|*  4 |     INDEX FAST FULL SCAN| MDSO_IDX |   821 | 26272 |    74   (2)| 00:00:01 |
|*  5 |     INDEX FAST FULL SCAN| MDSO_IDX |   821 | 26272 |    74   (2)| 00:00:01 |
|*  6 |     INDEX FAST FULL SCAN| MDSO_IDX |   821 | 26272 |    74   (2)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter("MS"."OBJECT_NAME" LIKE 'A%')
   5 - filter("MS"."OBJECT_NAME" LIKE 'B%')
   6 - filter("MS"."OBJECT_NAME" LIKE 'C%')

Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
        989  consistent gets


DB_TEST> --
DB_TEST> -- below will do a select of an indexed column across the view
DB_TEST> -- limiting it to "one" of the types
DB_TEST> select count(owner)
  2  from mdw_v1
  3  where src_type='3'
  4  /

COUNT(OWNER)
------------
         129


Execution Plan
----------------------------------------------------------
Plan hash value: 815942527
-------------------------------------------------------------------------------------
| Id  | Operation                | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |          |     1 |    20 |    74   (2)| 00:00:01 |
|   1 |  SORT AGGREGATE          |          |     1 |    20 |            |          |
|   2 |   VIEW                   | MDW_V1   |   823 | 16460 |    74   (2)| 00:00:01 |
|   3 |    UNION-ALL             |          |       |       |            |          |
|*  4 |     FILTER               |          |       |       |            |          |
|*  5 |      INDEX FAST FULL SCAN| MDSO_IDX |   821 | 26272 |    74   (2)| 00:00:01 |
|*  6 |     FILTER               |          |       |       |            |          |
|*  7 |      INDEX FAST FULL SCAN| MDSO_IDX |   821 | 26272 |    74   (2)| 00:00:01 |
|*  8 |     INDEX FAST FULL SCAN | MDSO_IDX |   821 | 26272 |    74   (2)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter(NULL IS NOT NULL)
   5 - filter("MS"."OBJECT_NAME" LIKE 'A%')
   6 - filter(NULL IS NOT NULL)
   7 - filter("MS"."OBJECT_NAME" LIKE 'B%')
   8 - filter("MS"."OBJECT_NAME" LIKE 'C%')

Statistics
----------------------------------------------------------
          8  recursive calls
          0  db block gets
        331  consistent gets


DB_TEST> -- below will select against the undelying object we are interested in
DB_TEST> select count(created)
  2  from mdw_source ms
  3  where ms.object_name like 'C%'
  4  /

COUNT(CREATED)
--------------
           129


Execution Plan
----------------------------------------------------------
Plan hash value: 152094671

---------------------------------------------------------------------------------
| Id  | Operation          | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |            |     1 |    34 |   184   (1)| 00:00:03 |
|   1 |  SORT AGGREGATE    |            |     1 |    34 |            |          |
|*  2 |   TABLE ACCESS FULL| MDW_SOURCE |   821 | 27914 |   184   (1)| 00:00:03 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("MS"."OBJECT_NAME" LIKE 'C%')

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        805  consistent gets

Look at the plans at the start.

When my query does not include SRC_TYPE,ie the first SQL statement, the plan is simple – it scans the table three times, estimates 821 rows per scan and a cost of 184 for each step. When I include the SRC_TYPE = ‘3’ in the next step we see that the plan still includes all three full table scans, each still estimated at a cost of 184.
When you provide a predicate that identifies the one sql statement that you want to run, it still runs all three. I tried a few further tests and could not force the exclusion of the “unwanted” queries from the plan.

I showed my friend above (the one who suggested it would not work as I had asked the question) the results and we agreed that it was a little unfortunate that the Oracle Optimizer Pixies had not coded in the ability to exclude unwanted Unioned statements, but we could understand it.

However, for those of you thinking “you are missing something” – you are right. {And, to be fair to my friend, my example was not quite so simplified as what I have posted here}.

I decided to quantify the level of the impact and so shifted my concentration to the buffer gets. Look at the first and second statements again. The cost of the total statement drops, from 551 to 184. The actual consistent gets resulting from executing the SQL is dropping from 2417 to 807. They are both dropping dramatically to approximately 1/3.

Now check out the filter predicates at step 4 and 6 – “NULL IS NOT NULL”. That will always be false. Everything “after” that statement in the plan will not be executed. The step in the plan is there and, as a step, has the same cost – but it is is not being executed.

The pattern is repeated when the simple SQL statement is on an indexed column and the index is being used (the next two statements) {and honesty forces me to admit I have included more output than I need for this example, as I wanted to confuse the issue slightly and maybe encourage you to not read the plans too carefully}.

So yes, the CBO IS excluding the individual unioned SQL statements based on the stated SRC_TYPE and the fixed value stated in each part of the union.

I’m quite impressed by that. Clever Oracle Optimizer Pixies.

To tie this back to reality, I initially looked at this as I am trying to tune some code that hides all the complexity of the application under one massive unioned uber-view over a set of 9 other complex, unioned views. The explain plan runs to close to 700 lines. I had utterly overlooked the filters and filter predicates in that – heck I missed it initially in my simple test :-) {which was more complex than in this post, in my defense}. Trying to work with a very slow statement based on nested, complex views that is used by pretty much the whole application is “fun”.

Also back in reality, the filter clause I see in my real code is along the lines of:

:B17 is null or is not null

This is as bind variable is being considerd – and that is actually one of the considerations back in this real world of the original issue. The level that the master query needs to run in is decided by this bind variable – otherwise the application developer would (should have?) just pointed at the sub-view they needed for that level. So the CBO does not know which chunk of the union’d code it can ignore until it sees that bind variable, so the plan shows what could be executed.

Of course, it could be argued that if you know the type at the time you query the database, rather than use the above union filtering trick to cut down your massively complex view to the part that is required, just query the sub view you need directly. A bit more code but a lot simpler system to look after.

It’s a nice SQL trick but I would keep it to simple examples!

Finaly, just for fun, here is the plan I’ve been trying to tune. As I said, close to 700 steps and only the highlighted grey section on the left is shown in more detail on the right.

Friday Philosophy – I Am An Exadata Expert August 10, 2012

Posted by mwidlake in Exadata, Friday Philosophy, Perceptions.
Tags: , , ,
5 comments

(Can I feel the angry fuming and dagger looks coming from certain quarters now?)

I am an Exadata Expert.

I must be! – I have logged onto an Exadata quarter rack and selected sysdate from Dual.

The pity is that, from some of the email threads and conversations I have had with people over the last 12 months, this is more real-world experience than some people I have heard of who are offering consultancy services. It’s also more experience than some people I have actually met, who have extolled their knowledge of Exadata – which is based solely on the presentations by Oracle sales people looking at the data sheets from 10,000 feet up and claiming it will solve world hunger.

Heck, hang the modesty – I am actually an Exadata Guru!

This must be true as I have presented on Exadata and it was a damned fine, technical presentation based on real-world experience and I have even debated, in public, the pros and cons of point releases of exadata. Touching base with reality once more, I did an intro talk “the first 5 things you need to know about Exadata” and the “debate” was asking Julian Dyke if he had considered the impact of serial direct IO on a performane issue he had seen and he had not only done so but looked into the issue far more than I – so he was able to correct me.

But joking aside –  I really am a true consulting demi-god when it comes to Exadata

I have years of experience across a wide range of Exadata platforms. That would be 0.5 years and I’ve worked intensively on just one system and am in a team now with some people who are proper experts. So a range of two. Yes, tongue is still firmly in cheek.

This situation always happens with the latest-greatest from Oracle (and obviously all other popular computing technologies). People feel the need to claim knowledge they do not have. Sometimes it is to try and get consultancy sales or employment, sometimes it is because they don’t want to be seen to be behind the times and sometimes it is because they are just deluded. The deluded have seen some presentations, a few blog posts and maybe even got the book and read the first few chapters and are honeslty convinced in their own minds that they now know enough to make effective use of the technology, teach {or, more usually, preach} others and so proclaim on it. {See Dunning Kruger effect, the certainty of idiots}. I’m certainly not arguing against going to presentations, reading blogs and books and learning, just don’t make the mistake of thinking theoretical, second-hand knowledge equates to expert.

With Exadata this situation is made worse as the kit is expensive and much of what makes it unusual cannot be replicated on a laptop, so you cannot as an individual set up a test system and play with it. Real world experince is required. This is growing but is still limited. So the bullshit to real skills quotient remains very, very high.

If you are looking for help or expertise with Exadata, how do you spot the people with real knowledge from the vocal but uninformed? Who do you turn to? {NB don’t call me – I’m busy for 6 months and I really am not an expert – as yet}. If your knowledge to date is based on sales presentations and tidbits from the net which may or may not be based on a depth of experience, it is going to be hard to spot. When I was still without real world experience I had an unfair advantage in that I saw email threads between my fellow OakTable members and of course some of those guys and gals really are experts. But I think I was still hoodwinked by the odd individual on the web or presenting and, I can tell you, though this background knowledge really helped – when I DID work on my first exadata system, I soon realised I did not understand a lot about the subtulties and not-so-subtulties of using a system where massively improved IO was available under key conditions. I had to put a lot of time and effort and testing to move from informed idoit to informed, partially experienced semi-idiot.

I know this issue of the non-expert proclaiming their skills really frustrates some people who do know their stuff for real and it is of course very annoying if you take someone’s advice (or even hire them) only to find their advice to be poor. Let’s face it, is is simple lying at best and potentially criminal mis-selling.

I guess the only way is for peopel needing help to seek the help of someone who has already proven themselves to be honest about their skills or can demonstrate a real-world level experience and success. I would suggest the real experts should do that most difficult task of pointing out the mistakes of the false prophets, but it is very tricky to do without looking like a smartarse or coming over as a big head or jealous.

I’ll finish on one thing. Last year I said how I thought maybe I should do more blog posts about things I did not know much about, and be honest about it and explore the process of learning. I did actualy draft out about 3 posts on such a topic but never pushed them out as I was way too busy to complete them… That and, being candid, I really did not want to look like an idiot. After all, this Oracle lark is what puts beer in my hand, hat fabric on my wife’s millinary worktop and food in my cat’s bowl. The topic was….? Correct, Exadata. Maybe I should dust them off and put them out for you all to laugh at.

Follow

Get every new post delivered to your Inbox.

Join 161 other followers