jump to navigation

Dynamic Sampling Hint Ignored May 20, 2010

Posted by mwidlake in internals, performance.
Tags: ,
trackback

This is a bit of an odd one. On V10.2.0.3 (I have not tried other versions yet) then the dynamic sampling hint appears to be ignored if the stats for the table say there are zero rows in the table. Even if there are in fact lots of rows.

I’ve not had chance to dig into Metalink etc yet {I’m putting together this post before I yet again get overwhelmed and fail to put up something I think is of interest} but if this turns out to be a general bug impacting other versions, it could catch people out. This is because one tactic to handle queries where the data volumes shifts all the time is to not gather stats and use the dynamic sampling hint. “Not gathering stats” could also be “gather stats when the table is empty and ignore it – the dynamic sampling hint will take care of everything”.

Here is my worked example. I use two tables, PERSON and PERSON_NAME, related on a column PERS_ID. {If you want the exact table definitions etc, mail me}.

--preserve my data
create table pena_backup as select * from person_name;

Table created.

--ensure I have stats
exec dbms_stats.gather_table_stats(ownname=>user,tabname =>'PERSON')
PL/SQL procedure successfully completed.

exec dbms_stats.gather_table_stats(ownname=>user,tabname =>'PERSON_NAME')
PL/SQL procedure successfully completed.

--Show I have data
select table_name,num_rows,last_analyzed from dba_tables
where owner=USER  AND TABLE_NAME IN ('PERSON','PERSON_NAME');

TABLE_NAME                       NUM_ROWS LAST_ANALYZED
------------------------------ ---------- -----------------
PERSON                              71913 20-MAY-2010 11:06
PERSON_NAME                        107567 20-MAY-2010 11:06

--Explain a simple select
explain plan for
select /* */ pers.surname, pers.first_forename
from    person pers, person_name pena
where pena.pers_id =pers.pers_id
and    pena.surname = 'BROWN'
/ 
Explained.

select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------
Plan hash value: 2653984949

----------------------------------------------------------------------------------
| Id  | Operation          | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |             |  1427 | 45664 |   637   (5)| 00:00:03 |
|*  1 |  HASH JOIN         |             |  1427 | 45664 |   637   (5)| 00:00:03 |
|*  2 |   TABLE ACCESS FULL| PERSON_NAME |  1427 | 17124 |   320   (5)| 00:00:02 |
|   3 |   TABLE ACCESS FULL| PERSON      | 71913 |  1404K|   314   (4)| 00:00:02 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("PENA"."PERS_ID"="PERS"."PERS_ID")
   2 - filter("PENA"."SURNAME"='BROWN')

16 rows selected.

--Now explain with dynamic sampling
explain plan for
select /*+ dynamic_sampling (pena 6) */ pers.surname, pers.first_forename
from    person pers, person_name pena
where pena.pers_id =pers.pers_id
and    pena.surname = 'BROWN'
/
Explained.

select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------
Plan hash value: 2653984949

----------------------------------------------------------------------------------
| Id  | Operation          | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |             |  1427 | 45664 |   637   (5)| 00:00:03 |
|*  1 |  HASH JOIN         |             |  1427 | 45664 |   637   (5)| 00:00:03 |
|*  2 |   TABLE ACCESS FULL| PERSON_NAME |  1427 | 17124 |   320   (5)| 00:00:02 |
|   3 |   TABLE ACCESS FULL| PERSON      | 71913 |  1404K|   314   (4)| 00:00:02 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("PENA"."PERS_ID"="PERS"."PERS_ID")
   2 - filter("PENA"."SURNAME"='BROWN')

Note
-----
   - dynamic sampling used for this statement

20 rows selected.

-- Note the appearance of the NOTE about dynamic sampling
-- If I did this again, I would add extra data to the table to get different stats
-- from the dynamic sampling


--truncate my person_name table and re-gather stats

truncate table person_name;
Table truncated.

exec dbms_stats.gather_table_stats(ownname=>user,tabname =>'PERSON_NAME')
PL/SQL procedure successfully completed.

TABLE_NAME                       NUM_ROWS LAST_ANALYZED
------------------------------ ---------- -----------------
PERSON                              71913 20-MAY-2010 11:06
PERSON_NAME                             0 20-MAY-2010 11:06


--now try and dynamic sample a known empty table
explain plan for
select /*+ dynamic_sampling (pena 6) */ pers.surname, pers.first_forename
from    person pers, person_name pena
where pena.pers_id =pers.pers_id
and    pena.surname = 'BROWN'; 
Explained.

select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------
Plan hash value: 1295262714

---------------------------------------------------------------------------------------------
| Id  | Operation                     | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |             |     1 |    50 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID  | PERSON      |     1 |    20 |     2   (0)| 00:00:01 |
|   2 |   NESTED LOOPS                |             |     1 |    50 |     3   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| PERSON_NAME |     1 |    30 |     1   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | PENA_SN     |     1 |       |     1   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN           | PERS_PEID   |     1 |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("PENA"."SURNAME"='BROWN')
   5 - access("PENA"."PERS_ID"="PERS"."PERS_ID")

18 rows selected.

-- I initially thought Oracle might be detecting that the 
-- table was empty - never had data inserted or had been truncated
-- and thus knew there was no point sampling...But...
 
-- put the data back
insert into person_name select * from pena_backup;

107567 rows created.

commit;
Commit complete.

--and explain again

Explained.
explain plan for
select /*+ dynamic_sampling (pena 6) */ pers.surname, pers.first_forename
from    person pers, person_name pena
where pena.pers_id =pers.pers_id
and    pena.surname = 'BROWN';

select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------
Plan hash value: 1295262714

---------------------------------------------------------------------------------------------
| Id  | Operation                     | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |             |     1 |    50 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID  | PERSON      |     1 |    20 |     2   (0)| 00:00:01 |
|   2 |   NESTED LOOPS                |             |     1 |    50 |     3   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| PERSON_NAME |     1 |    30 |     1   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | PENA_SN     |     1 |       |     1   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN           | PERS_PEID   |     1 |       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("PENA"."SURNAME"='BROWN')
   5 - access("PENA"."PERS_ID"="PERS"."PERS_ID")

18 rows selected.

-- Note, nothing about dynamic sampling, the rows and bytes expected are the same as
-- the query with no dynamic sampling hint

-- now gather stats again so oracle knows there is some data

exec dbms_stats.gather_table_stats(ownname=>user,tabname =>'PERSON_NAME') 
PL/SQL procedure successfully completed.

select table_name,num_rows,last_analyzed from dba_tables
where owner=USER  AND TABLE_NAME IN ('PERSON','PERSON_NAME');

TABLE_NAME                       NUM_ROWS LAST_ANALYZED
------------------------------ ---------- -----------------
PERSON                              71913 20-MAY-2010 11:06
PERSON_NAME                        107567 20-MAY-2010 11:09

--and now try dynamic sampling again

explain plan for
select /*+ dynamic_sampling (pena 6) */ pers.surname, pers.first_forename
from    person pers, person_name pena
where pena.pers_id =pers.pers_id
and    pena.surname = 'BROWN';
Explained.

select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------
Plan hash value: 2653984949

----------------------------------------------------------------------------------
| Id  | Operation          | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |             |  1427 | 45664 |   637   (5)| 00:00:03 |
|*  1 |  HASH JOIN         |             |  1427 | 45664 |   637   (5)| 00:00:03 |
|*  2 |   TABLE ACCESS FULL| PERSON_NAME |  1427 | 17124 |   320   (5)| 00:00:02 |
|   3 |   TABLE ACCESS FULL| PERSON      | 71913 |  1404K|   314   (4)| 00:00:02 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("PENA"."PERS_ID"="PERS"."PERS_ID")
   2 - filter("PENA"."SURNAME"='BROWN')

Note
-----
   - dynamic sampling used for this statement

20 rows selected.

What prompted me to find this is I had to provide a DYNAMIC SAMPLING hint to a development team and I wanted to make sure it was working correctly first (it was a MERGE statement and I was not sure where to best put it). As you get no warnings about syntax errors with hints, it is best to prove they are correct before handing them over šŸ™‚
I was doing this in a dev system and, try as I might I could not get the hint to work. Because, as I now know, the dev system had no data in the driving table and stats had been gathere on it. I wasted 30 mins in a critical fix-on-fail due to this oddity.

I have not tried to see if DYNAMIC SAMPLING stated at the instance level is also ignored if there are zero-row stats against the table.

Comments»

1. dombrooks - May 20, 2010

Interesting find.
It looks to be “deliberate”, initially at least.

The key seems to be in the 10053 trace file

“** Not using dynamic sampling since cardinality statistic or hint equals 0.”
which you get in this case (or at least I did on different tables) or if you use a level of 0 in the hint.

mwidlake - May 20, 2010

Hi Dom,

Thanks for that. I intended (maybe still will) doing a 10053 trace when I have my test system in front of me. I see you tested on 9 so I guess that means this oddity is across versions and that I should dot my I’s and cross my T’s by doing the trace on 10.2. too

I did manage to do a bit of web searching though. Jonathan L has a nice articel on a related theme, see http://jonathanlewis.wordpress.com/2010/02/23/dynamic-sampling/. He starts with a summary of what Dynamic Sampling is and then has a Follow-up where he shows the CBO appearing to ignore the Dynamic Sampling hint but the 10053 trace shows that the CBO DOES the sampling, but then throws it away.

It would seem that with my special case, the CBO really does not do the instructed dynamic sampling. A rare but true example of a valid hint being ignored, maybe?

It does add to the argument that “Bad Statistics are worse than No Statistics” – and that stats saying the table is empty are Bad if the table is not.

Jonathan Lewis - May 21, 2010

Martin,

“It would seem that with my special case, the CBO really does not do the instructed dynamic sampling. A rare but true example of a valid hint being ignored, maybe?”

But the hint wasn’t ignored – it directed the optimizer to follow a certain strategy and print the explanation in the trace file !

Regards
Jonathan Lewis

mwidlake - May 21, 2010

Oh no! I know when you comment, Jonathan, I’ve usually got sumat wrong…

OK, the CBO parsed the hint, accepted the hint was telling it to dynamically sample. And then did not. It did not do the dynamic sample (going on what Dom says and I have no reason to doubt him).

The output says

ā€œ** Not using dynamic sampling since cardinality statistic or hint equals 0.ā€

It did not do what I asked even though it could.

Is this not like my wife telling me to go buy more milk to make coffee (the hint)
– I check the running shopping list she keeps and see she has written “run out of coffee” on the shopping list
– So do not go out and buy milk?

I heard her, I took an action but I did not do what was asked of me. I think my wife would accuse me of “ignoring” her. How was I to know she had since bought coffee and not updated the list? She told me to get milk (which she stupidly forgot) and I ignored her and now she won’t cook my tea?

{forgive the metaphor, having a very trying day}

I’d still say it ignored my hint. If the hint did not syntactically make sense, it would have not “heard” the hint. If the hint was impossible then it could not do the task so not doing it would not be ignoring it. The CBO could have dynamically sampled the table { and I really wanted it to}, it ignored me.

Or am I really missing the point, as is often the case?

Jonathan Lewis - May 23, 2010

Martin

It did not do what I asked even though it could.

This example, and the example of the use_concat() hint where something similar can happen, get much closer to quibbling over semantics. But you could argue that when Oracle runs a serial query after you’ve supplied a parallel(t 6) hint that it did not do what you asked even though it could – and again you would be able to see from the trace file that the optimizer had obeyed your hint, but not done what you wanted.

This is another example of either a bug, or a documentation error – and the worst problem with Oracle and hints is the terrible quality of the documentation..

Personally I would want your example registered as a bug since it is unreasonable behaviour, but the trace report suggests that it would be considered “expected behaviour” if you tried to report it to Oracle as a bug. Dom’s trace – and I failed to notice that my quote from his reply was going to appear in the trail before his reply – would also have said something about: “Performing dynamic sampling initial checks” – and the closing dump in the trace would have said that the hint was “resolved” and “not in error”.

mwidlake - May 23, 2010

If it is semantics, such things are usually best discussed in the pub šŸ™‚

I’d still say the CBO is ignoring the hint as there is nothing preventing the CBO from following the hint – except it has decided that existing stats of zero rows are to be believed and relied on. Stats of any other number {I’ve not tested them all, I hasten to add :-)} will result in the hint being followed. This would be particularly vexing if I was attempting to get the optimizer to consider the predicate guesses and column correlation that high values for dynamic sampling invoke.

You mention the Parallel hint and I am glad you did. This is, I think, a good of example of a hint that sometimes works as intended and is sometimes “ignored”, when nothing in the app or stats or even data volume has changed. But it is not ignored, it is that other parallel executions are already using up slaves so your statement cannot be allocated the parallel threads you asked for.

This could become a separate blog entry next week….

{I hope this follows your last comment correctly – I find it a little annoying that you can only have three levels of comment in wordpress – at least, with my chosen style anyway}.

Jonathan Lewis - May 25, 2010

Martin,

My comment about supplying a hint like parallel(t 6) and still seeing a serial plan wasn’t made with regard to there being no available slaves. It was about the fact that parallel(t 6) is not a directive to the optimizer to run parallel – even though that’s how is’t commonly seen. So if it doesn’t work for you when it “should” you might assume that the optimizer is ignoring the hint when it could have obeyed it, while I might point out that, based on my knowledge of how the hint is supposed to work, it had obeyed the hint and behaved correctly.

mwidlake - May 26, 2010

On the topic of the parallel hint, I take it you mean that the hint is saying “use parallelism and at this level {on this segment} if parallel execution is possible in the plan” as opposed to “choose a plan that allows parallelism and then use this value”? I think you are right that this is coming down to semantics, though also ignorance of what exactly the hint is saying is perhaps a big part? .

If I use a parallel hint, I know that the plan has to be suitable for parallel execution, and that a lack of parallel execution when using the hint probably is a result of this – so I have to look at the plan and see if I need to use other hints to get a plan that is suitable for parallel operations. If the plan is suitable for parallelism and I have parallel slaves available etc and the code is still executed serially… then that would be the CBO ignoring my hint (and no, I can’t remember ever seeing this for real).

I feel that the documentation does not help. The SQL manual for 10 does not state that the parallel hint will cause a plan to be chosen to support parallel operations but it also does not, in my opinion, make it at all clear that it does NOT do this – and many assume it would. After all, there is no other hint that means “if possible, choose an execution path that supports parallel”. Many think (and I did at first) that it is implicit that asking for a level of parallelism also means use parallelism.

In my original case, the use of Dynamic Sampling hint, is the use of dynamic sampling both possible and potentially favourable, and yet the hint is not acted upon? Is this not ignoring the hint? {I still think this is best discussed over a pint}.

Jonathan Lewis - May 26, 2010

Martin,

My next free evening looks like Wednesday 7th July if you want to round up the lads.
In the meantime here’s an example I wrote about the parallel hint and lack of (clarity in the) documentation: http://jonathanlewis.wordpress.com/2007/06/17/hints-again/

2. dombrooks - May 20, 2010

It does seem to contradict (or at least add a caveat to) the official documentation (there’s a surprise) about using the default number of dynamic sampling blocks for analysed tables.

P.S. Should have added that that line above from 10053 trace came from 9.2.0.8

3. dombrooks - May 20, 2010

I’m so sorry for spamming you like this.
It caught my interest. I should have waited until the end to comment….

On further investigation, I would now say that this doesn’t contradict the official documentation (depending on where you’re looking) because it does say that if a cardinality statistic exists, it will be used.

What it doesn’t say is that if that cardinality statistic is zero, it will stop dynamic estimation of selectivity.

What you can see in the 10053 traces (the ones I’m looking at) is the following:
1. if unanalysed then you get dynamic cardinality & dynamic selectivity estimates;
2. if analysed and cardinality is not zero, then you get original cardinality & dynamic selectivity estimate;
3. if analysed and cardinality is zero, then no dynamic anything.

However, in case 3, if you additionally use the dynamic_sampling_est_cdn hint, then you back cooking on dynamic sampling of cdn and sel.

dombrooks - May 20, 2010

And all this has just made the penny drop on something else I was working on – thanks!

mwidlake - May 21, 2010

Thanks again Dom, you have not only confirmed what I had seen (saving me the work) but given me a method around it – the dynamic_sampling_etc_cdn hint šŸ™‚

I’d say thay the dynamic sampling hint working if there are existing stats saying there are rows and being ignored if there are stats saying there are no rows is “very questionable” by Oracle. This is because it is pretty common, if you use partitions in your database, to create future partitions by splitting the terminal, empty partition {This avoids issues with making global indexes invalid and issues querying data on the paritions being split}.
– You split off your new partitions
– These empty partitions have stats gathered on them by either the automatic stats gathering job or your inhouse code
– The stats say the paritition has zero rows and any dynamic sampling hint in your code is ignored.
– data starts going into new partitions
– Oracle treats them as having no data, ignoring the dynamic sampling hint {which would be being invoked correctly for other partitions}
– You get plans based on zero {often rounded to 1} rows in the partition until it gets new stats and is thus seen to have some data.

Hmmm, I wonder if this could explain some problems I struggled to solve last year…

dombrooks - May 21, 2010

And of course, the stats change from zero, you don’t necessarily get the dynamic sampling you’re after immdeiately due to the rolling invalidation functionality in 10g onwards (as I was reminded of when doing the test Statistique suggested below).

mwidlake - May 21, 2010

That is if you rely on the auto stats job or do a schema/database gather and leave the no_invalidate as the default of dbms_stats.auto_invalidate. That option really bugs me:
a) I have to think if setting it to FALSE will cause immediate invalidation or not. no…invalidate…false… Yes, invalidate.
b) as you say, the random time interval of between 1 and 5 hours before the dependent cursors are invalidated. Random. Unpredicatable. Not really highlighted in the documentation (especially in V10.1 when all this auto stats stuff just quietly inserted itself onto your database…)

I tend to do specific table and index stats gathering and invalidate cursors as I go. What is a bit of parsing between friends.

4. Statistique - May 20, 2010

A nice test would be to create a table, load it which data, gather the stats and then set the numrows (and or numblks) stats to 0 for the table.

Does it still “ignore” the dynamic_sampling hint ???

I’m going to test it on my test instance when I get 2 minutes.

mwidlake - May 21, 2010

Hi Statistique,

It would be interesting to see if setting the individual elements of table stats to zero has the same impact as having them all zero via my method. Judging from the work Dom has done with trace, setting the num rows (cardinality) wouold do the trick.

Martin

dombrooks - May 21, 2010

> setting the num rows would do the trick
Yes, no surprises there. It does. I tried it at home last night (on 10.2.0.4). But I didn’t post back because I thought I’d exceeded my quota of comments. Oh…

mwidlake - May 21, 2010

We could rename the blog “The Martin and Dom show”? šŸ™‚ Keep ’em coming, so long as they are informative or funny, the more the better

5. Narendra - May 21, 2010

If I am not mistaken, there is difference between table “not having statistics” and table “having statistics that indicate zero rows”.
My understanding is dynamic sampling kicks in when there are no statistics on the table in question. In your test case, you truncated the table and “gathered” statistics again. The presence of statistics (showing zero rows in table) resulted in optimizer ignoring the dynamic sampling hint.
I believe if you “delete” table statistics (using dbms_stats.delete_table_stats), you will not only see blank (I guess NULL) value in NUM_ROWS column but that will also enable the optimizer to use dynamic sampling. I think it boils down to the difference between NULL and ZERO. Both are not same. šŸ™‚
I don’t have access to the database at present and hence can’t test this out. Does this make sense?

p.s. I remember reading Doug Burns’ series of blog posts about statistics collection of partitioned tables. I believe it is somewhat related as he demonstrates how oracle is able to calculate global statistics when there exists dummy statistics at partition level. I guess that also works on the same principle.

mwidlake - May 21, 2010

Hi Narendra,

Thanks for the comments. You are right about the difference between having no stats gathered and having stats gathered, but it was not the point I was making in this case. I’m afraid I assumed people knew that dynamic sampling was intended to gather stats on tables lacking stats and also to verify correleation between columns via sampling, even if stats had been gathered (depending on the level set for Dynamic Sampling). I had included an example of dynamic sampling occuring on the tables when they utterly lacked stats and I had no hint, but took it out as the example was just to long. Sorry about that if it led to confusion.

With a hint stating that the CBO should use dynamic sampling on a stated table, the CBO should do the sampling at the stated level irrespective of there being stats on the table. And that is the case so long as the stats for the table do not state zero rows, it would seem. And yes, zero is certainly not null šŸ™‚

Doug’s series of posts are indeed good. The process of calculating global stats from partition ones is slightly different, in that case the information is derived from the existing partition stats as opposed to sampling data again (at least it is in 10.2.0.3). However, I should check back on Doug’s post and also what happens in 11, as it might be that Oracle does some sampling to help calculate information that simply cannot be derived well from the partition-level stats to the global stats (like, distinct number of values for columns).

Thanks,

Martin

Narendra - May 21, 2010

Martin,

but it was not the point I was making in this case.
Apologies if I have missed the plot.
Iā€™m afraid I assumed people knew that dynamic sampling was intended to gather stats on tables lacking stats and also to verify correleation between columns via sampling, even if stats had been gathered (depending on the level set for Dynamic Sampling)
BTW, that statement is not literally accurate. As of 11g, Dynamic Sampling is not intended to “gather” stats. All it does is actually executes an additional query (during parsing) in order to get the information, which statistics provide otherwise (like number of rows, distinct values etc.)

mwidlake - May 21, 2010

Hi Narendra,

Sorry, I said that about “gathering stats” badly and you are right. Under dynamic sampling the CBO samples the table for data to help it cost the query. It does not “gather stats” in that the information is not then stored into the dictionary against the tables/indexes. The data is collected, used and discarded.

One of the impacts of dynamic sampling is that it is carried out each time the statement is parsed and then the information is thrown away. On a data warehouse this is usually fine as the main workload is long queries where a few seconds sampling the table(s) is affordable. On an OLTP system, the sampling can well take longer than the actual query takes to run and it can cripple the system

6. Narendra - May 21, 2010

Apologies for missing out on this earlier, but I believe the “cardinality feedback” mechanism introduced in 11g will probably take care of this issue, when the sql is executed multiple times.
Also, it seems I am bit late here as DomBrooks appears to have already done the work. šŸ™‚

7. Statistique - May 21, 2010

Hi,

Thanks for the feedback on my comments.

“Iā€™m afraid I assumed people knew that dynamic sampling was intended to gather stats on tables lacking stats and also to verify correleation between columns via sampling, even if stats had been gathered (depending on the level set for Dynamic Sampling).”

I must point out that I use dynamic_sampling on table that has stats on them but that I know in advance the CBO will estimate cardinality so far from truth that it needs some extra info. Since our architects doesn’t like when we does hint such as /*+ CARDINALITY (xxx 9999) USE_HASH (xxx yyy) etc */, we use dynamic_sampling on such query and the CBO goes back in the rigth track 75% of the time.

I’ll keep in mind that “zero rows anomaly” though !

Nice blogging btw (you AND dombrooks) !!!

mwidlake - May 21, 2010

Thanks Statistique, both for the kind words and also the extra input.

I also quite often use dynamic sampling when I know there are stats on the table. Sometimes it is because I know that the data may well have changed since the stats were gathered, sometime it is because I want to use a level 3, 4 or more gather to allow the CBO to spo correlation between values.

I can appreciate your architects not liking hints, I tend to use them only if I can’t get the stats to tell the optimiser what I want to do (or alter the code to better handle the situation). But I use hints when needed. I have still to make up my mind if I “like” cardinality hints or not. It is giving the optimizer more information but allowing it to still choose what it sees as the best access path and should not prevent new features being used, but it is a bit like sticking a constant into the code. Will the cardinality still be correct in 2 years time?

{Dom does have his own blog but he seems to like to add to mine šŸ™‚ }

8. Dynamic sampling pitfalls « savvinov - December 21, 2011

Leave a reply to dombrooks Cancel reply