jump to navigation

Pickler Fetch – What is it? August 11, 2011

Posted by mwidlake in performance, PL/SQL.
Tags: , ,
trackback

How many of you have seen the step COLLECTION ITERATOR PICKLER FETCH appearing in Explain Plans and wondered “what is a Pickler Fetch”?

{Note – make sure you check out the comments on this post if you are interested in Pickler Fetches. Timur Akhmadeev and Dom Brooks correct and extend my original efforts}.

----------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |            |       |       |    31 (100)|          |
|*  1 |  FILTER                               |            |       |       |            |          |
|   2 |   NESTED LOOPS                        |            |       |       |            |          |
|   3 |    NESTED LOOPS                       |            |     1 |    71 |    31   (4)| 00:00:01 |
|   4 |     SORT UNIQUE                       |            |     4 |     8 |    26   (0)| 00:00:01 |
|   5 |      COLLECTION ITERATOR PICKLER FETCH|            |     4 |     8 |    26   (0)| 00:00:01 |
|*  6 |     INDEX RANGE SCAN                  | CHHE_PK    |     1 |       |     2   (0)| 00:00:01 |
|   7 |    TABLE ACCESS BY INDEX ROWID        | CHILD_HEAP |     1 |    69 |     4   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

It is where Oracle works through a SQL array. I’ve usually seen this within PL/SQL where an array variable is used within a procedure or package, being passed in containing a set of accounts or customers that are of interest and is used with a cursor. But you might also see it as the step processing the output of a pipelined function (one that returns a set of “rows”) being cast into a table. See this example by Tom Kyte. Tom also suggests that it is called a “Pickler” fetch as the data is pickled – packed and formatted. I never knew that, I just thought someone in Oracle development was having a giggle and it was “Pickled” as it was preserved from the PL/SQL side of the SQL engine. It seems that I was a little off-target with that.

{My thanks to Timur (see comments) who corrected me when I said it was a PL/SQL array. It is not, it is a SQL object}.

The above plan was generated for the below SQL statement:

cursor c_collect is 
select  PARE_ID
       ,CRE_DATE
       ,VC_1
from child_heap
where pare_id in (select  id  from table(p_parents) p)
and   cre_date >=trunc(p_date) and cre_date < trunc(p_date+1);

Anyway, I just wanted to make a few comments about Pickler Fetches:

  • The CBO is not very good at looking “inside” such arrays to determine the cardinality of that step. This can cause performance issues.
  • Because using them involves a few steps and potentially involves grants, it is possible for them to be “temporarily removed” during testing and so their impact is not seen.
  • They can leak memory. I think.

I’m now going to cover each of those points in turn. If you just wanted to know what a pickler fetch is and I’ve answered that for you, I suggest you go back to whatever you were doing before you started reading this 🙂

By “not very good at looking inside pickler fetches” I mean that the CBO seems to make a poor “base it on the first seen array” (V11 up) or utterly fails to identify how many records are inside the SQL array (V10 down), depending on the Oracle version. From an Oracle 10.2.0.4 system I’ve got the two following examples:

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name              | Rows| Bytes | Cost (%CPU)| Time     | Pstart
--------------------------------------------------------------------------------------------------------------------
|   9 |      NESTED LOOPS                       |                   |   1 |   139 |   114   (1)| 00:00:01 |      
|* 10 |       HASH JOIN SEMI                    |                   |   1 |    50 |    11  (10)| 00:00:01 |      
|* 11 |        TABLE ACCESS BY INDEX ROWID      | ABCDE_ABCDEFGHIJ  |   3 |   144 |     4   (0)| 00:00:01 |      
|* 12 |         INDEX RANGE SCAN                | XX_PK             |   3 |       |     3   (0)| 00:00:01 |      
|  13 |        COLLECTION ITERATOR PICKLER FETCH|                   |     |       |            |          |      
|  14 |       PARTITION RANGE ITERATOR          |                   |   6 |   534 |   103   (0)| 00:00:01 |   KEY
|* 15 |        INDEX RANGE SCAN                 | AAXX_PK           |   6 |   534 |   103   (0)| 00:00:01 |   KEY

Note the cost and expected rows for the Pickler Fetch step. Or rather, the lack of them.

This would manifest itself in the following way in OEM screens:

Note the cardinality of the Pickler Fetch step is 0 {sorry, I cut off the column headings}. This resulted in this step having a cost of 0 and all the subsequent steps of having an expected cardinality of one and having very low costs – between 1 and 3 {Again, not shown, sorry}.

The end result of this was that the CBO struggled to accuratley cost any given plan and came up with several, usually quite terrible, plans that it swapped between as other table stats varied. The CBO was picking between very complex plans with total costs of 100 or 200 or so! Any difference was “significant”.

Please note, OPTIMIZER_DYNAMIC_SAMPLING was set to 4 on this system and I tried hints and session settings to higher levels and they did not prompt the CBO to look into the array, on 10.2.0.4 at least.

In 11.1 things seem to be better, as is shown in the explain plan at the top of this post. The step has a cost. I have to confess, I have not tested this very much on 11 {and if anyone has, feel free to correct me/add enlightenment via comments or pointers to other sources}, but it seems to set the cardinality to the number of elements the Pickler Fetch finds in the first itteration. Unless it uses the same sort of trick Oracle 11 now uses for bind variables (detecting when the value supplied is out of range and generates a new plan) this is going to lead to the old and much ‘loved’ issue of the plan being fixed by the first execution, irrespective of how suitable that plan is.

How do you fix this issue? Well, I resort to the cardinality hint. Usually the number of records being passed into the array is not too variable and any half-decent value is better than nothing in Oracle 10 and before. As for in 11, I like stating the value rather than risking a variable ‘first seen at parsing time’ setting. It is a judgement call. The below is from 11.1 but I’ve used it extensively in 10.2, where the impact is much more significant:

select  PARE_ID
       ,CRE_DATE
       ,VC_1
from child_heap
where pare_id in (select  id  from table(p_parents) )
and   cre_date >=trunc(p_date) and cre_date < trunc(p_date+1)
--
----------------------------------------------------------------------------
| Id  | Operation                             | Name       | Rows  | Bytes |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |            |       |       |
|*  1 |  FILTER                               |            |       |       |
|   2 |   NESTED LOOPS                        |            |       |       |
|   3 |    NESTED LOOPS                       |            |     1 |    71 |
|   4 |     SORT UNIQUE                       |            |     4 |     8 |
|   5 |      COLLECTION ITERATOR PICKLER FETCH|            |     4 |     8 |
|*  6 |     INDEX RANGE SCAN                  | CHHE_PK    |     1 |       |
|   7 |    TABLE ACCESS BY INDEX ROWID        | CHILD_HEAP |     1 |    69 |
----------------------------------------------------------------------------
--
-- and now with hint
--
select  PARE_ID
       ,CRE_DATE
       ,VC_1
from child_heap
where pare_id in (select /*+ CARDINALITY(p,11) */ id  from table(p_parents) p)
and   cre_date >=trunc(p_date) and cre_date < trunc(p_date+1)
--
-----------------------------------------------------------------------------
| Id  | Operation                             | Name       | Rows  | Bytes |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |            |       |       |
|*  1 |  FILTER                               |            |       |       |
|   2 |   NESTED LOOPS                        |            |       |       |
|   3 |    NESTED LOOPS                       |            |     1 |    71 |
|   4 |     SORT UNIQUE                       |            |    11 |    22 |
|   5 |      COLLECTION ITERATOR PICKLER FETCH|            |    11 |    22 |
|*  6 |     INDEX RANGE SCAN                  | CHHE_PK    |     1 |       |
|   7 |    TABLE ACCESS BY INDEX ROWID        | CHILD_HEAP |     1 |    69 |
-----------------------------------------------------------------------------

Note the change of ROWS to 11 in step 5. In V10 this is a change from blank to 11 and in real situations, do not be at all suprised if the plan changes dramatically – away from nested loop access and more to hash joins. {I should note, the cardinality hint is not documented in Oracle 10 or 11 and any use you make of it in live code is your responsibility. Sorry about that}.

What about my second point, about testing them? Well, as an example of testing Pickler processing of SQL arrays, which are defined SQL types, this is what I had to do to run my basic test:

-- pare_types_demo.sql
--
-- create and use some record and table types.
--
create or replace type pare_rec as object
(id number(10)
,name varchar2(100))
/
create or replace type pare_rec_t
as table of pare_rec
/
--
set serveroutput on size unli
--
create or replace procedure get_children
(p_parents in pare_rec_t
,p_date    in date)
is
cursor c_collect is 
select  PARE_ID
       ,CRE_DATE
       ,VC_1
from child_heap
where pare_id in (select id from table(p_parents))
and   cre_date >=trunc(p_date) and cre_date < trunc(p_date+1);
begin
  for v_rec in c_collect
  loop
    dbms_output.put_line('parent '||to_char(v_rec.pare_id)||' on '||to_char(v_rec.cre_date)
    ||' value '||v_rec.vc_1);
  end loop;
end;
/
--
declare
l_date        date           :=sysdate-100;
l_parents     pare_rec_t;
begin
  l_parents :=pare_rec_T(PARE_REC(1,'ERIC')
                        ,PARE_REC(2,'BARRY')
                        ,PARE_REC(3,'KERMIT')
                        ,PARE_REC(4,'GONZO')
                        );
  GET_CHILDREN (L_PARENTS,L_DATE);
END;
/

I had to create a table type, which is the SQL array, and this was based on an object type which I had to create first {you can have table types based on standard SQL types but very often they are based on a “row” object}. After creating the stored procedure, I had to define and populate the array with a set of records which I then passed in to my procedure call . {If you want to repeat this yourself, check out my postings on IOTs to get the table creation statement for table CHILD_HEAP}.
Now, I created those types so I had access to them. If those types do not belong to you you have to be granted execute on the types to reference them. Not select, execute. Some sites have a pretty strict attuitude to granting execute on anything and types seem to get forgotten when the execute priviledges against packages and procedures are set up. In a recent situation I had, I was forced to do some testing work on Live and it had taken people with big sticks to get me select access on data. Execute privileges were totally refused. Calmly explaining why it was needed and how it was acceptable fell on not so much deaf as bricked-up ears.

So, for testing, the reference to an array passed in is often replaced by a little sub-select. After all, quite often what is being passed in for a pickler fetch is actually a set of records {but a subset of the rows} from a table that has been collected by a previous processing step.
As an example of such a change:

where  ev.SYS_ID                       = pag.SYS_ID
and    ev.EXT_ID                       = pag.EXT_ID
and    P_EXT_ID                        = pag.EXT_ID
and    P_SYS_ID                        = pag.SYS_ID
and    ev.CREATED_DT                  >= pag.START_DT
and (ev.ID, ev.TX) in (select ID,TX from table(P_ARRAY_IN))
--
--BECOMES
--
where  ev.SYS_ID                       = pag.SYS_ID
and    ev.EXT_ID                       = pag.EXT_ID
and    P_EXT_ID                        = pag.EXT_ID
and    P_SYS_ID                        = pag.SYS_ID
and    ev.CREATED_DT                  >= pag.START_DT
--and (ev.ID, ev.TX) in (select ID,TX from table(P_ARRAY_IN)) --NB swap back in before release!
and (ev.ID, ev.TX) in (select ID,TX from CUSTOMERS WHERE SURNAME='WIDLAKE' and FFNAME='MARTIN')

All that has changed is that we now have a little sub-select rather than the casting of the SQL array into a table and, heck, as the developer might say, those were the records that would have been passed in, the code still works as expected and the same data comes back. No identifying which object types you need, no getting the execute permissions, no populating it yourself in the test harness, we can just swap back in the array later.

{I apologise to good developers, who even now are throwing imaginary darts at me. I know Dawn, you would certainly not do this. But I’ve seen it a couple of times. Developers have enough on their plate to go worrying about esoteric aspects of the CBO}

But the thing is, Oracle can look at that select and evaluate it’s cost and get an expected cardinality. The pickler fetch version has the issues I’ve just covered. I had to deal with a situation just like this last year, it does happen. In dev it was fine, in pre-live testing it was not.

What about memory leaks? Well, I had lots of issues with SQL arrays and memory leaks with oracle 10.1. and 10.2 at one client site and there is a documented bug in oracle 8 with pickler fetch and memory leaks but I have to confess, a quick metalink search did not find any hits for Oracle 10 and 11. So maybe you should not trust me on that one. In the situation I saw the arrays were massive, several MB at times, and so if you are using SQL arrays to pass in a modest list of eg accounts or customers, it is not going to be an issue anyway.

You know, this was just going to be a quick post on something I’ve been meaning to mention for months, not a small essay :-).

Comments»

1. Timur Akhmadeev - August 11, 2011

Hi Martin
[blockquote]It is where Oracle works through a PL/SQL array[/blockquote]
It’s an SQL array, since PL/SQL types are not visible in SQL, even with a pipeline function returning PL/SQL type (corresponding SQL type is created behind the scene).

2. frank gordon - August 11, 2011

I’d come across this when using XML DB in 9i. I’d thought pickler came from the Python object serialization, where the XML object was being turned into database rows. It did leak memory like a sieve!

mwidlake - August 11, 2011

Hi Gordon

I think you are on the right lines when you say that the pickler fetch came from the serialization of the XML object – Now that Timur has prodded me in the right direction I would hazard a guess that some casting process was changing the XML object into a SQL array and then the Pickler fetch was working on that array.
I’m glad you had memory leaks (well, I’m not but you know what I mean). I think there are several known memory leak issues with XML processing in 9 and 10.

3. mwidlake - August 11, 2011

Hi Timur,

*thinks*… Yes, I see what you mean.
I’ve been lazy in my consideration of this haven’t I? The fact that you have to create a table type in SQL, often based on an object type you previously created, means it is in the SQL domain. In my experience I’ve ever only seen it when PL/SQL is used (result of a stored function or the casting of an array as a table as I show) and so I class it in my mind as part of PL/SQL.

Thank you. I’ll update the text.

Martin

4. Dom Brooks - August 11, 2011

I tried to post a long comment but it just disappeared.
I will try to break it up a bit.

In terms of cardinality, CBO will normally assume a default, like a temp table, related to block size.

So in older versions any estimate that is not in that ballpark is likely to be related to the plan.

Let’s see if I can post a example in this comment a) that actually illustrates what I want it to and b) without messing it up.

Firstly – not default estimate for collection cardinality (this is using COLLECTION ITERATOR CONSTRUCTOR FETCH but it shouldn’t matter):

SQL> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production
PL/SQL Release 9.2.0.8.0 - Production
CORE    9.2.0.8.0       Production
TNS for Solaris: Version 9.2.0.8.0 - Production
NLSRTL Version 9.2.0.8.0 - Production

SQL> alter session set optimizer_mode = all_rows;

Session altered.

SQL> 
SQL> explain plan for
  2  select *
  3  from   table(cast(sys.odcinumberlist('1','2','3','4','5')
  4                 as sys.odcinumberlist)) t;

Explained.

SQL> 
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------
| Id  | Operation                             |  Name       | Rows  | Bytes | Cost (%CPU)|
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |             |  8168 | 16336 |    12   (9)|
|   1 |  COLLECTION ITERATOR CONSTRUCTOR FETCH|             |       |       |            |
------------------------------------------------------------------------------------------

7 rows selected.

SQL> 
Dom Brooks - August 11, 2011

Now if we join to another table and mess around with the join order we should see that our numbers go “missing”:

SQL> create table t1
  2  as
  3  select rownum col1
  4  from   dual
  5  connect by rownum <= 100000;

Table created.

SQL> 
SQL> create unique index i1 on t1(col1);

Index created.

SQL> 
SQL> exec dbms_stats.gather_table_stats(USER,'T1');

PL/SQL procedure successfully completed.

SQL> explain plan for
  2  select /*+ leading (t) */
  3         *
  4  from   t1
  5  where  t1.col1 in (select value(t) 
  6                     from   table(cast(sys.odcinumberlist('1','2','3','4','5') 
  7                                    as sys.odcinumberlist)) t);

Explained.

SQL> 
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------
| Id  | Operation                               |  Name       | Rows  | Bytes | Cost (%CPU)|
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |             |  8168 | 49008 |    20  (40)|
|   1 |  NESTED LOOPS                           |             |  8168 | 49008 |    20  (40)|
|   2 |   SORT UNIQUE                           |             |       |       |            |
|   3 |    COLLECTION ITERATOR CONSTRUCTOR FETCH|             |       |       |            |
|*  4 |   INDEX UNIQUE SCAN                     | I1          |     1 |     4 |            |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("T1"."COL1"=VALUE(KOKBF$))

15 rows selected.

SQL> explain plan for
  2  select /*+ leading (t1) */
  3         *
  4  from   t1
  5  where  t1.col1 in (select value(t) 
  6                     from   table(cast(sys.odcinumberlist('1','2','3','4','5') 
  7                                    as sys.odcinumberlist)) t);

Explained.

SQL> 
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------------
| Id  | Operation                              |  Name       | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |             |     1 |     6 |    21  (29)|
|*  1 |  HASH JOIN SEMI                        |             |     1 |     6 |    21  (29)|
|   2 |   INDEX FAST FULL SCAN                 | I1          |   100K|   390K|     5  (20)|
|   3 |   COLLECTION ITERATOR CONSTRUCTOR FETCH|             |       |       |            |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."COL1"=VALUE(KOKBF$))

14 rows selected.

SQL> 
Dom Brooks - August 11, 2011

Now, dynamic sampling was not extended to TABLE function until something like 11.1.0.7.

So, whilst with dynamic_sampling there are certain sanity checks that kick in which give the impression that it’s not being applied and which can sometimes be circumvented using the hint dynamic_sampling_est_cdn, that wouldn’t make a difference here.

Once you are on a version where dynamic_sampling is applied, then the level you used shouldn’t make a difference with the TABLE function because the whole collection will be sampled

Ignoring any features like cardinality feedback or adaptive cursor sharing (which should kick in for the TABLE function / collections ), as per any other shared/shareable sql plan, everything will depend on what was peeked/sampled at the initial hard parse.

OPT_ESTIMATE is another option.

And for another alternative, consider the extensible optimiser.
Fully supported and documented and available in 10 onwards.
Adrian Billington has a nice walkthrough of the options here:
http://www.oracle-developer.net/display.php?id=427

But …thinking about it … for collections I would generally say that cardinality or opt_estimate is preferable to dynamic sampling.
Firstly, because in general you shouldn’t be using collections larger than a certain size and secondly, there should not be such a variation in size that an appropriate cardinality hint does not do for all.
If this is not the case, then this might be an indicator that you’re not using collections appropriately.
Not sure about that last bit.

Historical bugs notwithstanding, I’ve made extensive use of arrays etc without any memory leak issues.
As with any feature, the main danger is misuse. For example, why bring data into a collection in
expensive PGA if you achieve what you need with a single sql statement
(i.e. “select … bulk collect.. followed by forall insert” versus “insert … select”).

Dom Brooks - August 11, 2011

See Tom Kyte page below for an illustration of cardinality feedback with collections / TABLE function:
http://tkyte.blogspot.com/2010/04/something-new-i-learned-about-estimated.html

mwidlake - August 11, 2011

Thanks Dom. It seems some of your attempts to post ended up being caught up by WordPress’s spam filters. I think I’ve retreived and cleared up now. Let me know if I’ve removed something you expected to see.

5. mwidlake - August 11, 2011

Dom, that is brilliant. Thank you. I’m glad to see that from later 11.1 versions dynamic sampling is extended to the table function. That could be a bit of a problem if you are passing in large arrays, as you say, but then I agree with you that passing in large arrays is probably indicative of a sub-optimal design.

If you have problems with long comments again, email me the text and I’ll add it to the body of the post with the relevant citation.

Dom Brooks - August 11, 2011

P.S. I said this:
“(I complicated it with a distinct inline view to get statements that should be equivalent).”

but then I changed my example so that comment is misleading/irrelevant.

Also, for Tom Kyte on cardinality feedback with TABLE function / collection see here:
http://tinyurl.com/3cvs6l8

mwidlake - August 11, 2011

I’ve tidied up the original comment Dom and yes, that is a good link. That links to other good things… *sigh* this really did start out as just a brief break from IOTs…


Leave a comment