jump to navigation

INTERNAL_FUNCTION() Impact April 21, 2010

Posted by mwidlake in internals, performance.
Tags: , ,
trackback

You may occasionally see something like the following in the filter predicates of an Explain Plan:

INTERNAL_FUNCTION(“COL_1”)>:P0

What is INTERNAL_FUNCTION, what is the use of INTERNAL_FUNCTION() in Oracle explain plans and why am I putting things like ORACLE INTERNAL_FUNCTION partition exclusion and meaning of internal_function performance almost as random strings in this blog?

Well, I want to get the hit rate up for this post as when I did a quick search on INTERNAL_FUNCTION the current top hit is This unfortunately misleading posting that tells you “It is not a significant overhead in SQL execution performance”. This internal function IS potentially a significant overhead in SQL Execution performance. You can’t add comments to the above site {just book a course or order a book}, so I’m having to create my own page…

INTERNAL_FUNCTION is, as the above posting says, where oracle does an implicit data conversion. In the shown case, as well as in my example in this blog and in most cases I have seen, from timestamp to date.

Why is it an issue? Because it is a function on a column and as such it can have a disasterous impact on sql execution performance.

This code below ( which has most select columns removed for simplicity and to protect the innocent) was performing very poorly, taking over 30 minutes to complete. This is all on 10.2.0.3 Enterprise edition of Oracle.

SELECT trans0_.ID as ID3_1_
...
, round(transfers0_.AMOUNT,2) as formula10_0_
FROM VW_PMTT trans0_
WHERE  (trans0_.MODIFIED_DT between :p0 AND :p1
        AND trans0_.PERS_ID = :p2)
and transfers0_.ACCOUNT_ID=:p3
ORDER BY transfers0_.MODIFIED_DT

This is the plan and the significant filter/access predicates:

Plan hash value: 3261981189
------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name                 | Rows  |Bytes| Cost   |Pstart|Pstop|
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |                      |       |     | 67841  |      |     |
|   1 |  SORT ORDER BY                    |                      |     8 | 920 | 67841  |      |     |
|   2 |   VIEW                            | VW_PMTP              |     8 | 920 | 67840  |      |     |
|   3 |    UNION-ALL                      |                      |       |     |        |      |     |
|*  4 |     FILTER                        |                      |       |     |        |      |     |
|   5 |      NESTED LOOPS                 |                      |     4 | 280 | 28490  |      |     |
|*  6 |       TABLE ACCESS BY INDEX ROWID | W_TBLP               |     1 |  14 |     3  |      |
|*  7 |        INDEX UNIQUE SCAN          | PK_W_TBLP            |     1 |     |     2  |      |     |
|   8 |       PARTITION RANGE ALL         |                      |     4 | 224 | 28487  |    1 | 1580|
|*  9 |        TABLE ACCESS BY LOC INd RID| W_PAYMENT            |     4 | 224 | 28487       1 | 1580|
|* 10 |         INDEX RANGE SCAN          | IDX_W_P_TRANS        | 66936 |     |  3345  |    1 | 1580|
...

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - filter("PLY"."ACCOUNTID"=:P3)
   7 - access("PLY"."PERS_ID"=:P2)
   9 - filter(("PT"."TYPE"=1184769 AND INTERNAL_FUNCTION("PT"."MODIFIED")>=:P0 AND INTERNAL_FUNCTION ("PT"."MODIFIED")<:P1

As you can see, the view is translated into a simple two-table join {and unioned with a second two-table join, which I removed again for clarity, but if anyone wants the full plan and statement, email me} where:

  • one table is accessed on a unique index (via bind variable P2).
  • Row filtered for P3
  • This table is then joined to a Partitioned table in a nested loop 
  • The table being accessed via a local index range scan.

At first glance, the plan may look fine, but Look at the pstart and pstop. 1 to 1580. That is every partition in the table.
The predicate information shows that the INTERANAL_FUNCTION(“PT”.”MODIFIED”) column is being compared to P0 and P1.

The partition key on the table is:-

>@chk_patr_key
Name of Table : W_PAYMENT

TAB_NAME                       OBJECT_TYP PSN COLUMN_NAME
------------------------------ ---------- --- -----------
USER.W_PAYMENT                 TABLE       1 MODIFIED

ie the column that is being flitered by. Why no partition pruning?

Partition pruning is not occuring because of the “INTERNAL_FUNCTION” being applied to that column. The CBO is not able to understand how the result of a function will match to the partition values of the column. Not even it’s own, internally used function 😛

For the above, the “date” bind variables were defined as

p0 timestamp := to_timestamp(’01-FEB-2010 00:00′,’DD-MON-YYYY HH24:MI’);
p1 timestamp := to_timestamp(’03-FEB-2010 00:00′,’DD-MON-YYYY HH24:MI’);

I then altered the code such that the bind variables were defined as dates.

p0 date := to_date(’01-FEB-2010′,’DD-MON-YYYY’);
p1 date := to_date(’03-FEB-2010′,’DD-MON-YYYY’);

The plan and significant predicates were now:

Plan hash value: 346162108
------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name                 | Rows  |Bytes| Cost   |Pstart|Pstop|
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |                      |       |     | 67841  |      |     |
|   1 |  SORT ORDER BY                    |                      |     4 | 460 | 67841  |      |     |
|   2 |   VIEW                            | VW_PMTP              |     4 | 460 | 67840  |      |     |
|   3 |    UNION-ALL                      |                      |       |     |        |      |     |
|*  4 |     FILTER                        |                      |       |     |        |      |     |
|   5 |      NESTED LOOPS                 |                      |     2 | 140 | 28490  |      |     |
|*  6 |       TABLE ACCESS BY INDEX ROWID | W_TBLP               |     1 |  14 |     3  |      |     |
|*  7 |        INDEX UNIQUE SCAN          | PK_W_TBLP            |     1 |     |     2  |      |     |
|   8 |       PARTITION RANGE ALL         |                      |     2 | 112 | 28487  |  KEY | KEY |
|*  9 |        TABLE ACCESS BY LOC INd RID| W_PAYMENT            |     2 | 112 | 28487  |  KEY | KEY |
|* 10 |         INDEX RANGE SCAN          | IDX_W_P_TRANS        | 66936 |     |  3345  |  KEY | KEY |
...

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - filter("PLY"."ACCOUNTID"=:P3)
   7 - access("PLY"."PERS_ID"=:P2)
   9 - filter(("PT"."TYPE"=1184769 AND "PT"."MODIFIED">=:P0 AND "PT"."MODIFIED"<:P1

The expected rows and bytes drop but the key difference in the plan is Pstart and Pstop are now KEY KEY. ie the CBO will evaluate at run time the values coming back from the Table access of W_TBLP and partition prune the access of the table.

In this situation, the original query took over 30 minutes to return. The new code took 18 seconds.

I tried one final test. I set the bind variables to be varchar2 so that implicit data conversion would occur:
p0 varchar2(20) := ’01-FEB-2010 00:00′;
p1 varchar2(20) := ’03-FEB-2010 00:00′;

With VARCHAR2 the CBO was able to do the implicit conversion without the INTERNAL_FUNCTION and the plan/filter predicates were exactly as for native dates.

It would seem this issue with INTERNAL_FUNCTION blights timestamp to date implicit data conversion but not some other implicit data conversions. Without testing further, I can’t be sure which.

This is a documented feature by Oracle. Unusually (and much to their credit) Oracle even document the potential negative impact of this particular construct with timestamps and dates. See this section in chapter 5 of the Oracle 10g data warehousing guide.

I better balance the fact I linked to an inaccurate posting on the subject with a few good ones.
This is a nice one from McLaud on the OraFAQ site where he has come across it as a problem and with help shows how it is due to implicit conversion of a timestamp from Hibernate.
This blog on the issue arising with Java is a nice example of the fix (use a date to hold a date, not a timestamp).

Finally, this link is to an older but very, very valid rant by Tom Kyte about implicit data conversion and using the correct data type for your data. {Don’t go using numbers or varchars to hold dates, just don’t, and similarly you should avoid implicit data conversion}.

Comments»

1. oakesgr - April 21, 2010

Hi Martin,

nice post, something I’ll keep an eye out for going forward. I’ll add it to my list of ‘reasons to use correct datatypes’ which I like to bring out and moan about whenever I get the chance.

Cheers
Graham

2. mwidlake - April 21, 2010

I think anyone not using data types correctly should be given a very strong talking to. I mean, if you are going to abuse data types why not just create all tables called tab_1 to tab_nnnn and have columns col_aa to col_zz, each one datatype char(2000). Ultimate flexibility. You can reserve tab_1 and tab_2 to list and keep track of what all the tables (tab_1) and columns (tab_2) currently are used for…

Don’t get me started on calling columns silly things either. ( Like “timestamp#” to hold a DATE in sys.aud$, for goodness sake. So they depricate it and replace it with ntimestamp#. Tsch)

3. Tony Sleight - April 23, 2010

Our applications software developers have taken different approaches when encoding dates. Some store dates as a DATE data type, others store dates as VARCHAR2 data types. Luckily, we have no instances of DATE -> TIMESTAMP conversions (so far!), so by sheer luck, we have not experienced the performance issues highlighted in this article.
It’s a pity articles such as these do not come top of the google searches instead of those which may lead you into a false sense of security.
As an aside, could I ask how you have managed to get the syntax highlighter scripts in your wordpress blog? I am currently writing a case study article on the enq: TX – Row contention wait event as a first blog attempt, and can put the tag in the body tag, but have no method of inserting the scripts in the header. Any hints would be appreciated.
Keep up the good work in debunking the Oracle myths!

mwidlake - April 25, 2010

Hi Tony,

To get the syntax highligher/layout of code I use sourcecode. For general code you just use {remove double quotes} “[“sourcecode”]” and “[“/sourcecode”]” as tags {so square brackets for the tags}. For code with SQL in it, the opening tag is “[“sourcecode language=’sql”]’

I coverered the topic in one of my first posts but sourcecode has been improved so my example showing the syntax now works.

Hope that helps

4. ezuall - April 28, 2010

Good post and I really enjoy your blog.

To try and address the issue of poor content being surfaced I’ve started an experiment (hence the strange domain name) to group and surface good quality content. Like I said it’s still an experiment, so the hosting is a bit slow, but if you want to check it out I added this blog post: http://www.blingninja.com/node/48

Looking forward to future posts. A large part of the most valuable Oracle related content is currently being created by a small group of knowledgeable professionals who share their experience freely.

mwidlake - April 28, 2010

Hi ezuall,

Thanks for the kind comments and the link, I’m glad you find useful stuff here.

Good Luck with the experiment, I think it is a nice idea. I reckon it might be hard work to get it off the ground, but even as your own resouce for collating what you feel is good information, I think you will find it very useful.

Martin

5. Neil Chandler - May 3, 2010

Martin,

Good article. I think it’s worth mentioning that implicit conversion is one of the main causes of hard-to-find bugs in system too, and can also perplex the optimizer into bad plans (e.g. How big is the gap between 31-12-2009 and 01-01-2010 when represented in characters? It can be the difference between an relatively small index range scan and something much worse)

And I hope you realise that by linking to an article, inaccurate or otherwise, it increases it’s Google hit level and thus further promotes myths and problems – hopefully keeping the likes of you (and me) in a job.

regs

Neil

mwidlake - May 3, 2010

Thanks Neil.

The last article I link to in the main post, Tom Kyte’s, is an excellent one on the abuse of wrong column types and I utterly agree with you and Tom – putting dates into none-date columns is just wrong. They do it at a current site I am working for and their excuse is it allows them to insert “not known” values and special values. I’ve tried explaining why this is a bad idea logically and the impact on the CBO’s perception of cardinality (as you describe), but to no current avail. I’m fighting an established way of doing things and retrofitting such things is a big ask.

As for the increasing the number of links to the offending article, well yes I increase the links by one. But the chap in question cross-links his own stuff so much it probably makes little impace. I was hoping people would link to my article, and even originally put in a suggestion that people did, but it came over as a desperate “reference me! reference me!” plea, so I pulled the line.

So Neil, I expect you to start up a blog just so you can reference my jolly good article 🙂

Martin

Jonathan Lewis - May 7, 2010

Martin,

If you want to avoid incrementing the score of a page, you can always add the rel=”nofollow” tag to the link. (I do this by switching to the HTML editor for WordPress so that I can see the raw text of the link, it goes after the href=”…” bit, but before the closing angle bracket.

6. mwidlake - May 7, 2010

Thanks for that Jonathan, I’ll use that as appropriate!

7. Ian Carney - October 11, 2011

Did anybody say “Data Model” or “Data Standards”?
ian

mwidlake - October 11, 2011

Ian, you are showing your age – wanting a data model for goodness sake! Let alone data standards. 🙂
But being serious, yes, I did mention standards at the time. If I remember correct, they had them but did not follow them.

8. What the heck is the INTERNAL_FUNCTION in execution plan predicate section? | Tanel Poder's blog: IT & Mobile for Geeks and Pros - January 17, 2013

[…] the datatype conversion reason is already covered in Oracle docs and blogosphere. But there’s […]

9. lukaseder - December 29, 2014

This isn’t really a Hibernate problem as indicated in your last section, but a problem at the JDBC layer (and Hibernate as well as many other tools inherit it). It can be solved in Hibernate, though, using the approach shown in this blog post here:

Leaky Abstractions, or How to Bind Oracle DATE Correctly with Hibernate

mwidlake - December 30, 2014

Hi Lukas, thanks for the comment and also the link to how to address this issue within Hibernate.


Leave a comment