jump to navigation

Wrong way to Query on Dates – the Persistent Offender November 15, 2009

Posted by mwidlake in performance.
Tags: ,
trackback

Some of you, hopefully most of you, will find nothing new in this little posting. Except maybe the realisation that you are not the only one utterly frustrated by the ever-reoccuring nature of this. Actually, I should say “frustrated by these“. There are two things here I keep coming across, year after year after year…Actually, three things. {Maybe I am frustrated too easily, I need a holiday….}

{Maybe I should start a “weekend rant” as well as a”Friday Philospohy” thread…}

OK, you have a massive ORDERS table with a column ORDER_DATE on it, type DATE. This column has a normal B-tree index on it. You want to query for all ORDERS placed on a date, say 1st November 2009, so you issue a query with the following WHERE clause:

WHERE ORDER_DATE = ’01-NOV-09′

But you would not, would you? DATE columns can (and often do) contain a time portion. And in this case it does, so you would track down whoever specified this table, give them a “talking to”, get the column renamed to ORDER_DATETIME and use something like:

WHERE TRUNC(ORDER_DATETIME)=’01-NOV-09′

But you would not, would you? The function on ORDER_DATETIME will prevent the index being used (unless you added a function-based index, you clever so-and-so).
Now this is the first second thing that I keep seeing and have to run through with people. Some people simply have not yet been told that functions on columns stop indexes from being used, others know this but do not quite understand that a TRUNC {or SUBSTR on the first X characters} still stop indexes being used by Oracle, even though they know that “functions stop indexes being used” – It seems sensible to them, as humans, that if they had a printed index they would not think twice about doing a range check with the index. But the CBO is not a human and is not so smart. So you explain this to them.

So maybe you get to the point where the following is the WHERE clause to use:

WHERE ORDER_DATETIME >= ’01-NOV-09′
AND ORDER_DATETIME < ’02-NOV-09′

That will work. It will use the index on ORDER_DATETIME, it will pass unit testing, it will get released and work fine.

Until someone runs this on a system with a different NLS_DATE_FORMAT.
Or in a session with a different NLS_DATE_FORMAT.
Or on an application server/middle tier where the NLS_DATE_FORMAT is different and it all falls over.
{and for NLS_DATE_FORMAT, keep in mind that NLS_LANG and NLS_TERRITORY can implicitly alter NLS_DATE_FORMAT, so anyone supporting database systems over national boundaries may well be nodding their heads sagely right now}.

It falls over because there is an implicit data conversion of the string ’01-NOV-09′ to a date. It uses your default date format, which is not the same on all databases, on all middle tiers or even all sessions (I always alter my session to show the time portion and I seem to be one of the first to hit these issues, maybe as a result). It is far less common, but I must come across this NLS issue once a year minimum and usually it has been a bit of a major issue (like feeding back utterly the wrong data to a regulatory body, Ouch).

So, you need to explicitly state your character to date format conversion. And to all developers reading this, Yes I know, it is a pain to type those extra bits, but you get paid to do it! 🙂
WHERE ORDER_DATETIME >= TO_DATE(’01-NOV-2009′,’DD-MON-YYYY’)
AND ORDER_DATETIME < TO_DATE(’02-NOV-2009′,’DD-MON-YYYY’)

That will work and keep working.

So:

  • Call columns that hold a DATE with a time portion blar_DATETIME
  • Don’t have functions on the column, move them over to the other side of the WHERE statement(s)
  • Explicitly specify your date conversions. Always. Every time.

I know, I know, most of you knew all of that. Those of you who didn’t, well probably no one had told you before but now you have been told. And that is the heart of it, these things are easy to explain, easy to understand, but somehow seem to be missed when people are getting going with Oracle. They just need telling.

And I’m only able to tell so many. So, the rest of you, will you please tell all inexperienced Oracle people about these three things as I don’t think I can face another 20 years of telling people these things 🙂

{The serious final line is, there are lots and lots of interesting gotchas, what-ifs, quirks on things in Oracle, all of which are good to share, but perhaps the most important things to keep sharing are those things “we all know”, but those with less experience don’t.}

Comments»

1. girlgeek - November 15, 2009

It is not just a flaw of newbie developers. It is a failing in Oracle that such a common issue must be handled in such a cumbersome manner. Along with warning newbies of gotchas, I call on Oracle to make their data access friendlier.

mwidlake - November 15, 2009

Well, Oracle is kind of caught in that this is the way that Dates/datetimes and the NLS stuff has worked for a long while. To change it now would cause a huge impact on existing systems, so they can’t. It’s just a pity it was originally implemented in the way it was {ie times not shown as the default}.
All languages and protocols have their abberations, I just can’t help myself complaining about this particular one 🙂

Greg Rahn - November 16, 2009

I think the “failing” here is that Oracle offers implicit data type conversion. If Oracle enforced strict type checking then when the datatype of the right hand side of the comparison didn’t match the left side, it would return an error stating “datatype mismatch” (other databases do this). Simply put, everyone who writes/programs database queries needs to cast strings to dates (as well as validate other comparisons match in datatype).

Perhaps there should be an enhancement request for “strict_datatype_checking=TRUE”. …But then someone would complain that it makes programming more “difficult”.

Reminds me of a saying that one of my computer science university professors told me: “It might work, but it’s not right.”

mwidlake - November 16, 2009

The option to force strict data typing would certainly be a nice enhancement Greg, but I’m sure you are right in that people would not want it on as it makes developing code harder. I’d like
“insert into eric vales(x,y,z)”
(ie no column list for the “into”) to be illegal as it is done so often in development and breaks later on – but I use it so much when just testing things out or demonstrating things that it would trip me up!

Some databases (I’m thiking MySQL) try even harder than Oracle to do implicit data conversion. This works:
mysql> select ename, sal, job from emp
-> where SAL in (1000,2000,’3000′,4000,’5000apple’);
+——-+———+———–+
| ename | sal | job |
+——-+———+———–+
| SCOTT | 3000.00 | ANALYST |
| KING | 5000.00 | PRESIDENT |
| FORD | 3000.00 | ANALYST |
+——-+———+———–+

?5000Apple?
🙂

2. Bernard Polarski - November 16, 2009

I always put, for date in columns and predicates using the full format with YEAR first for sorting order when printed. so a date in a SQL is always to_char( mydate,’YYYY-MM-DD HH24:MI:SS’). NLS_DATE_FORMAT is an invention of Oracle Corps used to punch in their face all the naive user of Oracle.

mwidlake - November 16, 2009

Bernard, I now have an image of Oracle Support staff hitting users in the face…If only to distract them from the car-crash which is the new Metalink interface 🙂
I use the same format but often without the dashes and colons, to save typing and takes up less screen space when displayed.

3. dombrooks - November 16, 2009

> WHERE ORDER_DATETIME >= TO_DATE(‘01-NOV-2009′,’DD-MON-YYYY’)
Yes, this is all true. Don’t you also sometimes get the third thing? This where 01-NOV-2009 is a parameter of type date (hurrah!) but the developer is so-used to getting an ear-bashing over explicit conversion of date STRINGS that they do a TO_DATE on their date DATES and get another implicit conversion which can go wrong.

mwidlake - November 16, 2009

{quiet laugh} Yes, I have seen that one too. A sign of a developer trying very hard to do the right thing but not quite understanding what the right thing is {I can say this, I have been a developer on and off for years}.

4. wilfred - November 17, 2009

Ouch. TO_DATE(’01-NOV-2009′,’DD-MON-YYYY’) assumes your session is running with NLS_DATE_LANGUAGE set to English. November is not abbreviated to NOV in all languages.

So either use only numeric notations or use the third (optional) argument for TO_DATE to specify the NLS parameters:
TO_DATE(’01-NOV-2009′,’DD-MON-YYYY’, ‘NLS_DATE_LANGUAGE=American’)

mwidlake - November 17, 2009

Ahhhhh, you are right Wilfred.
Looks like I am trying very hard but not quite understanding what the right thing is…:-)
A problem with using numerical months (and why I stick with alpha) is the potential confusion with whether the month comes before or after the date (Americans get it wrong, unless you are American and then the British get it wrong).

5. Neil Chandler - November 20, 2009

Coincidence always surprises me. I have just emailed out to a few dozen of the Developers here in my department at *some*bank* an internal wiki page about how to use dates within Oracle. Most of them are Sybase/SQL Server Coders, but we migrated to Oracle in April. Most of them haven’t got a clue how Oracle dates work (or indeed what a database is)

2 examples from this weeks code review (there have been more) :

One guy was taking a date, converting it TO_CHAR and then converting TO_TIMESTAMP as he thought you couldn’t compare dates directly (I somewhat simplified and improved the performance of his code as he was magically able to use indexes.)

Another was converting the date TO_CHAR, and doing string manipulation to ensure the time was set to 12:00:00, then converting back TO_DATE for his SQL.

And don’t get me started on bind variables and when to use Dynamic SQL (for every SQL statement as far as some of the developers are concerned). *sigh*

Some days are better than others. At least it’s Friday today, and I don’t have to go out drinking with Martin again :-S

Oh, and when it comes to dates, how many of you use SYSDATE and how many use CURRENT_DATE ?

mwidlake - November 22, 2009

Was it you I was drinking with Neil? And I had such a good time – are you sure it was you? 🙂

I use sysdate. I am old. I have to check the manual on current date as I forget what it actually, really, especially in respect of daylight saving, is. And the SQL manual is particularly unhelpful on that one 🙂

6. Duke Ganote - March 2, 2010

Use a “Date Literal”

SELECT DATE ‘2010-03-02’ FROM DUAL

http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/sql_elements003.htm#sthref365


Leave a reply to wilfred Cancel reply