jump to navigation

Friday Philosophy – Smart or Smart-Arse? October 20, 2017

Posted by mwidlake in Friday Philosophy, humour, Perceptions, rant.
Tags: , , ,
1 comment so far

Many of you know what a “Smart-Arse” is. For those who do not…

A “Smart-arse” a person who is irritating because they behave as if they know everything or try to catch you out by misleading you.

A smart person will look at your problem and say something like “have you tried checking the array size?” and, 8 times out of 10, their input will help you solve your problem. It may not be THE answer but it makes you think about root causes.

A Smart-arse will say something more like “well, I would never have your problem as I would not have joined a company full of Java Nerds!!!”. Yeah, maybe that would have avoided my specific problem #1, but it is of no practical worth right now. .

You can usually pick out a smart-arse just by the tone of their voice. Think about the two situations above. The Smart person probably gave their advice in a quieter voice and with a passive or supporting manner. The Smart Arse person will usually have a higher, louder voice with a slightly sarcastic tone. Basically, in a superior or attention-seeking manner.

Another “Smart-arse” thing to do is to try to catch you out, in the misguided belief it makes them look cleverer than you.

In these situations always ask – “what is a Smart-arse hoping you won’t spot”

I’ll put my hand up right now. Sometimes, especially looking back on my past, I have been the smart-arse. (And, as humans, we hate the fault in others that we see in ourselves). And I bloody hate smart-arses. And I keep seeing smart-arse crap all over the internet. Let me give you an example. Look at the figure on the right.

This is the perfect example of the “Smart-Arse” question. You are faced with what looks like a simple logic puzzle and normally the tag line is something like “93% of people will get this WRONG!!! – Only Geniuses can solve it!!!!!!!”. They never cite a basis for the “93%” as it is as utterly made up and is as asinine and bloody annoying as whatever trick is in the post. What they are doing is giving you what looks like a genuine puzzle that needs a bit of thinking about but most of us can solve (though not you Dave, you really are an idiot). BUT! But they have hidden a detail. The are purposefully leading you astray so they can go “Aaa-Haaaa!!! Noooo! You forgot to check for the closed pipes!” (check tank 5 to 2). Or whatever the trick is.

This is “Smart-Arse”. It is not checking if you can solve a problem, if you are smart. It is checking if they can con you. Checking if they can give you a load of information and then go “Ahh HA!! Got ya!!! You did not check for the tiny bit of info we hid from you!!! O-hohohohho we are so clever!!!!”

Well, I have news for you, Smart-arse. You are a bloody idiot. Your answer is wrong, as any fool can see. (Bear with me on this…)

More boxes, same smart-arse shit

You may have seen other “tests” like this such as the one to the left – a bit more complex but the question is the same, which one fills up first.
In my head my response is always “which one fills up… *first*????”.

First! That is my response. Not which one but the fact that the question itself is wrong. It should be “which one fills up” full stop, as “any fool can see”. Not which one fills up first.

I better justify that claim.

Look at this second example, clearly labelled with the icon of utmost smarts Mr Einstein (who, I bet, could not plumb a toilet let alone all these pipes as, back in his time, there were no push-fittings – just copper and solder. I think he once said he was rubbish at practical tasks). They think the key “got ya” is that the pipe from C to D is blocked so water goes from C to J. And then from J to L, as the pipe to I exits higher than that to L. One sneaky trick and then a bit of good physics – it is not the input but the output that counts. So water pours into L and then to F – but not H as, again, a sneaky block has been inserted. So F fills up.

And only F fills up!!!

As as soon as it is full it overflows. It overflows below the height of any other buckets the fluid had flown through – and so no other bucket will fill. Their initial question is flawed. “Which will fill first” indicates one will fill second. No other bucket will fill second. The question is not logical! Bloody idiots.

I can’t say why I initially was struck by the fact that only one bucket would fill when I saw the first example of this, even before I spotted the blocked pipes, but we all think in different ways. It does not make me smarter, just different. I’m interested to see if any of you can point out a flaw in my logic above though as I have a sneaking suspicion I could still be wrong.

However, this is not the major flaw… (I told you that I was also an utterly insufferable smart-arse).

Figure 2 has a drip filling the A tank, figure 1 has a gushing tap filling tank 1. Now ask a simple question. No matter if the tap is gushing or dripping, can the pipe out of tank A (or 1) empty the water faster than the tap supplies it? Well, if the tap is dripping you would say “yes” – but if these tanks are 5mm cubed and the pipe out is less than 1mm thick then no! No scale is given. And in fig 1 the tap is gushing. Have any of you had a shower where the plug hole drains slower than the shower produces water? After 2 or 3 minutes your feet are in a shallow bath and if you keep the shower running it overflows into the rest of the bathroom.

With figure 1, the one with the gushing tap, my brain says that tank 1 will fill as the tap supplies water faster than it will exit through the pipe to tank 5. Tank 1 will fill and piss water all over the shop and whatever goes down the pipe to 5 will eventually fill that tank. Which of tanks 1 and 5 fills first is “it depends” (the classic answer to most I.T performance questions). The question is how much slower is the flow out of the pipe from tank 1 – if it is, on average, above half the rate of the tap flow then tank 5 will actually fill first. In any case, you have soaked the bathroom floor and the people in the flat below are banging on the front door…

With that new idea in your head, if you turn up the tap in figure 2 you can now see that which tank fills first is probably A or.. C – depending on the max flow out of the pipes (all pipes are the same bore so flow rate is the same, increasing header pressure in each tank as they fill allowing…) I think it might be C as it’s outflow is higher in relation to the tank top than B or C…

So depending on the tap flow rate, the drain pipe flow rate and the relative height of the clear output pipe it could be…. absolutely NOT the answer of the Smart-Arse original poster. That is the problem with smart-arses! They are so fixed on their clever “gotcha” answers that they stop thinking of the real world.

And don’t get me started on those images where bananas are added to cans of beer and divided by a plate of cakes, designed to look like some sort of Algebra test. Always they are being smart-arse. They try and hide the introduction of multipliers where all the first examples are addition, or you need to count the number of items, or yellow is 3 and green is 6, or it is in base 23. I was going to include an example (again, a really wrong one) but I’ll save that for another week when I am also in a bad mood.

And, of course, I am a “smart-arse” for pointing this all out. Did I say how much I dislike smart-arses?

I promise you, when you start looking for the smart-arse aspect to all those “are you smart enough” bollocks things on social media it just turns into so much blargh and you can either answer them easily or just decide you can’t be bothered being misdirected. And you can use that saved time for looking at funny kitten videos or, I don’t know, doing some productive work?

Is there any other relevance to your working life? Maybe. Next time your management structure asks you a seemingly benign question about what you are doing this weekend or when you think you are on leave (hang on, “think”?), or how minor will be the impact of a small change to how the business functions to the application you are developing – just switch on the bulb with “smart-arse” painted on it. They are asking you a question where they are expecting you to think in the clear, simplistic way most of us would. Now ask what the bloody hell they are up to.

Advertisements

Overloaded Indexes (for ODC Appreciation Day) October 10, 2017

Posted by mwidlake in database design, development, performance, SQL.
Tags: , ,
add a comment

ODC Appreciation Day is an idea that Tim Hall (aka Oracle-Base) came up with, to show out appreciation for the Oracle Technology Network (OTN)/Oracle Developer Community.

Fig 1 This is an efficient range scan

I want to show my support but rather than mention an Oracle “feature” I particularly like I’d like to mention a “trick” that I (and many others) often employ to help performance. That trick is Overloaded Indexes.

We can all appreciate that accessing data in a table via an index is a very good thing to do when we want just the few records for a given index key from the thousands or millions of rows in the table. As an example, we are looking up all the records for someone and we know their first name, last name and date of birth – a very common occurrence in things like hospital or billing systems. So our PERSON table has an index on those three columns. Oracle will read the root block of the index, read the one relevant block in each of the branch levels for the key, find all of the leaf block entries for that key – and then collect the relevant rows from the table. Fig 1 shows how we think of this working. i.e. most of the records we want to find will be close together in the table.

Actually, a lot of people who just “use” the Oracle database as a store of information sort-of think this is how an index always works. It efficiently identifies the rows that you want and that is the end of it. If the index is on the value(s) you are looking up rows for (say LAST_NAME, FIRST_NAME, DATE_OF_BIRTH) the index is ideal and that is as good as it gets.

Fig 2 – a more usual, less efficient index range scan

But in reality, the index lookup is often far less efficient than this and is more like fig 2. Working down the index structure to get all of the required index entries is exactly the same, but the rows you want are scattered all over the table. Oracle has to fetch many table blocks to get your data, maybe as many blocks as there records to be found. This is far from efficient.

So what can you do about this? You already have the “perfect” index, on LAST_NAME, FIRST_NAME, DATE_OF_BIRTH, the values you are looking up. Maybe you could add another column to the index to avoid those situations where there are many people with the same name and date of birth. But you may not have that extra information or it is simply not possible to identify the values in the table any more accurately, you really do need all the rows scattered though that table for the given search key.

There are “architectural” things you can do such as create the table as an Index Organised Table (see my little set of blogs about them starting here). You can also use various methods to group the relevant rows together in the table. But all of those methods are Big Impact. You need to recreate the table or plan for this eventuality up-front when you design the system.

But there is a very specific, easy thing you can do to address this particular problem, for the SQL statement you need to speed up. You can add all the columns your query needs into the index. This is an Overloaded Index.

An Overloaded Index holds not only the table columns in the WHERE clause but all the columns needed from that table for the query.

Why does this work? Because when Oracle identifies the range of keys for the key (LAST_NAME, FIRST_NAME, DATE_OF_BIRTH) all the other columns it needs are also in those index leaf entries and there is no need to get the rows from the table. All those lookups to the table are avoided. Fig 3 at the end of this article demonstrates this.

However, I’ll give you a real world example I had recently. A client had a SELECT statement something like the below, with the execution plan shown, and it was running too slowly for the business requirements:

SELECT (SUM(NVL(T.TRAN_VALUE_CR,0))-SUM(NVL(T.TRAN_VALUE_DB,0))) ,
        COUNT(*)
FROM    ACCOUNTS     A ,
        TRANSACTIONS T
WHERE   A.ACC_XYZ_IND      =:3
AND     A.ACC_ACCOUNT_NO   =:1             
AND     A.ACC_SUBACC_NO    =:2
AND     T.TRAN_XYZ_IND     =A.ACC_XYZ_IND
AND     T.TRAN_ACCOUNT_NO  =A.ACC_ACCOUNT_NO
AND     T.TRAN_SUBACC_NO   =A.ACC_SUBACC_NO
AND     T.TRAN_P_IND       =:4 
AND     T.TRAN_DLM_DATE    >=TO_DATE(:5,'YYYYMMDD')

------------------------------------------------------------
| Operation                                 | Name         | 
------------------------------------------------------------
|   0 | SELECT STATEMENT                    |              |
|   1 |  SORT AGGREGATE                     |              |
|   2 |   NESTED LOOPS                      |              |
|   3 |    NESTED LOOPS                     |              |
|*  4 |     INDEX RANGE SCAN                | ACC_PRIME    |
|*  5 |      INDEX RANGE SCAN               | TRAN2_3      |
|*  6 |    TABLE ACCESS BY LOCAL INDEX ROWID| TRANSACTIONS |
------------------------------------------------------------
Statistics
----------------------------------------------------------
       4740  consistent gets
       3317  physical reads

The index used on TRANSACTIONS is:

INDEX_NAME                   TABLE_NAME       PSN COL_NAME
---------------------------- ---------------- --- --------------
TRAN2_3                      TRANSACTIONS     1   TRAN_ACCOUNT_NO
TRAN2_3                      TRANSACTIONS     2   TRAN_SUBACC_NO
TRAN2_3                      TRANSACTIONS     3   TRAN_DLM_DATE

The index TRAN2_3 on the TRANSACTION table that you can see being used in the plan was for all the columns being used in the WHERE clause that actually helped identify the TRANSACTION records required – TRAN_ACCOUNT_NO, TRAN_SUBACC_NO and TRAN_DLM_DATE (the TRAN_XYZ_IND and TRAN_P_IND were always the same so “pointless” to index).

I added a new index to the TRANSACTION table. I added a new index rather than change the existing index as we did not want to impact other code and we wanted to be able to drop this new index if there were any unexpected problems. I added all the columns on the TRANSACTION table that were in the SELECT list, were in the the WHERE clauses even though they did not help better identify the rows needed. If there had been TRANSACTION columns in an ORDER BY or windowing clause, I would have added them too. So my index looked like this:

create index TRAN2_FQ on TRANSACTIONS
(TRAN_ACCOUNT_NO   ,TRAN_SUBACC_NO   ,TRAN_DLM_DATE
,TRAN_P_IND        ,TRAN_XYZ_IND
,TRAN_VALUE_CR     ,TRAN_VALUE_DB)

It is very, very important that the new index holds every column from the TRANSACTION table that the query needs.To prevent accessing the table, all the data the query needs for that table must be in the index.

The query could now satisfy the query by just using the new index, as the below explain plan shows.

----------------------------------------------------
| Id  | Operation                      | Name      |
----------------------------------------------------
|   0 | SELECT STATEMENT               |           |
|   1 |  SORT AGGREGATE                |           |
|   2 |   NESTED LOOPS                 |           |
|*  3 |    INDEX RANGE SCAN            | ACC_PRIME |
|*  4 |     INDEX RANGE SCAN           | TRAN2_FQ  |
----------------------------------------------------
Statistics
----------------------------------------------------
         56  consistent gets
         52  physical reads

There is now no line in the plan for visiting the table TRANSACTIONS and we are using the new TRAN2_FQ index. The consistent gets and physical reads to satisfy the query have gone down from 4740 and 3317 respectively to 56 and 52. I think that is good enough.

Fig 3 shows what is happening. The new index is effectively a “mini IOT” designed to support the given query.

Fig 3 – Tables? Who needs tables!

There are of course a few caveats. The new index needs to be maintained, which is an overhead on all INSERT/UPDATE/MERGE/DELETE activity on the table. The index will only remove the need to visit the table for queries that are very, very similar to the one it is designed for – ones that use the same rows from the TRANSACTIONS table or a subset of them. If you alter the query you, e.g. select another column from the TRANSACTION table you would need to revisit this overloaded index.

Finally, be careful of modifying existing indexes to overload them to support specific queries. If the index is there to support referential integrity you need to think carefully about this and the modified index may be less efficient for other queries that used the original index (as adding columns to an index make it “wider”).

BTW if you think you recognise this from a recent Oracle Scene article then you would be right. I was pushed for time, I used something I had written before 🙂