jump to navigation

SQL Quiz – How To Multiply across Rows February 22, 2012

Posted by mwidlake in SQL.
Tags:
trackback

A colleague came to me a couple of days ago with a SQL problem. He had something like this:

@get_source

NAME          INPUT
------------- -----
GROUP_1       5
GROUP_2       3
GROUP_3       4
GROUP_4       7
GROUP_5       3

What he wanted to do was multiply all the inputs across the groups, to get the total number of possible permutations. ie 5*3*4*7*3. The product of all the INPUTS. This has to be in straight SQL. Easy! You just need to… Ahhh… Hmmm…

No, I just could not think of a way to do it that my colleague could use.

- There is no group-by function that gives a product of a column {that I know of}
- We could not use PL/SQL {for reasons I won’t go into}, so no cursor looping or passing in an array, which would make it simple
- Neither of us could think of an analytical function that could take the result of itself in the prior row (though I suspect there may be a way to do it).
- The number of groups could and would vary from a few to possibly a hundred, so the old tricks of converting rows to columns or nesting so many sql statements would not help.

So, I asked my friend – the queen of SQL Query, {Boneist} {Oh she of the trombone playing pastime}.

She came straight back with an answer. In case you want to try and work out an answer yourself before seeing the solution, below is a cat picture. The answer she came up with is below that:

The key to the solution is natural logs {ln}. I don’t know about you, but I learnt about using logs at school and have pretty much not used them since. In summary:

If x=3*5*9
then ln(x) = ln(3)+ln(5)+ln(9)
= 1.09861+1.60944+2.19722
= 4.90527

ie using log converts multiplication to addition. You then use EXP, the inverse of ln, to convert your added-up log value into your result.

exp(4.90527) = 135

{NB if you use my figures, exp(4.90527) realy equals 134.999355, as I have truncated the log values shown. Oracle does this far more accurately internally but be aware you might get some slight rounding errors).

So, what we can do is simply use the SQL GROUP function SUM to add together the natural logs of all the rows:

sum(ln(input))
{grouped by the whole statement, so no group by is needed in this case}

As an example:

-- show the expected result first
select 3*7*4*5*1 from dual;

 3*7*4*5*1
----------
       420


select min(name),max(name),count(name)
,EXP (SUM (LN (gr_sum))) gr_prod
from
(select 'group_1' name, 3 gr_sum from dual
 union
 select 'group_2' name, 7 gr_sum from dual
 union
 select 'group_3' name, 4 gr_sum from dual
 union
 select 'group_4' name, 5 gr_sum from dual
 union
 select 'group_5' name, 1 gr_sum from dual
)
/

MIN(NAM MAX(NAM COUNT(NAME)    GR_PROD
------- ------- ----------- ----------
group_1 group_5           5        420

As you can see, it works – even if when you first look at the formula your brains {if you are not a mathematician} try to leak out of your head. Just try and remember what your mathematics teacher said about log books and how, before calculators, they were used to speed up manual long multiplication tasks by converting the task into log addition.

If you want more information on logs, see this discussion about how they are actually about growth or wikipedia if you must :-).

Boneist actually pointed me to this very nice post about using logs in oracle by Anju Parashar, which I have borrowed from.

One issues to be aware of (which is highlighted in Anuj Parashar’s article) is that you can’t get a log of negative values, as a consequence Oracle will give you an ora-01428 error:

select ln(-3) from dual;
select ln(-3) from dual
*
ERROR at line 1:
ORA-01428: argument ‘-3′ is out of range

Anuj gives a version of code that works if all values are negative, below I have one that copes with any number of negatives. Basically, you convert all the values to be multiplied to positive values and then make it negative if the count of negative values is odd. Mathematically, the result of a multiplication can ONLY be negative if there are an odd number of negative values.

,EXP (SUM (LN (abs(gr_sum))))
*decode (mod(sum(decode(sign(gr_sum),0,0,1,0, 1)),2)
,0,1,-1) correct_gr_prod

I’m sure that the above expression could be simplified, but I have to go and do the day job.

Finally, here is a little set of test cases covering the above, so you can play with this.

mdw1123> select 3*7*4*5*1 from dual;

 3*7*4*5*1
----------
       420

1 row selected.

mdw1123> --
mdw1123> select 'group_1' name, 3 gr_sum from dual
  2  union
  3  select 'group_2' name, 7 gr_sum from dual
  4  union
  5  select 'group_3' name, 4 gr_sum from dual
  6  union
  7  select 'group_4' name, 5 gr_sum from dual
  8  union
  9  select 'group_5' name, 1 gr_sum from dual
 10  /

NAME        GR_SUM
------- ----------
group_1          3
group_2          7
group_3          4
group_4          5
group_5          1

5 rows selected.

mdw1123>
mdw1123> select min(name),max(name),count(name)
  2  ,EXP (SUM (LN (gr_sum))) gr_prod
  3  from
  4  (select 'group_1' name, 3 gr_sum from dual
  5   union
  6   select 'group_2' name, 7 gr_sum from dual
  7   union
  8   select 'group_3' name, 4 gr_sum from dual
  9   union
 10   select 'group_4' name, 5 gr_sum from dual
 11   union
 12   select 'group_5' name, 1 gr_sum from dual
 13  )
 14  /

MIN(NAM MAX(NAM COUNT(NAME)    GR_PROD
------- ------- ----------- ----------
group_1 group_5           5        420

1 row selected.

mdw1123> --
mdw1123> -- now with a negative
mdw1123> select 'group_1' name, 3 gr_sum from dual
  2   union
  3   select 'group_2' name, -7 gr_sum from dual
  4   union
  5   select 'group_3' name, 4 gr_sum from dual
  6   union
  7   select 'group_4' name, 5 gr_sum from dual
  8   union
  9   select 'group_5' name, 1 gr_sum from dual
 10  /

NAME        GR_SUM
------- ----------
group_1          3
group_2         -7
group_3          4
group_4          5
group_5          1

5 rows selected.

mdw1123> -- and if the values contain negatives
mdw1123> select min(name),max(name),count(name)
  2  ,EXP (SUM (LN (abs(gr_sum)))) gr_prod
  3  ,mod(sum(decode(sign(gr_sum),0,0
  4                          ,1,0
  5                          ,  1)
  6           ),2) -- 0 if even number of negatives, else 1
  7           modifier
  8  ,EXP (SUM (LN (abs(gr_sum))))
  9   *decode (mod(sum(decode(sign(gr_sum),0,0,1,0,     1)),2)
 10         ,0,1,-1) correct_gr_prod
 11  from
 12  (select 'group_1' name, 3 gr_sum from dual
 13   union
 14   select 'group_2' name, -7 gr_sum from dual
 15   union
 16   select 'group_3' name, 4 gr_sum from dual
 17   union
 18   select 'group_4' name, 5 gr_sum from dual
 19   union
 20   select 'group_5' name, 1 gr_sum from dual
 21  )
 22  /

MIN(NAM MAX(NAM COUNT(NAME)    GR_PROD   MODIFIER CORRECT_GR_PROD
------- ------- ----------- ---------- ---------- ---------------
group_1 group_5           5        420          1            -420

1 row selected.
About these ads

Comments»

1. Mike Cox - February 22, 2012

That’s very impressive, I wouldn’t have got close, even if you’d hinted that the answer revolved round logs.

Assuming that you had a known number of ‘groups’ or at least no more than N then you could do it with lag/lead in oracle or row preceeding on other platforms bundled into an inline query but the logs solution is probably only way of getting it to work where then number of rows to be processed is unknown, so its a lot cleaner.

mwidlake - February 22, 2012

Yes, one of my first thoughts was to use Lag and have eg 10 Lag functions looking back up to 10 records and have that as a subquery to a piece of SQL pulling out the the first none-zero column – which would work up until you needed the product of 11 rows.

2. Charles Hooper - February 22, 2012

I would never have thought to use logs to solve the problem – that is an interesting solution.

I coulple of months ago Dom Brooks shared an approach on my blog that could be adapted to solve this problem. Essentially, the solution builds a math formula, and then evaluates the math formula within the SQL statement. The original solution provided by Dom Brooks:
http://hoopercharles.wordpress.com/2011/07/13/how-many-ways-to-solve-this-problem-add-the-sequential-numbers-x-through-y/#comment-3647

Adapted to solve this problem:

select xmlquery((listagg(gr_sum,' * ' ) within group (order by gr_sum)) returning content).getnumberval() PRODUCT
FROM
  (select 'group_1' name, 3 gr_sum from dual  
   union 
   select 'group_2' name, 7 gr_sum from dual  
   union 
   select 'group_3' name, 4 gr_sum from dual  
   union 
   select 'group_4' name, 5 gr_sum from dual  
   union 
   select 'group_5' name, 1 gr_sum from dual  
  );
 
   PRODUCT
----------
       420
select xmlquery((listagg(gr_sum,' * ' ) within group (order by gr_sum)) returning content).getnumberval() PRODUCT
FROM
  (select 'group_1' name, 3 gr_sum from dual  
   union 
   select 'group_2' name, -7 gr_sum from dual  
   union 
   select 'group_3' name, 4 gr_sum from dual  
   union 
   select 'group_4' name, 5 gr_sum from dual  
   union 
   select 'group_5' name, 1 gr_sum from dual  
  );
 
   PRODUCT
----------
      -420
mwidlake - February 22, 2012

That’s pretty cool too. I’ll have to play with that.

Your/Dom’s solution will make more sense to someone reading the code, I think

3. Rob van Wijk - February 22, 2012

Another alternative is to use the SQL model clause.

{code}SQL> with data as
2 ( select ‘group_1′ name, 3 gr_sum from dual union all
3 select ‘group_2′ name, 7 gr_sum from dual union all
4 select ‘group_3′ name, 4 gr_sum from dual union all
5 select ‘group_4′ name, 5 gr_sum from dual union all
6 select ‘group_5′ name, 3 gr_sum from dual
7 )
8 , including_product as
9 ( select name
10 , product
11 from data
12 model
13 return updated rows
14 dimension by (rownum r)
15 measures (name,gr_sum,1 product)
16 ( product[any] order by r = nvl(product[cv()-1],1) * gr_sum[cv()]
17 )
18 )
19 select min(name) min_group
20 , max(name) max_group
21 , count(*) count_group
22 , max(product) product
23 from including_product
24 /

MIN_GRO MAX_GRO COUNT_GROUP PRODUCT
——- ——- ———– ———-
group_1 group_5 5 1260

1 row selected.{code}

mwidlake - February 22, 2012

Hi Rob,

Thanks for that, I’m really happy that someone has provided a solution based on model.

I have to confess that my colleague and I briefly touched on using model but backed away from the idea as we have both only ever used it in the most simple way, and a few years ago when it first appeared.

We were not brave enough to try it!

4. Brian Tkatch - February 23, 2012

If the records come from different TABLEs, of course, a cartesian join does it for you:

SELECT COUNT(*) FROM
(SELECT 1 FROM (select ‘group_1′ name, 3 gr_sum from dual) CONNECT BY LEVEL <= gr_sum),
(SELECT 1 FROM (select 'group_2' name, 7 gr_sum from dual) CONNECT BY LEVEL <= gr_sum),
(SELECT 1 FROM (select 'group_3' name, 4 gr_sum from dual) CONNECT BY LEVEL <= gr_sum),
(SELECT 1 FROM (select 'group_4' name, 5 gr_sum from dual) CONNECT BY LEVEL <= gr_sum),
(SELECT 1 FROM (select 'group_5' name, 1 gr_sum from dual) CONNECT BY LEVEL <= gr_sum);

Of course, if you're not thinking, you might come up with the following and think it actually works. :)

WITH
Data
AS
(
select 'group_1' name, 3 gr_sum from dual union
select 'group_2' name, 7 gr_sum from dual union
select 'group_3' name, 4 gr_sum from dual union
select 'group_4' name, 5 gr_sum from dual union
select 'group_5' name, 1 gr_sum from dual
),
Silliness
AS
(
SELECT
COUNT(*) x
FROM
Data
CONNECT BY
LEVEL < gr_sum
)
SELECT
MIN(x)
- COUNT(*)
- SUM(gr_sum) "Really?"
FROM
Silliness,
Data;

mwidlake - February 23, 2012

Thanks Brian – how would you cope with there being a varying number of tables to get the information from?

Brian Tkatch - February 23, 2012

How would i cope? Don’t know. I was just mentioning the “obvious” answer, as joins without WHERE clauses is (logically) the starting point of any query with multiple TABLEs.

5. Bjørn - February 23, 2012

I must be an old, bitter man, since I learned math in highschool, I learned math at Uni, AND I remember lots of it. Upon reading the question, I immediately typed

$ sqlplus scott/tiger
SQL> select exp(sum(ln(deptno))) from dept;

Then again, I also own a selection of some 20 slide rules that I also remember how to use, and occasionally will use.

/Bjørn.

mwidlake - February 23, 2012

Hi Bjorn,

Good for you :-) I have never been shown how to use a slide rule.

I do take a perverse pleasure in using Oracle on half a million pounds worth of equipment as a simple calculator, rather than get the little Casio out of my top drawer

6. John Howard - February 23, 2012

Nice solutions. How about the user-defined aggregate function (possibly ruled out by “No PL/SQL”?)

SQL>create or replace type aggr_product as object
2 ( nb number,
3 static function ODCIAggregateInitialize(sctx IN OUT aggr_product) return number,
4 member function ODCIAggregateIterate(self IN OUT aggr_product, value IN NUMBER) return number,
5 member function ODCIAggregateTerminate(self IN aggr_product, returnValue OUT number, flags IN number) return number,
6 member function ODCIAggregateMerge(self IN OUT aggr_product, ctx2 IN aggr_product) return number
7 )
8 /

Type created.

SQL>create or replace type body aggr_product is
2 static function ODCIAggregateInitialize(sctx IN OUT aggr_product) return number is
3 begin
4 sctx := aggr_product(1);
5 return ODCIConst.Success;
6 end;
7
8 member function ODCIAggregateIterate(self IN OUT aggr_product, value IN NUMBER) return number is
9 begin
10 self.nb:=self.nb*value;
11 return ODCIConst.Success;
12 end;
13
14 member function ODCIAggregateTerminate(self IN aggr_product, returnValue OUT number, flags IN number) return number is
15 begin
16 returnValue := self.nb;
17 return ODCIConst.Success;
18 end;
19
20 member function ODCIAggregateMerge(self IN OUT aggr_product, ctx2 IN aggr_product) return number is
21 begin
22 self.nb := ctx2.nb;
23 return ODCIConst.Success;
24 end;
25 end;
26 /

Type body created.

SQL>CREATE OR REPLACE FUNCTION product (input number) RETURN number
2 PARALLEL_ENABLE AGGREGATE USING aggr_product;
3 /

Function created.

SQL>SELECT MIN(NAME),MAX(NAME),count(NAME)
2 ,product(gr_sum) gr_prod
3 from
4 (select 'group_1' name, 3 gr_sum from dual
5 union
6 select 'group_2' name, 7 gr_sum from dual
7 union
8 select 'group_3' name, 4 gr_sum from dual
9 union
10 select 'group_4' name, 5 gr_sum from dual
11 union
12 select 'group_5' name, 1 gr_sum from dual
13 )
14 /

MIN(NAM MAX(NAM COUNT(NAME) GR_PROD
------- ------- ----------- ----------
group_1 group_5 5 420

Mark D - March 3, 2012

I have to admit, the natural log solution is pretty darn cool, but as far as practical solutions go, John Howard’s is the approach I’d go with. However I’m fairly certain that line 22 should be:

self.nb := self.nb * ctx2.nb;

The ODCIAggregateMerge call is for when the SQL engine was working on two groups of data separately (during parallel execution, for example) and now wants to combine one result set with another. That would require multiplying self with the value being merged (ctx2 in this case).

mwidlake - March 3, 2012

Thanks Mark. Yeah, the natural log method really made me smile, but either John’s or Dom’s approach are probably the best to consider applying on a client site. Whatever you put in place, there has to be a reasonable chance that an average developer type is going to understand it after the event and the log stuff is just a bit too mathematical. Very cool but, unless I searched my own blog, in 5 years time I would probably just go “wow! How does THAT work!”

mwidlake - March 3, 2012

Dang, missed this for several days. Yes, nice answer but not allowed for my client as it used PL/SQL. Plus, I have to say, my personal PL/SQL skills are below me really understanding all of this. I need to brush up on my PL/SQL!!!! Many hanks for the input.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 156 other followers

%d bloggers like this: