jump to navigation

An Oracle server – How Fast for £1,000 July 27, 2010

Posted by mwidlake in Architecture, performance, Testing.
17 comments

Question? How fast an Oracle server can you create for £1,000 pounds?

{I’d really appreciate feedback and suggestions on this particular post}

The power of domestic PCs continues to grow, with four-core chips become pretty much standard and starting RAM looking more like 4GB than 2GB, with 8GB quite reasonable. So, how quick an Oracle server can you make based on a domestic PC? After all, those of us who play with Oracle in our spare time tned to use such machines and, in fact, they are often not far off what are our smaller servers at the office really are. When I worked at the Wellcome Trust Sanger Institute, we had to make our IT budget pounds go a long way. We were, after all, a charity with a limited budget but also a scientific organisation with a huge demand for data and processing. So we used a lot of cheap kit.

I’m seriously thinking of giving this a go. I need a new PC anyway and so I am willing to use it, at least initially, to see what can be done.

If I do this, I’m going to need to set some boundaries on the exercise. How about:

  • The oracle licence is being ignored in the cost {and please, I don’t need to be told how the licence can be more than the hardware costs!}. OS cost is though.
  • I am not aiming for enterprise-level resilience, so I am not going to consider hot-swappable components, dual redundant power supplies or things like that.
  • I am going to use new kit, so no scavenging or buying second-hand. It must all be easily available and repeatable.
  • I will use local storage in the server or connected to ports available on the server.
  • It will support a database of 1TB in size {yet to be designed}.
  • Oracle v11. Enterprise edition but nothing special like TimesTen or Exadata (unless Oracle are willing to sell me an Exadata box for a grand, then I’ll consider it).
  • I’m not considering backup and recovery performance {and this would be a serious oversight if this was a real system, but most places have central backup/recovery facilities}.

I would also have a few other things to decide.

The main one is “Do I use Linux or Windows?” Yes, you are all probably shouting “Linux!!!” but I have never been a Linux sys admin (I was an incredibly poor HP-UX system admin for 3 months though) so it will take me more time to deal with issues under Linux – in work situations I have always had access to people who know all this stuff to sort out issues but in this case I will be doing this on my own. On the other hand, you can just chuck Oracle on a standard windows box and it works, and as a rule hardware just works under Windows. If I decide to use USB3 ports, for example, is it going to be a major pain getting drivers under Linux? But then if I want the fastest oracle box under a grand why would I slow it down with windows and spend money on the licence? I just want the box to run Oracle and a workload.

The second “software” decision is, how do I measure performance? I think I could be getting to grips with Dom Giles’ excellent Swingbench {BTW, nice tag line on that page, Dom :-)}. But it runs on Java and guess what boys and girls? I’ve never been a Java developer. How limited are my skills! So that would take some of my precious spare time up too.

I’d love feedback on this, I’d love to know what hardware suggestions you would make, what you think about the overall idea, what else I need to consider to make the tests valid… I have a few ideas already for the hardware architecture and the intention would be to try lots of things but I’ll save that for a second post. After all, if I get no feedback I might just spend the money on a gaming machine and a week’s walking in the Lake District instead.

And if anyone want to help with the cost, please send cheques to….

Team Work & The Science of Slacking July 23, 2010

Posted by mwidlake in Friday Philosophy, Management, Perceptions.
Tags: , , ,
add a comment

We all know that working in a team is more efficient than working on your own (and I did say a week or two back how I was enjoying the rare privilege of working in a team of performance guys). Many of us also know about team dynamics and creating a balanced team of ideas people, completer-finishers, implementers, strategists and so forth. Those of use who have been exposed to training courses or books on team management know all these good things about teams and how we are supposed to get the most out of them.

How many of us, though, have been introduced to the work of the French Agronomist Max Ringelmann and the aspect of teams named after him, the Ringelmann Effect? In summary the Ringelmann Effect proposses that people in teams try less hard than they do when working alone. Especially if they think no one is watching them.

Back at the start of the 20th century Ringelmann tested out his ideas using a tug-of-war experiment. He would get people to pull on a rope as hard as they could and record their efforts using a strain gauge. Then he would get them to pull on the rope as part of a team, from 2 to 8 people. As soon as people were part of a team, they pulled less hard. With two people in the team, each pulled 93% as hard as on their own, with three people this dropped down to 85% and with 4 it was just 77%. By the time there were 8 people in the team, effort was down to 50%.

This idea of shirking work more and more as the team increased in size became established in modern psychology and was given Mr Ringelmann’s name. Psychologists explain that when someone is part of a group effort then the outcome is not solely down to the individual and, as such, is not totally in their control. This acts as a demotivating factor and the person tries that little bit less hard. The larger the team, the greater the demotivation and the more significant the drop in effort. Ringelmann found that effort was down to 50% in a team of 8 so how bad can the impact of the team be? I think most of us have at least witnessed, and quite possibly been in, the position of feeling like just a cog in a massive corporate team machine. Thoroughly demotivating (though, of course, we all of us still tried as hard as we could, didn’t we?).

The effect is also know under the far more entertaining title of Social Loafing.

Monsieur Ringelmann was far kinder at the time and pointed out that these chaps pulling on the rope could well have been suffering from a lack of synergy. They had not been trained together to pull as a team so that could account for the drop in effort, they were not synchronising their effort.

However, in the 1970’s Alan Ingham in Washington University revisited Ringelmanns work and he was far sneekier. Sorry, he was a more rigorous scientist. He used stooges in his team of rope-pullers, blindfolds and putting the one poor person pulling for real at the front of the team pulling the rope. Thus he could record the effort of the individual. Ingham found that there was indeed a drop in efficiency due to the team not pulling as one. But sadly, this was not the main factor. It remained that the drop in effort was mostly down to the perceived size of the rest of the team. The bottom line was proven to be the human capacity to try less hard when part of a team and that the drop in effort was directly proportional to the size of the team.

We are of course not immune to this effect in the IT world and someone has even gone to the effort of checking that out, James Suleiman and Richard T Watson.

It seems the ways to reduce this problem are:-

  • Don’t give people boring jobs.
  • Don’t give the same job to several people and let them know they all have the same job.
  • Ask people how they are getting on and give them mini-goals along the way.
  • Atually reward them for success. Like saying “thank you” and NOT giving them yet another boring, hard job to do as they did the last one so well.

I think it is also a good argument for keeping teams small {I personally think 5 or 6 people is ideal} and split up large projects such that a single team can cope. Then give tasks to individuals or pairs of people.

If you like this sort of thing you might want to check out one of my first blog post (though it is more an angry rant than a true discussion ofthe topic) which was on the Dunning-Kruger effect, where some people are unaware of their own limitations – though I did not know it was called the Dunning-Kruger effect until others told me, which only goes to show that maybe I am not aware of my own limits… Read the comments or click through to the links from there to get a better description of some people’s inability to guage their own inabilities.

My laptop has a Bug July 20, 2010

Posted by mwidlake in biology.
Tags: ,
add a comment

My laptop is suffering from bugs, and I’m not talking software.

It is warm and sunny here in the Southeast of England, which is not always the case during the British Summer, and I am suffering an invasion of little insects. Specifically Thrips or Thunderbugs. They are called Thunderbugs as they are supposed to appear in numbers when a thunderstorm is brewing. Like most Old Wives Tales it is utter rubbish. But kind of true too…

If you do not know, a thrip is usually a small insect about 0.15 mm wide and maybe 0.4mm long. So small, but visible. About the size of this:

,

Yep, a coma on an average LCD panel. And that is where the problem is. One has got into my laptop and under my screen and it is sure to die. It is currently scurrying around at the far left of the screen and I’m considering a mercy killing before it wanders further across the screen into prime acreage. I had this before on my old laptop. In that case it died in the middle of the screen and for ever more has looked suspiciously like a coma, or single ‘quote’, causing me confusion when it falls on top of emails, word documents and…. code. It really was a pain when it came to code. Even now, if I use that old machine it sometimes catches me out. It can merge with a letter in new and exciting ways, to subtly change a word or command.

I’m obviously not alone, a quick web search threw up some other people complaining of the same issue.

And of course it is a common knowledge that “bugs” in computing really did start out as insects getting fried in the electronics and valves of the very first machines in the mid-20th century. I wonder if that is really true or just another old myth? James Higgins seems to think it is real and who am I to doubt him. He has a photo of the evidence after all.

More Memory Meanderings – IOPS and Form Factors July 19, 2010

Posted by mwidlake in Architecture, Management, performance.
Tags: ,
8 comments

I had a few comments when I posted on solid state memory last week and I also had a couple of interesting email discussions with people.

I seriously failed to make much of one of the key advantages of solid-state storage over disk storage, which is the far greater capacity of Input/output operations per second (IOPS), which was picked up by Neil Chandler. Like many people, I have had discussions with the storage guys about why I think the storage is terribly slow and they think it is fast. They look at the total throughput from the storage to the server and tell me it is fine. It is not great ,they say, but it is {let’s say for this example} passing 440MB a second over to the server. That is respectable and I should stop complaining.

The problem is, they are just looking at throughput, which seems to be the main metric they are concerned about after acreage. This is probably not really their fault, it is the way the vendors approach things too. However, my database is just concerned in creating, fetching, and altering records and it does it as input/output operations. Let us say a disk can manage 80 IOPS per second (which allows an average 12.5 ms to both seek to the record and also read the data. Even many modern 7,200 rpm discs struggle to average less than 12ms seek time). We have 130 disks in this example storage array and there is no overhead from any sort of raid or any bottleneck in passing the data back to the server. {This is of course utterly unbelievable, but if i have been a little harsh not stating the discs can manage 8ms seek time, ignoring the raid/hba/network cost covers that}. Each disc is a “small” one of 500GB. They bought cheap disk to give us as many MB/£ as they could {10,000 and 15,0000 rpm disks will manage 120 and 160 IOPS per second but cost more per MB}.

Four sessions on my theoretical database are doing full table scans, 1MB of data per IO {Oracle’s usual max on 10.2}, Each session receiving 100MB of data a second, so 400MB in total. 5 discs {5*80 IOPS*1MB} could supply that level of IOPS. It is a perfect database world and there are no blocks in the cache already for these scans to interrupt the multi-block reads.

However, my system is primarily an OLTP system and the other IO is records being read via index lookups and single block reads or writes.

Each IOP reads the minimum for the database, which is a block. A block is 4k. Oracle can’t read a bit of a block.

Thus the 40MB of other data being transferred from (or to) the storage is single block reads of 4k. 10,000 of them. I will need 10,000/80 disks to support that level of IO. That is 125 discs, running flat out.

So, I am using all my 130 discs and 96% of them are serving 40MB of requests and 4% are serving 400MB of requests. As you can see, as an OLTP database I do not care about acreage or throughput. I want IOPS. I need all those spindles to give me the IOPS I need.

What does the 40MB of requests actually equate to? Let us say our indexes are small and efficient and have a height of 3 (b-level of 2), so root node, one level of branch nodes and then the leaf nodes. To get a row you need to read the root node, branch node, lead node and then the table block. 4 IOs. So those 10,000 IOPS are allowing us to read or write 10,000/4 records a second or 2,500 records.
You can read 2,500 records a second.

Sounds a lot? Well, let us say you are pulling up customer records onto a screen and the main page pulls data from 3 main tables (customer, address, account_summary) and translates 6 fields via lookups. I’ll be kind and say the lookups are tiny and oracle just reads the block or blocks of the table with one IO. So that is 9IOs for the customer screen, so if our 40MB OLTP IO was all for looking up customers then you could show just under 280 customers a second, across all users of your database. If you want to pull up the first screen of the orders summary, each screen record derived from 2 underlying main tables and again half a dozen lookups, but now with 10 records per summary page – that is 80 IOs for the page. Looking at a customer and their order summary you are down to under thirty a second across your whole organisation and doing nothing else.

You get the idea. 2,500 IOPS per second is tiny. Especially as those 130 500GB disks give you 65TB of space to host your database on. Yes, it is potentially a big database.

The only way any of this works is due to the buffer cache. If you have a very healthy buffer cache hit ratio of 99% then you can see that your 2500 records of physical IO coming in and out of the storage sub-system is actually supporting 250,000 logical-and-physical IOPS. {And in reality, many sites not buffer at the application layer too}.

Using Solid State Storage would potentially give you a huge boost in performance for your OLTP system, even if the new technology was used to simply replicate disk storage.

I think you can tell that storage vendors are very aware of this issue as seek time and IOPS is not metrics that tend to jump out of the literature for disk storage. In fact, often it is not mentioned at all. I have just been looking at some modern sales literature and white papers on storage from a couple of vendors and they do not even mention IOPS – but they happily quote acreage and maximum transfer rates. That is, until you get to information on Solid State Discs. NOw, because the vendor can say good things bout the situation then the information is there. On one HP white paper the figures given are:

				Modern super-fast		Top-end 
				SAS disk drive Top-end 	Solid State Disk
Sustained write     	150MB/s			180MB/s
Sustained read			90MB/s			180MB/s
Random write			285				5,000+
Random read				340				20,000+ 

More and more these days, as a DBA you do not need or want to state your storage requirements in terms of acreage or maximum throughput, you will get those for free, so long as you state your IOPS requirements. Just say “I need 5000 IOPS a second” and let the storage expert find the cheapest, smallest disks they can to provide it. You will have TBs of space.

With solid-state storage you would not need to over-specify storage acreage to get the IOPS, and this is why I said last week that you do not need solid state storage to match the capacity of current disks for this storage to take over. We would be back to the old situation where you buy so many cheap, small units to get the volume, IOPS are almost an accidental by-product. With 1GB discs you were always getting a bulk-buy discount :-)

I said that SSD would boost performance even if you used the technology to replicate the current disk storage. By this I mean that you get a chunk of solid-state disk with a SATA or SAS interface in a 3.5 inch format block and plug it in where a physical disk was plugged in, still sending chunks of 4k or 8k over the network to the Block Buffer Cache. But does Oracle want to stick with the current block paradigm for requesting information and holding data in the block buffer cache? After all, why pass over and hold in memory a block of data when all the user wanted was a specific record? It might be better to hold specific records. I suspect that Oracle will stick with the block-based structure for a while yet as it is so established and key to the kernel, but I would not be at all surprised if something is being developed with exadata in mind where data sets/records are buffered and this could be used for data coming from solid state memory. A second cache where, if using exadata or solid-state memory, holding single records. {I might come back to this in a later blog, this one is already getting bloated}.

This leads on to the physical side of solid-state discs. They currently conform to the 3.5” or 2.5” hard disc form factor but there is no need for them to do so. One friend commented that, with USB memory sticks, you could stick a female port on the back of a memory stick and a joint and you could just daisy-chain the USB sticks into each other, as a long snake. And then decorate your desk with them. Your storage could be looped around the ceiling as bunting. Being serious, though, with solid state storage then you could have racks or rows of chips anywhere in the server box. In something like a laptop the storage could be an array 2mm high across the bottom the chasis. For the server room you could have a 1u “server” and inside it a forest of chips mounted vertically, like row after row of teeth, with a simple fan at front and back to cool the teeth (if needed at all). And, as I said last time, with the solid state being so much smaller and no need to keep to the old hard disk format, you could squeeze a hell of a lot of storage into a standard server box.

If you pulled the storage locally into your server, you would be back to the world of localised storage, but then LANs and WANs are so much faster now that if you had 10TB of storage local to your server, you could probably share it with other machines in the network relatively easily and yet have it available to the local server with as many and as fat a set of internal interfaces as you could get your provider to manage.

I’m going to, at long last, wrap up this current instalment on my thoughts with a business one. I am convinced that soon solid-state storage is going to be so far superior a proposition to traditional disks that demand will explode. And so it won’t get cheaper. I’m wondering if manufacturers will hit a point where they can sell as much as they can easily make and so hold the price higher. After all, what was the argument for Compact Discs to cost twice as much to produce as old cassette tapes, even when they had been available for 5 years? What you can get away with charging for it.

How often is v$sys_time_model updated? July 14, 2010

Posted by mwidlake in internals, performance.
Tags: ,
6 comments

I think this posting might go down as one of my more pointless contributions to the Oracle knowledge sphere :-)

I was looking at V$SYS_TIME_MODEL and V$SESS_TIME_MODEL and I just happened to run “select * from V$SYS_TIME_MODEL” several times in very quick succession. And I noticed the values for the various counters stayed the same between a couple of the runs.

“Hmmm, interesting” I thought “The values are only flushed down to the view ‘periodically’. I wonder how periodically?”… and thus I wasted a lunch time.

I used the below sql*plus-PL/SQL script to investigate the refreshing of v$sess_time_model. Yes, I know the title says v$sys_time_model but the numbers are smaller and easier to check for the session version of the view and they are virtually the same, I can bet on the results being very similar. This is my script (and it is on 10.2.0.3 on linux):

--test_vstm_upd
-- how often is v$sessstat updated
set trims on
set term off
set serveroutput on size unli
spool test_vstm_upd 
begin
  for a in 1..1000  loop
    for vrec in 
      (select rpad(rpad(to_char(a),4)||' '||to_char(systimestamp,'ss.ff')||' '|| stat_name||' ',45)||value rec
      from v$sess_time_model
-- replace with your current session ID
      where sid=1989
      and stat_id in (3649082374,2748282437,2821698184,2643905994)
      )
    loop
      dbms_output.put_line(vrec.rec);
    end loop;
    dbms_output.put_line(chr(9));
  end loop;
end;
/
set term on
spool off

As you can see, it simply loops around selecting four of the values from v$sess_time_model, including the loop counter and current timestamp. Timetamp is evaluated afresh for each executed sql statement.

Here is the output for the first three iterations;

1    53.389576 DB time                       475860419
1    53.389576 DB CPU                        402642660
1    53.389576 sql execute elapsed time      209780319
1    53.389576 PL/SQL execution elapsed time 52290858
	
2    53.408944 DB time                       475860419
2    53.408944 DB CPU                        402642660
2    53.408944 sql execute elapsed time      209780319
2    53.408944 PL/SQL execution elapsed time 52290858
	
3    53.429159 DB time                       475860419
3    53.429159 DB CPU                        402642660
3    53.429159 sql execute elapsed time      209780319
3    53.429159 PL/SQL execution elapsed time 52290858

As you can see, the timetamp is increasing by 2/100s of a second or so per loop. Which is not as quick as I hoped but it is a test box. Note that the counters for DB Time, CPU time, SQL execute elapsed time and PL/SQL execution elapsed time are constant.

A few iterations later we see the v$sess_time_model counters increment:

7    53.509351 DB time                       475860419
7    53.509351 DB CPU                        402642660
7    53.509351 sql execute elapsed time      209780319
7    53.509351 PL/SQL execution elapsed time 52291610
--all change!	
8    53.531378 DB time                       475871716
8    53.531378 DB CPU                        402653957
8    53.531378 sql execute elapsed time      209786745
8    53.531378 PL/SQL execution elapsed time 52292793
-- and stable	
9    53.555889 DB time                       475871716
9    53.555889 DB CPU                        402653957
9    53.555889 sql execute elapsed time      209786745
9    53.555889 PL/SQL execution elapsed time 52292793

The counters all increment between iteration 7 and 8 and then stay the same. I can’t tell how long the counters had been the same, I need to wait and see when they change again. How long until they increment again? Well, not very long, in fact just around 0.12 seconds:

14   53.650154 DB time                       475871716
14   53.650154 DB CPU                        402653957
14   53.650154 sql execute elapsed time      209786745
14   53.650154 PL/SQL execution elapsed time 52293064
-- change	
15   53.670358 DB time                       475881268
15   53.670358 DB CPU                        402663509
15   53.670358 sql execute elapsed time      209792803
15   53.670358 PL/SQL execution elapsed time 52294180
-- still changing	
16   53.689011 DB time                       475887530
16   53.689011 DB CPU                        402669771
16   53.689011 sql execute elapsed time      209794387
16   53.689011 PL/SQL execution elapsed time 52294180
-- and still changing	
17   53.710875 DB time                       475889549
17   53.710875 DB CPU                        402671790
17   53.710875 sql execute elapsed time      209796393
17   53.710875 PL/SQL execution elapsed time 52295342
-- ...still changing...
18   53.728168 DB time                       475893032
18   53.728168 DB CPU                        402675273
18   53.728168 sql execute elapsed time      209797665
18   53.728168 PL/SQL execution elapsed time 52295342
-- and stable	
19   53.744725 DB time                       475893032
19   53.744725 DB CPU                        402675273
19   53.744725 sql execute elapsed time      209797665
19   53.744725 PL/SQL execution elapsed time 52295342

This time, the increment occurs over several iterations of the loop before becoming stable again. All four values I am pulling out increment over these iterations.

The next increment comes four iterations or 0.1 seconds later and happens swiftly, between two iterations:

22   53.802486 DB time                       475893032
22   53.802486 DB CPU                        402675273
22   53.802486 sql execute elapsed time      209797665
22   53.802486 PL/SQL execution elapsed time 52295342
-- change	
23   53.822231 DB time                       475897963
23   53.822231 DB CPU                        402680204
23   53.822231 sql execute elapsed time      209800369
23   53.822231 PL/SQL execution elapsed time 52296904
-- stable	
24   53.840085 DB time                       475905724
24   53.840085 DB CPU                        402687965
24   53.840085 sql execute elapsed time      209803330
24   53.840085 PL/SQL execution elapsed time 52296904

So it seem that v$sess_time_model is incremented in steps, not constantly, and does so every 0.10 to 0.13 seconds or so. My work here is done.

Or is it?

No, it is not, as there is now a “massive” gap where the counters do not increment for almost 3/4 of a second, until iteration 127:

126  55.530398 DB time                       475905724
126  55.530398 DB CPU                        402687965
126  55.530398 sql execute elapsed time      209803775
126  55.530398 PL/SQL execution elapsed time 52297583
-- change	
127  55.545085 DB time                       475914013
127  55.545085 DB CPU                        402696254
127  55.545085 sql execute elapsed time      209809518
127  55.545085 PL/SQL execution elapsed time 52298886
-- still changing	
128  55.560141 DB time                       475921342
128  55.560141 DB CPU                        402703583
128  55.560141 sql execute elapsed time      209812345
128  55.560141 PL/SQL execution elapsed time 52299359
-- still changing	
129  55.574806 DB time                       475922705
129  55.574806 DB CPU                        402704946
129  55.574806 sql execute elapsed time      209812345
129  55.574806 PL/SQL execution elapsed time 52299359
-- stable	
130  55.589541 DB time                       475922705
130  55.589541 DB CPU                        402704946
130  55.589541 sql execute elapsed time      209812345
130  55.589541 PL/SQL execution elapsed time 52299359 

Again, the incrementing ran over a small number of iterations of my loop.

I think I have shown that all the values increment together so I will reduce my output to just the one counter and see when it increments and over how many iterations and see if a pattern appears:

25   53.860550 DB time                       475905724
-- 53.84 to 55.54 0 1.7 seconds of stability 
126  55.530398 DB time                       475905724
127  55.545085 DB time                       475914013
128  55.560141 DB time                       475921342
129  55.574806 DB time                       475922705
136  55.682402 DB time                       475922705
137  55.697191 DB time                       475956738
138  55.712266 DB time                       475969859
139  55.727820 DB time                       475974350
140  55.743315 DB time                       475982356
141  55.758749 DB time                       475994069
142  55.773602 DB time                       476004596
143  55.788472 DB time                       476004596
144  55.803295 DB time                       476007541
145  55.818136 DB time                       476011172
146  55.832886 DB time                       476020336
147  55.847772 DB time                       476025376
148  55.865303 DB time                       476036347
-- incrementd with a couple of brief pauses over 0.34 seconds
149  55.881480 DB time                       476041481
150  55.896735 DB time                       476041481
...
200  56.664783 DB time                       476041481
-- 55.88 to 56.67 0.8 seconds of stability
201  56.679455 DB time                       476049162
-- increments over two iterations, 0.03 seconds
202  56.694092 DB time                       476052385
203  56.708733 DB time                       476052385
...
261  57.566902 DB time                       476052385
-- 56.69 to 57.59 0.9 seconds of stability
262  57.581582 DB time                       476052842
263  57.596218 DB time                       476058537
...
270  57.700212 DB time                       476058537
271  57.715371 DB time                       476060552
272  57.730797 DB time                       476063551
273  57.745700 DB time                       476074383
274  57.760351 DB time                       476079741
...
279  57.835162 DB time                       476079741
280  57.849966 DB time                       476080090
281  57.864782 DB time                       476090799
282  57.879446 DB time                       476100404
283  57.894553 DB time                       476103222
-- stable again after 0.3 seconds and a couple of mini-pauses
284  57.910592 DB time                       476103222
...
335  58.677438 DB time                       476103222
-- 57.91 to 58.69 0.8 seconds of stability
336  58.694704 DB time                       476113168
337  58.709995 DB time                       476113909
338  58.724782 DB time                       476119452
339  58.740756 DB time                       476119795
340  58.758659 DB time                       476129752
341  58.776040 DB time                       476132036
...
345  58.854895 DB time                       476132036
346  58.869516 DB time                       476138982
347  58.884100 DB time                       476145880
348  58.898772 DB time                       476160301
349  58.913401 DB time                       476178139
350  58.935391 DB time                       476183281
-- stable again after 0.27 seconds
351  58.955195 DB time                       476183281
...
395  59.608368 DB time                       476183281
-- 57.93 to 59.60 0.68 seconds of stability
396  59.623062 DB time                       476187188
...
402  59.713566 DB time                       476187188
403  59.728220 DB time                       476194591
404  59.742900 DB time                       476204006
405  59.757544 DB time                       476210666
406  59.774934 DB time                       476216338
407  59.796595 DB time                       476228874
...
413  59.890172 DB time                       476228874
414  59.908436 DB time                       476238680
415  59.923166 DB time                       476251316
416  59.937805 DB time                       476259466
417  59.952540 DB time                       476261228
418  59.967215 DB time                       476277094
419  59.981914 DB time                       476282108
-- stable again after 0.29 seconds
420  00.000358 DB time                       476298216
...
529  01.684500 DB time                       476298216
-- 00.00 to 01.69 1.69 seconds of stability
530  01.699165 DB time                       476301888
531  01.714307 DB time                       476312510

I would say that we can draw a few conclusions from the above

  • It is dangerous to look at a little bit of data and draw a firm conclusion, as I nearly did
  • The data in v$sess_time_model is only maintained in near-time not real-time
  • The counters in v$sess_time_model increment together
  • The counters seem to increment in a slightly messy way over part of a second and then are stable for 3/4 of a second to a second or two

I wonder how many of you went “Oh dear” when I said I could derive what is true for v$sys_time_model from v$sess_time_model? Could I? well, here is the modified script for v$sys_time_model:

--test_vstm_upd2
-- how often is v$sysstat updated
set trims on
set term off
set serveroutput on size unli
spool test_vstm_upd2 
begin
  for a in 1..1000  loop
    for vrec in 
      (select rpad(rpad(to_char(a),4)||' '||to_char(systimestamp,'ss.ff')||' '|| stat_name||' ',45)||value rec
      from v$sys_time_model
      --where sid=1989
      where stat_id in (3649082374)
      )
    loop
      dbms_output.put_line(vrec.rec);
    end loop;
--    dbms_output.put_line(chr(9));
  end loop;
end;
/
set term on
spool off

And a sample of my output:

1    43.187666 DB time                       14429733395433
2    43.188523 DB time                       14429733395755
3    43.188642 DB time                       14429733395905
4    43.188733 DB time                       14429733395905
5    43.188822 DB time                       14429733395905
6    43.188909 DB time                       14429733395905
--
7    43.188995 DB time                       14429733396491
8    43.189080 DB time                       14429733396491
9    43.189164 DB time                       14429733396491
10   43.189258 DB time                       14429733396491
11   43.189345 DB time                       14429733396491
12   43.189430 DB time                       14429733396491
13   43.189515 DB time                       14429733396491
14   43.189600 DB time                       14429733396491
15   43.189687 DB time                       14429733396491
16   43.189774 DB time                       14429733396491
17   43.189858 DB time                       14429733396491
18   43.189942 DB time                       14429733396491
19   43.190026 DB time                       14429733396491
20   43.190111 DB time                       14429733396491
--
21   43.190200 DB time                       14429733397436
22   43.190287 DB time                       14429733397436
23   43.190371 DB time                       14429733397436
24   43.190454 DB time                       14429733397436
25   43.190540 DB time                       14429733397436
26   43.190624 DB time                       14429733397436
27   43.190708 DB time                       14429733397436
--
28   43.190793 DB time                       14429733397764
29   43.190877 DB time                       14429733397764
30   43.190961 DB time                       14429733397764
31   43.191045 DB time                       14429733397764
32   43.191132 DB time                       14429733397764
33   43.191221 DB time                       14429733397764
34   43.191309 DB time                       14429733397764
35   43.191392 DB time                       14429733397764
--
36   43.191475 DB time                       14429733402416
37   43.191558 DB time                       14429733402416
--
38   43.191641 DB time                       14429733403070
39   43.191725 DB time                       14429733403070
40   43.191809 DB time                       14429733403070
41   43.191893 DB time                       14429733403070
42   43.191976 DB time                       14429733403070
43   43.192060 DB time                       14429733403070
44   43.192144 DB time                       14429733403070
45   43.192230 DB time                       14429733403070
46   43.192315 DB time                       14429733403070
47   43.192400 DB time                       14429733403070
48   43.192484 DB time                       14429733403070
49   43.192569 DB time                       14429733403070
50   43.192654 DB time                       14429733403070
--
51   43.192737 DB time                       14429733407045
52   43.192821 DB time                       14429733407045
53   43.192904 DB time                       14429733407045
54   43.192985 DB time                       14429733407045
55   43.193069 DB time                       14429733407045
56   43.193152 DB time                       14429733407045
57   43.193237 DB time                       14429733407045
58   43.193321 DB time                       14429733407045
59   43.193404 DB time                       14429733407045
60   43.193488 DB time                       14429733407045
61   43.193574 DB time                       14429733407045
--
62   43.193660 DB time                       14429733408897
63   43.193743 DB time                       14429733408897
64   43.193828 DB time                       14429733408897
65   43.193912 DB time                       14429733408897
66   43.193994 DB time                       14429733408897
67   43.194076 DB time                       14429733408897
--
68   43.194160 DB time                       14429733409208
69   43.194283 DB time                       14429733409208
70   43.194378 DB time                       14429733409208
--
71   43.194465 DB time                       14429733409267
72   43.194551 DB time                       14429733409267
73   43.194635 DB time                       14429733409267
74   43.194719 DB time                       14429733409267
75   43.194801 DB time                       14429733409267
76   43.194884 DB time                       14429733409267
--
77   43.194967 DB time                       14429733409863
78   43.195052 DB time                       14429733409863
--
79   43.195136 DB time                       14429733410499
80   43.195245 DB time                       14429733410499
81   43.195329 DB time                       14429733410499
82   43.195412 DB time                       14429733410499
83   43.195495 DB time                       14429733410499
84   43.195577 DB time                       14429733410499
85   43.195660 DB time                       14429733410499
86   43.195743 DB time                       14429733410499
87   43.195825 DB time                       14429733410499
88   43.195909 DB time                       14429733410499
89   43.195991 DB time                       14429733410499
90   43.196074 DB time                       14429733410499
91   43.196156 DB time                       14429733410499
92   43.196244 DB time                       14429733410499
93   43.196326 DB time                       14429733410499
94   43.196409 DB time                       14429733410499
--
95   43.196493 DB time                       14429733411732
96   43.196577 DB time                       14429733411732
97   43.196661 DB time                       14429733411732
98   43.196745 DB time                       14429733411732
99   43.196826 DB time                       14429733411732
--
100  43.196910 DB time                       14429733412107
101  43.196992 DB time                       14429733412410
102  43.197076 DB time                       14429733412410
103  43.197158 DB time                       14429733412410
104  43.197245 DB time                       14429733412410
105  43.197327 DB time                       14429733412410
106  43.197410 DB time                       14429733412410
107  43.197493 DB time                       14429733412410
108  43.197575 DB time                       14429733412410
109  43.197658 DB time                       14429733412410
--
110  43.197741 DB time                       14429733412981
111  43.197824 DB time                       14429733412981
112  43.197907 DB time                       14429733412981
113  43.197990 DB time                       14429733412981
114  43.198072 DB time                       14429733413001
115  43.198156 DB time                       14429733413001
116  43.198247 DB time                       14429733413001
117  43.198330 DB time                       14429733413001
--
118  43.198414 DB time                       14429733413300
119  43.198499 DB time                       14429733413300
120  43.198581 DB time                       14429733413300
121  43.198665 DB time                       14429733413300
122  43.198748 DB time                       14429733413300
123  43.198830 DB time                       14429733413300
124  43.198913 DB time                       14429733413300
--
125  43.198997 DB time                       14429733414262
126  43.199081 DB time                       14429733414262
127  43.199165 DB time                       14429733414262
128  43.199252 DB time                       14429733414262
129  43.199336 DB time                       14429733414262
130  43.199419 DB time                       14429733414262
131  43.199503 DB time                       14429733414262
--
132  43.199586 DB time                       14429733414569
133  43.199669 DB time                       14429733414569
134  43.199752 DB time                       14429733414569
135  43.199834 DB time                       14429733414569
136  43.199918 DB time                       14429733414569
137  43.200000 DB time                       14429733414569
138  43.200083 DB time                       14429733414569
139  43.200166 DB time                       14429733414569
140  43.200252 DB time                       14429733414569
--
141  43.200334 DB time                       14429733415145
142  43.200418 DB time                       14429733415145
--
143  43.200504 DB time                       14429733415335
144  43.200588 DB time                       14429733415335
145  43.200672 DB time                       14429733415335
146  43.200756 DB time                       14429733415335
147  43.200838 DB time                       14429733415335
148  43.200921 DB time                       14429733415335
149  43.201003 DB time                       14429733415335
150  43.201086 DB time                       14429733415335
151  43.201169 DB time                       14429733415335
152  43.201259 DB time                       14429733415335

I would say that we can draw a few conclusions from this latest test above!

  • It is dangerous to look at one thing and assume something closely related will be the same!
  • The data in v$sys_time_model is also being updated in bursts
  • The data in v$sys_time_model is actually updated very, very frequently, at around 1/1000 of a second intervals
  • It might be that v$sess_time_model is being updated for sessions in some sort of round-robin fashion and v$sys_time_model each time the v$sess version is updated
  • You can spend a lot of time looking at really quite obscure and possibly pointless stuff
  • The reason I am losing weight is I keep skipping lunch.

Memory Changes Everything July 12, 2010

Posted by mwidlake in Architecture, performance.
Tags: , , ,
9 comments

I’ve got this USB memory stick which I use to carry around my scripts, documents, presentations, Oracle manuals and enough music to keep me going for a few days. It is on an 8GB Gizzmo Junior and it is tiny. By tiny I mean as wide as my little finger, the length of a matchstick and about the same thickness of said matchstick. So small that I did indeed lose the damn thing for 6 months before I realised it had got trapped behind a credit card in my wallet.

It cost me ten British pounds about 15 months ago (less than most 4GB USB sticks seem to cost now, but then it is nothing more than the memory chip and connectors wrapped in plastic) and it highlights how cheap solid-state “storage” is becoming.

Connected to this, I was looking at buying a new PC this week and this machine comes with 10 USB slots, if you include the ones on the supplied monitor and stubs on the motherboard.
10 USB slots, 8GB gizzmo memory sticks… That would be 80GB of cheap and fast storage. Now get a few USB hubs and bulk-buy a few dozen cheap USB2 sticks and you could soon have a solid-state database of a few hundred GB for a thousand pounds. Then of course you can have fun seeing where the pinch-points in the system are (USB2 has a maximum speed per port and going USB3 right now is going to break that 1 grand barrier. But give it a year…).

This really started me thinking about when memory-based storage would take over from spinning disk as the best option for enterprise-level storage and my gut feeling is in about 5 years. I think it will be both technically possible and financially viable in much less than that, say as little as 2 years, but the cost of solid-state storage per MB will still be higher than disk by then but potentially much faster. A few considerations going through my mind were:-

  • Disk is getting a lot slower in relation to acreage. By this I mean that, for a single disc drive, capacity is doubling about every 18 months but seek time has hardly reduced in a decade and transfer rate (reading from the physical platters to the units buffer) is again almost stationary, at about 120MB/s for 10,000rpm disk and up towards 180 for those very expensive and noisy 15,000 rpm disks. Being a tad ridiculous to make the point, with modern 3TB disks you could build most Oracle database on one disc. Let’s make it two in a raid 10 configuration for redundancy. My point is, your 3TB database could well be being run right now, for real, across say 5 physical disks with a total sustainable physical throughput of around 500MB a second.
  • Solid state storage seems to be halving in price in more like 8-10 months.
  • IO subsystems are made faster by using RAID so that several physical discs can contribute to get towards the 300MB or so speed of the interface – but solid state is already that fast.
  • IO subsystems are made faster by building big caches into them and pre-fetching data that “might” be requested next. Oh, that is kind of solid state storage already.
  • Solid state storage, at least the cheap stuff in your USB stick, has the problem that you can only write to each bit a thousand or so times before it starts to get unreliable. But physical disk has exactly the same issue.
  • There are new methods of solid-state memory storage coming along – “New Scientist” had a nice article on it a few months ago, and these versions will be even higher density and more long-term reliable.
  • Seek time on solid-state memory is virtually zero, so random IO is going to be particularly fast compared to spinning disk.

Solid state memory needs less power, and thus less cooling, is silent, is potentially denser and is less vulnerable to temperature and humidity fluctuations. I can see it not needing to be kept in a specialist server room with the need for all that air con and ear defenders when you go in the room.
Just somewhere with normal air con and a lock on the door should suffice.
We do not need Solid State storage to match the size of current disks or even be as cheap to take over. As I have already pointed out, it is not acreage you need with physical disks but enough spindles and caches to make it fast enough in relation to the space. Further, we can afford to pay more for solid state if we do not need to keep it in such expensive clean-room like environments.

I can see that in a couple of years for a given computer system, say a mixed-workload order processing system, to support the storage needs we will have maybe a dozen solid-state chunks of storage, perhaps themselves consisting of several small units of memory in some sort of raid for resilience, all able to flood the IO channels into our processing server and the issue will be getting the network and io channels into the server to go fast enough. So don’t, stick all the storage directly into the server. You just got rid of half your SAN considerations.

I’m going to stop there. Partly because I have run out of time and partly because, in checking out what I am writing, I’ve just spotted someone did a better job of this before me. Over to James Morle who did a fantastic post on this very topic back in May. Stupid me for not checking out his blog more often. Jame also mentions that often it is not total throughput you are interested in at all but IOPS. That zero latency of solid-state memory is going to be great for supporting very high IOPS.

DBA_TAB_MODIFICATIONS can miss multi-table inserts July 5, 2010

Posted by mwidlake in internals.
Tags: ,
4 comments

Last week I blogged a piece all about DBA_TAB_MODIFICATIONS and how it captures all normal DML against table. However, a word of warning. This may not be true for multi-table inserts. It appears that inserts via multi-table inserts are not recorded in DBA_TAB_MODIFICATIONS for 10.2.0.3 and a couple of 9.2 versions of oracle.

{My thanks got to my comrade-in-arms Basavaraj Patil for this. He not only noticed the issue but tracked it down to multi-table inserts, came and told me about it and then said I could put it in my blog. This is his work really}.

I will not go into multi-table inserts right now but it is a very nice feature that was introduced with oracle 9 which allows you to insert a row into several tables at a time or into one of different tables. In fact, I think it is a relatively unknown feature. Check out this oracle-base article on new 9i features that mentions it or go pull up the Oracle 10 SQL reference manual.

There is a bug with Oracle up to 10.2.0.3 and 9.2.0.6 and 9.2.0.7.
It is fixed in 9.2.0.8, 10.2.0.4 and 11.1.0.6
Bug number 4572043 it you have metalink.
It impacts direct load merge statements as well as multi-table inserts, but I have not tested and verified the direct load merge statement.

Below is a demonstration of the issue:

TDB>drop table obj_nonsys purge;

Table dropped.

TDB>drop table obj_sys purge;

Table dropped.

TDB>create table obj_nonsys
  2  as select * from all_objects where rownum<1
  3  /

Table created.

TDB>create table obj_sys
  2  as select * from all_objects where rownum<1
  3  /

Table created.

TDB>insert into obj_sys
  2  select * from dba_objects
  3  where owner in ('SYS','SYSTEM')
  4  and rownum <= 200
  5  /

200 rows created.

TDB>insert into obj_nonsys
  2  select * from dba_objects
  3  where owner not in ('SYS','SYSTEM')
  4  and rownum <= 150
  5  /

150 rows created.

TDB>commit;

Commit complete.
TDB>-- flush the changes down to the DBA_TAB_MODIFICATIONS table.
TDB>exec DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO

PL/SQL procedure successfully completed.

TDB>--
TDB>select table_name,inserts,updates,deletes,truncated,timestamp
  2  from sys.dba_tab_modifications
  3  where table_owner=user and table_name in ('OBJ_SYS','OBJ_NONSYS');

TABLE_NAME      INSERTS    UPDATES    DELETES trnc TIMESTAMP                    
------------ ---------- ---------- ---------- ---- -----------------            
OBJ_NONSYS          150          0          0 NO   05-JUL-2010 20:59            
OBJ_SYS             200          0          0 NO   05-JUL-2010 20:59            

2 rows selected.

TDB>-- As can be seen above, the inserts are correctly captured
TDB>-- And the below counts confirm this
TDB>select count(*) from obj_sys;

  COUNT(*)                                                                      
----------                                                                      
       200                                                                      

1 row selected.

TDB>select count(*) from obj_nonsys;

  COUNT(*)                                                                      
----------                                                                      
       150                                                                      

1 row selected.

TDB>-- Now the core of it. Multi-table inserts
TDB>-- NB this is not the best example of a multi-table insert but it will do.
TDB>insert when (owner='SYS' or owner ='SYSTEM')
  2  	      then into obj_sys values ( OWNER,OBJECT_NAME,SUBOBJECT_NAME,OBJECT_ID,DATA_OBJECT_ID
  3  				       ,OBJECT_TYPE,CREATED,LAST_DDL_TIME,TIMESTAMP,STATUS
  4  				       ,TEMPORARY,GENERATED,SECONDARY)
  5  	    when (owner !='SYS' and owner !='SYSTEM')
  6  	      then into obj_nonsys values ( OWNER,OBJECT_NAME,SUBOBJECT_NAME,OBJECT_ID,DATA_OBJECT_ID
  7  				 ,OBJECT_TYPE,CREATED,LAST_DDL_TIME,TIMESTAMP,STATUS
  8  				 ,TEMPORARY,GENERATED,SECONDARY)
  9  	    select  OWNER,OBJECT_NAME,SUBOBJECT_NAME,OBJECT_ID,DATA_OBJECT_ID
 10  				 ,OBJECT_TYPE,CREATED,LAST_DDL_TIME,TIMESTAMP,STATUS
 11  				 ,TEMPORARY,GENERATED,SECONDARY
 12  	    from dba_objects
 13  	    where object_type='TABLE'
 14  	    and rownum <= 1000
 15  /

1000 rows created.

TDB>commit;

Commit complete.

TDB>exec DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO

PL/SQL procedure successfully completed.

TDB>-- And what do we see in DTM?
TDB>select table_name,inserts,updates,deletes,truncated,timestamp
  2  from sys.dba_tab_modifications
  3  where table_owner=user and table_name in ('OBJ_SYS','OBJ_NONSYS');

TABLE_NAME      INSERTS    UPDATES    DELETES trnc TIMESTAMP                    
------------ ---------- ---------- ---------- ---- -----------------            
OBJ_NONSYS          150          0          0 NO   05-JUL-2010 20:59            
OBJ_SYS             200          0          0 NO   05-JUL-2010 20:59            

2 rows selected.

TDB>-- Argh! No change to the number of inserts! They have been missed
TDB>--
TDB>-- Let me veryify what is in the tables...
TDB>select count(*) from obj_sys;

  COUNT(*)                                                                      
----------                                                                      
      1025                                                                      

1 row selected.

TDB>select count(*) from obj_nonsys;

  COUNT(*)                                                                      
----------                                                                      
       325                                                                      

1 row selected.

TDB>-- And I will do similar inserts to what the multi-table insert would do just to make sure
TDB>-- there is nothing odd going on.
TDB>insert into obj_sys
  2  select * from dba_objects
  3  where object_type='TABLE'
  4  and owner in ('SYS','SYSTEM')
  5  and rownum <= 600
  6  /

600 rows created.

TDB>insert into obj_nonsys
  2  select * from dba_objects
  3  where object_type='TABLE'
  4  and owner not in ('SYS','SYSTEM')
  5  and rownum <= 400
  6  /

400 rows created.

TDB>commit;

Commit complete.

TDB>exec DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO

PL/SQL procedure successfully completed.

TDB>--
TDB>select table_name,inserts,updates,deletes,truncated,timestamp
  2  from sys.dba_tab_modifications
  3  where table_owner=user and table_name in ('OBJ_SYS','OBJ_NONSYS');

TABLE_NAME      INSERTS    UPDATES    DELETES trnc TIMESTAMP                    
------------ ---------- ---------- ---------- ---- -----------------            
OBJ_NONSYS          550          0          0 NO   05-JUL-2010 20:59            
OBJ_SYS             800          0          0 NO   05-JUL-2010 20:59            

2 rows selected.

TDB>select count(*) from obj_sys;

  COUNT(*)                                                                      
----------                                                                      
      1625                                                                      

1 row selected.

TDB>select count(*) from obj_nonsys;

  COUNT(*)                                                                      
----------                                                                      
       725                                                                      

1 row selected.

TDB>
TDB>-- Note, the counts have gone as well of course and now are adrift from DTM

DBA_TAB_MODIFICATIONS July 2, 2010

Posted by mwidlake in internals, performance, statistics.
Tags: , ,
16 comments

I’ve mentioned DBA_TAB_MODIFICATIONS (and there is also of course ALL_TAB_MODIFICATIONS and USER_TAB_MODIFICATIONS) in a couple of posts, but mostly buried in long, rambling posts about other things. So here is an overview and demonstration of DBA_TAB_MODIFICATIONS. Skip to the end if you just want to see an example of how it works.

The below example is from 10.2.0.3 but for V11 it is pretty much the same, as is 9 except where I say otherwise.

SYS.DBA_TAB_MODIFICATIONS is used by oracle internally to track how many inserts, updates and deletes have been done to a table, table partition or table subpartition since the segment last had stats gathered on it with dbms_stats. It also records if the segment has been truncated since the last stats gather. Why since the last stats gather? Because Oracle uses it internally to detect how much a segment has changed and if it is 10% or more oracle will deem it stale for stats gathering purposes. (under 11 you can alter the stale percentage). Note I said SYS.DBA_TAB_MODIFICATIONS – on 10 at least the public synonym for it is missing, so unlike most other DBA_ views you need to state the owner (or create a synonym).

OWNER.TABLE_NAME     INSERTS    UPDATES    DELETES TIMESTAMP         TRU
------------------- ---------- ---------- ---------- ----------------- ---
XXXXXXX.YYYYYYYYYYY   22598264          0          1 01-MAR-2010 02:08 NO
XXXXXXX.YYYYYYYYYYY          5          0          0 19-SEP-2007 01:47 NO
XXXXXXX.YYYYYYYYYYY     888766          0          0 01-MAR-2010 02:08 NO
XXXXXXX.YYYYYYYYYYY       3191       1486          0 27-NOV-2009 05:11 NO
XXXXXXX.YYYYYYYYYYY      34742          0          0 08-MAR-2010 15:16 NO
XXXXXXX.YYYYYYYYYYY          0       7192          0 02-JUL-2010 05:00 NO
XXXXXXX.YYYYYYYYYYY          0          1          0 10-MAR-2010 15:16 NO
XXXXXXX.YYYYYYYYYYY          8          8          8 26-JAN-2010 08:05 NO
XXXXXXX.YYYYYYYYYYY    1533536          0          2 01-MAR-2010 02:08 NO
XXXXXXX.YYYYYYYYYYY        281          0          0 11-SEP-2009 03:00 NO 

Under oracle 9 you have to register a table as MONITORED before this information is gathered. Under 10 and 11 all tables/partitions/subpartitions are monitored and you can’t turn that monitoring off {you can try, but oracle ignores you :-) }.

The information can be very useful for checking how volatile a segment is, if it has been changed a lot since the last time stats were gathered on it and you can also add the values held in DBA_TAB_MODIFICATIONS to the value for NUM_ROWS held for the segment and get a very accurate estimate of the current number of rows. It is a lot, lot faster than actually counting them!

The information on inserts/updates and deletes is gathered for pretty much all DML against tables (see an up-coming post for an example of this not being true). Direct load SQL*Loader and other direct-io activity can skip being recorded but insert-append, using the /*+ append */ hint is recorded correctly {I suspect this was not true for V9 and 10.1 but am no longer sure}. This information is initially held in memory and only later pushed into DBA_TAB_MODIFICATIONS and so you may not see the latest information. Under oracle 9 this information is flushed down every 15 minutes I believe, under 10.1 it is 3 hours and under 10.2 onwards the information is only flushed down when stats are gathered against the segment OR you manually flush the information down to the database.

flushing the latest information is achieved with DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO. It is relatively quick to run, normally taking less than a few seconds.

When statistics are gathered on a segment, any corresponding rows in DBA_TAB_MODIFOCATIONS is deleted, not updated to zeros, and is recreated only when relevent inserts,updates, deletes or truncates occur on the segment.

the DBA_TAB_MODIFICATIONS view sits on top of sys.mon_mods_all$ as well as obj$,user$ and the usual suspects. sys.mon_mods_all$ does not contain any more information that the view exposes.

desc sys.dba_tab_modifications
 Name                                                  Null?    Type
 ----------------------------------------------------- -------- ------------
 TABLE_OWNER                                                    VARCHAR2(30)
 TABLE_NAME                                                     VARCHAR2(30)
 PARTITION_NAME                                                 VARCHAR2(30)
 SUBPARTITION_NAME                                              VARCHAR2(30)
 INSERTS                                                        NUMBER
 UPDATES                                                        NUMBER
 DELETES                                                        NUMBER
 TIMESTAMP                                                      DATE
 TRUNCATED                                                      VARCHAR2(3)
 DROP_SEGMENTS                                                  NUMBER

--View description
TEXT
-----------------------------------------------------------------------
SYS                            DBA_TAB_MODIFICATIONS                  9
select u.name, o.name, null, null,
       m.inserts, m.updates, m.deletes, m.timestamp,
       decode(bitand(m.flags,1),1,'YES','NO'),
       m.drop_segments
from sys.mon_mods_all$ m, sys.obj$ o, sys.tab$ t, sys.user$ u
where o.obj# = m.obj# and o.obj# = t.obj# and o.owner# = u.user#
union all
select u.name, o.name, o.subname, null,
       m.inserts, m.updates, m.deletes, m.timestamp,
       decode(bitand(m.flags,1),1,'YES','NO'),
       m.drop_segments
from sys.mon_mods_all$ m, sys.obj$ o, sys.user$ u
where o.owner# = u.user# and o.obj# = m.obj# and o.type#=19
union all
select u.name, o.name, o2.subname, o.subname,
       m.inserts, m.updates, m.deletes, m.timestamp,
       decode(bitand(m.flags,1),1,'YES','NO'),
       m.drop_segments
from sys.mon_mods_all$ m, sys.obj$ o, sys.tabsubpart$ tsp, sys.obj$ o2,
     sys.user$ u
where o.obj# = m.obj# and o.owner# = u.user# and
      o.obj# = tsp.obj# and o2.obj# = tsp.pobj#

-- underlying sys.$ table
desc sys.mon_mods_all$
 Name                                                  Null?    Type
 ----------------------------------------------------- -------- ------
 OBJ#                                                           NUMBER
 INSERTS                                                        NUMBER
 UPDATES                                                        NUMBER
 DELETES                                                        NUMBER
 TIMESTAMP                                                      DATE
 FLAGS                                                          NUMBER
 DROP_SEGMENTS                                                  NUMBER

Here is a demo of it in action:

TDB>-- clear down the test table.
TDB>-- if you do not have access to DBA_TAB_MODIFICATIONS change to ALL_TAB_MODIFICATIONS
TDB>drop table test_1 purge;

Table dropped.

TDB>create table test_1
  2  as select * from all_objects where rownum<1
  3  /

Table created.

TDB>select count(*) from test_1;

  COUNT(*)                                                                      
----------                                                                      
         0                                                                      

1 row selected.

TDB>-- WHAT IS IN DBA_TAB_MODIFICATIONS
TDB>select table_name,inserts,updates,deletes,truncated,timestamp
  2  from sys.dba_tab_modifications
  3  where table_owner=user and table_name= 'TEST_1'
  4  /

no rows selected

TDB>-- OK, let's flush down the information
TDB>--
TDB>exec DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO

PL/SQL procedure successfully completed.

TDB>-- Still nothing as no activity has occurred on the table.
TDB>select table_name,inserts,updates,deletes,truncated,timestamp
  2  from sys.dba_tab_modifications
  3  where table_owner=user and table_name= 'TEST_1'
  4  /

no rows selected

TDB>-- Now create some data
TDB>insert into test_1
  2  select * from dba_objects where rownum <= 100
  3  /

100 rows created.

TDB>commit;

Commit complete.

TDB>select count(*) from test_1;

  COUNT(*)                                                                      
----------                                                                      
       100                                                                      

1 row selected.

TDB>-- WHAT IS IN DBA_TAB_MODIFICATIONS
TDB>select table_name,inserts,updates,deletes,truncated,timestamp
  2  from sys.dba_tab_modifications
  3  where table_owner=user and table_name= 'TEST_1'
  4  /

no rows selected

TDB>-- OK, let's flush down the information
TDB>--
TDB>exec DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO

PL/SQL procedure successfully completed.

TDB>--
TDB>select table_name,inserts,updates,deletes,truncated,timestamp
  2  from sys.dba_tab_modifications
  3  where table_owner=user and table_name= 'TEST_1'
  4  /

TABLE_NAME      INSERTS    UPDATES    DELETES trnc TIMESTAMP                    
------------ ---------- ---------- ---------- ---- -----------------            
TEST_1              100          0          0 NO   02-JUL-2010 10:21            

1 row selected.

TDB>-- The information in DBA_TAB_MODIFICATIONS is used by Oracle to detect if a table
TDB>-- (or partition) in a tables is stale - changed by 10%
TDB>-- Gathering statistics on an object DELETES the record from DBA_TAB_MODIFICATIONS
TDB -- rather than setting all the values to zero.
TDB>--
TDB>exec dbms_stats.gather_table_stats(ownname=>user,tabname=>'TEST_1')

PL/SQL procedure successfully completed.

TDB>--
TDB>select table_name,inserts,updates,deletes,truncated,timestamp
  2  from sys.dba_tab_modifications
  3  where table_owner=user and table_name= 'TEST_1'
  4  /

no rows selected

TDB>-- now do some activity again and flush it down to the dictionary
TDB>insert  into test_1
  2  select * from dba_objects where rownum <= 150
  3  /

150 rows created.

TDB>exec DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO

PL/SQL procedure successfully completed.

TDB>select table_name,inserts,updates,deletes,truncated,timestamp
  2  from sys.dba_tab_modifications
  3  where table_owner=user and table_name= 'TEST_1'
  4  /

TABLE_NAME      INSERTS    UPDATES    DELETES trnc TIMESTAMP                    
------------ ---------- ---------- ---------- ---- -----------------            
TEST_1              150          0          0 NO   02-JUL-2010 10:21            

1 row selected.

TDB>-- Direct inserts via insert-append are also captured (I think this might have changed)
TDB>insert /*+ append */ into test_1
  2  select * from dba_objects where rownum <= 170
  3  /

170 rows created.

TDB>exec DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO

PL/SQL procedure successfully completed.

TDB>select table_name,inserts,updates,deletes,truncated,timestamp
  2  from sys.dba_tab_modifications
  3  where table_owner=user and table_name= 'TEST_1'
  4  /

TABLE_NAME      INSERTS    UPDATES    DELETES trnc TIMESTAMP                    
------------ ---------- ---------- ---------- ---- -----------------            
TEST_1              320          0          0 NO   02-JUL-2010 10:21            

1 row selected.

TDB>-- now a mixed bag of inserts, updates and deletes
TDB>insert into test_1
  2  select * from dba_objects where rownum <= 75
  3  and owner not in ('SYS','SYSTEM')
  4  /

75 rows created.

TDB>--
TDB>update test_1 set created=sysdate
  2  where object_type !='TABLE'
  3  /

289 rows updated.

TDB>delete from test_1
  2  where object_type='SEQUENCE'
  3  /

10 rows deleted.

TDB>commit;

Commit complete.

TDB>--
TDB>exec DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO

PL/SQL procedure successfully completed.

TDB>select table_name,inserts,updates,deletes,truncated,timestamp
  2  from sys.dba_tab_modifications
  3  where table_owner=user and table_name= 'TEST_1'
  4  /

TABLE_NAME      INSERTS    UPDATES    DELETES trnc TIMESTAMP                    
------------ ---------- ---------- ---------- ---- -----------------            
TEST_1              395        289         10 NO   02-JUL-2010 10:21            

1 row selected.

If you want to see what else I said about DBA_TAB_MODIFICATIONS then the end of this {way too long} post shows how to use it to get a very fast count of rows in a table and if you have issues with permissions flushing database monitoring see this link about permissions needed

Question. As the data gets flushed from memory to a data dictionary table, it persists the database being bounced. What happens to the data in memory about table changes when thers is a controlled shutdown and when the database crashes?

Follow

Get every new post delivered to your Inbox.

Join 161 other followers