Hope to see you at the IDUG North American Technical Conference in Phoenix, Arizona – May 12-16!

Find the Statements Using an Index Without Event Monitors

This post is specific to DB2 9.7. In DB2 10.1 and above, Usage Lists would be a better way to handle this.

In the course of writing my recent developerWorks article on low-cardinality indexes and an associated presentation, I found that I have dozes of indexes with a cardinality of 1 that are also recently used. I cannot come up with a way that using these indexes could help performance, so I want to understand the SQL that is using them. To do that, I need to find the SQL that uses an index. I also would like to do it without event monitors on a production system with minimal impact. Here’s my journey to do that.

I suspect the reason that the low-cardinality indexes are being used is because the SQL using them is using parameter markers, and therefore cannot make use of distribution statistics to know that they’re harmful rather than helpful to performance. I also want to know if the SQL using these indexes is frequently executed or some of the reporting workload that this e-commerce database is cursed with.

I also have absolutely no vendor or IBM tools other than what comes with DB2 ESE. This in itself is both a blessing and a curse. It takes me longer to do a lot of things than it would with tools, but without tools I learn so much more about how things work.

Where to Find Index Usage Information

Let’s start with where I got the list of indexes that have a cardinality of 1 and a lastused within the last 30 days. (this was run on 9/30/2013)

select 	i.lastused, 
		substr(t.tabname,1,15) as tabname, 
		substr(indname,1,10) as indname, 
		substr(colnames,1,16) as colnames, 
		smallint(fullkeycard)as fullkeycard, 
		integer(card) as table_card, 
		volatile 
from 	syscat.indexes i, 
		syscat.tables t 
where i.tabname=t.tabname and i.tabschema=t.tabschema 
	and t.tabschema not like 'SYS%'
	and card > 10
	and fullkeycard=1 
	and uniquerule='D' 
	and i.lastused > current date - 30 days 
with ur;

LASTUSED   TABNAME         INDNAME    COLNAMES         FULLKEYCARD TABLE_CARD  VOLATILE
---------- --------------- ---------- ---------------- ----------- ----------- --------
09/30/2013 ACACTGRP        I0000427   +MEMBER_ID                 1         412         
09/29/2013 ACRELGRP        I0000442   +MEMBER_ID                 1          19         
09/30/2013 ACRESGRP        I0000446   +MEMBER_ID                 1         216         
09/30/2013 ATCHTGT         I0000864   +ATTACHUSG_ID              1        3961         
09/30/2013 CALCODE         I0001247   +TXCDCLASS_ID              1        1019         
09/30/2013 CATENCALCD      I0000516   +TRADING_ID                1         208         
09/30/2013 CATENTRY        I0000375   +RANK                      1      279635         
09/30/2013 CATGPCALCD      I0000522   +TRADING_ID                1          32         
09/30/2013 CATGRPTPC       I0000531   +CATGROUP_ID               1          22         
09/30/2013 CATTOGRP        I0000373   +CATALOG_ID_LINK           1         627         
09/30/2013 CPENDORDER      I0000542   +STOREENT_ID               1          23         
09/30/2013 DISPCGPREL      I0000556   +MBRGRP_ID                 1          13         
09/30/2013 FFMCENTDS       I0000570   +STADDRESS_ID              1         785         
09/30/2013 LANGPAIR        I0000603   +STOREENT_ID               1          13         
09/30/2013 NOTIFY          I0000625   +MSGTYPE_ID                1       11818         
09/29/2013 OICOMPLIST      I0001263   +OICOMPREL_ID              1       14951         
09/30/2013 ORDERS          I0001267   +BUYERPO_ID                1     1184754         
09/30/2013 PARTICIPNT      I0000680   +TERMCOND_ID               1          73         
09/30/2013 PATTRVALUE      I0001270   +ATTACHMENT_ID             1       44644         
09/30/2013 PPCEXTDATA      I0001106   +PPCBATCH_ID               1     3633179         
09/30/2013 PPCPAYINST      I0000919   +RMA_ID+MARKFORD           1      241572         
09/30/2013 PPCPAYTRAN      I0001103   +PPCBATCH_ID               1      209810         
09/30/2013 SCHCMD          I0000758   +STOREENT_ID               1          21         
09/30/2013 TRADEPOSCN      I0000814   +PRODUCTSET_ID             1          43         
09/30/2013 PRELETEMPLATE   I0001344   +STOREENT_ID               1          13         
09/30/2013 PRICERULE       I0001342   +TYPE                      1          15         
09/30/2013 CATENTSUBS      I0001358   +SUBSCPTYPE_ID             1         162         
09/30/2013 SEOPAGEDEFSUBPA I0001415   +STOREENT_ID               1          14         
09/20/2013 CATGROUP        I0000376   +RANK                      1        1400         
09/30/2013 BUSEVENT        I0001104   +CHECKED                   1     6805318         
09/30/2013 STLOC           I0001211   +GEONODE_ID                1        4913         
09/29/2013 PPCEXTDATA      I0000927   +PPCPAYTRAN_ID             1     3633179         

  32 record(s) selected.

After digging around a bit, I determined there’s no good way to say match statements in mon_get_pkg_cache_stmt to indexes using system tables alone. So I determined I’m going to have to explain basically everything in my package cache to accomplish this. Explain must track the objects used, and indeed, I found the EXPLAIN_OBJECT table, which should list the indexes used and tie them back to which statement they were used for.

EXPLAINing Everything

My first thought on explaining everything was to use db2advis – since I knew that you can apply db2advis to the entire contents of your package cache, and can specify the option to capture explain information as well. Luckily, I thought that this might be resource-intensive and tried it on a non-production system first. I DO NOT RECOMMEND USING THIS METHOD on a production database. I used this syntax:

db2advis -d wc005s01 -q WSCOMUSR -g -p |tee db2advis.out

This ran for over 30 minutes. It did end up recommending some interesting indexes, but sifting through them would be a bit time consuming, and if I added all 192 it recommended, it would take over 5 GB of space – quite a bit for this ~100GB database. It wasn’t about the indexes anyway – I ran this to get the explain information. The big problem was that it spiked my CPU. CPU on a box that mostly runs below 20% was at 70-80% for that half hour, with spikes all the way to 100%. That would never work for my production server. And anyway, it’s doing more work than I need – I don’t need all that db2advis info, just the explain tables popluated.

Theoretically, I could just query mon_get_pkg_cache_stmt to get each of the statements, and run explains on them. But I wanted to use this to see if there was a more direct way of getting what I wanted. Digging around the IBM DB2 Info Center some more (some days I just love the Info Center), I found EXPLAIN_FROM_SECTION. It allows me to pass in an executable_id (which is available in mon_get_pkg_cache_stmt), and not much else. Armed with that, I came up with this, instead:

db2 -x "select executable_id from table(mon_get_pkg_cache_stmt(NULL, NULL, NULL, -2)) as t" >exe_id.list
date; cat exe_id.list |while read e; do db2 connect to wc005q01; db2 "CALL EXPLAIN_FROM_SECTION ($e, 'M', NULL, 0, NULL, ?, ?, ?, ?, ?)"; date; db2 connect reset; done >>explain_all.out

When I ran this on a nearly un-used test box, it used between 5% and 15% of CPU, when the CPU was normally 0% used. There were about 13,000 statements involved, and it’s a 12-cpu box, but obviously results vary based on hardware, package cache size, database purpose and design, etc. This is the results of my case study only.

It also ran for about half an hour. Part of that is my methodology. Running it in a stored procedure would certainly perform better – especially since my bash loop requires me to connect in every loop – significant overhead. But creating a stored procedure on stage and prod databases would require me to go through a change management process. I would be willing to do that once I understand if this process is useful, but not when I’m just trying it out.

After testing to make sure I wasn’t going to tank the system, I moved forward to an actually busy test system (ongoing CPU utilization still under 10%), it spiked CPU utilization up to 35%. Granted, this is a smaller box, not sized for load tests, with only 3 cpus and there were about 18,000 statements to explain. Running explains this way really seemed to peg one cpu, and leave the others alone.

Production has a smaller package cache (I have since altered the test environments to match prod), and has 12 cpus. It normally runs around 20% cpu utilization off hours, so I feel comfortable running the same process there. With only about 3,000 statements, it finished on prod in less than 10 minutes, and while running had CPU running 10% higher or less than what it looked like without it running.

Summary of the last 3 paragraphs: don’t run this on prod without testing the impact on your system first – it could be the straw that breaks the camel’s back.

Analyzing Explain Info for Indexes

After I had collected data as outlined above, I used this query to look at indexes on the same criteria I used to look at indexes above:

select 
	o.explain_time, 
	substr(o.object_schema,1,16) as schema, 
	substr(o.object_name,1,32) as name, 
	i.uniquerule,
	i.fullkeycard,
	t.card,
	substr(s.statement_text,1,50) as stmt 
from 	db2inst1.explain_object o, 
	db2inst1.explain_statement s, 
	syscat.tables t,
	syscat.indexes i
where 	s.explain_time = o.explain_time 
	and s.explain_requester=o.explain_requester 
	and s.source_name=o.source_name 
	and s.source_schema=o.source_schema 
	and s.explain_time > timestamp('2013-09-30-20.00.00.996909') 
	and o.object_type='IX'
	and o.object_schema = i.indschema and o.object_name = i.indname
		and i.tabname=t.tabname and i.tabschema=t.tabschema 
		and card > 10
		and i.fullkeycard=1 
		and uniquerule='D' 
with ur;

Note the conditions here – I’m looking for SQL that used indexes with a cardinality of 1 where the table has a cardinality greater than 10. In one execution of this process, I got output like this:

EXPLAIN_TIME               SCHEMA           NAME                             UNIQUERULE FULLKEYCARD          CARD                 STMT                                              
-------------------------- ---------------- -------------------------------- ---------- -------------------- -------------------- --------------------------------------------------
2013-10-01-09.41.44.460410 WSCOMUSR         I0001263                         D                             1                17742 DELETE FROM ORDERITEMS  WHERE ORDERITEMS_ID = ? AN
2013-10-01-09.41.44.460410 WSCOMUSR         I0001263                         D                             1                17742 $WITH CONTEXT$($CONSTRAINT$(WSCOMUSR.BKORDALLOC.F_
2013-10-01-09.41.51.030702 WSCOMUSR         I0000927                         D                             1              3769661 SELECT T1.PPCEXTDATA_ID, T1.ENCRYPTED, T1.MARKFORD
2013-10-01-09.41.51.030702 WSCOMUSR         I0000927                         D                             1              3769661 SELECT Q1.PPCEXTDATA_ID AS "PPCEXTDATA_ID", Q1.ENC
2013-10-01-10.01.28.520154 WSCOMUSR         I0000927                         D                             1              3769661 SELECT T1.PPCEXTDATA_ID, T1.ENCRYPTED, T1.MARKFORD
2013-10-01-10.01.28.520154 WSCOMUSR         I0000927                         D                             1              3769661 SELECT Q1.PPCEXTDATA_ID AS "PPCEXTDATA_ID", Q1.ENC
2013-10-01-10.04.31.072162 WSCOMUSR         I0000927                         D                             1              3769661 select ppcpaytran_id, count(*) from wscomusr.PPCEX
2013-10-01-10.04.31.072162 WSCOMUSR         I0000927                         D                             1              3769661 SELECT Q3.PPCPAYTRAN_ID AS "PPCPAYTRAN_ID", Q3.$C1
2013-10-01-15.21.23.483638 WSCOMUSR         I0001263                         D                             1                17742 DELETE FROM ORDERITEMS  WHERE ORDERITEMS_ID = ? AN
2013-10-01-15.21.23.483638 WSCOMUSR         I0001263                         D                             1                17742 $WITH CONTEXT$($CONSTRAINT$(WSCOMUSR.BKORDALLOC.F_
2013-10-01-15.21.31.235352 WSCOMUSR         I0000927                         D                             1              3769661 SELECT T1.PPCEXTDATA_ID, T1.ENCRYPTED, T1.MARKFORD
2013-10-01-15.21.31.235352 WSCOMUSR         I0000927                         D                             1              3769661 SELECT Q1.PPCEXTDATA_ID AS "PPCEXTDATA_ID", Q1.ENC

  12 record(s) selected.

One thing I noticed here – many times there are two entries for the same query – one for the original syntax and one for the optimized syntax that you see listed in the explain plan when you do an explain (which is not directly executable).

While this got some interesting results, it still did not quite get me as far as I wanted to go. I find that to understand HOW the sql is using these indexes, I need the values for the parameter markers in the queries. While this was an interesting exercise, in the end, I think I still need an event monitor to fully understand how the SQL is using the indexes, because only an event monitor will get me those values.. I’m hoping DB2 is using the indexes when the value of the column for the low-cardinality indexes is not represented in the index at all – for elimination. Because otherwise I don’t understand how an index with a cardinality of 1 could ever be useful.

But I can run explains and run the design advisor against this SQL to see what I can do as far as these indexes go.

1 comment… add one

  • Hi Amber, I liked your blog as I tried the same thing before and feels nice somebody else is trying to get the same info DB2 does not make easy to get.
    My experiences are:
    A) I’ve found SQLs where DB2 uses indexes which have nothing to do at all with the WHERE predicate. Like this particular query below uses the PK index (which is SINGLETRANSID column) but you don’t see it in the where predicate, even though there’s an index on ARCHFLAG+ARCHDATE+TMSTAMP columns.

    “SELECT Min(singletransID) FROM isaac.table WHERE ARCHFLAG = ? and Tmstamp < timestamp(current_date) – ? DAY"

    Only if you add 'and ARCHDATE=? and' in the predicate, DB2 uses the right index. So I think if we know indexes with fullkeycard=1 are not useful we should drop them anyway.

    Reply

Leave a Comment