DB2 Table Scans

What is a table scan?

DB2 has a number of ways of accessing the data it needs, but the most basic one – the one that will always work to get what it needs – is the table scan. A Table scan means that DB2 will read through every single row in a table to get the data it needs. Table scans appear on explain plans for some queries. My favorite method of generating an explain plan is described in this post: SQL Analysis Overview

Assuming the query has a where clause, DB2 may use a table scan to handle each row individually and ask “does this meet my criteria?”. If the answer is yes, then the row is added to the result set DB2 is building. If the answer is no, then the row is discarded and not used.

When do table scans occur?

DB2 may choose a table scan can whether your query is returning one row or every single row in a table. While a ‘select *’ with no where clause is almost certain to do a table scan, they can also occur in a scenario where you are returning only one row out of a million row table. A table scan is like DB2′s last resort – if it cannot find a an index to help, it will fall back on the table scan to examine each row in the table individually.

Table scans can even occur on tables you didn’t know existed. They can occur on temporary tables created just for this query (implicitly or explicitly). Since there is no index on some types of temporary tables, DB2 uses table scans a lot in this context. Check out my post on Fun with IN-lists and GENROW to see one of the ways this can look.

Why to avoid table scans

It is easy to see why table scans are a bad idea for selecting only a few rows out of a million row table. It seems less important to avoid with a small table of 6 or 7 rows that is being scanned to return just one row. Depending on how frequent each of those scenarios is, either one can be a performance killer. Consider table scans on small tables if done frequently just as much of a problem as they are on large tables.

Ways that table scans cause problems:

  1. Locking – Depending on isolation level, a table scan of even a small table can end up locking the whole table, reducing concurrency and increasing the likelihood of lock timeouts or even deadlocks. If it is a large table, it can even fill up the locklist and trigger a lock escalation to lock the whole table – which can occur at a variety of isolation levels. In my experience, locking issues are one of the primary causes of severe performance problems.
  2. Hogging the bufferpool – Depending on the size of the table and the size of the bufferpool, reading the whole table into the bufferpool can reduce your hit ratio by taking up too much space for the one table
  3. I/O – I/O takes time, and once you’ve run through your bufferpool, you could be going straight to disk, even with prefetch. Direct I/O is slow, and this can be a problem
  4. CPU Time – Comparing a million rows takes time, even if it all takes place in memory. You can actually see higher CPU time when a lot of table scans are occurring. CPU time may not be your overall bottleneck, but it can still be a problem, particularly if the database server is undersized on CPU to begin with.
  5. Increased Execution Time – A query that takes longer not only is more likely to have locking and concurrency issues, but it clearly is causing reduced response times to the end user.

Key indicators that table scans are occuring

Have I mentioned lately that I just love the ‘MON_GET…’ table functions introduced in 9.7? If you want to know which tables table scans are occuring on, it is only a quick query away, assuming you don’t care about a specific time frame other than since the last database activation:

$ db2 "select substr(tabschema,1,18) as tabschema, substr(tabname,1,30) as tabname, table_scans FROM TABLE(MON_GET_table('','',-2)) AS T order by table_scans desc fetch first 10 rows only with ur"

TABSCHEMA          TABNAME                        TABLE_SCANS
------------------ ------------------------------ --------------------
WSCOMUSR           ATTRVAL                                      567504
WSCOMUSR           STTPCUSG                                      67325
WSCOMUSR           XJORNADA                                      25758
WSCOMUSR           CTXMGMT                                       12671
WSCOMUSR           ACACTION                                      11919
WSCOMUSR           GRRGSTRNT                                     10618
WSCOMUSR           STOREENT                                       9452
WSCOMUSR           SHIPMODE                                       8427
WSCOMUSR           CMFILE                                         7445
WSCOMUSR           ATTRDICTSRCHCONF                               6824

  10 record(s) selected.

Calculating your Index Read Efficiency is key to figuring out if excessive table scans are occurring, and can be useful in knowing where to look to help resolve them. For e-commerce databases, Index Read Efficiency is excellent at 10 or lower, good at 100 or lower, and may be acceptable as high as 1000. The databases I support that I consider to be running well generally run between about 4 and 40.

Index Read Efficiency can be calculated at many different levels. Calculate it at the database (or workload) level like this:

$ db2 "select rows_read, rows_returned, rows_read/rows_returned as IREF FROM TABLE(MON_GET_WORKLOAD('',-2)) AS T where WORKLOAD_NAME='SYSDEFAULTUSERWORKLOAD' with ur"

-------------------- -------------------- --------------------
         96378957991            163061727                  591

  1 record(s) selected.

I like to collect the data above on an hourly basis, subtracting to get just the data from an hour, and I can then tell when the read efficiency is the biggest problem. I’m less worried about bad numbers during night time hours when batch processing or load jobs may be occurring than I am during peak transaction times.

Calculate Index Read Efficiency at the statement level like this:

            ELSE -1

-------------------- -------------------- ---------- -------------------- --------------- --------------------
select * from schsta               199364      47.95                   23         8668.00                   23
INSERT into DBAMON09                97400      23.42                    0           -1.00                    7
INSERT INTO DBAMON09                40656       9.77                    0           -1.00                    7
SELECT ATTR.ATTR_ID                 27630       6.64                27630            1.00                   45
INSERT into DBAMON09                14752       3.54                    0           -1.00                    7

  5 record(s) selected.

Calculating it at the statement level is particularly useful because it provides the worst queries over all to focus indexing efforts on. It is possible to use where clauses to pull out queries related to specific tables, though some of the ones that immediately come to mind (where uc(statement) like ‘%tablename%’) may not be very efficient in themselves.

How you may be able to avoid table scans

Can I stop every table scan? Absolutely not. If someone runs a ‘select *’ with no where clause, DB2 is just going to choose a table scan. It’s the right way to satisfy that kind of query. If you’re returning 90% of the table, it’s probably also appropriate. The line where it becomes inappropriate lies some place in the middle. Sometimes there are just too many conditions in the where clause to be able to handle the query any other way.

If I have a query that is executed very frequently and it returns about 10% of a table, that’s probably one I would want to index for.

That’s how to stop table scans – create indexes to access the data more efficiently. The first thing to look for in an index that I’m using to try to eliminate a table scan is to make sure that all of the columns in joins and where clauses are included in indexes in an order that DB2 can use them. Getting the right columns in those categories in indexes, you should be able to change a table scan into an index scan with a fetch back to the table to get any remaining data. db2advis is a great tool for figuring this out, though it’s not perfect.

The next level beyond that is index-only access. For “hogs and dogs” of SQL (the ones that perform bad and are frequently executed), index-only access is awesome. But it’s inefficient as far as storage space on disk, and if you go overboard with it you can negatively impact your insert performance. To achieve index-only access, you have to make sure every single column a query returns or references in any way is in one or more indexes in an order that DB2 can use. db2advis frequently recommends indexes that would achieve index-only access, and that’s one of the reasons it is important to carefully analyze the indexes it suggests – index-only access is not right for every query.

4 comments… add one

  • Hi Ember,

    Could you share your experience regarding index definitions.
    what is better for the index heading columns: highest to lowest or lowest to highest distinct values?.


    • Index column order depends first on the queries you’re indexing for. Putting columns in one order vs. another can allow one index to serve multiple queries. Even with the Jump Scans introduced in 10.1, db2 is still unlikely to use an index for a query that includes a column the query is not using in between columns that it is using. Indexing should depend on the SQL running more than the table itself. After that, it’s best to put the columns with the highest cardinality (number of distinct values) first. The reasoning for this is the structure of b-tree indexes – the more restrictive we are earlier in the index tree traversal, the fewer pages DB2 has to read. Even if the pages being read are in memory (and thus do not require I/Os), this can save CPU cycles. Does that help?

  • Hi again Ember,

    I’m playing with read efficiency and some other calculations from your various blogs using MON_GET_PKG_CACHE_STMT but I just can’t manage to get consistency between this table function and db2top ‘D’ option or SNAP_GET_DYN_SQL_V95. The last two are consistent in the results but MON_GET_PKG_CACHE_STMT always shows me something different (i.e. something simple like ordering by by total exec time or num executions).

    Honestly I don’t get it why MON_GET_PKG_CACHE_STMT shows not the same amount of values. Can you shed some light on this?.

    Thanks and regards

    • what specific differences are you seeing? differing numbers of statements, different statements, different values of counters/read efficency for the same statements?


Leave a Comment