Runstats – What and Why

You may also like...

14 Responses

  1. Great article.

    I wanted to say, that if you want to control how the statistics are executed, you must probably use a profile, and then, the automatic maintenance will respect that profile. This is a good reason to activate the automatic runstats and still have the control, by configuring the statistic profiles.
    “Automatic statistics collection respects the profile a user has specified by using the registered profile option in the SYSCAT.TABLES catalog table. If no profile is specified, automatic statistics collection uses the following options when collecting statistics:”

    • Ember Crooks says:

      But I do have to set a profile for each individual table, right? I can see that working, but as a control freak, I also want to know exactly when runstats happened across the board. When someone asks, I want to be able to say “Saturday Morning” or “Last Night” without even thinking about it.

  2. xandra says:


    Do I need to execute runstats for each table? can’t we run by schema?


  3. Amit says:

    What is recommendation for running runstats, say my data changes by x% then I should do runstats? What is the recommended x%, 10%, 20% 30 % etc

    • Ember Crooks says:

      I don’t have that number. I don’t like that method of running runstats, so I don’t know the thresholds associated with it.

  4. Murthy says:

    Is there any tool or SQL stmt to know when to run runstats as we do with REORGCHK.

    • Ember Crooks says:

      Usually if you’re doing runstats manually, then you have a fixed interval that you do runstats for all tables or perhaps categorize tables based on how often you need to do runstats on them based on data change rates (daily, weekly, monthly, etc).

      The driving factor is often data change rates, but I don’t have a specific query to track that. DB2’s automated runstats facilities do make decisions based on data change rates. But I don’t use them because I’m a control freak and I want to know when the last runstats was. Also because the data I’m querying in an e-commerce database tends to be the most recently added data.

      Personally I do runstats on all tables either daily or weekly depending on whether I’m in a build or post-go-live phase where data changes quickly or if I’m just in normal operation. This works well in e-commerce databases. I can see a point being made for different frequencies in different types of databases.

  5. Sanjay Trehan says:

    Few tables have range partitioning with 1 to 5 billion rows. Therefore, there are approximately 450 million rows per partition (with a non-partitioning index and 4 partitioning indexes). Can the optimizer use the statistics of the partition, if available, rather than the table statistics?

  6. Husain Bhavnagarwala says:

    If we use the runstats command similar to Ember’s (db2 runstats on table schema.table with distribution and detailed indexes all), do I need to run runstats on the individual indexes of that table (db2 runstats on table schema.table for index …..) ?

    • Ember Crooks says:

      First, you cannot do runstats on an individual index unless you are using range partitioning. Second, no, that syntax does all indexes in addition to the table.

  7. Shoaib says:

    Thanks Ember, how do you determine what columns currently have statistics collected on them. We have “Help stats tablename” in teradata, is there anything similar in db2. I’m accessing db2 from TSO

    • Ember Crooks says:

      In DB2, we generally go with all columns. I don’t have the sql off the top of my head, but I think you’d be able to see if only specific columns have distribution stats by querying sysstat.columns. A powerful strategy can also be to gather statistics on columns that are frequently queried together or joined on together and have correlations of some sort. See for information on how to see if that has been done.

Leave a Reply

Your email address will not be published. Required fields are marked *