DB2 Administrative SQL Cookbook: Listing Tables and Current Size

Note: updated on 2/6/2015 to include long, lob, and xml data at a commenter’s suggestion.

The DB2 Administrative SQL Cookbook is a series of blog entries that are short. Each focuses on one or a few SQL statements that may be useful when administering DB2 databases. They usually address system catalog views, snapshot tables/views, or snapshot table functions that are standard in all DB2 LUW implementations. The version the SQL has been tested on is noted.

Purpose

To list tables and their current sizes in descending order by size. Also lists whether table is column or row organized. Can be useful for finding large tables or for identifying large tables that do not match the organization of other tables in the database.

Version

Used on DB2 10.5, Fixpack 5. Version provided for older DB2 versions, below.

Statement

select  substr(t.tabschema,1,18) as tabschema
        , substr(t.tabname,1,40) as tabname
        , (COL_OBJECT_P_SIZE + DATA_OBJECT_P_SIZE + INDEX_OBJECT_P_SIZE + LONG_OBJECT_P_SIZE + LOB_OBJECT_P_SIZE + XML_OBJECT_P_SIZE)/1024 as tab_size_mb
        , tableorg
from    syscat.tables t
        join sysibmadm.admintabinfo ati
                on t.tabname=ati.tabname
                and t.tabschema=ati.tabschema
where   t.type='T'
        and t.tabschema not like ('SYS%')
order by 3 desc
with ur

Version that will probably work on 9.7 and up:

select  substr(t.tabschema,1,18) as tabschema
        , substr(t.tabname,1,40) as tabname
        , (DATA_OBJECT_P_SIZE + INDEX_OBJECT_P_SIZE)/1024 as tab_size_mb
from    syscat.tables t
        join sysibmadm.admintabinfo ati
                on t.tabname=ati.tabname
                and t.tabschema=ati.tabschema
where   t.type='T'
        and t.tabschema not like ('SYS%')
order by 3 desc
with ur

Sample Output

TABSCHEMA          TABNAME                                  TAB_SIZE_MB          TABLEORG
------------------ ---------------------------------------- -------------------- --------
WEBFILES_ODS       DI_DOCS                                                  2654 C
EBT_ODS            EBT_ACT_TYPE1                                            1573 C
WEBFILES_ODS       WBA_CASE_FILES                                           1353 C
SSIRS              ES_CSLD_VER2                                             1056 C
EBT_ODS            EBT_ACT_TYPE2                                             659 C
DW_VRU             CALL_HISTORY_TBL                                          191 R
GIS_UTIL           GEOCODED_STD_CIS_ADDR                                     181 C
GIS_UTIL           ARCGIS_CWS_RBC1                                           169 C
GIS_UTIL           GIS_ADDRESS_ARCHIVE                                       168 C
...

Caveats and Modifications

Hard numbers are used to limit the length of the schema and table names. If you have longer schema and table names, you may want to increase these numbers.
The table size reported here is the physical size of the table and it’s indexes. I find this more useful than a logical size or looking only at table size when I want to figure out what my largest tables really are.
Strictly speaking “with ur” is not needed, but I’m in the habit of adding it to the end of every query that doesn’t have an isolation requirement.

You may also like...

9 Responses

  1. Sandeep Sharma says:

    Hi Amber,

    Its a good article but have a little doubt, while calculating the size of a table shouldn’t we include LONG_OBJECT_P_SIZE, LOB_OBJECT_P_SIZE and XML_OBJECT_P_SIZE as well with DATA and INDEX object ?

    thanks..!!

  2. Anyanso P Agwu says:

    Ember,
    This is not a response to this post but more of a question regarding some earlier post that you wrote on “Your new best friend…the db2 MON_* table functions and views”. I want to ask you if by any chance you are in a position to share how you used these new functions and views especially how you built your tables for history/trending performance metrics. I am in need of how to do this and would appreciate your experience and guidance in this matter. Obviously you can guess I follow your blog religiously…
    Do let me know if you are able to help…

    Thanks

    A, Phillips Agwu

    • Ember Crooks says:

      I am not able to share my full code at this time, as it belongs to my former employer. I spent quite a bit of time on it, and was quite proud of the results. I started with the methodology and the scripts in this article:
      http://www.ibm.com/developerworks/data/library/techarticle/dm-1009db2monitoring1/

      Then I added history tables and table functions to populate those history tables. The history tables consisted of all of the columns of the original mon_get table function plus columns for a baseline timestamp, a snapshot timestamp and a record type. I added a group of table functions and tables to cover mon_get_pkg_cache_stmt, which is not in the referenced article. I also created functions to properly summarize data so I could store less granular data longer – in many databases, I found that a week’s worth of data was close to 5 GB. I then had a perl script that would:

      1. Capture baseline values (reset)
      2. Wait some specified amount of time (usually an hour, but for some situations, I would use 5 or 15 minutes – it was a variable I could pass in)
      3. Capture the history values to the history tables
      4. Summarize old data based on some retention guidelines
      5. Delete the oldest data from the database

      The perl script then ran hourly or so on all of my databases so I could go out and get trends and look at past performance. I kept all of the data in a separate tablespace in a separate storage group on a separate filesystem that did not auto resize so I did not risk my monitoring data filling anything up that would affect database availability.

      I could then query my history data the exact same way I would query the MON_GET table functions, simply adding a time component to it. I had queries that would help me find problem hours, and queries that would let me specify an hour or time range to see data on critical KPIs.

      Honestly, it was a heck of a lot of work, and I spent weeks on it. The data, however, was invaluable and easy to query and made performance troubleshooting so much easier. If you have the ability to get a third party performance tool, that’s probably easier and cheaper. But I learned so much in the process. After completing these, I was able to pass the certification exam for 9.7 Stored Procedure Developer cold, with no study time at all, and it’s not an easy exam. I used Merge statements for the first time, and learned what columns actually uniquely identified the various objects – some of them were surprising or hard to figure out.

  3. Christian says:

    Hello everyone,
    wow, thats a nice query. 🙂
    But what does the TableOrg column actually mean? I noticed the values “c” and “r” – does this stand for column or row?
    Thanks in advance.
    Best regards,

    Christian

  4. Jeroen says:

    I wonder why you need to join with SYSCAT.TABLES as both ‘tabschema’ and ‘tabname’ are included already in the ADMINTABINFO view.

  1. January 5, 2016

    […] DB2 Administrative SQL Cookbook: Listing Tables and Current Size […]

  2. March 3, 2016

    […] DB2 Administrative SQL Cookbook: Listing Tables and Current Size […]

Leave a Reply

Your email address will not be published. Required fields are marked *