DB2 Basics: What is a Reorg?
This is the first in a series of blog entries on reorgs. I was talking with a friend recently, and he pointed out that I only have a few articles on reorgs, and they’re very specific to complicated parts of reorg. I have scripted my own reorgs and made a point of educating myself on reorg, so I’m starting with several basic articles and may add in some more advanced articles.
Right now I plan to start with this article – What is a Reorg? – and do additional entries on how to determine if a reorg is needed and the different options when running a reorg. This series will also heavily cover reorgchk, and I may add another article in on the SQL that can be used to calculate reorgchk formulas.
Why Reorgs are Important
Reorgs are one of the keys to maintaining performance on a DB2 LUW database. One of the items on my list for the most basic regular tasks that DBAs need to perform is to run reorgs on the tables where reorgchk says they are needed on a regular basis. My databases have appropriate reorgs done on them weekly, though monthly may be frequently enough.
Over time, the regular activity in a database can cause minor issues that reorg can solve. If reorgs are not done in a timely manner, those minor issues can cause performance-killing and space-eating issues.
General Facts About Reorgs
A reorg is done at the table level, but may also specify an index. Some kinds of reorgs affect only tables or only indexes. A reorgchk can be run either at the table or the database level to help determine which kinds of reorgs are needed for what tables. Reorgchk requires current runstats to correctly make these determinations. Some kinds of reorgs make a table completely offline, while others keep the table online for a portion of the reorg.
Purposes of Reorgs
A clustering index in DB2 means an index that is defined with the
CLUSTER keyword. Clustering indexes means that DB2 attempts to maintain the table in the same order as the index. It is not guaranteed to be in the same order, though. If the data will not fit on the page that the clustering order would prefer, then DB2 will try nearby pages, but will not move data around to make it work. There is no clustering by default like there is for some other RDBMSes, like MS SQL Server. If you reorg a table with a clustering index, DB2 will reorder the table data to exactly match the order of the clustering index. If you add a clustering index, you will have to reorganize the table before the data will be reordered.
Clustering a Table Over an Index that is Not Defined as a Clustering Index
If you do not have a clustering index on a table, you can still use reorg to order the data in the order of any index, simply by specifying it in the reorg command. If you do this, DB2 makes no attempt to maintain the table data in the order of the index – it is only on reorg that this is done. If there is a clustering index on a table, you cannot reorg the data on a different index.
Grouping Empty Pages and Physical Continuity of Table Data
Over time, as rows are deleted from a table, the space that those rows took up is not freed. It may be reused by rows inserted into the tables, if they fit and APEND mode is not used for the table. But especially if a large amount of data has been deleted, the data on the table pages may actually be sparse. One of the main things that reorg does is to completely fill up the tables (respecting PCTFREE), and push all of the empty space to the end of the table. Depending on the reorg options you choose, it may also free that space up to be used by other tables in the tablespace. Other than freeing space on disk, the other advantage here is that data that may be frequently accessed together (by table scans or by prefetching) is physically located together on disk.
Compression Dictionary Generation
There are a number of times that compression dictionaries are generated for DB2 row level compression, and it may depend on your version. But there are situations where you may want to explicitly reset/recreate the dictionary, and there is an option on the reorg command to reset the compression dictionary
There are some operations that require a table reorganization after they are done. These include:
- Data type changes that increase the size of a varchar or vargraphic column
- Data type changes that decrease the size of a varchar or vargraphic column
- Altering a column to include
- Altering a column to inline LOBS
- Altering a column to compress the system default or turn off compression for the system default
- Altering a table to enable value compression
- Altering a table to drop a column
- Changing the
PCTFREEfor a table
- Altering a table to turn
- Altering a table or index to turn compression on
Note that some of the above are “reorg recommended” operations, where you can do up to 3 before a reorg is required, and you are limited in some of the operations that can be done against the table until a reorg is completed. Others require immediate reorganization before any actions may be performed against the table. Other things in the list above simply won’t get the benefit of the change until the reorg is completed.
In general, one of the purposes of reorgs may be to rebuild indexes. How, when, and even if this happens depends on the type of reorg you are running.
When rows are deleted from indexes, the RIDs are essentially marked as deleted, and not actually deleted. When they’re marked this way, they’re called pseudo deleted RIDs. There are two ways that these index entries are finally deleted. If MINPCTUSED is not set for the index (which is the default), then reorg can clean them up. When invoked for indexes, reorg can rebuild all of the indexes or it can go through a cleanup only mode to simply remove the pseudo-deleted rows and pages. This reorg process can solve problems with leaf page fragmentation, low-density indexes, indexes with too many levels, and also will respect PCTFREE – to ensure there is free space on index pages for additional inserts.
When reorging indexes for a table, you can only reorg all indexes for a table, and not an individual index. The exception is when you’re reorging non-partitioned indexes on a partitioned table.
This blog entry covered what reorgs do at a high level. Future entries about reorgs will cover reorgchk and how to determine what tables to reorg and the syntax and specifics on the various types of reorgs.