DB2 LUW – What is a Page?

The logical view of a database consists of the standard objects in any RDBMS – Tables, Indexes, etc. There are a number of layers of abstraction between this and the physical hardware level, both in the OS and within DB2.

Setting the Page Size

The smallest unit of I/O that DB2 can handle is a page. By default, the default page size is 4096. The default can be set to one of the other possible values when the databases is created using the PAGESIZE clause of the CREATE DATABASE statement.

The possible values for page size in a db2 database, no mater where it is referenced are:

  • 4K or 4,096 bytes
  • 8K or 8,192 bytes
  • 16K or 16,384 bytes
  • 32K or 32,768 bytes

Realistically, the page size is set for either a buffer pool or a tablespace. Setting it at the database level on database creation just changes the default from 4K to whatever you choose and changes the page size for all of the default tablespaces and the default bufferpool.

The page size for each bufferpool and tablespace is set at the time the buffer pool or tablespace is created, and cannot be changed after creation.

Tables can be moved to a tablespace of a different page size after creation using the ADMIN MOVE TABLE command, but that operation requires at the very least an exclusive lock, and may not support RI – I hear RI support is added in 10.1 Fixpack 2.

Choosing a Pagesize

In my experience, it is rare to have a database created with a different default page size. Every database I currently support has the default page size of 4K, and also has at least one tablespace and one bufferpool with each of the other page sizes.

The most common time you think about page sizes is when you’re creating a table. When DB2 stores data in a table, it CANNOT have a row size larger than the page size minus some overhead. So if you have a row greater than 4,005 bytes in width, you simply cannot keep it in a tablespace with a page size of 4K. The row size does not include large LOBs that you do not wish to in-line. But it does include the maximum value of every varchar field.

This is one area where DB2 differs from Oracle. To my knowledge, in Oracle, you can have a row that spans multiple pages. From what I hear, DB2 is planning to support that in a future release, but they’re planning to do it by plopping the rest of the row in a LOB – there was an audible groan when they announced that in a conference session I was in, due to the other problems in dealing with LOBs.

It is also important to think of the geometry of the table when choosing a page size. If you have a row size of 2010 bytes, that means that only one row will fit on every 4K page, and therefore nearly 50% of the space allocated to your table will be empty and wasted. A row size that large would do much better on an 8K or 16K or even 32K page. It is important to consider this for every table as you create it, and to revisit it with any column length alterations you make.

I have, on several different occasions, had a request from a developer to increase a column size, and have had to move the table to a different tablespace to accommodate the change because the increase pushed the row over the limit. Historically, moving a table between tablespaces has required the table to be offline – which can be problematic in a live production database. If you’re properly testing changes in at least one development environment, you will discover these kinds of issues before they get to production.

While page overhead is not exactly 100 bytes, it is usually not far from it, so to determine how many rows will fit on a page, you can usually use:

(pagesize-100)/maximum row length

Again, this does not count LOB data, but only LOB descriptors. LOB descriptors are stored on the data page. The remainder of the LOB is stored in another location, assuming you have not INLINED the LOBs. From the standpoint of a page, the main reason for using a LOB is to allow a large portion of unstructured data to be stored elsewhere – not directly on the data page. LOB descriptor sizes on the page depend on the size of the LOB and vary from 68-312 bytes.

Generally, smaller pages are preferable for OLTP and e-commerce databases because they allow you to handle less data at a time when you’re expecting smaller queries.

The total table size is another factor in choosing a page size. New tablespaces should generally be created as “LARGE” tablespaces. But “REGULAR” used to be our only option, and for REGULAR tablespaces with a 4K page size, the limit on table size in a DMS tablespace is just 64 GB (per partition). On more than one occasion I have dealt with hitting that limit, and it is not a pleasant experience. For LARGE tablespaces, the limit per partition for a 4K page size is 8 TB – much higher.

Since pagesize is set at the tablespace level, you can also consider the appropriate page size for your indexes, assuming you’re putting them in a different tablespace than the data. While you cannot select a tablespace for each index, but only an index tablespace for the table as a whole, you’ll want to consider possible indexing scenarios when chossing where your indexes go as well. The limit for index key (or row) size in db2 9.7 is the pagesize divided by 4. So for a 4k page size, it is 1024. Back on DB2 8.2, the limit was 1024 for all page sizes.

An interesting side note: if you’re not familiar with the “SQL and XML limits” page in any DB2 Info Center, I recommend looking it up and becoming familiar with it. That’s where I verified the index row limit and there are all kinds of interesting limits there.

Data Page Structure

Ok, this section is admitedly just a tad fuzzy. I had the darndest time getting information on this, even reaching out to some of my technical contacts at IBM. But I’m going to share what I do know in case it helps someone.

Each data page consists of several parts. First is the page header – a space of 91-100 bytes reserved for information about the page. The page header identifies the page number and some other mysterious information I cannot find much detail on.

Next comes the slot directory – which is variable in size, depending on how many rows are on the page. It lists RID’s that are on the page and the offset on the data page where the record begins. if the offset is -1, then that indicates a deleted record. In the structure of the rows themselves on the page, it appears that the records “start” at the bottom of the page. There may be open space between records due to any of the following:

  • Deleted records
  • Reduced size of VARCHAR values
  • Relocation of records due to increased size of the VARCHAR that will no longer allow the row to fit on the page

This open space cannot be used by additional records until after a reorg.

Finally, there may be continuous open space on a page that is left over or simply not used due to either deletes followed by reorgs or due to the pages simply not being filled yet.

DataPage

I found some references to each page also having a trailer, but they were not from sources I could consider credible, so there may or may not be a trailer on the page. Most of the information here comes from a page in the IBM DB2 Info Center. I would love to hear reader comments on this topic, or any references anyone may have with more detailed data.

Not every page in a table is a data page dedicated fully to user data. There is one extent of pages allocated for each table as the extent map for the table. Past a certain size, additional extent map extents may be required. There is also a Free Space Control Record every 500 pages that db2 uses when looking for space to insert a new row or move a re-located row.

Index Page Structure

Indexes are logically organized in a b-tree structure. Interestingly, the RIDs that index pages use are table-space relative in DMS tablespace containers, but object relative in SMS tablespace containers – perhaps this is one of the reasons there has never been an easy way to convert from SMS to DMS or vice-versa.

I have not been able to find a good representation of exactly what leaf and non-leaf pages look like for indexes. We do have the standard representation of pages in a b-tree index, that is:
b-tree

This shows us that we can expect to see index keys that delimit ranges on the root and intermediate non-leaf pages, and that on the leaf pages, we expect to see the index keys along with the RIDs that correspond to those index keys. There is still a page header that is between 91 and 100 bytes, But I don’t know if index leaf pages have the same slot directory that data pages do. Again, I welcome user comments and links on this topic.

Extent Size and Prefetch Size

The extent size and prefetch size are specified on a tablespace by tablespace basis(keywords EXTENTSIZE and PREFETCHSIZE on the CREATE TABLESPACE command) or as database-wide defaults (DFT_EXTENT_SZ and DFT_PREFETCH_SZ db cfg parameters). Extent sizes cannot ever be changed after tablespace creation. Prefetch sizes can be set to AUTOMATIC and be changed automatically by DB2, or can be altered manually using the ALTER TABLESPACE command. Both are specified as a number of pages.

The Extent size is the number of pages that are allocated to any objects in the each tablespace container at one time. The Prefetch size is the number of pages that are read into the bufferpool by the prefetchers for one read request. I’m not going to speak specifically to the tuning of these in this post.

Summary

By understanding pages, we come closer to understanding how DB2 handles some aspects of I/O. Minimally, a DBA needs to be able to pick the appropriate page size for a given table and it’s indexes.

You may also like...

14 Responses

  1. Brian says:

    Interesting article. So often, I have just used page sizes that were used in previous projects without giving them sufficient thought.

  2. Shivraj says:

    Very helpful and informative stuff.

  3. Anik says:

    Thanks again teacher 🙂

  4. Ted says:

    Hello, just wanted to take a quick moment to tell you how much I appreciate your website. I have been a db2 dba for 20 years and I still find myself learning something new on your website and refer to it quite often 🙂 Thanks!

  5. Aditya says:

    Its really informational and most important is the way of expressing the stuffs.
    Thanks a lot 🙂

  6. Pouria says:

    Thanks for your informative post. It clarifies a lot of details about the physical storage in DB2.
    I just have a quick question/confusion:
    (Assuming I am using DB2 10.1 – with *regular* table spaces) I have read in the IBM documentation that there cannot be more than 255 rows on a page. (for example here: http://www.ibm.com/developerworks/data/library/techarticle/0212wieser/ ). I am just trying to compare this one with your formula in “Choosing a page size” which seems to consider the whole space in a page (minus 100 bytes of overhead). I am just wondering if that formula is only correct for large tablespaces, or if there is sth about regular tablespaces that I am missing here.

    • Ember Crooks says:

      I would never use regular tablespaces anymore – only large ones.

      Large tablespaces have varying possible numbers of rows per page:
      Max rows per page

      There is a maximum number of rows per page, and for particularly narrow tables, that has to be considered.

  7. Hi Ember! This article you wrote still hot after 3 years. I have an additional question about performance when using larger tablespaces. Although I administrate a DB2 10.5 OTLP database with small tables that fit into a 4k pagesize tablespace, when I transfered them to a 32k prior to a compress and evaluated some performance tests, all DML operations performed a lot faster. It was not a problem with the old tablespace because I already have recreated maintaining 4k pagesize only changing from regular to large in order to allow more rows – it has reached regular maximum capacity. I have tested this approach in another databases, all of them OLTP and observed a reduction in I/O and faster SQL responses. All tablespace creation was made with default EXTENTSIZE and PREFETCH size, and all bufferpools are AUTOMATIC with STMM. I really cannot understand this, because it goes in the wrong way of all DB2 documentation. Is there any reason to see this increase in performance? BTW, all tables runs a lot of INSERT, UPDATE and DELETE and the tests considered them without compression.

    • Ember Crooks says:

      Without asking a lot of questions, I could not speculate as to why. There certainly could be reasons that things would work that way, but you’re right that generally, for OLTP, smaller page sizes are thought to perform better. EVERY single recommendation depends on so many factors, that when you can do testing like that, it is excellent to do so. It could be something as simple as a 32K page size matching up better with your storage stripe, strip, or LUN sizes or any one of a dozen other things.

      • Thanks for the quick response. I take care of 4 different customers running the same ERP software with DB2 9.7 and DB2 10.5. They’re all different in hardware specs, and I performed a test with the same table in all environments migrating to a 32k large tablespace without compress. At one case, a very complex ETL SQL SELECT that take more than 1 hour to execute turns out to finish in 3 minutes after the change (and I have migrated to a new 4k tablespace before the tests). I have access to an test environment pretty near to the production and trying different combinations of EXTENT and PREFETCH sizes to find out what’s going on.

        • Ember Crooks says:

          I am not surprised that a complex ETL SQL would perform better. However, the number one performance priority would usually be the transactions, and I’m curious as to the performance difference, if any for them.

          Queries dealing with larger amounts of data will generally benefit from a larger page size, particularly if the table is not very contiguous on disk.

  8. Aamir Malik says:

    I am in process of database migration from DB2/VSE 7.5.0 to DB2 LUW 10.5. How can i optimize the selection of page size, extent size, prefetchsize for migrated database.

    • Ember Crooks says:

      If the database being moved is a Data Warehouse or Analytics environment, then 32K is generally the page size to go with. If it’s OLTP, you may want to consider a smaller page size (4K or maybe 8K) for the tables that are particularly likely to singleton-row updates, inserts, and deletes. If you have the opportunity, it is best to test and compare configurations to see which is actually faster. Extent and prefetch size are going to be more dependent on your storage characteristics. If you can actually get details on your storage such as the strip size for a RAID array, then matching your extent size to to that is very useful. Prefetch size is often smaller for OLTP, and larger for DW/Analytics.

Leave a Reply

Your email address will not be published. Required fields are marked *