Loading Data into DB2 BLU Tables

Working with BLU has been a glorious learning experience for me. I say that both truly with excitement and with frustration. Whenever I have a DBA-character building experience, I’m bound to share it with my readers.

The Scenario

A client has a small (1.7 TB) DPF data warehouse/ODS than I’m helping them move from 9.5 to 10.5 using BLU. Only a small subset of the data needs to be moved. I started trying to load from pipe, thinking that would be the fastest, but learned a lot about how LOAD works for column-organized tables.

The Phases of LOAD for Column-Organized Tables

There are four major phases of LOAD for column-organized tables:

ANALYZE
LOAD
BUILD (indexes)
DELETE

On my first large table, I found that ANALYZE took 2.5 hours, LOAD took 20 minutes, BUILD took 8 minutes. I was worried at the ANALYZE time. After a lot of online searching turned up precious few real pieces of information, I reached out to some friends who work for big blue to better understand what was going on.

ANALYZE

The analyze phase is critical. This is where compression dictionaries and BLU synopsis tables are built. I found references to shortening this phase by building a dictionary based on a subset of the data. But the magic of BLU is strongly dependent on really awesome compression. And really awesome compression comes from having the best possible data.

This phase is only triggered if:

  • Dictionaries are needed for column-organized tables
  • Load Replace or Load Insert into empty table is used
  • KEEPDICTIONARY not specified

What DB2 is doing during the ANALYZE phase is building histograms to track the frequency of data values across all columns. The column compression dictionaries using approximate Huffman encoding (most frequent values use the least space) are built off of the histograms. For compiling this information, it is absolutely critical that the UTILHEAPSZ is gigantic. You can use automatic, but keep in mind that in the minimum recommended size environment (8 cores, 64 GB of memory), it is recommended that UTILHEAPSZ be one million pages in size. If you have 128 GB of memory, then UTILHEAPSZ should be four million pages in size.

Additionally, if you are loading from a source that can only be scanned once, DB2 needs to scan that data twice – once for the ANALYZE phase and once for the LOAD phase. This means for pipes or other sources that can only be scanned once, DB2 will create a copy of the data on the TEMPFILES PATH specified on the load command. This can be time consuming and use a lot of disk space.

LOAD

The LOAD phase is modified for column organized tables. The following occurs for column-organized tables in the load phase:

  • Column and page compression dictionaries (existing or built in the ANALYZE phase) are used to compress data
  • Compressed values are written to data pages
  • The Synopsis table is maintained
  • Keys are built for the page map index and any unique indexes

BUILD

For column-organized tables, the BUILD phase includes building the page map index and any unique indexes. Because this is BLU, no indexes other than Primary Key or Unique Constraint indexes are allowed, so this phase should be faster than in a traditional row-organized scenario.

DELETE

In addition to the traditional role of deleting any rows that are rejected because of the Primary Key or Unique Constraints, the DELETE phase for BLU tables includes deleting any temp files that were used to be able to scan through the data twice.

Real World Experimentation

I don’t pretend that my tests are exhaustive, but I primarily wanted to compare – all other things being equal for a random table selected, how did the speed of the overall operation along with the speed of the individual phases compare when loading into a brand-new column organized table in a fairly new BLU database that has no other activity against it.

The table I’m using has CHAR, DATE, TIMESTAMP, and INTEGER data types, no LOBs and no Primary Key or unique constraints. It has 64371541 rows in the source database, spread across 4 data partitions on one physical server. I’m dropping and re-creating the table between each test to ensure that I don’t have anything hanging around from the last try.

Load from Pipe

The process for load from pipe is not difficult, but requires that the target database server can connect to the source database server. It also requires two sessions. The general steps are:

  1. (session1)Connect to the source database from the server where the target database is
  2. (session1)Make a pipe using the mkfifo command `mkfifo datapipe1`
  3. (session1)Export to the pipe created `db2 “export to datapipe1 of del messages test.msg select * from schema.table with ur”`
  4. (session2)Connect to the target database locally
  5. (session2)Load from the pipe created ` db2 “load from datapipe1 of del messages test_load.msg replace into schema.table NONRECOVERABLE”`
  6. (session3)Use `db2pd -utilities -repeat 5` to monitor the load progress

The db2pd -utilities command is good for monitoring because you can see progress and the time that each phase starts. If you’re expecting a longer load, you may want to go with a longer refresh interval than 5 seconds.

Here’s the output from the end of my test scenario:

Utilities:
Address            ID         Type                   State      Invoker    Priority   StartTime           DBName   NumPhases  CurPhase   Description
0x000000020428F2E0 8          LOAD                   0          0          0          Wed Jan  7 18:52:27 SAMPLE   4           4           [LOADID: 126.2015-01-07-18.52.27.892240.0 (8;4)] [*LOCAL.db2inst1.NNNNNNNNNNNN] OFFLINE LOAD DEL AUTOMATIC INDEXING REPLACE NON-RECOVERABLE SCHEMA.TABLE1

Progress:
Address            ID         PhaseNum   CompletedWork                TotalWork                    StartTime           Description
0x000000020428F668 8          1          0 bytes                      0 bytes                      Wed Jan  7 18:52:27 SETUPo
0x000000020428F820 8          2          64371541 rows                64371541 rows                Wed Jan  7 18:52:27 ANALYZEl
0x000000020428F9A8 8          3          64371541 rows                64371541 rows                Wed Jan  7 19:18:54 LOADm
0x000000020428FB30 8          4          2 indexes                    2 indexes                    Wed Jan  7 19:41:55 BUILD

Database Member 0 -- Active -- Up 1 days 06:37:54 -- Date 2015-01-07-19.42.04.469122

The export command finishes at the end of the ANALYZE phase.
Note: I watched the load on a 5-second interval using the -repeat option of db2pd. There were times in the ANALYZE phase where the load seemed to hang without advancing the number of rows for several minutes at a time – then it would jump forward – so if you see this, don’t panic. Interestingly, I saw similar pauses in the LOAD phase at exactly the same numbers. 15604493 rows for some reason, and a couple of other points.

The results for this method are a total export/load time of 50 minutes – the analyze phase took about 26 minutes, the LOAD phase about 23 minutes, and the build phase about a minute. There was no delete phase.

Load from Cursor

Load from cursor for a remote source is not as hard as I had expected. There is no need to federate the source database. There are good instructions on how to load from cursor here: http://www.ibm.com/developerworks/data/library/techarticle/dm-0901fechner/

There are essentially three steps once the source database is cataloged on the server of the target database:

  1. On target server, connect to target database
  2. Declare cursor against the source database: `db2 “DECLARE C1 CURSOR DATABASE sourcedb user uname using pw for select * from schema.table with ur”`
  3. Load from cursor: `db2 “LOAD FROM C1 of CURSOR MESSAGES test_load.msg replace into schema.table nonrecoverable”`

And as for the results, all I can say is “Whoa, that was fast!”

It looked like this:

Utilities:
Address            ID         Type                   State      Invoker    Priority   StartTime           DBName   NumPhases  CurPhase   Description
0x000000020373EF80 10         LOAD                   0          0          0          Wed Jan  7 20:48:53 SAMPLE   4           4           [LOADID: 335.2015-01-07-20.48.53.091244.0 (8;4)] [*LOCAL.db2inst1.NNNNNNNNNNNN] OFFLINE LOAD CURSOR (DB) AUTOMATIC INDEXING REPLACE NON-RECOVERABLE SCHEMA.TABLE1

Progress:
Address            ID         PhaseNum   CompletedWork                TotalWork                    StartTime           Description
0x000000020373F308 10         1          0 bytes                      0 bytes                      Wed Jan  7 20:48:53 SETUPo
0x000000020373F4C0 10         2          64371541 rows                64371541 rows                Wed Jan  7 20:48:59 ANALYZEl
0x000000020373F648 10         3          64371541 rows                64371541 rows                Wed Jan  7 20:57:51 LOADm
0x000000020373F7D0 10         4          2 indexes                    2 indexes                    Wed Jan  7 21:00:21 BUILD

Database Member 0 -- Active -- Up 1 days 07:56:18 -- Date 2015-01-07-21.00.28.254954

In this case, the ANALYZE phase took 9 minutes, the LOAD took 3 minutes, and the BUILD phase took less than a minute, for a grand total of 12 minutes. That’s 1/4 the time the exact same data and the exact same table took with load from pipe.

Load from File

Finally, I thought I would try the file method as well. In this case, I’ll connect to the source database from the target database, export to a file on the target database server, and then load from that file on the target database server.

Export took 10 minutes:

-bash-4.1$ date; db2 "export to schema_table1.del of del messages manual_export_test.msg select * from schema.table1 with ur"; date
Wed Jan  7 21:13:14 PST 2015

Number of rows exported: 64371541

Wed Jan  7 21:23:53 PST 2015

The Load portion was faster, particularly the ANALYZE phase

Utilities:
Address            ID         Type                   State      Invoker    Priority   StartTime           DBName   NumPhases  CurPhase   Description
0x000000020477F100 11         LOAD                   0          0          0          Wed Jan  7 21:27:24 SAMPLE   4           3           [LOADID: 329.2015-01-07-21.27.24.015656.0 (8;4)] [*LOCAL.db2inst1.NNNNNNNNNNNN] OFFLINE LOAD DEL AUTOMATIC INDEXING REPLACE NON-RECOVERABLE SCHEMA.TABLE1

Progress:
Address            ID         PhaseNum   CompletedWork                TotalWork                    StartTime           Description
0x000000020477F488 11         1          0 bytes                      0 bytes                      Wed Jan  7 21:27:24 SETUPo
0x000000020477F640 11         2          64371541 rows                64371541 rows                Wed Jan  7 21:27:24 ANALYZEl
0x000000020477F7C8 11         3          63865512 rows                63865512 rows                Wed Jan  7 21:28:04 LOADm
0x000000020477F950 11         4          0 indexes                    2 indexes                    NotStarted          BUILD

Database Member 0 -- Active -- Up 1 days 08:26:30 -- Date 2015-01-07-21.30.40.629176

The analyze phase took about 1.5 minutes, the load phase took about 2 minutes, and the build phase was less than 5 seconds.

That’s a total of about thirteen minutes
In this case, load from file was faster than load from pipe and slightly slower than load from cursor.

Note that the count of indexes that LOAD notes is 2. But I have no explicitly created indexes (primary keys or unique constraints). One of the indexes it is building is the page map index for the table. I’m not sure what the other index is, but in my work, it consistently lists two indexes more than anything I have explicitly defined.

Another scenario

I wanted to make sure that there wasn’t some factor I wasn’t accounting for making the extreme difference for load from pipe. I though maybe pulling pages into bufferpool on the source server or something, so I did a different table with 97,110,306 rows and a primary key. In this case, I did the export to file method first and the pipe one second to compare.

File

For the export to file, it took 17 minute to export – this is connecting from the target server to the source server, so that includes network time. In the output below, you can see that the load itself took 2 minutes in the ANALYZE phase, 10 minutes in the LOAD phase, and 2 minutes in the BUILD phase for a total of 14 minutes.

Utilities:
Address            ID         Type                   State      Invoker    Priority   StartTime           DBName   NumPhases  CurPhase   Description
0x00000002036CEE00 12         LOAD                   0          0          0          Thu Jan  8 06:05:25 SAMPLE   4           4           [LOADID: 482.2015-01-08-06.05.25.336041.0 (20;4)] [*LOCAL.db2inst1.NNNNNNNNNNNN] OFFLINE LOAD DEL AUTOMATIC INDEXING REPLACE NON-RECOVERABLE DBA     .TEMP_TABLE2
Progress:
Address            ID         PhaseNum   CompletedWork                TotalWork                    StartTime           Description
0x00000002036CF188 12         1          0 bytes                      0 bytes                      Thu Jan  8 06:05:25 SETUPo
0x00000002036CF340 12         2          97110306 rows                97110306 rows                Thu Jan  8 06:05:25 ANALYZEl
0x00000002036CF4C8 12         3          97110306 rows                97110306 rows                Thu Jan  8 06:07:32 LOADm
0x00000002036CF650 12         4          0 indexes                    3 indexes                    Thu Jan  8 06:17:12 BUILD

Database Member 0 -- Active -- Up 1 days 17:14:58 -- Date 2015-01-08-06.19.08.353244

Pipe

The load using pipe data is below. The ANALYZE phase took 28 minutes, the LOAD phase took 13 minutes, and the BUILD phase took about a minute, for a total of 42 minutes – about twice as long, which is consistent with what I saw with the previous table in the previous experiment.

Utilities:
Address            ID         Type                   State      Invoker    Priority   StartTime           DBName   NumPhases  CurPhase   Description
0x00000002023DF5E0 13         LOAD                   0          0          0          Thu Jan  8 06:30:53 SAMPLE   4           4           [LOADID: 482.2015-01-08-06.30.53.492253.0 (10;18)] [*LOCAL.db2inst1.NNNNNNNNNNNN] OFFLINE LOAD DEL AUTOMATIC INDEXING REPLACE NON-RECOVERABLE DBA     .TEMP_TABLE2

Progress:
Address            ID         PhaseNum   CompletedWork                TotalWork                    StartTime           Description
0x00000002023DF968 13         1          0 bytes                      0 bytes                      Thu Jan  8 06:30:53 SETUPo
0x00000002023DFB20 13         2          97110306 rows                97110306 rows                Thu Jan  8 06:30:53 ANALYZEl
0x00000002023DFCA8 13         3          97110306 rows                97110306 rows                Thu Jan  8 06:58:13 LOADm
0x00000002023DFE30 13         4          3 indexes                    3 indexes                    Thu Jan  8 07:07:00 BUILD

Summary

Be careful not to over extrapolate from my little test cases. So many factors can make a difference, and there may be different strategies that work better in different scenarios. Overall, I plan to use load from cursor the most. It is the easiest and the least likely that you’ll connect to the wrong database by accident. It’s easy to quickly script. I do wonder if for particularly large tables, load from file might be the best.

You may also like...

14 Responses

  1. Brenda Boshoff says:

    I can only agree with your findings. In our case we have settled on load from cursor for BLU as we load about 200gb from z/OS DB2 v10.1 row based tables to BLU 10.5.4 every night . Not much but nevertheless. The loads to BLU v10.5.4 are on average a third faster than the load times (also using load from cursor) to the old row based v9.7.5 database pre BLU. The utility heap size was the biggest challenge as you found as well. Ours is now set to a “trial and horror” minimum with automatic on. I fully expected the load from cursor to take longer loading into BLU with the extra phase but this has not been our experience at all. All the load jobs run faster. We have kept the primary keys.
    I need a t-shirt that says “I ❤️ BLU”

  2. Tomas says:

    Load from cursor from remote database without federation? Mind = blown. Funny how the syntax diagrams for DECLARE CURSOR make no mention of this functionality… I’ve gone through every infocenter from 9.1 thru 10.5. Not a peep. This is some seriously useful functionality, why hide it?

    “I learned something today.” My favorite kind of day – thanks Ember!

  3. Dmitry says:

    Hi, Ember

    Thank’s for sharing your BLU expirience
    Did you try IMPORT/INSERT to column-tables ?

    On my 10.5.4 env (IBM say’s what in this release INSERT performance improved in times) i see what IMPORT is 15 times slowly (with any numbers of commitcount) than LOAD (this was 17M rows table import/load test from file). Also after insert’s into column-table compression is poor (8 times compared to LOAD)

    Best regards, Dmitry

    • Ember Crooks says:

      I did not try IMPORT/INSERT for two reasons. First, I know them to be slower than nonrecoverable LOAD in just about every situation. Second, the ONLY way you can build a compression dictionary for column-organized tables is through the LOAD command – you cannot do it with reorg. So if you must use IMPORT/INSERT, you must first attempt to generate a representative compression dictionary with a LOAD command using the RESETDICTIONARYONLY option to not actually load any data into the table.

  4. Norman Wong says:

    You can load with keepdictionary to avoid the analyze phase. If you have a very large table to load, load a representative sample first, then reload with the keepdictionary option. This can cut a huge amount of time.

    Check your percent column compressed to see if the sample is representative.

    • Ember Crooks says:

      The scenarios in which I would use this method are:

      1. The data for a table is over 128 GB
      2. For some reason I really cannot take the time for a full analyze

      I really prefer to use the full data whenever possible, because getting the compression right is so critical to future performance.

  5. Kulvant says:

    Hi Ember,

    We are offloading data into files but during offloading we are facing issues as below.

    The data stored in Database
    4.789928258849587598785605

    The data offloaded in a file
    4.7899282588495876

    Column is of datatype DOUBLE

    Could you please help here ?

    • Ember Crooks says:

      I don’t know that answer, nor can I find it in quick searching. I would try a different export format if it works (IXF if your’re using DEL for example) and see if that changes things at all. I would also verify that the query doesn’t constrain the value returned in any way. It does seem the sort of thing that a PMR would be able to resolve rather easily.

  6. Luis Reina says:

    Hi Ember,
    Great post thanks!, there is one more possibility that is even faster than the LOAD from Cursor: You can use db2hpu to get the data out in the source machine and instead to send the output to a file (if BLU is in a diffferent machine) use ssh to send over the network to a pipe and then LOAD from the pipe in BLU machine.
    I will write a post with an example and if you want I can share it here.
    By the way I got to know you in IOD last year, you were so kind to come to the presentation I did with Raul Baron.
    Luis Reina

    • Ember Crooks says:

      True. HPU is a additionally licensed feature, so I don’t tend to think of it. I found traditional load from pipe slower than load from cursor. Would be happy to see what you would write up on it.

  7. Ram says:

    Hi Ember,

    Thanks for sharing insights in a very easy and understandable way. Your articles have been very useful for me. I am working on a script to automate declare and load from cursor. This is needed as the load has to be done frequently. I have created a part of the script where in it takes the list of both source and target tables. But I am unable to proceed to the declare cursor and load cursor part. This is what I have done so far

    #!/bin/ksh
    # cursor_load.ksh
    # parm1: Database name required
    # parm2: Source schema name required
    # parm3: Target schema name required
    . ~hipdvins/sqllib/db2profile

    DBNAME=$1
    SRCNAME=$2
    TARSNAME=$3
    OWN=$4

    if [[ $# < 3 ]]
    then
    echo "Usage: cursor_table.ksh"

    echo " ”

    exit 1
    fi

    db2 connect to $DBNAME
    db2 -x “select rtrim(tabschema) || ‘.’ ||tabname from syscat.tables where tabschema = ‘$SRCNAME’ and type = ‘T’ and OWNER =’$OWN'” > $SRCNAME.tables
    db2 -x “select tabname from syscat.tables where tabschema = ‘$TARSNAME’ and type = ‘T’ ” > $TARSNAME.tables

    Could you please guide me on how to generate a declare cursor stmt followed by load from cursor stmt

    • Ember Crooks says:

      Is the issue associating the source name with the target name? Is the target table name ever going to be different than the source table name? Is it just the schema name that changes? If the answer is that you’re just trying to copy all tables from one schema to another, and the source and target table names are the same, just with different schemas, I think this would work:
      while read tab
      do
      db2 “DECLARE C1 CURSOR DATABASE $DBNAME user using for select * from ${SRCNAME}.${tab} with ur”
      db2 “LOAD FROM C1 of CURSOR MESSAGES $tab.msg replace into ${TARSNAME}.$tab nonrecoverable”
      done < $TARSNAME.tables The changes to this sample code would be replacing and
      with variables or values that make sense (if it’s a remote database for the source). You might also have to use “copy yes to ” if this is an HADR environment (where is on a shared filesystem).

      Does that help?

  1. January 27, 2015

    […] Loading Data into DB2 BLU Tables […]

Leave a Reply

Your email address will not be published. Required fields are marked *