Managing db2 transaction log files
There are two methods of logging that DB2 supports: Circular and Archive. I believe Oracle has similar modes.
To my extreme disgust, the default that Commerce uses if you don’t change anything is Circular logging. Circular logging is more often appropriate for databases where you don’t care about the data (seen it for databases supporting Tivoli and other vendors) or for a Data Warehousing and Decision Support databases where you have extremely well defined data loading processes that can easily be re-done on demand. You must also be willing to take regular outages for database backups because Circular logging does not allow you to take online backups. Circular logging also does not allow you to rollforward or back through transaction logs to reach a point in time – any restores are ONLY to the time a backup was taken.
On every new build, I move away from circular logging. I just don’t find it appropriate for a OLTP database, where your requirement often include very high availability and the ability to recover from all kinds of disasters with no data loss.
So why, then isn’t archive logging the default? Well, it requires proper management of transaction log files. Which can really get you in trouble if you don’t know what you’re doing. If you compress or delete an active transaction log, you will crash your database and have to restore from a backup. I’ve seen it happen, and it’s not fun. And the highest freqency of OS level backups you’re willing to do should be applied to the directories holding transaction log files.
I ensure that my archive logs are always on a separate path from the active ones so I and whoever gets paged out when a filesystem is filling up can easily see which is which.
Personally, I use scripts to manage my trasaction log files. I actually do most of it with my backup script. How long you keep them depends on your restore requirements and your overall backup/restore strategy. I also use a simple cron job to find files in the archive log path older than a certain time frame (1 day or 3 days is most common) and compress them. I hear that a nice safe way to delete logs is the prune logs command, but I never got used to it.
This is one of the areas where it is critical for DBA’s to have an excruciatingly high level of attention to detail.
Ok, ready for the most complicated part?
All the settings discussed are in the db cfg.
So the most important setting here is LOGRETAIN. If set to ‘NO’, then you have circular logging. If it is set to ‘Recovery’ then you have Archive logging. To enable archive logging, you simply update this parameter.
Second in my mind is LOGARCHMETH1. This parameter specifies the separate path for your archive logs to be sent to. It can be a location on DISK or TSM. Do not leave it set to ‘LOGRETAIN’.
WTH is this USEREXIT thing?
I undoubtedly have some newer DBAs wondering about this. The LOGARCHMETH1 parameter and others that dictate the location of archive logs was only introduced in db28 (or was it 7?). Before that, we had these nasty things called userexit programs that we had to locate C compilers to compile and be aware of the location of the uncompiled versions to make changes if needed. And the compiled file had to be in the right place with the right permissions. Really, I hated working with them. But the functionality is still in DB2 to use them. I imagine they could do things you can’t do natively, but the parameters are so good that it’d be a rare situation that you need them.
This is the size of each transaction log file. Generally my default for Commerce is 10000 (which I think Commerce itself actually sets on instance creation), but I’ve gone higher – it’s not unusual to go up to 40,000 while data is being loaded or for stagincopies.
This determines the number of log files of the size LOGFILSZ that compose the database’s active log files. These are all created on database activation (which happens on first connection), so you don’t want to go too large. But you do want to generally have the space here to handle your active logs. Most Commerce databases do well at around 12.
This determines the number of additional active logs that can be allocated as needed. LOGPRIMARY + LOGSECOND cannot exceed 255. The nice thing about LOGSECOND is that these are not allocated on database activation, but only as needed. The other awesome thing here is that they can be increased online – one of the few logging parameters that can be. I usually start with 50, but increase if there’s a specific need for more. Remember, these should not be used on an ongoing basis – just to handle spikes.
All the others
So there are all kinds of nifty things you can do with logging. Infinite logging, mirrored loging, logging to a raw device, etc. So I’m not going to cover all the logging parameters there are in this post.
Deleting or compressing an active log file
The best case if you delete or compress an active log file is that DB2 is able to recreate it. This may affect your ability to take online backups. The worst (and more likely) case is that your database ceases functioning and you have to restore from backup. Keep your active and archive logs in separate directories to help prevent this, and educate anyone who might try to alleviate a filesystem full. If you do get an error on an online backup referencing the inability to include a log file, take an offline backup just as soon as you can – you will be unable to take online backups until you do.
Filling up a filesystem due to not managing log files
If your archive log filesystem is separate and fills up, it doesn’t hurt anything. If the filesystem your active log path is on fills up, your database will be inaccessible until you clear up the filesystem full. The moment the filesystem is no longer full, the database will function, so there is no need to restore. I recommend monitoring for any filesystems involved in transaction logging.
Deleting too many log files and impacting recovery
If you’re on anything before DB2 9.5, make absolutely sure that you use the “include logs” keyword on the backup command. If you don’t, you may end up with a backup that is completely useless, because you MUST have at least one log file to restore from an online backup. When you delete log files, keep in mind your backup/recovery strategy. There’s very little worse than really needing to restore but being unable to do so because you’re missing a file. I recommend backing up your transaction logs to tape or through other OS level methods as frequently as you can.
Deleting recent files and impacting HADR
Sometimes HADR needs to access archive log files – especially if HADR is behind and needs to catch up. If you run into this situation, you have to re-set-up HADR using a database restore. If you’re using HADR, it is important to monitor HADR so you can catch failures as soon as possible and reduce the need for archive logs.
Log files too small
Tuning the size of your log files may be a topic for another post, but I’ll cover the highlights. Large deletes are the most likely to chew through everything you’ve got. The best solution is to break up large units of work into smaller pieces, especially deletes. Where that’s not possible (ahem, stagingcopy), you’ll need to increase any of LOGFILSZ, LOGPRIMARY, or LOGSECOND. Only LOGSECOND can be changed without recycling the database.
Log file saturation
This one confuses the heck out of new DBAs. You get what looks like a log file full, yet the disk is not full and a snapshot says there’s plenty of log space available. The problem here is that with archive logging, log files and each spot in those log files must be used sequentially – even if there are things that have already been committed. Normally the database is rolling through the logs, with the same number of files active at once, but constantly changing which files.
Sometimes an old connection is sitting out there hanging on to a page in the log file with an uncommitted unit of work. Then the connection becomes idle and stays that way, sometimes for days. Then DB2 gets to the point where it has to open another log file, and it can’t because that would be more than it is allowed to allocate. So it throws an error that looks pretty similar to log file full. In that case, you must force off the old idle connection. Details are written to the diag log, and you can also use a database snapshot to get the id of the connection holding the oldest log file.
This never happens to Commerce’s own connections, in my experience. It is usually a developer’s connection from what I’ve seen in Commerce databases. Commerce when functioning normally rarely has a connection with more than 5 minutes of idle time. So I like to have a db2 governor running that forces off connections that are IDLE for more than 4 hours.