What does a Commit mean in DB2?
What a Commit is
Database management systems have the concept of a commit. This concept has to do with grouping sets of actions together such that they all succeed or fail as a group. This represents the A in the famous ACID properties of a transaction system. The A stands for Atomicity – meaning that a transaction may consist of multiple small parts, but that if one of those parts fails, then the transaction as a whole must also fail to have made any changes to the data. This concept is especially critical in relational databases which may be normalized into many smaller tables, which leads to a transaction consisting of more than just a single update in a single location.
Assuming that an application does not have autocommit turned on, the following represents and example of a transaction and the use of a commit.
BEGIN TRANSACTION select columns from table a where id < NNNN; update table_a set col1=1234; update table_b set col2=5678; insert into table_c .... ; commit work; END TRANSACTION
In the above example, if the update statement for table_b fails, then the application will detect that and the commit work statement will never be executed or a rollback statement may be executed. DB2 will rollback the update to table_a and the insert into table_c.
A commit also has to do with the D in ACID - Durability. The durability property means that once a commit happens, the data persists (traditionally on disk), even if power is lost or other likely failures occur. This is a major reason that databases have transaction logs.
When Commits Occur
The DB2 command line usually has autocommit turned on by default. This means that if you are simply issuing db2 commands from a command line, then you do not generally have to commit. Many applications manage commits in other ways, including some GUI database administration or access applications.
When designing an application, designers and developers must be certain that commits are happening when they are supposed to.
DBAs also advocate for frequent commits during long-running actions like data loads or large deletes. Largely, this is so the transaction log files do not fill up.
Frequent commits support the highest level of concurrent data access.
What Commits Do
DB2 uses a method referred to as write-ahead logging. This means that when a commit occurs, the data is written directly to the transaction logs. Data gets to the tables and such on disk asynchronously, through the buffer pool. A commit does not write data out to the tables itself. This saves time - the end user is not sitting and waiting on as much I/O.
A side note - DB2 writes both redo and undo data to the transaction logs - it is not like Oracle where redo and undo logs are separate things.
DB2 has a memory buffer called the log buffer. As data is changed, it is written to the log buffer, committed or not. The log buffer is then written out to disk either when it becomes full or when any connection does a commit. The commit is not successful until it has been externalized from the log buffer to the log files on disk.
Uncommitted data can thus be written from the log buffer into the log files when this occurs. But DB2 tracks and knows which transactions are committed, and which are not.
In the image above, an agent writes a commit record to the log buffer (1). The logger process then writes the log buffer to disk (2) - either immediately or shortly based on MINCOMMIT and other factors. The agent waits for the acknowledgement from the logger process that the log records for the agent's commit have been externalized to disk (3), up to and including that commit record. More than one agent may be waiting at a time. Since this is where end users are waiting on physical I/O, it makes sense to apply your fastest disk to the transaction log files.
Note that the commit does not touch bufferpools or tablespace data. As statements have come through, they have been writing all the information for both undo and redo to the log buffer, so it's already there or in the transaction log files waiting for the commit.
If a database crashes, then when it comes back up, it goes through a process called crash recovery, which looks at the transaction log files and rolls forward through any transactions needed. After it completes that "forward phase" of crash recovery, it goes and rolls back any transactions that were in the log files and were not committed.