Checkpoint Restart in DB2 Part - I



>> Click here for Part II <<

Before getting into what is check point restart and how to implement it,you need to understand when and why you should go for it.

HERE’S THE SCENARIO:

Suppose, a batch program that basically reads an input file and posts the updates/inserts/deletes to DB2 tables in the database was abended before the end of the job because of some reasons; Is it possible to tell - How many input records were processed? Were any of the updates committed to the database or can the job be started from the beginning?

Assume that COMMIT logic was not coded for large batch jobs that process millions of records. If an ABEND occurs all database updates will be rolled back and the job can be resubmitted from the beginning. If an ABEND occurs near the end of the process, the rollback of all the updates is performed. Also, DB2 will maintain a large number of locks for a long period of time, reducing concurrency in the system. In fact, the program may ABEND if it tries to acquire more than the installation-defined maximum number of locks.

This involves setting up a batch-restart control table (CHECKPOINT_RESTART in our case) to store the last input record processed and other control information. The restart control table can also be used as an instrumentation table to control the execution, commit frequency, locking protocol and termination of batch
jobs.

One of the problems with restart is synchronizing DB2 tables and output files(If your program updates some output file). DB2 will rollback all work on DB2 tables to the last commit point; but for output files we have to delete all the records up to the last commit point. (One option to do this would be via a global temporary table, FILE_POSITION_GTT, See FILE REPOSITIONING section for further details.).

COMMIT Function

The COMMIT statement ends a unit of recovery and commits the relational database changes that were made in that unit of recovery. If relational databases are the only recoverable resources used by the application process, COMMIT also ends the unit of work. The unit of recovery in which the statement is executed is ended and a new unit of recovery is effectively started for the process. All changes made by ALTER, COMMENT ON, CREATE, DELETE, DROP, EXPLAIN, GRANT, INSERT, LABEL ON SQL connections are ended when any of the following apply:

► The connection is in the release pending state
► The connection is not in the release pending state but it is a remote connection and:

  • The DISCONNECT(AUTOMATIC) bind option is in effect, or
  • The DISCONNECT (CONDITIONAL) bind option is in effect and an open WITH HOLD cursor is not associated with the connection.
For existing connections,

► All open cursors that were declared with the WITH HOLD option are preserved, along with any 
SELECT statements that were prepared for those cursors.

► All other prepared statements are destroyed unless dynamic caching is enabled.

► If dynamic caching is enabled, then all prepared SELECT, INSERT, UPDATE 
and DELETE statements
 that are bound with KEEPDYNAMIC (YES) are kept past the commit.
 Prepared statements cannot be kept past a commit if:

      * SQL RELEASE has been issued for that site, or
      * Bind option DISCONNECT(AUTOMATIC) was used, or 
      * Bind option DISCONNECT (CONDITIONAL) was used and there are no hold cursors.

All implicitly acquired locks are released, except for those required for the cursors that were
  • A place to store the details (CHECKPOINT-COMMIT record) pertaining to the current execution of the program, like various counts (number of inserts/deletes/updates/selects), number of processed, processing dates, and other details which are needed in the program after a RESTART. 
  • A reliable FILE RE-POSITIONING logic with minimal changes to the existing PROCJCL.
  • Flexibility, to modify the commit frequency without changing the program code.
Where we can store this CHECKPOINT-COMMIT record?

We can store the CHECKPOINT-COMMIT record, COMMIT-FREQUENCY and other relevant
information in a DB2 table .

CHECKPOINT_RESTART TABLE DESCRIPTION

FILE RE-POSITIONING
At restart, all records written to the output file since the last commit will have to be removed. To accomplish this, FILE_POSITION_GTT global temporary table is used. SQL statements that use global temporary tables can run faster because:


• DB2 does not log changes to global temporary tables
• Global temporary tables do not experience lock contention
• DB2 creates an instance of the temp table OPEN/SELECT/INSERT/DELETE stmts. only occur:

• An instance of a temporary table exists at the current server until one of the following actions occur:

• The remove server connection under which the instance was created terminates
     For ROLLBACK stmt, DB2 deletes the instance of the temporary table.
     For COMMIT stmt, DB2 deletes the instance of the temporary table unless cursor
     for accessing temporary table is defined WITH HOLD and is open.
 •  The application process ends.



In next part we will discuss step by step process of check point restart installation along with sample code.