The first part : Checkpoint Restart in DB2 Part - I
In first part we understood what is check point restart and why we use it. We also covered the problem associated with Check point restart and solutions to those problems.
Now, in this post we will see the step by step implementation of check point restart logic.
CHECKPOINT/RESTART Implementation:
STEP1:
Create the CHECKPOINT-COMMIT record in the working storage section, to store
the data, which is needed for the next unit of recovery.
STEP2:
In the procedure division MAIN para:
First check the restart status flag i.e. RESTART-IND of CHECKPOINT_RESTART
table.
If RESTART-IND = ‘N’ then
if any output file exists open output file in OUTPUT mode
start the normal process
end
If RESTART-IND = ‘Y’ then
Move the SAVE-AREA information to CHECKPOINT-COMMIT record
if any output file exists
do the FILE REPOSITION:
Open the output file in INPUT mode.
Repeatedly
Read the output record and INSERT it into GLOBAL temp table
FILE_POSITION_GTT
Until the last unit of recovery write count.
Close the output file.
Open the output file in OUTPUT mode.
open a cursor for a table FILE_POSITION_GTT
repeatedly fetch a cursor and write the record information into the output file
until end of cursor
close a cursor
end
If input for the program is from cursor then skip the rows until COMMIT-KEY.
If input for the program is from file then skip the records until COMMIT-KEY.
End.
Note: For more than one output files, delete GTT after repositioning each output file.
STEP3:
Make a count for each Insert’s/Update’s/Deletes in RECORDS-PROCESSEDUOR variable.
STEP4:
Go thro’ the logic and find out the appropriate place where COMMIT WORK can be hosted. There check the frequency of COMMITS:
IF RECORDS-PROCESSED-UOR > COMMIT-FREQ
KEY (input) value of the program TO COMMIT-KEY
MOVE checkpoint-commit record length TO SAVE-AREA-LEN
MOVE checkpoint-commit record TO SAVE-AREA-TEXT
Update the CHECKPOINT_RESTART table with this information
END-COMMIT
STEP5:
Before STOP RUN statement; reset the RESTART flag of the
CHECKPOINT_RESTART table.
i.e. MOVE ‘N’ TO RESTART-IND
Update the CHECKPOINT_RESTART table with the above information.
Sample COBOL code for CHECKPOINT/RESTART Logic:
CHECKPOINT-COMMIT RECORD DEFINITION
***************************************************************
* GLOBAL TEMPORARY TABLE CURSOR DECLARATION & OPEN *
****************************************************************
EXEC SQL
DECLARE FPG-FPOS CURSOR FOR
SELECT RECORD_NUMBER ,RECORD_DETAIL
FROM FILE_POSITION_GTT
ORDER BY RECORD_NUMBER
END-EXEC.
**************************************************************
***** CHECK-POINT RESTART DATA DEFINITIONS *****
**************************************************************
01 COMMIT-REC.
02 FILLER PIC X(16) VALUE 'REC. PROCESSED: '.
02 COMMIT-KEY PIC 9(06) VALUE 0.
02 FILLER PIC X(14) VALUE 'TOTAL COUNTS: '.
02 COMMIT-COUNTS.
03 WS-REC-READ PIC 9(06) VALUE 0.
03 WS-REC-REJT PIC 9(06) VALUE 0.
03 WS-REC-WRIT PIC 9(06) VALUE 0.
03 WS-RECP-READ PIC 9(06) VALUE 0.
03 WS-RECP-UPDT PIC 9(06) VALUE 0.
01 CHKPRSL-VARS.
02 RECORDS-PROCESSED-UOR PIC S9(09) COMP VALUE +0.
**************************************************************
***** CHECK POINT RESTART LOGIC SECTION *****
**************************************************************
RESTART-CHECK.
MOVE 'XXXXXX ' TO PROGRAM-NAME.
PERFORM RESTART-SELECT.
IF RESTART-IND = 'Y'
MOVE SAVE-AREA-TEXT TO COMMIT-REC
If input is from cursor the skip until the commit-key
If input is from file then skip the records until the commit-key
END-IF.
**************************************************
***** CHECK RESTART STATUS *****
**************************************************
RESTART-SELECT.
MOVE 0 TO RECORD-PROCESSED-UOR.
EXEC SQL
SELECT RESTART_IND ,COMMIT_FREQ ,RUN_TYPE ,SAVE_AREA
INTO :RESTART-IND ,:COMMIT-FREQ ,:RUN-TYPE ,:SAVE-AREA
FROM CHECKPOINT_RESTART
WHERE PROGRAM_NAME = :PROGRAM-NAME
END-EXEC.
EVALUATE SQLCODE
WHEN 0
IF RESTART-IND = 'Y'
DISPLAY '* * * * * * * * * * * * * * * * * * * * * * * * * **********'
DISPLAY ' ***PROGRAM - ' PROGRAM-NAME ' RESTARTED***'
DISPLAY '* * * * * * * * * * * * * * * * * * * * * * * * * **********'
DISPLAY ' '
END-IF
WHEN 100
PERFORM RESTART-INSERT
WHEN OTHER
MOVE 'RESTART-SELECT ' TO WS-PARA-NAME
MOVE 'CHECKPOINT_RESTART SELECT ERR' TO WS-PARA-MSG
PERFORM EXCEPTION-ROUTINE
END-EVALUATE.
/
**************************************************************
***** INSERT THE NEW RESTART STATUS RECORD *****
**************************************************************
RESTART-INSERT.
MOVE SPACES TO CALL-TYPE.
MOVE SPACES TO CHECKPOINT-ID.
MOVE 'N' TO RESTART-IND.
MOVE 'B' TO RUN-TYPE.
MOVE +500 TO COMMIT-FREQ.
MOVE ZEROES TO COMMIT-SECONDS.
MOVE +4006 TO SAVE-AREA-LEN.
MOVE SPACES TO SAVE-AREA-TEXT.
EXEC SQL
INSERT INTO CHECKPOINT_RESTART
(
PROGRAM_NAME ,CALL_TYPE ,CHECKPOINT_ID ,RESTART_IND
,RUN_TYPE,COMMIT_FREQ ,COMMIT_SECONDS ,COMMIT_TIME
,SAVE_AREA
)
VALUES
(
:PROGRAM-NAME ,:CALL-TYPE ,:CHECKPOINT-ID ,:RESTART-IND ,:RUN-TYPE
,:COMMIT-FREQ ,:COMMIT-SECONDS, CURRENT TIMESTAMP ,:SAVE-AREA
)
END-EXEC.
EVALUATE SQLCODE
WHEN 0
CONTINUE
WHEN OTHER
MOVE 'RESTART-INSERT ' TO WS-PARA-NAME
MOVE 'CHECKPOINT_RESTART INSERT' TO WS-PARA-MSG
PERFORM EXCEPTION-ROUTINE
END-EVALUATE.
**********************************************************
***** UPDATE THE CHECKPOINT RECORD *****
**********************************************************
RESTART-COMMIT.
MOVE 'Y' TO RESTART-IND.
EXEC SQL
UPDATE CHECKPOINT_RESTART
SET RESTART_IND = :RESTART-IND ,SAVE_AREA = :SAVE-AREA ,
COMMIT_TIME = CURRENT TIMESTAMP
WHERE PROGRAM_NAME = :PROGRAM-NAME
END-EXEC.
EVALUATE SQLCODE
WHEN 0
EXEC SQL COMMIT WORK END-EXEC
EVALUATE SQLCODE
WHEN 0
CONTINUE
WHEN OTHER
MOVE 'RESTART-COMMIT' TO WS-PARA-NAME
MOVE 'COMMIT ERROR' TO WS-PARA-MSG
PERFORM EXCEPTION-ROUTINE
END-EVALUATE
MOVE 0 TO RECORD-PROCESSED-UOR
WHEN OTHER
MOVE 'RESTART-COMMIT' TO WS-PARA-NAME
MOVE 'CHECKPOINT_RESTART UPDATE ERR' TO WS-PARA-MSG
PERFORM EXCEPTION-ROUTINE
END-EVALUATE.
*******************************************************************
***** RESET THE RESTART FLAG AT THE END OF PROGRAM *****
*******************************************************************
RESTART-RESET.
MOVE 0 TO RECORD-PROCESSED-UOR.
MOVE 'N' TO RESTART-IND.
EXEC SQL
UPDATE CHECKPOINT_RESTART
SET RESTART_IND = :RESTART-IND ,COMMIT_TIME = CURRENT TIMESTAMP
WHERE PROGRAM_NAME = :PROGRAM-NAME
END-EXEC.
EVALUATE SQLCODE
WHEN 0
EXEC SQL COMMIT WORK END-EXEC
WHEN OTHER
MOVE 'RESTART-RESET' TO WS-PARA-NAME
MOVE 'CHECKPOINT_RESTART DELETE ERR' TO WS-PARA-MSG
PERFORM EXCEPTION-ROUTINE
END-EVALUATE.
/
*************************************************************
***** OUTPUT FILE REPOSITION LOGIC SECTION *****
********* * ***** ********************************************
************************************************************************
***** GLOBAL TEMPORARY TABLE CURSOR DECLARATION & OPEN *****
*************************************************************************
FPG-OPEN.
EXEC SQL
OPEN FPG-FPOS
END-EXEC.
EVALUATE SQLCODE
WHEN 0
CONTINUE
WHEN OTHER
MOVE 'FPG-OPEN' TO WS-PARA-NAME
MOVE 'GLOBAL TEMP TABLE OPEN ERR' TO WS-PARA-MSG
PERFORM EXCEPTION-ROUTINE
END-EVALUATE.
************************************************************
**** GLOBAL TEMPORARY TABLE CURSOR FETCH *****
************************************************************
FPG-FETCH.
EXEC SQL
FETCH FPG-FPOS INTO :FPG-RECORD-NUMBER ,:FPG-RECORD-DETAIL
END-EXEC.
EVALUATE SQLCODE
WHEN 0
CONTINUE
WHEN +100
MOVE 0 TO FPG-RECORD-NUMBER
WHEN OTHER
MOVE 'FPG-FETCH ' TO WS-PARA-NAME
MOVE 'GLOBAL TEMP TABLE FETCH ERR' TO WS-PARA-MSG
PERFORM EXCEPTION-ROUTINE
END-EVALUATE.
************************************************************
***** GLOBAL TEMPORARY TABLE CURSOR CLOSE *****
************************************************************
FPG-CLOSE.
EXEC SQL
CLOSE FPG-FPOS
END-EXEC.
EVALUATE SQLCODE
WHEN 0
MOVE 0 TO FPG-RECORD-NUMBER
WHEN OTHER
MOVE 'FPG-FPOS-CLOSE ' TO WS-PARA-NAME
MOVE 'GLOBAL TEMP TABLE CLOSE ERR' TO WS-PARA-MSG
PERFORM EXCEPTION-ROUTINE
END-EVALUATE.
***********************************************************
***** GLOBAL TEMPORARY TABLE INSERTS *****
***********************************************************
FPG-INSERT.
ADD 1 TO FPG-RECORD-NUMBER.
EXEC SQL
INSERT INTO FILE_POSITION_GTT
(
RECORD_NUMBER ,RECORD_DETAIL
)
VALUES
(
:FPG-RECORD-NUMBER
,:FPG-RECORD-DETAIL
)
END-EXEC.
EVALUATE SQLCODE
WHEN 0
CONTINUE
WHEN OTHER
MOVE 'FPG-INSERT ' TO WS-PARA-NAME
MOVE 'GLOBAL TEMP TABL INSERT ERR' TO WS-PARA-MSG
PERFORM EXCEPTION-ROUTINE
END-EVALUATE.
/
RESTART-FILE-REPOSITION.
OPEN INPUT outputfile-name.
MOVE LENGTH OF output-record TO FPG-RECORD-DETAIL-LEN.
READ output-file INTO FPG-RECORD-DETAIL-TEXT.
PERFORM UNTIL FPG-RECORD-NUMBER >= output record count of last commit
PERFORM FPG-INSERT
READ output-file INTO FPG-RECORD-DETAIL-TEXT
END-PERFORM.
CLOSE output-filename
OPEN OUTPUT outputfile-name.
PERFORM FPG-OPEN.
PERFORM FPG-FETCH.
PERFORM UNTIL FPG-RECORD-NUMBER = 0
WRITE outputfile-record FROM FPG-RECORD-DETAIL-TEXT
PERFORM FPG-FETCH
END-PERFORM.
PERFORM FPG-CLOSE.
---------skip input file until the last commit------------------
DISPLAY ' *** ALREADY ' COMMIT-KEY ' RECORDS PROCESSED ***'.
DISPLAY ' '
DISPLAY ' '.
/
***********************************************************
************** E X C E P T I O N R O U T I N E **************
***********************************************************
EXCEPTION-ROUTINE.
MOVE SQLCODE TO WS-SQL-RET-CODE.
DISPLAY '*************************************************'.
DISPLAY '**** E R R O R M E S S A G E S ****'.
DISPLAY '*************************************************'.
DISPLAY '* ERROR IN PARA.....: ' WS-PARA-NAME.
DISPLAY '* MESSAGES.....: ' WS-PARA-MSG.
DISPLAY '*'.
DISPLAY '* SQL RETURN CODE..: ' WS-SQL-RET-CODE.
DISPLAY '*************************************************'.
Output file Disposition in JCL:
♦ In JCL, disposition must be given as DISP=(NEW,CATLG,CATLG) or DISP=(OLD,KEEP,KEEP)
♦ Override statement is needed for the output files if job abended:
1. GDG with DISP=(NEW,CATLG,CATLG)
Override stmt:
• Change +1 generation to 0 (current) generation
• DISP=(OLD,KEEP,KEEP)
2. GDG with DISP=(OLD,KEEP,KEEP)
Override stmt:
• Change +1 generation to 0 (current) generation
Output file with Disposition MOD:
• If output file is already existing, and program is appending records to that, then the File re-positioning must be handled in different way according to the requirements.
Internal Sort:
If any Commit-Restart program has Internal Sort, remove it and have an External Sort.
POINTS TO REMEMBER
- All the update programs must use COMMIT frequency from the CHECKPOINT_RESTART table only
- Avoid – Internal Sorts
- Avoid – Mass updates (Instead, use cursor with FOR
- UPDATE clause and update one record at a time)
- On-call analyst should back-up all the output files before
- restart (The procedure should be documented in APCDOC)
- Reports to dispatch should be sent to a flat file; send the file
- to dispatch up on successful completion of the job
- Save only the working storage variables that are required
- for RESTART in the CHECKPOINT_RESTART table
- RESET the RESTART_IND flag at the end of the program
- If COMMIT-RESTART logic is introduced in an existing
- program then make relevant changes to the PROCJCL.