Checkpoint Restart in DB2 Part - II

The first part : Checkpoint Restart in DB2 Part - I


In first part we understood what is check point restart and why we use it. We also covered the problem associated with Check point restart and solutions to those problems.

Now, in this post we will see the step by step implementation of check point restart logic.

CHECKPOINT/RESTART Implementation:

STEP1: 

Create the CHECKPOINT-COMMIT record in the working storage section, to store 
the data, which is needed for the next unit of recovery. 

STEP2:

In the procedure division MAIN para: 
First check the restart status flag i.e. RESTART-IND of CHECKPOINT_RESTART 
table. 

If RESTART-IND = ‘N’ then
   if any output file exists  open output file in OUTPUT mode 
   start the normal process 
end 
If RESTART-IND = ‘Y’ then
 Move the SAVE-AREA information to CHECKPOINT-COMMIT record 
 if any output file exists 
  do the FILE REPOSITION: 
    Open the output file in INPUT mode. 
    Repeatedly  
      Read the output record and INSERT it into GLOBAL temp table 
      FILE_POSITION_GTT 
    Until the last unit of recovery write count. 
    Close the output file. 
   Open the output file in OUTPUT mode. 
open a cursor for a table FILE_POSITION_GTT
repeatedly fetch a cursor and write the record information into the output file  
until end of cursor
close a cursor 
end 
      If input for the program is from cursor then skip the rows until COMMIT-KEY. 
      If input for the program is from file then skip the records until COMMIT-KEY. 
End. 

Note: For more than one output files, delete GTT after repositioning each output file.

STEP3

Make a count for each  Insert’s/Update’s/Deletes in  RECORDS-PROCESSEDUOR variable. 

STEP4:

Go thro’ the logic and find out the appropriate place where COMMIT WORK can be hosted. There check the frequency of COMMITS: 

IF RECORDS-PROCESSED-UOR > COMMIT-FREQ

    KEY (input) value of the program                 TO COMMIT-KEY       
    MOVE checkpoint-commit record length  TO SAVE-AREA-LEN 
    MOVE checkpoint-commit record       TO SAVE-AREA-TEXT 
    Update the CHECKPOINT_RESTART table with this information 

END-COMMIT 

STEP5: 

Before STOP RUN statement; reset the RESTART flag of the 
CHECKPOINT_RESTART table. 
i.e. MOVE ‘N’ TO RESTART-IND
     Update the CHECKPOINT_RESTART table with the above information.


Sample COBOL code for CHECKPOINT/RESTART Logic: 

CHECKPOINT-COMMIT RECORD DEFINITION

 ***************************************************************
* GLOBAL TEMPORARY TABLE CURSOR DECLARATION & OPEN *
****************************************************************

EXEC SQL 

   DECLARE FPG-FPOS CURSOR FOR 
   SELECT RECORD_NUMBER ,RECORD_DETAIL 
   FROM FILE_POSITION_GTT 
   ORDER BY RECORD_NUMBER
END-EXEC. 

**************************************************************
***** CHECK-POINT RESTART DATA DEFINITIONS  ***** 
**************************************************************

01 COMMIT-REC. 
02 FILLER  PIC X(16) VALUE 'REC. PROCESSED: '.
02 COMMIT-KEY  PIC 9(06) VALUE 0. 
02 FILLER  PIC X(14) VALUE 'TOTAL COUNTS: '.
02 COMMIT-COUNTS.                                              
                   03 WS-REC-READ           PIC 9(06) VALUE 0. 
                   03 WS-REC-REJT             PIC 9(06) VALUE 0. 
                   03 WS-REC-WRIT            PIC 9(06) VALUE 0. 
                   03 WS-RECP-READ         PIC 9(06) VALUE 0. 
                   03 WS-RECP-UPDT         PIC 9(06) VALUE 0. 
01 CHKPRSL-VARS. 
02 RECORDS-PROCESSED-UOR PIC S9(09) COMP VALUE +0. 

************************************************************** 
*****  CHECK POINT RESTART LOGIC SECTION   ***** 
************************************************************** 

RESTART-CHECK. 
            MOVE 'XXXXXX  ' TO PROGRAM-NAME.                              
PERFORM RESTART-SELECT. 
IF RESTART-IND = 'Y'
 MOVE SAVE-AREA-TEXT TO COMMIT-REC 
   If input is from cursor the skip until the commit-key 
   If input is from file then skip the records until the commit-key                          
END-IF.

************************************************** 
***** CHECK RESTART STATUS  ***** 
************************************************** 

RESTART-SELECT. 

MOVE 0 TO RECORD-PROCESSED-UOR. 

EXEC SQL 

   SELECT RESTART_IND ,COMMIT_FREQ ,RUN_TYPE ,SAVE_AREA 
   INTO :RESTART-IND ,:COMMIT-FREQ ,:RUN-TYPE ,:SAVE-AREA 
   FROM CHECKPOINT_RESTART 
   WHERE PROGRAM_NAME = :PROGRAM-NAME              
        
END-EXEC. 

EVALUATE SQLCODE 

    WHEN 0              
                                          
         IF RESTART-IND = 'Y'
            DISPLAY '* * * * * * * * * * * * * * * * * * * * * * * * * **********'
            DISPLAY ' ***PROGRAM - ' PROGRAM-NAME ' RESTARTED***'
            DISPLAY '* * * * * * * * * * * * * * * * * * * * * * * * * **********' 
            DISPLAY ' '
         END-IF 

    WHEN 100               
                                       
         PERFORM RESTART-INSERT

     WHEN OTHER   
                                                 
        MOVE 'RESTART-SELECT  ' TO WS-PARA-NAME   
        MOVE 'CHECKPOINT_RESTART SELECT ERR' TO WS-PARA-MSG    
        PERFORM EXCEPTION-ROUTINE 
END-EVALUATE. 

************************************************************** 
***** INSERT THE NEW RESTART STATUS RECORD ***** 
************************************************************** 

RESTART-INSERT. 
MOVE SPACES  TO CALL-TYPE. 
MOVE SPACES  TO CHECKPOINT-ID. 
            MOVE 'N'  TO RESTART-IND. 
            MOVE 'B'  TO RUN-TYPE. 
            MOVE +500  TO COMMIT-FREQ.                        
            MOVE ZEROES              TO COMMIT-SECONDS.                     
            MOVE +4006               TO SAVE-AREA-LEN.                      
MOVE SPACES  TO SAVE-AREA-TEXT. 
EXEC SQL 
    INSERT INTO CHECKPOINT_RESTART 
    (
      PROGRAM_NAME ,CALL_TYPE ,CHECKPOINT_ID ,RESTART_IND
     ,RUN_TYPE,COMMIT_FREQ ,COMMIT_SECONDS ,COMMIT_TIME 
     ,SAVE_AREA 
     ) 
VALUES 
   (
    :PROGRAM-NAME ,:CALL-TYPE ,:CHECKPOINT-ID ,:RESTART-IND ,:RUN-TYPE 
   ,:COMMIT-FREQ  ,:COMMIT-SECONDS, CURRENT TIMESTAMP ,:SAVE-AREA 
   ) 

END-EXEC. 

EVALUATE SQLCODE 
   WHEN 0                                                        
          CONTINUE 
   WHEN OTHER                                                    
           MOVE 'RESTART-INSERT  ' TO WS-PARA-NAME       
           MOVE 'CHECKPOINT_RESTART INSERT' TO WS-PARA-MSG        
           PERFORM EXCEPTION-ROUTINE 
END-EVALUATE.

********************************************************** 
***** UPDATE THE CHECKPOINT RECORD ***** 
********************************************************** 

RESTART-COMMIT. 
            MOVE 'Y' TO RESTART-IND. 
EXEC SQL 

   UPDATE CHECKPOINT_RESTART 
   SET RESTART_IND = :RESTART-IND ,SAVE_AREA = :SAVE-AREA ,
   COMMIT_TIME = CURRENT TIMESTAMP 
   WHERE PROGRAM_NAME = :PROGRAM-NAME 
                     
END-EXEC. 

EVALUATE SQLCODE 
  WHEN 0                
                                        
   EXEC SQL COMMIT WORK END-EXEC                            
       EVALUATE SQLCODE 
         WHEN 0                                                   
           CONTINUE 
         WHEN OTHER                                               
           MOVE 'RESTART-COMMIT' TO  WS-PARA-NAME              
           MOVE 'COMMIT ERROR' TO WS-PARA-MSG        
           PERFORM EXCEPTION-ROUTINE 
       END-EVALUATE 
       MOVE 0 TO RECORD-PROCESSED-UOR 

  WHEN OTHER                                                    
    MOVE 'RESTART-COMMIT' TO WS-PARA-NAME   
    MOVE 'CHECKPOINT_RESTART UPDATE ERR' TO WS-PARA-MSG    
    PERFORM EXCEPTION-ROUTINE 
END-EVALUATE.

******************************************************************* 
***** RESET THE RESTART FLAG AT THE END OF PROGRAM ***** 
******************************************************************* 

RESTART-RESET. 
MOVE 0 TO RECORD-PROCESSED-UOR. 
MOVE 'N' TO RESTART-IND. 

EXEC SQL 
   UPDATE CHECKPOINT_RESTART 
   SET RESTART_IND = :RESTART-IND ,COMMIT_TIME = CURRENT TIMESTAMP 
   WHERE PROGRAM_NAME = :PROGRAM-NAME                      
END-EXEC. 

EVALUATE SQLCODE 
   WHEN 0                                                        
        EXEC SQL COMMIT WORK END-EXEC                            
    WHEN OTHER                                                    
        MOVE 'RESTART-RESET' TO WS-PARA-NAME   
        MOVE 'CHECKPOINT_RESTART DELETE ERR' TO WS-PARA-MSG    
        PERFORM EXCEPTION-ROUTINE 
END-EVALUATE. 
*************************************************************
***** OUTPUT FILE REPOSITION LOGIC SECTION  ***** 
********* * ***** ********************************************

************************************************************************ 
***** GLOBAL TEMPORARY TABLE CURSOR DECLARATION & OPEN  ***** 
************************************************************************* 
FPG-OPEN. 

EXEC SQL 
  OPEN FPG-FPOS 
END-EXEC

EVALUATE SQLCODE 
    WHEN 0                                                        
         CONTINUE 
     WHEN OTHER                                                    
          MOVE 'FPG-OPEN'                    TO WS-PARA-NAME       
          MOVE 'GLOBAL TEMP TABLE OPEN  ERR' TO WS-PARA-MSG        
          PERFORM EXCEPTION-ROUTINE 
 END-EVALUATE. 

************************************************************
****   GLOBAL TEMPORARY TABLE CURSOR FETCH  ***** 
************************************************************

FPG-FETCH. 
EXEC SQL 

  FETCH FPG-FPOS INTO :FPG-RECORD-NUMBER ,:FPG-RECORD-DETAIL 

END-EXEC. 

EVALUATE SQLCODE 
   WHEN 0                                                        
        CONTINUE 
    WHEN +100                                                     
        MOVE 0 TO FPG-RECORD-NUMBER                            
     WHEN OTHER                                                    
        MOVE 'FPG-FETCH '                  TO WS-PARA-NAME       
        MOVE 'GLOBAL TEMP TABLE FETCH ERR' TO WS-PARA-MSG        
        PERFORM EXCEPTION-ROUTINE 
END-EVALUATE. 

************************************************************
***** GLOBAL TEMPORARY TABLE CURSOR CLOSE  ***** 
************************************************************

FPG-CLOSE. 

EXEC SQL 

   CLOSE FPG-FPOS
END-EXEC. 

EVALUATE SQLCODE 
     WHEN 0                                                        
          MOVE 0 TO FPG-RECORD-NUMBER                              
      WHEN OTHER                                                    
          MOVE 'FPG-FPOS-CLOSE '             TO WS-PARA-NAME       
          MOVE 'GLOBAL TEMP TABLE CLOSE ERR' TO WS-PARA-MSG        
          PERFORM EXCEPTION-ROUTINE 
END-EVALUATE. 

*********************************************************** 
***** GLOBAL TEMPORARY TABLE INSERTS  ***** 
*********************************************************** 

FPG-INSERT. 
ADD 1 TO FPG-RECORD-NUMBER.                                  
EXEC SQL 

   INSERT INTO FILE_POSITION_GTT 
  ( 
    RECORD_NUMBER ,RECORD_DETAIL 
   ) 
VALUES 
 :FPG-RECORD-NUMBER 
,:FPG-RECORD-DETAIL 

END-EXEC. 

EVALUATE SQLCODE 
    WHEN 0                                                        
         CONTINUE 
     WHEN OTHER                                                    
         MOVE 'FPG-INSERT  '              TO WS-PARA-NAME       
         MOVE 'GLOBAL TEMP TABL INSERT ERR' TO WS-PARA-MSG        
         PERFORM EXCEPTION-ROUTINE 
END-EVALUATE. 
RESTART-FILE-REPOSITION. 
OPEN INPUT outputfile-name.                                         
MOVE LENGTH OF output-record TO FPG-RECORD-DETAIL-LEN.             
READ output-file INTO FPG-RECORD-DETAIL-TEXT.                   
PERFORM UNTIL FPG-RECORD-NUMBER >= output record count of last commit
PERFORM FPG-INSERT 
READ output-file INTO FPG-RECORD-DETAIL-TEXT            
END-PERFORM. 
CLOSE output-filename
OPEN OUTPUT outputfile-name.                                        
PERFORM FPG-OPEN. 
PERFORM FPG-FETCH. 
PERFORM UNTIL FPG-RECORD-NUMBER = 0 
WRITE outputfile-record FROM FPG-RECORD-DETAIL-TEXT 
PERFORM FPG-FETCH 
END-PERFORM. 
PERFORM FPG-CLOSE. 
---------skip input file until the last commit------------------ 
DISPLAY '  *** ALREADY ' COMMIT-KEY ' RECORDS PROCESSED ***'. 
DISPLAY ' '
DISPLAY ' '.
 / 
*********************************************************** 
************** E X C E P T I O N  R O U T I N E **************
*********************************************************** 

EXCEPTION-ROUTINE. 

MOVE SQLCODE TO WS-SQL-RET-CODE.                          
            DISPLAY '*************************************************'.  
            DISPLAY '****  E R R O R   M E S S A G E S  ****'.  
            DISPLAY '*************************************************'.  
            DISPLAY '* ERROR IN PARA.....: ' WS-PARA-NAME.              
            DISPLAY '*        MESSAGES.....: ' WS-PARA-MSG.               
            DISPLAY '*'.
            DISPLAY '* SQL RETURN  CODE..: ' WS-SQL-RET-CODE.           
            DISPLAY '*************************************************'.  

Output file Disposition in JCL: 

♦  In JCL, disposition must be given as DISP=(NEW,CATLG,CATLG) or DISP=(OLD,KEEP,KEEP) 
♦  Override statement is needed for the output files if job abended: 
        1.  GDG with DISP=(NEW,CATLG,CATLG) 
              Override stmt: 
                 •  Change +1 generation to 0 (current) generation 
                 •  DISP=(OLD,KEEP,KEEP) 
       2.  GDG with DISP=(OLD,KEEP,KEEP) 
             Override stmt: 
                •  Change +1 generation to 0 (current) generation 

Output file with Disposition MOD: 

•  If output file is already existing, and program is appending records to that, then the File re-positioning must be handled in different way according to the requirements. 

Internal Sort: 

If any Commit-Restart program has Internal Sort, remove it and have an External Sort. 

POINTS TO REMEMBER 
  • All the update programs must use COMMIT frequency from the CHECKPOINT_RESTART table only 
  • Avoid – Internal Sorts 
  • Avoid – Mass updates (Instead, use cursor with FOR 
  • UPDATE clause and update one record at a time) 
  • On-call analyst should back-up all the output files before 
  • restart (The procedure should be documented in APCDOC) 
  • Reports to dispatch should be sent to a flat file; send the file 
  • to dispatch up on successful completion of the job 
  • Save only the working storage variables that are required 
  • for RESTART in the CHECKPOINT_RESTART table 
  • RESET the RESTART_IND flag at the end of the program 
  • If COMMIT-RESTART logic is introduced in an existing 
  • program then make relevant changes to the PROCJCL.                    
                                                   

6 comments:

Anonymous said...

What are the benefits of using DB2 table vs. flat files for re-positioning the restart-point? If the abend was caused by a DB2 internal issue, the last commit point was not able to be written out to the GTT table? What is this GTT table?
I understand, even this do happen, the re-start will be restarting from one-commit point behind. Aren’t all the batch output files need to be synchronized by using sort-merge step?

Anonymous said...

Hey there! This post could not be written any better! Reading this post reminds me of my
old room mate! He always kept talking about this. I will forward this page to him.
Fairly certain he will have a good read. Thanks for sharing!


My weblog tutorial

Anonymous said...

We stumbled over here from a different web address and thought I should check things out.
I like what I see so now i am following you. Look forward
to going over your web page again.

My web page - Cccam white sky card

Anonymous said...

I think the admin of this site is truly working
hard in favor of his website, because here every material is quality based
information.

Review my homepage Premium

Anonymous said...

Wow, this article is nice, my sister is analyzing
these things, so I am going to convey her.



my blog cardshare

Thi said...

Hi,

Suppose we have the scenario read from file and write into table aswell as file. How to handle this??

Please reply me to thiyamu23@gmail.com

We can use mod instead of gtt. Do you think special purpose??

Post a Comment