Understanding Locking in DB2 - I

Locking in DB2 is the least understood concept among the programmer's.
There are lots of chaos ranging from what is lock to why we need it to end up discussing what exactly is isolation level etc.and i don't blame them.

Some programmer's who have somewhat understanding about it ,they have there different set of questions to ask on it ,such as we have something called as LOCKSIZE parameter while creating tablespace, then we have bind parameter such as Isolation Level and Aquire and Release. Now what exactly the function of these parameter? do we need all of them for deciding the kind of lock DB2 will take on resource for my particular Application Program.


Well to answer these questions let me clarify first that , Locking is not as simple subject as to cover it into a single post , if one want to understand the detailing of it , it may take up reading fully dedicated book on it.

I will try to explain the overview of it in this post and will cover the various aspect of it in coming posts.

Well i will not dwell much into what is the definition of lock and why we need it . you can read it anywhere (better go for IBM manual )


So read on


There are three things which you need to understand first in order to get rid of the concept of Locking.


-- The object on which the LOCK to be taken (That is Table,Page,Row etc.).
    It can also be called as size of  the LOCK.

 
-- The mode or type of Locking (Share,Update etc.).
     This controls the degree to which the resource can be accessed concurrently
     by multiple processes.


-- The duration of lock.This refers to the length of time that the lock is held.

Now lets discuss the above mentioned point one by one.


Lock Size

The size of the lock determines how much data is within the scope of a lock (hence the term lock size).

For example, a page lock includes all rows on a page, while a row lock just covers a single row.

The lock size is determined by the LOCKSIZE parameter of the CREATE and ALTER TABLESPACE data definition language statements. The options are ANY, TABLESPACE,TABLE, PAGE, ROW.
ANY is the default. TABLE should only be used for segmented table spaces.

DB2 uses locks of the following sizes:

– Table space

– Partition

– Table

– Page

– Row

Out of theses first three are high level locks(table space,partition,table) and the next two (page and row) are called as low level locks.The low level locks cannot be taken before taking the corresponding high level lock for them while high level locks can be taken directly ( except intent share, intent exclusive, and share with intent exclusive.) I know this statement is bit of confusing here but had to go with flow,don't worry will explain below. 

Lock state or lock mode

The state of a lock, also called the mode, determines the type of access to the locked object permitted to the lock owner and to any other concurrent application processes. The lock states available for an object depend on the lock size.

There are three possible states for a low level lock:

- Share
- Update
- Exclusive

Now these three modes of locks are available for low level lock( page and row) while high level lock size have these three modes of locking with the addition of three intent locks referred to as intent share, intent exclusive, and share with intent exclusive.

The primary difference is that whenever an intent lock is held,DB2 also has to request a lock at the page or row level, hence the reference to low level locks.

The share, update, and exclusive locks at the table space, partition, and table level are also
referred to as gross locks because they do not require low level locks to be taken.

Now, i think you got what i was trying to say previously.

When to take a lock ?

The low level locks, that is page and row are acquired when necessary while a process is being executed.
When the high level locks, that is, table space, partition, or table, are acquired depends on the ACQUIRE BIND option.

For ACQUIRE(ALLOCATE), the high level locks are acquired at the first SQL call.

For ACQUIRE(USE), the high level locks are acquired at the first actual use of the table.

The ACQUIRE bind options have no influence on when high level locks are acquired for dynamic SQL, which is always when the object is first accessed.


So until now we are clear with size and modes of lock. Lets revise it.

We can have high level locks (lock size) which covers table,tablespace and partition also we have low level locks which are on row and pages.Then we have various modes of locking that is intent share,intent exclusive,share intent exclusive,share,update and exclusive.Out of theses six modes of locking , all six are applicable for high level locking while for low level locking only three modes( share,update and exclusive) are applicable.

Now,I must stop this post here so you can grasp it. Don't worry if you don't get the concept in one reading , go through the post number of times till you get clear picture out of it.

In next post  i will discuss what all these six modes of locking exactly mean in detail and there compatibility with each other. Then we have the third attribute remaining to discuss that is duration of lock (May be one after the next post)

>> Read Part II
 

DB2 University

Well well well this is very good news for the guys who want to learn and get certified in DB2.

IBM has started new program called as DB2 University.It is dedicated to helping people get educated and certified in DB2.

There are various free courses available on the site like .



  • SQL Fundamentals I







  • DB2 Essential Training II







  • DB2 Essential Training III







  • IBM Data Studio Essential training







  • Each of these offerings will be delivered as "instructor-led online" (ILO) courses. Instructors will monitor forums and grade assignments and a final exam. Courses are planned for 4-6 weeks, with deadlines for each assignment. Courses may be taken simultaneously. Though some courses do have pre-requisites, they will not be enforced at this point.

    DB2 Essential Training I isn't considered an ILO course. The final exam will be worth 100 percent.

    AND WHAT MORE


    ALL courses will offer "prizes," including:

    • FREE IBM certification vouchers
    • 50-percent discount certification vouchers
    • "Understanding DB2" book (worth $60 U.S.)

    So what are you waiting for go and Register now at DB2university.com

    Ten Reasons to Upgrade to DB2 V10

    Today i was just going through some material on DB2 v10 and came across a very good topic on one of the blog which gives you ten solid reasons to upgrade your shop with DB2 v10. I think this very good list to start up with and hence thought to share with you.

    So here it goes.

    1. CPU reductions for transactions, queries and batch jobs: While improvements to the DB2 10 code itself should significantly reduce CPU usage, IBM has an entire box full of new features and functions to help with performance. More on those later in the list.

    2. Scalability in the number of concurrent connections on a subsystem: DB2 10 now supports 10 times more users (up to 20,000 concurrent threads). There's that number 10 again.

    3. Improved availability through improved concurrency for the catalog, utilities and SQL.

    4. Improved availability through enhancements to online schema changes: This means you'll be able to make changes to indexes and table spaces without having to unload and reload the data. With the catalog redesign, links are removed in the catalog and directory. In addition, new lock avoidance techniques reduce the time a lock is held and prevent writers from blocking readers of data.

    5. Security is enhanced with better granularity for administrative privileges, data masking and new audit capabilities.

    6. Improved portability brought about by SQL enhancements.

    7. pureXML: With these enhancements, companies will be better equipped to adapt and respond to changing business needs.

    8. Temporal or versioned data.

    9. Hash, index include columns, access path stability, skip migration are all designed to reduce CPU consumption.

    10. Productivity improvements for database and systems administrators, and application programmers.

     Finally i must give the credit where it dues. So HERE it goes.

    DB2 Precompiler Option : LEVEL (A way to avoid binding)


    If there is any change in your existing DB2 Application program what are the steps you follow to finally put it back in production ??

    - You make the necessary changes to program as per the requirement.

    - You compile and link edit your program to create new LOAD MODULE

    - You bind the latest DBRM to PLAN/PACKAGE

    - You run the program by using current PLAN and LOAD MODULE.

    - Finally you deliver your correct LOAD MODULE and Source program to
      production libraries after successful testing.

    The third step i mentioned here about binding, is it necessary if the SQL within my program doesn't change. Logically speaking it should not because my DBRM remains the same , so should my PLAN.

    But it's not the case ,why ? because when you compile your program after the changes, the DB2 precompiler will place some new consistency token in your DBRM as well as the source program and if you don't bind the DBRM then your plan will have the previous consistency token when it was last time bind while your load module have the latest one and you will get the consistency token mismatch ( For more detail refer this.)

    So you need to bind your DBRM again into PLAN/PACKAGE even though the SQL statement in your program remain unchanged.

    Is it not overhead?? Yes it is. Then can i have some provision like when there is change in my COBOL program only and the SQL statement remain unchanged i should avoid this binding ??? 

    Yes there is.The DB2 Precompiler option LEVEL can be use to avoid binding in such case.


    Let's see how it works.

    The program level defines the level for a particular module. This information is
    stored in the consistency token, which is in an internal DB2 format. Overriding the program level in the consistency token is possible

    To override the construction of the consistency token by DB2, use the LEVEL (aaaa) option. DB2 uses the value that you choose for aaaa to generate the consistency token.

    aaaa is any alphanumeric value of up to seven characters.

    You can omit the suboption (aaaa). The resulting consistency token is blank.

     For COBOL, you need to specify the suboption.

    The LEVEL option can be abbreviated to L.

    However  the DSNH CLIST and the DB2I panels do not support this option .

    After providing this Level Option in DB2 Precompilation step. You just need to follow the following things.


    -   Change the source code (but not the SQL statements).

    -   Compile and link-edit the changed program.

    -   Run the application without rebinding a plan or package.


    A word of advice :

    Before making use of this option, please double check that you are NOT changing any SQL statement in your program else you will not get the expected results.

    VALUE (COALESCE ) function in DB2


    In one of my Previous post , i discuss about a way to handle a null in your application program (by using Null indicator). However this post explain one more way to handle a null by using Scalar function.

    As you know if my column is defined as NULL and i haven't use any null handling technique in my application program then i may get the error ( SQL code -305). To avoid this we can use null indicator (explained in detail here ).

    You can get around this situation by using couple of Scalar function too,which can be part of your SQL query.

    These Scalar function are VALUE and COALESCE. Both of these function can be used to supply a value whenever DB2 returns a NULL. Let's see how exactly these function works.

    Example :

    Let's see i want to select salary of an employees from an EMP table where salary is greater than 1000 and also salary column is defined as NULL.

    Then i can use the Value function as below.


         SELECT * FROM EMP
             WHERE VALUE (SALARY,0)  > 1000;


    Now this function VALUE (SALARY,0)  > 1000 will return the actual value of SALARY when the column is NOT NULL (means the employee have some value as a SALARY ) and return 0 when the SALARY for employee is NULL. You can put any value instead of 0 here and whenever the SALARY for particular employee is NULL it will be replaced by that default value you placed.The DATA Type of default value you are providing must match with the DATA Type of column (SALARY in this case).

    Simply you can read it as ,

    IF SALARY NOT NULL then SALARY else 0.

    COALESCE function works the same way as VALUE and have the same syntax.

    That is COALESCE (SALARY,0)

    UNIQUE WHERE NOT NULL

    Indexes are nothing but the ordered set of pointers to your actual data. It's main advantage is that , you can create it on DB2 columns to speed up the query processing.

    The inedexes created can be either unique or non unique.When you create unique index on a column , it can contain only the unique values and exactly one null if the column is defined as NULL.

    What if you want to allow multiple NULL for the same.

    This new clause WHERE NOT NULL is use to enable multiple NULLs to exist in Unique Index.This is useful when an index contain at least one nullable column, but all non null entries must be unique.  

    Binding DBRM directly to a PLAN is NOT allowed (DB2 V9 onwards)

    As you all know, you can Bind DBRM directly to a PLAN or to a Package but DB2 V9 onwards you will not be able to bind your dbrms to plan and you need to bind it to packages only and bind the packages into a plan.


    DBRM-based plans are deprecated in DB2 9 for z/OS. That is, no enhancements are
    expected to be implemented and, in a future release, these types of plans might no longer be supported.


    Well then obvious question would arise that , what we will do with the existing PLANs which has DBRM directly bounded to it.


    Here is the solution.


    To convert DBRMs bound within plans into packages, a new REBIND PLAN option COLLID
    (collection name or *) is introduced to perform the DBRM to package conversion.


    If the plan that you specify already contains both DBRMs and package list, the newly converted package entries will be inserted into the front of the existing package list.


    During the REBIND PLAN process, DB2 reads the SYSIBM.SYSDBRM catlog table to retrieve the DBRMs bound with the plan, binds them into packages and binds the packages into the plans. Each package is listed separately in the PKLIST, not as a collection.*.

    Finally, the REBIND deletes the DBRMs bound into the plans. After the rebind, there are no longer any SYSDBRM or SYSSTMT records associated with the plan. The SYSPACKAGE and SYSPACKSTMT tables are populated with new package statement data.


    Let's see few example , how would you do it. ( As a programmer mostly you will not be the one , who will be doing  this task in your shop but this is just for you reference)


    * Converting all PLAN


    The following examples converts DBRMs that are bound with plan z into packages under the  collection ID mycollection.

         REBIND PLAN(z) COLLID('mycollection');

    Rebinding multiple plans which may contain DBRMs

    In the following example, BIND will traverse through each plan that is specified in the SYSPLAN table and will convert the DBRMs accordingly, and until none of the DBRMs are bound with plans.

            
    REBIND PLAN (z1, z2, z3) COLLID (collection_id|*);

    Specifying a package list

    The following examples converts all DBRMs that are bound with plan z into packages under collection ID
            

     REBIND PLAN (z) COLLID (collection_id|*) PKLIST(z);


    Additionally, releases of DB2 after Version 9.1 do not support the ACQUIRE(ALLOCATE) bind option for the BIND PLAN and REBIND PLAN commands.

    And at last it make sense to not have any significance of the MEMBER parameter for BIND PLAN.
         

    IS [NOT] DISTINCT FROM

    This clause is a very beautiful addition in DB2 (V8 Onwards).
    Dealing with Null is always somewhat complex and you should always be cautious about the result of what you have written anything which involves Null.
    Null is absence of a value. Neither equalling blank, space nor any character.
    Ok , so what you will do if you want to compare the two columns in a Query.
    Oh that's so simple, you know

    SELECT ...
    FROM Table1 JOIN Table2
    ON Table1.col1 = Table2.col1

    The predicate can evaluate to either TRUE or FALSE.
    Are you sure , this query will suffice ?? Think once again.
    What if both the columns (col1 and col2) or even one of them contain NULL .
    [For  example (Table1.col1 = 4, Table2.col1 = NULL), (Table1.col1 = NULL, Table1.col2 = 14) and even (Table1.col1 = NULL, Table2.col1 = NULL)]
    The predicate evaluates to UNKNOWN whenever NULLs are involved.
    You can still get around this situation by adding following logic.

    (Table1.col1 = Table2.col1 OR (Table1.col1 IS NULL AND Table2.col1 IS NULL))

    This makes the expression awkward, and also you would probably cross your fingers in hope that this will not hurt optimization. Note that this expression will “accept” only the cases you want to treat as a match and “reject” all others, but  still Table1.col1 = 14, Table2.col1 = NULL evaluates to UNKNOWN.
    This means that

     NOT(Table1.col1 = Table2.col1 OR (Table1.col1 IS NULL AND Table2.col1 IS NULL))

    Will evaluate to UNKNOWN in those cases as well and not to TRUE.So if you want all non matches of the positive predicate to be accepted, you won’t use the positive predicate and apply NOT on top, rather rewrite your logic.
    e.g
    Table1.col1  Table2.col1
      OR (Table1.col1 IS NULL AND Table2.col1 IS NOT NULL)
      OR (Table1.col1 IS NOT NULL AND Table2.col1 IS NULL)

    Now add this to the predicate of your query. Notice how awkward it is.

    Well here comes the solution to the problem with the simpler approach.

    Add the clause IS [NOT] DISTINCT FROM  thats it.

    It does require getting used to, but once you do, it makes perfect sense. DISTINCT treats NULLs like known values. That is, one NULL is not
    distinct from another NULL, but a NULL is distinct from known values. 4 is distinct from 14, but 4 is not distinct from 4.

    Now let’s go back to the example I mentioned earlier
    You would use

    T1.col1 IS NOT DISTINCT FROM T2.col1

    This is ultimately simpler than the awkward logic mentioned above.











    .


    Null Indicator Variable (Indicator Variable)

    The Null indicator variable are used in an Application Program as a part of exception handling while dealing with the columns which are defined as NULL ( Or say the one which is not defined with NOT NULL).

    Let us consider the following scenario:

    EXEC SQL

      SELECT EMP_NUMBER,EMP_ADDRESS
      INTO :EMP_NUMBER,:EMP_ADDRESS
      FROM EMP

    END-EXEC

    Let's say EMP_ADDRESS column in EMP tables is defined as NULL column and one of the Employee has not updated his address in database so it will be considered as null. Now what should happen if we try to fetch the row for that employee, what the host variable: EMP_ADDRESS suppose to contain after the Select operation. No it will not contain spaces as Null is absence of value and not blanks, spaces or any other character.

    In this case you will get the error with -305 as SQLCODE.

    To avoid this we need to use null indicator in our program which can be used to process the program flow in case of null.

    The Null indicator variable should be defined for the every column which may contain the null value.

    It must be defined as 2 Byte binary variable.

    01   EMP_ADDRESS_INDICATOR S9(4) COMP

    You can then refer this variable in your query as below.

    EXEC SQL

      SELECT EMP_NUMBER,EMP_ADDRESS
      INTO :EMP_NUMBER,:EMP_ADDRESS :EMP_ADDRESS_INDICATOR
      FROM EMP

    END-EXEC

    Note there is no comma between EMP_ADDRESS and EMP_ADDRESS_INDICATOR.

    Now see how this null indicator resolves our problem.

    If one of the employees have EMP_ADDRESS as NULL then EMP_ADDRESS_INDICATOR will be automatically updated with Negative value.

    Then you can check like

    IF EMP_ADDRESS_INDICATOR IS NOT ZERO
    THEN < do some processing>

    EMP_ADDRESS_INDICATOR will contain zero if the EMP_ADDRESS has some value (NOT NULL)

    Null indicator can contain three value

    0 : If column is not null that is it have some value in it.
    -1: If column contains null
    -2: If column contains null as a part of data conversion.

    Some Tricky Pointes to remember with Null indicator:

    1> You have to define the Null indicator with S9(4) COMP ( 2 BYTE Binary) or else you will get the error as " Undefined Host variable " even though you have defined it ( but with some other picture clause)

    2>  The value which host variable going to contain depends on the null indicator irrespective of content we move to it.

    For ex : Suppose i have to insert some record in table.

    If i move some address to the variable EMP_ADDRESS

    Move 'address' to EMP_ADDRESS

    Then insert this record to EMP table but i move -1 to Null indicator variable EMP_ADDRESS_INDICATOR associated with  EMP_ADDRESS then irrespective of 'address' moved to EMP_ADDRESS null will be moved EMP_ADDRESS in EMP table.

    3> You have used Null indicator in your query but didn't check it's value later in your program you may not get -305 sqlcode but your program will behave in unpredicted manner.

    The Placement of Host Variable in an Application Program

    As you all know that the Host Variable are nothing but the variable which are host language (Cobol ) equivalent of DB2 table coloumn.
    You can create this variable manually or using the tool DCLGEN (Declaration Generator) and then intern include in your Application Program.

    There are couple of ways you can place this host variable in your program
    1> Include in your program by using following statement in working-storage section.
           EXEC SQL
                      INCLUDE < Member Name >
           END-EXEC.
    Member name should be the member in PDS which contains these host variable definition and you should provide this PDS Library in your JCL.
    OR
    2> Directly put your Host variable declaration in your working-storage section as normal cobol variable.

    You cannot put your host variable in program by using COPY statement as you can do in case of normal cobol variable.This is because the statement within INCLUDE get expanded at DB2 precompilation time while the COPY get expanded at compilation time.

    Lets say if you put your host variable declaration in COPY statement then at the DB2 precompilation time you will get the "Undeclared host variable " error.


    So try to avoid this pitfall in case of placement of host variable in your application program.



    Timeout and resource unavailable.(-911 with timeout vs -904)


    This is quite tricky one to know. Well i thought so.
    -904 SQLCODE: Unavailable resource.
    -911 SQLCODE: Deadlock or timeout.But i am talking about timeout here.So we will get timeout since the resource is not available so the application will go in suspention mode and wait for some time( equal to IRLMRWT DSNZPARM paramater)and then timeout.


    SO WHAT EXACTLY THE DIFFRANCE BETWEEN -911 WITH TIMEOUT AND -904.


    A -911 is a lock timeout and occurs only when trying to access data that is locked by another process.

    A -904 is a resource unavailable, and can be caused by any number of resources that are not available. The -904 will have an associated reason code and type code. The reason tells you the "why" and the type code will tell you the "what." Reason codes and type codes are documented in the DB2 for z/OS Codes manual which is GC19-2971 for Version 10.

    SQLCODE -818 vs -805 (Timestamp mismatch)

    I have seen programmer often tend to get confuse between the two.We know that we can get SQLCODE -818 when the consistency token ( Placed by DB2 precompiler ) between the load module and the DBRM doesn't match.On the other hand , we get -805 when any DBRM or Package is not found in Plan which supposed to be there.

     Now lets see how we resolve both the abend.

    In case of -818, if we have the correct DBRM (DBRM with same consistency token as there in the load module or in other word the output DBRM from the same precompiler step from where we got our load module) then we can just bind our plan with DBRM and intern run the job with this new plan and corresponding load module.In case you don't  have the correct DBRM with you, just precompile the job again create new load module( via compile, link-edit) and bind to crate new plan so you can be sure that at run time the consistency token matches.

    For -805, you just see which dbrm or Package is not there in plan and include it accordingly in plan by binding it in plan.


    Now the real problem is even though sometimes the particular DBRM or the Package does exist in plan we get -805. Then obvious question is why???? why am i still getting -805 in this case.



    The answer is main reason for getting -805 is not just DBRM or Package not found in plan but it is

    -805 DBRM OR PACKAGE NAME

    location-name.collection-id.progname.
    consistency -token NOT FOUND IN PLAN plan-name REASON reason


    Simply stated, it means that an application program attempted to use a package
    'location-name.collection-id.progname.consistency-token' that was not found.


    The DBRM name ‘dbrm-name’ matched one or more entries in the package list and the search of those entries did not find the package (that is, it is present but the consistency token does not match). In this case the LOADLIB and the DBRM from which the package was created are not the same.


    So you can get timestamp mismatch (consistency token mismatch ) in case of -805 also as in case of -818

    The difference is if you bind a
    DBRM to Package and then bind that Package to Plan and consistency token doesn't match then you get -805 where as If you bind DBRM directly to Plan ( without Packaging) and then  consistency token doesn't match you get -818.

    Database 2 : Programmers guide

    Here is the new blog completely dedicated to IBM DB2. I wish to discuss all DB2 related information with you.Hope it get along with you well.I will try to post on various topics in DB2 randomly.I will try to have best of best knowledge on db2 in this blog from programmer's point of view and hence the name.

    If you have anything to share,have any queries,any doubts please feel free to leave a comment and rest assured.

    Thanks.