Indexes in DB2 - Part I

In order to get the bottom of any topic, I always prefer getting the answer of  three important questions.

What is it ?
Why we need it?
What if we don't have it?

So, today we are going to understand the topic indexes in DB2 using these questions and try to learn as much as possible.

What is it??

Index is very powerful concept , which uses the pointers to actual data to more efficiently access the specific data item.

DB2 Indexes are the DB2 objects which can be created on columns of the table to speed up the processing of SQL queries and sometimes can also help in uniquely identifying each row in a table.

To give one very practical example of what indexes are , think of indexing in books which you often see either at the start or end of the book catalog.




Why we need it?

As we mentioned above, it helps in speeding up the query processing by allowing DB2 optimizer to choose most optimized path to access the data you are looking for.

It also helps (Depending on type of index) to uniquely identify each single row in your DB2 table.

Lets understand this using our example of books.

Consider any book of 1000 pages and you don't want to read a whole book but a specific information/chapter in that book. what you will do?

Will you start looking/reading from page 1 ,each and every page/line ,until you find the information you are looking for ?  If yes, what is the problem here?

Obviously, It will be very time consuming to get a small piece of information in big book. So here indexing of the books helps us , you can just look for the specific keyword in book index and that will tell you what all the pages in book have that keyword and then you just read those specific pages to find out the exact information you are looking for.

What it did, reduces a lot of time to get to the data and speed up our process of looking information.

DB2 indexes are same, book is DB2 table, pages in book are DB2 pages, lines are rows in a table and specific data you are looking is the value of columns of the DB2 tables.

What if we don't have it?

In above example of books, imaging you don't have indexes. You get the drill, right?

If there is a DB2 table with large amount of data in it and there are no indexes defined on any of the columns, and you fire any simple SELECT SQL query on it.

SELECT EMP_NAME FROM EMP_TABLE
WHERE EMP_NO = '123456'

It will start reading the each and every DB2 page sequentially until it finds the data matching to predicates in your query and it will consume lot of CPU and Elapsed time. With time comes lot of other consequence like if the query is not defined with UR(Uncommitted read) , the locks will be held on tables/pages/rows and other process/program trying to access the same table/data at around same time will have to wait and more likely the waiting process fails with TIMEOUT.

So proper indexing is very important and there are only pros of having indexes.

But in some rare cases there are times when indexing  a table is not the way to go.

Consider a table with very less data.. may be 100 or even 1000 rows. Here query can get you the data faster when there are no indexes.

This is because having index is two way read process, first the indexes have to be read and then the pointers/pages  mentioned by index have to be read and with table having very less rows it is rather faster to read the data sequentially than having indexes.

So consider having indexed when there are tables with large number of rows.

Now, in next post we will cover the answer of next set of questions like how it works??.How indexes work and what are the types of indexes etc.




 

DB2 Image Copy

In a Production environment, there are constant updates happening on your DB2 tables as a result of several jobs/programs functionality. These updates are nothing but the results of the execution of  Insert/Update/Delete statement on your various DB2 tables.

What if, I want to take a look at the data in tables before my today's batch pass ran ?? this may be for some analysis/understanding  or worse inadvertently something went  wrong and you want to recover the data back to where it was before the job ran. Is there a way? Wouldn't it be nice if some one can take a back up of my DB2 tables before the batch pass run so I can always go and get it back where we started.

Well , it's already happening the only thing is as a programmer/ developer you may not be knowing it. This is something done by using DB2 utility and by your DataBaseAdministrator .

This process of taking full back-ups of your data objects is called as DB2 full image copy. It is achieved by DB2 utility COPY.

You can make full image copies of a variety of data objects. Data objects include table spaces, table space partitions, data sets of nonpartitioned table spaces, index spaces, and index space partitions.

The following statement specifies that the COPY utility is to make a full image copy of the TABSPACE table space in database DATABASE.

COPY TABLESPACE TABSPACE.DATABASE

The COPY utility writes pages from the table space or index space to the output data sets.

So, if you want to take a look at old data before pass, all you have to do is dump that image copy dataset from COPY Utility into normal DASD file on disk.

There is also something called as incremental image copy, this as it name suggest does not take complete copy but only the records which has changed (Insert/Update/Delete) from last run .

The information of this COPY Utility can be found on the DB2 catalog table SYSIBM.SYSCOPY 

More on how JCL'S for DB2 Utility looks and how it works will be covered later in DB2 utility post which I will be doing shortly.

DB2 System Catalog Tables

Ever wondered, when any new DB2 object is created like Table,View,Stored Procedure,DBRM etc. where all the information is stored. Is there any place where I can look up to get all the details like all the Indexes of the table, all the columns in an index, Referential integrity on the table etc..

DB2 system catalog tables is the answer and Good Developer/programmer should know about it.

All the mainframe shops generally have tools to get this information.

These tools goes by different name , most common of those and people many know is by BMC.
BMC Change and Catalog Manager,BMC Main View for DB2 etc.

This tool will get you all this information and have easy to use User interface and navigation but this tool in background also use the information from DB2 Systems catalog table.

So, DB2 system catalog tables is the ultimate source where all information is recorded.

As the name suggest , there are set of special tables where information get recorded automatically whenever any object is created. With proper access we can just query these tables ,like we query any other tables, and get the desired information.

All catalog tables are qualified by schema SYSIBM.

Ex SYSIBM.SYSTABLES,SYSIBM.SYSKEYS etc.

Sample query -

SELECT * FROM SYSIBM.SYSTABLES
WHERE NAME = <Table name>

More about it in upcoming posts.






 

Cusror in DB2 ---> Part III (Multirow fetch)


==> Click Here to Read Part I <==


==> Click Here to Read Part II <==


In this post ,we are going to discuss how to fetch and process multiple rows (Multirow fetch) at a time using single fetch statement in DB2 cursor.

The multirow fetch was introduced in DB2 version 8 onwards.

Why to use it ?

By fetching multiple rows at once, your request can become more efficient and it can improve the performance by reducing the CPU time.

How to use it ?

As we know with regular Cursor fetch statement we can fetch one row at a time, values will be fetched into host variables and processed later into the program.

Ex -

EXEC SQL
      FETCH  CURSOR_NAME
      IN TO :A,:B
END-EXEC.


Now if we want to use Multirow Fetch feature, First of all our Cursor should be defined with something called as "With Rowset positioning".

A rowset is a group of rows that are operated on as a set. Such a cursor enables your program to retrieve more than one row using a single FETCH statement

Define Cursor:

EXEC SQL
    DECLARE CURSOR CURSOR_NAME
    WITH ROWSET POSITIONING
    FOR
    SELECT EMP_NAME FROM EMPTABLE
END-EXEC.

Now with this cursor when we use fetch statement we will get multiple rows at a time which normal host variable structure will not be able to handle it as it was designed to receive only one row at a time.  

So,  to use a multi-row fetch you must have defined the appropriate structures to receive multi-row data. This means you must defined an array of host variables into which the fetched rows can be placed. Each column fetched requires its own host variable array into which its values will be placed. If column is nullable then make sure we have separate  host variable array defined for null indicator.

Be sure to match the array size to the rowset size. This way a single FETCH statements can be written to retrieve more than a single row from the result set.

Fetch Cursor :

EXEC SQL
     FETCH ROWSET FROM CURSOR_NAME
     FOR 10 ROWS
     INTO :HOST-VAR-ARRAY
END-EXEC.

So it looks exactly like normal fetch statement except the word ROWSET and FOR 10 ROWS.

ROWSET keyword indicates that this is a cursor with Rowset positioning.
The FOR 10 ROWS clause specifies the size of the rowset to be returned. The maximum rowset size is 32,767.

Rowset cursors are very useful when you need to retrieve many rows or large amounts of data.
By retrieving multiple rows with a single FETCH, multiple trips between the application and the database can be eliminated, thereby improving network performance.


 

LOB Data type in DB2.

We have various different type of data types in DB2.  For ex- SMALL INT,INT,CHAR,VARCHAR..

These data types seems to work fine and do their job perfectly .But if you want to store large audio,video, images or any other file which has much larger size than 32kb they cant handle it.

VARCHAR ,VARGRAPHIC,VARBINARY has the storage limit of 32 KB.

For data objects that are larger than 32 KB, you have to use something called as large object (LOB) data types to store these objects.

DB2 provides three different type of LOB data types to store these data objects as strings of up to 2 GB in size:
Character large objects (CLOBs) :
 
Use this data type if your data is larger (or might grow larger) than the VARCHAR data type permits. It can store up to 2GB. For example, you can store information such as an employee resume, or the text of book in a CLOB.
 
Double-byte character large objects (DBCLOBs)
 
Use the DBCLOB data type to store large amounts of DBCS data, such as documents that use a DBCS character set.
 
Binary large objects (BLOBs)
 
Use the BLOB data type to store large amounts of non character data, such as pictures, voice, and mixed media.
If your data does not fit entirely within a data page, you can define one or more columns as LOB columns.

Example :

CREATE TABLE EMPLOYEE_SALARY
     (DEPTNO   CHAR(3)      NOT NULL, EMP_RESUME  BLOB(1G) )


 

DB2 SQL query in COBOL program - Bind Warning

When we write embedded SQL in COBOL-DB2 application program , we know that

  • It should be always within EXEC SQL END-EXEC.
  • If we are doing a SELECT or using a Predicate with specific value passed in variable for comparison, that variable should always be preceded by :

All the variables preceded by : called as Host Variable.

Ex :

EXEC SQL

  SELECT SAMPLE_DATE INTO :WS-DATE
  FROM TABLE_A WHERE DAY = :TABLE_A.DAY

END-EXEC.

Here :WS-DATE ,DAY OF TABLE_A are host variable , declared somewhere in working storage section with appropriate definition.

What would be the consequence if somehow we miss this variable to precede with :(Say TABLE_A.DAY was not preceded by : in above query)

Will we able to compile the program ?? Bind it ?? What error will we get , if query is getting executed.

First of all, yes program can be compiled, and bind will be successful. But there will be a warning message in your Bind step with SQLCODE 203 and SQLSTATE 01552.

So lesson learnt is if your bind is successful not necessary mean that all is well, carefully look at if there is any warning provided by DB2 precompiler.If you miss it , you will be in for surprise Production/Test failure. Just here in this case if you just go ahead and move the code with Bind warning and when program runs and try to execute this query it will fail with SQLCODE -206