Indexes in DB2 - Part I

In order to get the bottom of any topic, I always prefer getting the answer of  three important questions.

What is it ?
Why we need it?
What if we don't have it?

So, today we are going to understand the topic indexes in DB2 using these questions and try to learn as much as possible.

What is it??

Index is very powerful concept , which uses the pointers to actual data to more efficiently access the specific data item.

DB2 Indexes are the DB2 objects which can be created on columns of the table to speed up the processing of SQL queries and sometimes can also help in uniquely identifying each row in a table.

To give one very practical example of what indexes are , think of indexing in books which you often see either at the start or end of the book catalog.




Why we need it?

As we mentioned above, it helps in speeding up the query processing by allowing DB2 optimizer to choose most optimized path to access the data you are looking for.

It also helps (Depending on type of index) to uniquely identify each single row in your DB2 table.

Lets understand this using our example of books.

Consider any book of 1000 pages and you don't want to read a whole book but a specific information/chapter in that book. what you will do?

Will you start looking/reading from page 1 ,each and every page/line ,until you find the information you are looking for ?  If yes, what is the problem here?

Obviously, It will be very time consuming to get a small piece of information in big book. So here indexing of the books helps us , you can just look for the specific keyword in book index and that will tell you what all the pages in book have that keyword and then you just read those specific pages to find out the exact information you are looking for.

What it did, reduces a lot of time to get to the data and speed up our process of looking information.

DB2 indexes are same, book is DB2 table, pages in book are DB2 pages, lines are rows in a table and specific data you are looking is the value of columns of the DB2 tables.

What if we don't have it?

In above example of books, imaging you don't have indexes. You get the drill, right?

If there is a DB2 table with large amount of data in it and there are no indexes defined on any of the columns, and you fire any simple SELECT SQL query on it.

SELECT EMP_NAME FROM EMP_TABLE
WHERE EMP_NO = '123456'

It will start reading the each and every DB2 page sequentially until it finds the data matching to predicates in your query and it will consume lot of CPU and Elapsed time. With time comes lot of other consequence like if the query is not defined with UR(Uncommitted read) , the locks will be held on tables/pages/rows and other process/program trying to access the same table/data at around same time will have to wait and more likely the waiting process fails with TIMEOUT.

So proper indexing is very important and there are only pros of having indexes.

But in some rare cases there are times when indexing  a table is not the way to go.

Consider a table with very less data.. may be 100 or even 1000 rows. Here query can get you the data faster when there are no indexes.

This is because having index is two way read process, first the indexes have to be read and then the pointers/pages  mentioned by index have to be read and with table having very less rows it is rather faster to read the data sequentially than having indexes.

So consider having indexed when there are tables with large number of rows.

Now, in next post we will cover the answer of next set of questions like how it works??.How indexes work and what are the types of indexes etc.




 

DB2 Image Copy

In a Production environment, there are constant updates happening on your DB2 tables as a result of several jobs/programs functionality. These updates are nothing but the results of the execution of  Insert/Update/Delete statement on your various DB2 tables.

What if, I want to take a look at the data in tables before my today's batch pass ran ?? this may be for some analysis/understanding  or worse inadvertently something went  wrong and you want to recover the data back to where it was before the job ran. Is there a way? Wouldn't it be nice if some one can take a back up of my DB2 tables before the batch pass run so I can always go and get it back where we started.

Well , it's already happening the only thing is as a programmer/ developer you may not be knowing it. This is something done by using DB2 utility and by your DataBaseAdministrator .

This process of taking full back-ups of your data objects is called as DB2 full image copy. It is achieved by DB2 utility COPY.

You can make full image copies of a variety of data objects. Data objects include table spaces, table space partitions, data sets of nonpartitioned table spaces, index spaces, and index space partitions.

The following statement specifies that the COPY utility is to make a full image copy of the TABSPACE table space in database DATABASE.

COPY TABLESPACE TABSPACE.DATABASE

The COPY utility writes pages from the table space or index space to the output data sets.

So, if you want to take a look at old data before pass, all you have to do is dump that image copy dataset from COPY Utility into normal DASD file on disk.

There is also something called as incremental image copy, this as it name suggest does not take complete copy but only the records which has changed (Insert/Update/Delete) from last run .

The information of this COPY Utility can be found on the DB2 catalog table SYSIBM.SYSCOPY 

More on how JCL'S for DB2 Utility looks and how it works will be covered later in DB2 utility post which I will be doing shortly.