This clause is a very beautiful addition in DB2 (V8 Onwards).
Dealing with Null is always somewhat complex and you should always be cautious about the result of what you have written anything which involves Null.
Null is absence of a value. Neither equalling blank, space nor any character.
Ok , so what you will do if you want to compare the two columns in a Query.
Oh that's so simple, you know
SELECT ...
FROM Table1 JOIN Table2
ON Table1.col1 = Table2.col1
The predicate can evaluate to either TRUE or FALSE.
Are you sure , this query will suffice ?? Think once again.
What if both the columns (col1 and col2) or even one of them contain NULL .
[For example (Table1.col1 = 4, Table2.col1 = NULL), (Table1.col1 = NULL, Table1.col2 = 14) and even (Table1.col1 = NULL, Table2.col1 = NULL)]
The predicate evaluates to UNKNOWN whenever NULLs are involved.
You can still get around this situation by adding following logic.
(Table1.col1 = Table2.col1 OR (Table1.col1 IS NULL AND Table2.col1 IS NULL))
This makes the expression awkward, and also you would probably cross your fingers in hope that this will not hurt optimization. Note that this expression will “accept” only the cases you want to treat as a match and “reject” all others, but still Table1.col1 = 14, Table2.col1 = NULL evaluates to UNKNOWN.
This means that
NOT(Table1.col1 = Table2.col1 OR (Table1.col1 IS NULL AND Table2.col1 IS NULL))
Will evaluate to UNKNOWN in those cases as well and not to TRUE.So if you want all non matches of the positive predicate to be accepted, you won’t use the positive predicate and apply NOT on top, rather rewrite your logic.
e.g
Table1.col1 Table2.col1
OR (Table1.col1 IS NULL AND Table2.col1 IS NOT NULL)
OR (Table1.col1 IS NOT NULL AND Table2.col1 IS NULL)
Now add this to the predicate of your query. Notice how awkward it is.
Well here comes the solution to the problem with the simpler approach.
Add the clause IS [NOT] DISTINCT FROM thats it.
It does require getting used to, but once you do, it makes perfect sense. DISTINCT treats NULLs like known values. That is, one NULL is not
distinct from another NULL, but a NULL is distinct from known values. 4 is distinct from 14, but 4 is not distinct from 4.
distinct from another NULL, but a NULL is distinct from known values. 4 is distinct from 14, but 4 is not distinct from 4.
Now let’s go back to the example I mentioned earlier
You would use
T1.col1 IS NOT DISTINCT FROM T2.col1
This is ultimately simpler than the awkward logic mentioned above.
.
2 comments:
It is actually a nice and helpful piece of information.
I am happy that you simply shared this useful information with us.
Please stay us up to date like this. Thanks for sharing.
my homepage - diet that works
Hello There. I found your blog using msn. This is an extremely well written article.
I'll make sure to bookmark it and come back to read more of your useful information. Thanks for the post. I will definitely comeback.
Feel free to visit my web page - calories burnt walking
Post a Comment