Mimer SQL User's Manual TOC PREV NEXT INDEX

Mimer Developer Site

www.mimer.com/developer


Handling NULL Values


NULL values require special handling in SQL queries. NULL represents an unknown value, and strictly speaking NULL is never equal to NULL. (NULL values are however treated as equal for the purposes of GROUP BY, DISTINCT and UNION).

Searching for NULL

The search condition:

 WHERE column = NULL
 

will not retrieve any rows since NULL is not equal to anything. The condition for selecting NULL values is

 WHERE column IS NULL
 

The negated form (WHERE column IS NOT NULL) selects values which are not NULL (i.e. values which are known).

Find the names of the persons who made the reservations for those customers who have not yet checked in to the Hotel Skyline:
Note: 'Not checked in' is represented by NULL in the CHECKIN column.
 SELECT  RESERVED_FNAME, RESERVED_LNAME
 FROM    BOOK_GUEST
 WHERE   CHECKIN IS NULL
 AND     HOTELCODE = (SELECT  HOTELCODE 
                      FROM    HOTEL
                      WHERE   NAME = 'SKYLINE');
Result:

 RESERVED_FNAME
 RESERVED_LNAME
 OMAR
 CHAFIR      
 AGNETA
 ERIKSSON  
 SVEN
 LINDHOLM    
 HENRIK
 PIHL      
 URBAN
 FRANSSON   

Find the names of the guests who have checked in to the Hotel Laponia.
 SELECT  GUEST_FNAME, GUEST_LNAME
 FROM    BOOK_GUEST
 WHERE   CHECKIN IS NOT NULL
 AND     HOTELCODE = (SELECT  HOTELCODE 
                      FROM    HOTEL
                      WHERE   NAME = 'LAPONIA');
Result:

 GUEST_FNAME
 GUEST_LNAME
 CHRISTOPHER
 DATE
 STEN
 JOHANSEN     
 STEFAN
 HANSEN     
 GUNNAR
 ALVE       
 NILS
 KRISTOFERSEN 
 LARS
 HOLMER       
 KNUT
 KULLMER      
 JUDITH
 SMITH      
 ADOLF
 SCHMIDT     
 LAILA
 ZETTERBERG  
 MATS
 HANSSON      

Null values in ALL, ANY, IN and EXISTS Queries

Null values should be treated cautiously, particularly in ALL, ANY, IN and EXISTS queries.

The result of a comparison involving NULL is unknown, which is generally treated as false. This can lead to unexpected results.

For example, neither of the following conditions are true:

 <null>     IN (...,null,...)
 <null> NOT IN (...,null,...)
 

The first result is almost intuitive: since NULL is not equal to NULL, NULL is not a member of a set containing NULL.

But if NULL is not a member of a set containing NULL, the second result is intuitively true.

In fact, neither result is true or false: both are unknown. If NULL values are involved on either side of the comparison, IN and NOT IN are not complementary. Similar arguments apply to queries containing ALL or ANY:

Where are hotels with rooms that are more expensive than those at the hotel Skyline (hotel code SKY)?
 SELECT  NAME, CITY
 FROM    HOTEL AS H, ROOM_PRICES AS RP
 WHERE   H.HOTELCODE = RP.HOTELCODE
 AND     PRICE > ALL (SELECT  PRICE
                      FROM    ROOM_PRICES
                      WHERE   HOTELCODE = 'SKY');
 

This query works as long as there are no NULL values in the PRICE column. But introduce a new room type at Skyline with an unknown price, and the query results in an empty set.

Moreover, the reverse query, hotels that are cheaper than all rooms at Skyline, also results in an empty set. A justification for this is that as long as one price at Skyline is unknown, it is impossible to say whether rooms at other hotels are more or less expensive than those at Skyline.

Using Exists

It is always possible to rephrase a query using ALL, ANY or IN in terms of one using EXISTS (with an outer reference between the selection and the EXISTS condition). This is to be recommended if the NULL indicator is to be permitted in the comparison sets, since NULL handling is then written out explicitly in the query.

Thus, the query above can also be written as follows:

 SELECT  NAME, CITY
 FROM    HOTEL AS H, ROOM_PRICES AS RP
 WHERE   H.HOTELCODE = RP.HOTELCODE
 AND     NOT EXISTS (SELECT  *
                     FROM    ROOM_PRICES
                     WHERE   HOTELCODE = 'SKY'
                     AND     (   PRICE <= RP.PRICE
                              OR PRICE IS NULL
                              OR RP.PRICE IS NULL ));
 

This formulation may be read as 'Find hotels where no room at Skyline is cheaper than or the same price as any room in the hotel in question, as long as no prices are unknown'.

The explicit PRICE IS NULL clause tests that if either of the components of the comparison is NULL, then the subselect is not empty, NOT EXISTS is false, and no row is returned.

In general, a query of the form ($ stands for any comparison operator):

 SELECT  column-list
 FROM    table1
 WHERE   column1 $ ALL (SELECT  column2
                        FROM    table2
                        WHERE   condition)
 is equivalent to
 SELECT  column-list
 FROM    table1
 WHERE   NOT EXISTS (SELECT  *
                     FROM    table2
                     WHERE   condition
                     AND ( NOT table1.column1 $ table2.column2
                           OR  table1.column1 IS NULL
                           OR  table2.column2 IS NULL ));
 

A similar example is:

Where are hotels with rooms that have unknown prices or that are more expensive than rooms with known prices at hotel Skyline?
 SELECT  NAME, CITY
 FROM    HOTEL H, ROOM_PRICES RP
 WHERE   H.HOTELCODE = RP.HOTELCODE
 AND     NOT EXISTS (SELECT  *
                     FROM    ROOM_PRICES
                     WHERE   HOTELCODE = 'SKY'
                     AND     PRICE <= RP.PRICE);
 

This query does not exclude the occurrence of the NULL indicator from the comparisons. If there is an unknown price, then the hotel concerned will be included in the result set - even if the unknown price is at Skyline itself. (Skyline might have a room that is more expensive than all rooms with known prices at Skyline).

Formulated with ALL, this query would be:

 SELECT  NAME, CITY
 FROM    HOTEL H, ROOM_PRICES RP
 WHERE   H.HOTELCODE = RP.HOTELCODE
 AND     PRICE > ALL (SELECT  PRICE
                      FROM    ROOM_PRICES
                      WHERE   HOTELCODE = 'SKY'
                      AND     PRICE IS NOT NULL);
 

It is clear from the examples above that distinctions between queries involving NULL comparisons are subtle and are easily overlooked.

It is essential that the aim of a query is stringently defined before the query is formulated in SQL, and that the possible effects of NULL values in the search condition are considered.

There are many real-life examples where the presence of NULL has resulted in unforeseen and sometimes misleading data retrievals. It is advisable to define all columns in the database tables as NOT NULL except those where unknown values have a specific meaning, such as the CHECKIN and CHECKOUT columns in the BOOK_GUEST table. In this way the risks of confusion with NULL handling are minimized.



Upright Database Technology AB
Voice: +46 18 780 92 00
Fax: +46 18 780 92 40
dbtechnology@upright.se
Mimer SQL User's Manual TOC PREV NEXT INDEX