Retrieving Data from More than One Table
The examples presented up to now in this chapter have illustrated the essential features of simple SELECT statements with data retrieval from single tables. However, much of the power of SQL lies in the ability to perform joins through a single statement, i.e. to select data from two or more tables, using the search condition to link the tables in a meaningful way.
The Join Condition
In retrieving data from more than one table, the search condition or join condition specifies the way the tables are to be linked. For example:
List the billed charges for reservation number 1349:SELECT DESCRIPTION, COST FROM CHARGES, BILL WHERE RESERVATION = 1349 AND BILL.CHARGE_CODE = CHARGES.CHARGE_CODE;
The join condition here is BILL.CHARGE_CODE = CHARGES.CHARGE_CODE, which relates the charge code in table BILL (where amounts are listed) to the charge code in table CHARGES (where the text description of the charge code is listed).
Conceptually, the join first establishes a table containing all combinations of the rows in CHARGES with the rows in BILL, then selects those rows in which the two CHARGE_CODE values are equal. See Conceptual Description of the Selection Process for a fuller description of the conceptual SELECT process.
This does not necessarily represent the order in which the operations are actually performed; the order of evaluation of a complex SELECT statement is determined by the SQL optimizer, regardless of the order in which the component clauses are written.
Without the join condition, the result is a cross product of the columns in the tables in question, containing all possible combinations of the selected columns, for example:SELECT DESCRIPTION, COST FROM CHARGES, BILL WHERE RESERVATION = 1349;
It is easy to see that a carelessly formulated join query can produce a very large result table. Two tables of 100 rows each, for instance, give a cross product with 10,000 rows; three tables of 100 rows each give a cross product with 1,000,000 rows!
The risk of generating large (erroneous) result tables is particularly high in interactive SQL (e.g. when BSQL is used), where queries are so easily written and submitted.
In simple joins, all tables used in the join are listed in the FROM clause of the SELECT statement. This is in distinction to nested joins, where the search condition for one SELECT is expressed in terms of another SELECT, see Nested Selects.
An example of a simple join is the query described in The Join Condition.
Example:SELECT DESCRIPTION, COST FROM CHARGES, BILL WHERE BILL.CHARGE_CODE = CHARGES.CHARGE_CODE AND RESERVATION = 1349;
The form SELECT * may be used in a join query, but since this selects all columns in the result set, at least one column is usually duplicated.
Example:SELECT * FROM CHARGES, BILL ...;
Columns in the join query that are uniquely identified by the column name may be specified by name alone. Columns that have the same name in the joined tables must be qualified by their respective table names.
There is an alternative formulation of the query above:SELECT DESCRIPTION, COST FROM CHARGES JOIN BILL ON CHARGES.CHARGE_CODE = BILL.CHARGE_CODE AND RESERVATION = 1349;
All predicates that can be used in a where clause, except sub-selects, can be used in an on-clause. The join clause can be used as a statement on its own:CHARGES JOIN BILL ON CHARGES.CHARGE_CODE = BILL.CHARGE_CODE;
orCHARGES NATURAL JOIN BILL;
About Natural Joins
A natural join, joins the table on the condition of equality between any columns with the same name, in the two tables. In the first example, all columns from the two tables are present in the result. In the second example the join columns will only occur once. Thus, in the first case, the CHARGE_CODE column appears twice in the result, while there is only one occurrence of this column in the second result.
Nesting Join Clauses
It is possible to nest join-clauses, for example:
Select the status of all rooms at hotel LAPONIA:SELECT ROOMNO, STATUS FROM ROOMSTATUS NATURAL JOIN ROOMS JOIN HOTEL ON HOTEL.HOTELCODE = ROOMS.HOTELCODE AND HOTEL.NAME = 'LAPONIA';
The natural join between ROOMSTATUS and ROOMS is slightly contrived in this example and is present to demonstrate that joins can be nested. If the STATUS column in the ROOMS table was not a foreign key referencing the ROOMSTATUS table, the function of the join could be to validate values in the ROOMS.STATUS column.
Complex Search Conditions and Joins
A join query can join any number of tables using complex search conditions to select the relevant information from each table.
Select the total bill for guest Sten Johansen and list it in both Swedish and Danish crowns (SEK and DKK respectively).SELECT GUEST_LNAME, SUM(COST)/RATE AS TOTAL_BILL, CURRENCY FROM BOOK_GUEST, BILL, EXCHANGE_RATE WHERE GUEST_LNAME = 'JOHANSEN' AND (CURRENCY = 'DKK' OR CURRENCY = 'SEK') AND BOOK_GUEST.RESERVATION = BILL.RESERVATION GROUP BY GUEST_LNAME, CURRENCY, RATE;
In formulating a search condition for a join query, it can help to write out the columns that would appear in a complete cross-product of the tables. The search condition is then formulated as though the query was a simple SELECT from the cross-product table.
The joins in the previous sections were all inner joins. In an inner join between two tables, only rows that fulfill the join condition are present in the result.
An outer join, on the contrary, contains non-matching rows as well. The outer join has two options, LEFT and RIGHT.
Left Outer Join
Example:SELECT DESCRIPTION, COST FROM CHARGES LEFT OUTER JOIN BILL ON CHARGES.CHARGE_CODE = BILL.CHARGE_CODE AND RESERVATION = 1349;
In the example above all rows from the table to the left in the join clause, i.e. CHARGES, are present in the result. Non-matching rows from the BILL table are filled with null values in the result.
Observe the difference in result for the next statement and the previous one.SELECT DESCRIPTION, COST FROM CHARGES LEFT OUTER JOIN BILL ON CHARGES.CHARGE_CODE = BILL.CHARGE_CODE WHERE RESERVATION = 1349;
The reason is that conditions in the where clause are applied to the result of the join-clause and not to the joined tables as is the case with the conditions in the on-clause.
Right Outer Join
A right outer join will take all records from the table to the right in the join-clause.
Nesting Outer Joins
As with inner joins, it is possible to nest join-clauses. Nested joins can be of different types, i.e. both inner and outer joins.
The result of nested outer joins can be somewhat unexpected though, as it is the result of the first join-clause that is the left table in the next join, and not the right table in the first join-clause.
A form of SELECT, called a subselect, can be used in the search condition of a SELECT statement to form a nested query.
The main SELECT statement is then referred to as the outer select. For example:
Select the names of hotels which have rooms with a price under 750.SELECT NAME FROM HOTEL WHERE HOTELCODE IN (SELECT HOTELCODE FROM ROOM_PRICES WHERE PRICE < 750 );
To see how this works, evaluate the subselect first:SELECT HOTELCODE FROM ROOM_PRICES WHERE PRICE < 750;
Then use the result of the subselect in the search condition of the outer select:SELECT NAME FROM HOTEL WHERE HOTELCODE IN ('LAP','STG');
A subselect can be used in a search condition wherever the result of the subselect can provide the correct form of the data for the search condition.
Thus a subselect used with '=' must give a single value as a result.
A subselect used with IN, ALL or ANY must give a set of single values, see Retrieval with ALL, ANY, SOME.
A subselect used with EXISTS may give any result, see Retrieving Data Using EXISTS and NOT EXISTS.
Examples:WHERE column = (subselect) WHERE column IN (subselect) WHERE column = ALL (subselect) WHERE column = ANY (subselect) WHERE EXISTS (subselect)
Subselects and ORDER BY Clauses
Subselects cannot include ORDER BY clauses. The UNION operator can be used to combine two or more subselects in more complex statements, see Union Queries.
Many nested queries can equally well be written as simple joins. For example:
Select the names of hotels which have rooms with a price under 750.SELECT NAME FROM HOTEL WHERE HOTELCODE IN (SELECT HOTELCODE FROM ROOM_PRICES WHERE PRICE < 750 );
or alternativelySELECT DISTINCT NAME FROM HOTEL, ROOM_PRICES WHERE HOTEL.HOTELCODE = ROOM_PRICES.HOTELCODE AND ROOM_PRICES.PRICE < 750;
Both these queries give exactly the same result. In most cases, the choice of which form to use is a matter of personal preference. Choose the form which you can understand most easily; the clearest formulation is least likely to cause problems.
Subselects in Queries
Queries may contain any number of subselects, for example:
List hotels which have rooms that are more expensive than any of the rooms at the Hotel Laponia.SELECT NAME FROM HOTEL WHERE HOTELCODE IN (SELECT HOTELCODE FROM ROOM_PRICES WHERE PRICE > (SELECT MAX(PRICE) FROM ROOM_PRICES WHERE HOTELCODE = (SELECT HOTELCODE FROM HOTEL WHERE NAME = 'LAPONIA')));
Note the balanced parentheses for the nested levels.
It is particularly important at this level of complication to think carefully through the query to make sure that it is correctly formulated.
Often, writing some of the levels as simple joins can simplify the structure. The previous example may also be written:SELECT DISTINCT NAME FROM HOTEL, ROOM_PRICES WHERE HOTEL.HOTELCODE = ROOM_PRICES.HOTELCODE AND PRICE > (SELECT MAX(PRICE) FROM ROOM_PRICES, HOTEL WHERE ROOM_PRICES.HOTELCODE = HOTEL.HOTELCODE AND NAME = 'LAPONIA' );
Ordering Nested Queries
The ORDER BY clause may only be used in outer SELECT statements and not in subselects.
The following example is correct:SELECT NAME, ROOMTYPE, FROM_DATE, PRICE FROM HOTEL, ROOM_PRICES WHERE ROOMTYPE IN ('NSSGLS','NSSGLB') ORDER BY NAME;
The following example is incorrect:SELECT NAME, ROOMTYPE, FROM_DATE, PRICE FROM HOTEL, ROOM_PRICES WHERE HOTEL.HOTELCODE IN (SELECT HOTELCODE FROM ROOM_PRICES WHERE ROOMTYPE IN ('NSSGLS','NSSGLB') ORDER BY HOTELCODE);
A correlation name is a temporary name given to a table to represent a logical copy of the table within a query.
Correlation names can be up to a maximum of 128 characters long.
There are three uses for correlation names:
Simplifying Complex Queries Using Correlation Names
Using short correlation names into complicated queries can make the query easier to write and understand, particularly when qualified table names are used:SELECT HOTELADM.BOOK_GUEST.GUEST_LNAME, HOTELADM.HOTEL.NAME, SUM(COST) FROM HOTELADM.BOOK_GUEST, HOTELADM.HOTEL, HOTELADM.BILL WHERE HOTELADM.BILL.RESERVATION = HOTELADM.BOOK_GUEST.RESERVATION AND HOTELADM.HOTEL.HOTELCODE = 'WINS' GROUP BY HOTELADM.BOOK_GUEST.GUEST_LNAME, HOTELADM.HOTEL.NAME;
may be rewrittenSELECT G.GUEST_LNAME, H.NAME, SUM(COST) FROM HOTELADM.BOOK_GUEST AS G, HOTELADM.HOTEL AS H, HOTELADM.BILL AS B WHERE B.RESERVATION = G.RESERVATION AND H.HOTELCODE = 'WINS' GROUP BY G.GUEST_LNAME, H.NAME;
The keyword AS in the FROM clause may be omitted, but is recommended for clarity. Do not confuse AS in the FROM clause (defining a correlation name) with AS in the select list, see Setting Column Labels, defining a label.
About Correlation Names
Correlation names are local to the query in which they are defined.
When a correlation name is introduced for a table name, all references to the table in the same query must use the correlation name. The following expression is not accepted:... FROM HOTELADM.BOOK_GUEST AS G, ... WHERE H.RESERVATION = HOTELADM.BOOK_GUEST.RESERVATION
Joining a Table with Itself Using a Correlation Name
Joining a table with itself allows you to compare information in a table with other information in the same table. This can be done with a correlation name.
Select all pairs of hotels located in the same city:SELECT HOTEL.NAME, HOTEL.CITY FROM HOTEL, HOTEL AS COPY WHERE HOTEL.CITY = COPY.CITY AND HOTEL.NAME <> COPY.NAME;
Here, the table HOTEL is joined to a logical copy of itself called COPY.
The first search condition finds pairs of hotels in the same city, and the second eliminates 'pairs' with the same name. Without the second condition in the search condition, all hotel names would be selected!
Without correlation names, this kind of query cannot be formulated. The following query would select all the hotel names from the table:SELECT HOTEL.NAME, HOTEL.CITY FROM HOTEL WHERE HOTEL.CITY = HOTEL.CITY;
Outer References in Subselects Using Correlation Names
In some constructions using subselects, a subselect at a lower level may refer to a value in a table addressed at a higher level. This kind of reference is called an outer reference.SELECT NAME FROM HOTEL WHERE EXISTS (SELECT * FROM BOOK_GUEST WHERE HOTELCODE = HOTEL.HOTELCODE);
This kind of query processes the subselect for every row in the outer select, and the outer reference represents the value in the current outer select row. In descriptive terms, the query says 'For each row in HOTEL, select the NAME column if there are rows in BOOK_GUEST containing the current HOTELCODE value'.
If the qualifying name in an outer reference is not unambiguous in the context of the subselect, a correlation name must be defined in the outer select.
A correlation name may always be used for clarity, as in the following example:SELECT NAME FROM HOTEL AS H WHERE EXISTS (SELECT * FROM BOOK_GUEST WHERE HOTELCODE = H.HOTELCODE);
Retrieving Data Using EXISTS and NOT EXISTS
EXISTS is used to check for the existence of some row or rows which satisfy a specified condition. EXISTS differs from the other operators in that it does not compare specific values; instead, it tests whether a set of values is empty or not. The set of values is specified as a subselect.
The subselect following the EXISTS clause most often uses of 'SELECT *' as opposed to 'SELECT column-list' since EXISTS only searches to see if the set of values addressed by the subselect is empty or not - a specified column is seldom relevant in the subquery.
EXISTS (subselect) is true if the result set of the subselect is not empty
NOT EXISTS (subselect) is true if the result set of the subselect is empty
SELECT statements with EXISTS almost always include an outer reference linking the subselect to the outer select.
Examples of Exists
Find the names of hotels for which guests exist in the BOOK_GUEST table:SELECT NAME FROM HOTEL AS H WHERE EXISTS (SELECT * FROM BOOK_GUEST WHERE HOTELCODE = H.HOTELCODE);
Without the outer reference, the select becomes a conditional 'all-or-nothing' statement: perform the outer select if the subselect result is not empty, otherwise select nothing.
List all reservation numbers if anybody has checked out without paying:SELECT DISTINCT RESERVATION FROM BILL WHERE EXISTS (SELECT * FROM BOOK_GUEST WHERE CHECKOUT IS NOT NULL AND PAYMENT IS NULL);
Examples of NOT EXISTS
The next example illustrates NOT EXISTS:
Which hotels do not have double rooms with showers?SELECT NAME, HOTELCODE FROM HOTEL AS H WHERE NOT EXISTS (SELECT * FROM ROOMS WHERE HOTELCODE = H.HOTELCODE AND ROOMTYPE IN ('NSDBLS','SDBLS');
Negated EXISTS clauses must be handled with care. There are two semantic 'opposites' to EXISTS, with very different meanings:WHERE EXISTS (SELECT * FROM GUESTS WHERE GUEST = 'CODD')
is true if at least one guest is called CODD.WHERE NOT EXISTS (SELECT * FROM GUESTS WHERE GUEST = 'CODD')
is true if no guest is called CODD.
ButWHERE EXISTS (SELECT * FROM GUESTS WHERE GUEST <> 'CODD')
is true if at least one guest is not called CODD.WHERE NOT EXISTS (SELECT * FROM GUESTS WHERE GUEST <> 'CODD')
is true if no guest is not called CODD, that is, if every guest is called CODD.
The double negative in the previous example is an SQL implementation of the universal quantifier FORALL, see 'A Guide to DB2' by C. J. Date for more information on EXISTS and FORALL.
Retrieval with ALL, ANY, SOME
Subselects that return a set of values may be used in the quantified predicates ALL, ANY or SOME. ThusWHERE PRICE < ALL (subselect)
selects rows where the price is less than every value returned by the subselectWHERE PRICE < ANY (subselect)
selects rows where the price is less than at least one of the values returned by the subselect
Select room types and hotel codes for rooms with a price that differs from that of each room at Hotel Skyline:SELECT DISTINCT ROOMTYPE, HOTELCODE FROM ROOM_PRICES WHERE PRICE <> ALL (SELECT PRICE FROM ROOM_PRICES WHERE HOTELCODE = 'SKY');
If the result of the subselect is an empty set, ALL evaluates to true, while ANY or SOME evaluates to false.
An alternative to using ALL, ANY or SOME in a value comparison against a general sub-select, is to use EXISTS or NOT EXISTS to see if values are returned by a sub-select which only selects for specific values. For example:
Select the room type, price and hotel code for rooms which have the same price as a room at the hotel Skyline.SELECT ROOMTYPE, PRICE, HOTELCODE FROM ROOM_PRICES WHERE PRICE = ANY (SELECT PRICE FROM ROOM_PRICES WHERE HOTELCODE = 'SKY');
is equivalent toSELECT ROOMTYPE, PRICE, HOTELCODE FROM ROOM_PRICES RP WHERE EXISTS (SELECT * FROM ROOM_PRICES WHERE HOTELCODE = 'SKY' AND RP.PRICE = PRICE);
The UNION operator combines the results of two or more subselect clauses.
UNION first merges the result tables specified by the separate subselects and then eliminates duplicate rows from the merged set.
Select the codes for hotels which are in Stockholm or have single rooms with showers:SELECT HOTELCODE FROM HOTEL WHERE CITY = 'STOCKHOLM' UNION SELECT DISTINCT HOTELCODE FROM ROOMS WHERE ROOMTYPE IN ('NSSGLS','SSGLS');
The result is obtained by merging the results of the two subselects and eliminating duplicates:SELECT HOTELCODE SELECT DISTINCT HOTELCODE FROM HOTEL FROM ROOMS WHERE CITY = 'STOCKHOLM'; WHERE ROOMTYPE IN ('NSSGLS','SSGLS');
giving the result table:
To retain duplicates in the result table, use UNION ALL in place of UNION, see the Mimer SQL Reference Manual for details.
Columns which are merged by UNION must have compatible data types (numerical with numerical, character with character).
Subselects addressing more than one result column are merged column by column in the order of selection. The number of columns addressed in each subselect must be the same.
About Column Names
The column names in the result of a UNION are taken from the names in the first subselect. Use labels in the first subselect to assign different column names to the result table:
Merge the codes and names of hotels in Stockholm with the hotel codes and room type for rooms which are more expensive than any room at the St. George hotel.SELECT HOTELCODE AS CODE, NAME AS NAME_OR_TYPE FROM HOTEL WHERE CITY = 'STOCKHOLM' UNION SELECT HOTELCODE, ROOMTYPE FROM ROOM_PRICES WHERE PRICE > (SELECT MAX(PRICE) FROM ROOM_PRICES WHERE HOTELCODE = 'STG');
Subselects merged by UNION may not include an ORDER BY clause. However, the result of the UNION query may be ordered with an ORDER BY clause placed after the last query in the UNION.
UNION may not be used within a nested subselect. However, the results of nested queries may be joined by UNION.
More about Unions
Unions can also be used to combine information from the same table:
Find the highest and lowest prices for rooms at the Hotel Skyline:SELECT 'HIGHEST' AS PRICE, MAX(PRICE) AS AMOUNT FROM ROOM_PRICES WHERE HOTELCODE = 'SKY' UNION SELECT 'LOWEST', MIN(PRICE) FROM ROOM_PRICES WHERE HOTELCODE = 'SKY' ORDER BY AMOUNT;
Outer Joins and Unions
Unions can also be used to perform outer joins, joining information in a table or tables with information not listed in those tables (i.e. information that is null). For example:
List the room types available for each hotel code. Include a row for hotel codes which do not have a given room type with a shower:SELECT DISTINCT H.HOTELCODE, ROOMTYPE FROM ROOMS R, HOTEL H WHERE R.HOTELCODE = H.HOTELCODE UNION SELECT DISTINCT H.HOTELCODE, 'NO '|| ROOMTYPE AS ROOMTYPE FROM HOTEL H, ROOMS WHERE H.HOTELCODE = ROOMS.HOTELCODE AND NOT EXISTS (SELECT * FROM ROOMS R WHERE R.HOTELCODE = H.HOTELCODE AND ROOMTYPE LIKE '%S') ORDER BY HOTELCODE;
Note: UNION statements including DISTINCT treat NULL values as duplicates.
Unions and NULL
In UNION queries, the keyword NULL can be included in the column list of one or both of the queries, so that columns not represented in all of the queries in the statement are retained in the result set.
Upright Database Technology AB
Voice: +46 18 780 92 00
Fax: +46 18 780 92 40