|
|
Retrieving Data from More than One Table
The examples presented up to now in this chapter have illustrated the essential features of simple
SELECTstatements with data retrieval from single tables. However, much of the power of SQL lies in the ability to perform joins through a single statement, i.e. to select data from two or more tables, using the search condition to link the tables in a meaningful way.The Join Condition
In retrieving data from more than one table, the search condition or join condition specifies the way the tables are to be linked. For example:
List the product name in addition to the EAN and price:
SELECT product, ean_code, price FROM items JOIN products ON items.product_id = products.product_id;The join condition here is
ITEMS.PRODUCT_ID = PRODUCTS.PRODUCT_ID, which relates the product identifier in tableITEMS(where codes are listed) to the product identifier in tablePRODUCTS(where names are listed).Returns:
Conceptually, the join first establishes a table containing all combinations of the rows in
PRODUCTSwith the rows inITEMS, then selects those rows in which the twoPRODUCT_IDvalues are equal. See Conceptual Description of the Selection Process for a fuller description of the conceptualSELECTprocess.This does not necessarily represent the order in which the operations are actually performed; the order of evaluation of a complex
SELECTstatement is determined by the SQL optimizer, regardless of the order in which the component clauses are written.Cross Products
Without the join condition, the result is a cross product of the columns in the tables in question, containing all possible combinations of the selected columns, for example:
SELECT product, ean_code, price FROM items, products;Returns:
PRODUCT 'Murder In The Cathedral' 'Murder In The Cathedral' 'Murder In The Cathedral' 'Murder In The Cathedral' ...It is easy to see that a carelessly formulated join query can produce a very large result table. Two tables of 100 rows each, for instance, give a cross product with 10,000 rows; three tables of 100 rows each give a cross product with 1,000,000 rows!
The risk of generating large (erroneous) result tables is particularly high in interactive SQL (e.g. when using Mimer BSQL), where queries are so easily written and submitted.
Simple Joins
In simple joins, all tables used in the join are listed in the
FROMclause of theSELECTstatement. This is in distinction to nested joins, where the search condition for oneSELECTis expressed in terms of anotherSELECT, see Nested Selects.Example:
SELECT product, ean_code, price FROM items, products WHERE items.product_id = products.product_id;SELECT *
The form
SELECT *may be used in a join query, but since this selects all columns in the result set, at least one column is often duplicated (a join condition column).Example:
SELECT * FROM items JOIN products ON items.product_id = products.product_id;Returns:
ITEM_ID PRODUCT_ID FORMAT_ID RELEASE_DATE STATUS PRICE STOCK REORDER_LEVEL EAN_CODE PRODUCER_ID IMAGE_IDColumns in the join query that are uniquely identified by the column name may be specified by name alone. Columns that have the same name in the joined tables must be qualified by their respective table names.
The same query as above, but only three columns are returned:
SELECT product, ean_code, price FROM items JOIN products ON items.product_id = products.product_id;Nesting Join Clauses
It is possible to nest join-clauses, for example:
List the category in addition to the EAN and price for any items released in December 1996:
SELECT ean_code, price, category FROM items JOIN formats ON items.format_id = formats.format_id JOIN categories ON categories.category_id = formats.category_id WHERE release_date BETWEEN date'1996-12-01' AND date'1996-12-30';Result:
Complex Search Conditions and Joins
A join query can join any number of tables using complex search conditions to select the relevant information from each table.
List the product for any items with a release date in the future along with the item price in both Swedish and Danish crowns (SEK and DKK respectively):
SELECT product, CAST(price * exchange_rate AS DECIMAL(12,2)) AS cost, currency FROM items, products, currencies WHERE release_date > CURRENT_DATE AND products.product_id = items.product_id AND currencies.code IN ('SEK', 'DKK') ORDER BY product, currency;Result:
In formulating a search condition for a join query, it can help to write out the columns that would appear in a complete cross-product of the tables. The search condition is then formulated as though the query was a simple
SELECTfrom the cross-product table.Outer Joins
The joins in the previous sections were all inner joins. In an inner join between two tables, only rows that fulfill the join condition are present in the result.
An outer join, on the contrary, contains non-matching rows as well. The outer join has two options,
LEFTandRIGHT.Left Outer Join
Example:
SELECT ean_code, release_date, producer FROM items LEFT OUTER JOIN producers ON items.producer_id = producers.producer_id WHERE ean_code >= 800000000000 ORDER BY ean_code;Result:
In the example above all rows from the table to the left in the join clause, i.e.
ITEMS, are present in the result; non-matching rows from thePRODUCERStable are filled with null values in the result.Observe the difference in result for the next statement and the previous one.
SELECT ean_code, release_date, producer FROM items LEFT OUTER JOIN producers ON items.producer_id = producers.producer_id AND ean_code >= 800000000000 ORDER BY ean_code;Result:
The reason is that conditions in the where clause are applied to the result of the join-clause and not to the joined tables as is the case with the conditions in the on-clause.
Right Outer Join
A right outer join will take all records from the table to the right in the join-clause.
Nesting Outer Joins
As with inner joins, it is possible to nest join-clauses. Nested joins can be of different types, i.e. both inner and outer joins.
The result of nested outer joins can be somewhat unexpected though, as it is the result of the first join-clause that is the left table in the next join, not the right table in the first join-clause.
Example:
SELECT * FROM tableA LEFT JOIN tableB ON tableA.id = tableB.id LEFT JOIN tableC ON tableA.id = tableC.idThis query does first perform
tableA LEFT JOIN tableB. The result is then used as left table when performingLEFT JOIN tableC.To make this query clearer, parentheses can be added as:
SELECT * FROM (tableA LEFT JOIN tableB ON tableA.id = tableB.id) LEFT JOIN tableC ON tableA.id = tableC.idNested Selects
A form of
SELECT, called a subselect, can be used in the search condition of aSELECTstatement to form a nested query.The main
SELECTstatement is then referred to as the outer select.Select the products that have a release date in the future.
SELECT product FROM products WHERE product_id IN (SELECT product_id FROM items WHERE release_date > CURRENT_DATE);Result:
To see how this works, evaluate the subselect first:
SELECT product_id FROM items WHERE release_date > CURRENT_DATE;Result:
Then use the result of the subselect in the search condition of the outer select:
SELECT product FROM products WHERE product_id IN (30206, 30618, 31082, 31083);Result:
Using Subselects
A subselect can be used in a search condition wherever the result of the subselect can provide the correct form of the data for the search condition.
Thus a subselect used with `
=' must give a single value as a result.A subselect used with
IN,ALLorANYmust give a set of single values, see Retrieval with ALL, ANY, SOME.A subselect used with
EXISTSmay give any result, see Retrieving Data Using EXISTS and NOT EXISTS.Examples:
WHERE column = (subselect) WHERE column IN (subselect) WHERE column = ALL (subselect) WHERE column = ANY (subselect) WHERE EXISTS (subselect)The
UNION,EXCEPTandINTERSECToperators can be used to combine two or more subselects in more complex statements, see Union, Except and Intersect Queries.Nested Queries
Many nested queries can equally well be written as simple joins. For example:
Select the products that have a release date in the future.
SELECT product FROM products WHERE product_id IN (SELECT product_id FROM items WHERE release_date > CURRENT_DATE);SELECT DISTINCT product FROM products, items WHERE products.product_id = items.product_id AND items.release_date > CURRENT_DATE;Both these queries give exactly the same result. In most cases, the choice of which form to use is a matter of personal preference. Choose the form which you can understand most easily; the clearest formulation is least likely to cause problems.
Subselects in Queries
Queries may contain any number of subselects, for example:
List the producers (manufacturers) which have items that are more expensive than any of the items produced by Sony.
SELECT producer FROM producers WHERE producer_id IN (SELECT producer_id FROM items WHERE price > (SELECT MAX(price) FROM items WHERE producer_id = (SELECT producer_id FROM producers WHERE producer = 'SONY')));Note the balanced parentheses for the nested levels.
It is particularly important at this level of complication to think carefully through the query to make sure that it is correctly formulated.
Often, writing some of the levels as simple joins can simplify the structure. The previous example may also be written:
SELECT DISTINCT producer FROM producers, items WHERE producers.producer_id = items.producer_id AND price > (SELECT MAX(price) FROM items, producers WHERE items.producer_id = producers.producer_id AND producer = 'SONY');Correlation Names
A correlation name is a temporary name given to a table to represent a logical copy of the table within a query.
There are three uses for correlation names:
Simplifying Complex Queries Using Correlation Names
Using short correlation names into complicated queries can make the query easier to write and understand, particularly when qualified table names are used:
SELECT mimer_store_music.artists.artist, mimer_store.product_details.* FROM mimer_store.product_details, mimer_store_music.titles, mimer_store_music.artists WHERE mimer_store_music.titles.item_id = mimer_store.product_details.item_id AND mimer_store_music.artists.artist_id = mimer_store_music.titles.artist_id ORDER BY mimer_store_music.artists.artist;SELECT art.artist, pdt.* FROM mimer_store.product_details AS pdt, mimer_store_music.titles AS ttl, mimer_store_music.artists AS art WHERE ttl.item_id = pdt.item_id AND art.artist_id = ttl.artist_id ORDER BY art.artist;The keyword
ASin theFROMclause may be omitted, but is recommended for clarity.About Correlation Names
Correlation names are local to the query in which they are defined.
When a correlation name is introduced for a table name, all references to the table in the same query must use the correlation name.
The following expression is not accepted:
... FROM mimer_store.product_details AS pdt, mimer_store_music.titles AS ttl, ... WHERE ttl.item_id = mimer_store.product_details.item_idJoining a Table with Itself Using a Correlation Name
Joining a table with itself allows you to compare information in a table with other information in the same table. This can be done with a correlation name.
Select all currencies with the same exchange rate:
SELECT currencies.currency, currencies.code, currencies.exchange_rate FROM currencies, currencies AS copy WHERE currencies.exchange_rate = copy.exchange_rate AND currencies.currency <> copy.currency;Result:
Here, the table
CURRENCIESis joined to a logical copy of itself calledCOPY.The first search condition finds pairs of currencies with the same exchange rate, and the second eliminates 'pairs' with the same currency name. Without the second condition in the search criteria, all currencies would be selected!
Without correlation names, this kind of query cannot be formulated. The following query would select all the currencies from the table:
SELECT currency, code, exchange_rate FROM currencies WHERE currencies.exchange_rate = currencies.exchange_rate;Outer References in Subselects Using Correlation Names
In some constructions using subselects, a subselect at a lower level may refer to a value in a table addressed at a higher level. This kind of reference is called an outer reference.
SELECT currency FROM currencies WHERE EXISTS (SELECT * FROM countries WHERE currency_code = currencies.code);This kind of query processes the subselect for every row in the outer select, and the outer reference represents the value in the current outer select row. In descriptive terms, the query says 'For each row in
CURRENCIES, select theCURRENCYcolumn if there are rows inCOUNTRIEScontaining the currentCODEvalue'.If the qualifying name in an outer reference is not unambiguous in the context of the subselect, a correlation name must be defined in the outer select.
A correlation name may always be used for clarity, as in the following example:
SELECT currency FROM currencies AS c WHERE EXISTS (SELECT * FROM countries WHERE currency_code = c.code);Retrieving Data Using EXISTS and NOT EXISTS
EXISTSis used to check for the existence of some row or rows which satisfy a specified condition.EXISTSdiffers from the other operators in that it does not compare specific values; instead, it tests whether a set of values is empty or not. The set of values is specified as a subselect.The subselect following the
EXISTSclause most often uses of `SELECT *' as opposed to `SELECT column-list' sinceEXISTSonly searches to see if the set of values addressed by the subselect is empty or not - a specified column is seldom relevant in the subquery.
EXISTS(subselect) is true if the result set of the subselect is not empty
NOT EXISTS(subselect) is true if the result set of the subselect is empty
SELECTstatements withEXISTSalmost always include an outer reference linking the subselect to the outer select.Examples of EXISTS
Find all currencies that are used in the COUNTRIES table:
SELECT currency FROM currencies AS c WHERE EXISTS (SELECT * FROM countries WHERE currency_code = c.code);Without the outer reference, the select becomes a conditional `all-or-nothing' statement: perform the outer select if the subselect result is not empty, otherwise select nothing.
List all products where the producer (manufacturer) is not known:
SELECT product FROM products WHERE EXISTS (SELECT * FROM items WHERE producer_id IS NULL AND product_id = products.product_id);Examples of NOT EXISTS
The next example illustrates
NOT EXISTS:List all products where the producer (manufacturer) is not known:
SELECT product FROM products WHERE NOT EXISTS (SELECT * FROM items JOIN producers ON items.producer_id = producers.producer_id WHERE product_id = products.product_id);Result:
Negated EXISTS
Negated
EXISTSclauses must be handled with care. There are two semantic `opposites' toEXISTS, with very different meanings:WHERE EXISTS (SELECT * FROM artists WHERE artist = 'Enigma')is true if at least one artist is called
Enigma.WHERE NOT EXISTS (SELECT * FROM artists WHERE artist = 'Enigma')is true if no artist is called
Enigma.WHERE EXISTS (SELECT * FROM artists WHERE artist <> 'Enigma')is true if at least one artist is not called
Enigma.WHERE NOT EXISTS (SELECT * FROM artists WHERE artist <> 'Enigma')is true if no artist is not called
Enigma, that is if every artist is calledEnigma.Retrieval with ALL, ANY, SOME
Subselects that return a set of values may be used in the quantified predicates
ALL,ANYorSOME. ThusWHERE PRICE < ALL (subselect)selects rows where the price is less than every value returned by the subselect
WHERE PRICE < ANY (subselect)selects rows where the price is less than at least one of the values returned by the subselect
Select countries that have an exchange rate of less than one:
SELECT country FROM countries WHERE currency_code <> ALL (SELECT code FROM currencies WHERE exchange_rate >= 1.0);If the result of the subselect is an empty set,
ALLevaluates to true, whileANYorSOMEevaluates to false.An alternative to using
ALL,ANYorSOMEin a value comparison against a general sub-select, is to useEXISTSorNOT EXISTSto see if values are returned by a sub-select which only selects for specific values. For example:Select countries where the associated currency code contains the letter 'E' as the middle character in the code:
SELECT country FROM countries WHERE currency_code = ANY (SELECT code FROM currencies WHERE code LIKE '_E_');SELECT country FROM countries AS c WHERE EXISTS (SELECT * FROM currencies WHERE code LIKE '_E_' AND code = c.currency_code);Union, Except and Intersect Queries
The
UNION,EXCEPTandINTERSECToperators combine the results of two select clauses.
UNIONfirst merges the result tables specified by the separate selects and then eliminates duplicate rows from the merged set. (UNION ALLdoes not eliminate duplicate rows.)
EXCEPTtakes the distinct rows from the first select and returns the rows that do not appear in the second select. (EXCEPT ALLdoes not eliminate duplicate rows.)
INTERSECTtakes the results of two selects and returns only rows that appear in both selects, after removing duplicate rows from the final result set. (INTERSECT ALLdoes not eliminate duplicate rows.)Columns which are merged by
UNION,EXCEPTandINTERSECTmust have compatible data types (numerical with numerical, character with character).Subselects addressing more than one result column are merged column by column in the order of selection. The number of columns addressed in each subselect must be the same.
The column names in the result of a
UNION,EXCEPTorINTERSECTare taken from the names in the first subselect. Use labels in the first subselect to assign different column names to the result table.In
UNION,EXCEPTandINTERSECTqueries, you may need to add an empty column so that columns not represented in both queries in the statement are retained in the result set. This is done by casting a NULL value to a matching datatype.Example
SELECT ean_code, release_date, producer FROM items INNER JOIN producers ON items.producer_id = producers.producer_id UNION ALL SELECT ean_code, release_date, CAST(NULL AS char) FROM items WHERE NOT EXISTS (SELECT * FROM producers WHERE items.producer_id = producers.producer_id)UNION Examples
Select the different codes for currencies and countries that start with the letter 'D':
SELECT code FROM currencies WHERE code LIKE 'D%' UNION SELECT currency_code FROM countries WHERE country LIKE 'D%';The result is obtained by merging the results of the two selects and eliminating duplicates:
SELECT code SELECT currency_code FROM currencies FROM currencies WHERE code LIKE 'D%; WHERE country LIKE 'D%';and the
UNIONgives the result table:To retain duplicates in the result table, use UNION ALL in place of UNION, see the Mimer SQL Reference Manual, UNION or UNION ALL, for details.
Merge the codes and names of currencies where the code begins with 'D' with the codes and names of the countries where the country begins with 'D':
SELECT code, currency AS currency_or_country FROM currencies WHERE code LIKE 'D%' UNION SELECT currency_code, country FROM countries WHERE country LIKE 'D%' ORDER BY code;Result:
CURRENCY_OR_COUNTRY Djibouti Djibouti Francs Danish Kronor Denmark Dominican Pesos Dominican Republic Algerian Dinars DominicaFind the lowest and highest exchange_rates:
Unions can be used to combine information from the same table.
SELECT 'Highest', MAX(exchange_rate) AS rate FROM currencies UNION ALL SELECT 'Lowest', MIN(exchange_rate) FROM currencies ORDER BY rate;Result:
List the EAN, price and producer for each item, also add each item where the producer is not known:
Unions can be used to perform OUTER JOIN's, joining information in a table or tables with information not listed in those tables (i.e. information that is null).
SELECT ean_code, release_date, producer FROM items JOIN producers ON items.producer_id = producers.producer_id UNION ALL SELECT ean_code, release_date, '*** undefined ***' FROM items WHERE NOT EXISTS (SELECT * FROM producers WHERE producer_id = items.producer_id) ORDER BY ean_code;Result:
EXCEPT Examples
Select the codes from currencies, except those that also are found in countries, starting with the letter 'D':
SELECT code FROM currencies WHERE code LIKE 'D%' EXCEPT SELECT currency_code FROM countries WHERE country LIKE 'D%';The result is obtained by taking the first result and then remove the rows also found in the second select, and finally eliminating duplicates:
SELECT code SELECT currency_code FROM currencies FROM currencies WHERE code LIKE 'D%; WHERE country LIKE 'D%';and the
EXCEPTgives the result table:To retain duplicates in the result table, use
EXCEPT ALLin place ofEXCEPT, see the Mimer SQL Reference Manual, EXCEPT or EXCEPT ALL, for details.INTERSECT Examples
Select the codes from currencies and countries that exist in both tables, starting with the letter 'D':
SELECT code FROM currencies WHERE code LIKE 'D%' INTERSECT SELECT currency_code FROM countries WHERE country LIKE 'D%';The result is obtained by taking the first result and then remove the rows also found in the second select, and finally eliminating duplicates:
SELECT code SELECT currency_code FROM currencies FROM currencies WHERE code LIKE 'D%; WHERE country LIKE 'D%';and the
INTERSECTgives the result table:To retain duplicates in the result table, use
INTERSECT ALLin place ofINTERSECT, see the Mimer SQL Reference Manual, INTERSECT or INTERSECT ALL, for details.
|
Mimer Information Technology AB Voice: +46 18 780 92 00 Fax: +46 18 780 92 40 info@mimer.se |
|
|