Today, I am continuing our exploration of the SELECT reserved word by talking about sub-queries. A sub-query is a query that is nested inside a SELECT, INSERT, UPDATE, or DELETE statement, or inside another sub-query. Sub-queries can be used anywhere an expression is allowed or can be nested inside another sub-query.
I will be using the AdventureWorks Sample database supplied by Microsoft during this talk.
1 2 3 4 |
<span style="color: #008000; font-family: Lucida Console; font-size: small;">-- Use the correct database USE AdventureWorks2008R2 GO </span> |
One usage a sub-query is with comparison operators in the where clause. The example below returns all the sales people by name and id who are above average sellers.
1 2 3 4 5 6 7 8 9 10 11 |
<span style="color: #008000; font-family: Lucida Console; font-size: small;">-- All sales people who have above average sales - 8 rows SELECT P.BusinessEntityID as Id, cast(P.LastName as char(20)) as LastName, cast(P.FirstName as char(20)) as FirstName FROM Sales.SalesPerson S1 INNER JOIN Person.Person P ON S1.BusinessEntityID = P.BusinessEntityID WHERE S1.SalesYtd > (SELECT AVG(S2.SalesYTD) FROM Sales.SalesPerson S2) </span> |
Another usage of a sub-query is with the IN (inclusion) clause. In short, the sub-query or inner select statement, returns a list of values in which column or field has to match one value in the list. The example below shows all female sales agents.
1 2 3 4 5 6 7 8 9 10 11 12 |
<span style="color: #008000; font-family: Lucida Console; font-size: small;">-- List all the female sales agents - 7 rows SELECT P.BusinessEntityID as Id, cast(P.LastName as char(20)) as LastName, cast(P.FirstName as char(20)) as FirstName FROM Person.Person P WHERE P.PersonType = 'SP' AND P.BusinessEntityID IN ( Select BusinessEntityID from HumanResources.Employee Where Gender = 'F' ) </span> |
The opposite of inclusion is exclusion. This condition is tested with the NOT IN (exclusion) clause. The example below shows all male sales agents.
1 2 3 4 5 6 7 8 9 10 11 12 |
<span style="color: #008000; font-family: Lucida Console; font-size: small;">-- List all the male sales agents - 10 rows SELECT P.BusinessEntityID as Id, cast(P.LastName as char(20)) as LastName, cast(P.FirstName as char(20)) as FirstName FROM Person.Person P WHERE P.PersonType = 'SP' AND P.BusinessEntityID NOT IN ( SELECT BusinessEntityID FROM HumanResources.Employee WHERE Gender = 'F' ) </span> |
The TSQL code below shows the number of distinct employees and total rate change records. This information is relevant to the existence and non-existence query examples coming up later.
1 2 3 4 5 6 7 8 |
<span style="color: #008000; font-family: Lucida Console; font-size: small;">-- Total distinct employees SELECT COUNT(DISTINCT H.BusinessEntityID) as TotalEmployees FROM HumanResources.EmployeePayHistory H -- Total rate change records SELECT COUNT(H.BusinessEntityID) AS TotalRateChgs FROM HumanResources.EmployeePayHistory H </span> |
We are going to use the EXIST clause to determine all employees who have had at least on rate change in the past. Existence is determined by the correlated sub-query that does not return a empty set, no rows.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
<span style="color: #008000; font-family: Lucida Console; font-size: small;">-- Show employees who have had rate changes - 13 rows SELECT P.BusinessEntityID as Id, cast(P.LastName as char(20)) as LastName, cast(P.FirstName as char(20)) as FirstName FROM Person.Person P WHERE EXISTS ( SELECT COUNT(*) FROM HumanResources.EmployeePayHistory H WHERE P.BusinessEntityID = H.BusinessEntityID HAVING COUNT(*) > 1 ) </span> |
The NOT EXIST clause determines if a empty set is returned by the sub-query. The next business problem is to find all the employees who have not had a rate change. One would think that add the word NOT to the TSQL statement above would solve our problem. However, it returns any non-matches including contacts who are not employees. We have to add an additional EXISTS clause to weed out all the non-employees.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
<span style="color: #008000; font-family: Lucida Console; font-size: small;">-- Show employees who have had stayed at the same rate - 277 rows SELECT P.BusinessEntityID as Id, cast(P.LastName as char(20)) as LastName, cast(P.FirstName as char(20)) as FirstName FROM Person.Person P WHERE -- Includes non-employees + employees with one record change NOT EXISTS ( SELECT COUNT(*) FROM HumanResources.EmployeePayHistory H WHERE P.BusinessEntityID = H.BusinessEntityID HAVING COUNT(*) > 1 ) -- Restrict to just employees AND EXISTS ( SELECT * FROM HumanResources.EmployeePayHistory H WHERE P.BusinessEntityID = H.BusinessEntityID ) </span> |
Please note, queries that test exclusion or non existence always execute long than queries that test inclusion or existence. The reason for this is that a condition is meet when the first match is fond. On the other hand, all values have to be checked when testing for negativity.
Sub-queries can be used any where an expression can be used. The following example shows the list price of mountain bikes, the average prices of a mountain bike, and the difference from the average price for the given mountain bike. In short, two sub-queries are used as expressions.
1 2 3 4 5 6 7 8 |
<span style="color: #008000; font-family: Lucida Console; font-size: small;">-- Show all mountain bikes, list, avg, and dif - 32 rows SELECT Name, ListPrice, (SELECT AVG(ListPrice) FROM Production.Product) AS Average, ListPrice - (SELECT AVG(ListPrice) FROM Production.Product) AS Difference FROM Production.Product WHERE ProductSubcategoryID = 1; </span> |
The ANY, SOME and ALL keywords were added to the SQL Server in the 2005 edition. The following example finds all customers who have not been given a territory id which is assigned to a sales person. This includes all rows in which territory id is null.
1 2 3 4 5 |
<span style="color: #008000; font-family: Lucida Console; font-size: small;">-- Same thing as IN clause, returns 19820 customers are not assigned a territory SELECT COUNT(DISTINCT CustomerID) as Customers_Wo_Territory FROM Sales.Customer WHERE TerritoryID <> ANY (SELECT TerritoryID FROM Sales.SalesPerson); </span> |
The SOME is a great way to test for at least one condition is meet. The Adventure Works database has 32 mountain bikes in its product line. You have been tasked to find all Customer’s who have purchased a mountain bike and return their customer id. The query below does just that.
1 2 3 4 5 6 7 8 9 10 11 12 |
<span style="color: #008000; font-family: Lucida Console; font-size: small;">-- 4300 Unique Customers who order at least one mountain bike SELECT DISTINCT CustomerId FROM Sales.SalesOrderHeader H WHERE H.SalesOrderId IN ( -- 6185 Unique Orders SELECT DISTINCT SalesOrderId FROM Sales.SalesOrderDetail D WHERE D.ProductId = -- 32 Mountain Bikes SOME ( SELECT ProductId FROM Production.Product where name like '%mountain-%' ) ) </span> |
Another cool feature of subqueries is that they can be used with inserts, updates and deletes – data manipulation statements.
Let’s make believe that your manager wanted to have a fire sale on the mountain bikes to reduce stock. He has asked you to reduce the price by 50 percent. You begin the transaction, you update 32 records in the product table. He comes by your desk within minutes and tells you that he wants to wait 2 weeks before making this change. Therefore, you rollback the changes. The TSQL below executes the series of tasks.
1 2 3 4 5 6 7 8 9 10 |
<span style="color: #008000; font-family: Lucida Console; font-size: small;">-- Used with INS, UPD, DEL BEGIN TRAN UPDATE Production.Product SET ListPrice = ListPrice / 2 WHERE ProductID IN (SELECT ProductID FROM Production.Product WHERE ProductSubcategoryID = 1); ROLLBACK </span> |
Last but not least, I am going to end the discussion with the ALL keyword. I worked on this query for a while since real life examples lend themselves to using this keyword in a stored proceedure. For instance, the books on line example show a stored proceedure called DaysToBuild which takes an @OrderID and @NumberOfDays as input. If a order can be build in the number of days, a true statement is shown; otherwise a false statement is shown.
Since this is not a real example of using the ALL keyword in a select statement, I decided to introduce two powerful techniques. First, common table expressions (CTE’s) were added to SQL Server in the 2005 edition. They return back a dynamic view of the data. Second, a tally table is a table that contains numbers 1 to n. Armed with these two CTE’s, I am able to identify all the customers who have ordered 30 or more of mountain bikes.
Again, this example is artificial since we could get the answer by adding a HAVING clause to the mountain bikes SELECT statement to return the results with one query.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
<span style="color: #008000; font-family: Lucida Console; font-size: small;">-- Count of mountain bike models by customer WITH cte_MOUNTAIN_BIKES(CustomerID, UniqueBikes) AS ( -- 4300 rows SELECT H.CustomerID, COUNT(DISTINCT D.ProductId) AS UniqueBikes FROM Sales.SalesOrderHeader H JOIN Sales.SalesOrderDetail D on H.SalesOrderID = D.SalesOrderID JOIN Production.Product P ON D.ProductID = P.ProductID WHERE P.name like '%mountain-%' GROUP BY H.CustomerID ), -- A tally table contains values 1 .. 30 cte_TALLY_TABLE(n) AS ( SELECT N FROM ( SELECT ROW_NUMBER() OVER ( ORDER BY SalesOrderID ) FROM Sales.SalesOrderHeader ) D ( N ) WHERE N < 30 ) -- Return all customers (7 rows) who have ordered 30 or more different bikes- SELECT * FROM cte_MOUNTAIN_BIKES WHERE UniqueBikes >= ALL (SELECT N FROM cte_TALLY_TABLE) ORDER BY UniqueBikes GO </span> |
The lesson learn from today’s talk is that sub-queries are very powerful for selecting data. The IN, EXIST, ANY, SOME and ALL keywords allow a SQL developer to test for all types of conditions in the result set. Data manipulation queries such as INSERTS, UPDATES and DELETES can use sub-queries to pin point what rows to update. Like views, deterministic sub-queries can be updatable but have rules that need to be followed.
Book marked, I love your site! :)
Thank you for the sensible critique. Me and my neighbor were just preparing to do some research on this. We got a grab a book from our area library but I think I learned more clear from this post. I am very glad to see such wonderful information being shared freely out there.
I love your writing style genuinely enjoying this site. “The glory of creation is in its infinite diversity.” by Gene Roddenberry.