1.6. Subqueries#

Last updated: September 13, 2024

A parenthesized SQL query statement (a subquery) can be used as a value in various places of a larger SQL query. We could use subqueries as scalars or sets.

1.6.1. Scalar Subqueries#

If a subquery returns a single tuple with a single attribute value, it can be treated as a scalar in expressions. Suppose we want to collect all the stops that happened at the same location as id = 123. We could issue a query as follows, with the parethesized statement as the subquery:

SELECT S1.id, S1.race, S1.location
FROM stops S1
WHERE S1.location = (
    SELECT S2.location FROM stops S2 WHERE S2.id = 123
);

Note that when using subqueries, it’s important to define relations with relevant variables (i.e., S1 and S2) such that when accessing attributes, it’s clear which relation’s attribute is being accessed. We could also rewrite this query to use a CTE (i.e., WITH statement) instead of a subquery:

WITH location_123 AS (
    SELECT location FROM Stops WHERE id = 123
)
SELECT S1.id, S1.race, S1.location
FROM Stops as S1, location_123
WHERE S1.location = location_123.location;

1.6.2. EXISTS#

We can use the EXISTS or NOT EXISTS keywords in WHERE clauses to use results of a subquery in set form. For example, suppose we want to determine all the Stops that are the only one in their zipcode:

WITH stops_zips AS (
    SELECT * FROM stops NATURAL JOIN zips
)
SELECT *
FROM stops_zips SZ1
WHERE NOT EXISTS (
    SELECT *
    FROM stops_zips SZ2
    WHERE SZ1.zipcode = SZ2.zipcode
            AND SZ1.id != SZ2.id
);

In the above query, the subquery returns a set of all zips that have more that one stop, i.e. the WHERE keyword finds all the tuples that have the same zipcode as another, but with a unique ID, which is then used with the NOT EXISTS keyword to return locations of stops that do not have an entry in the subquery set.