Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

2.2 Extended RA Operators

Last updated: September 6, 2024

While the six primitive operators can compose many of the relational operations used in SQL and other relational data systems, you may have noticed that very few of them operate on changing the data themselves (beyond adding and removing tuples). To encompass other sorts of data operations, we turn to extended RA operators, that manipulate data values themselves, e.g., outer joins to include null values, aggregations to group data values, and so on.

For now, we cover precisely one extended RA operator, the grouping/aggregation operator, γ\gamma.

Grouping and Aggregation

Suppose we had input relation RR with schema (C1,,Cn)(C_1, \dots, C_n). The output relation resulting from grouping on attributes (B1,,Bn)(B_1, \dots, B_n) where {B1,,Bn}{C1,,Cn}\{B_1, \dots, B_n\} \subset \{C_1, \dots, C_n\} and computing aggregations agg1,,aggkagg_1, \dots, agg_k can be written as γB1,,Bn,agg1,,aggk(R)\gamma_{B_1, \dots, B_n, agg_1, \dots, agg_k}(R). By convention, the grouped attributes come first, followed by the aggregations. Yes, it’s unwieldy. Good times.

SQL query to Relational Algebra

Suppose that we have the following query:

SELECT a1, a2, …, an, agg1(d1), agg2(d2), …, aggp(dp)
FROM R1, R2, …, Rk
WHERE C
GROUP BY b1, b2, …, bm
HAVING H

We could write the SQL order of execution of the above query using extended relational algebra as follows:

πlist(σH(γb1,,bm,agg1(d1),,aggp(dp)(σC(R1×Rk))))\pi_{list} \biggl( \sigma_H \biggl( \gamma_{b1, \dots, bm, agg1(d1), \dots, aggp(dp)} \bigl( \sigma_C (R1 \times \dots Rk ) \bigr) \biggr) \biggr), where

A more formal definition of this grouping/aggregation operator is available on Wikipedia, though we note that their notation is slightly different.

We leave your practice of this operator to homework.