Relational Algebra
- A procedural language that is evaluated like a mathematical expression
- Based on mathematical foundations of set theory
- We don’t care about relationships vs entities here—they are all sets
Core
- Operations
- Select: σcA
- Picks rows that meet the condition
- C is a condition that refers to attributes
- Projection: ΠLA
- Selects certain columns
- L is an ordered list of attributes from the schema
- Alters the schema and data
- Needs duplicate removal afterwards
- Product: A×B
- Cartesian product of two sets
- Pair each tuple of A with each tuple of B
- Schema is the attributes of A and then B, ie
- Attributes with the same name will be A.name, B.name
- Theta-Join: R1⋈cR2
- Equivalent to σc(R1×R2)
- Common form of boolean condition: AθB where θ is =, <, >, etc.
- Natural-Join: R1⋈R2
- A frequent type of join
- Connects two relations by equating attributes of the same name (ie key to foreign key)
- Projects out one copy of the equated attributes (removes the duplicate key)
- Renaming: ρR1(A1,…,An)(R2)
- Gives a new schema to a relation
- Alternate notation: R1(A1,…An) := (R2)
Complex Expressions
- Parenthesis override order of operations (like normal algebra)
- Three representations
- Sequences of assignments
- Expressions with parenthesis
- Trees
- Leaves are operands (either variables of relations or constant relations)
- Inner nodes are operators
- Precedence
- Unary (select, project, rename)
- Product, Join
- Intersection
- Union, Difference
Bag Semantics
- Bags (aka multiset) is like a set but an element can appear more than once
- Sets are also bags that happen to be a set
- Used because deduplication is an expensive operation that requires sorting
- Most commercial databases use bags semantics unless the attribute is explicitly declared to be unique
- SQL is implemented on top of bag semantics
Operators
- Union: Add counts
- Intersection: The minimum counts of elements that appear in both
- Difference: The number of times an object appears in A minus the number of times in B (without going negative)
- Selection: Same
- Projection: Do not eliminate duplicates
- Product and Joins: Same
- Bag laws != Set laws (some but not all hold)
- Ex: commutative law for union holds for both (because addition is commutative)
- Ex: Union idempotent law (S∪S=S) does not hold because each element would be duplicated
Equivalence Laws
Extended
- Eliminate duplicates from bags: δ
- Sort tuples: τ
- Extended projection
- Arithmetic, duplication of columns
- L can include arbitrary arithmetic
- Grouping and aggregation: γL(R)
- L is a list of elements that are either
- Grouping from the base table
- Aggregation
- Ex: sum, average, min, max
- Outer Join
- Include “dangling tuples” (tuples that do not join with anything) in the result