Structural Query Language (SQL)
Queries
Select
SELECT [DISTINCT] target-list
FROM relation-list
WHERE qualification
- Target-list
- A list of attributes of relations in relation list
as
and =
both can name fields
- Can additionally contain arithmetic (ex:
age1 = S.age - 5
or 2 * S.age as age2
)
- Relation-list
- A list of relation names
- Possibly with a range-variable after each name
- If multiple are included, the cross product is taken
- Qualification: Comparisons
- Formatted
attr1 op const
or attr1 op attr2
- Where op is one of <,>,=,≤,≥,=
- Combined using
AND
, OR
, and NOT
- String comparison can also be done with
LIKE
_
stands for any one character and
%
stands for 0 or more arbitrary characters.
DISTINCT
: optional keyword indicating that the answer should not contain duplicates
- Can add
ORDER BY attribute-list
and LIMIT
to make things run faster
- Grouping
- Add
GROUP BY grouping-list HAVING group-qualification
- grouping list = grouping attributes
- target list = contains grouping attribute names or terms with aggregate operations
- group-qualification = boolean conditions on grouping attributes or group aggregates
- Each answer tuple corresponds to a group, and these attributes must have a single value per group
- One answer tuple is generated per group
- Grouping additionally allows you to include the attribute that is was grouped over alongside an aggregation in the select
Nested Queries
- A
WHERE
, FROM
, and HAVING
clauses can contain an SQL query
- Using an
IN
within the WHERE
connecting to the nested clause is common
- Semantics
- Conceptual nested loop evaluation
- For each top level tuple, check the qualification by computing the subquery
- Using the result
Tuple IN R
Val > ANY/EVERY R
- `Tuple = ANY/EVERY R
Joins
- Can specify natural, outer, inner
- Uses bag semantics
- Theta join:
R JOIN S ON <condition>
- Outer Joins:
R [NATURAL] [LEFT/RIGHT/FULL] OUTER JOIN S [ON <condition>]
- left = pad dangling tuples of R only
- right = pad dangling tuples of S only
- full = pad both (default)
Set Operations
INTERSECT
, UNION
, MINUS/EXCEPT
- Use set semantics by default
- Because these are done via sorting, so it is cheap to also do deduplication afterwards
- Can opt back into bag semantics by adding
ALL
after the operation
Aggregate Operators
COUNT (*)
COUNT ([DISTINCT] A)
SUM ([DISTINCT] A)
AVG ([DISTINCT] A)
MAX (A)
MIN (A)
Evaluation
- Semantics of an SQL query are defined using a conceptual evaluation strategy
- Compute the cross-product of the relationship-list
- Discard the resulting tuples if they fail qualifications
- Delete attributes that are not in the target-list
- If distinct is specified, eliminate duplicate rows
- Conceptual Evaluation Strategy != Actual Evaluation Strategy
- The actual database uses a more efficient approach that generates the same answer
Data Modification Language
- A modification command does not return a result but does change the database in some way
Insert
INSERT INTO <relation> VALUES (<list of values>)
- Can specify attribute order
INSERT INTO table(attr1, attr2) VALUES (val1, val2)
- Can use subqueries to insert data from one table into another
INSERT INTO <relation> (<subquery>)
Delete
DELETE FROM <relation> WHERE <condition>
- Where can be a full query condition
- If where is not specified, it just deletes everything
Updates
UPDATE <relation>
SET <list of attribute assignments>
WHERE <condition on tuples>
Optimizations
- Push selection down the tree
- Change the order of joints (start a join with the smaller table)