Multidimensional Data Model
- Collection of numeric measures, which depend on a set of dimensions
- Dimension hierarchies
- For each dimension, the set of values can be organized into a hierarchy
Implementation
- MOLAP
- Multidimensional data is stored physically in a multidimensional array
- Fast array operations (reading), expensive set operations
- ROLAP
- Multidimensional data is stored as relations
- Fast set operations (set, union, etc), meh array operations
- Implementation
- The main relation (which relates dimensions to a measure) is called the fact table
- Each dimension an have additional attributes and an associated dimension table
- Fact tables are much larger than dimensional tables
- Star Schema
- One fact table with many dimension tables
- Because the fact table is in the middle and links all of the dimension tables together
Marginals
- Compresses over a dimension of a hypercube
- Aggregates over dimension(s) and stores as a slice
- Visualized as next to the cube
- Speeds up other queries
Cube Operator
- If there are k dimensions, there ark 2^k possible sql group by queries that can be generated through pivoting on a subset of dimensions
GROUP BY attr1, attr2, ... WITH CUBE
- Equivalent to rolling up on all subsets of
{attr1, attr2, ...}
- Generates a table with the schema of
{attr1, attr2, ...}
where the attributes that were aggregated over for each row is null