Multidimensional Data Model

  • Collection of numeric measures, which depend on a set of dimensions
  • Dimension hierarchies
    • For each dimension, the set of values can be organized into a hierarchy

Implementation

  • MOLAP
    • Multidimensional data is stored physically in a multidimensional array
    • Fast array operations (reading), expensive set operations
  • ROLAP
    • Multidimensional data is stored as relations
    • Fast set operations (set, union, etc), meh array operations
    • Implementation
      • The main relation (which relates dimensions to a measure) is called the fact table
      • Each dimension an have additional attributes and an associated dimension table
      • Fact tables are much larger than dimensional tables
    • Star Schema
      • One fact table with many dimension tables
      • Because the fact table is in the middle and links all of the dimension tables together

Marginals

  • Compresses over a dimension of a hypercube
  • Aggregates over dimension(s) and stores as a slice
  • Visualized as next to the cube
  • Speeds up other queries

Cube Operator

  • If there are k dimensions, there ark 2^k possible sql group by queries that can be generated through pivoting on a subset of dimensions
  • GROUP BY attr1, attr2, ... WITH CUBE
  • Equivalent to rolling up on all subsets of {attr1, attr2, ...}
  • Generates a table with the schema of {attr1, attr2, ...} where the attributes that were aggregated over for each row is null