Schema Refinement

  • Problems caused by bad schema design
    • Redundant storage
    • Update anomaly: one occurrence is changed, but not all occurrences
    • Insertion anomaly: cannot insert a piece of data without first inserting another
    • Deletion anomaly: valid fact is lost when a tuple is deleted
  • In poor design, one ER entity becomes multiple tuples

Decomposition

  • Main refinement technique
  • Ex: replacing ABCD with AB and BCD, or ACD and ABD
  • Do we need to decompose a relation? Use FDs and normal forms
  • Two important properties of decomposition:
    • Lossless-join
      • It should be possible to project the original relations onto the decomposed schema and then reconstruct the original
    • Dependency-preservation (decide efficiency in checking the IC’s)
      • it should be possible to check in the projected relations whether all the given FD’s are satisfied
  • Trade-offs: Query Performance vs. Data Redundancy

Methods