Megan Taylor

front-end dev, volunteacher, news & data junkie, bibliophile, Flyers fan, sci-fi geek and kitteh servant

Intro to Databases: Relational Design Theory, Motivation and Overview

Designing a database schema
usually many designs possible
some are much better than others
how do we choose?

Often use higher-level design tools, but…
some designers go straight to relations
useful to understand why tools produce certain schemas

very nice theory for relational database design

Example: College application info
SSN and name
colleges applying to
high schools attended (with city)
hobbies

Apply(SSN, sName, cName, HS, HScity, hobby)

Design anomalies
redundancy
update anomaly – update facts differently in different places
deletion anomaly – inadvertently delete

Different design for the same data – no anomalies, can reconstruct all original data

Student(SSN, sName)
Apply(SSN, cName)
HighSchool(SSN, HS)
Located(HS, HScity)
Hobbies(SSN, hobby)

could also move HScity to HighSchool and get rid of Located, or move hobby to Apply and get rid of Hobbies

best design depends on what data represents in the real world as well as removing anomalies.

Design by decomposition
start with “mega” relations containing everything
decompose into smaller, better relations with same info
can do decomposition automatically

Automatic decomposition
“mega” relations + properties of the data
system decomposes based on properties
final set of relations satisfies normal form (no anomalies, no lost information)

Properties and Normal Forms
functional dependencies > Boyce-Codd Normal Form
+ multivalued dependencies > Fourth Normal Form

Fourth Normal Form is stricter than BCNF

Functional Dependencies and Boyce-Codd Normal Form

Apply(SSN, sName, cName)

redundancy, update and deletion anomalies
storing SSN, sName pair once for each college

Functional Dependency: SSN -> sName
same SSN always has same sName
should store each SSN’s sName only once

BCNF: if A -> B then A is a key

Decompose: Student(SSN, sName) Apply(SSN, cName)

Multivalued Dependencies and Fourth Normal Form

Apply(SSN, cName, HS)

redundancy, update and deletion anomalies
multiplicative effect: student who applies to X college and attended Y high schools has X*Y tuples in the database
Not addressed by BCNF: no functional dependencies

Multivalued Dependencies SSN ->> cName
Given SSN has every combination of cName with HS
should store each cName and each HS for an SSN once

Fourth Normal FormL if A ->> B then A is a key

Decompose: Apply (SSN, cName) HighSchool(SSN, HS)

November 7, 2011 | Comments Off on Intro to Databases: Relational Design Theory, Motivation and Overview | Categories: Posts | Permalink

Comments are closed.

%d bloggers like this: