Problems with a Bad DB Design
● Redundancy : Duplication of Data
● Insert Anomalies: It is not possible to insert some data with out
inserting some other data.
● Delete Anomalies: It is not possible to delete some data with out
loosing some other data.
● Update Anomalies: If we want to modify some data, we have to
modify all the copies of it.
● Memory Wastage: Redundant storage of data always leads to
memory wastage.
Normalization
● Normalization is the process of schema refinement to
remove redundant storage of data in the database.
● We normalize the database through decomposition.
● Decomposition is the process of dividing a bigger table into
smaller multiple tables.
● To ensure the consistency of database the decomposition must
be loss-less decomposition.
● Lossy decomposition leads to inconsistency of database.
Normalization
● Based on to which extent we want to normalize the database,
we have different normal forms:
➢First Normal Form (1NF)
➢Second Normal Form (2NF)
➢Third Normal Form (3NF)
➢Boyce Codd Normal Form (BCNF)
➢Fourth Normal Form (4NF)
➢Fifth Normal Form (5NF)
Functional Dependency
● In a Given Relation R, X and Y are attributes.
Attribute Y is functionally dependent on
attribute X, if each value of X determines
exactly one value in Y.
● This is Represented as X -> Y
● X Determines the value of Y or
● Y is Functionally Dependent on X
● Ex: Marks -> Grade
● Studentid -> StudentName
Determinant:
● Attribute X can be defined as determinant if it
uniquely defines the attribute value Y in a given
relation R.
● It is Represented as X -> Y, which means
● Attribute X decides attribute Y.
● X is called Determinant.
Functional Dependency types:
1)Fully Functional Dependency.
2)Partial Functional Dependency.
3)Transitive Functional Dependency.
Let us consider the Student_Cource_Result Table.
Sid and Cid both together is called the composite Key
1) Fully Functional Dependency
Ex: Sid + Cid -> Marks
Both studentid and courceid determines Marks.
2) Partial Functional dependency
Ex: sid -> sname ( Sid is alone enough to determine the
Sname, Cid is not Required)
cid -> cname ( Cid is alone enough to determine the
Cname, Sid is not Required)
3) Transitive Functional Dependency
Consider:
SidCid->Marks-> Grade.
● In the above representation Sid and Cid Determines
Marks, and Marks Determines Grade.
● That is Sid Cid determines Grade.
Note: Both Partial and Transitive Dependencies must be
eliminated to make the Database design Good.
First Normal Form (1NF)
● A Relation ‘R’ is said to be in first normal form if and only
if all the attributes of ‘R’ are atomic in nature.
● Composite and Multi valued attributes should be avoided.
● Atomic: The Smallest levels to which data may be broken
down and remains meaningful.
Relation which is not in 1NF
Relation in 1NF
Second Normal Form (2NF)
● A Relation ‘R’ is said to be in second normal form if and
only if: (i) it is in first normal form.
(ii) no partial dependency exists between
non key attributes and key attributes.
Relations in 2NF
Relations in 2NF
Third Normal Form (3NF)
● A Relation ‘R’ is said to be in third normal form if and
only if: (i) it is in second normal form.
(ii) no transitive dependency exists between
non key attributes and key attributes.
Boyce Codd Normal Form (BCNF)
● A Relation ‘R’ is said to be in BCNF if and only
(i) it is in 3NF
(ii) all the determinants are key Attributes.
● BCNF is a strong 3NF. But not every 3NF relation is
in BCNF
● If a table has only one non-composite candidate key
and if it is in 3NF, then the table will also be in BCNF.
Normalization Merits and Demerits
Merits:
● Removes the redundancy to the greater extent.
● Removes insert, delete and update anomalies.
Demerits:
● Performance of Data retrieval (select) operations will be
affected.
● Sometimes full normalization may not be desirable.
Fourth Normal Form (4NF)
● A Relation ‘R’ is said to be in fourth normal form if and only
if: (i) it is in BCNF.
(ii) it does not contain more than one multi
valued dependency.
● Multi valued dependency is the special case of Functional
dependency
X->Y (Every X value determines exactly one Y value)
X->>Y (Every X value may determine more than one Y
value)
EmpName ProjectName AssistName
Smith X John
smith X Anna
smith Y John
smith Y Anna
Course Teacher Book
DBMS YRM Korth
DBMS YRM Raghuram
Krishnan
DBMS KSH Korth
DBMS KSH Raghuram
Krishnan
DBMS CSR Korth
DBMS CSR Raghuram
Krishnan
Course Teacher Course Book
DBMS YRM DBMS Korth
DBMS KSH
DBMS Raghuram
DBMS CSR Krishnan
Fifth Normal Form (5NF)
● A Relation ‘R’ is said to be in fifth normal form if and
only if: (i) it is in 4NF.
(ii) it does not contain join dependency.
● A Relation that has a join dependency can not be
divided into two (or more) relations such that the
resulting relations can be combined to form the
original relation.
● 5NF is also called as PJNF (Project Join Normal
Form).
Department Course Student
Department Course Course Student
CSE CP ED Ravi
ME ED BEE Ramesh
ECE BEE CP Rajesh
Department Student
CSE Ravi
ME Ramesh
ECE Rajesh
Proj Project Emp Emp Category Hourly
id name id name wages
E501 Ravi A 600
P101 Banking E502 Ramesh B 500
E503 Rajesh C 400
P102 Shopping E501 Ravi A 600
E507 Siva B 500