ScholarQuill logoScholarQuillUniversity Notes
  • Notes
  • Past Papers
  • Blogs
  • Todo
Login
ScholarQuill logoScholarQuillUniversity Notes
Login
NotesPast PapersBlogsTodo
More
SubjectsDiscussionCGPA CalculatorGPA CalculatorStudent PortalCourse Outline
About
About usPrivacy PolicyReportContact
Notes
Past Papers
Blogs
Todo
Analytics
    Current Subject
    🧩
    Database Systems
    CC-215
    Progress0 / 34 topics
    Topics
    1. Basic Database Concepts2. Database Approach vs File Based System3. Database Architecture4. Three Level Schema Architecture5. Data Independence6. Relational Data Model7. Attributes8. Schemas9. Tuples10. Domains11. Relation Instances12. Keys of Relations13. Integrity Constraints14. Relational Algebra15. Selection in Relational Algebra16. Projection in Relational Algebra17. Cartesian Product in Relational Algebra18. Types of Joins19. Normalization20. Functional Dependencies21. Normal Forms22. Entity-Relationship Model23. Entity Sets24. Attributes in Entity-Relationship Model25. Relationship in Entity-Relationship Model26. Entity-Relationship Diagrams27. Structured Query Language (SQL)28. Joins in SQL29. Sub-Queries in SQL30. Grouping and Aggregation in SQL31. Concurrency Control32. Database Backup and Recovery33. Indexes34. NoSQL Systems
    CC-215›Normalization
    Database SystemsTopic 19 of 34

    Normalization

    6 minread
    1,063words
    Intermediatelevel

    Normalization in Database Design

    Normalization is a process used in database design to organize the data in a way that minimizes redundancy and dependency. The goal of normalization is to reduce the possibility of data anomalies and ensure that the database structure is efficient, logical, and consistent.

    Normalization typically involves decomposing a large, complex table into smaller, simpler tables, and establishing relationships between them. This helps improve the integrity of the data, making it easier to maintain and update.

    Normalization follows a series of stages called normal forms. There are several normal forms, and each normal form builds on the previous one, with the primary goal of reducing redundancy and preventing anomalies (insert, update, delete anomalies).

    Key Objectives of Normalization:

    1. Eliminate Redundancy: Redundant data (such as repeating values) leads to wasted space and can cause inconsistencies in data updates.
    2. Improve Data Integrity: By organizing data into logical units, normalization helps ensure that the data is accurate and consistent.
    3. Simplify Queries: Normalized data structures make queries more efficient by ensuring that related data is logically grouped.
    4. Minimize Update, Insert, and Delete Anomalies: By reducing data redundancy, normalization prevents problems where updating one piece of data leads to inconsistencies across the database.

    Normal Forms

    Normalization proceeds through a series of Normal Forms (NF). Each normal form addresses specific types of redundancy or anomaly.

    1. First Normal Form (1NF)

    A relation is in 1NF if:

    • All attributes contain only atomic (indivisible) values.
    • Each column contains only one value per row (no repeating groups or arrays).
    • Each record (row) is unique.

    Key points of 1NF:

    • There should be no multi-valued attributes (for example, a column storing multiple phone numbers).
    • Each column must contain values of a single type.
    Example (Before 1NF):
    StudentID Name Phone Numbers
    1 Alice 12345, 67890
    2 Bob 98765
    After converting to 1NF:
    StudentID Name Phone Number
    1 Alice 12345
    1 Alice 67890
    2 Bob 98765
    • The "Phone Numbers" attribute has been broken down into individual records, making the data atomic.

    2. Second Normal Form (2NF)

    A relation is in 2NF if:

    • It is in 1NF.
    • Every non-key attribute is fully functionally dependent on the entire primary key (i.e., there are no partial dependencies).

    Key points of 2NF:

    • A relation is in 2NF if there are no attributes that depend on only part of a composite primary key.
    • This applies only to tables with composite primary keys (i.e., primary keys made of more than one column).
    Example (Before 2NF):

    Consider a table storing information about students and their courses:

    StudentID CourseID Instructor InstructorPhone
    1 C101 Dr. Smith 12345
    1 C102 Dr. Johnson 67890
    2 C101 Dr. Smith 12345

    Here, the primary key is a composite of StudentID and CourseID. The problem is that Instructor and InstructorPhone depend only on CourseID, not the entire composite key.

    After converting to 2NF:
    • Create two tables to remove partial dependency:

    Student-Course Table:

    StudentID CourseID
    1 C101
    1 C102
    2 C101

    Instructor Table:

    CourseID Instructor InstructorPhone
    C101 Dr. Smith 12345
    C102 Dr. Johnson 67890
    • Now, there are no partial dependencies, and each non-key attribute is fully dependent on the entire primary key.

    3. Third Normal Form (3NF)

    A relation is in 3NF if:

    • It is in 2NF.
    • There are no transitive dependencies; that is, no non-key attribute depends on another non-key attribute.

    Key points of 3NF:

    • Every non-key attribute must be directly dependent on the primary key, not on other non-key attributes.
    • Eliminates transitive dependencies, which occur when one non-key attribute depends on another non-key attribute.
    Example (Before 3NF):

    Consider a table that stores information about employees and their departments:

    EmployeeID Name DeptID DeptName DeptLocation
    1 Alice D1 HR New York
    2 Bob D2 IT San Francisco
    3 Charlie D1 HR New York

    Here, DeptName and DeptLocation are dependent on DeptID, not directly on EmployeeID. Thus, we have a transitive dependency (Employee → DeptID → DeptName, DeptLocation).

    After converting to 3NF:
    • Split the table into two tables:

    Employee Table:

    EmployeeID Name DeptID
    1 Alice D1
    2 Bob D2
    3 Charlie D1

    Department Table:

    DeptID DeptName DeptLocation
    D1 HR New York
    D2 IT San Francisco
    • Now, there are no transitive dependencies. Each non-key attribute is only dependent on the primary key.

    4. Boyce-Codd Normal Form (BCNF)

    A relation is in BCNF if:

    • It is in 3NF.
    • For every functional dependency, the left side (determinant) is a superkey.

    Key points of BCNF:

    • A relation is in BCNF if it does not contain any exceptions to the rule that every determinant must be a candidate key.
    Example (Before BCNF):

    Consider a table where StudentID and CourseID are both keys:

    StudentID CourseID Instructor InstructorPhone
    1 C101 Dr. Smith 12345
    1 C102 Dr. Johnson 67890
    2 C101 Dr. Smith 12345

    Here, Instructor and InstructorPhone depend on CourseID, but CourseID is not a superkey.

    After converting to BCNF:
    • Split into two tables:

    Student-Course Table:

    StudentID CourseID
    1 C101
    1 C102
    2 C101

    Course-Instructor Table:

    CourseID Instructor InstructorPhone
    C101 Dr. Smith 12345
    C102 Dr. Johnson 67890
    • Now, the relation is in BCNF because every determinant (i.e., CourseID) is a superkey.

    5. Fourth Normal Form (4NF)

    A relation is in 4NF if:

    • It is in BCNF.
    • There are no multi-valued dependencies.

    Key points of 4NF:

    • A multi-valued dependency occurs when one attribute determines multiple values of another attribute, but there is no logical relationship between them.
    Example (Before 4NF):
    EmployeeID Skill Language
    1 Programming English
    1 Management Spanish

    Here, an employee can have multiple skills and multiple languages. This is a case of multi-valued dependency.

    After converting to 4NF:
    • Split into two tables:

    Employee-Skill Table:

    EmployeeID Skill
    1 Programming
    1 Management

    Employee-Language Table:

    EmployeeID Language
    1 English
    1 Spanish
    • Now, there are no multi-valued dependencies.

    Summary of Normal Forms:

    • 1NF: Eliminate duplicate records and ensure all attributes are atomic.
    • 2NF: Remove partial dependencies (when a non-key attribute depends only on part of a composite primary key).
    • 3NF: Remove transitive dependencies (non-key attributes dependent on other non-key attributes).
    • BCNF: Ensure every determinant is a superkey.
    • 4NF: Eliminate multi-valued dependencies.

    Normalization ensures that the database schema is clean, consistent, and free from redundancy. The trade-off is that normalization can lead to more tables and more complex queries due to the need for joins.

    Previous topic 18
    Types of Joins
    Next topic 20
    Functional Dependencies

    Past Papers

    Open this section to load past papers

    Click on Show Past Papers to see past papers.
    On This Page
      Reading Stats
      Est. reading time6 min
      Word count1,063
      Code examples0
      DifficultyIntermediate