In this post we’re going to take a look at how we can use different forms of database normalization to normalize our designs, and overcome their violation. Database normalization is a database design technique that consists of 1NF, 2NF, and 3NF normalization forms. Database normalization ensures no redundancy of data in tables.
Lets take a look at the three basic forms of normalization with their appropriate examples:
1 NF: 1 NF say there should be No repeating column
In this form we will see how repeating column can occur and the way to remove the repeating column.
Above figure showing that, in customer table the column order is repeating per customer, it not satisfying the rules of 1 NF. This violation left the data inconsistency and duplicity in database.
Disadvantage of avoiding 1 NF Rules:
- Improper Database design structure: suppose a customer need to order 4 orders, but you can see in above image there are space for only 3 orders. So you cannot able to store order number 4, for that by looking on above figure you need to alter table to add a new column order4.
- Wastage of space: A second customer arrives and he order only two order Order1 and Order2. Now the table for this customer only using two column order1 and order2 and the space for third Column is empty and space for Order3 remain unused
- Duplicity : the column is repeating if you avoid the 1 NF rules
Implementing 1NF database Design: to overcome of duplicity issue, I am going to split table into two separate tables. The following image is a implementation of rules of 1NF form. I have split the
Repeating column in another table called order table.
You can see MANOJ has order 3 items in order table with ORDERID 1, 2, 3. And ASHOK only ordered 2 items ORDERID 4 and 5.
Having this type of implementation 1NF have following advantages:
Advantage of having 1 NF Rules:
Proper database design structure:
We do not need to alter tables if customers order a new order. A new order should be recorded in order table with having unique Key OrderId.
If customers order more than one order item or only one order, then the spaces of these records are used without any wastage in order table.
No duplicity of Column
By implementation of 1NF design, you can see that 1NF have removed the redundancy of column
- 2NF say all attribute depend upon full primary key.
- The table should satisfy the condition of 1NF
- Tables have proper relationship using foreign key.
In following image, you can see the CUSTOMERORDERDETAIL table. So in following table we have two primary keys CUSTOMERID AND ORDERID.
So in this table customer1 has ordered electronic, base metal and agriculture and customer 2 has order hard metal and soft metal. I am storing this information along with customer name. But problem in this table is that for each time I have the customer id I have to put the customer name and phone. If MANOJ has ordered 10 orders then I will have to store his name and phone number ten times so this is called redundancy.
Disadvantage of avoiding 2NF rules
We can see that customer name and phone are not fully depended on order id, it depends upon customer id irrespectively with the order id the customer has the same name. Customer name only depends upon partial relationship that is customer id. This left inconsistent data in our database.
Slow DML queries
If I have to update the phone number of MANOJ with customer id 1 than I have to update phone number same number of redundancy of customer id 1. It makes our DML query slow because of processing redundancy data.
Implementing 2NF Database design: to overcome the redundancy and partial dependency, I am going to split this table. If we will not store the customer name and phone in CUSTOMERODERDETAIL table then we don’t have the problem. I am going to split this table into two parts of table along with relationship. So we have two type of information in this table customer information and order information. By making join on customer table and order table we can get the information that is in CUSTOMERORDERDETAIL.
I have split the CUSTOMERORDERDETAIL table to two parts. And I have made the join on these tables so that I can get all information that is in CUSTOMERORDERDETAIL
So if you look following image I have store the phone number and customer name of particular CUSTOMERID only one time. Similarly I have store ORDERID only one in order table.
Consistent data: one piece of information store in one place
Fast DML queries: 2nF make DML query fast because we do not have redundant data.
- 3NF say that all column non transitively depend upon primary key
- Must satisfied all condition of 1NF and 2NF
So by above rules we can say that all the non key attributes should be non transitively depend on the primary key.
Let see following figure, in following figure I have invoice table. InvoiceName is a primary key.
So city attribute should be non transitively depend upon InvoiceName and country attribute should be non transitively depend upon InvoiceName. Here we can see that country is not directly depending upon InvoiceName. We can derive the country from the city.
In above example we can see the transitive violation Ashok belong to India, because Ashok belong to Mohali. Manoj belong to India because manoj belong to panchkula this called transitive information that need to avoid.
Disadvantage of avoiding 3NF rules:
- Transitive data
- Redundant data
Implementing 3NF design: To overcome the issue of transitive information I have split the transitive data to new table called tbllocation
Now in invoice table ASHOK belong to MOHALI and in location table MOHALI belong to India. So now I have removed the transitive information in invoice table.