A database is said to be in 5NF, if and only if,
- It’s in 4NF
- If we can decompose table further to eliminate redundancy and anomaly, and when we re-join the decomposed tables by means of candidate keys, we should not be losing the original data or any new record set should not arise. In simple words, joining two or more decomposed table should not lose records nor create new records.
Consider an example of different Subjects taught by different lecturers and the lecturers taking classes for different semesters.
Note: Please consider that Semester 1 has Mathematics, Physics and Chemistry and Semester 2 has only Mathematics in its academic year!!
In above table, Rose takes both Mathematics and Physics class for Semester 1, but she does not take Physics class for Semester 2. In this case, combination of all these 3 fields is required to identify a valid data. Imagine we want to add a new class – Semester3 but do not know which Subject and who will be taking that subject. We would be simply inserting a new entry with Class as Semester3 and leaving Lecturer and subject as NULL. As we discussed above, it’s not a good to have such entries. Moreover, all the three columns together act as a primary key, we cannot leave other two columns blank!
Hence we have to decompose the table in such a way that it satisfies all the rules till 4NF and when join them by using keys, it should yield correct record. Here, we can represent each lecturer’s Subject area and their classes in a better way. We can divide above table into three – (SUBJECT, LECTURER), (LECTURER, CLASS), (SUBJECT, CLASS)
Now, each of combinations is in three different tables. If we need to identify who is teaching which subject to which semester, we need join the keys of each table and get the result.
For example, who teaches Physics to Semester 1, we would be selecting Physics and Semester1 from table 3 above, join with table1 using Subject to filter out the lecturer names. Then join with table2 using Lecturer to get correct lecturer name. That is we joined key columns of each table to get the correct data. Hence there is no lose or new data – satisfying 5NF condition.
SELECT t3.Class, t3.Subject, t1.Lecturer FROM TABLE3 t3, TABLE3 t2, TABLE3 t1, where t3.Class = 'SEMESTER1' and t3.SUBJECT= 'PHYSICS' AND t3.Subject = t1.Subject AND t3.Class = t2.Class AND t1.Lecturer = t2.Lecturer;