Every database which has tables and constraints need not be a relational database system. Any database which simply has relational data model is not a relational database system (RDBMS). There are certain rules for a database to be perfect RDBMS. These rules are developed by Dr Edgar F Codd (EF Codd) in 1985 to define a perfect RDBMS. For a RDBMS to be a perfect RDBMS, it has to follow his rules. But no RDBMS can obey all his rules.
EF Codd has developed 13 rules for a database to be a RDBMS. According to him, all these rule help to have perfect RDBMS and hence correct data and relation among the objects in database. But none of the database follows all these rules; but obeys to some extent. For example, oracle follows only 8.5 Codd’s rules.
Let us see the rules one by one.
This is the foundational Rule. This rule states that any database system should have characteristics as relational, as a database and as a management system to be RDBMS.
That means a database should be a relational by having the relation / mapping among the tables in the database. They have to be related to one another by means of constraints/ relation. There should not be any independent tables hanging in the database.
RDBMS is a database i.e.; it stores the data in a well organized form called tables. It should be able to handle large amount of information too. In short, it should meet the objectives of a database.
RDBMS is management system – that means it should be able to manage the data, relation, retrieval, update, delete, permission on the objects. It should be able handle all these administrative tasks without affecting the objectives of database. It should be performing all these tasks by using query languages.
A database consists of lot of data – may be user data and the data about these data or metadata. Each group of these data must be stored in a table in the form of rows and columns. Each cell in the table should have these datas. The order of rows and columns in the table should not affect the meaning of the table. Each cell should have single data. There should not be any group/range of values separated by comma, space or hyphen (Normalized data). This should be the only way to store the data in a database. This rule is satisfied by all the databases.
Order of storing personal details about ‘James’ and ‘Antony’ in PERSON table should not have any difference. There should be flexibility of storing them in any order in a row. Similarly, storing Person name first and then his address should be same as storing address and then his name. It does not make any difference on the meaning of table.
Guaranteed Access Rule
This rule refers to the primary key. It states that any data/column/attribute in the table should be able logically accessed by using the table in which it is stored, the primary key column of the table and the column which we want to access. When combination of these 3 is used, it should give the correct result. Any column/ cell value should not be directly accessed without specifying the table and primary key.
Address of Kathy STUDENT + STUDENT_ID (Kathy) + ADDRESS is the right way of getting any cell value.
Address of Kathy Troy should be able to access like this.
Systematic Treatment of NULL
This rule states about handling the NULLs in the database. As database consists of various types of data, each cell will have different datatypes. If any of the cell value is unknown, or not applicable or missing, it cannot be represent as zero or empty. It will be always represented as NULL. This NULL should be acting irrespective of the datatype used for the cell. When used in logical or arithmetical operation, it should result the value correctly.
Adding NULL to numeric 5 should result NULL –
5+ unknown = unknown 5+ NULL = NULL
5+ NULL! = 5 or 0
It should not result in any zero or numeric value. DBMS should be strong enough to handle these NULLs according to the situation and the datatypes.
Active Online Catalog
This rule illustrates data dictionary. Metadata should be maintained for all the data in the database. These metadata should also be stored as tables, rows and columns. It should also have access privileges. In short, these metadata stored in the data dictionary should also obey all the characteristics of a database. Also, it should have correct up to date data. We should be able to access these metadata by using same query language that we use to access the database.
SELECT * FROM ALL_TAB; -- ALL_TAB is the table which has the table definitions that the user owns and has access. It is queried using the same SQL query that we use in the database.
Comprehensive Data Sub language Rule
Any RDBMS database should not be directly accessed. It should always be accessed by using some strong query language. This query language should be able to access the data, manipulate the data and maintain the consistency and integrity of the database. They query should make sure that the transaction is fully complete or not done at all.
SQL is a structured query language which support creating tables / views/ constraints/indexes, accessing the records of tables/views (SELECT), manipulating the records by insert/delete/update, provides security by giving different level of access rights (GRANT and REVOKE) and integrity and consistency by using constraints.
Any database without any query language is not a RDBMS. Database can be accessed by using query language directly or using them in the application.
View Updating Rule
Views are the virtual tables created by using queries to show the partial view of the table. That is views are subset of table, it is only partial table with few rows and columns. This rule states that views are also be able to get updated as we do with its table.
Suppose we have created a view on Employee table, in which we have details of the employees who work for particular department, say ‘Testing’. Here STUDENT is the whole table and STUDENT_TEST is the view with Testing Employees. According to this rule, we should be able to update the records in STUDENT_VIEW.
But in real database systems, we cannot give this privilege on views. Basic intension of creating the view is to give the group of data to the user in the form of table. When lengthy queries have to be written to get some details from the database, view shortens the length of the query and gives more meaningful and shorter query. In such case, updating the view is not feasible. Although updating the view will update the table used for creating it, it is not recommended by most of the database. Hence this rule is not used in most of the database.
High-level insert, update, and delete
This rule states that every query language used by the database should support INSERT, DELETE and UPDATE on the records. It should also support set operations like UNION, UNION ALL, MINUS, INTERSECT and INTERSECT ALL. All these operation should not be restricted to single table or row at a time. It should be able to handle multiple tables and rows in its operation.
Suppose employees got 5% hike in a year. Then their salary has to be updated to reflect the new salary. Since this is the annual hike given to the employees, this increment is applicable for all the employees. Hence, the query should not be written for updating the salary one by one for thousands of employee. A single query should be strong enough to update the entire employee’s salary at a time.
Physical Data Independence
If there is any change in the physical storage of the data, it should not affect the data at the logical or external view.
If the data stored in one disk is transferred to another disk, then the user viewing the data should not feel the difference or delay in access time. The user should be able to access the data as he was accessing before. Similarly, if the file name for the table is changed in the memory, it should not affect the table or the user viewing the table. This is known as physical independence and database should support this feature.
Logical Data Independence
This is similar to physical data independence. Here if there are any changes to the logical view, then it should not be reflected in the user view.
If we split the EMPLOYEE table according to his department into multiple employee tables, the user viewing the employee table should not feel that these records are coming from different tables. These split tables should be able to get joined and show the result. In our example we can use UNION and display the results to the user.
But in ideal scenario, this is difficult to achieve since all the logical and user view will be tied so strongly that they will be almost same.
Database should be able apply integrity rules by using its query languages. It should not be dependent on any external factor or application to maintain the integrity. The keys and constraints in the database should be strong enough to handle the integrity. A good RDBMS should be independent of the frontend application. It should at least support primary key and foreign key integrity constraints.
Suppose we want to insert an employee for department 50 using an application. But department 50 does not exists in the system. In such case, the application should not perform the task of fetching if department 50 exists, if not insert the department and then inserting the employee. It should all handled by the database.
The database can be located at the user server or at any other network. The end user should not be able to know about the database servers. He should be able to get the records as if he is pulling the records locally. Even if the database is located in different servers, the accessibility time should be comparatively less.
When a query is fired in the database, it will be converted into low level language so that it can be understood by the underlying systems to retrieve the data. In such case, when accessing or manipulating the records at low level language, there should not be any loopholes that alter the integrity of the database. In other words, even thought the query written does not change the integrity of the tables, the converted low level language should be same as the query written. It should not be converted into some other low level language which changes the data integrity in the database or performs some unwanted actions in the database.
Update Student’s address query should always be converted into low level language which updates the address record in the student file in the memory. It should not be updating any other record in the file nor inserting some malicious record into the file/memory.