Database is composed of lots of tables and views. Each table will have lots of data in them. Main purpose of storing all these data in the database is to get them handy when it is required. But when a table is very huge, searching for single record is really a difficult task. The usual method of searching the record in the table is to start from the beginning record by record, till we get the record. But it is not feasible when table is large. It will take lot of time.
In order to reduce the time to fetch the record, another table like structure is introduced where pointer to the memory address is stored for each record. This is called as index. This index table will have the columns which are frequently used for retrieval of records and the address of data block in the memory. It speeds up the data retrieval time and makes the fetch faster. Primary key itself acts as an index in a table. We can create any number of indexes, but we have to be cautious while creating index since it slows down data insertion and updation speed. Creating an index depends on the table depends on the frequency of table access, the columns/ combination of columns which are used for fetching the records etc. Before creating the index, developer has to thoroughly understand all these about the table, and then create the index. Otherwise it will degrade the performance of record fetch as well as other transaction like insert/ delete/ update.
Index can be created on one or more columns. If the index is created on two or more columns to form one index, then the index is called composite index or concatenated index.
Index is usually created to enhance the performance of retrieval of records in huge tables. But sometimes, improper creation of index and usage will cause bad performance of the query. Hence proper management of the index is required. We have to follow few rules for the index management.
- Create Indexes on huge tables to enhance the performance of record retrieval. Query retrieval in small tables will be usually faster. Hence they need not have any indexes.
- Indexes are created on frequently accessed tables – means tables on which frequent record fetches are applied. Usually the columns which are frequently involved in fetching the records – the columns which comes in the condition clause (WHERE Clause) of a query are used to index. Indexes should be created on columns with unique values and having range of values. So proper table and columns need to be indexed.
- Columns with non-null values – If we create index on columns with NULL value, it is of no use. We cannot fetch the required record.
- If there are lots of updates on the column, such columns should not be used in index.
- Drop the indexes, if they are no more required in the database. Unwanted and unused indexes always lead to bad performance of a query. It sometimes deviate the execution path of the query. That is if there is a better performance of a query by full table scan or by using some other index, but because of the existence of unwanted index, the DBMS will be forced to use this index. Hence affecting the performance.
- Order of the columns in the index also matters in the performance. When Indexes are created on set of columns, it is usually created on the order of the columns. That means when we fire fetch query, the execution path will first fetch for the address location based on first column in the index, and the subsequent columns. Hence if we use less frequently accessed columns first in the index, it will not boost the performance. Usually indexes involving two or more columns, the order should be most frequently accessed columns to less frequently accessed columns.
- Any table can have any number of indexes. But these indexes should increase the performance of query retrieval. But if there is lot of insertion / deletion/updation, then having more index is not a good idea. These transactions will require index table to be updated accordingly. For example, if we have inserted some records into table, then index table also have to be inserted with index columns. If there is an update/delete, index table is also have to be updated/deleted. This is an overhead to the database and it will decrease the performance of insertion / deletion / updation.
- New indexes should be created once records are inserted into the table. Otherwise it will degrade the performance of insertion as it has to update each record for index.
Let us see how an index is created, deleted and updated using SQL.
Syntax for creating an index:
CREATE INDEX index_name ON table_name (column/s); –Index can be created on single column or combinations of columns.
CREATE INDEX idx_stdName ON STUDENT (STD_NAME); — creates index on student name
CREATE INDEX idx_empName ON EMPLOYEE (EMP_FIRST_NAME, EMP_LAST_NAME); — combinations of columns are used to create index.
If only one column is used to create an index, then it is called as single column index and if there is more than one column, then it is called as composite index. Implicit indexes are the one which are indexes by default. That means database considers them as index when the table is created. Primary key and unique key columns are implicit indexes. Unique index can be explicitly created as below. This index enhances the performance as well as avoids duplicate column value.
CREATE UNIQUE INDEX index_name ON table_name (column/s);
CREATE UNIQUE INDEX idx_persLicence ON PERSON (LICENCE_NUMBER);
If any index is no longer required then it can be dropped.
DROP INDEX index_name; — where index_name is the name of the index to be dropped.
DROP INDEX idx_persLicence;
Index can be renamed too.
ALTER INDEX index_name RENAME TO new_index_name; — where index_name is renamed to new_index_name
ALTER INDEX idx_stdName RENAME TO idx_stud_Name;
Types of Index
There are different types of indexes. They are used / created depending on the requirement. It all depends on what type of data exists in those columns, and frequency of access.
This is the default index created when we create any index. We need not explicitly specify the index as B- tree. The syntax CREATE INDEX will create B-tree index. These indexes will have the column value and the pointer to the address location in the memory where the actual record is stored. This index will have intermediary nodes where it will have pointers to the address location and the leaf node where the actual data. These indexes are suitable for columns which are more frequently accessed and primary keys. These indexes sort the records and store it in the memory.
There are 4 sub-types of B-Tree index
Here data itself acts as an index and whole record is stored in the B+ index file. That is all columns of the table together considered as one index. A primary key have to be defined so that each row can be uniquely identified. The records are stored in the same structure as they are created.
Here records are stored in the descending order of index key column in B+ tree files.
Reverse key indexes
In this method of indexing, the index key column value is stored in the reverse order.
For example, say index is created on STD_ID in the STUDENT table. Suppose STD_ID has values 100,101,102 and 103. Then the reverse key index would be 001, 101,201 and 301 respectively.
B+ tree Cluster Index
Here, cluster key of the table is used in the index. Thus, each index in this method will point to set of records with same cluster keys.
This type of index is similar to any normal index. But this is applicable to partitioned tables. The index created on the whole table is known as global index. The indexes specific to each partition are called local index.
These indexes are used on less unique columns. For example, columns like Gender or columns with values Yes/No, or very limited values are used to have bitmap index. In this method, the column values are stored as bits. There will be as many bits as the rows exist in the table and each bit represents respective row. The bit value will be 1 if that row contains the value, else it will be zero.
In the below example, table has 4 rows and hence bitmap index on Gender and Semester has 4 bits, each bit representing the each row. In the gender index for male, we can see that only first row has ‘M’ and hence the index has value ‘1000’ and female index is ‘0111’ because 2nd,3rd and 4th rows have gender as female.
Bitmap index can be created on set of columns from same or different tables. Such indexes are called bitmap join indexes.
This type of index will be created on columns of tables after applying some functions on those columns. Some of the functions that are frequently used are UPPER, LOWER, TRUNC etc. These indexes are created because, if we have index only on the column, and the query with this column has some functions on it, then the index will not be used. For example look at below query.
SELECT * FROM EMP WHERE UPPER (EMP_NAME) = ’ADAM’;
In above case, suppose we have index on EMP_NAME. But this index will not be used to parse above query as it has function defined on it. Suppose we have created function based index instead of normal index, then the index will be used to traverse the query.
CREATE INDEX idx_empname ON EMP (UPPER (EMP_NAME));
this index is created by the user for the application specific domain. They need not be actual index. They can exist in the database or in the form of file. For example some documents that are used in the application, pictures/videos used in the applications.
Creating a new index consumes lot of resource. But the actual behavior of the index will be known only after it is created. But when the developer and testers are checking for the performance issues, they cannot afford for creating and deleting the index each time. Hence virtual indexes are created which are not really created but it will affect the query plans and performance to show up as if it is present in the database. Thus it reduces the resource consumption and boosts the performance test. If this virtual index works as required, then a real index is created latter.