Evaluation of Expressions in DBMS

 

So far we have seen how to write a query in SQL. Now let us try to understand how DBMS evaluates the query written in SQL. i.e.; how it breaks them into pieces to get the records quickly.

There are two methods of evaluating the query.

Materialization

In this method, queries are broken into individual queries and then the results of which are used to get the final result. To be more specific, suppose there is a requirement to find the students who are studying in class ‘DESIGN_01’.

SELECT * FROM STUDENT s, CLASS c 
     WHERE s.CLASS_ID = c.CLASS_ID AND c.CLASS_NAME = ‘DESIGN_01’;

Here we can observe two queries: one is to select the CLASS_ID of ‘DESIGN_01’ and another is to select the student details of the CLASS_ID retrieved in the first query.
The DBMS also does the same. It breaks the query into two as mentioned above. Once it is broken, it evaluates the first query and stores it in the temporary table in the memory. This temporary table data will be then used to evaluate the second query.

This is the example of two level queries in materialization method. We can have any number of levels and so many numbers of temporary tables.

Although this method looks simple, the cost of this type of evaluation is always more. It takes the time to evaluate and write into temporary table, then retrieve from this temporary table and query to get the next level of result and so on. Hence cost of evaluation in this method is:
Cost = cost of individual SELECT + cost of write into temporary table

Pipelining

In this method, DBMS do not store the records into temporary tables. Instead, it queries each query and result of which will be passed to next query to process and so on. It will process the query one after the other and each will use the result of previous query for its processing.

In the example above, CLASS_ID of DESIGN_01 is passed to the STUDENT table to get the student details.

In this method no extra cost of writing into temporary tables. It has only cost of evaluation of individual queries; hence it has better performance than materialization.

There are two types of pipelining:

Demand Driven or Lazy evaluation

In this method, the result of lower level queries are not passed to the higher level automatically. It will be passed to higher level only when it is requested by the higher level. In this method, it retains the result value and state with it and it will be transferred to the next level only when it is requested.

In our example above, CLASS_ID for DESIGN_01 will be retrieved, but it will not be passed to STUDENT query only when it is requested. Once it gets the request, it is passed to student query and that query will be processed.

Producer Driven or Eager Pipelining

In this method, the lower level queries eagerly pass the results to higher level queries. It does not wait for the higher level queries to request for the results. In this method, lower level query creates a buffer to store the results and the higher level queries pulls the results for its use. If the buffer is full, then the lower level query waits for the higher level query to empty it. Hence it is also called as PULL and PUSH pipelining.

There are still more methods of pipelining like Linear and non-linear methods of pipelining, left deep tree, right deep tree etc.

Translate »