Correlated Sub Queries in DBMS

Correlated Sub Queries are also similar to sub queries but here the outer query is executed first and inner query is executed for each records of outer query. That is inner query is executed as many times as the outer query results.
Suppose we need to select the students whose marks have been entered into MARKS table. (We can write much simpler query than below by not writing WHERE clause in the sub query, but below query is written to show how correlated sub query is written and executed)

SELECT * 
       FROM STUDENT s 
       WHERE STD_ID IN (SELECT STD_ID 	FROM MARKS m WHERE s.STD_ID = m.STD_ID);

Here we can see that outer query column and inner query column are joined to get the result. This query fetches all the records from STUDENT table and joins with the STD_ID in MARKS table. It returns the records only if there is a matching STD_ID in MARKS.

When we observe above query, we see that the STD_ID in the SELECT statement of sub query is not really required. It is simply used to maintain the structure of the SELECT statement. The main task of sub query here is to certify if the STD_ID exists in MARKS table or not by checking it in the WHERE clause and then display the result. Hence STD_ID used in the WHERE clause of outer query and in the SELECT statement of inner query does not have any significance.

Hence DBMS provides another clause EXISTS to use in such cases. The above correlated sub query can be rewritten as below (in either of way):

SELECT * 
       FROM STUDENT s 
       WHERE EXISTS 
	 (SELECT 1
		FROM MARKS m
	        WHERE s.STD_ID = m.STD_ID);

 

SELECT * 
       FROM STUDENT s 
       WHERE EXISTS 
	 (SELECT STD_ID 
		FROM MARKS m
	        WHERE s.STD_ID = m.STD_ID);

We can even use NOT EXISTS clause which performs opposite of EXISTS. Below query returns the students who didn’t get the marks yet.

SELECT * 
       FROM STUDENT s 
       WHERE NOT EXISTS 
	 (SELECT 1
		FROM MARKS m
	        WHERE s.STD_ID = m.STD_ID);

 

SELECT * 
       FROM STUDENT s 
       WHERE NOT EXISTS 
	 (SELECT STD_ID 
		FROM MARKS m
	        WHERE s.STD_ID = m.STD_ID);

We can use correlated sub queries as a column in the SELECT statement too. In below query AVG_MARK is a column got from correlated sub query which finds the average mark of each student.

SELECT
   STD_ID, 
   STD_NAME,
   (SELECT AVG (mark) 
      FROM MARKS m
      WHERE m.STD_ID = s.STD_ID) AS AVG_MARK
   FROM STUDENT s;

 Difference between Sub Query and Correlated Sub Query

Sub Query

Correlated Sub Query

Inner Query is executed First.

Outer Query is executed first.

Inner query is executed only once and its result is used by outer query.

Inner query is executed for each of the records that outer query returns.

Uses using =, <, >, >=, <=, IN, BETWEEN operators.

Can use using =, <, >, >=, <=, IN, BETWEEN operators, but it mainly uses EXISTS and NOT EXISTS clause.

Always outer query columns are compared with inner query but there are no explicit joins in the inner query with outer query columns.

There should be some joins between the outer and inner query columns in the inner query.

Is always used in the WHERE clause.

Is used in WHERE clause as well as columns of SELECT statement.

Depending on the number of columns returned by inner query, operators should be used in the outer query.

There is no restrictions on the operators if EXISTS or NOT EXISTS are used. The condition inside the inner query should be correct.

Performance is better as inner query is executed only once and outer query is executed based on the result of inner query.

It will be bit slow if the outer table has large number of records. This is because, when each record of outer query is retrieved, the inner query is executed. The number of execution of inner query depends on the number of records returned by the outer query.

Translate »