There are many terms in SQL whose usage is almost similar and when it comes to implementing them, it always leads to confusion among developers. Correlated Subquery and Noncorrelated Subquery are one of them. Understanding the distinction between them is important for optimizing SQL queries to receive desired output. In this tutorial, we will learn how Correlated Subquery and Noncorrelated Subquery are different from each other and determine scenarios in which each is appropriate to use.
What is a Subquery?
A subquery is a SQL query that is nested within another SQL query. While writing SQL statements, subqueries can be used in various parts such as SELECT, WHERE, HAVING or FROM clauses, to fetch data based on specific criteria. Subqueries offer developers some great functionality such as filtering, joining and manipulating data.
Syntax:
SELECT column01, column02
FROM table01
WHERE column01 = (
SELECT column03 FROM table02
WHERE condition
);
Code language: SQL (Structured Query Language) (sql)
Example:
We are going to use the SALE table for the rest of our examples.
Let’s say we want to find the names of employees who have made sales of min of 2000 from the SALE table.
SELECT NAMES
FROM SALE
WHERE ID IN (
SELECT ID
FROM SALE
GROUP BY ID
HAVING MIN(SALES_DONE) >= 2000
);
Code language: SQL (Structured Query Language) (sql)
Output:
As we can see in the above query, the output will be dependent on the data in the SALE table. The record of employees’ names who meet the criteria is later fetched.
Also Read: Joins Vs Subqueries in MySQL
Correlated Subquery in SQL
Correlated subqueries are dependent on the outer query, this means that the subquery uses data value from the outer query to complete its task. This can lead to decreased performance and increased execution time, especially when we are dealing with large datasets.
Syntax:
SELECT column01, column02
FROM table01
WHERE comparison_operator (SELECT column01, column02
FROM table02
WHERE table02.column = table01.column);
Code language: SQL (Structured Query Language) (sql)
Example:
Let’s say we want to find out the record of employees whose sales are greater than the average sales amount across all the employees in the department.
SELECT e.NAMES, e.SALES_DONE
FROM SALE e
WHERE e. SALES_DONE > (SELECT AVG(e1.SALES_DONE)
FROM SALE e1
WHERE e1.DEPARTMENT_ID = e.DEPARTMENT_ID)
ORDER BY e.SALES_DONE DESC;
Code language: SQL (Structured Query Language) (sql)
Output:
Here the outer query selects the name of employees and sales done, whereas the inner query calculates the average salary within each department.
Noncorrelated Subquery in SQL
A noncorrelated subquery is an independent subquery which can be executed on its own and it does not depend on the results of an outer query. Noncorrelated queries are executed once and then used in the main query. Using noncorrelated subqueries can sometimes lead to better performance and execution time as compared to correlated subqueries, especially in the case of large datasets.
Syntax:
SELECT column01, column02
FROM table01
WHERE column01 = (
SELECT column03
FROM table02
WHERE condition
);
Code language: SQL (Structured Query Language) (sql)
Example:
Let’s say we want to fetch the records of the names and sales amounts of employees who achieved the maximum sales amount in the department from the SALE table.
SELECT NAMES, SALES_DONE
FROM SALE
WHERE SALES_DONE = (
SELECT MAX(SALES_DONE)
FROM SALE
);
Code language: SQL (Structured Query Language) (sql)
Output:
In the above query, the inner subquery is independent of the outer subquery, we can know this by just executing the inner subquery.
SELECT MAX(SALES_DONE)
FROM SALE;
Output:
Best Practices for Correlated and Noncorrelated Subqueries
- Optimizing correlated subqueries: If it is necessary to use a correlated subquery, consider optimizing the subquery so that execution time can be minimized and use indexes.
- Performing noncorrelated subqueries: Use a noncorrelated subquery to enhance the performance by seeing if the required information can be fetched independently of the outer query.
- Testing and measures: To determine the most efficient solution for the use case, always test and measure the performance of different query approaches with the specified data.
Primary Difference Between Correlated and Noncorrelated Subqueries
- Complexity: Correlated queries can be more complex in comparison to noncorrelated subqueries.
- Execution: Correlated queries run for each row in the outer query whereas, noncorrelated runs once.
- Dependency: Correlated fetched data values from the outer query whereas, noncorrelated is independent.
- Performance: Noncorrelated can be turned out as faster because of fewer executions.
Conclusion
In this tutorial, we have learned how different correlated subquery is from noncorrelated subquery with the help of multiple examples. Also, we have seen in which situations we can use them to get good performance and can be optimised based on the specified datasets. This is one of the most asked topics in SQL interviews. We hope you enjoyed it.
Reference
https://stackoverflow.com/questions/17268848/difference-between-subquery-and-correlated-subquery