SQL JOINs are often a good solution over the subqueries, but not every time. There might be chances join statement will not work faster where the subquery will and vice-versa. New learners get easily confused between the JOIN statement and the subqueries and which one to use.
In this tutorial, we will see the differences between the join statement and the subquery and when to use which one. So, let’s get started!
Introduction to Joins and Subqueries
Let’s talk about the JOIN statement first. The JOIN statement is used to join the data of two or more tables and bring out the result as a single set of records. The joins are very useful when you have a relationship between two tables using the primary-foreign key.
The subqueries are also used to join the data of two or more tables. A subquery is also called an inner query or a nested query. A subquery is basically a query inside the query.
Note that, in the JOIN statement, only a single SELECT statement is present with the names of multiple tables. Whereas, in the subquery, there is another query with the SELECT statement present apart from the outer SELECT statement. That means, there can be multiple select statements present in the subqueries.
Subqueries and join statements can be used alternatively. However, sometimes the subquery becomes the only option to get the result. However, a join statement can be replaced with a very long subquery.
Note that, before the JOIN statement was introduced in MySQL, only subquery was the option to write the complex logic. That means, everything that we write using the JOIN, can be written using the subquery.
Let’s take a few examples to understand how the same logic can be written using a subquery as well as the JOIN.
When You Should Write Subqueries With JOINs
Note that, queries written using the JOIN statement are often more readable than the subqueries. Those are easy to understand and sometimes more efficient as well.
Here, we will create two tables to demonstrate the subquery and the JOIN statement for the same problem statements.
The first table is the product which consists of the product id, name and price. The second is the sales table which consists of the sales id, product id, sales year and the total sales amount.
Check the table description and the table data below.
Products Table description and the data:-
Sales Table description and the data:-
Example 1- Subquery Returns a Single Value
Here, the subquery returns only a single record which further is used by the outer query to find the records accordingly.
For example, we will find the product names having sales of greater than 3000 Rs.
SELECT * FROM products WHERE product_id=(SELECT product_id FROM sales WHERE sales_amount>=3000 AND product_id=products.product_id);Code language: SQL (Structured Query Language) (sql)
Now let’s write the query for the same problem using the JOIN.
SELECT p.product_id,p.product_name,p.product_price FROM products p JOIN sales ON p.product_id=sales.product_id WHERE sales_amount>=3000;Code language: SQL (Structured Query Language) (sql)
Here, we have simply joined two tables -products and sales, using the join statement and used the where clause to apply the filtering condition.
If you execute the above query, you will get the exact same output.
Example 2- Subquery Returns a Multiple Value
When the inner query (subquery) returns multiple records, we can use the IN keyword. On the other hand, the NOT IN keyword returns the data which does not include the result of the IN keyword.
Let’s take an example of it.
Here, we will write a query to return the product details which is sold at least one time. This means the products which have records in the sales table are sold at least once.
SELECT * FROM products WHERE product_id IN (SELECT product_id FROM sales); SELECT * FROM products WHERE product_id NOT IN (SELECT product_id FROM sales);Code language: SQL (Structured Query Language) (sql)
The first query returns the data of the products which are sold at least once. Whereas the second query returns the data of the products which are not sold for once.
As you can see in the result, the first query returns all product names. This is because all products have at least one sale. Therefore, the second query returns an empty set as there is no such product which does not have even a single sale.
Now let’s write the same queries using the JOIN.
SELECT DISTINCT p.product_id,p.product_name,p.product_price FROM products p JOIN sales ON p.product_id=sales.product_id; SELECT DISTINCT p.product_id,p.product_name,p.product_price FROM products p JOIN sales s ON p.product_id=s.product_id WHERE s.product_id IS NULL;Code language: SQL (Structured Query Language) (sql)
Here, we have used the DISTINCT clause to remove duplicate records in the first query. In the second query, we have used the IS NULL clause to check if the product id in the sales table is null.
Using the JOIN statement, the query has become simpler and more readable.
Example 3- Join Can Not Replace the Subquery
Now we will take an example where the JOIN can not be used to get the required output.
We will find the products which have prices above the average price of all products.
SELECT * FROM products WHERE product_price >= (SELECT AVG(product_price) FROM products);Code language: SQL (Structured Query Language) (sql)
Here, we calculate the average product price in the subquery. Later, we use that query to check if the product prices from the outer query are greater than it.
As you can see in the output, we have got the correct result.
Advantages and Disadvantages of JOIN
- Sometimes complex queries written using the JOIN executes faster.
- It becomes a lot easier to read and understand the query written with the JOIN instead of the subqueries.
- If the query is relatively smaller, the subquery is preferred over the join.
- It’s a bit confusing to understand which type of JOIN should be used to get the desired output.
Advantages and Disadvantages of Subqueries
- If we break down the complex logic into chunks, the subquery becomes more readable.
- Subqueries can bring any result that sometimes join can’t.
- The MySQL optimizer algorithms are so advanced that it automatically finds the best way to execute the JOIN queries, resulting in faster execution for JOIN over the subquery.
- Sometimes the query becomes too lengthy and JOIN can do the same in short which also becomes easy to read.
In this tutorial, we have seen how subqueries and join differ from each other and which is the best choice to write queries. We have taken a few points into consideration while choosing the JOIN and the subquery. We recommend you learn both methods as both of them are used everywhere and will help you to find the best solutions.