Window Functions Are Not Allowed in WHERE Clauses

Featured

If you have been writing SQL queries for a while, you must know the importance of the WHERE clause. When it comes to data extraction based on specific conditions. However, there are some limitations you must have faced while trying to use the window function in WHERE clauses and the limitation can seem confusing at first glance. In this tutorial, we will learn about the reason behind this limitation and also we will explore ways to achieve the same result.

Understanding Window Function

Window functions operate on a set of rows which is related to the current row. It allows us to execute calculations on a collection of rows that are connected to the current row. Common window functions include ROW_NUMBER(), RANK(), DENSE_RANK(), SUM(), and OVER().

For example, let’s say we have a table with employee data. We can use a window function to calculate the salary for each department without grouping the results allowing us to retain individual rows but do aggregate calculations in defined data partitions. This provides insights into how each employee’s salary compares to the average within their department, facilitating deeper analysis and decision-making processes.

WHERE Clause & Its Purpose

The WHERE clause in SQL serves a unique purpose. It filters rows from the result set based on specified conditions. Conditions in the WHERE clause are used before rows are grouped to figure out which rows are included in the final result set.

For example, let’s say we have a table with sales data. We can use the WHERE clause to fetch the record of sales for only specific products, for a specific period of time. This targeted search provides users with valuable insights from the database that enable them to make informed decisions.

Why Window Functions Are Restricted in WHERE Clause?

We cannot use window functions in the WHERE clause because of the logical order of query processing in SQL. The WHERE clause is evaluated before the window functions. Since window functions depend on the result set returned by the query, they cannot be evaluated in the same step as the WHERE clause.

The logical order of operation in SQL is:

From the above, we can see that window functions can be used in SELECT and ORDER BY. It is not accessible in GROUP BY, WHERE and HAVING clauses.

Alternative Approaches

Using CTE

This approach allows us to calculate aggregated values, such as averages, for specific groups of data and then it allows us to compare those values within the main query.

Example

Let’s say we have a table named sales given below:

sales

If we want to filter out orders where the order amount is greater than the average order amount for each customer.

WITH average_order_amount AS (
    SELECT
        customer_id,
        AVG(order_amount) AS avg_order_amount
    FROM
        sales
    GROUP BY
        customer_id
)
SELECT
    order_id,
    customer_id,
    order_date,
    order_amount
FROM
    sales
WHERE
    order_amount > (
        SELECT
            avg_order_amount
        FROM
            average_order_amount
        WHERE
            customer_id = sales.customer_id
    );
Code language: SQL (Structured Query Language) (sql)

Output

Example
Example

In the WITH, we have calculated the average order amount for each customer by selecting the customer_id and computing the average order_amount by using the AVG() function. This creates a temporary table-like structure named average_order_amount.

In the main query, we select columns from the sales table and compare the order_amount with the average order amount for the respective customer. We found this by comparing the order_amount in the main query with the average order amount obtained from the average_order_amount CTE using a subquery.

Using Derived Tables

Derived tables which are also known as inline views can be used to calculate window functions in a separate query and then join the result with the main query. This allows us to utilize the window function results in the WHERE clauses indirectly.

Example

Let’s say we want to calculate the average order amount for each customer and then filter orders based on whether their order amount exceeds the average for their respective customer.

SELECT 
    s.order_id, 
    s.customer_id, 
    s.order_date, 
    s.order_amount
FROM 
    sales AS s
JOIN 
    (SELECT 
        customer_id, 
        AVG(order_amount) AS avg_order_amount
    FROM 
        sales
    GROUP BY 
        customer_id
    ) AS avg_orders ON s.customer_id = avg_orders.customer_id
WHERE 
    s.order_amount > avg_orders.avg_order_amount;
Code language: SQL (Structured Query Language) (sql)

Output

Example
Example

From the above example, as you can see in the main query, we select columns from the sales table and join it with the derived table (avg_orders) based on the customer_id column. This allows us to bring in the calculated average order amount for each customer into the main query

In the WHERE clause, we compared the order_amount from the sales table to the average order amount obtained from the derived table.

Using HAVING Clause with Aggregations

In this approach, instead of using a window function directly in the WHERE clause, we use a combination of GROUP BY, aggregate functions, and a HAVING clause to achieve the same filtering effect based on aggregate results

Example

Let’s say we want to calculate the average order amount for each customer in the sales table.

SELECT 
    customer_id, 
    AVG(order_amount) AS avg_order_amount
FROM 
    sales
GROUP BY 
    customer_id
HAVING 
    AVG(order_amount) > 100;Code language: SQL (Structured Query Language) (sql)

Output

Example
Example

From the above example, as we can see the HAVING clause filters out the customers where the average order amount is greater than 100.

This query elaborates an alternative way to filter data based on aggregated results, which obtains a similar result to using window functions in the WHERE clause.

Conclusion

In this tutorial, we have learned about the reasons behind the restrictions of the WHERE clause with window functions and additionally, we have seen the alternative approaches to deal with this problem. We believe that if you follow these techniques correctly, you will not only expand your SQL knowledge but also gain a deeper understanding of how to use the power of window processing to extract meaningful insights from your data.

Reference

https://stackoverflow.com/questions/13997177/why-no-windowed-functions-in-where-clauses