5 Examples of SQL GROUP BY (Beginner Friendly)

Featured

Data aggregation is one of the important concepts when we talk about databases. If you have writing queries in SQL for a while, you must have come across the GROUP BY clause. In this tutorial, we will present some GROUP BY examples and demonstrate how they can be solved to extract valuable insights from data sets.

Overview of GROUP BY

GROUP BY clause in SQL is used to group the rows that have similar values into summary rows with the help of some aggregate function like SUM, AVG, COUNT, MIN, and MAX. It performs calculations on grouped data. This clause is important for summarizing and analyzing large datasets, that allow users to extract meaningful data from their database.

Syntax:

SELECT column01, function_name(column02)
FROM table01
WHERE condition
GROUP BY column01, column02
ORDER BY column01, column02;Code language: SQL (Structured Query Language) (sql)

Below is the table named Course, we are going to use this table for the rest of our tutorial.

Course
Course

Example 1: GROUP BY With One Column

Let’s say we want to fetch the record of the total duration of courses for each date from the Course table.

SELECT date, SUM(duration) AS total_duration
FROM Course
GROUP BY date;Code language: SQL (Structured Query Language) (sql)

Output:

Example 01
Example 01

In the above example, each row represents one date with the total duration of courses (total_duration) which is scheduled on the date. Whereas, The ‘total_duration‘ column shows the sum of the duration of all the courses which is scheduled on a particular date.

Example 2: GROUP BY With Two Column

Let’s say we want to fetch the record of the average duration for a course in a given month of the year 2024 from the Course table.

SELECT
EXTRACT( YEAR FROM date) AS year,
EXTRACT( MONTH FROM date) AS month,
ROUND(AVG(duration), 2) AS avg_time
FROM Course
GROUP BY
EXTRACT( YEAR FROM date),
EXTRACT( MONTH FROM date);Code language: SQL (Structured Query Language) (sql)

Output:

Example 02
Example 02

In the above example, we can observe that for March 2024, the average duration of the course is 135.00. It’s a single row in the result set which tells that there is only one distinct-month combination in the Course table.

Note: Every column name that is specified in the SELECT statement must either be included in the GROUP BY clause or used in the aggregate functions.

Example 3: GROUP BY and ORDER BY

Let’s say we want to fetch the record of the total duration of courses for each course and then order the results which will be based on the total duration from the Course table.

SELECT course, SUM(duration) AS total_duration
FROM Course
GROUP BY course
ORDER BY total_duration DESC;Code language: SQL (Structured Query Language) (sql)

Output:

Example 03
Example 03

In the above example, each course category with its corresponding total duration of courses is shown. The ORDER BY clause allows users to specify the order in which they want to fetch the result set.

Example 4: GROUP BY and HAVING

Let’s say we want to fetch the records of the course categories that have an average duration of more than 120 minutes from the Course table.

SELECT course, AVG(duration) AS avg_duration
FROM Course
GROUP BY course
HAVING AVG(duration) > 120;Code language: SQL (Structured Query Language) (sql)

Output:

Example 04
Example 04

In the above example, each course category with its corresponding average duration of courses greater than 120 minutes is shown. The HAVING clause allows users to filter out the grouped rows.

Example 5: GROUP BY, HAVING and WHERE

Finally, if we want to fetch the record of the course categories that have an average duration greater than 120 minutes and which were conducted after March 15, 2024, from the Course table.

SELECT course, AVG(duration) AS avg_duration
FROM Course
WHERE date > '2024-03-15'
GROUP BY course
HAVING AVG(duration) > 120;Code language: SQL (Structured Query Language) (sql)

Output:

Example 05
Example 05

From the above example, you can see that, how each course category has an average course time of more than 120 minutes and how those courses were held after March 15, 2024.

Note: The WHERE and HAVING clauses kinda look similar but they actually differ from each other. HAVING is used to filter out the group of rows whereas WHERE is used to filter out single rows before they are even grouped.

Conclusion

In this tutorial, we have analysed multiple scenarios where GROUP BY can be used. We have gone through easy to complex problems and analysed them accordingly. GROUP BY is an essential clause when it comes to data aggregation. If you want to learn about it deeply, Click here.

Reference

https://stackoverflow.com/questions/7434657/how-does-group-by-work