SQL (5) GROUP BY and HAVING

Share

Let’s move further in the aggregations functions that covered in the last article. In this article, we will cover GROUP BY and HAVING clauses.

■ GROUP BY

What does it mean to group by? To group by specifying a column means that the values of the specified column are expressed in the same row.

At first glance, it looks similar to DISTINCT, which eliminates redundancy. The difference between GROUP BY and DISTINCT is whether to use the aggregation function. With GROUP BY, the group is handed over as an argument of the aggregation function. Check out the following examples.

NameScorePlace
AAA501
BBB1001
CCC802

I want to group the data above by Place and get the average score for each group. The query statement is as follows:

SELECT Place,
       AVG(SCORE) AS AVG_SCORE
FROM Scores
GROUP BY Place;

The results will come out as follows:

PlaceAVG_SCORE
150
280

One more caveat is that an aggregate function must be used to SELECT columns other than those specified in GROUP BY. In other words, if you place with GROUP BY as in the query statement above, you can use Place in SELECT without aggregation function, but you must use Score with aggregation function.

■ WHERE vs. HAVING

Now let’s look at the difference between WHERE and HAVING, which you use to specify conditionals.

For WHERE, the data is filtered before becoming a group. Conversely, HAVING filters the data after grouping. Thus, rows removed by WHERE are not included in the group.

For example, HAVING can be used in the following cases:

SELECT Product, 
SUM(Sales)
FROM Products
GROUP BY Product
HAVING SUM(Sales) >= 100;

To interpret query statements, only products with a total sales of 100 or more per product are filtered in the Products table. It would also be good to recall that the aggregation function cannot be used in the conditional expression of the WHERE clause. This is due to the order in which SQL processes, as the processing of retrieving rows in WHERE clause runs before the processing of grouping them into GROUP BY.

The internal processing order is WHERE > GROUP BY > HAVING > SELECT > ORDER BY.

Note that the order in which GROUP BY and several constructs are written is as follows:

Remember that ORDER BY is almost always at the end.

SELECT
FROM
WHERE
GROUP BY
HAVING
ORDER BY

Also, note that NULL is also grouped if the column to be grouped has NULL values. That is, if there are multiple NULLs, they are grouped into one NULL.

2 Comments
Leave a Reply

Your email address will not be published. Required fields are marked *