Let’s move further in the aggregations functions that covered in the last article. In this article, we will cover GROUP BY and HAVING clauses.
■ GROUP BY
What does it mean to group by? To group by specifying a column means that the values of the specified column are expressed in the same row.
At first glance, it looks similar to DISTINCT, which eliminates redundancy. The difference between GROUP BY and DISTINCT is whether to use the aggregation function. With GROUP BY, the group is handed over as an argument of the aggregation function. Check out the following examples.
I want to group the data above by Place and get the average score for each group. The query statement is as follows:
SELECT Place, AVG(SCORE) AS AVG_SCORE FROM Scores GROUP BY Place;
The results will come out as follows:
One more caveat is that an aggregate function must be used to SELECT columns other than those specified in GROUP BY. In other words, if you place with GROUP BY as in the query statement above, you can use Place in SELECT without aggregation function, but you must use Score with aggregation function.
■ WHERE vs. HAVING
Now let’s look at the difference between WHERE and HAVING, which you use to specify conditionals.
For WHERE, the data is filtered before becoming a group. Conversely, HAVING filters the data after grouping. Thus, rows removed by WHERE are not included in the group.
For example, HAVING can be used in the following cases:
SELECT Product, SUM(Sales) FROM Products GROUP BY Product HAVING SUM(Sales) >= 100;
To interpret query statements, only products with a total sales of 100 or more per product are filtered in the Products table. It would also be good to recall that the aggregation function cannot be used in the conditional expression of the WHERE clause. This is due to the order in which SQL processes, as the processing of retrieving rows in WHERE clause runs before the processing of grouping them into GROUP BY.
The internal processing order is WHERE > GROUP BY > HAVING > SELECT > ORDER BY.
Note that the order in which GROUP BY and several constructs are written is as follows:
Remember that ORDER BY is almost always at the end.
Also, note that NULL is also grouped if the column to be grouped has NULL values. That is, if there are multiple NULLs, they are grouped into one NULL.