Group by aggregate fields with SQL query in SQL SERVER
Submitted by oscar on Mon, 08/31/2009 - 14:28
If you are reading this you are facing a situation where you have created an SQL query using a case statement. Your query consists of derived fields using an SQL CASE statement and some aggregate function that needs to perform its duty based on the derived field.
If you don’t need the background on the example listed and are seeking the technical how-to, then scroll down to the bottom of the page to the section that says “Working Example”.
Example.
I have a table named SALES_TABLE. Each record in my table is a sales order. There are no duplicates of the same order in that table, meaning that each order processed is distinct from the other.
My fiscal year begins in September and ends in August of the following year. So, if today’s date is September 5, 2010, then the fiscal year that I am in is fiscal year 2011. The quarters in my fiscal year are grouped by September, October, November as Quarter 1, December, January, February as Quarter 2. March, April and May make up Quarter 3. Lastly, June, July and August make up Quarter 4.
Unfortunately, in my data I don’t have a field that produces the fiscal year period of when the sale took place.
I’ve been asked to create a simple report that displays the total number of sales per quarter in each fiscal year for all the years we have in our database.
Essentially the output is supposed to look something like:
ORDER_COUNT FISCAL_YEAR QUARTER=========== =========== =======120 1999 Q196 1999 Q2180 1999 Q3232 1999 Q4As mentioned before, we don’t have fields that store the values of the fiscal year or the quarter of when the sale took place. We only have the sales date (SALES_DT). So the first thing that you want to do is derive the values Q1 through Q2 based on the month the sales took place. In addition, you want to add one (1) to the sales date when the sale took place anytime between September and December so that your actual years become fiscal years.
Below is the logic on how to do that;
CASE
WHEN MONTH(SALE_DT) BETWEEN 9 AND 12 THEN YEAR(SALE_DT) + 1
ELSE YEAR(SALE_DT)
END AS YR,
CASE
WHEN MONTH(SALE_DT) BETWEEN 9 AND 11 THEN 'QTR 01'
WHEN MONTH(SALE_DT) = 12 THEN 'QTR 02'
WHEN MONTH(SALE_DT) BETWEEN 1 AND 2 THEN 'QTR 02'
WHEN MONTH(SALE_DT) BETWEEN 3 AND 5 THEN 'QTR 03'
WHEN MONTH(SALE_DT) BETWEEN 6 AND 8 THEN 'QTR 04'
ELSE 'ERROR'
END AS QTR
As you can see, the first CASE statement requests that if the MONTH in the sales date is between month 9 and 12 (September and December), then add one year to the current year. Otherwise, leave it as is.
The second case statement does something similar, however it says, if the sale date is between 9 and 11, assign the value of ‘QTR 01’. If the month of the transactions falls in December, January and February, then give the value ‘QTR 02’ and so on.
What we want is to get a count based on year and quarter. So, one would assume that with the CASE logic in place, you can simply write
SELECT COUNT(*),
CASE
WHEN MONTH(SALE_DT) BETWEEN 9 AND 12 THEN YEAR(SALE_DT) + 1
ELSE YEAR(SALE_DT)
END AS YR,
CASE
WHEN MONTH(SALE_DT) BETWEEN 9 AND 11 THEN 'QTR 01'
WHEN MONTH(SALE_DT) = 12 THEN 'QTR 02'
WHEN MONTH(SALE_DT) BETWEEN 1 AND 2 THEN 'QTR 02'
WHEN MONTH(SALE_DT) BETWEEN 3 AND 5 THEN 'QTR 03'
WHEN MONTH(SALE_DT) BETWEEN 6 AND 8 THEN 'QTR 04'
ELSE 'ERROR'
END AS QTR
FROM DATABASE.dbo.SALES_TABLE
GROUP BY YR, QTR
However, SQL SERVER will not recognize the column aliases you provided to be used in the GROUP BY clause.
Msg 207, Level 16, State 1, Line 15
Invalid column name 'YR'.
Msg 207, Level 16, State 1, Line 15
Invalid column name 'QTR'.
Working Example
You will have to go a step further and in your group clause, provide the same CASE logic as you did in your SELECT clause. The complete SQL query is written below.
SELECT COUNT(*),
CASE
WHEN MONTH(SALE_DT) BETWEEN 9 AND 12 THEN YEAR(SALE_DT) + 1
ELSE YEAR(SALE_DT)
END AS YR,
CASE
WHEN MONTH(SALE_DT) BETWEEN 9 AND 11 THEN 'QTR 01'
WHEN MONTH(SALE_DT) = 12 THEN 'QTR 02'
WHEN MONTH(SALE_DT) BETWEEN 1 AND 2 THEN 'QTR 02'
WHEN MONTH(SALE_DT) BETWEEN 3 AND 5 THEN 'QTR 03'
WHEN MONTH(SALE_DT) BETWEEN 6 AND 8 THEN 'QTR 04'
ELSE 'ERROR'
END AS QTR
FROM DATABASE.dbo.SALES_TABLE
GROUP BY
CASE
WHEN MONTH(SALE_DT) BETWEEN 9 AND 11 THEN 'QTR 01'
WHEN MONTH(SALE_DT) = 12 THEN 'QTR 02'
WHEN MONTH(SALE_DT) BETWEEN 1 AND 2 THEN 'QTR 02'
WHEN MONTH(SALE_DT) BETWEEN 3 AND 5 THEN 'QTR 03'
WHEN MONTH(SALE_DT) BETWEEN 6 AND 8 THEN 'QTR 04'
ELSE 'ERROR'
END,
CASE
WHEN MONTH(SALE_DT) BETWEEN 9 AND 12 THEN YEAR(SALE_DT) + 1
ELSE YEAR(SALE_DT)
END
ORDER BY YR, QTR;
That concludes this lesson on how to group (or use aggregate functions) on derived fields in SQL Server.
Note, other databases that I’ve dealt with in the past don’t require this extra step in the group by clause. You can simply group by column alias, column name with function or numeral.