You are on page 1of 3

We summarize the basic SQL clauses and keywords used to aggregate data in the list below.

Clause Purpose Example


SELECT Used to specify what fields SELECT Sum(Amount) As Total
FROM Sales
will be included in the
WHERE CustomerID='ABC'
query result. This clause is
always found in SQL
query statements.
FROM Specifies what tables data SELECT Sum(Amount) As Total
will come from. This exists FROM Sales
WHERE CustomerID='ABC'
in all SQL query
statements.
WHERE Specifies what subset of SELECT Sum(Amount) As Total
FROM Sales
the data will be used
WHERE CustomerID='ABC'
(always has non-aggregate
conditions). This clause is
almost always found in
SQL query statements.
JOIN The join clause is used to SELECT Customers.CustomerName, Sales.CustomerID,
link more than one table Sum(Sales.Amount) As Total
FROM Sales INNER JOIN Customers ON
together. This is often Customers.CustomerID = Sales.CustomerID
found in more complex WHERE Sales.CustomerID='ABC'
queries that require
retrieving data from more
than one table. There are
several formats (INNER
JOIN, OUTER JOIN,
LEFT JOIN, RIGHT
JOIN, ..OUTER is usually
optional
GROUP BY Used to specify about what SELECT CustomerID, Sum(Amount) As Total
FROM Sales
fields data should be
GROUP BY CustomerID
aggregated. In this
example, we group by
CustomerID so that we get
a summary of the total
purchases per customer
HAVING HAVING is very similar to SELECT CustomerID, Sum(Amount) As Total
FROM Sales
WHERE except the GROUP BY CustomerID
statements within it are of HAVING Sum(Amount) > 60000
an aggregate nature. Note
in this example - we are
only returning summaries
for customers who have
purchased more than
60,000 worth of items
Aggregation Aggregate functions are SELECT CustomerID, Sum(Amount) As Total ,
Functions used to summarize data by Count(*) As SaleCount, AVG(Amount) As
AverageOrder
SUM, COUNT, rolling up a set of data FROM Sales
AVG items into a single item. GROUP BY CustomerID
There are a few basic ones
that exist in most systems
that support SQL, and a lot
are specific to certain
DBMS. An important
thing to note is that if a
column in the resultset is
not an aggregate field, then
it must be included in the
GROUP BY clause.

Application

Data aggregation is the process of taking numerous records and collapsing them into a single
summary record. When aggregating data, one must decide what records must be considered in
the summary and how these records should be summarized. Data can be summarized based on
certain fields (attributes) in a database or derived attributes.
The examples below were performed in SQL Server 7.0 so the syntax for these may be slightly
different if you are working with a different DBMS.

SELECT SaleDate, Amount


FROM Sales
WHERE SaleDate > '1999-12-30'

In the above example with a date comparison using a string representation of a date. This
represents the dataset that we will summarize.
*Note SQL Server 7 does an implicit string to date conversion. Some DBMS require
explicit casting such as CAST to date or CONVERT. Also valid string representations of
dates may vary from local or DBMS system.

Once the data to summarize has been decided on, the next step is to decide how to
aggregate the data. Data can be aggregated in an infinite number of ways. For example if
we are doing an analysis of what are the best days of the week for sales we may want to
know the average sale per day of week.
SELECT DATEPART(dw,SaleDate) As DayOfWeek, AVG(Amount) AvgDaySale
FROM Sales
WHERE SaleDate > '1999-12-30'
GROUP BY DATEPART(dw,SaleDate)

You might also like