SQL 102

Photo by S O C I A L . C U T on Unsplash

After we can get the raw data, the next step an analyst do is usually to check for some statistics of it. Using SQL we can do that in straightforward manner as follows.

Aggregation Function

Most common one including getting minimum, maximum, average and total value of certain columns. We can do this by calling the column name and the functions. The syntax still follow the basic structure of a query as follows:

SELECT 
column_1, column_2, AVG(column_3) AS average_value
FROM table_1

If we try to run the above command, it will give us an error because it’s not complete. The next syntax we need to put is called GROUP BY

SELECT column_1, column_2, AVG(column_3) AS average_value
FROM table_1
GROUP BY column_1, column_2

GROUP BY syntax must be followed by the baseline columns, in this example is column_1 and column_2. Basically we tell SQL to what is the average if we split the data based on the value within column_1 and column_2.

Closing

Examining the data at hand can give us interesting finding already like spotting outliers or invalid data. That’s it for this post. Hope you find it useful and see you in the next post!

Leave a comment