SQL Tutorials - 3

6 Operator/ Clause
6.1 Select
6.2 Distinct 
6.3 Where
6.4 Like
6.5 Between
6.6 In & Not in
6.7 Order by
6.8 Limit
6.9 Offset
6.10 Group by
6.11 Describe
6.12 Case
6.13 Aliases
6.14 Top
6.15 And & Or
7 Function
7.1 Aggregate Function
7.2 Windows Function
7.2.1 Aggregate Function as Windows Function
7.2.2 Ranking Windows Function
7.2.3 Positional Windows Function


We will use Rainfall data in India for demonstrating different concepts of Operators and clauses in SQL. You may download the dataset from the below clickable link

Link for SQL Rainfall in India Dataset

6.1 Select

After importing the data in the RDMS application. you may execute the query for getting the summary of dataset. SELECT Statement in SQL is used to retrieve, display output or fetch data from a database.

Select * from rainfall; 

( * means query should return all the columns of queried table )

Output - 

6.2 Distinct 

Distinct keyword is used with select keyword to fetch unique values.

SELECT DISTINCT * FROM rainfall;

6.3 Where

Where is used for filtering data by matching pattern

                  Greater Than

>=                 -  Greater than or Equal to

                  -  Less Than

<=                 -  Less than or Equal to

=                   -  Equal to

<>                 Not Equal to

BETWEEN  -  In an inclusive Range

LIKE            -  Search for a pattern

IN                 -  To specify multiple possible values


SELECT * FROM rainfall WHERE DISTRICT = 'MYSORE';

6.4 Like

The LIKE operator is used in a WHERE clause to search for a specified pattern in a column.

  • The percent sign (%) represents zero, one, or multiple characters
  •  The underscore sign (_) represents one, single character
The following are the rules for pattern matching with the LIKE Clause:

  • ‘a%’ Match strings which start with ‘a’
  • ‘%a’ Match strings with end with ‘a’
  • ‘a%t’ Match strings which contain the start with ‘a’ and end with ‘t’.
  • ‘%wow%’ - Match strings which contain the substring ‘wow’ in them at any  position.
  • ‘_wow%’ - Match strings which contain the substring ‘wow’ in them at the second position.
  • ‘_a%’ -  Match strings which contain ‘a’ at the second position.
  • ‘a_ _%’ - Match strings which start with ‘a’ and contain at least 2 more characters.
  • 'a%o' - Match string that starts with ‘a’ and ends with ‘o’

Example of the see use of like query -

SELECT * FROM rainfall WHERE district LIKE 'a%';


Similarly you can try practice fetching other patterns with like query

6.5 Between

The SQL BETWEEN condition is used to search an expression within range of values. It can be used in a SELECT, INSERT, UPDATE, or DELETE statement. BETWEEN condition can be used with numeric and date values

Using BETWEEN with Numeric Values: 

SELECT * FROM rainfall WHERE AUG BETWEEN 250 AND 280;

Using NOT operator with BETWEEN 

SELECT * FROM rainfall WHERE AUG NOT BETWEEN 250 AND 280;

6.6 In & Not In

The SQL IN condition is used fetch records when it matches the expression defined in the list of values. It is used to remove the need of multiple OR condition in SELECT, INSERT, UPDATE or DELETE. IN condition can be used with numeric, dates and alphabetic expressions. NOT IN condition is used to exclude the values  defined in the list

Using IN with Numeric Values: 

SELECT * FROM rainfall WHERE AUG IN ( 291.4, 427.8, 480.7);

6.7 Order by

The ORDER BY clause is used to sort the data of specific field in a ascending and descending order by using keyword asc & desc respectively.

SELECT * FROM rainfall ORDER BY DISTRICT desc;

6.8 Limit

If there are large number of  tuples satisfying the query conditions & we require to see some of them then Limit Clause is handy.

SELECT * FROM rainfall ORDER BY DISTRICT desc LIMIT 3;


6.9 offset

OFFSET is the extension of LIMIT clause. By using OFFSET clause the mentioned number records will be skipped.

SELECT * FROM rainfall ORDER BY DISTRICT ASC LIMIT 3 OFFSET 2;

6.10 Group by

The GROUP BY Statement in SQL is used to arrange identical data into groups with the help of aggregate functions when the condition satisfies.

SELECT STATE_UT_NAME, COUNT(*) 
FROM rainfall GROUP BY STATE_UT_NAME HAVING COUNT(*) >5;


SELECT COUNT(DISTRICT), STATE_UT_NAME FROM rainfall GROUP BY STATE_UT_NAME ASC;

6.11 Describe 

DESCRIBE or DESC command is used to describe the structure of a table.

Describe rainfall;

or 

Desc rainfall;

6.12 Case 

The CASE statement goes through conditions and returns a value when the first condition is met (like an if-then-else statement). So, once a condition is true, it will stop reading and return the result. If no conditions are true, it returns the value in the ELSE clause.

If there is no ELSE part and no conditions are true, it returns NULL.

SELECT DISTRICT, ANNUAL,
CASE
    WHEN ANNUAL > 3000 THEN 'The quantity is greater than 300'
    WHEN ANNUAL = 3000 THEN 'The quantity is 300'
    ELSE 'The quantity is under 3000'
END AS Annual_rainfall
FROM rainfall;
6.13 Aliases 
Aliases are the temporary names given to table or column for the purpose of a particular SQL query. Aliases is just a temporary change and table name does not change in the original database used when table or column names are big or not very readable.
SELECT STATE_UT_NAME as Region FROM rainfall;
6.14 Top

TOP clause is used to fetch limited number of rows from top of table.

SELECT TOP 2 * FROM rainfall;

6.15 And & Or

The AND & OR operators are used for filtering the data and getting precise results based on conditions. AND and OR operators are used with the WHERE clause.

AND operator - Operators display only those records when condition 1 & Condition 2 are True.

SELECT* FROM rainfall WHERE STATE_UT_NAME = 'DELHI' AND DISTRICT = 'NORTH DELHI';


OR operatorOperators display only those records when condition 1 or Condition 2 are True.

SELECT* FROM rainfall WHERE STATE_UT_NAME = 'DELHI' OR DISTRICT = 'PANIPAT';


7.1 Aggregate Function

An aggregate function in SQL returns one value after calculating multiple values of a column.

Various Aggregate Functions

1) Count() - Returns total number of non null records.

2) Sum() - Sum all Non Null values of Column.

3) Avg() - AVG function returns the average of all non-Null values.

4) Min() - MIN function is used to find the minimum value of a certain column.

5) Max() - MAX function is used to find maximum value of certain column.

6) FIRST()

7) LAST()


Example 1 - using Count

SELECT COUNT(DISTRICT) FROM rainfall;

Example 2 - using Max

SELECT STATE_UT_NAME, MAX(ANNUAL) AS max_Annual FROM rainfall GROUP BY STATE_UT_NAME;

Similarly you can try other Aggregate function as well.

7.2 Windows Function

Window functions applies aggregate and ranking functions over a particular set of rows. Window functions are distinguished from other SQL functions by the presence of an OVER clause. If a function has an OVER clause, then it is a window function. If it lacks an OVER clause, then it is an ordinary aggregate or scalar function.

We Can use PARTITION & ORDER BY with Over clause.

PARTITION is used to divide the result set into partitions and perform computation & ORDER BY is used to arrange data in sequential order.


7.2.1 Using Aggregate function as windows function

SELECT STATE_UT_NAME, ANNUAL, 

MAX(ANNUAL) over(PARTITION BY STATE_UT_NAME) AS max_Annual

FROM rainfall;

For understanding 'over(partition by STATE_UT_NAME)' can be considered as replacement of group by clause.


7.2.2 Ranking Windows Function

1) Row_Number() - It assigns consecutive integers to all the rows, no two rows can have same row number.

2) Rank() - Rank is assigned such that rank 1 given to the first row and rows which are having same value are assigned with same rank. For the next rank after two same rank values, one rank value will be skipped. i.e If first 2 rows have same it will be given as rank 1 and third row will be assigned as rank 3. 

3) Dense_Rank() - Just like rank function first row is assigned rank 1 and rows having same value have same rank. The difference between RANK() and DENSE_RANK() is that in DENSE_RANK(), for the next rank after two same rank, consecutive integer is used, no rank is skipped. 


Example  - using Row_Numer()

SELECT STATE_UT_NAME, ANNUAL,

ROW_NUMBER( ) over(PARTITION BY STATE_UT_NAME) AS Row_Number 

FROM rainfall;


Similarly you can try using Other Ranking Windows function as well.


7.2.3 Positional Windows Function

4) Lag () - The LAG() function return value of the previous row in result.

5) Lead () - The LAG() function return value of the next row in result.


Example - using Lag()

SELECT STATE_UT_NAME, ANNUAL,

LAG(ANNUAL) over(PARTITION BY STATE_UT_NAME ORDER BY STATE_UT_NAME) AS lag

FROM rainfall;


Similarly you can try using Lead function as well.