In this topic, I'll talk about the procedure of execution for a SQL SELECT statement.
Basically there're 6 steps for a query:
1. Get Data - FROM, JOIN
2. Filter Row - WHERE
3. Group - GROUP BY
4. Group Filter - HAVING
5. Return Selected Result - SELECT
6. Order or Page - ORDER BY/LIMIT/OFFSET
0. Prepare the data.
Here I will use two tables to demonstrate the process.
CREATE DATABASE dbtest;
USE dbtest;
CREATE TABLE employees(employee_name VARCHAR(30) NOT NULL, department_id INT);
CREATE TABLE departments(department_id INT NOT NULL, department_name VARCHAR(30));
INSERT INTO departments VALUES(1, 'HR');
INSERT INTO departments VALUES(2, 'R&D');
INSERT INTO departments VALUES(3, 'Marketing');
INSERT INTO departments VALUES(4, 'Sales');
INSERT INTO employees VALUES('Andy', 1);
INSERT INTO employees VALUES('Bob', 2);
INSERT INTO employees VALUES('Candy',3);
INSERT INTO employees VALUES('David',1);
INSERT INTO employees VALUES('Ella',2);
INSERT INTO employees VALUES('Fann',1);
INSERT INTO employees VALUES('Gary',4);
INSERT INTO employees VALUES('Henry',4);
INSERT INTO employees VALUES('Ivy',4);
INSERT INTO employees VALUES('Jack',4);
We use the below statement to explain the process.
SELECT d.department_name AS "Department Name"
FROM employees e
JOIN departments d
ON e.department_id = d.department_id
WHERE d.department_name != 'Marketing'
GROUP BY d.department_name
HAVING COUNT(*) >= 3
ORDER BY d.department_name DESC
LIMIT 1;
1. Get data
The first step is to get the data by FROM and JOIN. It will return the cardisian product. We have 10 * 4 = 40 rows in this result. It's equal to:
SELECT *
FROM employees e
JOIN departments d;
| employee_name | department_id | department_id | department_name |
| Andy | 1 | 4 | Sales |
| Andy | 1 | 3 | Marketing |
| Andy | 1 | 2 | R&D |
| Andy | 1 | 1 | HR |
| Bob | 2 | 4 | Sales |
| Bob | 2 | 3 | Marketing |
| Bob | 2 | 2 | R&D |
| Bob | 2 | 1 | HR |
| Candy | 3 | 4 | Sales |
| Candy | 3 | 3 | Marketing |
| Candy | 3 | 2 | R&D |
| Candy | 3 | 1 | HR |
| David | 1 | 4 | Sales |
| David | 1 | 3 | Marketing |
| David | 1 | 2 | R&D |
| David | 1 | 1 | HR |
| Ella | 2 | 4 | Sales |
| Ella | 2 | 3 | Marketing |
| Ella | 2 | 2 | R&D |
| Ella | 2 | 1 | HR |
| Fann | 1 | 4 | Sales |
| Fann | 1 | 3 | Marketing |
| Fann | 1 | 2 | R&D |
| Fann | 1 | 1 | HR |
| Gary | 4 | 4 | Sales |
| Gary | 4 | 3 | Marketing |
| Gary | 4 | 2 | R&D |
| Gary | 4 | 1 | HR |
| Henry | 4 | 4 | Sales |
| Henry | 4 | 3 | Marketing |
| Henry | 4 | 2 | R&D |
| Henry | 4 | 1 | HR |
| Ivy | 4 | 4 | Sales |
| Ivy | 4 | 3 | Marketing |
| Ivy | 4 | 2 | R&D |
| Ivy | 4 | 1 | HR |
| Jack | 4 | 4 | Sales |
| Jack | 4 | 3 | Marketing |
| Jack | 4 | 2 | R&D |
| Jack | 4 | 1 | HR |
Based on above cardisian product, the ON condition will be applied to it. It will return 10 rows.
It's like the statement.
SELECT *
FROM employees e
JOIN departments d
ON e.department_id = d.department_id;
| employee_name | department_id | department_id | department_name |
| Andy | 1 | 1 | HR |
| Bob | 2 | 2 | R&D |
| Candy | 3 | 3 | Marketing |
| David | 1 | 1 | HR |
| Ella | 2 | 2 | R&D |
| Fann | 1 | 1 | HR |
| Gary | 4 | 4 | Sales |
| Henry | 4 | 4 | Sales |
| Ivy | 4 | 4 | Sales |
| Jack | 4 | 4 | Sales |
2. Filter Rows by WHERE
Then WHERE clause is applied to get the qualified rows. It will return 9 rows.
SELECT *
FROM employees e
JOIN departments d
ON e.department_id = d.department_id
WHERE d.department_name != 'Marketing';
| employee_name | department_id | department_id | department_name |
| Andy | 1 | 1 | HR |
| Bob | 2 | 2 | R&D |
| David | 1 | 1 | HR |
| Ella | 2 | 2 | R&D |
| Fann | 1 | 1 | HR |
| Gary | 4 | 4 | Sales |
| Henry | 4 | 4 | Sales |
| Ivy | 4 | 4 | Sales |
| Jack | 4 | 4 | Sales |
3. Group the data with GROUP BY
GROUP BY d.department_name;
Based on the department name, it will group the data. It also has 3 rows like the way:
| employee_name | department_id | department_id | department_name |
| Andy | 1 | 1 | HR |
| Fann | 1 | 1 | |
| David | 1 | 1 | |
| Bob | 2 | 2 | R&D |
| Ella | 2 | 2 | |
| Gary | 4 | 4 | Sales |
| Henry | 4 | 4 | |
| Ivy | 4 | 4 | |
| Jack | 4 | 4 |
4. Group filter by HAVING
HAVING COUNT(*) >= 3
Only HR and Sales which has at least 3 employees are left.
| employee_name | department_id | department_id | department_name |
| Andy | 1 | 1 | HR |
| Fann | 1 | 1 | |
| David | 1 | 1 | |
| Gary | 4 | 4 | Sales |
| Henry | 4 | 4 | |
| Ivy | 4 | 4 | |
| Jack | 4 | 4 |
5. Return selected result
Then execute SELECT clause:
SELECT d.department_name AS "Department Name"
FROM employees e
JOIN departments d
ON e.department_id = d.department_id
WHERE d.department_name != 'Marketing'
GROUP BY d.department_name
HAVING COUNT(*) >= 3;
| Department Name |
| HR |
| Sales |
6. Order or Page
ORDER BY is applied to the result in Step 5:
SELECT d.department_name AS "Department Name"
FROM employees e
JOIN departments d
ON e.department_id = d.department_id
WHERE d.department_name != 'Marketing'
GROUP BY d.department_name
HAVING COUNT(*) >= 3
ORDER BY d.department_name DESC;
| Department Name |
| Sales |
| HR |
Finally LIMIT is applied. Only the first record is returned after LIMIT 1:
SELECT d.department_name AS "Department Name"
FROM employees e
JOIN departments d
ON e.department_id = d.department_id
WHERE d.department_name != 'Marketing'
GROUP BY d.department_name
HAVING COUNT(*) >= 3
ORDER BY d.department_name DESC
LIMIT 1;
| Department Name |
| Sales |










