Calculate the employee tenure time by differencing between from_date
and to_date
in the dept_emp
table using the AGE()
function.
Why are some of the calculated values so large?
SELECT
from_date,
to_date,
?? AS employee_tenure
FROM dept_emp
LIMIT 30
;
In this lecture, we will go over conditional functions and expressions available in PostgreSQL. These are the equivalent of writing IF .. THEN .. in other languages.
The three we will cover are:
NULLIF()
functionCOALESCE()
functionCASE .. WHEN ..
expressionPostgreSQL documentation can be found at: https://www.postgresql.org/docs/current/functions-conditional.html.
NULLIF()
NULLIF(value1, value2)
The NULLIF()
function returns a NULL value if value1 equals value2; otherwise it returns value1.
This is the equivalent of pseudo code:
if value1 == value2:
return NULL
else:
return value1
An example of NULLIF()
:
SELECT
dept_name,
NULLIF(dept_name, 'Customer Service') AS dept_name_wo_customer_service
FROM departments
;
Outputs the following:
dept_name | dept_name_wo_customer_service
-------------------+-------------------------------
Customer Service |
Development | Development
Finance | Finance
Human Resources | Human Resources
Marketing | Marketing
Production | Production
Quality Management | Quality Management
Research | Research
Sales | Sales
Exercise:
Use NULLIF()
to create a new column in which if the row value in to_date
is equal to '9999-01-01'
then let the value in the new column be null, otherwise default back to the original value in to_date
.
SELECT
to_date,
???
FROM dept_emp
LIMIT 30
;
COALESCE()
The COALESCE()
function accepts an unlimited number of arguments. It returns the first argument that is not null. If all arguments are null, the COALESCE()
function will return null.
In many ways, COALESCE()
is the inverse of IFNULL()
.
The syntax for COALESCE()
is as follows:
COALESCE (argument_1, argument_2, …)
This is the equivalent of pseudo code:
if argument_1 != NULL:
return argument_1
elif argument_2 != NULL:
return argument_2
...
else:
return NULL
Consider the following SQL:
SELECT
dept_name,
NULLIF(dept_name, 'Customer Service') AS dept_name_wo_customer_service,
COALESCE(NULLIF(dept_name, 'Customer Service'), 'CS') AS dept_name_w_cs
FROM departments
;
Outputs the following:
dept_name | dept_name_wo_customer_service | dept_name_w_cs
-------------------+-------------------------------+--------------------
Customer Service | | CS
Development | Development | Development
Finance | Finance | Finance
Human Resources | Human Resources | Human Resources
Marketing | Marketing | Marketing
Production | Production | Production
Quality Management | Quality Management | Quality Management
Research | Research | Research
Sales | Sales | Sales
Breakout Exercise:
to_date_revised
doing?
SELECT
from_date,
to_date,
COALESCE(NULLIF(to_date, '9999-01-01'), CURRENT_DATE) AS to_date_revised
FROM dept_emp
LIMIT 30
;
from_date
and the newly created to_date_revised
column in question 1, does this new and improved employee tenure column make sense now? Write the SQL for it.CASE .. WHEN ..
statementsThe PostgreSQL CASE .. WHEN ..
expression is the same as IF/THEN/ELSE statements in other programming languages.
The CASE .. WHEN ..
expression has two writing forms:
The general form of the CASE .. WHEN ..
statement is as follows:
CASE
WHEN condition_1 THEN result_1
WHEN condition_2 THEN result_2
[WHEN ...]
[ELSE else_result]
END
In this syntax, each condition (condition_1, condition_2…) is a boolean expression that returns either true or false.
When a condition evaluates to false, the CASE expression evaluates the next condition from the top to bottom until it finds a condition that evaluates to true.
If a condition evaluates to true, the CASE expression returns the corresponding result that follows the condition. For example, if the condition_2 evaluates to true, the CASE expression returns the result_2. Also, it immediately stops evaluating the next expression.
In case all conditions evaluate to false, the CASE expression returns the result (else_result) that follows the ELSE keyword. If you omit the ELSE clause, the CASE expression returns NULL.
A beginner’s example:
SELECT
dept_name,
CASE
WHEN dept_name = 'Customer Service' THEN 'CS'
END AS dept_name_short
FROM departments
;
An example with ELSE:
SELECT
dept_name,
CASE
WHEN dept_name = 'Customer Service' THEN 'CS'
ELSE 'not CS'
END AS dept_name_short
FROM departments
;
Note that the ELSE does not have to be a string literal. Here, we can default the ELSE to the original values in the dept_name
column.
SELECT
dept_name,
CASE
WHEN dept_name = 'Customer Service' THEN 'CS'
ELSE dept_name
END AS dept_name_short
FROM departments
;
Exercise:
Convert the following pseudo code logic to SQL to create the new column called dept_name_short
:
if dept_name == 'Customer Service':
return 'CS'
else:
return ''
Fill in the SQL below:
SELECT
dept_name,
??? AS dept_name_short
FROM departments
;
An example with multiple WHEN’s:
Now let’s look at an example with multiple conditions to evaluate.
SELECT
dept_name,
CASE
WHEN dept_name = 'Customer Service' THEN 'CS'
WHEN dept_name = 'Development' THEN 'Dev'
ELSE 'N/A'
END AS dept_name_short
FROM departments
;
In pseudo code, this is the same as:
if dept_name == 'Customer Service':
return 'CS'
elif dept_name == 'Development':
return 'Dev'
else:
return 'N/A'
Exercise:
Write the SQL query that generates the output below based on the column dept_name
in the departments
table:
dept_name | dept_name_short
-------------------+-----------------
Customer Service | CS
Development | Dev
Finance | Fnce
Human Resources | HR
Marketing | Mktg
Production | Prod
Quality Management | QA
Research | R&D
Sales | Sales
Fill in the SQL below:
SELECT
dept_name,
??? AS dept_name_short
FROM departments
;
PostgreSQL provides another form of the CASE expression called the simple form, which is as follows:
CASE expression
WHEN value_1 THEN result_1
WHEN value_2 THEN result_2
[WHEN ...]
[ELSE else_result]
END
The CASE first evaluates the expression and compares the result with each value( value_1, value_2, …) in the WHEN clauses sequentially until it finds the match.
Once the result of the expression equals a value (value1, value2, etc.) in a WHEN clause, the CASE returns the corresponding result in the THEN clause.
If CASE does not find any matches, it returns the else_result which follows the ELSE, or NULL value if the ELSE is not available.
A beginner’s example:
The general form:
SELECT
dept_name,
CASE
WHEN dept_name = 'Customer Service' THEN 'CS'
END as dept_name_short
FROM departments
;
The simple form
SELECT
dept_name,
CASE dept_name
WHEN 'Customer Service' THEN 'CS'
END as dept_name_short
FROM departments
;
A more complex example with multiple conditions and an ELSE:
With more conditions, the simple form is more concise to type:
The general form:
SELECT
dept_name,
CASE
WHEN dept_name = 'Customer Service' THEN 'CS'
WHEN dept_name = 'Development' THEN 'Dev'
ELSE 'N/A'
END as dept_name_short
FROM departments
;
The simple form:
SELECT
dept_name,
CASE dept_name
WHEN 'Customer Service' THEN 'CS'
WHEN 'Development' THEN 'Dev'
ELSE 'N/A'
END as dept_name_short
FROM departments
;
Regardless of the form, the psuedo code itself is still the same, and so is the output:
dept_name | dept_name_short
-------------------+-----------------
Customer Service | CS
Development | Dev
Finance | N/A
Human Resources | N/A
Marketing | N/A
Production | N/A
Quality Management | N/A
Research | N/A
Sales | N/A
(9 rows)
Exercise:
Re-write the following SQL, which uses the general form of CASE .. WHEN
into the simple form.
SELECT
dept_name,
CASE
WHEN dept_name = 'Customer Service' THEN 'CS'
WHEN dept_name = 'Development' THEN 'Dev'
WHEN dept_name = 'Finance' THEN 'Fnce'
WHEN dept_name = 'Human Resources' THEN 'HR'
WHEN dept_name = 'Marketing' THEN 'Mktg'
WHEN dept_name = 'Production' THEN 'Prod'
WHEN dept_name = 'Quality Management' THEN 'QA'
WHEN dept_name = 'Research' THEN 'R&D'
ELSE dept_name
END AS dept_name_short
FROM departments
;
Breakout Exercise:
Please write the SQL that will create the following output using the departments
table:
dept_no | dept_name | dept_name_bundled
--------+--------------------+-------------------
d009 | Customer Service | Business
d005 | Development | Engineering
d002 | Finance | Business
d003 | Human Resources | Business
d001 | Marketing | Business
d004 | Production | Engineering
d006 | Quality Management | Engineering
d008 | Research | Engineering
d007 | Sales | Business
CASE .. WHEN ..
with aggregation functionsOn many occasions, it’s helpful to combine the conditional logic from CASE .. WHEN ..
with aggregation functions such as SUM()
, MIN()
, MAX()
, etc.
Consider the following:
This SQL query returns the number of rows in the table salaries
by creating a temporary column of 1’s for each row, then summing down that row.
SELECT
SUM(1) AS cnt_rows
FROM salaries
;
However, if we want to apply conditional logic to the temporary column so that the value is only equal to 1 if the condition is met (e.g. emp_no = 90869
), we can first do the following:
SELECT
emp_no,
CASE
WHEN emp_no = 90689 THEN 1
ELSE 0
END AS flg_emp_90689
FROM salaries
LIMIT 10 -- this is just to limit the printout
;
Which gives an output like this:
emp_no | flg_emp_90689
--------+---------------
90689 | 1
90689 | 1
90689 | 1
90690 | 0
90691 | 0
90691 | 0
90691 | 0
90691 | 0
90691 | 0
90691 | 0
(10 rows)
Then, we can apply the aggregation function on top of this column created by conditional logic to calculate how many rows in the salaries
table have the emp_no
equal to 90689
.
SELECT
SUM(
CASE
WHEN emp_no = 90689 THEN 1
ELSE 0
END
) AS cnt_emp_90689_occurences
FROM salaries
;
Exercise:
Run the following SQL. In English, what does it do?
SELECT
SUM(
CASE
WHEN gender = 'M' THEN 1
ELSE 0
END
) AS cnt_emp_male,
SUM(
CASE
WHEN gender = 'F' THEN 1
ELSE 0
END
) AS cnt_emp_female
FROM employees
;
More aggregation functions beyond SUM()
The combination of conditional logic with aggregation functions does not just stop with SUM()
and zeroes and ones. All aggregation functions are fair game.
Consider the following:
We can create a new column where if the emp_no = 90589
then retain the original value from the same row salary
column, otherwise let the value be 0.
SELECT
emp_no,
salary,
CASE
WHEN emp_no = 90689 THEN salary
ELSE 0
END AS flg_emp_90689_salary
FROM salaries
LIMIT 10 -- this is just to limit the printout
;
Which gives an output like this:
emp_no | salary | flg_emp_90689_salary
--------+--------+----------------------
90689 | 98100 | 98100
90689 | 101016 | 101016
90689 | 102549 | 102549
90690 | 40000 | 0
90691 | 65744 | 0
90691 | 66904 | 0
90691 | 71074 | 0
90691 | 71198 | 0
90691 | 74916 | 0
90691 | 74444 | 0
(10 rows)
Which then can be aggregated, like so:
SELECT
SUM(
CASE
WHEN emp_no = 90689 THEN salary
ELSE 0
END
) AS emp_90689_salary
FROM salaries
;
Note however, that SUM()
, like all aggregation functions, simply ignore NULL values. That is why, the ELSE
clause is not actually necessary, and the logic will return the same value.
SELECT
SUM(
CASE
WHEN emp_no = 90689 THEN salary
END
) AS emp_90689_salary
FROM salaries
;
Breakout Exercises:
SELECT
MIN(
CASE
WHEN emp_no = 90689 THEN salary
ELSE 0
END
) AS emp_90689_min_salary,
MAX(
CASE
WHEN emp_no = 90689 THEN salary
ELSE 0
END
) AS emp_90689_max_salary
FROM salaries
;
SELECT
SUM(
CASE
WHEN title like '%Staff%'
THEN 1
ELSE 0
END
) AS cnt_staff
FROM titles
;
Write a SQL query that returns the number of rows in the employees
table where the birth_date
is equal to '1961-01-01'
.
salaries
table, write a SQL query that returns the highest salary for employees employed in 2000. You can use the from_date
to represent employment date and to extract the employment year.