Home

Lecture 14: 2nd Set Breakout Exercise Solutions


Ex 1

The duplication is caused by the table dept_emp, where several employee ID’s (emp_no) appear more than once in this table.

You could infer by comparing the number of rows against the unique count of employee ID’s in each table:

SELECT
  COUNT(DISTINCT emp_no) AS cnt_unique_emp_no,
  SUM(1) AS num_rows
FROM dept_emp
;
 count  |  sum
--------+--------
 300024 | 331603
(1 row)

To actually find the emp_no’s that are repeated, use this query:

SELECT emp_no
FROM dept_emp
GROUP BY 1
HAVING SUM(1) > 1
;

One way to interpret this result beyond just SQL is that employees might be moving between departments during their employment, and every department switch results in a new row in the dept_emp table.


Ex 2

SELECT
  employees.first_name,
  employees.last_name,
  dept_emp.from_date,
  dept_emp.to_date,
  departments.dept_name
FROM departments
INNER JOIN dept_emp
  ON departments.dept_no = dept_emp.dept_no
INNER JOIN employees
  ON dept_emp.emp_no = employees.emp_no
ORDER BY employees.emp_no, dept_emp.from_date    
;

Ex 3

SELECT
  employees.first_name,
  employees.last_name,
  titles.title AS current_job_title,
  salaries.salary AS current_job_salary
FROM employees
LEFT JOIN titles
  ON employees.emp_no = titles.emp_no
LEFT JOIN salaries
  ON employees.emp_no = salaries.emp_no
WHERE titles.to_date = '9999-01-01'
  AND salaries.to_date = '9999-01-01'
;

This query results in 240,124 rows because out of 300,024 employees in the database, some of them have left the company and are filtered out by the WHERE clauses.

If you rather keep all 300,024 employees, and only have NULLs for the current job title and current job salary, then the query could be written like the following: By using 2 conditions to create each table join, we stipulate that the JOIN only happens if emp_no’s match AND that the to_date criteria is also met. Otherwise, the titles.title column value and the salaries.salary column value remain NULLs.

SELECT
  employees.first_name,
  employees.last_name,
  titles.title AS current_job_title,
  salaries.salary AS current_job_salary
FROM employees
LEFT JOIN titles
  ON employees.emp_no = titles.emp_no
  AND titles.to_date = '9999-01-01'
LEFT JOIN salaries
  ON employees.emp_no = salaries.emp_no
  AND salaries.to_date = '9999-01-01'
;

This query now results in 300,024 rows, which is the count for ALL employees in our database, irregardless of whether they are currently still employed here or not.