Apply the following psql commands to see the metadata information on the PostgreSQL databases and specifically on the tables in the employees
database.
\l
: list all databases in PostgreSQL\c employees
: connect to the employees database\dt
: list all tables in the current database (e.g. employees
database)\d employees
: to describe the table (e.g. employees
) and print the table schema as well as the data type for each column.You should see something like this:
Table "public.employees"
Column | Type | Collation | Nullable | Default
------------+-----------------------+-----------+----------+---------
emp_no | integer | | not null |
birth_date | date | | not null |
first_name | character varying(14) | | not null |
last_name | character varying(16) | | not null |
gender | character varying | | not null |
hire_date | date | | not null |
Up until now, we have been casual in our definition of data types (e.g. “text” or “number”). Here, we will take a more formal approach.
The data type defines what kind of value(s) a column can hold. Each column in a database table is required to have a column name and a data type. The SQL developer must decide on the data type when creating the table.
The data type is a guideline for SQL to understand what type of data is expected inside each column and also identifies how SQL will interact with the stored data. For example, we can only apply SQL string/text functions to pre-defined text columns.
A data type error
ROUND()
is a mathematical function. Let’s see what happens when it is applied to a text column.
SELECT ROUND(gender)
FROM employees
;
The error generated is a data type error:
HINT: No function matches the given name and argument types.
You might need to add explicit type casts.
Data types
PostgreSQL supports many data types. The four major ones are:
For the full list of data types, check out PostgreSQL’s official documentation at https://www.postgresql.org/docs/current/datatype.html.
Note that some data types have many different names (see the aliases column in the documentation) and not all PostgreSQL data types have equivalents outside of PostgreSQL.
Also note, quotation marks play a large role in data types as well. The differentiation between single quotes and double quotes is that something unique to PostgreSQL:
', '
when concatenating first name and last name)PostgreSQL provides three character data types: CHAR(n), VARCHAR(n), and TEXT
CHAR(n)
, also known as character (n)
, is the fixed-length character with space padded. If you insert a string that is shorter than the length of the column, PostgreSQL pads spaces. If you insert a string that is longer than the length of the column, PostgreSQL will issue an error.
VARCHAR(n)
, also known as character varying (n)
, is the variable-length character string. With VARCHAR(n), you can store up to n characters. PostgreSQL does not pad spaces when the stored string is shorter than the length of the column.
TEXT
is the variable-length character string. Theoretically, text data is a character string with unlimited length.
Exercise:
How many maximum characters can the column dept_name
in the table departments
store?
PostgreSQL provides two distinct types for number data:
Integer
There are two main types of integers (think: whole numbers):
SMALLINT
) is 2-byte signed integer that has a range from -32,768 to 32,767.INT
) is a 4-byte integer that has a range from -2,147,483,648 to 2,147,483,647.Floating-point numbers
There are three main types of floating-point numbers (think: decimals):
float(n)
is a floating-point number whose precision, at least, n, up to a maximum of 8 bytes.real
or float8
is a 4-byte floating-point number.numeric
or numeric(p,s)
is a real number with p digits with s number after the decimal point.Exercise:
Which column(s) in the salaries
table are numeric? What kind of numeric data type, specifically?
The time (temporal) data types allow you to store date and/or time (e.g. datetime) data.
PostgreSQL has five main data types in this category:
DATE
stores the dates only.TIME
stores the time of day values.TIMESTAMP
stores both date and time values.TIMESTAMPTZ
is a timezone-aware timestamp data type. It is the abbreviation for timestamp with the time zone. The TIMESTAMPTZ is the PostgreSQL’s extension to the SQL standard’s temporal data types.INTERVAL
stores periods of time (e.g. minutes, hours, days)Exercise:
Find the data type(s) for columns to_date
and from_date
in the salaries
table.
A PostgreSQL boolean data type has three possible values:
true
false
null
(missing data)When loading data into a PostgreSQL database and the column is declared to be a Boolean data type, the following values are accepted:
true
.false
.There isn’t a boolean type column in the employees
database, but we will be working with booleans quite a bit when we encounter data filtering with the WHERE
clause.