- Learn
- PostgreSQL
- PostgreSQL CREATE TABLE
PostgreSQL CREATE TABLE
Everything you need to know about creating tables in PostgreSQL.
Basic syntax
The CREATE TABLE statement defines a new table in your database. At minimum, you need a table name and at least one column:
Each column gets a name and a data type. You can add as many columns as you need, separated by commas.
A more complete example:
This creates a users table with a primary key, a unique email column, a required display name, and a timestamp that defaults to the current time.
Data types
PostgreSQL has a large set of built-in data types. Here are the ones you'll use most often.
Numeric types
| Type | Storage | Range | Use case |
|------|---------|-------|----------|
| smallint | 2 bytes | -32,768 to 32,767 | Small counters, status codes |
| integer | 4 bytes | -2.1 billion to 2.1 billion | Most integer columns |
| bigint | 8 bytes | -9.2 quintillion to 9.2 quintillion | Large IDs, row counts |
| numeric(p, s) | variable | up to 131,072 digits | Money, precision math |
| real | 4 bytes | 6 decimal digits precision | Scientific data (approximate) |
| double precision | 8 bytes | 15 decimal digits precision | Scientific data (approximate) |
For most integer columns, integer is the right choice. Use bigint when you expect billions of rows or are working with external IDs that may exceed integer range.
For money and financial data, always use numeric. Floating point types (real, double precision) introduce rounding errors that will cause problems with currency.
Text types
| Type | Description |
|------|-------------|
| text | Variable-length string, no limit |
| varchar(n) | Variable-length string, max n characters |
| char(n) | Fixed-length string, padded with spaces |
In PostgreSQL, text and varchar perform identically. There is no performance benefit to using varchar(255) over text. The varchar(n) limit can be useful as a data integrity check (for example, ensuring a country code is no more than 2 characters), but if you're adding arbitrary limits like 255, you're better off using text and validating length in your application.
Date and time types
| Type | Description | Example |
|------|-------------|---------|
| date | Calendar date | 2024-03-15 |
| time | Time of day (no timezone) | 14:30:00 |
| timestamp | Date and time (no timezone) | 2024-03-15 14:30:00 |
| timestamptz | Date and time (with timezone) | 2024-03-15 14:30:00+00 |
| interval | Time span | 2 hours 30 minutes |
Always prefer timestamptz over timestamp for storing points in time. A plain timestamp has no timezone information, which leads to confusion when your servers, users, or database sessions are in different time zones.
Boolean
The boolean type stores true, false, or NULL.
UUID
The uuid type stores 128-bit universally unique identifiers.
The gen_random_uuid() function is built into PostgreSQL 13 and later. For older versions, you'll need the uuid-ossp extension.
JSON and JSONB
PostgreSQL supports storing JSON data. Always use jsonb over json. The jsonb type stores data in a binary format that's faster to query and supports indexing.
Arrays
PostgreSQL supports array columns for any built-in type:
Arrays work well for small, simple lists. For more complex relationships, a separate junction table is usually the better approach.
Constraints
Constraints enforce rules on your data at the database level. They prevent bad data from getting into your tables, regardless of which application or script inserts the data.
NOT NULL
Requires that a column always has a value:
Default for any column is nullable. You must explicitly add NOT NULL to make a column required. A good rule of thumb: start with NOT NULL on everything, then remove it only when you have a clear reason for allowing nulls.
UNIQUE
Ensures no two rows have the same value in a column:
You can also create unique constraints across multiple columns:
This allows a user to be in many teams and a team to have many users, but the same user cannot be in the same team twice.
Note that NULL values are not considered equal in unique constraints. Two rows can both have NULL in a unique column. If you need to prevent duplicate nulls, use a unique index with COALESCE or a partial index.
CHECK
Validates that values meet a condition:
You can also name your constraints, which makes error messages more readable:
Named constraints appear in error messages:
ERROR: new row for relation "products" violates check constraint "sale_below_price"
DEFAULT
Sets a default value when no value is provided during insert:
Defaults can be constant values, expressions, or function calls. Common defaults include now() for timestamps, gen_random_uuid() for UUIDs, and '{}' for empty JSON objects.
Primary keys
Every table should have a primary key. It uniquely identifies each row and PostgreSQL automatically creates an index on it.
Serial (auto-incrementing integer)
The traditional approach in PostgreSQL:
serial is shorthand that creates an integer column with an auto-incrementing sequence. There are three sizes:
| Type | Underlying type | Max value |
|------|----------------|-----------|
| smallserial | smallint | 32,767 |
| serial | integer | 2,147,483,647 |
| bigserial | bigint | 9,223,372,036,854,775,807 |
For most tables, serial is fine. If you expect more than 2 billion rows, use bigserial.
Identity columns (SQL standard)
The modern, SQL-standard way to create auto-incrementing columns. Preferred over serial in PostgreSQL 10 and later:
Or, if you sometimes need to set the ID manually:
GENERATED ALWAYS prevents manual inserts of the ID (you can override with OVERRIDING SYSTEM VALUE if needed). GENERATED BY DEFAULT allows manual inserts but auto-generates when no value is provided.
Why prefer identity over serial? Identity columns are part of the SQL standard, have clearer ownership semantics, and dropping the column automatically drops the sequence. With serial, the sequence is a separate object you may need to manage independently.
UUID primary keys
UUIDs are useful when you need IDs that are unique across systems, or when you don't want sequential IDs exposed to users:
Tradeoffs compared to integer primary keys:
- UUIDs are 16 bytes vs 4 bytes for integer (or 8 bytes for bigint)
- Random UUIDs cause more index fragmentation
- UUIDs are safe to generate on the client side
- UUIDs don't reveal row counts or insertion order
For most applications, integer primary keys work well. UUIDs make sense for distributed systems, public-facing IDs, or when you need to generate IDs before inserting.
Foreign keys
Foreign keys create links between tables and enforce referential integrity. They guarantee that a referenced row exists.
The REFERENCES users (id) clause means that every user_id value in the orders table must match an existing id in the users table. Trying to insert an order with a non-existent user_id will fail.
ON DELETE behavior
What happens when you delete a referenced row? You control this with ON DELETE:
CASCADE is convenient but dangerous. Deleting a user will silently delete all their orders, invoices, and anything else that cascades. For important data, RESTRICT (the default) is safer. It forces you to handle related data explicitly.
ON UPDATE behavior
The same options exist for updates. Most commonly used when the referenced column might change:
If a product's SKU changes, all order_items referencing it will be updated automatically.
Named foreign key constraints
For complex tables, naming your foreign keys improves error messages:
Indexing foreign key columns
PostgreSQL does not automatically create indexes on foreign key columns. You should add them yourself, especially for columns you'll join or filter on:
Without this index, queries joining orders to users (and ON DELETE checks) will require a full table scan on orders.
Common patterns
Timestamps
Most tables benefit from created_at and updated_at columns:
PostgreSQL doesn't automatically update updated_at. You need a trigger:
This trigger works on any table with an updated_at column. Create the function once, then add the trigger to each table that needs it.
Soft delete
Instead of deleting rows, mark them as deleted:
Then filter out deleted rows in your queries:
You can create a partial index to speed up queries on active rows:
One important consideration: the UNIQUE constraint on email will prevent you from creating a new account with the same email as a soft-deleted one. If that's a problem, replace the unique constraint with a partial unique index:
Enum-like status columns
You have two options for status columns: text with a check constraint, or a PostgreSQL enum type.
Using a check constraint:
Using an enum type:
Enums give you type safety and slightly better storage. Check constraints are more flexible (easier to add or remove values). For most cases, a check constraint is simpler and works fine.
CREATE TABLE AS
Create a new table from the results of a query:
This creates the table and populates it with data in one step. The new table has the same column types as the query results, but does not inherit constraints, indexes, or defaults from the source table.
You can also create an empty table with the same structure:
CREATE TABLE AS is useful for creating materialized snapshots, staging tables for data migrations, and reporting tables.
CREATE TABLE LIKE
Copy the structure of an existing table, including (optionally) its constraints, defaults, and indexes:
You can be selective about what to include:
The options are INCLUDING DEFAULTS, INCLUDING CONSTRAINTS, INCLUDING INDEXES, INCLUDING STORAGE, INCLUDING COMMENTS, INCLUDING GENERATED, INCLUDING IDENTITY, and INCLUDING ALL.
Unlike CREATE TABLE AS, this creates an empty table. It copies structure, not data.
IF NOT EXISTS
Avoid errors when a table already exists:
If the table exists, PostgreSQL skips the statement and issues a notice instead of an error. This is useful in migration scripts and initialization code where you may not know if the table has already been created.
Note that IF NOT EXISTS only checks the table name. It does not verify that the existing table has the same columns or structure. If a users table exists with different columns, this statement will silently do nothing.
Temporary tables
Temporary tables exist only for the duration of your database session (or transaction). They're visible only to the session that created them.
To drop the temporary table at the end of the transaction instead of the session:
Temporary tables are useful for complex data transformations, staging imported data, and storing intermediate results in multi-step queries.
Unlogged tables
Unlogged tables skip write-ahead logging (WAL), making inserts and updates significantly faster. The tradeoff: data is lost if PostgreSQL crashes or restarts.
Good use cases for unlogged tables:
- Caches and session data that can be rebuilt
- Staging tables for ETL pipelines
- Temporary analytics buffers
- Build or processing queues where loss is acceptable
Do not use unlogged tables for data you can't afford to lose.
Partitioned tables
Table partitioning splits a large table into smaller pieces while keeping a single logical table for queries. PostgreSQL supports three partitioning strategies.
Range partitioning
The most common approach. Splits data by value ranges, typically dates:
Queries that filter on created_at will only scan the relevant partitions.
List partitioning
Splits data by a list of values:
Hash partitioning
Distributes rows evenly across partitions using a hash function:
Partitioning adds complexity. It's worth considering when a table exceeds tens of millions of rows or when you need to efficiently drop old data (detaching a partition is instant, compared to deleting millions of rows).
Schema qualification
By default, tables are created in the public schema. You can specify a different schema:
Schemas are useful for organizing tables by domain, managing permissions, and separating application concerns.
Putting it all together
Here's a realistic set of tables for an e-commerce application:
This setup gives you referential integrity, input validation, sensible defaults, and the indexes you need for common queries. If you're working with a schema like this, a GUI database client like DB Pro can help you visualize table relationships and run queries without switching between the terminal and your editor.
Common mistakes
Forgetting to index foreign keys
PostgreSQL creates an index on the primary key automatically, but not on foreign key columns. If you join on orders.user_id or delete a user (which checks for referencing orders), PostgreSQL will do a full table scan without an index.
Always create indexes on your foreign key columns.
Using serial when you should use bigserial
Two billion sounds like a lot, but high-traffic tables can hit the serial limit faster than expected. If you're creating a table that will have many inserts (events, logs, analytics), use bigserial or bigint GENERATED ALWAYS AS IDENTITY from the start. Changing the column type later requires a table rewrite.
Choosing varchar(255) out of habit
This is a MySQL convention that has no benefit in PostgreSQL. The database doesn't allocate 255 bytes per row. text and varchar(255) perform identically. Use text unless you have a specific length limit you want to enforce.
Missing NOT NULL constraints
Every nullable column is a potential source of bugs. NULL values propagate through expressions in unexpected ways (NULL + 1 is NULL, NULL = NULL is NULL, not true). Add NOT NULL to every column that should always have a value.
Storing money as floating point
Never use real or double precision for monetary values:
Not setting ON DELETE behavior on foreign keys
The default is RESTRICT, which prevents deleting a parent row if children exist. This is often the right choice, but you should make it an explicit decision. Think about what should happen when the parent row is deleted and set ON DELETE accordingly.
Creating tables without a primary key
Every table needs a primary key. Without one, you have no guaranteed way to identify a specific row. Updates and deletes become ambiguous. Replication tools may not work. ORMs will complain.
Even junction tables should have a primary key:
Quick reference
| Command | Description |
|---------|-------------|
| CREATE TABLE name (...) | Create a new table |
| CREATE TABLE IF NOT EXISTS name (...) | Create only if it doesn't exist |
| CREATE TABLE name AS SELECT ... | Create from query results |
| CREATE TABLE name (LIKE other INCLUDING ALL) | Copy structure from another table |
| CREATE TEMPORARY TABLE name (...) | Create session-scoped table |
| CREATE UNLOGGED TABLE name (...) | Create table without WAL |
| CREATE TABLE name (...) PARTITION BY RANGE (col) | Create partitioned table |
| DROP TABLE name | Remove a table |
| DROP TABLE IF EXISTS name | Remove if it exists |
| ALTER TABLE name ADD COLUMN ... | Add a column |
| ALTER TABLE name DROP COLUMN ... | Remove a column |
Summary
Creating tables well means thinking about your data before writing any application code. Pick the right data types, add constraints to prevent bad data, index your foreign keys, and use sensible defaults.
Start with NOT NULL on everything, use timestamptz for timestamps, prefer text over varchar(n), and add created_at and updated_at to most tables. The few extra minutes spent on a well-designed schema will save you hours of debugging later.