PostgreSQL data types

PostgreSQL's data types and when to use each one.

PostgreSQL ships with a rich set of built-in data types. Picking the right type for each column makes your schema more efficient, prevents invalid data from being stored, and lets the query planner make better decisions.

Numeric types

PostgreSQL offers several numeric types for different use cases.

| Type | Storage | Range | |------|---------|-------| | smallint | 2 bytes | -32,768 to 32,767 | | integer | 4 bytes | -2,147,483,648 to 2,147,483,647 | | bigint | 8 bytes | -9.2 quintillion to 9.2 quintillion | | numeric(p, s) | variable | up to 131,072 digits before decimal | | real | 4 bytes | 6 decimal digits precision (approximate) | | double precision | 8 bytes | 15 decimal digits precision (approximate) |

Use integer for most whole-number columns. Move to bigint when you expect more than 2 billion rows, or when you're storing external IDs that may exceed integer range (tweet IDs, for example).

Use numeric for money and any calculation where exactness matters. Floating point types (real, double precision) cannot represent most decimal fractions exactly, which causes rounding errors that compound over time.

SQL

CREATE TABLE products (
    id integer PRIMARY KEY,
    name text NOT NULL,
    price numeric(10, 2) NOT NULL,       -- exact: 10 digits total, 2 after decimal
    weight_kg double precision,           -- approximate: fine for physical measurements
    stock_count integer NOT NULL DEFAULT 0
);

Serial and auto-increment

serial, bigserial, and smallserial are shorthand types that create an integer column backed by an auto-incrementing sequence. They are the traditional way to make a surrogate primary key.

SQL

CREATE TABLE orders (
    id serial PRIMARY KEY,
    customer_id integer NOT NULL
);

In PostgreSQL 10 and later, GENERATED ALWAYS AS IDENTITY is the preferred SQL-standard alternative:

SQL

CREATE TABLE orders (
    id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    customer_id integer NOT NULL
);

Both work. Identity columns have cleaner semantics: dropping the column automatically drops the sequence, and ownership is unambiguous.

Text types

| Type | Description | |------|-------------| | text | Variable-length string, no limit | | varchar(n) | Variable-length string, max n characters | | char(n) | Fixed-length string, padded with spaces to length n |

In PostgreSQL, text and varchar have identical performance. The database does not pre-allocate storage based on the declared length. This differs from MySQL and SQL Server, where choosing varchar(255) over text can affect performance or indexing.

Use text by default. Use varchar(n) only when you want the database to enforce a maximum length for business logic reasons, such as a country code or a phone extension. Avoid char(n) for most purposes; the space-padding behaviour is rarely useful and can cause surprising results in comparisons.

SQL

CREATE TABLE users (
    id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    email text NOT NULL UNIQUE,
    country_code varchar(2) NOT NULL,   -- enforce 2-char limit as a data rule
    bio text                             -- no arbitrary length cap needed
);

Boolean

The boolean type stores true, false, or NULL. PostgreSQL accepts a wide range of input literals.

| Accepted as true | Accepted as false | |--------------------|---------------------| | true, 't', 'yes', 'on', '1' | false, 'f', 'no', 'off', '0' |

SQL

CREATE TABLE feature_flags (
    id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    name text NOT NULL UNIQUE,
    is_enabled boolean NOT NULL DEFAULT false
);

INSERT INTO feature_flags (name, is_enabled) VALUES
    ('dark_mode', true),
    ('beta_checkout', false);

Date and time types

| Type | Description | Example | |------|-------------|---------| | date | Calendar date only | 2024-03-15 | | time | Time of day, no timezone | 14:30:00 | | timestamp | Date and time, no timezone | 2024-03-15 14:30:00 | | timestamptz | Date and time, with timezone | 2024-03-15 14:30:00+00 | | interval | A span of time | 2 hours 30 minutes |

The most important distinction is between timestamp and timestamptz. A plain timestamp stores the literal date and time you give it with no timezone context. A timestamptz (timestamp with time zone) stores the moment in UTC and converts it to the session's time zone on display.

If your application ever runs in more than one time zone, or if your database server and application server are in different zones, timestamp will cause bugs that are hard to trace. Use timestamptz for nearly everything.

SQL

CREATE TABLE events (
    id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    title text NOT NULL,
    starts_at timestamptz NOT NULL,
    ends_at timestamptz NOT NULL,
    created_at timestamptz NOT NULL DEFAULT now()
);

Use date when you genuinely only care about the calendar date and time is irrelevant (a birthdate, a fiscal quarter start date).

JSON types

PostgreSQL has two JSON types:

| Type | Storage | Ordering preserved | Indexable | Duplicate keys | |------|---------|--------------------|-----------|----------------| | json | Raw text | Yes | No | Allowed (last wins on read) | | jsonb | Binary | No | Yes | Not allowed |

jsonb is the right choice for almost all use cases. It strips insignificant whitespace, deduplicates keys, and can be indexed with GIN indexes for fast containment queries. The only reason to prefer json is if you need to preserve the exact original text of the JSON input (whitespace, key order, duplicate keys), which is rare.

SQL

CREATE TABLE orders (
    id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    customer_id integer NOT NULL,
    metadata jsonb NOT NULL DEFAULT '{}',
    line_items jsonb NOT NULL
);

-- GIN index for fast containment queries
CREATE INDEX idx_orders_metadata ON orders USING gin (metadata);

-- Query orders where metadata contains a specific key/value
SELECT * FROM orders WHERE metadata @> '{"source": "web"}';

Arrays

Any PostgreSQL data type can be used as an array column by appending []. Arrays work well for small, fixed-purpose lists such as tags or permission flags.

SQL

CREATE TABLE posts (
    id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    title text NOT NULL,
    tags text[] NOT NULL DEFAULT '{}',
    allowed_roles integer[]
);

INSERT INTO posts (title, tags, allowed_roles)
VALUES ('Getting started', ARRAY['tutorial', 'beginner'], ARRAY[1, 2, 3]);

Use ANY to query for rows where an array contains a specific value, and the @> containment operator to check that an array includes all elements of another array.

SQL

-- Posts with the 'tutorial' tag
SELECT * FROM posts WHERE 'tutorial' = ANY(tags);

-- Posts that have both 'tutorial' and 'beginner' tags
SELECT * FROM posts WHERE tags @> ARRAY['tutorial', 'beginner'];

For complex many-to-many relationships, a separate junction table is usually better than an array column. Arrays shine for simple, bounded lists that you query in bulk.

UUID

The uuid type stores a 128-bit universally unique identifier in 16 bytes. PostgreSQL 13 and later include gen_random_uuid() as a built-in function.

SQL

CREATE TABLE sessions (
    id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id integer NOT NULL,
    created_at timestamptz NOT NULL DEFAULT now(),
    expires_at timestamptz NOT NULL
);

UUIDs versus integer primary keys:

| | Integer / serial | UUID | |--|------------------|------| | Storage | 4 bytes | 16 bytes | | Readability | Easy to type and share | Long and opaque | | Sequential | Yes (predictable) | No (random v4) | | Client-side generation | Requires a round-trip | Safe to generate anywhere | | Reveals row count | Yes | No |

Choose UUIDs when you need to generate IDs before hitting the database (useful in distributed systems or offline-first apps), when you want to avoid exposing sequential IDs in URLs, or when merging data from multiple sources. Stick with integers for most internal tables where simplicity and storage efficiency matter.

Other useful types

inet and cidr

inet stores an IPv4 or IPv6 address, optionally with a subnet mask. cidr stores a network address and requires that the host bits are zero.

SQL

CREATE TABLE blocked_addresses (
    id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    address inet NOT NULL,
    reason text NOT NULL,
    blocked_at timestamptz NOT NULL DEFAULT now()
);

INSERT INTO blocked_addresses (address, reason)
VALUES ('192.168.1.100', 'repeated failed logins');

PostgreSQL includes network operators for these types, such as checking whether an address falls within a subnet (<<).

bytea

bytea stores raw binary data as a byte sequence. It is used for file contents, encrypted values, and any data that is not valid text.

SQL

CREATE TABLE attachments (
    id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    filename text NOT NULL,
    content_type text NOT NULL,
    data bytea NOT NULL
);

Enum types

Enums define a fixed set of allowed string values at the type level. They offer slightly better storage than text with a check constraint, and enforce the allowed values across all tables that use the type.

SQL

CREATE TYPE order_status AS ENUM (
    'pending', 'processing', 'shipped', 'delivered', 'cancelled'
);

CREATE TABLE orders (
    id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    status order_status NOT NULL DEFAULT 'pending',
    created_at timestamptz NOT NULL DEFAULT now()
);

The tradeoff: adding a new value to an enum requires ALTER TYPE ... ADD VALUE, and removing or renaming values requires more work. A text column with a CHECK constraint is simpler to change. For stable, well-defined sets of values, enums are a clean choice.

Choosing the right type

| Situation | Type to use | |-----------|-------------| | Whole numbers, IDs, counters | integer | | IDs expected to exceed 2 billion | bigint | | Auto-incrementing primary key | integer GENERATED ALWAYS AS IDENTITY | | Money, financial calculations | numeric(p, s) | | Scientific measurements (approximate OK) | double precision | | General text, names, descriptions | text | | Text with a meaningful max length | varchar(n) | | True/false flags | boolean | | Points in time | timestamptz | | Calendar dates only | date | | JSON data | jsonb | | Simple lists of values | text[] or integer[] | | Globally unique identifiers | uuid | | IP addresses | inet | | Binary data | bytea | | Fixed set of string values | enum or text with check constraint |

When in doubt, start with the simpler type. It is easier to change from integer to bigint early in a project than to fix data integrity problems caused by using double precision for money.