A practical, developer-friendly guide to PostgreSQL with real, copy-pasteable code. This article covers core concepts, architecture highlights (brief), and—most importantly—hands-on examples: DDL, DML, indexes, JSONB, full-text search, partitioning, transactions, and short client examples in Node.js and Python. Use it as a dev.to post or a quick reference.
Why PostgreSQL (short)
PostgreSQL is a powerful open-source object-relational database. It gives you SQL standards compliance, extensibility (custom types, operators), excellent JSON support (jsonb), robust concurrency via MVCC, strong tool ecosystem (PostGIS, TimescaleDB), and many index types.
Quick setup (commands)
Run these in a Unix shell (assuming postgres user exists):
# start psql as postgres user sudo -u postgres psql -- inside psql: create DB and user CREATE DATABASE demo; CREATE USER demo_user WITH PASSWORD 'demo_pass'; GRANT ALL PRIVILEGES ON DATABASE demo TO demo_user; \q Connect as the new user:
psql -h localhost -U demo_user -d demo # or PGPASSWORD=demo_pass psql -h localhost -U demo_user -d demo Schema: Blog example
We’ll build a small blog schema to demonstrate features.
-- create tables CREATE TABLE authors ( id SERIAL PRIMARY KEY, username TEXT NOT NULL UNIQUE, email TEXT NOT NULL UNIQUE, created_at TIMESTAMP WITH TIME ZONE DEFAULT now() ); CREATE TABLE posts ( id BIGSERIAL PRIMARY KEY, author_id INT NOT NULL REFERENCES authors(id) ON DELETE CASCADE, title TEXT NOT NULL, body TEXT NOT NULL, tags TEXT[], -- simple array of tags metadata JSONB, -- flexible JSON metadata published_at TIMESTAMP WITH TIME ZONE, created_at TIMESTAMP WITH TIME ZONE DEFAULT now() ); Basic CRUD
Insert author and post:
INSERT INTO authors (username, email) VALUES ('alice', 'alice@example.com') RETURNING id; INSERT INTO posts (author_id, title, body, tags, metadata, published_at) VALUES (1, 'Hello Postgres', 'This is the body', ARRAY['postgres','db'], '{"views": 0, "likes": 0}', now()) RETURNING id; Select with JOIN:
SELECT p.id, p.title, a.username, p.published_at FROM posts p JOIN authors a ON a.id = p.author_id WHERE p.published_at IS NOT NULL ORDER BY p.published_at DESC LIMIT 10; Update and delete:
UPDATE posts SET metadata = jsonb_set(metadata, '{views}', ( (metadata->>'views')::int + 1 )::text::jsonb ) WHERE id = 1; DELETE FROM posts WHERE id = 123; Indexes & EXPLAIN
Create indexes used commonly:
-- index on foreign key for faster joins CREATE INDEX idx_posts_author ON posts(author_id); -- index on published_at for range queries/sorting CREATE INDEX idx_posts_published_at ON posts(published_at); -- GIN index for tags array CREATE INDEX idx_posts_tags_gin ON posts USING GIN (tags); -- GIN index for jsonb metadata or text search on JSON fields CREATE INDEX idx_posts_metadata_gin ON posts USING GIN (metadata); Use EXPLAIN ANALYZE to inspect plans:
EXPLAIN ANALYZE SELECT * FROM posts WHERE tags @> ARRAY['postgres']; Read the output: look for Index Scan vs Seq Scan, estimated vs actual rows, timing. Use it to change indexes or rewrite queries.
JSONB: storing and querying JSON
Store structured data and query it efficiently.
-- add more metadata UPDATE posts SET metadata = metadata || '{"comments": [{"user":"bob","text":"nice!"}], "views": 10}' WHERE id = 1; -- read a scalar from JSONB SELECT metadata->>'views' AS views_text FROM posts WHERE id = 1; -- query by JSON key SELECT * FROM posts WHERE metadata->>'views' IS NOT NULL AND (metadata->>'views')::int > 100; -- find posts where metadata has a nested field SELECT * FROM posts WHERE metadata @> '{"comments": [{"user": "bob"}] }'; Combine jsonb with GIN index for speed:
CREATE INDEX idx_posts_metadata_gin ON posts USING GIN (metadata jsonb_path_ops); -- or default gin (better for existence/containment) CREATE INDEX idx_posts_metadata_gin_default ON posts USING GIN (metadata); Full-text search (simple)
Add a tsvector column or compute on the fly.
-- add a tsvector column for efficient search ALTER TABLE posts ADD COLUMN search_vector tsvector; -- populate it (concatenate title and body) UPDATE posts SET search_vector = to_tsvector('english', coalesce(title,'') || ' ' || coalesce(body,'')); -- keep it updated with trigger CREATE FUNCTION posts_search_vector_update() RETURNS trigger AS $$ BEGIN NEW.search_vector := to_tsvector('english', coalesce(NEW.title,'') || ' ' || coalesce(NEW.body,'')); RETURN NEW; END $$ LANGUAGE plpgsql; CREATE TRIGGER trg_posts_search_vector BEFORE INSERT OR UPDATE ON posts FOR EACH ROW EXECUTE FUNCTION posts_search_vector_update(); -- index the vector CREATE INDEX idx_posts_search_vector ON posts USING GIN (search_vector); -- search SELECT id, title, ts_rank(search_vector, plainto_tsquery('english', 'postgres tutorial')) AS rank FROM posts WHERE search_vector @@ plainto_tsquery('english', 'postgres tutorial') ORDER BY rank DESC LIMIT 10; Window Functions & Analytics
Example: top authors by number of posts with running totals.
SELECT author_id, cnt, SUM(cnt) OVER (ORDER BY cnt DESC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total FROM ( SELECT author_id, COUNT(*) AS cnt FROM posts GROUP BY author_id ) sub ORDER BY cnt DESC; Transactions & LOCKs (safe updates)
Use transactions to ensure atomic updates:
BEGIN; -- example: increment view count safely UPDATE posts SET metadata = jsonb_set(metadata, '{views}', ((metadata->>'views')::int + 1)::text::jsonb) WHERE id = 1; -- do other related writes... COMMIT; If you need stronger isolation or to avoid race conditions, you can lock a row:
BEGIN; SELECT * FROM posts WHERE id = 1 FOR UPDATE; -- now safe to read/modify and write back COMMIT; Partitioning (time-based)
Partition posts by year (declarative partitioning):
-- create partitioned table CREATE TABLE posts_partitioned ( id bigserial PRIMARY KEY, author_id int NOT NULL, title text, body text, created_at timestamptz NOT NULL DEFAULT now() ) PARTITION BY RANGE (created_at); -- create partitions CREATE TABLE posts_2024 PARTITION OF posts_partitioned FOR VALUES FROM ('2024-01-01') TO ('2025-01-01'); CREATE TABLE posts_2025 PARTITION OF posts_partitioned FOR VALUES FROM ('2025-01-01') TO ('2026-01-01'); Insert goes automatically to correct partition. Partitioning helps large tables performance and maintenance.
Logical replication example (basic)
Set up a publication on primary:
-- on primary CREATE PUBLICATION my_publication FOR TABLE posts, authors; On the subscriber:
-- create subscriber (runs on subscriber) CREATE SUBSCRIPTION my_subscription CONNECTION 'host=primary_host port=5432 dbname=demo user=replicator password=secret' PUBLICATION my_publication; This keeps data replicated (logical replication). (Note: replication setup requires configuration at server/pg_hba and WAL settings—see server docs.)
Short client examples
Node.js (using pg)
// npm install pg const { Pool } = require('pg'); const pool = new Pool({ user: 'demo_user', host: 'localhost', database: 'demo', password: 'demo_pass', port: 5432 }); async function getRecentPosts() { const res = await pool.query( `SELECT p.id, p.title, a.username FROM posts p JOIN authors a ON a.id = p.author_id WHERE p.published_at IS NOT NULL ORDER BY p.published_at DESC LIMIT 10` ); return res.rows; } getRecentPosts().then(rows => { console.log(rows); pool.end(); }).catch(err => { console.error(err); pool.end(); }); Use connection pooling (PgBouncer) for production.
Python (using psycopg)
## pip install psycopg[binary] import psycopg dsn = "dbname=demo user=demo_user password=demo_pass host=localhost" with psycopg.connect(dsn) as conn: with conn.cursor() as cur: cur.execute(""" SELECT p.id, p.title, a.username FROM posts p JOIN authors a ON a.id = p.author_id WHERE p.published_at IS NOT NULL ORDER BY p.published_at DESC LIMIT 10 """) for row in cur.fetchall(): print(row) Tips & Best Practices (practical)
- Use proper data types: smallints/ints/bigints where appropriate;
timestamptzfor timezone-aware times. - Index the columns you filter/join on, but avoid excessive indexes which slow writes.
- Use
EXPLAIN ANALYZEbefore optimizing; measure before changing. - VACUUM / Autovacuum: keep it healthy; tune if you have many updates/deletes.
- Avoid long transactions that hold old row versions and cause bloat.
- Use prepared statements or parameterized queries to avoid SQL injection and boost planner stability.
- Monitor
pg_stat_activityto find blocking queries and long running transactions. - Backups: use
pg_basebackupfor physical orpg_dumpfor logical backups. - Use connection pooling (PgBouncer) to handle many short-lived connections.
Advanced directions (where to go next)
- PostGIS for geospatial workloads.
- Logical replication + logical decoding for CDC (change data capture).
- TimescaleDB for time-series.
- Custom extensions (create types, operators).
- Parallel queries and server tuning (
work_mem,shared_buffers,effective_cache_size, etc.). - Security: roles, row level security (RLS), TLS.
Conclusion
PostgreSQL is more than a database—it’s a platform you can extend and tune for a wide variety of workloads. Above you have a compact, practical set of examples: schema, indexes, JSONB, full-text, partitioning, transactions, and client code. Copy the snippets, run them locally, and use EXPLAIN ANALYZE to see how queries behave on your data.
Top comments (0)