Skip to main content

PostgreSQL 101

PostgreSQL, often simply "Postgres," is a powerful, open-source object-relational database system. It has a strong reputation for reliability, feature robustness, and performance. It is known for its strong adherence to SQL standards and its advanced features like support for complex queries, foreign keys, and triggers.

Overview

Unlike many other relational databases, PostgreSQL is object-relational, meaning it supports not just traditional data types, but also custom types, composite types, and even custom functions. This makes it highly extensible and suitable for a wide range of applications, from simple web apps to complex data warehousing and geospatial applications.

Installation

The installation process for PostgreSQL varies by operating system. Here are the general steps for a Debian-based Linux distribution like Ubuntu.

Ubuntu/Debian

  1. Update Package Index:

    sudo apt update
  2. Install the PostgreSQL package:

    sudo apt install postgresql postgresql-contrib
  3. Start and Enable the PostgreSQL service:

    sudo systemctl start postgresql
    sudo systemctl enable postgresql

Basic Concepts and Commands

Connecting to PostgreSQL

By default, PostgreSQL creates a user named postgres with the postgres role. To connect, you first need to switch to this user.

sudo -i -u postgres
psql

This will log you into the PostgreSQL interactive terminal.

Common SQL Commands

  • List Databases:

    \l
  • Create a Database:

    CREATE DATABASE my_new_postgres;
  • Connect to a Database:

    \c my_new_postgres
  • Create a Table:

    CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    price NUMERIC(10, 2) NOT NULL
    );
  • Insert Data:

    INSERT INTO products (name, price) VALUES ('Laptop', 1200.50);
  • Select Data:

    SELECT * FROM products;
  • Exit the psql shell:

    \q

Docker Installation

Running PostgreSQL in a Docker container is a common practice for development and testing.

  1. Pull the PostgreSQL Image:

    docker pull postgres:latest
  2. Run the PostgreSQL Container:

    docker run --name some-postgres -e POSTGRES_PASSWORD=mysecretpassword -d postgres:latest

    This command starts a new container named some-postgres, sets the password for the postgres user, and runs it in detached mode.

  3. Connect to the Container:

    docker exec -it some-postgres psql -U postgres

Kubernetes Deployment

Deploying PostgreSQL on Kubernetes can be achieved using a Deployment, a Service, and a PersistentVolumeClaim. For production environments, using a PostgreSQL Operator is recommended.

  1. Create a Secret for the Password:

    apiVersion: v1
    kind: Secret
    metadata:
    name: postgres-pass
    type: Opaque
    data:
    password: <your-base64-encoded-password>
  2. Create a PersistentVolumeClaim:

    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
    name: postgres-pv-claim
    spec:
    accessModes:
    - ReadWriteOnce
    resources:
    requests:
    storage: 10Gi
  3. Create a Deployment:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: postgres
    spec:
    replicas: 1
    selector:
    matchLabels:
    app: postgres
    template:
    metadata:
    labels:
    app: postgres
    spec:
    containers:
    - name: postgres
    image: postgres:latest
    env:
    - name: POSTGRES_PASSWORD
    valueFrom:
    secretKeyRef:
    name: postgres-pass
    key: password
    ports:
    - containerPort: 5432
    volumeMounts:
    - name: postgres-persistent-storage
    mountPath: /var/lib/postgresql/data
    volumes:
    - name: postgres-persistent-storage
    persistentVolumeClaim:
    claimName: postgres-pv-claim

Configuration and Tuning

  • Configuration File: The main configuration file for PostgreSQL is postgresql.conf. The location of this file varies, but it is typically found in /etc/postgresql/<version>/main/postgresql.conf.
  • Key Configuration Options:
    • shared_buffers: This is a critical setting for performance. A good starting point is 25% of the system's memory.
    • work_mem: This setting controls the amount of memory used for sorting and hashing operations.
    • max_connections: This setting controls the maximum number of concurrent connections.
  • Tuning:
    • Vacuuming: PostgreSQL uses a process called VACUUM to reclaim storage occupied by dead tuples. Autovacuum is enabled by default, but it may need to be tuned for high-write workloads.
    • Indexing: Use appropriate index types (e.g., B-tree, GIN, GiST) to optimize queries.
    • Monitoring: Use tools like pg_stat_statements and external monitoring solutions like pgmetrics or Prometheus with the postgres_exporter to monitor performance.