Deploying Full-Stack Apps on a Single VPS with Docker and Nginx

There is a pervasive assumption in the industry that scaling a web application requires Kubernetes, managed container services, or at minimum a multi-server architecture. I deploy and operate multiple production applications -- Errandoo, NCHRecruitPro, LeadsNeoForge, and ForgeCadNeo -- on single VPS instances. Each server costs about $12 per month. Each handles over 10,000 daily users. Here is exactly how.

The Case for a Single VPS

Before diving into configuration files, let me address the question you are already thinking: when is a single VPS enough?

A single VPS with 4GB RAM and 2 vCPUs can handle surprisingly heavy workloads when properly configured. Nginx can serve 10,000+ concurrent connections. PostgreSQL with proper indexing handles millions of rows. Node.js with clustering saturates both CPU cores. The bottleneck is almost never the hardware -- it is misconfiguration, missing indexes, or unoptimized queries.

You should consider scaling beyond a single VPS when: you need geographic redundancy (users on multiple continents), your database exceeds the VPS disk capacity, you have sustained CPU usage above 80% after optimization, or you need zero-downtime deployments with blue-green switching. For everything else, a single VPS with Docker Compose is the right starting point.

The Complete Docker Compose Configuration

This is the actual docker-compose.yml I use in production, annotated with explanations for every decision.

# docker-compose.prod.yml
version: "3.8"

services:
  # ---------- Frontend ----------
  web:
    build:
      context: .
      dockerfile: apps/web/Dockerfile
      args:
        - NODE_ENV=production
    restart: unless-stopped
    mem_limit: 512m
    cpus: 0.5
    environment:
      - NODE_ENV=production
      - NEXT_PUBLIC_API_URL=https://api.errandoo.com
    healthcheck:
      test: ["CMD", "wget", "--spider", "-q", "http://localhost:3000/api/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s
    networks:
      - frontend
    depends_on:
      api:
        condition: service_healthy

  # ---------- Backend API ----------
  api:
    build:
      context: .
      dockerfile: apps/api/Dockerfile
    restart: unless-stopped
    mem_limit: 2g
    cpus: 1.5
    env_file: .env.production
    healthcheck:
      test: ["CMD", "wget", "--spider", "-q", "http://localhost:5000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 30s
    networks:
      - frontend
      - backend
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_healthy

  # ---------- Database ----------
  postgres:
    image: postgres:16-alpine
    restart: unless-stopped
    mem_limit: 1g
    cpus: 1.0
    environment:
      POSTGRES_DB: ${DB_NAME}
      POSTGRES_USER: ${DB_USER}
      POSTGRES_PASSWORD: ${DB_PASSWORD}
    volumes:
      - pgdata:/var/lib/postgresql/data
      - ./scripts/init-db.sql:/docker-entrypoint-initdb.d/init.sql
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${DB_USER} -d ${DB_NAME}"]
      interval: 10s
      timeout: 5s
      retries: 5
    networks:
      - backend
    # Never expose DB port to host in production
    # ports are intentionally omitted

  # ---------- Cache / Queue ----------
  redis:
    image: valkey/valkey:8-alpine
    restart: unless-stopped
    mem_limit: 256m
    command: valkey-server --maxmemory 200mb --maxmemory-policy allkeys-lru
    volumes:
      - redisdata:/data
    healthcheck:
      test: ["CMD", "valkey-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 5
    networks:
      - backend

  # ---------- Background Workers ----------
  worker:
    build:
      context: .
      dockerfile: apps/api/Dockerfile
    command: node dist/worker.js
    restart: unless-stopped
    mem_limit: 512m
    cpus: 0.5
    env_file: .env.production
    networks:
      - backend
    depends_on:
      api:
        condition: service_healthy

volumes:
  pgdata:
    driver: local
  redisdata:
    driver: local

networks:
  frontend:
  backend:

A few things to notice. Every service has explicit memory and CPU limits. Without these, a memory leak in one container can starve the entire server. The PostgreSQL container does not expose any ports to the host, only to the backend network. The health checks use wget --spider instead of curl because Alpine images include wget but not curl, and installing curl adds 5MB to every container. The start_period on health checks gives containers time to boot before Docker starts counting failures.

Multi-Stage Docker Builds

Image size directly impacts deployment speed and memory usage. A naive Node.js Dockerfile produces images over 1GB. Multi-stage builds cut that to under 200MB.

# apps/api/Dockerfile
# Stage 1: Install dependencies
FROM node:20-alpine AS deps
WORKDIR /app
COPY package.json pnpm-lock.yaml ./
RUN corepack enable && pnpm install --frozen-lockfile --prod=false

# Stage 2: Build
FROM node:20-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN corepack enable && pnpm run build
# Prune dev dependencies after build
RUN pnpm prune --prod

# Stage 3: Production runtime
FROM node:20-alpine AS runner
WORKDIR /app

# Security: run as non-root user
RUN addgroup --system appgroup && adduser --system appuser --ingroup appgroup

# Copy only what we need
COPY --from=builder --chown=appuser:appgroup /app/dist ./dist
COPY --from=builder --chown=appuser:appgroup /app/node_modules ./node_modules
COPY --from=builder --chown=appuser:appgroup /app/package.json ./

USER appuser
EXPOSE 5000

HEALTHCHECK --interval=30s --timeout=10s --retries=3 \
  CMD wget --spider -q http://localhost:5000/health || exit 1

CMD ["node", "dist/main.js"]

Stage 1 installs all dependencies including dev. Stage 2 builds the TypeScript and prunes dev dependencies. Stage 3 starts from a clean Alpine image and copies only the compiled output and production dependencies. The resulting image is typically 180-220MB instead of 1.2GB. Running as a non-root user inside the container is a basic security measure that prevents container-escape privilege escalation.

Nginx Configuration: The Full Picture

Nginx is the front door to everything. It handles SSL termination, compression, rate limiting, WebSocket proxying, and static file caching. Here is the complete configuration.

# /etc/nginx/nginx.conf
user www-data;
worker_processes auto;
pid /run/nginx.pid;
worker_rlimit_nofile 65535;

events {
    worker_connections 4096;
    multi_accept on;
    use epoll;
}

http {
    # ---------- Basic Settings ----------
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    types_hash_max_size 2048;
    server_tokens off;  # Hide Nginx version
    client_max_body_size 50m;

    include /etc/nginx/mime.types;
    default_type application/octet-stream;

    # ---------- Gzip Compression ----------
    gzip on;
    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 4;
    gzip_min_length 256;
    gzip_types
        text/plain
        text/css
        text/xml
        text/javascript
        application/json
        application/javascript
        application/xml
        application/rss+xml
        image/svg+xml;

    # ---------- Rate Limiting ----------
    limit_req_zone $binary_remote_addr zone=general:10m rate=10r/s;
    limit_req_zone $binary_remote_addr zone=api:10m rate=30r/s;
    limit_req_zone $binary_remote_addr zone=auth:10m rate=3r/s;
    limit_req_status 429;

    # ---------- Logging ----------
    log_format main '$remote_addr - $remote_user [$time_local] '
                    '"$request" $status $body_bytes_sent '
                    '"$http_referer" "$http_user_agent" '
                    '$request_time $upstream_response_time';

    access_log /var/log/nginx/access.log main;
    error_log /var/log/nginx/error.log warn;

    # ---------- SSL Settings ----------
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384;
    ssl_prefer_server_ciphers off;
    ssl_session_cache shared:SSL:10m;
    ssl_session_timeout 1d;
    ssl_session_tickets off;

    # OCSP Stapling
    ssl_stapling on;
    ssl_stapling_verify on;

    # ---------- Security Headers ----------
    add_header X-Frame-Options "SAMEORIGIN" always;
    add_header X-Content-Type-Options "nosniff" always;
    add_header X-XSS-Protection "1; mode=block" always;
    add_header Referrer-Policy "strict-origin-when-cross-origin" always;
    add_header Strict-Transport-Security "max-age=63072000; includeSubDomains" always;

    include /etc/nginx/conf.d/*.conf;
}

# /etc/nginx/conf.d/errandoo.conf
upstream web_app {
    server 127.0.0.1:3000;
    keepalive 32;
}

upstream api_app {
    server 127.0.0.1:5000;
    keepalive 32;
}

# Redirect HTTP to HTTPS
server {
    listen 80;
    server_name errandoo.com www.errandoo.com api.errandoo.com;
    return 301 https://$host$request_uri;
}

server {
    listen 443 ssl http2;
    server_name errandoo.com www.errandoo.com;

    ssl_certificate /etc/letsencrypt/live/errandoo.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/errandoo.com/privkey.pem;

    # Static assets with aggressive caching
    location /_next/static/ {
        proxy_pass http://web_app;
        expires 365d;
        add_header Cache-Control "public, immutable";
    }

    # General rate limit for pages
    location / {
        limit_req zone=general burst=20 nodelay;
        proxy_pass http://web_app;
        proxy_http_version 1.1;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

server {
    listen 443 ssl http2;
    server_name api.errandoo.com;

    ssl_certificate /etc/letsencrypt/live/errandoo.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/errandoo.com/privkey.pem;

    # Auth endpoints: strict rate limit
    location /auth/ {
        limit_req zone=auth burst=5 nodelay;
        proxy_pass http://api_app;
        proxy_http_version 1.1;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }

    # WebSocket endpoint for Socket.IO
    location /socket.io/ {
        proxy_pass http://api_app;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_read_timeout 86400;  # Keep WS connections alive for 24h
    }

    # API endpoints
    location / {
        limit_req zone=api burst=50 nodelay;
        proxy_pass http://api_app;
        proxy_http_version 1.1;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

The three rate-limiting zones serve different purposes. The auth zone is aggressive (3 requests per second) to prevent brute-force login attempts. The api zone is more generous (30r/s) for authenticated API calls. The general zone handles page loads. The burst parameter allows temporary spikes without dropping legitimate traffic -- a user rapidly clicking through pages will not get 429 errors.

Let's Encrypt with Auto-Renewal

SSL certificates are non-negotiable in production. Let's Encrypt provides them for free, and Certbot automates the entire lifecycle.

# Initial certificate generation
sudo certbot certonly --nginx \
  -d errandoo.com \
  -d www.errandoo.com \
  -d api.errandoo.com \
  --email admin@errandoo.com \
  --agree-tos \
  --no-eff-email

# Auto-renewal is set up via systemd timer (Certbot installs this automatically)
# Verify with:
sudo systemctl list-timers | grep certbot

# Manual test of renewal
sudo certbot renew --dry-run

# Post-renewal hook to reload Nginx
# /etc/letsencrypt/renewal-hooks/deploy/reload-nginx.sh
#!/bin/bash
systemctl reload nginx
echo "$(date): Nginx reloaded after cert renewal" >> /var/log/certbot-deploy.log

Certbot's systemd timer runs twice daily and only renews certificates within 30 days of expiration. The deploy hook reloads Nginx after renewal so the new certificate takes effect without downtime.

Monitoring Stack: Grafana, Loki, and Uptime Kuma

Running production without monitoring is operating blind. My monitoring stack runs on the same VPS (yes, really -- monitoring a single server does not need its own server) and consists of three components.

# docker-compose.monitoring.yml
version: "3.8"

services:
  # ---------- Log Aggregation ----------
  loki:
    image: grafana/loki:2.9.0
    restart: unless-stopped
    mem_limit: 256m
    volumes:
      - ./monitoring/loki-config.yml:/etc/loki/local-config.yaml
      - lokidata:/loki
    command: -config.file=/etc/loki/local-config.yaml
    networks:
      - monitoring

  promtail:
    image: grafana/promtail:2.9.0
    restart: unless-stopped
    mem_limit: 128m
    volumes:
      - ./monitoring/promtail-config.yml:/etc/promtail/config.yml
      - /var/log:/var/log:ro
      - /var/lib/docker/containers:/var/lib/docker/containers:ro
    command: -config.file=/etc/promtail/config.yml
    networks:
      - monitoring
    depends_on:
      - loki

  # ---------- Dashboards ----------
  grafana:
    image: grafana/grafana:10.2.0
    restart: unless-stopped
    mem_limit: 256m
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
      - GF_USERS_ALLOW_SIGN_UP=false
      - GF_SERVER_ROOT_URL=https://monitor.errandoo.com
    volumes:
      - grafanadata:/var/lib/grafana
      - ./monitoring/provisioning:/etc/grafana/provisioning
    networks:
      - monitoring
      - frontend

  # ---------- Uptime Monitoring ----------
  uptime-kuma:
    image: louislam/uptime-kuma:1
    restart: unless-stopped
    mem_limit: 256m
    volumes:
      - uptimedata:/app/data
    networks:
      - monitoring
      - frontend

volumes:
  lokidata:
  grafanadata:
  uptimedata:

networks:
  monitoring:
  frontend:
    external: true

Promtail tails Docker container logs and Nginx access logs, labels them by service, and ships them to Loki. Grafana queries Loki for dashboards showing request rates, error rates, and response times. Uptime Kuma pings every endpoint every 60 seconds and sends alerts when something goes down.

Telegram Alerts for Critical Events

Dashboards are great for analysis, but you need push alerts for outages. I run a simple Node.js script as a systemd service that monitors critical metrics and sends Telegram messages.

// alert-bot.js
const https = require('https');
const { execSync } = require('child_process');

const TELEGRAM_BOT_TOKEN = process.env.TELEGRAM_BOT_TOKEN;
const TELEGRAM_CHAT_ID = process.env.TELEGRAM_CHAT_ID;
const CHECK_INTERVAL = 60_000; // 1 minute

async function sendAlert(message) {
  const url = `https://api.telegram.org/bot${TELEGRAM_BOT_TOKEN}/sendMessage`;
  const body = JSON.stringify({
    chat_id: TELEGRAM_CHAT_ID,
    text: `[ALERT] ${new Date().toISOString()}\n\n${message}`,
    parse_mode: 'HTML',
  });

  return new Promise((resolve, reject) => {
    const req = https.request(url, { method: 'POST', headers: {
      'Content-Type': 'application/json',
    }}, resolve);
    req.on('error', reject);
    req.write(body);
    req.end();
  });
}

function checkDiskUsage() {
  const output = execSync("df -h / | tail -1 | awk '{print $5}'").toString().trim();
  const usage = parseInt(output);
  if (usage > 85) {
    sendAlert(`Disk usage critical: ${usage}%\nServer: ${process.env.SERVER_NAME}`);
  }
}

function checkMemoryUsage() {
  const output = execSync("free | grep Mem | awk '{printf \"%.0f\", $3/$2 * 100}'").toString().trim();
  const usage = parseInt(output);
  if (usage > 90) {
    sendAlert(`Memory usage critical: ${usage}%\nServer: ${process.env.SERVER_NAME}`);
  }
}

function checkContainerHealth() {
  const output = execSync('docker ps --format "{{.Names}} {{.Status}}"').toString();
  const unhealthy = output.split('\n')
    .filter(line => line.includes('unhealthy') || line.includes('Restarting'));

  if (unhealthy.length > 0) {
    sendAlert(`Unhealthy containers:\n${unhealthy.join('\n')}`);
  }
}

// Run checks every minute
setInterval(() => {
  try {
    checkDiskUsage();
    checkMemoryUsage();
    checkContainerHealth();
  } catch (err) {
    sendAlert(`Monitor script error: ${err.message}`);
  }
}, CHECK_INTERVAL);

Backup Strategy

Backups are the thing you never think about until you need them. I use a two-tier backup approach: local daily backups with 7-day retention, and weekly offsite backups to an S3-compatible bucket.

#!/bin/bash
# /opt/scripts/backup.sh
# Runs daily via cron: 0 3 * * * /opt/scripts/backup.sh

set -euo pipefail

BACKUP_DIR="/opt/backups"
DATE=$(date +%Y%m%d_%H%M%S)
DB_CONTAINER="errandoo-postgres-1"
S3_BUCKET="s3://errandoo-backups"

# 1. PostgreSQL dump
docker exec $DB_CONTAINER pg_dump -U $DB_USER -d $DB_NAME \
  --format=custom --compress=9 \
  > "$BACKUP_DIR/db_${DATE}.dump"

# 2. Compress uploads directory
tar -czf "$BACKUP_DIR/uploads_${DATE}.tar.gz" /opt/errandoo/uploads/

# 3. Backup Docker volumes metadata
docker volume ls --format '{{.Name}}' > "$BACKUP_DIR/volumes_${DATE}.txt"

# 4. Clean local backups older than 7 days
find "$BACKUP_DIR" -type f -mtime +7 -delete

# 5. Weekly offsite backup (runs on Sundays)
if [ "$(date +%u)" -eq 7 ]; then
  aws s3 cp "$BACKUP_DIR/db_${DATE}.dump" "$S3_BUCKET/weekly/" \
    --storage-class STANDARD_IA
  echo "$(date): Offsite backup completed" >> /var/log/backup.log
fi

echo "$(date): Backup completed" >> /var/log/backup.log

Cost Comparison: VPS vs Managed Services

Here is a real cost comparison for running Errandoo's stack (Next.js frontend, NestJS API, PostgreSQL, Redis, background workers):

Component	Single VPS	Managed Services (AWS/Vercel)
Compute	$12/mo (4GB/2vCPU VPS)	$73/mo (EC2 t3.medium or Vercel Pro)
Database	Included	$25/mo (RDS db.t3.micro)
Redis/Cache	Included	$15/mo (ElastiCache t3.micro)
SSL	Free (Let's Encrypt)	Free (ACM)
Monitoring	Included (self-hosted)	$30/mo (Datadog/New Relic basic)
Storage (50GB)	Included	$5/mo (EBS)
Bandwidth	Included (2TB)	$9/mo (100GB egress)
Total	$12/mo	$157/mo

That is a 13x cost difference. The managed services approach has genuine advantages -- automated failover, managed patches, less operational work -- but for a bootstrapped product or side project, $12/month versus $157/month is the difference between sustainable and not.

Security Hardening Checklist

Running your own VPS means you are responsible for security. Here is the checklist I follow for every new server.

SSH hardening: Disable password auth, use key-only login, change default port, install fail2ban
Firewall: UFW with default-deny incoming, allow only 80, 443, and your SSH port
Automatic updates: Enable unattended-upgrades for security patches
Docker security: Run containers as non-root, never expose database ports to host, use read-only filesystems where possible
Secrets management: Use .env files with restricted permissions (chmod 600), never commit to git
Log monitoring: Loki alerts on repeated 401/403 responses and suspicious patterns
Backups: Tested restore procedure (a backup you have not tested is not a backup)

When to Scale Beyond a Single VPS

A single VPS is not forever. Here are the signals that it is time to scale: your database needs more than 80% of available RAM for its working set, you need sub-second failover for high-availability requirements, you are serving users across multiple continents and latency matters, or your deployment requires more than a few seconds of downtime.

When that time comes, the Docker Compose setup translates cleanly to Docker Swarm or even Kubernetes -- the containers, health checks, and networking concepts are the same. But do not start there. Start with a $12 VPS and scale when the metrics tell you to, not when Hacker News tells you to.

The Case for a Single VPS

Before diving into configuration files, let me address the question you are already thinking: when is a single VPS enough?

The Complete Docker Compose Configuration

This is the actual docker-compose.yml I use in production, annotated with explanations for every decision.

# docker-compose.prod.yml
version: "3.8"

services:
  # ---------- Frontend ----------
  web:
    build:
      context: .
      dockerfile: apps/web/Dockerfile
      args:
        - NODE_ENV=production
    restart: unless-stopped
    mem_limit: 512m
    cpus: 0.5
    environment:
      - NODE_ENV=production
      - NEXT_PUBLIC_API_URL=https://api.errandoo.com
    healthcheck:
      test: ["CMD", "wget", "--spider", "-q", "http://localhost:3000/api/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s
    networks:
      - frontend
    depends_on:
      api:
        condition: service_healthy

  # ---------- Backend API ----------
  api:
    build:
      context: .
      dockerfile: apps/api/Dockerfile
    restart: unless-stopped
    mem_limit: 2g
    cpus: 1.5
    env_file: .env.production
    healthcheck:
      test: ["CMD", "wget", "--spider", "-q", "http://localhost:5000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 30s
    networks:
      - frontend
      - backend
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_healthy

  # ---------- Database ----------
  postgres:
    image: postgres:16-alpine
    restart: unless-stopped
    mem_limit: 1g
    cpus: 1.0
    environment:
      POSTGRES_DB: ${DB_NAME}
      POSTGRES_USER: ${DB_USER}
      POSTGRES_PASSWORD: ${DB_PASSWORD}
    volumes:
      - pgdata:/var/lib/postgresql/data
      - ./scripts/init-db.sql:/docker-entrypoint-initdb.d/init.sql
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${DB_USER} -d ${DB_NAME}"]
      interval: 10s
      timeout: 5s
      retries: 5
    networks:
      - backend
    # Never expose DB port to host in production
    # ports are intentionally omitted

  # ---------- Cache / Queue ----------
  redis:
    image: valkey/valkey:8-alpine
    restart: unless-stopped
    mem_limit: 256m
    command: valkey-server --maxmemory 200mb --maxmemory-policy allkeys-lru
    volumes:
      - redisdata:/data
    healthcheck:
      test: ["CMD", "valkey-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 5
    networks:
      - backend

  # ---------- Background Workers ----------
  worker:
    build:
      context: .
      dockerfile: apps/api/Dockerfile
    command: node dist/worker.js
    restart: unless-stopped
    mem_limit: 512m
    cpus: 0.5
    env_file: .env.production
    networks:
      - backend
    depends_on:
      api:
        condition: service_healthy

volumes:
  pgdata:
    driver: local
  redisdata:
    driver: local

networks:
  frontend:
  backend:

Multi-Stage Docker Builds

Image size directly impacts deployment speed and memory usage. A naive Node.js Dockerfile produces images over 1GB. Multi-stage builds cut that to under 200MB.

# apps/api/Dockerfile
# Stage 1: Install dependencies
FROM node:20-alpine AS deps
WORKDIR /app
COPY package.json pnpm-lock.yaml ./
RUN corepack enable && pnpm install --frozen-lockfile --prod=false

# Stage 2: Build
FROM node:20-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN corepack enable && pnpm run build
# Prune dev dependencies after build
RUN pnpm prune --prod

# Stage 3: Production runtime
FROM node:20-alpine AS runner
WORKDIR /app

# Security: run as non-root user
RUN addgroup --system appgroup && adduser --system appuser --ingroup appgroup

# Copy only what we need
COPY --from=builder --chown=appuser:appgroup /app/dist ./dist
COPY --from=builder --chown=appuser:appgroup /app/node_modules ./node_modules
COPY --from=builder --chown=appuser:appgroup /app/package.json ./

USER appuser
EXPOSE 5000

HEALTHCHECK --interval=30s --timeout=10s --retries=3 \
  CMD wget --spider -q http://localhost:5000/health || exit 1

CMD ["node", "dist/main.js"]

Nginx Configuration: The Full Picture

Nginx is the front door to everything. It handles SSL termination, compression, rate limiting, WebSocket proxying, and static file caching. Here is the complete configuration.

# /etc/nginx/nginx.conf
user www-data;
worker_processes auto;
pid /run/nginx.pid;
worker_rlimit_nofile 65535;

events {
    worker_connections 4096;
    multi_accept on;
    use epoll;
}

http {
    # ---------- Basic Settings ----------
    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    keepalive_timeout 65;
    types_hash_max_size 2048;
    server_tokens off;  # Hide Nginx version
    client_max_body_size 50m;

    include /etc/nginx/mime.types;
    default_type application/octet-stream;

    # ---------- Gzip Compression ----------
    gzip on;
    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 4;
    gzip_min_length 256;
    gzip_types
        text/plain
        text/css
        text/xml
        text/javascript
        application/json
        application/javascript
        application/xml
        application/rss+xml
        image/svg+xml;

    # ---------- Rate Limiting ----------
    limit_req_zone $binary_remote_addr zone=general:10m rate=10r/s;
    limit_req_zone $binary_remote_addr zone=api:10m rate=30r/s;
    limit_req_zone $binary_remote_addr zone=auth:10m rate=3r/s;
    limit_req_status 429;

    # ---------- Logging ----------
    log_format main '$remote_addr - $remote_user [$time_local] '
                    '"$request" $status $body_bytes_sent '
                    '"$http_referer" "$http_user_agent" '
                    '$request_time $upstream_response_time';

    access_log /var/log/nginx/access.log main;
    error_log /var/log/nginx/error.log warn;

    # ---------- SSL Settings ----------
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384;
    ssl_prefer_server_ciphers off;
    ssl_session_cache shared:SSL:10m;
    ssl_session_timeout 1d;
    ssl_session_tickets off;

    # OCSP Stapling
    ssl_stapling on;
    ssl_stapling_verify on;

    # ---------- Security Headers ----------
    add_header X-Frame-Options "SAMEORIGIN" always;
    add_header X-Content-Type-Options "nosniff" always;
    add_header X-XSS-Protection "1; mode=block" always;
    add_header Referrer-Policy "strict-origin-when-cross-origin" always;
    add_header Strict-Transport-Security "max-age=63072000; includeSubDomains" always;

    include /etc/nginx/conf.d/*.conf;
}

# /etc/nginx/conf.d/errandoo.conf
upstream web_app {
    server 127.0.0.1:3000;
    keepalive 32;
}

upstream api_app {
    server 127.0.0.1:5000;
    keepalive 32;
}

# Redirect HTTP to HTTPS
server {
    listen 80;
    server_name errandoo.com www.errandoo.com api.errandoo.com;
    return 301 https://$host$request_uri;
}

server {
    listen 443 ssl http2;
    server_name errandoo.com www.errandoo.com;

    ssl_certificate /etc/letsencrypt/live/errandoo.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/errandoo.com/privkey.pem;

    # Static assets with aggressive caching
    location /_next/static/ {
        proxy_pass http://web_app;
        expires 365d;
        add_header Cache-Control "public, immutable";
    }

    # General rate limit for pages
    location / {
        limit_req zone=general burst=20 nodelay;
        proxy_pass http://web_app;
        proxy_http_version 1.1;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

server {
    listen 443 ssl http2;
    server_name api.errandoo.com;

    ssl_certificate /etc/letsencrypt/live/errandoo.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/errandoo.com/privkey.pem;

    # Auth endpoints: strict rate limit
    location /auth/ {
        limit_req zone=auth burst=5 nodelay;
        proxy_pass http://api_app;
        proxy_http_version 1.1;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }

    # WebSocket endpoint for Socket.IO
    location /socket.io/ {
        proxy_pass http://api_app;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_read_timeout 86400;  # Keep WS connections alive for 24h
    }

    # API endpoints
    location / {
        limit_req zone=api burst=50 nodelay;
        proxy_pass http://api_app;
        proxy_http_version 1.1;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

Let's Encrypt with Auto-Renewal

SSL certificates are non-negotiable in production. Let's Encrypt provides them for free, and Certbot automates the entire lifecycle.

# Initial certificate generation
sudo certbot certonly --nginx \
  -d errandoo.com \
  -d www.errandoo.com \
  -d api.errandoo.com \
  --email admin@errandoo.com \
  --agree-tos \
  --no-eff-email

# Auto-renewal is set up via systemd timer (Certbot installs this automatically)
# Verify with:
sudo systemctl list-timers | grep certbot

# Manual test of renewal
sudo certbot renew --dry-run

# Post-renewal hook to reload Nginx
# /etc/letsencrypt/renewal-hooks/deploy/reload-nginx.sh
#!/bin/bash
systemctl reload nginx
echo "$(date): Nginx reloaded after cert renewal" >> /var/log/certbot-deploy.log

Certbot's systemd timer runs twice daily and only renews certificates within 30 days of expiration. The deploy hook reloads Nginx after renewal so the new certificate takes effect without downtime.

Monitoring Stack: Grafana, Loki, and Uptime Kuma

# docker-compose.monitoring.yml
version: "3.8"

services:
  # ---------- Log Aggregation ----------
  loki:
    image: grafana/loki:2.9.0
    restart: unless-stopped
    mem_limit: 256m
    volumes:
      - ./monitoring/loki-config.yml:/etc/loki/local-config.yaml
      - lokidata:/loki
    command: -config.file=/etc/loki/local-config.yaml
    networks:
      - monitoring

  promtail:
    image: grafana/promtail:2.9.0
    restart: unless-stopped
    mem_limit: 128m
    volumes:
      - ./monitoring/promtail-config.yml:/etc/promtail/config.yml
      - /var/log:/var/log:ro
      - /var/lib/docker/containers:/var/lib/docker/containers:ro
    command: -config.file=/etc/promtail/config.yml
    networks:
      - monitoring
    depends_on:
      - loki

  # ---------- Dashboards ----------
  grafana:
    image: grafana/grafana:10.2.0
    restart: unless-stopped
    mem_limit: 256m
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
      - GF_USERS_ALLOW_SIGN_UP=false
      - GF_SERVER_ROOT_URL=https://monitor.errandoo.com
    volumes:
      - grafanadata:/var/lib/grafana
      - ./monitoring/provisioning:/etc/grafana/provisioning
    networks:
      - monitoring
      - frontend

  # ---------- Uptime Monitoring ----------
  uptime-kuma:
    image: louislam/uptime-kuma:1
    restart: unless-stopped
    mem_limit: 256m
    volumes:
      - uptimedata:/app/data
    networks:
      - monitoring
      - frontend

volumes:
  lokidata:
  grafanadata:
  uptimedata:

networks:
  monitoring:
  frontend:
    external: true

Telegram Alerts for Critical Events

Dashboards are great for analysis, but you need push alerts for outages. I run a simple Node.js script as a systemd service that monitors critical metrics and sends Telegram messages.

// alert-bot.js
const https = require('https');
const { execSync } = require('child_process');

const TELEGRAM_BOT_TOKEN = process.env.TELEGRAM_BOT_TOKEN;
const TELEGRAM_CHAT_ID = process.env.TELEGRAM_CHAT_ID;
const CHECK_INTERVAL = 60_000; // 1 minute

async function sendAlert(message) {
  const url = `https://api.telegram.org/bot${TELEGRAM_BOT_TOKEN}/sendMessage`;
  const body = JSON.stringify({
    chat_id: TELEGRAM_CHAT_ID,
    text: `[ALERT] ${new Date().toISOString()}\n\n${message}`,
    parse_mode: 'HTML',
  });

  return new Promise((resolve, reject) => {
    const req = https.request(url, { method: 'POST', headers: {
      'Content-Type': 'application/json',
    }}, resolve);
    req.on('error', reject);
    req.write(body);
    req.end();
  });
}

function checkDiskUsage() {
  const output = execSync("df -h / | tail -1 | awk '{print $5}'").toString().trim();
  const usage = parseInt(output);
  if (usage > 85) {
    sendAlert(`Disk usage critical: ${usage}%\nServer: ${process.env.SERVER_NAME}`);
  }
}

function checkMemoryUsage() {
  const output = execSync("free | grep Mem | awk '{printf \"%.0f\", $3/$2 * 100}'").toString().trim();
  const usage = parseInt(output);
  if (usage > 90) {
    sendAlert(`Memory usage critical: ${usage}%\nServer: ${process.env.SERVER_NAME}`);
  }
}

function checkContainerHealth() {
  const output = execSync('docker ps --format "{{.Names}} {{.Status}}"').toString();
  const unhealthy = output.split('\n')
    .filter(line => line.includes('unhealthy') || line.includes('Restarting'));

  if (unhealthy.length > 0) {
    sendAlert(`Unhealthy containers:\n${unhealthy.join('\n')}`);
  }
}

// Run checks every minute
setInterval(() => {
  try {
    checkDiskUsage();
    checkMemoryUsage();
    checkContainerHealth();
  } catch (err) {
    sendAlert(`Monitor script error: ${err.message}`);
  }
}, CHECK_INTERVAL);

Backup Strategy

Backups are the thing you never think about until you need them. I use a two-tier backup approach: local daily backups with 7-day retention, and weekly offsite backups to an S3-compatible bucket.

#!/bin/bash
# /opt/scripts/backup.sh
# Runs daily via cron: 0 3 * * * /opt/scripts/backup.sh

set -euo pipefail

BACKUP_DIR="/opt/backups"
DATE=$(date +%Y%m%d_%H%M%S)
DB_CONTAINER="errandoo-postgres-1"
S3_BUCKET="s3://errandoo-backups"

# 1. PostgreSQL dump
docker exec $DB_CONTAINER pg_dump -U $DB_USER -d $DB_NAME \
  --format=custom --compress=9 \
  > "$BACKUP_DIR/db_${DATE}.dump"

# 2. Compress uploads directory
tar -czf "$BACKUP_DIR/uploads_${DATE}.tar.gz" /opt/errandoo/uploads/

# 3. Backup Docker volumes metadata
docker volume ls --format '{{.Name}}' > "$BACKUP_DIR/volumes_${DATE}.txt"

# 4. Clean local backups older than 7 days
find "$BACKUP_DIR" -type f -mtime +7 -delete

# 5. Weekly offsite backup (runs on Sundays)
if [ "$(date +%u)" -eq 7 ]; then
  aws s3 cp "$BACKUP_DIR/db_${DATE}.dump" "$S3_BUCKET/weekly/" \
    --storage-class STANDARD_IA
  echo "$(date): Offsite backup completed" >> /var/log/backup.log
fi

echo "$(date): Backup completed" >> /var/log/backup.log

Cost Comparison: VPS vs Managed Services

Here is a real cost comparison for running Errandoo's stack (Next.js frontend, NestJS API, PostgreSQL, Redis, background workers):

Component	Single VPS	Managed Services (AWS/Vercel)
Compute	$12/mo (4GB/2vCPU VPS)	$73/mo (EC2 t3.medium or Vercel Pro)
Database	Included	$25/mo (RDS db.t3.micro)
Redis/Cache	Included	$15/mo (ElastiCache t3.micro)
SSL	Free (Let's Encrypt)	Free (ACM)
Monitoring	Included (self-hosted)	$30/mo (Datadog/New Relic basic)
Storage (50GB)	Included	$5/mo (EBS)
Bandwidth	Included (2TB)	$9/mo (100GB egress)
Total	$12/mo	$157/mo

Security Hardening Checklist

Running your own VPS means you are responsible for security. Here is the checklist I follow for every new server.

SSH hardening: Disable password auth, use key-only login, change default port, install fail2ban
Firewall: UFW with default-deny incoming, allow only 80, 443, and your SSH port
Automatic updates: Enable unattended-upgrades for security patches
Docker security: Run containers as non-root, never expose database ports to host, use read-only filesystems where possible
Secrets management: Use .env files with restricted permissions (chmod 600), never commit to git
Log monitoring: Loki alerts on repeated 401/403 responses and suspicious patterns
Backups: Tested restore procedure (a backup you have not tested is not a backup)

Deploying Full-Stack Apps on a Single VPS: Docker, Nginx, and Let's Encrypt

Deploying Full-Stack Apps on a Single VPS: Docker, Nginx, and Let's Encrypt

The Case for a Single VPS

The Complete Docker Compose Configuration

Multi-Stage Docker Builds

Nginx Configuration: The Full Picture

Let's Encrypt with Auto-Renewal

Monitoring Stack: Grafana, Loki, and Uptime Kuma

Telegram Alerts for Critical Events

Backup Strategy

Cost Comparison: VPS vs Managed Services

Security Hardening Checklist

When to Scale Beyond a Single VPS

Deploying Full-Stack Apps on a Single VPS: Docker, Nginx, and Let's Encrypt

Deploying Full-Stack Apps on a Single VPS: Docker, Nginx, and Let's Encrypt

The Case for a Single VPS

The Complete Docker Compose Configuration

Multi-Stage Docker Builds

Nginx Configuration: The Full Picture

Let's Encrypt with Auto-Renewal

Monitoring Stack: Grafana, Loki, and Uptime Kuma

Telegram Alerts for Critical Events

Backup Strategy

Cost Comparison: VPS vs Managed Services

Security Hardening Checklist

When to Scale Beyond a Single VPS