Real-Time Delivery Tracking with NestJS, Socket.IO, and PostGIS
Abhishek Sharma
Software Developer
Real-Time Delivery Tracking with NestJS, Socket.IO, and PostGIS
Errandoo is a hyperlocal delivery platform—think DoorDash but for a specific metropolitan market where riders pick up anything from groceries to pharmacy orders and deliver within a 5km radius. The defining technical challenge is real-time tracking: from the moment a rider accepts a task, the customer needs to see a live dot moving on a map, with position updates every 3-5 seconds. This is not a polling problem. This is a WebSocket problem, and here is how I built it with NestJS, Socket.IO, and PostGIS.
Why WebSockets Over Server-Sent Events
The first architectural question is always "why not SSE?" Server-Sent Events are simpler, work over standard HTTP, and handle the primary use case (server pushing location updates to the customer) perfectly. Here is why I chose WebSockets anyway:
- Bidirectional communication. Riders send GPS coordinates to the server. Customers receive those coordinates from the server. SSE only handles server-to-client. With SSE, riders would need a separate REST endpoint for sending updates, which means two connection management paths instead of one.
- Room-based broadcasting. Socket.IO rooms are a first-class primitive that maps perfectly to our domain: one room per active task, with the rider and customer both joined. Broadcasting to a room is a single call. With SSE, I would need to maintain subscriber lists manually and iterate over connections.
- Connection state management. Socket.IO tracks connection/disconnection events, supports automatic reconnection with exponential backoff, and handles transport fallback (WebSocket to long-polling) transparently. Building this on top of SSE means reinventing Socket.IO poorly.
- Acknowledgements. When the rider sends a location update, the server can acknowledge receipt. This matters for detecting stale connections—if a rider stops sending updates, we need to know whether they lost connectivity or whether the app crashed.
The tradeoff is complexity. WebSockets are stateful connections that do not play well with horizontal scaling without a Redis adapter. For Errandoo's scale (hundreds of concurrent deliveries, not tens of thousands), a single NestJS instance with Socket.IO handles the load comfortably. The Redis adapter is implemented but primarily serves as a hot standby during deployments.
System Architecture
+----------------+ +------------------+ +----------------+
| Rider App |<--WS-->| NestJS Gateway |<--WS-->| Customer App |
| (React Native)| | (Socket.IO) | | (Next.js PWA) |
+----------------+ +------------------+ +----------------+
|
+--------------+--------------+
| | |
+----------+ +----------+ +------------+
| PostGIS | | BullMQ | | Redis |
| (Spatial | | (Async | | (Socket.IO |
| Queries)| | Jobs) | | Adapter) |
+----------+ +----------+ +------------+
|
+-------------+-------------+
| | |
+---------+ +---------+ +-----------+
|Fast2SMS | | FCM | | Cashfree |
|(OTP) | | (Push) | | (Payment) |
+---------+ +---------+ +-----------+
Monitoring: Grafana + Loki + Sentry + Uptime Kuma
The NestJS WebSocket Gateway
NestJS has first-class support for WebSocket gateways through the @nestjs/websockets package. The gateway is a class decorated with @WebSocketGateway that handles connection lifecycle and message events. Here is the tracking gateway:
// src/tracking/tracking.gateway.ts
import {
WebSocketGateway,
WebSocketServer,
SubscribeMessage,
OnGatewayConnection,
OnGatewayDisconnect,
ConnectedSocket,
MessageBody,
} from "@nestjs/websockets";
import { Server, Socket } from "socket.io";
import { UseGuards, Logger } from "@nestjs/common";
import { WsJwtGuard } from "../auth/guards/ws-jwt.guard";
import { TrackingService } from "./tracking.service";
interface LocationUpdate {
taskId: string;
latitude: number;
longitude: number;
heading: number; // compass bearing in degrees
speed: number; // km/h
accuracy: number; // GPS accuracy in meters
timestamp: number; // Unix ms from device
}
@WebSocketGateway({
cors: {
origin: process.env.ALLOWED_ORIGINS?.split(",") || ["http://localhost:3000"],
credentials: true,
},
namespace: "/tracking",
pingInterval: 10000, // 10s ping to detect dead connections
pingTimeout: 5000, // 5s timeout before considering disconnected
})
export class TrackingGateway implements OnGatewayConnection, OnGatewayDisconnect {
@WebSocketServer()
server: Server;
private readonly logger = new Logger(TrackingGateway.name);
constructor(private readonly trackingService: TrackingService) {}
async handleConnection(client: Socket) {
try {
const user = await this.trackingService.authenticateSocket(client);
client.data.userId = user.id;
client.data.role = user.role;
this.logger.log(`Client connected: ${user.id} (${user.role})`);
} catch (error) {
this.logger.warn(`Unauthorized connection attempt: ${error.message}`);
client.disconnect();
}
}
async handleDisconnect(client: Socket) {
const { userId, role } = client.data;
this.logger.log(`Client disconnected: ${userId} (${role})`);
// If a rider disconnects, mark them as offline after a grace period
if (role === "rider") {
await this.trackingService.handleRiderDisconnect(userId);
}
}
@SubscribeMessage("join:task")
async handleJoinTask(
@ConnectedSocket() client: Socket,
@MessageBody() data: { taskId: string },
) {
const canJoin = await this.trackingService.canAccessTask(
client.data.userId,
data.taskId,
);
if (!canJoin) {
client.emit("error", { message: "Not authorized for this task" });
return;
}
await client.join(`task:${data.taskId}`);
this.logger.log(`${client.data.userId} joined room task:${data.taskId}`);
// Send the last known location immediately
const lastLocation = await this.trackingService.getLastLocation(data.taskId);
if (lastLocation) {
client.emit("location:current", lastLocation);
}
}
@SubscribeMessage("location:update")
async handleLocationUpdate(
@ConnectedSocket() client: Socket,
@MessageBody() data: LocationUpdate,
) {
// Only riders can send location updates
if (client.data.role !== "rider") {
return;
}
// Validate GPS data quality
if (data.accuracy > 50) {
// GPS accuracy worse than 50m: skip this update
return;
}
// Persist location to PostGIS
await this.trackingService.saveLocation(client.data.userId, data);
// Broadcast to everyone in the task room EXCEPT the sender
client.to(`task:${data.taskId}`).emit("location:update", {
latitude: data.latitude,
longitude: data.longitude,
heading: data.heading,
speed: data.speed,
timestamp: data.timestamp,
serverTimestamp: Date.now(),
});
}
@SubscribeMessage("task:status")
async handleTaskStatus(
@ConnectedSocket() client: Socket,
@MessageBody() data: { taskId: string; status: string },
) {
if (client.data.role !== "rider") return;
await this.trackingService.updateTaskStatus(data.taskId, data.status);
// Broadcast status change to the room
this.server.to(`task:${data.taskId}`).emit("task:status", {
taskId: data.taskId,
status: data.status,
timestamp: Date.now(),
});
// If task is completed, clean up the room
if (data.status === "delivered") {
const sockets = await this.server.in(`task:${data.taskId}`).fetchSockets();
for (const socket of sockets) {
socket.leave(`task:${data.taskId}`);
}
}
}
}
Several design decisions in this gateway are worth examining:
GPS accuracy filtering. Mobile GPS is unreliable. Indoors, under bridges, in urban canyons—accuracy degrades to 100m+. Showing a rider jumping 100 meters between updates destroys user confidence. We filter out any update with accuracy worse than 50 meters. The customer sees a smooth (slightly delayed) track rather than a jittery one.
Server timestamp injection. Every location update includes both the device timestamp and a server timestamp. This allows the client to calculate and compensate for network latency. If the device timestamp says "2 seconds ago" but the server timestamp says "just now," the client knows the update was delayed in transit and can adjust the interpolation accordingly.
Room cleanup on delivery. When a task reaches "delivered" status, all sockets are explicitly removed from the room. This prevents memory leaks from accumulated rooms. In a 24-hour period, Errandoo processes hundreds of deliveries—each creates a room, and each needs to be cleaned up.
Handling Disconnections and Reconnections
Network reliability on mobile devices in delivery scenarios is poor. Riders move through areas with bad coverage, enter basements for pickups, and switch between WiFi and cellular. The disconnection handling strategy has three layers:
// src/tracking/tracking.service.ts (partial)
import { Injectable, Logger } from "@nestjs/common";
import { PrismaService } from "../prisma/prisma.service";
import { InjectQueue } from "@nestjs/bullmq";
import { Queue } from "bullmq";
@Injectable()
export class TrackingService {
private readonly logger = new Logger(TrackingService.name);
private readonly OFFLINE_GRACE_PERIOD_MS = 30_000; // 30 seconds
constructor(
private readonly prisma: PrismaService,
@InjectQueue("notifications") private notificationQueue: Queue,
) {}
async handleRiderDisconnect(riderId: string) {
// Don't immediately mark offline. Mobile connections are flaky.
// Wait 30 seconds, then check if they reconnected.
setTimeout(async () => {
const rider = await this.prisma.riderSession.findFirst({
where: { riderId, isConnected: true },
});
if (!rider) {
// Still disconnected after grace period
await this.prisma.rider.update({
where: { id: riderId },
data: { status: "OFFLINE", lastSeen: new Date() },
});
// Check if rider has active tasks
const activeTasks = await this.prisma.task.findMany({
where: { riderId, status: { in: ["ACCEPTED", "PICKED_UP", "IN_TRANSIT"] } },
});
for (const task of activeTasks) {
// Notify the customer that rider connectivity was lost
await this.notificationQueue.add("rider-offline", {
taskId: task.id,
customerId: task.customerId,
message: "Your rider is temporarily offline. Tracking will resume when they reconnect.",
});
}
this.logger.warn(
`Rider ${riderId} offline with ${activeTasks.length} active tasks`,
);
}
}, this.OFFLINE_GRACE_PERIOD_MS);
}
async saveLocation(riderId: string, data: LocationUpdate) {
// Update rider's current position (PostGIS geography type)
await this.prisma.$executeRaw`
UPDATE "Rider"
SET
current_location = ST_SetSRID(ST_MakePoint(${data.longitude}, ${data.latitude}), 4326)::geography,
heading = ${data.heading},
speed = ${data.speed},
last_location_update = NOW(),
status = 'ONLINE'
WHERE id = ${riderId}
`;
// Append to location history for route reconstruction
await this.prisma.locationHistory.create({
data: {
riderId,
taskId: data.taskId,
point: {
type: "Point",
coordinates: [data.longitude, data.latitude],
},
heading: data.heading,
speed: data.speed,
accuracy: data.accuracy,
deviceTimestamp: new Date(data.timestamp),
},
});
}
}
The 30-second grace period is calibrated from production data. In our market, 85% of rider disconnections resolve within 15 seconds (cellular handoff, brief coverage gap). Setting the grace period to 30 seconds avoids sending false "rider offline" notifications. Only persistent disconnections trigger customer alerts.
PostGIS Spatial Queries: Finding Nearby Riders
When a customer creates a new task, we need to find available riders within a 5km radius and rank them by proximity. This is a spatial query, and PostgreSQL with PostGIS handles it efficiently using a geography index:
-- Prisma migration: add PostGIS geography column and spatial index
-- migration.sql
ALTER TABLE "Rider" ADD COLUMN current_location geography(Point, 4326);
CREATE INDEX idx_rider_location_gist
ON "Rider"
USING GIST (current_location);
-- The core nearest-rider query
SELECT
r.id,
r.name,
r.rating,
r.total_deliveries,
r.vehicle_type,
ST_Distance(
r.current_location,
ST_SetSRID(ST_MakePoint($1, $2), 4326)::geography
) AS distance_meters
FROM "Rider" r
WHERE
r.status = 'ONLINE'
AND r.is_available = true
AND r.last_location_update > NOW() - INTERVAL '5 minutes'
AND ST_DWithin(
r.current_location,
ST_SetSRID(ST_MakePoint($1, $2), 4326)::geography,
5000 -- 5000 meters = 5km radius
)
ORDER BY distance_meters ASC
LIMIT 10;
A few critical details here:
Geography type, not geometry. PostGIS has two spatial types: geometry (planar, Cartesian coordinates) and geography (spherical, latitude/longitude on the Earth's surface). For delivery tracking, we need geography because we are working with real-world lat/lng coordinates and need distance calculations in meters, not degrees. Using geometry with SRID 4326 would give distances in degrees, which are meaningless for "find riders within 5km."
ST_DWithin with geography. When used with the geography type, ST_DWithin correctly handles the curvature of the Earth and accepts the distance parameter in meters. It also leverages the GIST index for efficient spatial filtering before computing exact distances. Without the spatial index, this query scans every rider row. With the GIST index, it narrows down to riders in the approximate bounding box first, then computes exact distances only for candidates.
Freshness filter. The last_location_update > NOW() - INTERVAL '5 minutes' clause excludes riders whose last GPS update is stale. A rider might be "ONLINE" in status but their app is frozen or their GPS is off. The freshness filter ensures we only consider riders who are actively transmitting.
In NestJS, the nearest rider query is wrapped in a service method that Prisma's $queryRaw calls:
// src/riders/riders.service.ts
import { Injectable } from "@nestjs/common";
import { PrismaService } from "../prisma/prisma.service";
import { Prisma } from "@prisma/client";
interface NearbyRider {
id: string;
name: string;
rating: number;
totalDeliveries: number;
vehicleType: string;
distanceMeters: number;
}
@Injectable()
export class RidersService {
constructor(private readonly prisma: PrismaService) {}
async findNearbyRiders(
longitude: number,
latitude: number,
radiusMeters: number = 5000,
limit: number = 10,
): Promise<NearbyRider[]> {
const riders = await this.prisma.$queryRaw<NearbyRider[]>`
SELECT
r.id,
r.name,
r.rating,
r.total_deliveries AS "totalDeliveries",
r.vehicle_type AS "vehicleType",
ROUND(ST_Distance(
r.current_location,
ST_SetSRID(ST_MakePoint(${longitude}, ${latitude}), 4326)::geography
)::numeric, 0) AS "distanceMeters"
FROM "Rider" r
WHERE
r.status = 'ONLINE'
AND r.is_available = true
AND r.last_location_update > NOW() - INTERVAL '5 minutes'
AND ST_DWithin(
r.current_location,
ST_SetSRID(ST_MakePoint(${longitude}, ${latitude}), 4326)::geography,
${radiusMeters}
)
ORDER BY "distanceMeters" ASC
LIMIT ${limit}
`;
return riders;
}
}
BullMQ Queue Architecture for Async Operations
Not everything should happen in the request-response cycle. OTP delivery via Fast2SMS, push notifications via Firebase Cloud Messaging, payment processing via Cashfree—these are all operations that should be queued and processed asynchronously. BullMQ (the successor to Bull, built on top of Redis Streams) handles this with named queues and typed workers:
// src/queues/queue.module.ts
import { BullModule } from "@nestjs/bullmq";
import { Module } from "@nestjs/common";
import { NotificationProcessor } from "./processors/notification.processor";
import { PaymentProcessor } from "./processors/payment.processor";
import { OtpProcessor } from "./processors/otp.processor";
@Module({
imports: [
BullModule.registerQueue(
{ name: "notifications" },
{ name: "payments" },
{ name: "otp" },
),
],
providers: [NotificationProcessor, PaymentProcessor, OtpProcessor],
exports: [BullModule],
})
export class QueueModule {}
// src/queues/processors/notification.processor.ts
import { Processor, WorkerHost } from "@nestjs/bullmq";
import { Logger } from "@nestjs/common";
import { Job } from "bullmq";
import * as admin from "firebase-admin";
interface PushNotificationJob {
userId: string;
title: string;
body: string;
data?: Record<string, string>;
}
interface RiderOfflineJob {
taskId: string;
customerId: string;
message: string;
}
@Processor("notifications")
export class NotificationProcessor extends WorkerHost {
private readonly logger = new Logger(NotificationProcessor.name);
async process(job: Job<PushNotificationJob | RiderOfflineJob>) {
switch (job.name) {
case "push":
return this.sendPush(job as Job<PushNotificationJob>);
case "rider-offline":
return this.handleRiderOffline(job as Job<RiderOfflineJob>);
default:
this.logger.warn(`Unknown job type: ${job.name}`);
}
}
private async sendPush(job: Job<PushNotificationJob>) {
const { userId, title, body, data } = job.data;
// Fetch user's FCM tokens (they may have multiple devices)
const tokens = await this.getUserFcmTokens(userId);
if (tokens.length === 0) {
this.logger.warn(`No FCM tokens for user ${userId}`);
return;
}
const message: admin.messaging.MulticastMessage = {
tokens,
notification: { title, body },
data: data || {},
android: {
priority: "high",
notification: { channelId: "delivery_updates" },
},
apns: {
payload: {
aps: {
sound: "default",
badge: 1,
},
},
},
};
const response = await admin.messaging().sendEachForMulticast(message);
this.logger.log(
`Push sent to ${userId}: ${response.successCount} success, ` +
`${response.failureCount} failures`,
);
// Clean up invalid tokens
response.responses.forEach((resp, idx) => {
if (resp.error?.code === "messaging/registration-token-not-registered") {
this.removeInvalidToken(tokens[idx]);
}
});
}
private async handleRiderOffline(job: Job<RiderOfflineJob>) {
await this.sendPush(
{ data: {
userId: job.data.customerId,
title: "Rider Connectivity Issue",
body: job.data.message,
data: { taskId: job.data.taskId, type: "rider_offline" },
},
} as any,
);
}
}
The queue architecture provides three critical benefits:
1. Failure isolation. If Fast2SMS is down, OTP jobs queue up and retry with exponential backoff. The rest of the application continues functioning. Without queues, a Fast2SMS outage would block the entire signup flow.
2. Rate limiting compliance. Fast2SMS has API rate limits. BullMQ's built-in rate limiter (limiter: { max: 10, duration: 1000 }) ensures we never exceed the limit, even during traffic spikes.
3. Observability. BullMQ integrates with Bull Board for a visual dashboard of queue depths, processing rates, and failed jobs. In production, this dashboard is behind the admin panel and is the first thing I check when debugging delivery issues.
The Full Event Flow: GPS Update to Customer Screen
Let me trace a single location update through the entire system to show how all the pieces connect:
1. Rider's phone GPS fires [Device, every 3s]
2. React Native app sends "location:update" [WebSocket emit]
3. NestJS TrackingGateway receives event [Server, <50ms]
4. GPS accuracy check (skip if > 50m) [Gateway filter]
5. Prisma $executeRaw: UPDATE Rider SET [PostGIS, ~5ms]
current_location = ST_MakePoint(...)
6. Prisma create: LocationHistory [PostgreSQL, ~3ms]
7. Socket.IO: client.to(room).emit() [Broadcast, <1ms]
8. Customer's browser receives event [WebSocket, ~50ms]
9. React state update -> map marker moves [Client render, ~16ms]
Total end-to-end latency: ~120-200ms
The 120-200ms end-to-end latency is well under the 500ms threshold where users perceive lag in real-time tracking. The main bottleneck is step 5 (the PostGIS write), which averages 5ms but can spike to 20ms under load. The location history write (step 6) is not on the critical path for the customer's experience—it could be moved to a queue if it became a bottleneck, but at current scale, inline writes are fine.
Prisma Schema for Location Tracking
The Prisma schema for the spatial data model requires some creative use of unsupported types and raw SQL migrations since Prisma does not natively support PostGIS geography types:
// prisma/schema.prisma (relevant models)
model Rider {
id String @id @default(cuid())
name String
phone String @unique
email String?
vehicleType String @map("vehicle_type")
status RiderStatus @default(OFFLINE)
isAvailable Boolean @default(false) @map("is_available")
rating Float @default(5.0)
totalDeliveries Int @default(0) @map("total_deliveries")
heading Float?
speed Float?
lastLocationUpdate DateTime? @map("last_location_update")
lastSeen DateTime? @map("last_seen")
// current_location is a PostGIS geography column
// managed via raw SQL migration, not Prisma native
// Prisma reads/writes use $queryRaw / $executeRaw
tasks Task[]
locationHistory LocationHistory[]
sessions RiderSession[]
createdAt DateTime @default(now()) @map("created_at")
updatedAt DateTime @updatedAt @map("updated_at")
@@map("Rider")
}
model LocationHistory {
id String @id @default(cuid())
riderId String @map("rider_id")
taskId String? @map("task_id")
latitude Float
longitude Float
heading Float?
speed Float?
accuracy Float?
deviceTimestamp DateTime @map("device_timestamp")
createdAt DateTime @default(now()) @map("created_at")
rider Rider @relation(fields: [riderId], references: [id])
task Task? @relation(fields: [taskId], references: [id])
@@index([riderId, createdAt])
@@index([taskId, createdAt])
@@map("LocationHistory")
}
model Task {
id String @id @default(cuid())
customerId String @map("customer_id")
riderId String? @map("rider_id")
status TaskStatus @default(PENDING)
pickupLat Float @map("pickup_lat")
pickupLng Float @map("pickup_lng")
pickupAddress String @map("pickup_address")
dropoffLat Float @map("dropoff_lat")
dropoffLng Float @map("dropoff_lng")
dropoffAddress String @map("dropoff_address")
estimatedDistance Float? @map("estimated_distance")
actualDistance Float? @map("actual_distance")
fare Float?
customer Customer @relation(fields: [customerId], references: [id])
rider Rider? @relation(fields: [riderId], references: [id])
locationHistory LocationHistory[]
createdAt DateTime @default(now()) @map("created_at")
updatedAt DateTime @updatedAt @map("updated_at")
@@map("Task")
}
enum RiderStatus {
ONLINE
OFFLINE
ON_DELIVERY
}
enum TaskStatus {
PENDING
SEARCHING
ACCEPTED
PICKED_UP
IN_TRANSIT
DELIVERED
CANCELLED
}
Scaling Considerations
Errandoo currently operates in a single metro area with 50-80 concurrent riders during peak hours. At this scale, a single NestJS instance handles everything comfortably. But the architecture is designed for horizontal scaling when needed:
Socket.IO Redis Adapter. When running multiple NestJS instances behind a load balancer, Socket.IO's Redis adapter ensures that a client.to(room).emit() call on instance A reaches clients connected to instance B. The adapter is already configured:
// src/main.ts (Socket.IO Redis adapter setup)
import { createAdapter } from "@socket.io/redis-adapter";
import { createClient } from "redis";
const pubClient = createClient({ url: process.env.REDIS_URL });
const subClient = pubClient.duplicate();
await Promise.all([pubClient.connect(), subClient.connect()]);
const io = app.get(IoAdapter);
io.createIOServer(server, {
adapter: createAdapter(pubClient, subClient),
});
Location write batching. At higher scale, writing every GPS update to PostgreSQL individually becomes a bottleneck. The next optimization would be to batch location updates in Redis (GEOADD for current position, XADD to a stream for history) and flush to PostgreSQL every 10-30 seconds. This converts thousands of individual writes into a handful of batch inserts.
Read replicas for spatial queries. The nearest-rider query is read-heavy and can be directed to a PostgreSQL read replica. The ~5 second staleness of a replica is acceptable for finding nearby riders since riders do not teleport.
Production Monitoring Stack
For a delivery platform, downtime means riders idle, customers frustrated, and revenue lost. The monitoring stack is multi-layered:
- Grafana + Loki: Centralized log aggregation and dashboard. Key metrics: WebSocket connection count, location updates per second, queue depth, API response times. Alerting rules fire when WebSocket connections drop suddenly (potential deployment issue) or queue depth exceeds 100 (potential downstream service outage).
- Sentry: Error tracking with source maps for both NestJS and the Next.js customer app. Every unhandled exception includes the full stack trace, request context, and user/rider ID. P0 errors (payment failures, delivery assignment crashes) trigger immediate Slack alerts.
- Uptime Kuma: Self-hosted uptime monitoring. Checks the health endpoint every 30 seconds and alerts on degradation. Also monitors external dependencies (Fast2SMS, Cashfree, FCM) to distinguish between internal failures and upstream outages.
Lessons Learned
1. GPS data is noisier than you expect. Urban environments, building reflections, and device hardware variability produce GPS data that jumps, drifts, and occasionally teleports. The accuracy filter (reject updates worse than 50m) and the client-side interpolation (smooth between updates using heading and speed) are essential for a usable tracking experience. Raw GPS data plotted on a map looks like a drunk rider.
2. PostGIS geography vs. geometry is a critical distinction. I initially used the geometry type with SRID 4326, which gave distances in degrees. A query for "riders within 5000" returned every rider on the planet because 5000 degrees is meaningless. The fix was casting to geography, which changes the distance unit to meters. This is a common PostGIS mistake that wastes hours of debugging.
3. WebSocket disconnection handling is harder than WebSocket connection handling. Setting up the connection is easy. Detecting that a connection is dead, waiting an appropriate grace period, notifying affected parties, and correctly handling the reconnection (rejoining rooms, replaying missed events) is where 60% of the WebSocket engineering effort goes.
4. BullMQ over Bull. BullMQ is the official successor and is built on Redis Streams instead of Redis Lists. The practical difference: BullMQ handles consumer groups, has better exactly-once semantics, and does not lose jobs during Redis restarts (Streams are persistent). If you are starting a new project, there is no reason to use Bull.
5. Monorepo pays for itself immediately. Errandoo uses pnpm workspaces with Turborepo. Shared types between the NestJS backend, Next.js customer app, and React Native rider app prevent the single most common bug in full-stack projects: API contract mismatches. When I add a field to the location update event, TypeScript errors surface in all three apps simultaneously. That alone justifies the monorepo complexity.
Real-time tracking is one of those features that looks simple on the surface—a dot moving on a map—but involves a surprising depth of engineering across WebSockets, spatial databases, async job processing, and mobile network resilience. The NestJS + Socket.IO + PostGIS stack handles it well, and the architectural patterns here scale comfortably to thousands of concurrent deliveries with the optimizations outlined above.