01 Clarify the Problem & Scope 5–7 min
"Let me restate — we're designing a ride-hailing platform like Uber. A rider requests a ride, the system matches them with a nearby driver, and we track the trip in real-time through pickup, ride, and drop-off. Let me ask a few scoping questions."
Questions I'd Ask
  • What outcome are we optimizing for? → Trip completion rate (rider requests a ride → ride finishes successfully). Secondary: rider wait time, driver utilization, ETA accuracy. This tells us the system's job is MATCHING and RELIABILITY, not just speed. A fast match that leads to a cancelled trip is worse than a slower match that completes.
  • Which side of the marketplace? Both rider and driver, or just one? → Assume both.
  • Ride types? Just point-to-point, or also shared rides (UberPool), scheduled rides? → Start with point-to-point only. Mention pool as an extension.
  • Geography? Single city, single country, or global? → Design for multi-city, discuss global as evolution.
  • Payment? In scope or out of scope? → Out of scope for deep dive; acknowledge it exists.
  • Scale? Roughly how many concurrent riders/drivers? → ~100M riders, ~5M drivers globally, ~20M trips/day.
  • Latency expectations? How fast should matching happen? → Rider sees a driver within 5–10 seconds of requesting.
Agreed Scope (write on Excalidraw canvas)
In ScopeOut of Scope
Rider requests a ridePayment processing
Driver location trackingDriver onboarding / identity verification
Matching rider → driverShared rides (UberPool)
Real-time trip trackingScheduled rides
Pricing & surgeRatings & reviews
ETA computationChat/calling between rider & driver
Core Use Cases (ranked by priority)
  • UC1: Rider requests ride → system matches to nearest available driver → driver accepts
  • UC2: Driver continuously reports location → system tracks in real-time
  • UC3: Both rider and driver see live trip progress (pickup → in-ride → drop-off)
  • UC4: System computes fare estimate before ride and final fare after
Non-Functional Requirements
  • Availability > Consistency — better to show a slightly stale driver location than to fail a ride request
  • Matching latency — <10 seconds from request to driver assignment
  • Location freshness — driver location updates every 3–5 seconds
  • Trip state must be durable — can't lose an in-progress trip (strongly consistent)
  • Surge pricing can be eventually consistent — a few seconds of staleness is fine
Notice how I've already made tradeoff decisions: availability > consistency for location, but strong consistency for trip state. This sets up architectural decisions later.
02 Back-of-the-Envelope Estimation 3–5 min
"Let me run quick numbers to identify what drives our architecture."
Trips / Day
20M
~230 trips/sec average, ~1K/sec peak
Active Drivers
~2M concurrent
5M total, ~40% online at peak
Location Updates
~500K/sec
2M drivers × 1 update every 4 sec
Location Storage
~50 GB/day
500K/sec × 100 bytes × 86,400 sec
Trip Records
~20 GB/day
20M trips × ~1KB each
WebSocket Connections
~3M concurrent
2M drivers + ~1M active riders at peak
Key insight: The location update pipeline (500K writes/sec) is the hottest path in the system. This is what drives the architecture — it needs a write-optimized, in-memory spatial index, not a traditional database on the write path.
Key insight: 3M concurrent WebSocket connections means we need a horizontally scalable real-time layer. A single server can hold ~50K–100K connections, so we need 30–60 connection servers.
03 High-Level Design 8–12 min
"Let me sketch the major components and data flows. I'll walk through the ride request flow first, then the driver location tracking flow."
Key Architecture Decisions
"Here's WHY I chose each technology — mapping requirements to tradeoffs. Every choice has a rejected alternative and a consequence."
RequirementDecisionWhy (and what was rejected)Consistency
ACID for trip state (financial record)PostgreSQL (sharded by city)Trips involve payments + state machine transitions that must be atomic. DynamoDB lacks multi-row transactions.CP
125K location writes/sec, append-onlyCassandra (time-series)Pure append, never updated. Linear write scaling. PostgreSQL would compete for IOPS with trips.AP
Sub-ms spatial queries for matchingRedis GeospatialGEORADIUS in <1ms, TTL auto-expires stale entries. PostGIS is 10-50ms — too slow for real-time matching.AP
Decouple billing, analytics, ML from hot pathKafka event streamDurable log with independent consumers. Direct service calls would create tight coupling + no replay.
Handle 100K QPS surge without session affinityStateless services + API GatewayAny pod handles any request. Stateful servers can't scale independently or fail over cleanly.
Real-time push to rider/driver appsWebSocket (not polling)Server-initiated push for location + trip state updates. Polling at 4s intervals wastes bandwidth and adds latency.
Major Components (what to draw on Excalidraw)
High-Level Architecture CLIENTS EDGE / LOAD BALANCING APPLICATION SERVICES CACHING MESSAGE QUEUE / ASYNC DATA STORES location push 📱 Rider App 🚗 Driver App 🌐 API Gateway rate limit · auth · route 🔌 WebSocket real-time push 🚦 Trip Service state machine 🔀 Matching spatial index 📍 Location geohash grid 💰 Pricing surge model ⚡ Redis sessions · geo · surge 📨 Kafka trip events · location stream 🐘 PostgreSQL trips · users · payments 📊 Cassandra location history

📱 Rider App CLIENT

  • Request ride, see ETA, track trip
  • WebSocket for real-time updates

🚗 Driver App CLIENT

  • Send location, accept/reject rides
  • WebSocket for dispatch & trip updates

🌐 API Gateway EDGE

  • AuthN, rate limiting, routing
  • REST for CRUD, WS upgrade for real-time

🔌 WebSocket Service REAL-TIME

  • Holds 3M persistent connections
  • Horizontally scaled (30–60 nodes)
  • Pushes trip updates & dispatch offers

📍 Location Service HOT PATH

  • Ingests 500K loc updates/sec
  • In-memory spatial index (geohash grid)
  • "Find nearest drivers" queries

🔀 Matching / Dispatch Service CORE

  • Queries Location Service for nearby drivers
  • Ranks by ETA, rating, vehicle type
  • Sends offer to best driver via WS

🚦 Trip Service CORE

  • Trip state machine (requested → matched → pickup → in_ride → completed)
  • Strongly consistent (PostgreSQL)

💰 Pricing Service CORE

  • Fare estimate before ride
  • Surge multiplier computation
  • Final fare calculation after ride
Flow 1: Rider Requests a Ride
Questions I'd Ask Rider App API Gateway creates trip (status: REQUESTED) Trip Service Matching Service "find 5 nearest available (spatial index lookup) drivers within 5km" rank by ETA, pick best Location Service WebSocket Service "ride offer: pickup at X, (accept / reject) rider is 3 min away" on accept → update Trip (status: MATCHED) on reject → offer to next driver Driver App
Flow 2: Driver Location Tracking
Flow 2: Driver Location Tracking Driver App WebSocket Service update in-memory spatial index async write to Kafka for persistence Location Service (append-only, for analytics & ETA models) Location History DB
Flow 3: Live Trip Tracking
Flow 3: Live Trip Tracking Driver App Location Service push update via WS Trip Service (sees driver moving on map) Rider App
"That covers the three main flows. The hottest path is the location ingestion pipeline — 500K writes/sec. The most critical path is the trip state machine — we can't lose trip state. I'd like to deep-dive into the Location Service first since it's the most architecturally interesting. Sound good?"
04 Deep Dives 25–30 min
Deep Dive 1: Location Service & Driver Matching (~10 min)
The core challenge: Index 2M driver locations, handle 500K updates/sec, and answer "find nearest available drivers" queries in <50ms.

Spatial Indexing Strategy — Geohash Grid
Divide the world into geohash cells (precision 6 = ~1.2km × 0.6km). Each cell maps to a set of driver IDs. When a rider requests a ride, compute their geohash and query that cell plus neighboring cells.

Why geohash over R-tree or quadtree? Geohashes are strings, so they're trivially partitioned across a distributed in-memory store (Redis or custom). R-trees give better query precision but are harder to distribute. At our scale, the simplicity of geohash-based sharding wins. The tradeoff is edge-case inaccuracy at cell boundaries — mitigated by always querying neighboring cells.

In-Memory Architecture

ComponentTechRationale
Spatial IndexRedis Cluster (sorted sets per geohash)In-memory, O(log N) updates, 500K writes/sec is feasible across a cluster
Driver StateRedis hash per driverStatus (available/busy/offline), current geohash, vehicle type
PersistenceKafka → CassandraLocation history for ETA models and analytics. Async — not on hot path
── Redis: Spatial Index ── SORTED SET key: geohash:{cell_id} member: driver_id score: timestamp (for TTL-based eviction of stale drivers) ── Redis: Driver State ── HASH key: driver:{driver_id} status: available | on_trip | offline geohash: string lat/lng: float vehicle_type: economy | premium | xl last_updated: timestamp

Matching Algorithm

  • Step 1: Compute rider's geohash → query that cell + 8 neighbors → get candidate driver IDs
  • Step 2: Filter by status=available, vehicle_type match
  • Step 3: Compute ETA for top candidates (call Maps/routing service)
  • Step 4: Rank by ETA (primary), driver rating (secondary)
  • Step 5: Send ride offer to top-ranked driver via WebSocket
  • Step 6: If rejected or timeout (10s) → offer to next driver. Max 3 attempts before expanding search radius.
Why not broadcast to all nearby drivers? Broadcasting creates a thundering herd — multiple drivers accept the same ride, then we cancel all but one, causing bad driver experience. Serial dispatch with timeout is simpler and more predictable. Tradeoff: slightly higher matching latency (potentially 10–30s if first drivers reject), but much better UX for drivers.
Hot spot mitigation: In dense areas (airports, stadiums) a single geohash cell could have thousands of drivers. We can use finer-grained geohashes (precision 7 or 8) in high-density areas, or use a two-level index: coarse for initial lookup, fine for ranking.
Deep Dive 2: Trip Service & State Machine (~8 min)
The core challenge: Trip state must never be lost. Transitions must be atomic. Both rider and driver must see the same state.

State Machine

State Machine REQUESTED MATCHED DRIVER_ARRIVED NO_DRIVERS IN_PROGRESS COMPLETED
Storage: PostgreSQL — Trip state is our most critical data. It involves multi-field updates (status, driver_id, fare, timestamps) and we need ACID transactions. The write volume (~230 trips/sec average, ~1K peak) is well within Postgres capacity. We'd shard by city or region if we outgrow a single instance. Tradeoff vs. DynamoDB: we lose auto-scaling but gain transactional guarantees and query flexibility for operational dashboards.
trips id UUID PK rider_id UUID FK → users driver_id UUID FK → users (nullable until matched) status ENUM (requested, matched, arrived, in_progress, completed, cancelled) pickup_lat/lng FLOAT dropoff_lat/lng FLOAT fare_estimate DECIMAL fare_final DECIMAL (null until completed) surge_mult DECIMAL requested_at TIMESTAMP matched_at TIMESTAMP started_at TIMESTAMP completed_at TIMESTAMP city_id INT (partition key for sharding) -- Indexes: (rider_id, status), (driver_id, status), (city_id, requested_at)

Event Publishing
On every state transition, Trip Service publishes an event to Kafka: trip.status_changed {trip_id, old_status, new_status, timestamp}. Consumers include: WebSocket Service (push to rider/driver apps), Pricing Service (trigger fare calculation on COMPLETED), Analytics pipeline.

Why event-driven for updates instead of polling? With 20M trips/day, polling from multiple consumers would create massive DB load. Publishing events on state transitions decouples consumers, allows fan-out, and keeps the Trip DB write-path clean. Tradeoff: we accept eventual consistency for the push updates (ms-level delay), but the source of truth (Postgres) is always strongly consistent.
Deep Dive 3: Pricing & Surge (~7 min)
The core challenge: Compute accurate fare estimates in <100ms. Dynamically adjust prices (surge) based on supply/demand without being stale or oscillating wildly.

Fare Calculation

── Fare Formula ── fare = base_fare + (per_mile_rate × distance) + (per_minute_rate × duration) + booking_fee × surge_multiplier ── Where ── distance, duration = from routing/maps service (precomputed ETA) surge_multiplier = from Surge Service (per geohash cell)

Surge Pricing Design

  • Input signals: Demand (ride requests per cell per minute) and supply (available drivers per cell)
  • Computation: A background job runs every 30–60 seconds per city. For each geohash cell, compute demand/supply ratio → map to surge multiplier via a configurable curve.
  • Storage: Redis hash — surge:{city}:{geohash} → multiplier. TTL of 2 minutes (auto-expire if job fails).
  • Smoothing: Apply exponential moving average to prevent wild oscillation. Multiplier changes are capped at ±0.5x per cycle.
Why precomputed + cached, not real-time per-request? Computing supply/demand per request would require querying the Location Service for every fare estimate — that's 1K QPS hitting the spatial index just for pricing. Precomputing every 30–60s and caching in Redis keeps the pricing path fast (<5ms lookup) and decouples it from the location hot path. Tradeoff: surge can be up to 60s stale, which is acceptable — riders see the multiplier before confirming.
Fairness guard: The fare shown at request time is what the rider pays. If surge drops during the ride, they pay the lower of estimate and actual. This is a product decision but it has architectural implications — we need to store the surge_multiplier at request time on the trip record.
Deep Dive 4: Data Model & Storage Summary (~5 min)
DataStoreAccess PatternConsistency
Driver Location (live)Redis Cluster500K writes/sec, spatial queriesEventual (best-effort)
Driver Location (history)CassandraAppend-only, time-range queries for ETA/analyticsEventual
Trip StatePostgreSQLCRUD by trip_id, query by rider/driver + statusStrong (ACID)
User ProfilesPostgreSQLRead by user_id, low write frequencyStrong
Surge MultipliersRedisRead by geohash, written every 30–60sEventual (TTL 2m)
Event StreamKafkaPub/sub for trip events, location eventsOrdered per partition
Why not one database for everything? Because the access patterns are radically different. Location needs in-memory spatial indexing at 500K/sec — no traditional DB handles that. Trips need ACID — Redis doesn't provide that. History is append-only at massive volume — Cassandra's sweet spot. Each store is matched to its access pattern.
📡 API Design
REST Endpoints
POST/v1/rides/estimateGet fare estimate + ETA before requesting
Request: {pickup: {lat, lng}, dropoff: {lat, lng}, vehicle_type}
Response: {fare_estimate, surge_multiplier, eta_minutes, ride_token}
POST/v1/ridesRequest a ride (rider)
Request: {ride_token, pickup, dropoff, vehicle_type, payment_method_id}
Response: {trip_id, status: "REQUESTED"} — subsequent updates come via WebSocket
PUT/v1/rides/{trip_id}/acceptDriver accepts ride offer
Response: {trip_id, rider_info, pickup, dropoff}
PUT/v1/rides/{trip_id}/statusTransition trip state
Request: {status: "ARRIVED" | "IN_PROGRESS" | "COMPLETED" | "CANCELLED"}
GET/v1/rides/{trip_id}Get trip details
WebSocket Channels
WS/v1/ws/driverDriver: send location, receive dispatch offers
Inbound: {type: "location", lat, lng, timestamp} (every 4s)
Outbound: {type: "ride_offer", trip_id, pickup, rider_name, fare_estimate}
Outbound: {type: "trip_update", trip_id, status, ...}
WS/v1/ws/riderRider: receive trip updates & driver location
Outbound: {type: "trip_update", trip_id, status, driver_location, eta}
Outbound: {type: "driver_location", lat, lng} (every 4s during active trip)
05 Cross-Cutting Concerns 10–12 min
Storage Architecture Summary
What goes where and why. Each data store is chosen for its access pattern — not by default. The question isn't "which database?" but "what are the read/write patterns, consistency requirements, and scale characteristics?"
DataStoreWhy This Store
Trip state & history PostgreSQL ACID for state machine transitions. Sharded by city_id. Trips are financial records — strong consistency required.
Driver locations (live) Redis (Geospatial) GEOADD/GEORADIUS for spatial queries. TTL=30s auto-expires stale locations. ~500K active drivers in memory.
Location history Cassandra Append-only time-series writes. Partitioned by driver_id + date. High write throughput, no updates needed.
Surge pricing zones Redis Precomputed per geohash cell. Updated every 30s by pricing service. Read-heavy, low latency required.
User profiles & payments PostgreSQL Relational data with foreign keys. PCI-compliant vault for payment tokens. Read-heavy, strong consistency.
Trip events stream Kafka Durable event log. trip.requested, trip.matched, trip.completed. Consumed by analytics, billing, ETA models.
Failure Scenarios & Mitigation
Location Service node diesRedis Cluster auto-failover to replica. Drivers re-register location on reconnect. A few seconds of stale data — acceptable per our consistency model.
WebSocket server crashesClients reconnect to a different node (sticky sessions via driver/rider ID). Trip state is in Postgres, so no data lost — just a brief reconnection delay.
Driver accepts but goes offlineTrip Service has a heartbeat timeout. If no location update from driver for 30s after MATCHED, auto-reassign to next driver. Rider gets notified.
Matching Service can't find driversExpand search radius in concentric rings (5km → 10km → 15km). After 3 expansions, return NO_DRIVERS status. Trigger surge pricing increase for that area.
Kafka consumer lagLocation history writes fall behind but the real-time spatial index (Redis) is unaffected. Trip events: if WebSocket push is delayed, rider/driver can poll GET /rides/{id} as fallback.
PostgreSQL (Trip DB) downThis is our most critical failure. Primary-replica setup with automatic failover. If both are down, no new trips can be created — we show "service temporarily unavailable." In-progress trips continue (driver has cached trip info) but state transitions queue until recovery.
Scalability Bottlenecks
At ScaleWhat BreaksMitigation
10× (200M trips/day)Single-city Postgres shard gets hot. Redis spatial index memory pressure.Shard Trip DB by city_id. Partition Redis by geo-region.
100× (2B trips/day)Matching Service becomes bottleneck — ETA computation is expensive. WebSocket tier needs 300-600 nodes.Pre-compute ETAs in a spatial grid. Move to gRPC between internal services. Regional deployments.
Observability
  • Golden signals per service: Request rate, error rate (4xx/5xx), p50/p99 latency, saturation (CPU, memory, connections)
  • Business metrics: Match rate (% of requests that get a driver), time-to-match, surge coverage, cancellation rate
  • Distributed tracing: Trace from ride request → matching → dispatch → acceptance. Critical for debugging "why did matching take 30 seconds?"
  • Alerting: Match rate drops below 80%, time-to-match p99 > 30s, Trip DB replication lag > 5s, WebSocket reconnection spike
Security Highlights
  • AuthN: JWT tokens per session, short-lived (15 min) with refresh tokens. Separate tokens for rider and driver roles.
  • Location privacy: Rider's exact location is only shared with matched driver. Never expose driver home address.
  • Rate limiting: Ride requests capped per user (prevent abuse). Location updates capped per driver (prevent spoofing).
  • Fare tampering: Surge multiplier is server-side only. ride_token is signed — client can't modify fare estimate.
06 Wrap-Up & Evolution 3–5 min
"To summarize: the system is organized around three main concerns — a write-heavy, in-memory location tracking pipeline using Redis with geohash-based spatial indexing; a strongly consistent trip state machine on PostgreSQL with event-driven updates via Kafka; and a decoupled pricing service with precomputed surge multipliers. The key tradeoff throughout is eventual consistency for location and surge data in exchange for low-latency reads, with strong consistency reserved for trip state where correctness matters most."
What I'd Build Next
ExtensionWhy It MattersArchitecture Impact
Shared Rides (Pool)Higher utilization, lower faresMatching becomes a combinatorial optimization problem. Needs a batch matching window instead of serial dispatch.
Scheduled RidesAirport pickups, commute planningNew "scheduled" state in trip machine. Background scheduler that triggers matching N minutes before pickup.
Multi-RegionLatency for global usersRegional deployments with independent Location + Trip services per region. Cross-region user profile replication.
ML-based ETAMore accurate estimatesTrain on Cassandra location history. Serve from a feature store. Replace simple routing API call.
Fraud DetectionFake GPS, driver collusionStream processing on location events. Anomaly detection on speed/teleportation. Async — doesn't block hot path.
Closing framing: This design optimizes for the two things that matter most for a ride-hailing platform — fast matching (location + dispatch) and reliable trip state (never lose a ride). Everything else is built around supporting those two pillars.
07 Interview Q&APractice
"Here are the hardest questions an interviewer would ask about this design, and how to answer them. Each answer demonstrates deep understanding of the tradeoffs, not just surface knowledge."
Q1

How would you handle a sudden 5x spike in ride requests during a concert ending?

A

This is exactly what surge pricing is designed for — it's not just revenue, it's a load-shedding mechanism. The pricing service detects demand/supply imbalance per geohash cell and raises the multiplier, which does two things: (1) reduces demand by discouraging price-sensitive riders, and (2) increases supply by incentivizing drivers to move toward the surge zone. On the infrastructure side, the matching service is already sharded by city, so a local spike doesn't affect other cities. The WebSocket service scales horizontally — each instance handles a partition of connected drivers. The real bottleneck would be the matching algorithm itself: with 5x requests but not 5x drivers, we'd want to increase the search radius gradually and batch match requests to avoid the thundering herd problem.

Q2

Why Cassandra for location history instead of just using PostgreSQL?

A

Location history is a pure append workload — we never update a past location, only write new ones. At 500K active drivers updating every 4 seconds, that's ~125K writes/sec sustained. PostgreSQL could handle this with partitioning, but Cassandra gives us: (1) linear write scalability — just add nodes, (2) no single point of failure — any node can accept writes, (3) natural time-series partitioning with driver_id + date as partition key, and (4) automatic TTL-based expiration for old data. The tradeoff is we can't do complex joins — but we never need to. Location history queries are always "give me driver X's positions between time A and B," which is a single partition scan in Cassandra. PostgreSQL is the right choice for trips because trips have relational integrity requirements (rider, driver, payment, route all linked), but location is just a high-velocity stream.

Q3

What happens if the Matching Service goes down mid-request?

A

The Trip Service is the source of truth, not the Matching Service. When a rider requests a ride, the Trip Service creates a trip record in state REQUESTED with a TTL. The Matching Service is called asynchronously — if it fails, the trip stays in REQUESTED state. We have two safety nets: (1) a retry loop in the Trip Service that re-calls Matching every 5 seconds while the trip is in REQUESTED state, and (2) a timeout — if no match after 60 seconds, the trip transitions to NO_DRIVERS and the rider is notified. Because matching is idempotent (it reads driver locations and returns the best match), retries are safe. If the entire Matching Service fleet is down, riders see "no drivers available" — degraded but not broken. The worst failure mode would be matching succeeding but the response being lost — which is why the Trip Service does the state transition, not the Matching Service.

Q4

How do you ensure a driver doesn't get assigned to two rides simultaneously?

A

This is a distributed locking problem. When the Matching Service selects a driver, it must atomically mark that driver as "unavailable" before returning. We use Redis with a SET NX (set-if-not-exists) lock: `SET driver:{id}:trip trip_123 NX EX 30`. If the SET succeeds, the driver is ours. If it fails, someone else claimed them — pick the next candidate. The 30-second TTL is a safety net if the Trip Service crashes after locking but before completing the match. The Matching Service returns a ranked list of candidates and tries them in order until one lock succeeds. This means we never double-assign, but we might occasionally skip an optimal match if there's contention — which is fine, the second-best driver 200m away is nearly as good as the best driver 150m away.

Q5

Why not use a single database for everything?

A

Because the access patterns are fundamentally different and a single database would force you to compromise on all of them. Trips need ACID transactions (state machine transitions must be atomic). Locations need 125K writes/sec with geographic queries. Surge zones need sub-millisecond reads. User sessions need TTL-based expiration. A single PostgreSQL instance could technically store all of this, but: (1) the location write volume would compete with trip transaction throughput, (2) geographic queries on Postgres require PostGIS extensions that don't scale horizontally as cleanly as Redis GEORADIUS, and (3) you'd need to over-provision for the union of all workload peaks. Polyglot persistence lets each store be optimized for its access pattern and scaled independently. The cost is operational complexity — more systems to monitor and maintain — which is why we only split when the access pattern genuinely demands it.