- What outcome are we optimizing for? → Trip completion rate (rider requests a ride → ride finishes successfully). Secondary: rider wait time, driver utilization, ETA accuracy. This tells us the system's job is MATCHING and RELIABILITY, not just speed. A fast match that leads to a cancelled trip is worse than a slower match that completes.
- Which side of the marketplace? Both rider and driver, or just one? → Assume both.
- Ride types? Just point-to-point, or also shared rides (UberPool), scheduled rides? → Start with point-to-point only. Mention pool as an extension.
- Geography? Single city, single country, or global? → Design for multi-city, discuss global as evolution.
- Payment? In scope or out of scope? → Out of scope for deep dive; acknowledge it exists.
- Scale? Roughly how many concurrent riders/drivers? → ~100M riders, ~5M drivers globally, ~20M trips/day.
- Latency expectations? How fast should matching happen? → Rider sees a driver within 5–10 seconds of requesting.
| In Scope | Out of Scope |
|---|---|
| Rider requests a ride | Payment processing |
| Driver location tracking | Driver onboarding / identity verification |
| Matching rider → driver | Shared rides (UberPool) |
| Real-time trip tracking | Scheduled rides |
| Pricing & surge | Ratings & reviews |
| ETA computation | Chat/calling between rider & driver |
- UC1: Rider requests ride → system matches to nearest available driver → driver accepts
- UC2: Driver continuously reports location → system tracks in real-time
- UC3: Both rider and driver see live trip progress (pickup → in-ride → drop-off)
- UC4: System computes fare estimate before ride and final fare after
- Availability > Consistency — better to show a slightly stale driver location than to fail a ride request
- Matching latency — <10 seconds from request to driver assignment
- Location freshness — driver location updates every 3–5 seconds
- Trip state must be durable — can't lose an in-progress trip (strongly consistent)
- Surge pricing can be eventually consistent — a few seconds of staleness is fine
| Requirement | Decision | Why (and what was rejected) | Consistency |
|---|---|---|---|
| ACID for trip state (financial record) | PostgreSQL (sharded by city) | Trips involve payments + state machine transitions that must be atomic. DynamoDB lacks multi-row transactions. | CP |
| 125K location writes/sec, append-only | Cassandra (time-series) | Pure append, never updated. Linear write scaling. PostgreSQL would compete for IOPS with trips. | AP |
| Sub-ms spatial queries for matching | Redis Geospatial | GEORADIUS in <1ms, TTL auto-expires stale entries. PostGIS is 10-50ms — too slow for real-time matching. | AP |
| Decouple billing, analytics, ML from hot path | Kafka event stream | Durable log with independent consumers. Direct service calls would create tight coupling + no replay. | — |
| Handle 100K QPS surge without session affinity | Stateless services + API Gateway | Any pod handles any request. Stateful servers can't scale independently or fail over cleanly. | — |
| Real-time push to rider/driver apps | WebSocket (not polling) | Server-initiated push for location + trip state updates. Polling at 4s intervals wastes bandwidth and adds latency. | — |
📱 Rider App CLIENT
- Request ride, see ETA, track trip
- WebSocket for real-time updates
🚗 Driver App CLIENT
- Send location, accept/reject rides
- WebSocket for dispatch & trip updates
🌐 API Gateway EDGE
- AuthN, rate limiting, routing
- REST for CRUD, WS upgrade for real-time
🔌 WebSocket Service REAL-TIME
- Holds 3M persistent connections
- Horizontally scaled (30–60 nodes)
- Pushes trip updates & dispatch offers
📍 Location Service HOT PATH
- Ingests 500K loc updates/sec
- In-memory spatial index (geohash grid)
- "Find nearest drivers" queries
🔀 Matching / Dispatch Service CORE
- Queries Location Service for nearby drivers
- Ranks by ETA, rating, vehicle type
- Sends offer to best driver via WS
🚦 Trip Service CORE
- Trip state machine (requested → matched → pickup → in_ride → completed)
- Strongly consistent (PostgreSQL)
💰 Pricing Service CORE
- Fare estimate before ride
- Surge multiplier computation
- Final fare calculation after ride
Spatial Indexing Strategy — Geohash Grid
Divide the world into geohash cells (precision 6 = ~1.2km × 0.6km). Each cell maps to a set of driver IDs. When a rider requests a ride, compute their geohash and query that cell plus neighboring cells.
In-Memory Architecture
| Component | Tech | Rationale |
|---|---|---|
| Spatial Index | Redis Cluster (sorted sets per geohash) | In-memory, O(log N) updates, 500K writes/sec is feasible across a cluster |
| Driver State | Redis hash per driver | Status (available/busy/offline), current geohash, vehicle type |
| Persistence | Kafka → Cassandra | Location history for ETA models and analytics. Async — not on hot path |
Matching Algorithm
- Step 1: Compute rider's geohash → query that cell + 8 neighbors → get candidate driver IDs
- Step 2: Filter by status=available, vehicle_type match
- Step 3: Compute ETA for top candidates (call Maps/routing service)
- Step 4: Rank by ETA (primary), driver rating (secondary)
- Step 5: Send ride offer to top-ranked driver via WebSocket
- Step 6: If rejected or timeout (10s) → offer to next driver. Max 3 attempts before expanding search radius.
State Machine
Event Publishing
On every state transition, Trip Service publishes an event to Kafka: trip.status_changed {trip_id, old_status, new_status, timestamp}. Consumers include: WebSocket Service (push to rider/driver apps), Pricing Service (trigger fare calculation on COMPLETED), Analytics pipeline.
Fare Calculation
Surge Pricing Design
- Input signals: Demand (ride requests per cell per minute) and supply (available drivers per cell)
- Computation: A background job runs every 30–60 seconds per city. For each geohash cell, compute demand/supply ratio → map to surge multiplier via a configurable curve.
- Storage: Redis hash —
surge:{city}:{geohash} → multiplier. TTL of 2 minutes (auto-expire if job fails). - Smoothing: Apply exponential moving average to prevent wild oscillation. Multiplier changes are capped at ±0.5x per cycle.
| Data | Store | Access Pattern | Consistency |
|---|---|---|---|
| Driver Location (live) | Redis Cluster | 500K writes/sec, spatial queries | Eventual (best-effort) |
| Driver Location (history) | Cassandra | Append-only, time-range queries for ETA/analytics | Eventual |
| Trip State | PostgreSQL | CRUD by trip_id, query by rider/driver + status | Strong (ACID) |
| User Profiles | PostgreSQL | Read by user_id, low write frequency | Strong |
| Surge Multipliers | Redis | Read by geohash, written every 30–60s | Eventual (TTL 2m) |
| Event Stream | Kafka | Pub/sub for trip events, location events | Ordered per partition |
{pickup: {lat, lng}, dropoff: {lat, lng}, vehicle_type}Response:
{fare_estimate, surge_multiplier, eta_minutes, ride_token}{ride_token, pickup, dropoff, vehicle_type, payment_method_id}Response:
{trip_id, status: "REQUESTED"} — subsequent updates come via WebSocket{trip_id, rider_info, pickup, dropoff}{status: "ARRIVED" | "IN_PROGRESS" | "COMPLETED" | "CANCELLED"}{type: "location", lat, lng, timestamp} (every 4s)Outbound:
{type: "ride_offer", trip_id, pickup, rider_name, fare_estimate}Outbound:
{type: "trip_update", trip_id, status, ...}
{type: "trip_update", trip_id, status, driver_location, eta}Outbound:
{type: "driver_location", lat, lng} (every 4s during active trip)
| Data | Store | Why This Store |
|---|---|---|
| Trip state & history | PostgreSQL | ACID for state machine transitions. Sharded by city_id. Trips are financial records — strong consistency required. |
| Driver locations (live) | Redis (Geospatial) | GEOADD/GEORADIUS for spatial queries. TTL=30s auto-expires stale locations. ~500K active drivers in memory. |
| Location history | Cassandra | Append-only time-series writes. Partitioned by driver_id + date. High write throughput, no updates needed. |
| Surge pricing zones | Redis | Precomputed per geohash cell. Updated every 30s by pricing service. Read-heavy, low latency required. |
| User profiles & payments | PostgreSQL | Relational data with foreign keys. PCI-compliant vault for payment tokens. Read-heavy, strong consistency. |
| Trip events stream | Kafka | Durable event log. trip.requested, trip.matched, trip.completed. Consumed by analytics, billing, ETA models. |
| At Scale | What Breaks | Mitigation |
|---|---|---|
| 10× (200M trips/day) | Single-city Postgres shard gets hot. Redis spatial index memory pressure. | Shard Trip DB by city_id. Partition Redis by geo-region. |
| 100× (2B trips/day) | Matching Service becomes bottleneck — ETA computation is expensive. WebSocket tier needs 300-600 nodes. | Pre-compute ETAs in a spatial grid. Move to gRPC between internal services. Regional deployments. |
- Golden signals per service: Request rate, error rate (4xx/5xx), p50/p99 latency, saturation (CPU, memory, connections)
- Business metrics: Match rate (% of requests that get a driver), time-to-match, surge coverage, cancellation rate
- Distributed tracing: Trace from ride request → matching → dispatch → acceptance. Critical for debugging "why did matching take 30 seconds?"
- Alerting: Match rate drops below 80%, time-to-match p99 > 30s, Trip DB replication lag > 5s, WebSocket reconnection spike
- AuthN: JWT tokens per session, short-lived (15 min) with refresh tokens. Separate tokens for rider and driver roles.
- Location privacy: Rider's exact location is only shared with matched driver. Never expose driver home address.
- Rate limiting: Ride requests capped per user (prevent abuse). Location updates capped per driver (prevent spoofing).
- Fare tampering: Surge multiplier is server-side only.
ride_tokenis signed — client can't modify fare estimate.
| Extension | Why It Matters | Architecture Impact |
|---|---|---|
| Shared Rides (Pool) | Higher utilization, lower fares | Matching becomes a combinatorial optimization problem. Needs a batch matching window instead of serial dispatch. |
| Scheduled Rides | Airport pickups, commute planning | New "scheduled" state in trip machine. Background scheduler that triggers matching N minutes before pickup. |
| Multi-Region | Latency for global users | Regional deployments with independent Location + Trip services per region. Cross-region user profile replication. |
| ML-based ETA | More accurate estimates | Train on Cassandra location history. Serve from a feature store. Replace simple routing API call. |
| Fraud Detection | Fake GPS, driver collusion | Stream processing on location events. Anomaly detection on speed/teleportation. Async — doesn't block hot path. |
How would you handle a sudden 5x spike in ride requests during a concert ending?
This is exactly what surge pricing is designed for — it's not just revenue, it's a load-shedding mechanism. The pricing service detects demand/supply imbalance per geohash cell and raises the multiplier, which does two things: (1) reduces demand by discouraging price-sensitive riders, and (2) increases supply by incentivizing drivers to move toward the surge zone. On the infrastructure side, the matching service is already sharded by city, so a local spike doesn't affect other cities. The WebSocket service scales horizontally — each instance handles a partition of connected drivers. The real bottleneck would be the matching algorithm itself: with 5x requests but not 5x drivers, we'd want to increase the search radius gradually and batch match requests to avoid the thundering herd problem.
Why Cassandra for location history instead of just using PostgreSQL?
Location history is a pure append workload — we never update a past location, only write new ones. At 500K active drivers updating every 4 seconds, that's ~125K writes/sec sustained. PostgreSQL could handle this with partitioning, but Cassandra gives us: (1) linear write scalability — just add nodes, (2) no single point of failure — any node can accept writes, (3) natural time-series partitioning with driver_id + date as partition key, and (4) automatic TTL-based expiration for old data. The tradeoff is we can't do complex joins — but we never need to. Location history queries are always "give me driver X's positions between time A and B," which is a single partition scan in Cassandra. PostgreSQL is the right choice for trips because trips have relational integrity requirements (rider, driver, payment, route all linked), but location is just a high-velocity stream.
What happens if the Matching Service goes down mid-request?
The Trip Service is the source of truth, not the Matching Service. When a rider requests a ride, the Trip Service creates a trip record in state REQUESTED with a TTL. The Matching Service is called asynchronously — if it fails, the trip stays in REQUESTED state. We have two safety nets: (1) a retry loop in the Trip Service that re-calls Matching every 5 seconds while the trip is in REQUESTED state, and (2) a timeout — if no match after 60 seconds, the trip transitions to NO_DRIVERS and the rider is notified. Because matching is idempotent (it reads driver locations and returns the best match), retries are safe. If the entire Matching Service fleet is down, riders see "no drivers available" — degraded but not broken. The worst failure mode would be matching succeeding but the response being lost — which is why the Trip Service does the state transition, not the Matching Service.
How do you ensure a driver doesn't get assigned to two rides simultaneously?
This is a distributed locking problem. When the Matching Service selects a driver, it must atomically mark that driver as "unavailable" before returning. We use Redis with a SET NX (set-if-not-exists) lock: `SET driver:{id}:trip trip_123 NX EX 30`. If the SET succeeds, the driver is ours. If it fails, someone else claimed them — pick the next candidate. The 30-second TTL is a safety net if the Trip Service crashes after locking but before completing the match. The Matching Service returns a ranked list of candidates and tries them in order until one lock succeeds. This means we never double-assign, but we might occasionally skip an optimal match if there's contention — which is fine, the second-best driver 200m away is nearly as good as the best driver 150m away.
Why not use a single database for everything?
Because the access patterns are fundamentally different and a single database would force you to compromise on all of them. Trips need ACID transactions (state machine transitions must be atomic). Locations need 125K writes/sec with geographic queries. Surge zones need sub-millisecond reads. User sessions need TTL-based expiration. A single PostgreSQL instance could technically store all of this, but: (1) the location write volume would compete with trip transaction throughput, (2) geographic queries on Postgres require PostGIS extensions that don't scale horizontally as cleanly as Redis GEORADIUS, and (3) you'd need to over-provision for the union of all workload peaks. Polyglot persistence lets each store be optimized for its access pattern and scaled independently. The cost is operational complexity — more systems to monitor and maintain — which is why we only split when the access pattern genuinely demands it.