01 Clarify the Problem & Scope 5–7 min
"Let me restate β€” we're designing an e-commerce platform like Amazon. Users browse a product catalog, search for items, add them to a cart, and check out. The system needs to manage inventory across warehouses, process orders, and handle fulfillment. Let me scope this down."
Questions I'd Ask
  • What outcome are we optimizing for? β†’ Purchase conversion rate (session β†’ order) and customer lifetime value. Secondary: delivery speed (same-day/next-day), selection breadth. This shapes architecture: the browse path must be FAST (every 100ms of latency costs ~1% sales), the buy path must be CORRECT (no overselling), and the delivery promise must be HONEST (showing "delivery by Tuesday" when it's actually Thursday destroys trust).
  • Marketplace or first-party only? Do we support third-party sellers, or just our own inventory? β†’ Both. Marketplace model with our own warehousing (FBA-like) is most interesting.
  • Core flow? Browse β†’ search β†’ product detail β†’ cart β†’ checkout β†’ order tracking? β†’ Yes, this is the primary flow.
  • Inventory model? Single warehouse or distributed? β†’ Multi-warehouse. Inventory is distributed across regions.
  • Payment? β†’ Acknowledge integration with payment provider, don't deep-dive the payment gateway itself.
  • Recommendations? β†’ Mention as a component, not a deep dive.
  • Scale? β†’ ~300M active customers, ~500M products in catalog, ~5M orders/day normal, ~50M on peak days (Prime Day, Black Friday).
  • Flash sales / peak events? β†’ Yes, must handle 10Γ— traffic spikes. This is a critical constraint.
Agreed Scope
In ScopeOut of Scope
Product catalog & detail pagesSeller portal / seller onboarding
Product searchRecommendation engine internals
Shopping cartReturns & refunds
Checkout & inventory reservationDelivery logistics / last-mile
Order processing pipelineReviews & ratings
Inventory management (multi-warehouse)Prime membership system
Order trackingAdvertising platform
Core Use Cases (ranked)
  • UC1: User browses/searches products β†’ views product detail page with price, availability, and delivery estimate
  • UC2: User adds item to cart β†’ cart persists across sessions
  • UC3: User checks out β†’ inventory reserved β†’ payment processed β†’ order created
  • UC4: Order is fulfilled β†’ user tracks status (placed β†’ packed β†’ shipped β†’ delivered)
Non-Functional Requirements
  • Inventory accuracy is paramount β€” we must NEVER sell more than we have (overselling). This means strong consistency on the inventory decrement path.
  • Product catalog reads are eventually consistent β€” a product page showing a price that's 30 seconds stale is fine. Availability > freshness for browsing.
  • Checkout must be ACID β€” inventory reservation + order creation is a transaction that must not partially fail.
  • 10Γ— spike tolerance β€” must handle Black Friday / Prime Day without degradation. This means the system can't be designed for average load only.
  • Cart is durable β€” users expect their cart to survive app restarts and multi-device access.
  • Catalog is read-heavy β€” 100:1 read:write ratio on product pages. Most users browse, few buy.
  • Global β€” multi-region serving for low latency. Inventory is regional (warehouse-specific).
The defining tension of this system: product browsing wants availability and speed (eventually consistent, heavily cached), but checkout wants correctness (strongly consistent, transactional). The architecture must cleanly separate these two worlds.
02 Back-of-the-Envelope Estimation 3–5 min
"Let me run the numbers. I want to separate normal load from peak load since the 10Γ— spike is a first-class design constraint."
Product Page Views / Day
~5B
300M users Γ— ~17 pages/day. ~58K/sec avg, ~500K/sec peak.
Search Queries / Day
~1B
~12K/sec avg, ~100K/sec peak.
Cart Updates / Day
~200M
~2.3K/sec avg, ~20K/sec peak. Add/remove/update quantity.
Orders / Day
5M norm / 50M peak
~58/sec avg, ~580/sec peak. Each order avg 3 items β†’ ~1.7K inventory ops/sec peak.
Product Catalog Size
500M products
~2KB per product record β†’ ~1TB of structured data. Images: ~50TB total.
Inventory Records
~2B rows
500M products Γ— avg 4 warehouse locations. ~200GB structured data.
Read:Write Ratio (catalog)
~100:1
5B page views vs. ~50M catalog updates/day.
Conversion Rate
~2-3%
Of users who browse, few buy. System must optimize for the 97% browsing AND the 3% buying.
Key insight #1: 500K product page views/sec at peak. The catalog MUST be served from cache/CDN. Hitting the database for every product page is impossible.
Key insight #2: Inventory operations are relatively low volume (~1.7K/sec peak) BUT require strong consistency. This is a correctness problem, not a throughput problem. We can use a traditional relational DB with row-level locking.
Key insight #3: The 10Γ— spike between normal and peak means we need elastic compute and aggressive caching. Designing for peak = overprovisioned 90% of the time. Designing for average = crashing on Black Friday. Solution: auto-scaling + queue-based order processing to absorb spikes.
03 High-Level Design 8–12 min
"I'll draw the major services and walk through three flows: browsing a product page, the checkout critical path, and order fulfillment."
Key Architecture Decisions
"Here's WHY I chose each technology β€” mapping requirements to tradeoffs. Every choice has a rejected alternative and a consequence."
RequirementDecisionWhy (and what was rejected)Consistency
Never oversell inventory (financial integrity)PostgreSQL with compare-and-swapUPDATE ... WHERE count > 0 is atomic. Eventual consistency would allow overselling during flash sales.CP
Cart: spiky traffic, simple key-valueDynamoDB (not PostgreSQL)Single-key lookups, TTL expiration, auto-scaling. No relational joins needed. PostgreSQL would need manual scaling for Black Friday spikes.AP
Product search with facets + filtersElasticsearch (not PostgreSQL full-text)Faceted aggregations (brand, price range, ratings) are native in ES. PostgreSQL full-text search can't do facets efficiently.AP
Order state requires ACIDPostgreSQL for orders (sharded by customer_id)95% of queries are "my orders" — single-shard. Order→payment→shipment requires foreign keys + transactions.CP
Fulfillment is async (doesn't block checkout)Kafka / SQS for order eventsCheckout completes in <3s. Warehouse processing takes hours. Queue decouples critical path from async pipeline.β€”
Product images at CDN scaleS3 + CDN with content-addressable URLsURL includes content hash β†’ zero cache invalidation needed. New image = new URL. Old URLs expire naturally.β€”
Major Components
High-Level Architecture CLIENTS EDGE / LOAD BALANCING APPLICATION SERVICES CACHING MESSAGE QUEUE / ASYNC DATA STORES static πŸ“± Client Apps Web Β· iOS Β· Android 🌍 CDN product images Β· static 🌐 API Gateway auth Β· rate limit Β· route πŸ“¦ Catalog products Β· browse πŸ” Search Elasticsearch πŸ›’ Cart stateful sessions πŸ“Š Inventory stock count πŸ’³ Order/Checkout critical path πŸ“¬ Fulfillment async pipeline ⚑ Redis cart Β· sessions Β· catalog πŸ“¨ Kafka / SQS order events Β· fulfillment 🐘 PostgreSQL orders Β· users ⚑ DynamoDB cart Β· inventory πŸ”Ž Elasticsearch product search πŸ“¦ S3 images Β· assets

πŸ“± Client Apps CLIENT

  • Web, iOS, Android
  • REST API for CRUD, CDN for static assets

🌐 API Gateway + CDN EDGE

  • AuthN, rate limiting, routing
  • CDN caches product pages, images

πŸ“¦ Product Catalog Service READ PATH

  • Product details, pricing, images
  • Heavily cached (Redis + CDN)
  • Seller updates via async pipeline

πŸ” Search Service READ PATH

  • Full-text search + faceted filtering
  • Elasticsearch cluster
  • Autocomplete, spell correction

πŸ›’ Cart Service STATEFUL

  • Add/remove/update cart items
  • Persists across sessions & devices
  • DynamoDB or Redis + DB backing

πŸ“Š Inventory Service CRITICAL

  • Real-time stock counts per warehouse
  • Reserve / release / decrement
  • Strong consistency β€” the hardest piece

πŸ’³ Checkout / Order Service CRITICAL

  • Orchestrates: inventory β†’ payment β†’ order creation
  • ACID transaction across steps
  • Idempotent (retry-safe)

πŸ“¬ Order Fulfillment Service ASYNC

  • Picks warehouse, generates shipping
  • State machine: placed β†’ packed β†’ shipped β†’ delivered
  • Integrates with shipping carriers
Flow 1: Product Browsing (the fast path)
Questions I'd Ask Client cache HIT β†’ return immediately (~5ms) cache MISS ↓ CDN API Gateway 1. check Redis cache 2. miss β†’ read from Products DB 3. enrich with availability (Inventory cache) 4. populate cache, return Catalog Service
Flow 2: Checkout (the critical path)
Flow 2: Checkout (the critical path) Client API Gateway 1. Fetch cart items from Cart Service 2. Validate prices (re-fetch from Catalog) 3. SELECT best warehouse per item (nearest with stock) Order Service success β†’ continue failure β†’ return "out of stock" for that item Inventory Service ── success β†’ create order record (status: PLACED) failure β†’ RELEASE inventory reservation Payment Service ── charge payment method Kafka ── publish Fulfillment (async) (email confirm) Notification
Flow 3: Order Fulfillment (async pipeline)
Flow 3: Order Fulfillment (async pipeline) Fulfillment Service
"The most architecturally interesting challenge is the inventory reservation at checkout β€” how do we prevent overselling without creating a bottleneck? That's what I'd like to deep-dive first. Second, the order pipeline is a good saga / state machine discussion."
04 Deep Dives 25–30 min
Deep Dive 1: Inventory Management & Checkout (~12 min)
The core challenge: Never oversell. When 1,000 users simultaneously try to buy the last 5 units of a hot item on Black Friday, exactly 5 succeed and 995 get "out of stock." This is a concurrency control problem at the database level.

Inventory Model: Reserve β†’ Confirm β†’ Release

  • Available stock = total_stock βˆ’ reserved_stock βˆ’ sold_stock
  • Reserve: At checkout, atomically decrement available_count. Item is "held" for this order.
  • Confirm: On successful payment, convert reservation to sale (decrement reserved, increment sold).
  • Release: On failed payment, timeout, or cancellation, release the reservation (increment available back).
  • TTL on reservations: If payment isn't confirmed within 10 minutes, reservation auto-expires. Background job sweeps expired reservations.
── Inventory DB (PostgreSQL, sharded by product_id) ── inventory product_id UUID warehouse_id UUID total_stock INT // total units in this warehouse reserved_stock INT // currently held by in-flight checkouts sold_stock INT // confirmed sold -- available = total_stock - reserved_stock - sold_stock PRIMARY KEY (product_id, warehouse_id) updated_at TIMESTAMP reservations id UUID PK order_id UUID product_id UUID warehouse_id UUID quantity INT status ENUM (reserved, confirmed, released, expired) expires_at TIMESTAMP // 10 min from creation created_at TIMESTAMP

The Critical SQL: Atomic Reservation

── Atomic inventory reservation (no overselling) ── UPDATE inventory SET reserved_stock = reserved_stock + :quantity WHERE product_id = :product_id AND warehouse_id = :warehouse_id AND (total_stock - reserved_stock - sold_stock) >= :quantity; -- If affected_rows = 0 β†’ insufficient stock β†’ return "out of stock" -- If affected_rows = 1 β†’ reservation succeeded -- The WHERE clause + UPDATE is atomic (row-level lock in Postgres) -- No explicit SELECT FOR UPDATE needed β€” the conditional UPDATE is sufficient

Pessimistic Locking (SELECT FOR UPDATE)

  • Lock the row β†’ read β†’ check β†’ update β†’ release lock
  • βœ… Simple to reason about
  • ❌ Locks held during payment processing β†’ deadlocks under load
  • ❌ Hot items create lock contention (1000 threads waiting for same row)
❌ Don't lock during payment β€” only during reservation

Conditional UPDATE (our approach)

  • Single atomic UPDATE with WHERE guard β†’ no explicit lock held
  • βœ… Lock duration is microseconds (just the UPDATE)
  • βœ… No deadlocks β€” single statement
  • βœ… Hot items handled fine β€” Postgres serializes row-level UPDATEs
  • ⚠️ Under extreme contention (>1000 concurrent), consider Redis-based counter as a pre-filter
βœ… Default approach β€” fast, correct, simple
Why not Redis for inventory counts? Redis is faster, but inventory is a CORRECTNESS problem. If Redis crashes before persisting, we lose reservation state and risk overselling. A SQL conditional UPDATE gives us ACID + durability + simple recovery. At ~1.7K inventory ops/sec peak, Postgres handles this easily. Tradeoff: higher latency per operation (~5ms vs. Redis ~1ms), but correctness outweighs speed here. Redis IS used as a pre-filter ("approximate availability" check) to reject clearly-out-of-stock requests before hitting the DB.

Hot Item Strategy (Lightning Deals / Last Unit)

  • Problem: 10,000 users try to buy 100 units simultaneously β†’ 10,000 DB transactions hitting the same row.
  • Pre-filter with Redis: Maintain an approximate counter in Redis. Requests check Redis first β€” if counter says 0, reject immediately without hitting DB. Only let through ~2Γ— the remaining stock.
  • Queue-based checkout for flash sales: For specific sale events, route checkout requests through an SQS queue. Process sequentially at DB level. Users get "your order is being processed" response and poll for status.
  • Warehouse spreading: Split hot item inventory across multiple warehouse rows. Reserve from any warehouse with stock. Reduces per-row contention.
The two-level architecture: Approximate availability (Redis, eventually consistent, for product page display) + exact reservation (Postgres, strongly consistent, for checkout). Users see "In Stock" on the page (cached, might be slightly wrong), but actual reservation at checkout is always accurate.

Checkout Orchestration (Saga Pattern)

  • Step 1: Reserve inventory β†’ on failure: return error, stop.
  • Step 2: Charge payment β†’ on failure: RELEASE inventory reservation, return error.
  • Step 3: Create order record β†’ on failure: REFUND payment, RELEASE inventory, return error.
  • Step 4: Publish order.placed event β†’ on failure: order exists but fulfillment retries from event.
Orchestration vs. Choreography saga? Orchestration β€” the Order Service is the central coordinator that calls each step and handles compensating actions. With choreography (each service reacts to events), the failure recovery becomes distributed and much harder to debug. At this complexity level, a single orchestrator with clear compensating transactions is simpler. Tradeoff: the Order Service is a single point of coordination (but not a single point of failure β€” it's stateless and horizontally scaled, with the order state in the DB).
Idempotency: Every checkout request gets a client-generated idempotency_key. Order Service checks if this key already has an order β†’ returns existing order. This prevents double-charges on retries (network timeout, user double-clicks). Critical for payment safety.
Deep Dive 2: Order Processing Pipeline (~8 min)
The core challenge: Process 5M orders/day (50M peak) through a multi-step fulfillment pipeline with reliable state tracking. Each order may span multiple warehouses and shipping carriers.

Order State Machine

PLACED ──warehouse assigned──▢ PROCESSING ──items picked──▢ PACKED β”‚ β”‚ β”‚ payment failed (late) carrier picked up β–Ό β–Ό CANCELLED SHIPPED ──carrier confirms──▢ DELIVERED Multi-item orders: if items ship from different warehouses, each creates a separate SHIPMENT. Order status = worst-case of its shipments. CANCELLED can happen from PLACED or PROCESSING (not after PACKED). On cancel: release inventory, initiate refund.
Storage: PostgreSQL for orders. Same rationale as the Trip Service in the Uber design β€” orders are our most critical business records, we need ACID, and the write volume (~580/sec peak) is well within Postgres capacity. Sharded by customer_id for efficient "my orders" queries.
── Orders DB (PostgreSQL, sharded by customer_id) ── orders id UUID PK customer_id BIGINT (shard key) status ENUM (placed, processing, packed, shipped, delivered, cancelled) total_amount DECIMAL currency CHAR(3) shipping_address JSONB payment_id UUID (ref to payment provider txn) idempotency_key UUID UNIQUE placed_at TIMESTAMP updated_at TIMESTAMP order_items id UUID PK order_id UUID FK β†’ orders product_id UUID warehouse_id UUID quantity INT unit_price DECIMAL (price at time of purchase) reservation_id UUID FK β†’ reservations shipments id UUID PK order_id UUID FK β†’ orders warehouse_id UUID carrier VARCHAR tracking_number VARCHAR status ENUM (pending, picked, packed, shipped, delivered) estimated_delivery TIMESTAMP shipped_at TIMESTAMP -- Indexes: (customer_id, placed_at DESC), (status) for fulfillment queries

Event-Driven Fulfillment

  • Event flow: order.placed β†’ Fulfillment Service consumes β†’ assigns warehouse β†’ creates shipments β†’ publishes shipment.created β†’ Warehouse System picks items β†’ shipment.packed β†’ Carrier integration β†’ shipment.shipped β†’ Carrier webhook β†’ shipment.delivered.
  • Each state transition: Update order/shipment status in DB, publish event to Kafka, Notification Service sends email/push to customer.
  • Warehouse selection: For each item, pick the warehouse that (a) has stock AND (b) is closest to the shipping address. Goal: minimize shipping cost and delivery time. If one warehouse has all items, prefer a single shipment. If not, split into multiple shipments.
Why event-driven instead of synchronous orchestration for fulfillment? Fulfillment is inherently async β€” it takes hours to days. Synchronous calls would mean holding connections open or polling. Event-driven lets each stage proceed independently, with Kafka providing durability. If the packing service is temporarily down, events queue up and process when it recovers. Tradeoff: harder to debug end-to-end (need distributed tracing), but the natural async nature of physical fulfillment makes events the right fit.
Delivery estimate: Shown at checkout. Computed by: warehouse selection (distance) + warehouse processing SLA (24h avg) + carrier transit time (carrier API). This is a latency-sensitive calculation done synchronously at checkout but backed by precomputed carrier transit-time tables.
Deep Dive 3: Product Catalog & Search (~5 min)
The core challenge: Serve 500M products at 500K reads/sec peak with <100ms latency. Index for full-text search with faceted filtering (category, price range, brand, rating).

Catalog Architecture

LayerTechWhat It ServesTTL / Freshness
CDNCloudFrontProduct page HTML, images, static data5–15 min TTL
Application CacheRedis ClusterProduct objects (structured data for API)30s–5 min TTL
Primary StorePostgreSQL (sharded)Source of truth for product dataStrong
Search IndexElasticsearchFull-text search, faceted filtering, autocompleteNear-real-time (seconds lag)
MediaS3 + CDNProduct images (multiple sizes)Immutable (versioned URLs)
Why Elasticsearch for search instead of database full-text? Postgres full-text search works for simple queries but breaks down at our scale: 500M products with faceted filtering (category AND price range AND brand AND rating AND availability) requires an inverted index purpose-built for this. Elasticsearch gives us relevance scoring, fuzzy matching, autocomplete, and horizontal scaling. Tradeoff: it's not our source of truth β€” it's a derived index. Product updates flow through Kafka to an ES indexing consumer. There's a 1-5 second lag between a seller updating a price and the search index reflecting it.
── Products DB (PostgreSQL, sharded by product_id) ── products id UUID PK seller_id BIGINT title VARCHAR(500) description TEXT category_id INT brand VARCHAR price DECIMAL currency CHAR(3) image_urls TEXT[] (S3 keys) attributes JSONB (color, size, weight, etc.) avg_rating DECIMAL (denormalized) review_count INT (denormalized) status ENUM (active, inactive, suspended) created_at TIMESTAMP updated_at TIMESTAMP ── Elasticsearch Index ── products_index All fields from products table, PLUS: availability BOOLEAN (approximate, updated periodically) search_keywords TEXT (title + brand + category + attributes, analyzed) // Mapping: title β†’ text (analyzed) + title.keyword (exact match) // Facets: category_id, brand, price (range), avg_rating (range)

Catalog Update Pipeline

  • Seller updates product β†’ write to Products DB β†’ publish product.updated to Kafka
  • Consumers: (1) ES Indexer updates search index, (2) Cache Invalidator deletes Redis + CDN cache entries
  • Price changes are applied immediately in DB but may take 30s–5 min to propagate through all cache layers. The checkout flow always re-validates price from DB (not cache) to prevent stale-price purchases.
Price integrity: The price shown on the product page (from cache) is a display price. The price charged at checkout is always fetched fresh from the Products DB. If the price changed between browsing and buying, the checkout shows "price has changed" and asks for confirmation. This protects both the customer and the seller.
Deep Dive 4: Data Model & Storage Summary (~5 min)
DataStoreAccess PatternConsistency
Product CatalogPostgreSQL (sharded) + Redis + CDN500K reads/sec peak (from cache). 50M seller updates/day.Strong writes, eventual reads
Search IndexElasticsearch (cluster)100K search queries/sec peak. Faceted filtering + ranking.Near-real-time (seconds lag from source)
InventoryPostgreSQL (sharded by product_id)~1.7K reservations/sec peak. Conditional UPDATEs.Strong (ACID) β€” non-negotiable
Shopping CartDynamoDB (or Redis + DB)~20K updates/sec peak. Key-value by user_id.Strong (durable across sessions)
OrdersPostgreSQL (sharded by customer_id)~580 writes/sec peak. "My orders" query.Strong (ACID)
ShipmentsPostgreSQL (same cluster as orders)Updated by fulfillment pipeline. Query by order_id.Strong
User ProfilesPostgreSQL + Redis cacheRead-heavy, addresses, payment methods.Strong writes, cached reads
Product ImagesS3 + CDN~50TB total. Immutable (versioned). Served from CDN edge.Eventual (CDN propagation)
EventsKafkaorder.placed, product.updated, shipment.status_changed, etc.Ordered per partition
Why DynamoDB for cart instead of Redis? Cart data must be durable (survive app restarts, persist across devices). Redis with persistence (RDB/AOF) can do this, but DynamoDB gives us built-in durability, auto-scaling (critical for 10Γ— spikes), and single-digit ms reads with zero ops burden. The access pattern is simple key-value (user_id β†’ cart items) which is DynamoDB's sweet spot. Tradeoff: higher per-request cost than Redis, but carts are small (~5KB avg) and the ops simplicity is worth it at scale.
Why Postgres for both inventory AND orders? They're different databases (different shard keys: product_id vs. customer_id), but same technology because both need ACID transactions. The checkout saga coordinates between them. We don't try to do a distributed transaction across them β€” instead, we use the saga with compensating actions. If the order write fails after inventory reservation, we release the reservation as compensation.
πŸ“‘ API Design
Product Catalog
GET/v1/products/{product_id}Product detail page
Response: {id, title, description, price, images[], attributes, avg_rating, review_count, availability}
Heavily cached (CDN + Redis). Availability is approximate.
GET/v1/search?q={query}&category=&price_min=&price_max=&brand=&sort=&cursor=&limit=
Backed by Elasticsearch. Response: {products[], facets: {categories[], brands[], price_ranges[]}, total_count, next_cursor}
Shopping Cart
GET/v1/cartGet current user's cart
Response: {items: [{product_id, quantity, unit_price, product_snapshot}], subtotal}
POST/v1/cart/itemsAdd item to cart
Request: {product_id, quantity}
PUT/v1/cart/items/{product_id}Update quantity
Request: {quantity} β€” set to 0 to remove
DEL/v1/cartClear entire cart
Checkout & Orders
POST/v1/checkoutProcess checkout (the critical path)
Request: {shipping_address_id, payment_method_id, idempotency_key}
Response: {order_id, status: "placed", estimated_delivery, total_charged}
Headers: Idempotency-Key: {uuid} β€” prevents double charges on retry.
GET/v1/orders?cursor=&limit=20My orders (paginated)
GET/v1/orders/{order_id}Order detail with shipment tracking
Response: {order, items[], shipments: [{id, status, tracking_number, carrier, estimated_delivery}]}
POST/v1/orders/{order_id}/cancelCancel order (if before PACKED)
Triggers: release inventory reservation, initiate refund.
05 Cross-Cutting Concerns 10–12 min
Failure Scenarios & Mitigation
Inventory DB shard downPrimary-replica failover. During failover (~30s), checkouts for products on that shard queue and retry. Product browsing is unaffected (served from cache). The Redis pre-filter still rejects clearly-out-of-stock requests.
Payment provider timeoutInventory is already reserved with 10-min TTL. Retry payment with exponential backoff. If retries exhausted, release reservation and return error to user. They can retry checkout manually.
Double checkout (user clicks twice)Idempotency key catches this. Second request returns existing order without creating a duplicate or double-charging.
Reservation expires before paymentBackground sweeper releases expired reservations every 60s. If payment succeeds AFTER reservation expired, the order creation step will attempt to re-reserve β€” if stock is gone, refund payment and notify user.
Elasticsearch cluster downSearch is degraded but browsing continues (product pages served from cache/DB). Fallback: route search queries to a simpler DB-backed search (slower, less relevant). Alert on-call.
Kafka consumer lag (fulfillment)Orders are placed and recorded in DB. Fulfillment is delayed but not lost. Consumers catch up. Customer sees "processing" slightly longer than usual.
CDN origin overloadedStale-while-revalidate: CDN serves stale content while fetching fresh in background. Origin has circuit breakers. Even a 5-min stale product page is acceptable.
Black Friday 10Γ— spikePre-warm CDN and Redis caches for top 10K products. Auto-scale API and checkout services. Queue-based checkout for flash deals. Degrade non-critical features (recommendations, reviews) to protect the purchase path.
Scalability Bottlenecks
At ScaleWhat BreaksMitigation
10Γ— (50M orders/day)Inventory DB row contention on hot products. Single Elasticsearch cluster at query limits. Order DB write throughput.Redis pre-filter + queue-based checkout for hot items. ES cluster scale-out (add data nodes). Shard Order DB by customer region.
100Γ— (500M orders/day)Saga orchestration becomes bottleneck β€” Order Service coordinating too many steps. Kafka partition throughput. Global inventory coordination.Move to choreography saga at this scale. Regional Kafka clusters. Per-region inventory with cross-region rebalancing. CQRS: separate order write model from read model.
Consistency Model Summary
DataModelRationale
Inventory reservationStrong (ACID)Overselling is unacceptable. Row-level locks on conditional UPDATE.
Order creationStrong (ACID)Can't lose or duplicate orders. Financial record.
Product page displayEventual (5 min)Slightly stale price/description is invisible to users.
Availability on product pageEventual (30s)Approximate. Real check at checkout. "In Stock" might be wrong briefly.
Search resultsEventual (seconds)ES index lags behind source of truth. Acceptable for discovery.
Shopping cartStrong (durable)Users expect cart to persist. DynamoDB gives strong consistency on read-after-write.
Order status updatesEventual (seconds)Event-driven. Slight lag between warehouse scan and user seeing "shipped."
Observability
  • Checkout pipeline: End-to-end latency from "click Buy" to order confirmation. Success rate. Failure breakdown (inventory, payment, system error). Inventory reservation hit rate (% that succeed on first attempt).
  • Business metrics: Conversion rate (browse-to-buy), cart abandonment rate, average order value, orders/second real-time.
  • Infrastructure: Inventory DB transaction latency (p99), Redis cache hit ratio (target: >95%), ES query latency, Kafka consumer lag per topic.
  • Alerting: Checkout error rate >1%, inventory DB p99 >50ms, reservation timeout rate spikes, payment failure rate >5%, Kafka lag >1000 messages.
Security
  • Price integrity: Prices are NEVER trusted from the client. Checkout always re-fetches price from DB. Prevents price manipulation attacks.
  • Inventory attacks: Rate limit cart additions (prevent bots reserving all stock). Reservation TTL ensures held stock is released.
  • Payment security: PCI compliance β€” payment details never touch our servers. Tokenized via payment provider (Stripe, Adyen). Only token stored.
  • Bot protection: CAPTCHA on checkout during flash sales. Device fingerprinting. Purchase quantity limits per customer per product.
  • Data encryption: All PII encrypted at rest. TLS in transit. Shipping addresses and payment tokens in separate, access-controlled stores.
06 Wrap-Up & Evolution 3–5 min
"To summarize: the architecture is built around a clean separation between the browsing path and the buying path. Browsing is eventually consistent, aggressively cached across CDN and Redis, and designed for 500K reads/sec with minimal backend load. Buying is strongly consistent β€” inventory reservation uses atomic conditional UPDATEs in PostgreSQL to guarantee no overselling, checkout follows a saga pattern with compensating transactions for each failure mode, and orders are ACID-committed. The bridge between these two worlds is a two-level inventory model: approximate availability for display (Redis, fast, slightly stale) and exact reservation for purchase (Postgres, correct, ACID). For 10Γ— spikes like Black Friday, we use a Redis pre-filter to shed clearly-failed requests before they hit the DB, queue-based checkout for flash sales, and aggressive CDN warming for hot products."
What I'd Build Next
ExtensionWhy It MattersArchitecture Impact
Recommendation EngineDrives ~35% of Amazon's revenueCollaborative filtering + content-based. Precomputed per-user recommendations stored in a feature store. Served at product page and cart. Adds read load but no write-path changes.
Reviews & RatingsTrust signal, conversion driverSeparate service. Write-light, read-heavy (denormalized avg_rating on product). Moderation pipeline (similar to content moderation).
Returns & RefundsPost-purchase experienceReverse of order pipeline: return requested β†’ item received β†’ refund processed β†’ inventory restocked. Separate state machine.
Seller AnalyticsMarketplace healthCQRS: read model built from order + inventory events via CDC to a data warehouse. Separate from operational path.
Multi-Region Active-ActiveGlobal latency, disaster recoveryRegional inventory DBs (each region owns its warehouse inventory). Catalog replicated read-only. Orders owned by the region closest to customer. Cross-region conflict resolution for shared resources.
Dynamic PricingCompetitive positioning, margin optimizationML model trained on demand signals, competitor prices, inventory levels. Updates prices asynchronously. Same cache invalidation pipeline.
Subscribe & SaveRecurring revenueScheduler service that creates orders on cadence. Requires handling out-of-stock, price changes, and payment failures gracefully.
Closing framing: This design is defined by ONE fundamental tension: the browsing experience demands speed and availability (cached, eventual, optimistic), while the buying experience demands correctness (transactional, consistent, pessimistic about inventory). The architecture resolves this by cleanly separating the two paths and bridging them with a two-level inventory model. Every technology choice β€” CDN for catalog, Redis for approximate availability, Postgres for reservations, DynamoDB for cart, Kafka for async fulfillment β€” follows from which side of this tension it serves.
07 Interview Q&APractice
"Here are the hardest questions an interviewer would ask about this design, and how to answer them. Each answer demonstrates deep understanding of the tradeoffs, not just surface knowledge."
Q1

How do you prevent overselling during a flash sale when 100K users click "Buy" simultaneously?

A

This is the defining challenge. The inventory decrement MUST be strongly consistent β€” we use a compare-and-swap operation: `UPDATE inventory SET count = count - 1 WHERE product_id = X AND count > 0`. If count hits 0, subsequent attempts fail atomically. But hitting a single row in PostgreSQL with 100K concurrent writes would be a disaster. So we use a reservation pattern: (1) when the user clicks "Add to Cart," we don't decrement inventory β€” we create a soft reservation in Redis with a 10-minute TTL, (2) Redis DECR is atomic and handles 100K+/sec, (3) the final hard decrement in PostgreSQL only happens at checkout, where traffic is naturally lower (only ~10-20% of carts convert). If the Redis reservation expires (user abandoned cart), the count is restored. This means during a flash sale, Redis absorbs the thundering herd, and PostgreSQL sees a manageable write rate. The tradeoff: a user might see "In Stock" but fail at checkout if all reservations are taken β€” which is better than overselling.

Q2

Why DynamoDB for the cart instead of PostgreSQL?

A

Carts are session-scoped, user-specific, and have a simple access pattern: get cart by user_id, add/remove items, expire after 30 days. DynamoDB excels here because: (1) single-key lookups at <5ms regardless of scale, (2) auto-scaling β€” cart traffic is extremely spiky (Black Friday), and DynamoDB handles this without provisioning, (3) TTL-based expiration built in β€” abandoned carts auto-delete, (4) no schema β€” cart items can have variable attributes (gift wrap, custom engraving). PostgreSQL could work, but we'd need to manage connection pools for spiky traffic, handle TTL cleanup ourselves, and the relational model doesn't buy us anything since carts have no joins. The cart is a document, not a relational entity. However, at checkout, we DO copy the final cart into PostgreSQL as an order β€” because orders have relational integrity (linked to payments, shipments, invoices).

Q3

Walk me through what happens when a payment fails after inventory is decremented.

A

This is a saga pattern. The checkout flow is: (1) decrement inventory, (2) charge payment, (3) create order. If payment fails after step 1, we need a compensating transaction: re-increment inventory. The Order Service orchestrates this β€” each step publishes an event to Kafka. If step 2 fails, the Order Service publishes an `inventory.release` event, and the Inventory Service adds the count back. The order moves to PAYMENT_FAILED state. Key design decisions: (1) we decrement inventory BEFORE charging payment β€” not after β€” because a payment that succeeds but has no inventory is worse (we'd have to refund), (2) the saga has a timeout β€” if the payment gateway doesn't respond in 30 seconds, we release inventory and show an error, (3) each step is idempotent β€” retrying an inventory decrement with the same order_id is a no-op if it already succeeded. The Kafka event log gives us an audit trail and the ability to replay failed sagas.

Q4

How would you design the search service to handle "running shoes under $50" with filtering?

A

This is an Elasticsearch problem, not a PostgreSQL problem. Product catalog data is dual-written: PostgreSQL is the source of truth (for orders, inventory), and Elasticsearch is the search index (for discovery). The search query "running shoes under $50" becomes an ES query: `bool: must: [match: "running shoes" on title+description], filter: [range: price < 50, term: category = "shoes"]`. Filters use ES filter context (cached, no scoring), while text matching uses query context (scored by relevance). We add faceted aggregations so the UI can show "Brand: Nike (234), Adidas (189)..." in the sidebar. The index is updated asynchronously β€” a product price change in PostgreSQL publishes a Kafka event, which the search indexer consumes and updates ES. This means search results can be ~5 seconds stale, which is acceptable. For price-sensitive queries, the product detail page always reads from PostgreSQL (source of truth), so the user sees the accurate price before buying.

Q5

What's your CDN invalidation strategy when a product image changes?

A

We don't invalidate β€” we use content-addressable URLs. Every product image URL includes a hash of the content: `cdn.example.com/images/{hash}.jpg`. When a seller uploads a new image, it gets a new hash, and the product record is updated to point to the new URL. The old URL remains valid in CDN cache until it naturally expires (TTL 30 days), but no one references it anymore. This means: (1) zero cache invalidation needed, (2) browsers cache images aggressively (the URL literally changes when content changes), (3) rollback is trivial (revert the product record to the old hash). The one exception is the product listing page HTML, which references the image URL β€” this has a short CDN TTL (5 minutes) so it picks up image URL changes quickly. The images themselves are immutable and cached forever.