Bloomfield & Co. requires a cloud-native, omnichannel commerce platform capable of serving a web storefront, native mobile applications (iOS & Android), and in-store point-of-sale terminals across UK pop-up locations.
The recommended architecture is a microservices-based system deployed on AWS, following a domain-driven design decomposition. It will be capable of handling 50,000 monthly active users at launch and scaling to 500,000 within 18 months with minimal re-architecture.
Key architectural decisions:
• API-first design — all channels consume a single versioned REST/GraphQL gateway
• Event-driven inventory & order pipeline — decouples channel demand spikes from fulfilment
• PCI-DSS scope minimisation via Stripe Elements (card data never touches Bloomfield servers)
• GDPR-compliant data residency — all PII remains within eu-west-2 (London) region
• Infrastructure-as-Code with Terraform; zero manual click-ops in production
Estimated monthly infrastructure cost at launch: £1,400–£1,900. At 500k MAU: £4,800–£6,200.
FUNCTIONAL REQUIREMENTS
────────────────────────
• Product catalogue browsing & search (faceted filters, real-time stock levels)
• Shopping cart, checkout, and order management across web, iOS, Android, and POS
• Customer account management: registration, login, address book, order history
• Promotions & discount-code engine
• Inventory management with warehouse & pop-up store stock sync
• Order fulfilment workflow: pick, pack, despatch, tracking integration (DPD/Royal Mail)
• In-store POS: barcode scan → inventory lookup → card payment → receipt
NON-FUNCTIONAL REQUIREMENTS
──────────────────────────
• Availability: 99.9% SLA (≤8.7 hrs downtime/yr)
• p95 API response: < 300 ms under peak load
• Peak concurrency: 5,000 simultaneous sessions (assumes 10× spike on Black Friday)
• Horizontal auto-scaling — no manual intervention during traffic spikes
• RTO: 1 hour / RPO: 15 minutes
COMPLIANCE
──────────
• PCI-DSS Level 3 (Stripe as payment processor, SAQ-A eligibility)
• UK GDPR & Data Protection Act 2018
• Accessibility: WCAG 2.1 AA for web storefront
ASSUMPTIONS
───────────
• AWS eu-west-2 (London) as primary region; no multi-region requirement at launch
• Stripe used for all card payments; no direct card data handling by Bloomfield
• Team of 8 engineers: 3 backend, 2 frontend, 1 mobile, 1 DevOps, 1 QA
• Existing inventory managed in a legacy Shopify instance; migration is out of scope
ARCHITECTURAL STYLE
───────────────────
Domain-driven microservices on AWS ECS (Fargate), fronted by a single API Gateway. Services communicate synchronously via REST for read paths and asynchronously via Amazon EventBridge for write/event paths (orders, inventory mutations).
DOMAIN DECOMPOSITION
─────────────────────
1. Catalogue Service — product data, search index (OpenSearch), image CDN (CloudFront + S3)
2. Inventory Service — stock levels, warehouse sync, reservation holds
3. Cart & Checkout Service — basket state (ElastiCache/Redis), checkout orchestration
4. Order Service — order lifecycle FSM, fulfilment events, tracking webhooks
5. Customer Identity Service — Cognito-backed auth, profile, address book
6. Payment Service — Stripe payment intent creation & webhook processing (no card data stored)
7. Promotions Service — discount codes, flash-sale rules, eligibility engine
8. Notification Service — transactional email (SES) + push notifications (SNS + APNs/FCM)
9. POS Adapter Service — lightweight bridge for in-store terminals → core API
DATA TIER
──────────
• Amazon Aurora PostgreSQL (Multi-AZ) — transactional data (orders, customers, inventory)
• Amazon OpenSearch Service — product search & autocomplete
• ElastiCache for Redis — sessions, cart state, rate-limiting counters
• S3 + CloudFront — product images, static assets
FRONTEND DELIVERY
──────────────────
• Web: Next.js (SSR) deployed via AWS Amplify / CloudFront
• Mobile: React Native; thin API consumer sharing business logic with web
• POS: Stripe Terminal SDK on iPad; communicates through POS Adapter Service
┌─────────────────────────────────────────────────────────────────────────────┐
│ BLOOMFIELD & CO. — AWS ARCHITECTURE │
└─────────────────────────────────────────────────────────────────────────────┘
CLIENTS
┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Web Browser │ │ iOS App │ │ Android App │ │ POS iPad │
│ (Next.js) │ │ (React Native│ │ (React Native│ │(Stripe Term.)│
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │ │
└─────────────────┴─────────────────┴──────────────────┘
│ HTTPS
▼
┌─────────────────────────────────────────────────────────┐
│ CloudFront CDN + WAF (OWASP rules) │
└────────────────────────────┬────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ API Gateway (REST + WebSocket) │
│ Rate limiting · Auth (Cognito JWT) │
└──────┬──────────┬─────────┬──────────┬──────────────────┘
│ │ │ │
┌────▼───┐ ┌────▼───┐ ┌───▼────┐ ┌───▼────┐
│Catalogue│ │ Cart & │ │ Order │ │Customer│ ... (ECS Fargate services)
│Service │ │Checkout│ │Service │ │Identity│
└────┬───┘ └────┬───┘ └───┬────┘ └───┬────┘
│ │ │ │
└──────────┴────┬────┴──────────┘
│ Events (Amazon EventBridge)
┌───────────────┼───────────────┐
┌────▼───┐ ┌─────▼──┐ ┌──────▼─────┐
│Inventory│ │Payment │ │Notification │
│ Service │ │Service │ │ Service │
│ │ │(Stripe)│ │(SES/SNS/FCM)│
└────┬────┘ └────────┘ └─────────────┘
│
DATA TIER
┌──────▼──────────────────────────────────────────────────┐
│ Aurora PostgreSQL │ OpenSearch │ ElastiCache Redis │
│ (Multi-AZ) │ (Search) │ (Sessions/Cart) │
└──────────────────────┴──────────────┴────────────────────┘
STORAGE & CDN
┌─────────────────────────────────────────────────────────┐
│ S3 (Images/Assets) → CloudFront (Global CDN) │
└─────────────────────────────────────────────────────────┘
OBSERVABILITY
┌─────────────────────────────────────────────────────────┐
│ CloudWatch Logs · X-Ray Tracing · Grafana Dashboard │
└─────────────────────────────────────────────────────────┘
CATALOGUE SERVICE
• Technology: Node.js (Fastify), TypeScript
• Responsibilities: Product CRUD, variant management, price rules, image upload to S3
• Search: Syncs to OpenSearch via EventBridge "product.updated" events
• Caching: CloudFront TTL 300s for product detail pages; Redis for browse pages
• Scaling: ECS Fargate; auto-scales on CPU > 60%
CART & CHECKOUT SERVICE
• Technology: Python (FastAPI)
• Cart state stored in Redis with 24-hour TTL; recovered from DB on cache miss
• Checkout orchestration: reservation → payment intent → order create (saga pattern)
• Idempotency keys on all Stripe calls to handle duplicate webhook deliveries
ORDER SERVICE
• Technology: Python (FastAPI)
• Finite state machine: pending → confirmed → picking → packed → despatched → delivered
• State transitions published to EventBridge; consumers: Inventory, Notification, Analytics
• Webhook receiver for carrier tracking updates (DPD / Royal Mail)
CUSTOMER IDENTITY SERVICE
• Technology: Amazon Cognito user pool (managed)
• Supports: email/password, Google Sign-In, Apple Sign-In
• JWT access tokens (15-min TTL) + refresh tokens (30-day TTL)
• GDPR: right-to-erasure implemented via Cognito account delete + async PII scrub job
PAYMENT SERVICE
• Technology: Python (FastAPI)
• Never stores card data; delegates entirely to Stripe
• Handles: payment intent lifecycle, 3DS2 challenges, refunds, disputes
• Stripe webhook signature validation on all inbound events
INVENTORY SERVICE
• Technology: Go (Gin)
• Manages: warehouse stock, pop-up store allocation, reservation holds (TTL 15 min)
• Write throughput optimised: uses PostgreSQL advisory locks for stock decrement
• Publishes "stock.low" events when SKU drops below reorder threshold
NOTIFICATION SERVICE
• Email: Amazon SES (transactional templates — order confirmation, despatch, refund)
• Push: Amazon SNS → APNs (iOS) / FCM (Android)
• Rate-limited to avoid SES reputation damage; exponential back-off on delivery failure
POS ADAPTER SERVICE
• Technology: Node.js (Express)
• Authenticates POS terminals using device certificates (mutual TLS)
• Translates Stripe Terminal events into Order Service commands
• Offline mode: queues transactions locally on iPad when connectivity lost; syncs on reconnect
NETWORK SECURITY
• All public traffic routed through CloudFront + AWS WAF
• WAF rules: OWASP Top 10, rate limiting (500 req/min per IP), geo-blocking (optional)
• Services in private VPC subnets; no public IPs on ECS tasks
• VPC security groups: principle of least privilege between services
• Inter-service traffic over HTTPS with mutual TLS via AWS Certificate Manager
AUTHENTICATION & AUTHORISATION
• Customer auth: Cognito JWTs; API Gateway validates signature before forwarding
• Service-to-service auth: IAM roles + VPC-internal trust (no external exposure)
• Admin dashboard: separate Cognito user pool with MFA enforced
• POS terminal auth: device certificates (rotated quarterly)
PCI-DSS COMPLIANCE
• SAQ-A eligible: Stripe Elements renders card fields in Stripe-hosted iframes
• Bloomfield servers never see, store, or transmit raw card numbers
• Stripe's PCI-DSS Level 1 certification covers all card-data scope
• Annual Bloomfield self-assessment questionnaire (SAQ-A) required
UK GDPR
• All PII stored in eu-west-2 (London); no cross-border transfer without SCC
• Data retention: order data 7 years (tax obligation), marketing preferences 3 years
• Right to erasure: Cognito delete triggers async Lambda to scrub PII from all tables
• Privacy notice and cookie consent managed via OneTrust integration
• DPA signed with all sub-processors (AWS, Stripe, DPD)
SECRETS MANAGEMENT
• AWS Secrets Manager for all credentials (DB passwords, Stripe keys, API keys)
• Secrets rotated automatically every 30 days where possible
• No secrets in environment variables or source code
VULNERABILITY MANAGEMENT
• Container images scanned by Amazon ECR on push (critical findings block deploy)
• Dependabot PRs reviewed weekly
• Annual penetration test by CREST-accredited firm
RISK 1 — Database as single point of failure
Likelihood: Low | Impact: Critical
Mitigation: Aurora Multi-AZ with automatic failover (~30s RTO). Read replicas for reporting queries to protect primary. Daily snapshots retained 35 days. RDS Proxy to pool connections and absorb reconnect storms during failover.
RISK 2 — Black Friday traffic spike overwhelms services
Likelihood: Medium | Impact: High
Mitigation: ECS service auto-scaling on CPU and ALB request-count metrics. Catalogue and product pages served from CloudFront cache (origin shielded). Load test to 5× expected peak before Q4. Circuit breakers (Resilience4j) on downstream service calls to fail-fast rather than cascade.
RISK 3 — Stripe outage disrupts checkout
Likelihood: Low | Impact: High
Mitigation: Monitor Stripe status page; surface maintenance banner when degraded. Queue failed payment intents for retry. Consider Stripe's Adaptive Acceptance feature for card-network routing. Business continuity plan: temporary "pay by BACS on invoice" option for B2B accounts.
RISK 4 — PII data breach
Likelihood: Low | Impact: Critical
Mitigation: Column-level encryption on PII fields (AWS KMS CMK). RDS audit logging. GuardDuty for anomaly detection. VPC Flow Logs retained 90 days. Incident response playbook rehearsed quarterly. ICO breach notification within 72 hours.
RISK 5 — Team of 8 becomes delivery bottleneck at scale
Likelihood: High | Impact: Medium
Mitigation: Invest in internal developer portal (Backstage) from Month 3. Standardise service scaffolding templates. Automate on-call runbooks (PagerDuty + Runbook Automation). Hire 2 additional engineers (1 backend, 1 SRE) before Black Friday.
RISK 6 — Offline POS terminals lose sales
Likelihood: Medium | Impact: Medium
Mitigation: React Native POS app stores transactions locally in encrypted SQLite when offline. Syncs automatically on reconnect. Cash-payment fallback procedure documented for staff.
SHORT TERM (0–3 months)
──────────────────────
1. Deploy MVP with Catalogue, Cart, Order, Identity, and Payment services only — defer Promotions and POS Adapter to Month 2.
2. Use Terraform Cloud for remote state; enforce PR review on all infra changes.
3. Implement structured logging (JSON → CloudWatch → Grafana) from Day 1. Debugging without logs is expensive.
4. Set up Stripe radar rules and fraud scoring before accepting live payments.
5. Conduct pre-launch security review: AWS Trusted Advisor + manual config audit.
MEDIUM TERM (3–9 months)
────────────────────────
1. Introduce a GraphQL federation layer (Apollo) on top of REST services to allow mobile clients to fetch composite views in a single request — reduces chattiness on 4G networks.
2. Add read replicas for Aurora and route search traffic to OpenSearch to protect the primary DB write path.
3. Implement A/B testing infrastructure (AWS Evidently) for checkout funnel optimisation.
4. Build customer data platform (CDP) pipeline: EventBridge → Kinesis Firehose → S3 (Parquet) → Athena for analytics.
LONG TERM (9–18 months)
───────────────────────
1. Multi-region failover assessment — if EU expansion planned, evaluate eu-central-1 (Frankfurt) active-passive setup.
2. Evaluate moving to AWS Graviton3 ECS tasks for ~20% cost reduction on compute.
3. Consider platform team structure: internal shared services (auth, notifications, payments) as products consumed by feature squads.
4. Reassess PCI scope annually as payment volume grows — may need PCI-DSS Level 2 assessment at >1M transactions/year.
TECHNOLOGY BETS TO WATCH
─────────────────────────
• AWS Bedrock for AI-powered product search & personalisation (already in platform)
• Stripe Issuing for employee expense cards and supplier payments
• React Server Components to further reduce mobile bundle size
Ready to design your architecture?
Describe your system in plain English and ASAA generates a complete HLD like this one — including security controls, cost estimates, and technology decisions — in under two minutes.