Case Study

UPI Migration: DC to AWS Cloud

A comprehensive technical deep-dive into migrating India's critical UPI payment infrastructure from PPBL (Paytm Payments Bank) physical data centers to AWS Cloud with zero downtime.

Executive Summary

The entire UPI stack at PPBL (Paytm Payments Bank) required migration from on-premises data centers to AWS Cloud. This initiative demanded meticulous planning across networking, security, data synchronization, deployment orchestration, and compliance certifications (PCI-DSS, ISO 27001, SOC 2) to ensure zero customer impact and regulatory adherence.

The Challenge

Technical Constraints

  • Entire UPI stack running on legacy on-prem infrastructure
  • Complex networking requirements (VPC, subnets, peering, NAT/IGW, PrivateLink, DirectConnect for NPCI and Bank connectivities)
  • Large-scale data migration with consistency requirements
  • Multi-service dependencies and tight coupling

Business Requirements

  • Absolutely zero downtime tolerance for UPI transactions
  • Strict regulatory compliance and audit requirements
  • Minimal customer impact during migration
  • Rollback capability at every migration phase
  • Replace obsolete and old versions of tools with modern, supported alternatives
  • Architecture design to have maximum automation and minimum manual intervention

Migration Approach

Phase 1: Foundation

Weeks 1-3
  • AWS account setup with landing zone best practices
  • VPC design: Multi-AZ architecture with public/private subnets
  • Network connectivity: VPN, Direct Connect, peering configurations
  • Security groups, NACLs, and IAM roles/policies
  • EKS cluster provisioning with Istio service mesh

Phase 2: Data Migration

Weeks 4-6
  • Database replication setup (Aerospike, others)
  • Kafka topic migration and consumer group synchronization
  • S3 bucket creation with lifecycle policies
  • Data validation and consistency checks
  • Performance baseline establishment

Phase 3: Application Deployment

Weeks 7-10
  • Containerization of all UPI microservices
  • Helm chart creation and GitOps setup with Argo CD
  • Blue/green deployment preparation
  • Service mesh configuration (Istio routing rules)
  • Observability stack deployment (Prometheus, Grafana)

Phase 4: Cutover & Validation

Weeks 11-12
  • Progressive traffic shift using Argo Rollouts canaries
  • Real-time monitoring of latencies, error rates, throughput
  • DR runbook execution and failover testing
  • Final cutover during low-traffic window
  • Post-migration optimization and tuning

Risks & Mitigations

Data inconsistency during replication

Mitigation: Implemented dual-write pattern with reconciliation jobs; automated consistency checks pre-cutover

Network latency increase

Mitigation: Direct Connect for low-latency connectivity; performance benchmarking at each phase

Service dependencies failure

Mitigation: Circuit breakers, retries with exponential backoff, comprehensive monitoring dashboards

Security compliance gaps

Mitigation: IRSA for pod-level IAM, encryption at rest/in transit, audit logging, CIS benchmarks

Rollback complexity

Mitigation: Automated rollback scripts, traffic shift granularity with Argo Rollouts, tested rollback procedures

Outcomes & Impact

100%
Zero-downtime migration
-30%
P95 latency reduction
+200%
Auto-scaling efficiency
12 wks
Total migration time

Key Achievements

  • Successfully migrated 100% of UPI services without any customer-facing incidents
  • Established GitOps practices with Argo CD for declarative infrastructure
  • Implemented progressive delivery with canary deployments and automated rollbacks
  • Achieved regulatory compliance with comprehensive audit trails and security controls
  • Reduced infrastructure costs through right-sizing and efficient resource utilization
  • Built comprehensive observability with SLO/SLA dashboards and alerting

Technologies & Tools

AWS EKS
Kubernetes
Istio
Argo CD
Argo Rollouts
KEDA
Terraform
Helm
Kafka
Aerospike
Prometheus
Grafana
AWS VPC
AWS ALB/NLB
AWS S3
AWS IAM
Direct Connect
GitOps
Canary Deployments
Blue/Green
Service Mesh