Dualo
Backend Architectures Deep Dive

Deployment models — server, serverless, edge

Long-running process, Function-as-a-Service, edge runtime. Each imposes different constraints on your framework, your DB connections, and your mental model.

2 min read

Long-running deployment — the default many frameworks assume: Django, Rails, Spring, Go binaries were designed for this. They benefit from warm JIT, persistent DB connection pools, in-memory caches, background threads. Scaling: horizontal (more containers) + vertical (bigger instances). Orchestration: ECS, Kubernetes, Nomad, or platform-as-a-service (Heroku, Fly, Railway). Ideal for: WebSockets, SSE, heavy background jobs, large in-process caches, long-running batch work.

Cold start reality by runtime: Go/Rust binaries ~30 ms; Node.js ~100 ms; Python ~200–500 ms (plus framework imports: +500 ms for Django); JVM/Spring Boot: 1–5 seconds WITHOUT SnapStart, ~200 ms WITH SnapStart (AWS). Mitigation paths: (a) provisioned concurrency (pre-warmed Lambda instances — adds fixed cost), (b) smaller bundles / lazy imports, (c) edge runtimes (V8 isolates, sub-5 ms cold starts), (d) SnapStart for Java. Cold start cost multiplies in bursty workloads — every scale-out spawns fresh instances.

Serverless constraints you must design around: (a) connection pooling death: each new Lambda instance opens fresh DB connections; at scale you exhaust Postgres's max_connections (typically ~100). Solutions: RDS Proxy (AWS), Prisma Data Proxy, Neon's serverless driver, external PgBouncer. (b) no post-response work: the function is frozen when the response is sent (AWS) unless you use background-worker patterns like ctx.waitUntil (Vercel/Cloudflare). (c) ephemeral filesystem: /tmp is per-invocation, nothing persists. (d) package size limits: Lambda 250 MB unzipped; Vercel similar. Packaging heavy ML libs (torch, pandas) can fail to fit.

Edge runtime (V8 isolates) specifics: Cloudflare Workers, Vercel Edge, Deno Deploy share the 'V8 isolate' execution model — NOT Node.js. Restrictions: (a) no fs, child_process, net (only fetch); (b) no native addons; (c) limited memory (128 MB typical); (d) limited CPU time per request (10–50 ms CPU — I/O wait doesn't count); (e) many Node-only libs (bcrypt native, sharp, Prisma classic) won't work. Next.js route segment config: export const runtime = 'nodejs' | 'edge' — pick per route.

Framework affinity by deployment: Django/Rails/Spring prefer long-running — running them as Lambda (via Mangum, Zappa, or framework wrappers) 'works' but fights the model (huge cold starts, connection pool churn). Next.js, Hono, Fastify, Elysia, and most modern Node frameworks explicitly target serverless + edge — minimal boot, cold-start-friendly. Go and Rust are deployment-agnostic by default due to static binaries + fast startup.

Cost model differences: long-running = fixed monthly fee per instance even idle. Serverless = per-invocation + per-GB-ms billed. Edge = per-invocation, often cheaper, CPU-time only (I/O wait is free). Crossover points: for bursty low-baseline workloads, serverless + edge save money. For steady high-throughput (thousands of req/sec sustained), long-running containers are cheaper per request.

The hybrid reality at scale: most real systems mix models. Long-running containers for background workers + WebSockets + stateful services; edge functions for auth + personalization + rewrites; serverless for bursty API endpoints + scheduled jobs. Don't force one model to fit everything.

Grounded on https://vercel.com/docs/functions/runtimes

Next up

Choosing an architecture — a decision framework

No stack is universally best. Match the workload, the team, and the deployment target. Start boring, scale up complexity only when measurements force you.