Backend

From Laravel Monolith to Cloud-Native: A Strangler-Fig Migration Playbook

March 10, 20265 min read

Most "modernize the monolith" articles assume you're starting with a healthy CI/CD pipeline, comprehensive tests, and a team that meets every Tuesday. In real life you're starting with a five-year-old PHP application that nobody's fully read in eighteen months, and the only person who remembers why a particular controller exists left in 2022.

Here's the playbook that's worked for me, doing this kind of work at the Supreme Court of Justice of Panama and now in similar shape at Telered.

The shortest definition of "strangler fig"

You don't rewrite the application. You build new functionality next to it, in your target stack. The new code intercepts a small slice of traffic at the edge. You expand that slice over time. The old monolith eventually has nothing left to do.

The metaphor — Martin Fowler borrowed it from a tree that grows around its host until the host is consumed — is doing a lot of work. What it really means in code:

        ┌────────────────────────┐
        │  Reverse proxy / WAF   │  ◀── all traffic enters here
        └─────┬──────────┬───────┘
              │          │
              │  /new/*  │  /*       (everything else)
              ▼          ▼
       ┌──────────┐  ┌──────────────┐
       │  New     │  │  Legacy      │
       │  service │  │  Laravel     │
       │  (AWS)   │  │  monolith    │
       └──────────┘  └──────────────┘

The reverse proxy is the only piece that knows the new system exists. Today it routes 5% of traffic to the new code; in six months it routes 95%.

What I keep, what I replace

The Laravel apps I've inherited are usually a mix of three things:

Domain logic — the actual business rules. Keep this for as long as you can. It's been hardened by reality.
Plumbing — auth, routing, ORM, caching. Laravel does this well; if it works, don't touch it.
Coupling to the runtime — sessions in database, cron jobs in the monolith's scheduler, file uploads to local disk, blocking calls to slow APIs. This is what kills monoliths in the cloud.

The migration is mostly about the third bucket. The domain logic comes along for the ride.

Order of operations that actually works

Step 1: Externalize state. Sessions go to Redis or DynamoDB. Files go to S3. Cache goes to ElastiCache. Anything that lives on the local filesystem of the PHP server has to leave first, because the moment you put the app behind a load balancer with two instances, that state breaks.

Step 2: Containerize, don't lift-and-shift. Wrap the Laravel app in a Docker image. Run it on ECS or EKS. The first deploy should be functionally identical to what's running on-prem. If anything is different, you've already changed two variables at once and your debugging just got 4x harder.

Step 3: Put the proxy in. Now you have an entry point that can fork traffic. Even if 100% goes to the legacy app for now, the proxy is what unlocks every step after this.

Step 4: Extract a strangler. Pick the easiest, lowest-risk endpoint. Not the most-trafficked, not the most painful — the easiest. A health check, a public-facing read-only API, a webhook receiver. Build it in your target stack (Node.js + Lambda for me, lately), point the proxy at it for that one route.

Step 5: Watch metrics for a week. Not a day, a week. Things you missed will surface on Tuesday at 14:00 when a scheduled job runs.

Step 6: Repeat. One endpoint at a time. Some endpoints will resist — they have weird transactional dependencies on other parts of the monolith. Those are the last to migrate, and sometimes they never do.

Things I had to learn the hard way

Eloquent's relationships are sneaky. A model that looks innocent often has a with('orders.items.product.category') somewhere upstream. When you extract a service that returns "just an Order", you discover the legacy frontend was depending on six levels of nested data. Use API resources / DTOs at the boundary on day one.

Laravel's queue isn't free to migrate. If you're moving from database queue to SQS, your code stays the same — but the job class needs to be deserializable on both sides during the transition. Don't rename a job class halfway through the cutover.

Soft deletes are a footgun. withTrashed() exists for a reason and the new service almost certainly doesn't know about it. Half my migration bugs were rows the legacy app silently filtered out and the new code happily returned.

Don't dual-write to two databases. Pick one source of truth per table. If the new service needs that data, it reads from the legacy DB during the transition (yes, awkward; yes, temporary). Dual-writes are one of those things that look fine at 100 records and burn at 100,000.

The hardest call: when do you cut the legacy off?

There's a temptation to declare victory the moment 80% of traffic is on the new stack. Don't. The remaining 20% is usually:

The admin panel only three people use, but they're the people who matter
The batch jobs that run on the 1st and 15th of the month
A CSV export endpoint someone built in 2019 that turns out to power a regulatory report

Until those are migrated, the legacy app is still production. You're paying for it in maintenance, security patching, and team attention. The right time to retire it is when you can run a full month — including end-of-month — without anyone touching the legacy codebase. Not before.

What strangler buys you

The selling point isn't "you get to use Lambda." It's that at every point during the migration, the system works. You can stop. You can roll back. You can pause for a quarter while a regulatory audit eats your team's calendar. The big-bang rewrite doesn't give you any of that — and the big-bang rewrite is also the project that quietly fails 60% of the time.

For systems that real people depend on, "boring and reversible" beats "elegant and revolutionary" every single time.

Share this article:

Cloud Architecture

Bridging On-Prem BankCore to AWS: Building a PCI-DSS Tokenization Platform