What is progressive extraction in PHP refactoring?

Progressive extraction keeps the monolith running while extracting clean bounded contexts module by module alongside ongoing feature delivery. Business logic moves out of controllers, use cases emerge, and the legacy code shrinks progressively. Continuous delivery starts from sprint 1.

How does bounded context decomposition help AI-assisted development?

A 500,000-line monolith overwhelms any AI context window. But a 15,000-30,000 line bounded context fits perfectly. The AI model can reason about a coherent domain, suggest meaningful refactorings, and generate accurate tests because the code scope is focused and well-structured.

What does clean code mean in PHP refactoring?

Clean code in PHP follows the principles of Robert C. Martin (Uncle Bob): functions that do one thing, names that reveal intent, no side effects, and the boy scout rule — leave the code cleaner than you found it. Combined with SOLID principles, these rules guide legacy refactoring toward a codebase that is readable, testable and safe to change over time.

PHP Refactoring & Legacy Modernization

Q: What is a bubble context strategy for legacy modernization?

The bubble context strategy isolates legacy code behind an anti-corruption layer (ACL). New development is built cleanly outside the bubble, communicating with legacy only through the ACL. The bubble gradually empties as responsibilities transfer out. Best for highly risky legacy with a larger team.

Most PHP applications don’t need a full rewrite – they need a clear modernization strategy. As an expert PHP and Symfony developer, I help product teams progressively refactor existing codebases: extracting business logic from controllers, introducing use cases, splitting into bounded contexts, and building solid foundations for future development. No big-bang rewrites. Steady, measurable improvement aligned with your roadmap.

Legacy PHP refactoring: before (coupled monolith) and after (bounded contexts)

My refactoring approach

Legacy PHP projects share common patterns: tight coupling, no tests, mixed responsibilities, business logic buried in controllers. The challenge is not identifying the problems – it’s making progress without breaking what works.

My typical sequence:

Functional smoke tests – protect critical paths before touching anything
Extract business logic – move it out of controllers into explicit use cases (application layer)
Identify bounded contexts – apply DDD principles to define coherent module boundaries
Introduce abstractions – ports and adapters to prevent implementation details leaking into the domain
Discover and refine – new contexts emerge as the model clarifies; previous contexts get leaner
Enrich the domain model – replace anemic models with proper domain events, integration events, inter-context clients

I also use AI-assisted tools (including Claude Code) to accelerate the codebase mapping phase and speed up pattern detection across large codebases.

This sequence describes Strategy A – progressive extraction. For codebases where the risk is too high to work directly inside the existing code, Strategy B takes a fundamentally different approach (see below).

Why splitting a complex problem into smaller ones changes everything

The core principle behind bounded contexts is simple: a problem that is too large to reason about clearly becomes manageable when split into smaller, well-defined problems. Each bounded context has a clear responsibility, its own vocabulary and its own team ownership. A developer joining the project can understand a single context without needing to hold the entire system in their head.

This is not just a software architecture argument – it has a direct impact on how you work with modern AI tools.

The AI context window effect

A legacy monolith of 500,000 lines of code is impossible to feed entirely to an AI assistant. Even with large context windows, the model loses coherence, misses domain subtleties and produces generic suggestions that don’t reflect the actual business rules.

Once the codebase is split into bounded contexts of 15,000–30,000 lines each, the situation changes completely. Each context fits comfortably within an AI’s working context. Tools like Claude Code can then:

Understand the domain model of a specific context with precision
Suggest refactorings that are consistent with the existing domain vocabulary
Generate tests that match the actual business rules, not generic stubs
Detect inconsistencies and naming drift within the context
Accelerate the cartography of the next context to extract

In practice, what used to require weeks of analysis can be done in days. The bounded context decomposition is not just good architecture – it is what makes AI-assisted development truly effective on complex legacy codebases.

Component-based packaging makes it concrete

Splitting into bounded contexts only pays off fully when the codebase structure reflects it. The classic approach, Symfony's default convention, organises code by technical layer:

Layer-based structure (Symfony default, but not ideal) src/ 📁 Controller/ ← all domains mixed 📁 Repository/ ← all domains mixed 📁 Service/ ← all domains mixed

Pointing an AI tool at src/Repository/ gives it a mix of persistence logic from every domain. The context is incoherent – the model cannot reason about a specific business domain because the files don’t reflect domain boundaries.

Component-based packaging changes this:

Component-based structure src/ 📂 Identity/ 📁 Infrastructure/Persistence/DoctrineUserRepository.php 📁 Infrastructure/Http/RegisterUserController.php 📁 Application/UseCase/RegisterUser.php 📄 Domain/User.php 📂 Orders/ 📁 Infrastructure/ 📁 Application/ 📁 Domain/

Now each bounded context is a self-contained package. Pointing Claude Code at src/Identity/ gives it a coherent, domain-scoped context – the full stack of one business capability, nothing else. This is what makes AI assistance precise rather than generic. And as a side effect, it also makes future microservice extraction straightforward: the boundary is already drawn in the file system.

Which strategy fits your project?

Not every legacy codebase calls for the same approach. The right strategy depends on the risk level of the existing code, the team size, and how well the business domain is understood upfront.

Two refactoring strategies: progressive extraction vs bubble context

Strategy A – Progressive extraction
The monolith keeps running throughout. Module by module, clean bounded contexts are extracted alongside ongoing feature delivery – continuous delivery from sprint 1. Business logic moves out of controllers, use cases emerge, the domain model gets richer. The legacy code shrinks progressively. The tradeoff: you’re working inside messy code while improving it, which requires discipline.

Best for: continuous delivery constraint, small team, need for visible results immediately.

Strategy B – Bubble context
The legacy code is isolated in a “bubble” protected by an anti-corruption layer (ACL). New development and refactored modules are built cleanly outside the bubble, communicating with legacy only through the ACL. The bubble gradually empties as responsibilities are transferred out. The tradeoff: there is an upfront investment phase to set up the ACL correctly – during this period delivery slows down. Once the ACL is in place, a steady continuous delivery rhythm resumes.

Best for: highly risky or untouchable legacy, larger team, business domain well understood upfront, long-term internal ownership.

There is no universal right answer. The first conversation I have with a team before touching any code is about which strategy fits their specific situation – and being honest about what each one costs.

Common refactoring missions

PHP 5/7 to PHP 8+ migration
Symfony 2/3/4 upgrade to latest version (not just LTS)
Introducing automated testing to untested codebases
Extracting services and APIs from monolithic applications
Replacing anemic domain models with proper entities, value objects and domain events
Decoupling business logic from framework and infrastructure layers

Key patterns

Applied selectively based on what the codebase needs, not as a dogmatic checklist.

Clean code – following the principles laid out by Robert C. Martin (Uncle Bob): functions that do one thing, names that reveal intent, no side effects, the boy scout rule (“leave the code cleaner than you found it”). Combined with SOLID, these principles turn a legacy codebase into code that is readable, testable and safe to change.
Hexagonal architecture – separate what the business does from how it connects to the outside world (database, HTTP, email…). The domain never knows about technical details.
DDD bounded contexts – split a large domain into smaller, autonomous modules, each with its own vocabulary. A “user” in billing doesn’t mean the same thing as a “user” in identity management.
CQRS – separate the code that changes state (commands) from the code that reads it (queries). Simplifies each side and makes the intent explicit.
Strangler fig – replace a legacy component progressively, without ever stopping the system. The new code grows around the old one until the old one can be removed.
Mikado method – map out the dependency tree of a refactoring, then resolve each node sequentially to minimise impact on the running system.

Clients who trusted me

💬 Let’s assess your legacy codebase

Discuss your project on LinkedIn

See my profile on Malt

← Back to home