Editorial

17 Apr 2026 · 14 min read

Your codebase wasn't built for AI. Here's how to bridge the gap.

By Andy Webb

A monolith of tangled code with a figure holding a magnifying glass

Let's describe a scene I think many of you will recognise. Your CEO just came back from a conference absolutely buzzing about AI. "We need to integrate AI into the product," they say. "Our competitors are doing it. Customers are asking for it. Let's ship something by Q3." Literally that's the conversation that happened to me many moons ago during my BBC tenure when we were early adopters.

You nod. You smile. You go back to your desk and stare at a codebase that was started in 2017 (if you're lucky). There are 180,000 lines of code (Or a million) across 1,200 files (say, 10,000). The original architect left two years ago and took most of the context with them. The documentation is a README that was last updated in 2021 and a Confluence page that describes a system architecture that hasn't existed since the pandemic. There are also three modules nobody dares touch because the last person who refactored one of them caused a two-day outage and a very uncomfortable meeting with the CTO.

And you're supposed to "add AI" to this…

I'm not going to pretend this is easy. But I am going to tell you that the answer isn't dumping your entire codebase into a context window and hoping for the best. And the honest first step is almost certainly not the one your CEO has in mind.

The "just add AI" approach and why it falls flat on its face

The most common mistake I see is teams trying to bolt AI onto existing systems without preparing the ground underneath. They connect their codebase to Claude or Copilot and expect it to just understand everything. It doesn't. And the failures are subtle enough to be genuinely dangerous.

Large language models are remarkable at pattern recognition, but they don't understand your business logic. They don't know that the function called processOrder() has a side effect that quietly updates three separate database tables. They don't know that the configuration file on line 847 overrides the default behaviour in production but not in staging. They don't know that the variable named temp isn't temporary at all. It's been "temporary" since 2019 and now handles critical authentication logic that half the company depends on.

When you feed undocumented code into an AI tool, you get confident-sounding suggestions that are wrong in ways that are hard to spot. The model fills in the gaps with plausible assumptions. Those assumptions compile. They pass the happy-path tests. They break in production at 2am on a Saturday when you're three pints deep at your mate's birthday.

Step 1: Document the codebase you actually have

Before you use AI to change your code, use AI to understand your code. I know. It's the boring step. Nobody's going to stand up at a board meeting and say "this quarter we documented our codebase" to a round of applause. But it's the foundation that makes everything else work.

Don't try to do the whole codebase at once. Pick the scariest module first, the one where the dragons live. Then work through these prompts systematically.

Map the territory. Open Cursor, Claude Code, or whatever AI coding tool you use and start here:

Analyse the directory structure of /src/modules/[your-module]. For each file, give me: (1) A one-sentence description of its purpose. (2) Its imports and exports. (3) Any responsibilities that seem beyond what its filename suggests. Flag anything that looks like it's doing more than one job.

This gives you the map. You'll immediately spot files called utils.js that are secretly doing critical business logic, or a helpers.ts that's become a 2,000-line dumping ground for everything that didn't have an obvious home.

Hunt the side effects. This is where legacy code gets dangerous. Run this prompt on every module that touches data:

For each public function in /src/modules/[your-module], identify: (1) Every database write, API call, or external service interaction it triggers, including through nested function calls. (2) Any global state it modifies. (3) Any event emitters or message queue publishes it fires. (4) Any file system operations. Present this as a table with columns: Function Name, Direct Side Effects, Indirect Side Effects (via called functions), Risk Level (high/medium/low).

I tested this on The Bot Market codebase and it caught three routing assumptions I'd made that were wrong. On a monolith with years of accumulated decisions, the side effect map alone could save you from a production incident.

Generate the missing docs. Now the heavy lifting. For each significant file:

Write documentation for [filename] in the following format:

Overview

What this file does and why it exists, in 2-3 sentences.

Dependencies

What this file imports and why, grouped by internal modules and external packages.

Public Interface

For each exported function/class: its purpose, parameters with types and descriptions, return value, and one example of how it's called elsewhere in the codebase.

Side Effects and Gotchas

Anything a developer touching this file for the first time needs to know to avoid breaking something. Be specific about production vs staging behaviour differences.

Technical Debt

Anything that's clearly a workaround, a known issue, or a "temporary" solution that became permanent.

Here's the critical part: don't trust this output blindly. Have the engineer who knows this code best spend 20 minutes reviewing each file's documentation. They'll correct the AI's mistakes and, more importantly, they'll add the tribal knowledge that only exists in their head. "This function looks unused but it's called dynamically from the scheduler" or "Don't change the order of these operations or the payment gateway rejects the request."

The review process is the real product. The AI draft is just the catalyst that makes the review efficient instead of agonising.

Step 2: Map your dependencies before they map you

If you read part one of this series, you know vendor dependency is an existential risk. The same principle applies inside your codebase. Before you can integrate AI safely, you need to know what depends on what.

Build the internal dependency graph:

Analyse the import/require statements across the entire /src directory. Build me a dependency graph showing: (1) Which modules depend on which other modules. (2) Which modules are depended upon by the most other modules (highest fan-in). (3) Any circular dependencies. (4) Any modules that import from more than 5 other internal modules. Present the top 10 most-depended-upon modules in a ranked table.

The modules at the top of that list are your riskiest targets for change and your most important targets for documentation. If auth.ts is imported by 47 other files, you need to understand it completely before any AI touches it.

Map the external integrations:

Find every external API call, database connection, third-party SDK usage, and webhook endpoint in the codebase. For each one, list: (1) The service name and endpoint URL pattern. (2) Which files make the call. (3) Whether there's error handling around it. (4) Whether there's a retry mechanism. (5) Whether the connection details are hardcoded or configurable. Present as a table sorted by number of call sites.

This map tells you exactly where your system touches the outside world. Every one of these integration points is a potential AI integration point, and every one without error handling is a production incident waiting to happen.

Step 3: Audit and label your data before it goes anywhere near a model

Three filing cabinets in green, amber, and red with different levels of access

This is the step most teams skip entirely. It's also the one that determines whether your AI integration is an asset or a lawsuit.

Find out what you actually have:

Scan the database schema at [connection/path]. For each table, give me: (1) Table name and estimated row count. (2) Column names with data types. (3) Which columns likely contain personally identifiable information (names, emails, phone numbers, addresses, IP addresses, dates of birth). (4) Which columns contain free-text fields that could contain anything. (5) The last modified date of the most recent record. (6) Any columns that appear to store data in inconsistent formats. Present as a table with a PII Risk column rated high/medium/low/none.

Now you know where the sensitive data lives. Every column flagged as high PII risk needs a decision: can this data be sent to an external AI model? Under what conditions? With what consent? This isn't just good practice; it's a GDPR and EU AI Act requirement.

Assess data quality for AI readiness:

Analyse the data in [table/collection name] and report: (1) Percentage of null or empty values per column. (2) Percentage of values that don't match the expected format (e.g. phone numbers with letters, dates in mixed formats). (3) Any columns where the same information appears to be stored in different formats (e.g. "UK", "United Kingdom", "GB", "Great Britain"). (4) Any columns that appear to have been repurposed (the column name suggests one thing but the data contains something else). (5) A data quality score from 1-10 for each column with justification.

AI models trained or prompted with inconsistent data produce inconsistent results. If your address field contains "London", "london", "LONDON", "London, UK", and "Greater London Area", your AI features will inherit all of that chaos. Clean it now or debug it forever.

Create your data classification:

Based on the PII analysis and data quality assessment, create a data classification document with three tiers:

GREEN: Data that can be sent to external AI APIs freely. No PII, good quality, non-sensitive.

AMBER: Data that can be sent to AI APIs only after anonymisation or with explicit consent. Contains PII or commercially sensitive information but has legitimate AI use cases.

RED: Data that must never leave your infrastructure. Highly sensitive PII, financial data, health data, or data subject to specific regulatory requirements.

For each table/collection, assign a tier and list what processing is needed before it can be used with AI at the GREEN level.

This classification document is your AI data policy in concrete terms. When an engineer asks "can I use this data with Claude?", they check the document. No ambiguity. No guesswork. No accidental GDPR violations.

Step 4: Find the seams for AI integration

Organic vines wrapping around a dark rectangle, gradually replacing it

Now you know your code, your dependencies, and your data. You can finally talk about where AI actually fits.

Don't rewrite the monolith. Use the strangler fig pattern instead. The name comes from tropical trees that grow around existing trees, gradually replacing them without ever felling the original. Nobody has ever completed a full rewrite on time and on budget. I will die on this hill.

Identify the integration points:

Based on the codebase analysis, identify the top 5 locations where an AI-powered feature could be added with the least disruption. For each candidate, assess: (1) Can it be implemented as a new service that calls the existing system through defined APIs, rather than modifying the monolith directly? (2) What data does it need, and is that data classified as GREEN in our data classification? (3) If the AI feature fails or produces bad output, does the existing system continue to work? (4) What's the smallest useful version we could ship to test the value? (5) Rate the integration difficulty from 1-10 with justification.

The best AI integration points share three characteristics: they can be built alongside the monolith rather than inside it, they use data that's already classified as safe, and they degrade gracefully when the AI component has a bad day.

Design the wrapper:

For the highest-priority integration point, design an API contract between the existing system and the new AI-powered service. Define: (1) The endpoint(s) the monolith will call. (2) The request payload with exact field names and types. (3) The response payload with exact field names and types. (4) The timeout and fallback behaviour (what happens if the AI service is slow or down). (5) The error response format. (6) Rate limiting requirements.

This API contract is your abstraction layer in miniature. The monolith calls a defined interface. Behind that interface, you can swap models, change providers, or rewrite the AI logic entirely without the monolith knowing or caring. That's the portability principle from article one applied at the feature level.

Tools I've tested that actually help

I've been testing tools specifically designed for this problem. No affiliate links here, just what I've found.

Cursor is genuinely good at navigating large codebases. Its context awareness means it understands relationships between files in a way standalone ChatGPT can't match. Every prompt above works well in Cursor. I built this entire website with it.

Claude Code is excellent for the investigative prompts, the "scan the entire codebase and tell me about X" tasks. It handles the dependency mapping and side-effect hunting prompts particularly well because it can traverse large file trees in a single session.

Sourcegraph Cody is built specifically for enterprise codebase intelligence. If you have a monolith spanning multiple repositories, it's worth evaluating for cross-repository understanding.

None of these tools are magic. They all need human oversight. They all make mistakes on edge cases and business logic that only exists in someone's head. But used as investigation assistants rather than autonomous agents, they dramatically reduce the time from "nobody understands this code" to "we know enough to change it safely."

Making the case to the people who control the budget

Some of you are probably reading this and thinking: "This all makes sense, but my CEO wants a chatbot by Q3, not a documentation sprint." I've been there. Here's how I'd frame it.

"We can ship a chatbot in Q3. But without preparing the foundation first, it'll be fragile, it'll require constant firefighting, and it'll break in ways that damage customer trust. If we spend six weeks on documentation, data classification, and integration design first, the chatbot we ship in Q4 will be better, cheaper to maintain, and safer. More importantly, it'll be the first of many AI features built on a solid foundation, rather than a one-off demo that becomes technical debt the moment it goes live."

You're not saying no. You're not saying slow down. You're saying "let me do this properly so we can go faster for longer."

The four steps above (document, map dependencies, classify data, find the seams) can be done in parallel by different team members. A senior engineer on documentation. A data engineer on the data audit. An architect on the integration design. Six weeks, three people, part-time. The output is a codebase that's ready for AI rather than a codebase that's hoping AI will fix itself.

Sustainable AI isn't slower. It's faster over any timeframe longer than one quarter. And last time I checked, most companies plan to be around for more than three months.

This is part three of the Sustainable AI series. Next article: The AI bill nobody budgeted for.

The sharpest AI tools intel, weekly.

Join thousands of professionals navigating the AI tools landscape. Free, no spam, unsubscribe anytime.

The "just add AI" approach and why it falls flat on its face

Step 1: Document the codebase you actually have

Overview

Dependencies

Public Interface

Side Effects and Gotchas

Technical Debt