Quick reference: AI integration cost by project type
| Project type | Typical cost | Timeline |
|---|---|---|
| AI chatbot (added to existing product) | £8,000 – £25,000 | 4–8 weeks |
| Document processing pipeline | £15,000 – £40,000 | 6–10 weeks |
| RAG system (knowledge base Q&A) | £20,000 – £55,000 | 8–14 weeks |
| AI workflow automation | £25,000 – £70,000 | 10–18 weeks |
| Custom AI agent (multi-step reasoning) | £35,000 – £80,000 | 12–20 weeks |
| LLM fine-tuning on proprietary data | £30,000 – £90,000 | 10–20 weeks |
These figures are for end-to-end delivery: scoping, prompt engineering, integration, testing, monitoring setup, and deployment. They assume your existing infrastructure (API, data storage, authentication) is already in place. If you’re building from scratch, add the cost of the surrounding product.
AI integration vs. building an AI product
This distinction matters for budgeting. AI integration means adding AI capabilities to something that already exists — your CRM, your internal tool, your SaaS product. The surrounding infrastructure is already there; you’re layering intelligence on top. This is typically 30–60% cheaper than building a new AI product from scratch because you’re not paying for the scaffolding.
Building an AI product means the AI is the core of what you’re building and there’s nothing yet to integrate into. Combine the AI integration cost with the cost of the surrounding application.
Most UK businesses asking about AI integration fall into the first category. The question is usually “how do we add AI to what we already have”, not “how do we build a new AI company”.
What drives AI integration cost
1. Where your data lives and how clean it is
AI systems are only as good as the data they reason over. If your documents are in a structured format with consistent naming and accessible via an API, integration is straightforward. If your data is spread across SharePoint, email inboxes, PDFs with inconsistent formatting, and a 15-year-old on-premise SQL server, data preparation alone can add £10,000–£30,000 to the project.
The single biggest hidden cost in most AI integration projects is data cleanup and normalisation. Ask any AI consultancy what percentage of their projects get delayed by data quality issues and the honest answer is “most of them”.
2. Which LLM you use and how you use it
Prompting a cloud-hosted LLM (GPT-4o, Claude 3.5 Sonnet) via API is the fastest path. The integration itself is straightforward; the engineering effort goes into prompt design, output parsing, error handling, and evaluation. Build cost is lower, but ongoing API costs are real.
Self-hosting an open-source LLM (Llama 3, Mistral, Qwen) removes the per-token API cost but requires GPU infrastructure, model serving setup, and ongoing maintenance. The infrastructure overhead typically adds £15,000–£30,000 to the build cost. It only pays off at scale — roughly above 5 million tokens per day in production.
For most UK SMEs, using OpenAI or Anthropic APIs is cheaper overall even at significant volume, because the saved ongoing costs don’t amortise the higher build cost for years.
3. Whether you need RAG or just prompting
Many AI use cases can be handled with well-designed prompts against a base LLM, with no retrieval layer needed. A customer service chatbot that answers questions about your standard product range may only need a system prompt with your knowledge embedded directly.
But when your knowledge base is too large to fit in a context window, changes frequently, or needs to reference specific documents, you need a RAG (Retrieval-Augmented Generation) system. This means a vector database (Pinecone, Weaviate, Qdrant), an embedding pipeline, a retrieval layer, and logic to merge retrieved context with the user query. RAG adds £15,000–£30,000 to the project and is where a significant amount of AI integration effort actually goes.
4. Evaluation and safety
Production AI systems require evaluation infrastructure. You need to know when the LLM is hallucinating, when answers degrade after a model update, and when user queries fall outside your system’s competence. Building a proper evaluation framework — test sets, automated regression checks, human review workflows — adds 2–4 weeks to any serious project. Skipping it means your AI system will silently degrade without you knowing.
5. Integration depth
Adding a chat widget to a website is a different scope than integrating AI into your core business workflow. The further the AI goes into your production system — writing directly to your database, taking actions in your CRM, sending emails on behalf of users — the more engineering effort goes into error handling, retry logic, audit trails, and safety guardrails.
Cost by project type
AI chatbot (£8,000–£25,000)
A chatbot added to an existing product or website. This covers:
- System prompt design and tuning against your domain
- API integration with OpenAI, Anthropic, or a self-hosted model
- Conversation state management and memory handling
- Widget or API endpoint for your frontend to consume
- Basic evaluation against common user queries
The lower end (£8,000–£12,000) is a stateless chatbot answering questions from a fixed knowledge base embedded in the system prompt. The upper end (£18,000–£25,000) includes conversation history, user personalisation, escalation to human agents, and a content management interface for the knowledge base.
Document processing pipeline (£15,000–£40,000)
Automated extraction, classification, or summarisation of documents — invoices, contracts, reports, forms, medical records. This covers:
- Document ingestion (upload API, email inbox, SharePoint, S3)
- OCR and text extraction for scanned documents
- LLM extraction and structuring of target fields
- Validation logic and confidence scoring
- Output routing to downstream systems
- Human review queue for low-confidence extractions
Document processing is where the data quality problem hits hardest. Clean, digital PDFs with consistent structure are quick to process. Handwritten forms, mixed-format scans, or multilingual documents each add significant complexity.
RAG system — knowledge base Q&A (£20,000–£55,000)
A system that lets users query a large corpus of internal knowledge — documentation, past proposals, support tickets, policy documents, research. The key engineering work is in the retrieval layer:
- Embedding pipeline to convert documents to vector representations
- Vector database setup and indexing (Pinecone, Weaviate, Qdrant)
- Chunking strategy (how to split documents for best retrieval)
- Re-ranking and context compression
- Answer generation with source citation
- Freshness pipeline to keep the index current as content changes
AI workflow automation (£25,000–£70,000)
AI taking actions inside your business workflow — drafting emails, updating CRM records, routing support tickets, generating reports, summarising meeting notes. The engineering complexity comes from the actions themselves:
- Tool use / function calling setup so the LLM can invoke real actions
- Audit trail for every AI action (what was done, when, with what data)
- Human-in-the-loop controls for high-risk actions
- Retry logic, error handling, and fallback paths
- Integration with your existing workflow tooling (Zapier, Make, custom API)
Custom AI agent (£35,000–£80,000)
An agent that can plan, reason across multiple steps, use multiple tools, and handle complex tasks autonomously. This is the most expensive and the most engineering-intensive category. The cost comes from:
- Agent architecture (ReAct, plan-and-execute, multi-agent)
- Tool library design and safe execution sandboxing
- Memory system (short-term context + long-term retrieval)
- Comprehensive evaluation because failures are harder to detect
- Extensive safety testing before production deployment
LLM cost comparison for UK businesses
| Model | Build cost impact | Ongoing API cost | Best for |
|---|---|---|---|
| OpenAI GPT-4o | Lowest (best tooling) | £0.003/1K tokens | Most production use cases, complex reasoning |
| Anthropic Claude 3.5 Sonnet | Low (excellent docs) | £0.003/1K tokens | Long context, document analysis, careful responses |
| Anthropic Claude Haiku 4.5 | Low | £0.0003/1K tokens | High-volume simple tasks, classification |
| Open-source (Llama 3, Mistral) | +£15k–£30k infrastructure | Compute only | High volume (>5M tokens/day), sensitive data, offline |
The open-source trap: Self-hosting an LLM sounds cheaper because there’s no per-token cost. But the GPU infrastructure, model serving layer, monitoring, and ongoing maintenance add £15,000–£30,000 to your build cost. For most UK SMEs processing fewer than 5 million tokens per day, the API cost is lower than the infrastructure overhead. Do the maths for your specific usage volume before choosing open-source to save money.
Hidden costs in AI integration projects
LLM API costs after launch
Almost never included in integration quotes. OpenAI and Anthropic charge per token. Typical costs for a deployed business application:
- Small internal tool (50–200 queries/day): £50–£200/month
- Customer-facing chatbot (500–2,000 queries/day): £300–£1,500/month
- High-volume document processing (10,000+ docs/month): £1,000–£5,000/month
Vector database hosting
Pinecone, Weaviate, and Qdrant cloud plans: £50–£500/month depending on index size and query volume. Again, rarely in the initial quote.
Prompt maintenance
When OpenAI or Anthropic releases a new model version, your prompts may need tuning to maintain output quality. Budget 5–10% of your initial build cost annually for prompt maintenance. This is essentially unavoidable for production systems.
Evaluation infrastructure
If your supplier didn’t build an evaluation framework as part of the project, you won’t know when your AI is failing. Retrofitting evaluation after the fact typically costs £5,000–£15,000.
Red flags in AI integration quotes
Question the quote if it:
Start with prompt engineering, not fine-tuning
The most common expensive mistake in AI integration projects is jumping to fine-tuning before exhausting what’s achievable with prompt engineering alone. Fine-tuning costs £30,000–£90,000 to do properly. The same outcome is achievable with prompt engineering for £8,000–£20,000 in the vast majority of business use cases.
The right progression is: prompt engineering first → RAG if you need dynamic knowledge → fine-tuning only if you have a specific style, format, or domain that genuinely can’t be captured in a prompt. Most UK businesses never need to fine-tune a model.
Summary
AI integration in the UK costs £8,000–£25,000 for a chatbot, £15,000–£40,000 for a document processing pipeline, £20,000–£55,000 for a RAG system, and £25,000–£80,000 for workflow automation or a custom agent. The biggest drivers of cost are data quality, integration depth, whether you need a retrieval layer, and how much evaluation infrastructure is built in.
For most UK SMEs, the most cost-effective approach is to start with prompt engineering against a cloud-hosted LLM (GPT-4o or Claude), build a solid evaluation framework, and add RAG only when your knowledge base genuinely requires it. Fine-tuning is almost never the right starting point.