The situation: Manual document processing bottleneck
The company processes 500-600 shipments per day across multiple clients. Each shipment requires:
- Receipt verification (scanning and validating incoming documents)
- Data extraction (pulling key fields like shipment ID, weight, destination, costs)
- Invoice matching (ensuring invoice amount matches actual shipment cost)
- Exception handling (flagging mismatches or missing documents)
- Manual input into ERP system
Three staff members were spending 4 hours each per day (12 hours total) on these document processing tasks. This was purely manual: scanning documents, reading them, typing data into their ERP system (an old legacy system with no API).
The cost: £12 per hour × 12 hours × 220 working days per year = £31,680 annually in labour just on manual document processing.
Discovery: What problem are we actually solving?
Before diving into solutions, we asked:
- What's the actual bottleneck? (Data extraction from documents)
- How many document types? (4 main types: shipping receipts, invoices, delivery confirmations, customs documents)
- What's the error rate currently? (5-8% of documents have data entry errors, caught in monthly audit)
- How would this data be used? (ERP input, not decision-making)
The actual requirement was: accurately extract key fields from documents and format them for ERP input. Accuracy target: 95%+ (high enough that errors caught in spot checks, not post-processing)
Solution: Document AI pipeline
We built a simple but effective pipeline:
- Document ingestion: Staff scan documents into a folder (or email them to a special address). System automatically extracts images.
- Classification: AI classifies each document type (shipping receipt, invoice, etc.). Accuracy: 98%.
- Field extraction: For each document type, AI extracts key fields (shipment ID, weight, destination, cost, dates). Accuracy: 96%.
- Validation: Automated checks flag mismatches (e.g., invoice amount vs. shipment cost) and missing data.
- Human review: Staff review flagged documents (10-15% of volume). Clear, structured format makes review fast (30 seconds per document vs. 5 minutes previously).
- ERP input: Validated data is formatted and exported for ERP system (CSV or direct API if possible).
Implementation: 12 weeks, £8,500 investment
Week 1-2: Discovery & Training Data
- Reviewed 200 sample documents across all types
- Defined extraction rules and validation logic
- Identified edge cases (incomplete documents, poor scans)
Week 3-6: Model Training & Integration
- Trained document AI model on labeled examples
- Built API integration to their ERP system
- Created staff dashboard for review and approval
Week 7-9: Testing & Refinement
- Tested on 1,000 real documents
- Fine-tuned extraction rules based on real errors
- Accuracy improved from 92% to 96% through iteration
Week 10-12: Rollout & Training
- Trained staff on new workflow
- Ran parallel with manual process for 2 weeks
- Full switchover in week 12
Total cost: £8,500 (internal development, not outsourced)
Results: Real numbers
Error rate: From 5-8% errors caught in monthly audit to <2% (only human review errors).
Staff impact: Instead of 3 FTEs on document processing, 0.5 FTE needed for review and exceptions. One staff member now handles all document processing (10 hours/week) + other responsibilities.
Cost breakdown (Year 1):
- AI service subscription (Claude API, document processing): £2,400/year
- Infrastructure & hosting: £1,200/year
- Maintenance & updates: 0.1 FTE = £4,000/year
- Total ongoing cost: £7,600/year
Net savings Year 1: £19,800 (time freed) - £7,600 (running costs) - £8,500 (implementation) = £3,700 net benefit. Break-even: 6 months.
Year 2 onwards: £19,800/year savings with £7,600/year running costs = £12,200/year net benefit.
What would they do differently?
Looking back, the team identified learning points:
- Start with fewer document types: They tried all 4 types at once. Starting with the highest-volume type (shipping receipts, 60% of documents) would have shown ROI faster.
- Better scanning standards upfront: 5% of documents were difficult to read (poor scans, faded text). Spending 2 weeks teaching scanning best practices would have improved accuracy.
- Automate more validation: They kept some validation checks manual. Fully automating cross-checks between documents and ERP data would save another 2 hours/week.
Is document automation right for your business?
Document automation works well when:
- You process 500+ documents per month (high volume justifies the effort)
- Documents follow consistent formats (structured documents, not freeform PDFs)
- Data extraction is high-value (saving time or preventing errors costs real money)
- You have 2+ FTEs doing manual processing (ROI is there)
It's not the right fit if your documents are completely unstructured, highly variable in format, or if you process very low volumes.
Key insight: The best automation projects aren't "let's automate everything." They're "here's one painful, high-volume, rule-based process — let's solve that." This logistics company solved one problem deeply and got real ROI in 3 months.