← All Projects
AI Projects

AI-Powered Invoice Processing

Logistics Company

The Problem

A mid-sized logistics company receives 80–120 supplier invoices daily in a variety of formats: PDFs from established suppliers with consistent layouts, scanned paper documents from smaller vendors, and the occasional Excel file from international partners.

Their two-person accounts team was spending 3 hours every morning manually entering invoice data into their ERP system. Beyond the time cost, the error rate on manual entry was running at ~4% — which sounds small until you consider that a 4% error rate on payment processing creates significant downstream reconciliation work.

They had looked at traditional OCR solutions but found them brittle: a supplier changing their invoice template required reconfiguration, and handwritten annotations were consistently missed.

Our Solution

We built a document intelligence pipeline using large language models (LLMs) as the core extraction engine, augmented with traditional document processing tools for preprocessing.

Architecture

Why LLMs Beat Traditional OCR Here

Traditional OCR extracts text but doesn't understand it. An LLM understands that "Qty: 24 × Pallet / SKU: WH-4491 / Unit price: €12.50" and "24 pallets of WH-4491 @ 12.50 EUR each" are the same thing expressed differently.

We used GPT-4o with structured output (JSON schema enforcement) to extract a standardized invoice object from each document. The prompt was carefully engineered to handle:

  • Multiple line items per invoice
  • Different date formats (DD/MM/YYYY, MM-DD-YYYY, written months)
  • Currency variations (€, EUR, HRK legacy amounts)
  • VAT line items vs. subtotals
  • Croatian/English/German language invoices

Validation Layer

Raw LLM extraction isn't enough for financial data. We built a validation layer that:

  1. Cross-references extracted vendor names against the vendor master (fuzzy matching with threshold)
  2. Validates product codes against the product catalog
  3. Checks VAT calculations mathematically
  4. Flags invoices where extracted total doesn't match sum of line items

Invoices that pass all validations go directly to ERP draft creation. Invoices that fail any check go to a human review queue with the specific issue highlighted.

ERP Integration

The company uses SAP Business One. We integrated via the SAP Business One Service Layer REST API to create purchase invoice drafts. The accounts team reviews and approves drafts rather than entering data — a fundamentally different (and much faster) workflow.

Results

After 60 days in production, processing 8,400+ invoices:

MetricBeforeAfter
Daily processing time~3 hours~15 minutes
Error rate on extractionN/A (manual)2.7% (flagged for review)
Auto-approved rate74% (no human touch needed)
Cost per invoice processed~€0.85 labour~€0.09 (compute + API costs)

The 74% fully-automated rate means the accounts team handles roughly 20–30 invoices manually per day instead of 80–120. The time saving is real: they now close out invoice processing before 9:30 AM every day.

What We'd Do Differently

The initial version relied entirely on GPT-4o, which was expensive for high-volume processing. In a subsequent optimization, we introduced a routing layer: simple, clean PDFs from known vendors use a cheaper, faster model (GPT-4o-mini); complex or unfamiliar documents escalate to GPT-4o. This reduced API costs by ~60% without affecting accuracy.

We'd also invest more in the review UI from day one. The initial version was functional but basic. When we learned that the accounts team was spending most of their 15 minutes in the review queue, we built a proper review interface that shows the original document side-by-side with extracted fields, which cut review time in half again.

Related Projects

AI Projects

Outreach — AI-Powered B2B Lead Generation

Crystalium (vlastiti produkt)

Sustav za automatizirani B2B outreach koji koristi AI za personalizaciju i scraping za prikupljanje leadova u realnom vremenu.

Next.jsAI/LLMWeb ScrapingWhatsApp API
View case study