Architecture
System Overview
Receipt OCR App (Next.js on Cloudflare Workers)
βββ Storage Brain SDK β Cloudflare R2 (file uploads)
βββ Google Cloud Vision β OCR (text extraction)
βββ OpenRouter β LLM (classification + chat)
βββ Local DataBrainAdapter β Data Brain API β Cloudflare D1 (structured data)
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Receipt OCR App (Next.js / Cloudflare Workers) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β ββββββββββββββ ββββββββββββββββ ββββββββββββββββ βββββββββββββ β
β β Upload β β Dashboard β β AI Chat β β API β β
β β Page β β (4 views) β β Sidebar β β Routes β β
β βββββββ¬βββββββ ββββββββ¬ββββββββ ββββββββ¬ββββββββ βββββββ¬ββββββ β
β β β β β β
β βΌ βΌ βΌ βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Component Layer β β
β β ββββββββββββββββ βββββββββββββββ ββββββββββββββββββββββββββ β β
β β β Receipt β β Data Table β β ChatSidebar β β β
β β β Uploader β β (4 views) β β (SSE + tool approval) β β β
β β ββββββββ¬ββββββββ ββββββββ¬βββββββ ββββββββββββββ¬ββββββββββββ β β
β βββββββββββΌβββββββββββββββββΌββββββββββββββββββββββΌββββββββββββββ β
β β β β β
ββββββββββββββΌβββββββββββββββββΌββββββββββββββββββββββΌβββββββββββββββββββ
β β β
ββββββββ βββββββββ ββββββββββββ
βΌ βΌ βΌ
βββββββββββββ βββββββββββββββββ ββββββββββββββββ βββββββββββββββββ
β Storage β β Data Table β β OpenRouter β β Google Cloud β
β Brain SDK β β React + Local β β (LLM API) β β Vision API β
β β β DataBrain β β β β (OCR) β
β β β Adapter β β β β β
βββββββ¬ββββββ βββββββββ¬ββββββββ ββββββββββββββββ βββββββββββββββββ
β β
βΌ βΌ
βββββββββββββ βββββββββββββββββ
β Cloudflareβ β Data Brain β
β R2 β β API β D1 β
βββββββββββββ βββββββββββββββββ
Page Structure
Upload Page (/)
- Drag-and-drop zone for images and PDFs
- Three-phase progress: uploading, OCR processing, saving
- File type validation (images + PDF)
- Automatic redirect to dashboard on success
Dashboard (/dashboard)
- Powered by
@marlinjai/data-table-react - 4 pre-configured views:
- Table -- grouped by Category (default)
- By Konto -- grouped by SKR03 account number
- Board -- Kanban-style board grouped by Status
- Calendar -- date-based view using the Date column
- Column management, multi-row selection, search, filter, pagination
- Inline cell editing
- AI Chat sidebar toggle
Data Flow
Upload Flow
User drops image or PDF
β
βΌ
Upload to Storage Brain (R2)
β
βΌ
POST /api/ocr with fileId
β
βΌ
Fetch file from Storage Brain β send to Google Cloud Vision API
β (images: images:annotate, PDFs: files:annotate up to 5 pages)
βΌ
Return OcrResult { fullText, blocks (with bounding boxes), confidence }
β
βΌ
extractReceiptFields(ocrResult) β heuristic field extraction
β β vendor, gross, net, taxRate, date, category, konto, name
βΌ
POST /api/classify-single (optional AI classification)
β β category, konto, zuordnung, confidence, reasoning
βΌ
Create row in receipts table via DataBrainAdapter
β
βΌ
Redirect to dashboard
AI Chat Flow
User opens chat sidebar β types message
β
βΌ
POST /api/chat (streaming SSE)
β system prompt includes: table schema, select options,
β SKR03 mappings, zuordnung options, user rules
βΌ
LLM responds with text + optional tool_calls
β
βΌ
Frontend receives SSE events:
βββ text_delta β rendered incrementally
βββ tool_use β displayed as pending action
β β
β βββ read-only tool (get_rows, get_columns, get_select_options)
β β β auto-executed, result sent back as tool_result
β β
β βββ write tool (update_cells, bulk_update, create_row, delete_rows)
β β requires user approval ("Apply" / "Apply All")
β β on approval: executed client-side, result sent back
β
βββ done β response complete
Field Extraction Engine
Located at src/lib/extract-receipt-fields.ts (~500 lines). Returns an ExtractionResult with name, vendor, gross, net, taxRate, date, category, and konto.
Amount Extraction (multi-pass)
- Net: looks for lines matching subtotal/netto/before-tax labels
- Tax: looks for tax/VAT/MwSt labels (excluding total lines)
- Gross (4 passes):
- High-priority: "grand total", "amount due", "balance due"
- Medium-priority: generic "total" (excluding subtotal/tax lines)
- EU keywords: "gesamt", "summe", "brutto"
- Fallback: largest amount found anywhere in the text
- Derivation: if 2 of 3 values are found, the third is calculated
Vendor Extraction
- Primary: spatial extraction from OCR bounding boxes (topmost non-noise block)
- Fallback: first non-noise line in the first 8 lines of OCR text
- Noise filter: skips pure numbers, addresses, metadata labels, generic headings
Date Extraction
- Priority: labeled dates ("Date:", "Invoice Date:") first
- Formats: ISO (
2024-01-15), EU dot (15.01.2024), US slash (01/15/2024), named months (Jan 15, 2024) - Skips expiry/card dates
Category Inference (3-pass)
- Vendor lookup: matches vendor name against ~80 known vendors (e.g., "starbucks" -> Bewirtung)
- Keyword scan: matches full OCR text against category keyword patterns
- Item patterns: checks for specific line-item hints (e.g., "cappuccino" -> Bewirtung)
- Falls back to "Sonstige Ausgaben" if no match
Local DataBrainAdapter
The app uses a local DataBrainAdapter at src/lib/data-brain-adapter.ts (not imported from the npm package @marlinjai/data-table-adapter-data-brain). It extends BaseDatabaseAdapter from @marlinjai/data-table-core and delegates all calls to a DataBrain SDK client.
// src/lib/data-brain-adapter.ts
import { BaseDatabaseAdapter } from '@marlinjai/data-table-core';
import { DataBrain } from '@marlinjai/data-brain-sdk';
export class DataBrainAdapter extends BaseDatabaseAdapter {
private readonly client: DataBrain;
constructor(config: { baseUrl: string; apiKey: string; workspaceId?: string }) {
super();
this.client = new DataBrain({ apiKey: config.apiKey, baseUrl: config.baseUrl });
}
// ... delegates ~30 methods to this.client
}Receipt Table Schema
| Column | Type | Description |
|---|---|---|
Name | text | Composite summary (primary column) |
Vendor | text | Merchant name (OCR spatial extraction) |
Gross | number | Total amount incl. tax |
Net | number | Amount before tax |
Tax Rate | number | Tax percentage (e.g. 19 for 19%) |
Date | date | Receipt date (ISO 8601) |
Category | select | SKR03 expense category (10 options) |
Konto | text | SKR03 account number (e.g. "4650") |
Status | select | Pending / Processed / Rejected |
Confidence | number | OCR or AI classification confidence |
Receipt Image | url | Link to original file in Storage Brain |
OCR Text | text | Raw OCR text for AI context |
Zuordnung | select | Dynamic column: Universitat / Geschaftlich / Privat |
SKR03 Category-to-Konto Mapping
| Category | Konto |
|---|---|
| Bewirtung | 4650 |
| Reisekosten | 4670 |
| Burobedarf | 4930 |
| Software & Lizenzen | 4806 |
| Telefon & Internet | 4920 |
| Hardware & IT | 4855 |
| Miete & Nebenkosten | 4210 |
| Versicherungen | 4360 |
| Fachliteratur | 4940 |
| Sonstige Ausgaben | 4900 |
API Routes
| Route | Method | Purpose |
|---|---|---|
/api/ocr | POST | Fetches file from Storage Brain, sends to Google Cloud Vision, returns OcrResult |
/api/classify-single | POST | LLM classification of a single receipt via OpenRouter |
/api/chat | POST | Streaming AI chat with tool use (SSE) |
/api/files/[fileId] | GET | Proxies file downloads from Storage Brain |
Environment Configuration
# Storage Brain (file uploads to R2)
NEXT_PUBLIC_STORAGE_BRAIN_API_KEY=sk_live_...
NEXT_PUBLIC_STORAGE_BRAIN_URL=https://storage-brain-api.marlin-pohl.workers.dev
# Data Brain (structured data persistence)
NEXT_PUBLIC_DATA_BRAIN_API_KEY=db_live_...
NEXT_PUBLIC_DATA_BRAIN_URL=https://data-brain.workers.dev
# Google Cloud Vision (OCR)
GOOGLE_CLOUD_VISION_API_KEY=AIza...
# OpenRouter (AI classification + chat)
OPENROUTER_API_KEY=sk-or-v1-...
# Optional: override AI models
# AI_MODEL=anthropic/claude-sonnet-4-20250514
# AI_CLASSIFY_MODEL=anthropic/claude-sonnet-4-20250514
Deployment
Target: Cloudflare Workers via @opennextjs/cloudflare
The app is deployed at receipts.lumitra.co. Server-side secrets (GOOGLE_CLOUD_VISION_API_KEY, OPENROUTER_API_KEY) are configured as Cloudflare Workers secrets. Client-side env vars use the NEXT_PUBLIC_ prefix.