This tutorial demonstrates how to build a production-grade, agentic invoice processing pipeline using Pydantic AI and LangGraph for structured data extraction and human-in-the-loop validation.

Introduction

Automated invoice processing has transitioned from rigid, template-based Optical Character Recognition (OCR) to flexible, LLM-driven Intelligent Document Processing (IDP). In 2026, the technical challenge is no longer just "reading" text, but ensuring the high-precision extraction of structured data, validating financial logic, and managing complex workflows where human intervention is required for edge cases.

Traditional OCR systems often fail when faced with non-standard layouts, handwritten notes, or multi-page tables. Modern agentic architectures solve this by treating document extraction as a multi-step reasoning process. By using Pydantic AI for type-safe extraction and LangGraph for stateful orchestration, developers can build systems that not only extract data but also verify its integrity against business rules and external databases.

This tutorial provides a comprehensive guide to implementing an autonomous invoice processing agent capable of handling unstructured PDF data, performing mathematical validation, and routing exceptions to a human reviewer.

Objectives

By the end of this tutorial, you will:

  1. Define robust data schemas for financial documents using Pydantic.
  2. Implement a multi-modal extraction agent using Pydantic AI.
  3. Construct a stateful workflow with LangGraph to handle validation and human-in-the-loop (HITL) requirements.
  4. Integrate automated verification logic to ensure sum-total consistency and tax compliance.
  5. Deploy the solution as a containerized microservice using FastAPI and Docker.

Prerequisites

To follow this tutorial, you require the following tools and environment:

  • Python 3.12+: The latest stable version is required for advanced type hinting and performance. Python Downloads
  • Pydantic AI: For structured LLM outputs and dependency injection. Pydantic AI Documentation
  • LangGraph: For managing agentic state and transitions. LangGraph Documentation
  • Docker: For environment isolation and deployment. Docker Documentation
  • API Keys: Access to a multi-modal LLM such as GPT-4o or Gemini 1.5 Pro.

Implementation / Step-by-Step

1. Defining the Structured Data Schema

The foundation of any IDP system is a strictly defined schema. In financial contexts, data integrity is non-negotiable. We use Pydantic to define the structure of an invoice, including line items, tax rates, and vendor information.

from datetime import date
from typing import List, Optional
from pydantic import BaseModel, Field, field_validator

class InvoiceLineItem(BaseModel):
    description: str = Field(description="Description of the product or service")
    quantity: float = Field(description="The number of units purchased")
    unit_price: float = Field(description="The price per single unit")
    total_amount: float = Field(description="The total amount for this line item (qty * price)")

    @field_validator("total_amount")
    @classmethod
    def validate_line_total(cls, v: float, info) -> float:
        values = info.data
        expected = values.get("quantity", 0) * values.get("unit_price", 0)
        if abs(v - expected) > 0.01:
            raise ValueError(f"Line item total {v} does not match quantity * unit_price {expected}")
        return v

class InvoiceSchema(BaseModel):
    invoice_number: str = Field(description="Unique identifier for the invoice")
    vendor_name: str = Field(description="The legal name of the entity issuing the invoice")
    vendor_vat_id: Optional[str] = Field(None, description="The VAT or Tax ID of the vendor")
    date_issued: date = Field(description="The date the invoice was created")
    line_items: List[InvoiceLineItem] = Field(description="List of individual items billed")
    subtotal: float = Field(description="Total amount before taxes")
    tax_amount: float = Field(description="Total tax amount applied")
    total_amount_due: float = Field(description="Final amount to be paid")

    @field_validator("total_amount_due")
    @classmethod
    def validate_invoice_total(cls, v: float, info) -> float:
        values = info.data
        subtotal = values.get("subtotal", 0)
        tax = values.get("tax_amount", 0)
        if abs(v - (subtotal + tax)) > 0.01:
            raise ValueError("Total amount due must equal subtotal + tax_amount")
        return v

The use of field_validator ensures that the LLM does not just extract text but also adheres to mathematical constraints. If the LLM extracts a total that does not match the sum of its parts, the validation layer will trigger a retry or an error.

2. Implementing the Extraction Agent with Pydantic AI

Pydantic AI provides a streamlined interface for forcing LLMs to return data that conforms to a specific Pydantic model. We will initialize an agent that utilizes a multi-modal model to process the invoice image or PDF.

import os
from pydantic_ai import Agent, RunContext
from dataclasses import dataclass

@dataclass
class AgentDeps:
    api_key: str
    organization_id: str

invoice_agent = Agent(
    'openai:gpt-4o',
    deps_type=AgentDeps,
    result_type=InvoiceSchema,
    system_prompt=(
        "You are a professional financial auditor. Extract all relevant data from the "
        "provided invoice document. Ensure all currency values are floats and dates "
        "are in ISO format. If a value is missing but can be calculated, calculate it."
    ),
)

async def extract_invoice_data(image_path: str, deps: AgentDeps):
    # In a real scenario, you would pass the image bytes or a URL
    result = await invoice_agent.run(
        f"Process this invoice: {image_path}",
        deps=deps
    )
    return result.data

The result_type parameter is critical. It leverages the LLM's tool-calling capabilities to ensure the output is a valid JSON object that matches our InvoiceSchema. This eliminates the need for manual regex parsing or post-extraction cleaning.

3. Orchestrating the Workflow with LangGraph

Invoice processing is rarely a single-step operation. It involves extraction, validation, and potentially a human-in-the-loop step if the confidence score is low or if business rules are violated. LangGraph allows us to define this as a state machine.

from typing import TypedDict, Annotated, Union
from langgraph.graph import StateGraph, END

class AgentState(TypedDict):
    file_path: str
    raw_data: Optional[InvoiceSchema]
    validation_errors: List[str]
    status: str
    requires_human: bool

def extraction_node(state: AgentState):
    # Call the Pydantic AI agent defined in Step 2
    try:
        data = invoice_agent.run_sync(f"Process: {state['file_path']}")
        return {**state, "raw_data": data.data, "status": "extracted"}
    except Exception as e:
        return {**state, "validation_errors": [str(e)], "requires_human": True}

def validation_node(state: AgentState):
    data = state["raw_data"]
    errors = []
    
    # Custom business logic validation
    if data.total_amount_due > 10000:
        errors.append("High-value invoice requires manual approval")
    
    if errors:
        return {**state, "validation_errors": errors, "requires_human": True}
    return {**state, "status": "validated", "requires_human": False}

def human_review_node(state: AgentState):
    # This node pauses execution in a production environment
    # until a human provides input via a UI.
    print(f"PENDING HUMAN REVIEW: {state['validation_errors']}")
    return {**state, "status": "reviewed", "requires_human": False}

def router(state: AgentState):
    if state["requires_human"]:
        return "human_review"
    return END

# Construct the Graph
workflow = StateGraph(AgentState)

workflow.add_node("extract", extraction_node)
workflow.add_node("validate", validation_node)
workflow.add_node("human_review", human_review_node)

workflow.set_entry_point("extract")
workflow.add_edge("extract", "validate")
workflow.add_conditional_edges("validate", router)
workflow.add_edge("human_review", END)

app = workflow.compile()

4. Why Use LangGraph Over Simple Loops?

Using a state graph provides several architectural advantages for enterprise AI:

  • Persistence: LangGraph can save the state of a workflow to a database. If a human review takes three days, the process can be resumed exactly where it left off.
  • Error Recovery: If the extraction fails due to a transient API error, the graph can be configured to retry only the failed node rather than restarting the entire pipeline.
  • Traceability: Every transition between nodes is logged, providing a clear audit trail of how an invoice moved from "Received" to "Approved."

5. Performance and Accuracy Comparison

When evaluating AI invoice processing solutions, it is necessary to compare the agentic approach against traditional methods.

| Feature | Traditional OCR (Tesseract/Regex) | LLM-Only Extraction | Agentic IDP (Pydantic AI + LangGraph) | | : | : | : | : | | Accuracy (Unstructured) | Low (requires templates) | High | Very High (with self-correction) | | Logic Validation | Manual/Hardcoded | None (prone to hallucination) | Automated via Pydantic Validators | | Edge Case Handling | Fails | Inconsistent | Routed to Human-in-the-loop | | Setup Time | Weeks (per template) | Minutes | Hours (schema-based) | | Auditability | High | Low (Black box) | High (State history) |

6. Deployment Considerations

For a production environment in 2026, the system should be deployed as a microservice. FastAPI is the preferred framework due to its native support for Pydantic and asynchronous execution.

$ pip install fastapi uvicorn pydantic-ai langgraph
$ docker build -t invoice-agent-service .

The service should expose an endpoint that accepts a file upload, initiates the LangGraph workflow, and returns a job ID. Since invoice processing can be time-consuming, an asynchronous pattern using a message broker (like RabbitMQ or Redis) is recommended for scaling.

7. Handling GDPR and Data Privacy

In the European enterprise context, processing financial documents requires strict adherence to GDPR. When using LLMs for invoice processing:

  • PII Redaction: Before sending data to an external LLM API, sensitive information not required for the business logic (e.g., personal phone numbers) should be redacted.
  • Private Deployments: For highly sensitive data, utilize private cloud deployments of models (e.g., Azure OpenAI or local Llama 3 instances) to ensure data does not leave the sovereign boundary.
  • Data Retention: Ensure that the persistence layer of LangGraph complies with data minimization principles, deleting state information once the invoice is successfully exported to the ERP system.

Conclusion

Automating invoice processing with AI in 2026 requires more than