System Overview

Understand the bizSupply platform architecture, core capabilities, and how documents flow through the processing pipeline.

Last updated: 2026-04-01

What is bizSupply?

bizSupply is an intelligent document processing platform that helps organizations ingest, classify, extract, and aggregate information from unstructured documents at scale. Built on a plugin-based architecture with LLM-powered extraction, bizSupply turns raw documents into structured, actionable data accessible via a REST API.

Whether you are processing invoices, contracts, insurance claims, or any domain-specific document type, bizSupply provides the infrastructure to build end-to-end document processing pipelines without managing the underlying complexity.

Key Capabilities

Multi-tenant architecture — isolated workspaces per organization with role-based access control and tenant-scoped data.
Plugin system — extend the platform with custom Source, Classification, Extraction, and Aggregation plugins written in Python.
Ontology-based extraction — define exactly what data to extract using structured ontologies with typed fields and validation rules.
Source integration — ingest documents from email (IMAP), cloud storage, APIs, and file uploads through configurable source plugins.
LLM-powered processing — leverage large language models for intelligent classification, field extraction, and data normalization.
Secure credential management — store and manage OAuth2, IMAP, and API key credentials with encrypted-at-rest storage.
REST API — full programmatic access to every platform capability, from pipeline creation to document retrieval.
Real-time updates — monitor job progress and receive status updates through polling endpoints and webhook notifications.

How It Works

bizSupply processes documents through a six-step pipeline. Each step is handled by a dedicated plugin type, giving you full control over every stage of the workflow.

Ingest — A Source plugin fetches documents from an external system (email inbox, cloud storage, API endpoint, or direct upload). Documents are received as raw binary content with metadata.
Store — The platform stores each document with a unique identifier, preserving the original file alongside extracted metadata such as filename, MIME type, size, and ingestion timestamp.
Classify — A Classification plugin analyzes the document content and assigns one or more document types (e.g., "invoice", "purchase_order", "receipt"). Classification determines which extraction ontology to apply.
Extract — An Extraction plugin processes the document against the matched ontology, pulling out structured fields such as vendor name, total amount, line items, and dates. LLM prompts are constructed dynamically from the ontology definition.
Aggregate — An Aggregation plugin normalizes, enriches, and combines extracted data across documents. This step handles deduplication, cross-referencing, currency conversion, and any domain-specific business logic.
Access — Processed documents and their extracted data are available through the REST API. Query by document ID, filter by type, search across fields, or export in bulk.

Plugin System

Plugins are the building blocks of every bizSupply pipeline. Each plugin type handles a specific stage of document processing.

Plugin Type	Purpose	Example Use Case
Source	Ingests documents from external systems into bizSupply.	Fetch invoices from an IMAP mailbox or pull files from a SharePoint folder.
Classification	Analyzes documents and assigns document types.	Determine whether a PDF is an invoice, a contract, or a receipt using LLM analysis.
Extraction	Pulls structured data from documents based on an ontology.	Extract vendor name, invoice number, line items, and total amount from an invoice.
Aggregation	Normalizes and combines extracted data across documents.	Deduplicate vendor records and aggregate monthly spend totals.

Architecture Principles

Multi-Tenancy

Every API request is scoped to a tenant. Documents, plugins, ontologies, pipelines, and credentials are fully isolated between tenants. A tenant represents an organization or workspace, and all resources are accessed through tenant-scoped API endpoints.

Plugin Isolation

Plugins execute in isolated environments with controlled access to platform services. Each plugin receives only the data it needs — a Source plugin gets credentials, a Classification plugin gets document content, and an Extraction plugin gets both content and ontology fields. Plugins cannot access other tenants' data or interfere with other plugin executions.

Scalability

bizSupply processes documents asynchronously through a job queue. Pipelines can handle thousands of documents in a single execution, with each document processed independently. The platform scales horizontally — adding more workers increases throughput without architectural changes. Jobs report progress in real time, and failed documents can be retried individually without reprocessing the entire batch.

API-First Design

Every capability in bizSupply is accessible through the REST API. The web interface, CLI tools, and MCP server all use the same API endpoints. This means anything you can do in the UI can be automated programmatically. All API responses use consistent JSON structures with pagination, filtering, and sorting support.

GETTING STARTEDKey Concepts