Plugin Interface Specification

Complete reference for plugin base classes, method signatures, service methods, and return type contracts.

Last updated: 2026-04-01

This document specifies the complete interface contract for bizSupply plugins. Every plugin must conform to these interfaces to be registered and executed by the platform.


Plugin Contract

All plugins share a common contract: they extend a base class, implement a single required method, and return a specific type. The platform calls your method during pipeline execution, passing the relevant data for that stage.

Plugins have access to platform services through inherited methods on the base class. These services provide LLM access, logging, configuration, and credential retrieval.


ClassificationPlugin

Analyzes a document and assigns a document type string.

classifier.pypython
from bizsupply_sdk import ClassificationPlugin


class MyClassifier(ClassificationPlugin):
    name = "my-classifier"
    version = "1.0.0"

    def classify(self, document) -> str:
        """
        Classify a document and return its type.

        Args:
            document: Document object
                - document.content: str — extracted text (up to 100KB)
                - document.filename: str — original filename
                - document.mime_type: str — MIME type
                - document.metadata: dict — ingestion metadata

        Returns:
            str — document type (e.g., "invoice", "contract").
                  Must match a taxonomy in a registered ontology.

        Raises:
            PluginError — for expected, handleable failures.
        """
        ...

ExtractionPlugin

Extracts structured fields from a document based on an ontology definition.

extractor.pypython
from bizsupply_sdk import ExtractionPlugin


class MyExtractor(ExtractionPlugin):
    name = "my-extractor"
    version = "1.0.0"

    def extract(self, document, fields: list[dict]) -> dict:
        """
        Extract structured data from a document.

        Args:
            document: Document object (same as ClassificationPlugin)
            fields: list[dict] — field definitions from the ontology.
                Each field dict contains:
                - name: str — field name (e.g., "vendor_name")
                - type: str — field type (string, number, date, boolean, array)
                - description: str — human-readable description
                - required: bool — whether the field is mandatory

        Returns:
            dict — mapping of field names to extracted values.
                   Keys must match field names from the ontology.
                   Example: {"vendor_name": "Acme Corp", "total_amount": 1500.00}

        Raises:
            PluginError — for expected, handleable failures.
        """
        ...

SourcePlugin

Fetches documents from an external system and returns them for ingestion.

source.pypython
from bizsupply_sdk import SourcePlugin, RawDocument


class MySource(SourcePlugin):
    name = "my-source"
    version = "1.0.0"

    def fetch_documents(self) -> list[RawDocument]:
        """
        Fetch documents from an external source.

        Access credentials via self.get_credential(credential_name).

        Returns:
            list[RawDocument] — list of documents to ingest.
                Each RawDocument contains:
                - content: bytes — raw file content
                - filename: str — suggested filename
                - mime_type: str — MIME type
                - metadata: dict — optional key-value metadata

        Raises:
            PluginError — for connection failures, auth errors, etc.
        """
        ...

AggregationPlugin

Processes a batch of extracted documents and returns aggregated results.

aggregator.pypython
from bizsupply_sdk import AggregationPlugin


class MyAggregator(AggregationPlugin):
    name = "my-aggregator"
    version = "1.0.0"

    def aggregate(self, documents: list) -> dict:
        """
        Aggregate data across multiple processed documents.

        Args:
            documents: list — documents with extracted fields.
                Each document has:
                - document.id: str
                - document.document_type: str
                - document.fields: dict — extracted key-value data
                - document.metadata: dict

        Returns:
            dict — aggregated results. Structure is plugin-defined.
                   Example: {"total_spend": 45000, "vendor_count": 12}

        Raises:
            PluginError — for processing failures.
        """
        ...

Available Service Methods

All plugin base classes provide access to platform services through inherited methods. These are the methods available on self inside your plugin.

prompt_llm()

Sends a prompt to the platform's LLM service and returns the text response. This is the primary way plugins interact with large language models.

python
# Basic usage
result = self.prompt_llm("Classify this document: ...")

# With options
result = self.prompt_llm(
    prompt="Extract the vendor name from: ...",
    model="gemini-2.0-flash",   # Default: platform-configured model
    temperature=0.1,             # Default: 0.2
    max_tokens=500,              # Default: 1024
)
ParameterTypeDefaultDescription
promptstr(required)The prompt text to send to the LLM.
modelstrPlatform defaultLLM model identifier. Depends on platform configuration.
temperaturefloat0.2Sampling temperature. Lower values produce more deterministic output.
max_tokensint1024Maximum tokens in the LLM response.

format_fields_for_prompt()

Converts an ontology field list into a formatted string suitable for inclusion in an LLM prompt. This ensures consistent prompt formatting across extraction plugins.

python
fields_text = self.format_fields_for_prompt(fields)
# Output:
# - vendor_name (string, required): The name of the vendor or supplier.
# - invoice_number (string, required): The unique invoice identifier.
# - total_amount (number, required): The total amount due, including taxes.

prompt = f"""Extract the following fields from this document:
{fields_text}

Document content:
{document.content[:4000]}

Return a JSON object with the field values."""

result = self.prompt_llm(prompt)

get_credential()

Retrieves a stored credential by name. Only available in SourcePlugin. The credential is decrypted at retrieval time.

python
cred = self.get_credential("accounts-payable-imap")
# cred.host -> "imap.company.com"
# cred.port -> 993
# cred.username -> "ap@company.com"
# cred.password -> "decrypted-password"

log()

Writes a log entry that is captured in the job execution log. Available levels: debug, info, warning, error.

python
self.log("info", f"Processing document: {document.filename}")
self.log("warning", "Document content is shorter than expected.")
self.log("error", f"LLM returned invalid JSON: {result}")

get_config()

Retrieves a configuration value set for this plugin instance in the pipeline. This is how pipeline-level parameters are passed to plugins.

python
threshold = self.get_config("confidence_threshold", 0.8)
max_pages = self.get_config("max_pages", 50)

Configurable Parameters Pattern

Plugins can declare configurable parameters as class attributes with default values. These parameters can be overridden per-pipeline through the pipeline configuration.

python
class InvoiceExtractor(ExtractionPlugin):
    name = "invoice-extractor"
    version = "2.1.0"

    # Configurable parameters with defaults
    confidence_threshold: float = 0.8
    max_content_length: int = 8000
    include_line_items: bool = True
    supported_currencies: list[str] = ["USD", "EUR", "GBP"]

    def extract(self, document, fields) -> dict:
        # Access parameters via self or get_config
        threshold = self.get_config("confidence_threshold", self.confidence_threshold)
        max_len = self.get_config("max_content_length", self.max_content_length)
        ...

When registering the plugin, declare the config schema so the platform can validate pipeline configurations:

config_schema.jsonjson
{
  "config_schema": {
    "confidence_threshold": {
      "type": "number",
      "default": 0.8,
      "min": 0.0,
      "max": 1.0,
      "description": "Minimum confidence score to accept an extraction."
    },
    "max_content_length": {
      "type": "integer",
      "default": 8000,
      "description": "Maximum characters of document content to process."
    },
    "include_line_items": {
      "type": "boolean",
      "default": true,
      "description": "Whether to extract individual line items."
    }
  }
}

Return Types

Each plugin type has a strict return type contract:

Plugin TypeReturn TypeValidation
ClassificationPluginstrMust be a non-empty string. Should match a registered ontology taxonomy for extraction to proceed.
ExtractionPlugindict[str, Any]Keys must correspond to field names defined in the ontology. Values are validated against field types (string, number, date, boolean, array).
SourcePluginlist[RawDocument]Each RawDocument must have non-empty content (bytes), a filename (str), and a mime_type (str).
AggregationPlugindict[str, Any]Free-form dictionary. Structure is plugin-defined. Stored in the job results.

Error Handling

Use PluginError for expected failures. Set retryable=True if the error is transient (e.g., network timeout) and the platform should retry the document.

python
from bizsupply_sdk import PluginError

# Non-retryable error — document is fundamentally unprocessable
raise PluginError(
    "Document has no extractable text content.",
    retryable=False,
)

# Retryable error — transient failure, try again
raise PluginError(
    "LLM service timed out.",
    retryable=True,
)
ℹ️Note

Unhandled exceptions (anything other than PluginError) cause the entire job to fail immediately. Always catch unexpected errors and wrap them in PluginError with appropriate retryable flags.