Plugin Service API Reference

Complete reference for all service methods, data models, and utilities available to bizSupply plugins at runtime.

Last updated: 2026-04-06

This document provides the complete API reference for all service methods and data models available to bizSupply plugins. These methods are inherited from the plugin base classes and accessible via self inside your plugin code.


Core Service Methods

Every plugin type (Source, Classification, Extraction, Aggregation) has access to these four core service methods.

prompt_llm()

Sends a prompt to the platform LLM service and returns the text response.

signaturepython
def prompt_llm(
    self,
    prompt: str,
    *,
    model: str | None = None,
    temperature: float = 0.2,
    max_tokens: int = 1024,
    stop_sequences: list[str] | None = None,
    response_format: str | None = None,
) -> str:
    """
    Send a prompt to the LLM and return the text response.

    Args:
        prompt: The prompt text to send.
        model: LLM model identifier (default: platform-configured).
            Supported: "gemini-2.0-flash", "gemini-2.0-pro",
            "claude-sonnet-4-20250514", "gpt-4o".
        temperature: Sampling temperature, 0.0-1.0 (default: 0.2).
        max_tokens: Maximum response tokens (default: 1024).
        stop_sequences: Optional list of strings that stop generation.
        response_format: Optional hint ("json", "text"). When "json",
            the LLM is instructed to return valid JSON.

    Returns:
        str — the LLM's text response.

    Raises:
        PluginError(retryable=True) — on LLM timeout or rate limit.
        PluginError(retryable=False) — on invalid prompt or auth failure.
    """

Usage examples:

python
# Basic classification prompt
result = self.prompt_llm("Classify this document: ...")

# Extraction with JSON response format
result = self.prompt_llm(
    prompt="Extract fields as JSON: ...",
    model="gemini-2.0-flash",
    temperature=0.1,
    max_tokens=2000,
    response_format="json",
)

# With stop sequences
result = self.prompt_llm(
    prompt="List the top 3 vendors:
1.",
    stop_sequences=["4."],
    max_tokens=500,
)

get_prompt()

Loads a registered prompt template by name. Returns the template string with placeholders intact.

python
def get_prompt(self, name: str) -> str:
    """
    Retrieve a registered prompt template.

    Args:
        name: The prompt name (as registered via the API).

    Returns:
        str — the prompt template with {{PLACEHOLDER}} markers.

    Raises:
        PluginError(retryable=False) — if the prompt does not exist.
    """

# Usage
template = self.get_prompt("invoice-classifier")
prompt = template.replace("{{DOCUMENT_CONTENT}}", document.content[:4000])

format_fields_for_prompt()

Converts ontology field definitions into a formatted string for LLM prompts.

python
def format_fields_for_prompt(self, fields: list[dict]) -> str:
    """
    Format ontology fields into a human-readable string.

    Args:
        fields: List of field dicts with name, type, description, required.

    Returns:
        str — formatted field list, one per line. Example:
            - vendor_name (string, required): The name of the vendor.
            - total_amount (number, required): Total amount due.
            - due_date (date, optional): Payment due date.
    """

# Usage
fields_text = self.format_fields_for_prompt(fields)
prompt = f"Extract these fields:\n{fields_text}\nDocument: {doc.content}"

logger

Writes entries to the job execution log. Entries are visible in the platform UI and API.

python
def log(self, level: str, message: str) -> None:
    """
    Write a log entry.

    Args:
        level: Log level — "debug", "info", "warning", or "error".
        message: The log message.
    """

# Usage
self.log("info", f"Processing document: {document.filename}")
self.log("warning", f"Field 'due_date' not found in document.")
self.log("error", f"LLM returned invalid JSON: {result[:100]}")
self.log("debug", f"Raw LLM response: {result}")

Data Models

These are the core data models used throughout the plugin system.

Document

Represents a document in the platform. Passed to classify() and extract() methods.

PropertyTypeDescription
idstrUnique document identifier (platform-assigned).
contentstrExtracted text content (up to 100KB).
filenamestrOriginal filename.
mime_typestrMIME type (application/pdf, image/png, etc.).
metadatadictIngestion metadata (source info, email headers, etc.).
document_typestr | NoneClassification label (set after classification stage).
statusstrProcessing status: pending, classified, extracted, failed.
created_atdatetimeIngestion timestamp.

ExtractionResult

The return type for extract(). Wraps extracted field values with optional confidence scores.

python
from bizsupply_sdk import ExtractionResult

result = ExtractionResult()

# Set a field value with optional confidence
result.set_field("vendor_name", "Acme Corp", confidence=0.95)
result.set_field("total_amount", 1500.00, confidence=0.92)
result.set_field("line_items", [
    {"description": "Widget A", "quantity": 10, "unit_price": 50.00},
    {"description": "Widget B", "quantity": 5, "unit_price": 100.00},
])

# Access fields
print(result.fields)
# {"vendor_name": "Acme Corp", "total_amount": 1500.00, "line_items": [...]}

print(result.confidences)
# {"vendor_name": 0.95, "total_amount": 0.92}

DocumentInput

Used by source plugins to yield new documents for ingestion.

PropertyTypeRequiredDescription
contentbytesYesRaw file content.
filenamestrYesSuggested filename.
mime_typestrYesMIME type of the content.
metadatadictNoArbitrary key-value metadata passed to downstream plugins.

BaseSourceState

Base class for source plugin state models. Extend this to track your plugin's position in the source.

python
from bizsupply_sdk import BaseSourceState
from datetime import datetime


class MySourceState(BaseSourceState):
    last_id: int = 0
    last_run: datetime | None = None
    cursor: str = ""
    total_fetched: int = 0

DynamicCredential

Provides access to stored credentials in source plugins. Fields are defined by the plugin's credential_fields.

python
# In a source plugin:
cred = self.credentials  # DynamicCredential

# Access fields as attributes
host = cred.host          # str
port = cred.port          # int
username = cred.username  # str
password = cred.password  # str (decrypted)

# For OAuth:
token = cred.access_token  # str (auto-refreshed)

OntologyField

Represents a single field in an ontology definition. Passed to extract() in the fields list.

PropertyTypeDescription
namestrField name (e.g., "vendor_name"). Must be unique within the ontology.
typestrField type: string, number, date, boolean, array, object.
descriptionstrHuman-readable description used in LLM prompts.
requiredboolWhether the field must be extracted (True) or is optional (False).
sub_fieldslist[OntologyField] | NoneNested field definitions (for array and object types).

OntologyNode

Represents a node in the ontology taxonomy tree. Used for hierarchical classification.

PropertyTypeDescription
labelstrThe classification label for this node.
descriptionstrHuman-readable description.
childrenlist[OntologyNode]Child nodes in the taxonomy.
fieldslist[OntologyField]Fields associated with this node (leaf nodes only).

OntologyManifest

The complete ontology definition including taxonomy tree and field definitions.

PropertyTypeDescription
idstrUnique ontology identifier.
namestrOntology name.
versionstrOntology version.
rootOntologyNodeRoot node of the taxonomy tree.
created_atdatetimeCreation timestamp.

ExtendedDocument

Used by benchmark plugins. Extends Document with ground truth fields for scoring.

PropertyTypeDescription
(all Document fields)...Inherits all Document properties.
ground_truthdictExpected field values (the "correct" answers for scoring).
extracted_fieldsdictThe values extracted by the extraction plugin being benchmarked.

ScoredDocument

A document with attached benchmark scores. Produced by the score() method of benchmarks.

PropertyTypeDescription
documentExtendedDocumentThe extended document that was scored.
scorefloatThe computed score for this document (0.0-1.0).
detailsdictPer-field scoring details and breakdown.

MatchRule and MatchCondition

Used by benchmark plugins to define aggregation rules for scoring.

python
from bizsupply_sdk import MatchRule, MatchCondition

# A MatchRule defines how to aggregate scores for a group of documents
rule = MatchRule(
    name="high-value-invoices",
    conditions=[
        MatchCondition(field="total_amount", operator="gte", value=10000),
        MatchCondition(field="currency", operator="eq", value="USD"),
    ],
    aggregation="mean",  # mean, median, min, max, sum
)

# MatchCondition operators
# eq, neq, gt, gte, lt, lte, contains, not_contains, regex

Engine Responsibilities by Plugin Type

The Engine handles different lifecycle tasks depending on the plugin type:

Plugin TypeEngine ProvidesEngine Handles After
SourcePluginCredentials (self.credentials), State (self.state)Creates Document records, saves state
ClassificationPluginDocument with contentRoutes document to matching ontology, sets document_type
ExtractionPluginDocument + ontology fields listValidates extracted values, stores results
AggregationPluginList of processed documents with extracted fieldsStores aggregated results in job output

Complete Import Reference

All importable symbols from the bizsupply_sdk package:

imports.pypython
# Plugin base classes
from bizsupply_sdk import (
    ClassificationPlugin,
    ExtractionPlugin,
    SourcePlugin,
    AggregationPlugin,
)

# Data models
from bizsupply_sdk import (
    Document,
    DocumentInput,
    ExtractionResult,
    RawDocument,
    BaseSourceState,
    DynamicCredential,
)

# Ontology models
from bizsupply_sdk import (
    OntologyField,
    OntologyNode,
    OntologyManifest,
)

# Benchmark models
from bizsupply_sdk import (
    BenchmarkPlugin,
    ExtendedDocument,
    ScoredDocument,
    MatchRule,
    MatchCondition,
)

# Error handling
from bizsupply_sdk import PluginError

# Utilities
from bizsupply_sdk.utils import (
    parse_date,         # str -> datetime
    parse_currency,     # "$1,500.00" -> 1500.00
    normalize_text,     # strip, lowercase, remove special chars
    truncate_content,   # safe unicode-aware truncation
    guess_mime_type,    # filename -> MIME type
    base64_decode,      # base64 str -> bytes
)