Plugin Service API Reference
Complete reference for all service methods, data models, and utilities available to bizSupply plugins at runtime.
This document provides the complete API reference for all service methods and data models available to bizSupply plugins. These methods are inherited from the plugin base classes and accessible via self inside your plugin code.
Core Service Methods
Every plugin type (Source, Classification, Extraction, Aggregation) has access to these four core service methods.
prompt_llm()
Sends a prompt to the platform LLM service and returns the text response.
def prompt_llm(
self,
prompt: str,
*,
model: str | None = None,
temperature: float = 0.2,
max_tokens: int = 1024,
stop_sequences: list[str] | None = None,
response_format: str | None = None,
) -> str:
"""
Send a prompt to the LLM and return the text response.
Args:
prompt: The prompt text to send.
model: LLM model identifier (default: platform-configured).
Supported: "gemini-2.0-flash", "gemini-2.0-pro",
"claude-sonnet-4-20250514", "gpt-4o".
temperature: Sampling temperature, 0.0-1.0 (default: 0.2).
max_tokens: Maximum response tokens (default: 1024).
stop_sequences: Optional list of strings that stop generation.
response_format: Optional hint ("json", "text"). When "json",
the LLM is instructed to return valid JSON.
Returns:
str — the LLM's text response.
Raises:
PluginError(retryable=True) — on LLM timeout or rate limit.
PluginError(retryable=False) — on invalid prompt or auth failure.
"""Usage examples:
# Basic classification prompt
result = self.prompt_llm("Classify this document: ...")
# Extraction with JSON response format
result = self.prompt_llm(
prompt="Extract fields as JSON: ...",
model="gemini-2.0-flash",
temperature=0.1,
max_tokens=2000,
response_format="json",
)
# With stop sequences
result = self.prompt_llm(
prompt="List the top 3 vendors:
1.",
stop_sequences=["4."],
max_tokens=500,
)get_prompt()
Loads a registered prompt template by name. Returns the template string with placeholders intact.
def get_prompt(self, name: str) -> str:
"""
Retrieve a registered prompt template.
Args:
name: The prompt name (as registered via the API).
Returns:
str — the prompt template with {{PLACEHOLDER}} markers.
Raises:
PluginError(retryable=False) — if the prompt does not exist.
"""
# Usage
template = self.get_prompt("invoice-classifier")
prompt = template.replace("{{DOCUMENT_CONTENT}}", document.content[:4000])format_fields_for_prompt()
Converts ontology field definitions into a formatted string for LLM prompts.
def format_fields_for_prompt(self, fields: list[dict]) -> str:
"""
Format ontology fields into a human-readable string.
Args:
fields: List of field dicts with name, type, description, required.
Returns:
str — formatted field list, one per line. Example:
- vendor_name (string, required): The name of the vendor.
- total_amount (number, required): Total amount due.
- due_date (date, optional): Payment due date.
"""
# Usage
fields_text = self.format_fields_for_prompt(fields)
prompt = f"Extract these fields:\n{fields_text}\nDocument: {doc.content}"logger
Writes entries to the job execution log. Entries are visible in the platform UI and API.
def log(self, level: str, message: str) -> None:
"""
Write a log entry.
Args:
level: Log level — "debug", "info", "warning", or "error".
message: The log message.
"""
# Usage
self.log("info", f"Processing document: {document.filename}")
self.log("warning", f"Field 'due_date' not found in document.")
self.log("error", f"LLM returned invalid JSON: {result[:100]}")
self.log("debug", f"Raw LLM response: {result}")Data Models
These are the core data models used throughout the plugin system.
Document
Represents a document in the platform. Passed to classify() and extract() methods.
| Property | Type | Description |
|---|---|---|
| id | str | Unique document identifier (platform-assigned). |
| content | str | Extracted text content (up to 100KB). |
| filename | str | Original filename. |
| mime_type | str | MIME type (application/pdf, image/png, etc.). |
| metadata | dict | Ingestion metadata (source info, email headers, etc.). |
| document_type | str | None | Classification label (set after classification stage). |
| status | str | Processing status: pending, classified, extracted, failed. |
| created_at | datetime | Ingestion timestamp. |
ExtractionResult
The return type for extract(). Wraps extracted field values with optional confidence scores.
from bizsupply_sdk import ExtractionResult
result = ExtractionResult()
# Set a field value with optional confidence
result.set_field("vendor_name", "Acme Corp", confidence=0.95)
result.set_field("total_amount", 1500.00, confidence=0.92)
result.set_field("line_items", [
{"description": "Widget A", "quantity": 10, "unit_price": 50.00},
{"description": "Widget B", "quantity": 5, "unit_price": 100.00},
])
# Access fields
print(result.fields)
# {"vendor_name": "Acme Corp", "total_amount": 1500.00, "line_items": [...]}
print(result.confidences)
# {"vendor_name": 0.95, "total_amount": 0.92}DocumentInput
Used by source plugins to yield new documents for ingestion.
| Property | Type | Required | Description |
|---|---|---|---|
| content | bytes | Yes | Raw file content. |
| filename | str | Yes | Suggested filename. |
| mime_type | str | Yes | MIME type of the content. |
| metadata | dict | No | Arbitrary key-value metadata passed to downstream plugins. |
BaseSourceState
Base class for source plugin state models. Extend this to track your plugin's position in the source.
from bizsupply_sdk import BaseSourceState
from datetime import datetime
class MySourceState(BaseSourceState):
last_id: int = 0
last_run: datetime | None = None
cursor: str = ""
total_fetched: int = 0DynamicCredential
Provides access to stored credentials in source plugins. Fields are defined by the plugin's credential_fields.
# In a source plugin:
cred = self.credentials # DynamicCredential
# Access fields as attributes
host = cred.host # str
port = cred.port # int
username = cred.username # str
password = cred.password # str (decrypted)
# For OAuth:
token = cred.access_token # str (auto-refreshed)OntologyField
Represents a single field in an ontology definition. Passed to extract() in the fields list.
| Property | Type | Description |
|---|---|---|
| name | str | Field name (e.g., "vendor_name"). Must be unique within the ontology. |
| type | str | Field type: string, number, date, boolean, array, object. |
| description | str | Human-readable description used in LLM prompts. |
| required | bool | Whether the field must be extracted (True) or is optional (False). |
| sub_fields | list[OntologyField] | None | Nested field definitions (for array and object types). |
OntologyNode
Represents a node in the ontology taxonomy tree. Used for hierarchical classification.
| Property | Type | Description |
|---|---|---|
| label | str | The classification label for this node. |
| description | str | Human-readable description. |
| children | list[OntologyNode] | Child nodes in the taxonomy. |
| fields | list[OntologyField] | Fields associated with this node (leaf nodes only). |
OntologyManifest
The complete ontology definition including taxonomy tree and field definitions.
| Property | Type | Description |
|---|---|---|
| id | str | Unique ontology identifier. |
| name | str | Ontology name. |
| version | str | Ontology version. |
| root | OntologyNode | Root node of the taxonomy tree. |
| created_at | datetime | Creation timestamp. |
ExtendedDocument
Used by benchmark plugins. Extends Document with ground truth fields for scoring.
| Property | Type | Description |
|---|---|---|
| (all Document fields) | ... | Inherits all Document properties. |
| ground_truth | dict | Expected field values (the "correct" answers for scoring). |
| extracted_fields | dict | The values extracted by the extraction plugin being benchmarked. |
ScoredDocument
A document with attached benchmark scores. Produced by the score() method of benchmarks.
| Property | Type | Description |
|---|---|---|
| document | ExtendedDocument | The extended document that was scored. |
| score | float | The computed score for this document (0.0-1.0). |
| details | dict | Per-field scoring details and breakdown. |
MatchRule and MatchCondition
Used by benchmark plugins to define aggregation rules for scoring.
from bizsupply_sdk import MatchRule, MatchCondition
# A MatchRule defines how to aggregate scores for a group of documents
rule = MatchRule(
name="high-value-invoices",
conditions=[
MatchCondition(field="total_amount", operator="gte", value=10000),
MatchCondition(field="currency", operator="eq", value="USD"),
],
aggregation="mean", # mean, median, min, max, sum
)
# MatchCondition operators
# eq, neq, gt, gte, lt, lte, contains, not_contains, regexEngine Responsibilities by Plugin Type
The Engine handles different lifecycle tasks depending on the plugin type:
| Plugin Type | Engine Provides | Engine Handles After |
|---|---|---|
| SourcePlugin | Credentials (self.credentials), State (self.state) | Creates Document records, saves state |
| ClassificationPlugin | Document with content | Routes document to matching ontology, sets document_type |
| ExtractionPlugin | Document + ontology fields list | Validates extracted values, stores results |
| AggregationPlugin | List of processed documents with extracted fields | Stores aggregated results in job output |
Complete Import Reference
All importable symbols from the bizsupply_sdk package:
# Plugin base classes
from bizsupply_sdk import (
ClassificationPlugin,
ExtractionPlugin,
SourcePlugin,
AggregationPlugin,
)
# Data models
from bizsupply_sdk import (
Document,
DocumentInput,
ExtractionResult,
RawDocument,
BaseSourceState,
DynamicCredential,
)
# Ontology models
from bizsupply_sdk import (
OntologyField,
OntologyNode,
OntologyManifest,
)
# Benchmark models
from bizsupply_sdk import (
BenchmarkPlugin,
ExtendedDocument,
ScoredDocument,
MatchRule,
MatchCondition,
)
# Error handling
from bizsupply_sdk import PluginError
# Utilities
from bizsupply_sdk.utils import (
parse_date, # str -> datetime
parse_currency, # "$1,500.00" -> 1500.00
normalize_text, # strip, lowercase, remove special chars
truncate_content, # safe unicode-aware truncation
guess_mime_type, # filename -> MIME type
base64_decode, # base64 str -> bytes
)