WASM API Reference
Complete API reference for the Blazen WebAssembly SDK
init()
Initialize the WASM module. Must be called once before using any other export.
import init from '@blazen/sdk';
await init();
Returns a Promise<void>. Subsequent calls are no-ops.
CompletionModel
A chat completion model. Created via static factory methods for each provider.
Provider Factory Methods
All factory methods take no arguments — API keys are read from environment variables at runtime (OPENAI_API_KEY, ANTHROPIC_API_KEY, GEMINI_API_KEY, FAL_KEY, OPENROUTER_API_KEY, etc.). To override the default model, chain .withModel(...) on the returned instance.
Azure additionally requires resourceName and deploymentName as arguments (there’s no single global endpoint). Bedrock requires region.
| Method | Signature |
|---|---|
CompletionModel.openai | () |
CompletionModel.anthropic | () |
CompletionModel.gemini | () |
CompletionModel.azure | (resourceName: string, deploymentName: string) |
CompletionModel.fal | () |
CompletionModel.openrouter | () |
CompletionModel.groq | () |
CompletionModel.together | () |
CompletionModel.mistral | () |
CompletionModel.deepseek | () |
CompletionModel.fireworks | () |
CompletionModel.perplexity | () |
CompletionModel.xai | () |
CompletionModel.cohere | () |
CompletionModel.bedrock | (region: string) |
const model = CompletionModel.openai();
const claude = CompletionModel.anthropic();
const gemini = CompletionModel.gemini();
const azure = CompletionModel.azure('my-resource', 'my-deployment');
const fal = CompletionModel.fal();
const groq = CompletionModel.groq().withModel('llama-3.3-70b-versatile');
const bedrock = CompletionModel.bedrock('us-east-1');
model.withModel(modelId: string): CompletionModel
Override the default model ID for this provider instance. Returns a new CompletionModel (WASM does not mutate in place).
const model = CompletionModel.openai().withModel('gpt-4o-mini');
Properties
| Property | Type | Description |
|---|---|---|
.modelId | string | The model identifier string |
await model.complete(messages: ChatMessage[]): CompletionResponse
Perform a chat completion.
const response = await model.complete([
ChatMessage.system('You are a helpful assistant.'),
ChatMessage.user('What is 2 + 2?'),
]);
console.log(response.content);
await model.completeWithOptions(messages: ChatMessage[], options: CompletionOptions): CompletionResponse
Perform a chat completion with additional options.
const response = await model.completeWithOptions(
[ChatMessage.user('Write a haiku about WASM.')],
{ temperature: 0.7, maxTokens: 100 }
);
await model.stream(messages: ChatMessage[], onChunk: (chunk) => void): void
Stream a chat completion. The callback receives each chunk as it arrives.
await model.stream(
[ChatMessage.user('Tell me a story')],
(chunk) => {
if (chunk.delta) process.stdout.write(chunk.delta);
}
);
Each chunk has the shape:
{
delta?: string; // Text content delta
finishReason?: string; // Set on the final chunk
toolCalls: ToolCall[]; // Tool calls, if any
}
Middleware Decorators
Each decorator returns a new CompletionModel wrapping the original.
model.withRetry(maxRetries?: number): CompletionModel
Automatic retry with exponential backoff on transient failures. Defaults to 3 retries.
const resilient = model.withRetry(5);
model.withCache(ttlSeconds?: number, maxEntries?: number): CompletionModel
In-memory response cache. Streaming requests bypass the cache.
const cached = model.withCache(600, 500);
| Parameter | Default | Description |
|---|---|---|
ttlSeconds | 300 | Cache entry TTL in seconds. |
maxEntries | 1000 | Maximum entries before eviction. |
CompletionModel.withFallback(models: CompletionModel[]): CompletionModel
Static method. Tries providers in order; falls back on transient errors.
const model = CompletionModel.withFallback([
CompletionModel.openai(),
CompletionModel.groq(),
]);
ChatMessage
A class for building typed chat messages.
Static Factory Methods
| Method | Description |
|---|---|
ChatMessage.system(content: string) | Create a system message |
ChatMessage.user(content: string) | Create a user message |
ChatMessage.assistant(content: string) | Create an assistant message |
ChatMessage.tool(content: string) | Create a tool result message |
ChatMessage.toolResultMessage(callId: string, name: string, content: string) | Create a tool result message with a tool-call ID and function name. (Named toolResultMessage to avoid colliding with the .toolResult instance getter that surfaces the structured payload of an existing message.) |
ChatMessage.userImageUrl(text: string, url: string, mediaType?: string) | User message with text and an image URL |
ChatMessage.userImageBase64(text: string, data: string, mediaType: string) | User message with text and a base64-encoded image |
const msg = ChatMessage.user('Hello');
const sys = ChatMessage.system('You are a helpful assistant.');
const img = ChatMessage.userImageUrl('Describe this:', 'https://example.com/photo.jpg');
Constructor
new ChatMessage({ role?: string, content?: string, parts?: ContentPart[] })
Properties
| Property | Type | Description |
|---|---|---|
.role | string | "system", "user", "assistant", or "tool" |
.content | string | undefined | The text content of the message |
.toolCallId | string | undefined | The tool-call ID this message is responding to (only set for tool-result messages) |
.name | string | undefined | The function name of the tool that produced this result (only set for tool-result messages) |
JSON Shape
ChatMessage.toJSON() (and the entries in the messages array returned by runAgent) match the tsify-generated ChatMessage interface:
interface ChatMessage {
role: "system" | "user" | "assistant" | "tool";
content: MessageContent;
tool_call_id?: string;
name?: string;
tool_calls?: ToolCall[];
tool_result?: ToolOutput; // structured tool-result payload (see below)
}
The tool_result field is populated when a tool handler returns a non-string value or supplies an llm_override. Plain-string tool results live in content as MessageContent::Text instead. The field name is tool_result (snake_case) because tsify preserves Rust field naming.
Tip. The tsify-generated interface lives in
crates/blazen-wasm-sdk/pkg/blazen_wasm_sdk.d.tsand is regenerated on everypnpm build(orwasm-pack build --target bundler) insidecrates/blazen-wasm-sdk/. See the WASM Quickstart for the build flow.
ToolOutput
The two-channel tool result emitted by JS tool handlers and surfaced on ChatMessage.tool_result.
interface ToolOutput {
/**
* The full structured payload the caller sees programmatically. Any
* JSON-serializable value (object, array, string, number, etc).
*/
data: any;
/**
* Optional override for the body sent back to the model on the next
* turn. When `null` / absent, each provider applies its default
* conversion from `data`.
*/
llm_override?: LlmPayload | null;
}
The two channels exist because what the rest of your application wants to consume from a tool (full structured data, large blobs, internal IDs) is rarely the best thing to feed back to the LLM (token-heavy, leaks internal shape). Set data to the rich payload your code consumes, and use llm_override when you want to send the model a trimmed summary or a provider-specific shape instead.
The WASM dispatcher accepts either llm_override (snake) or llmOverride (camel) when a JS handler returns a structured object — both spellings are normalized before the value is parsed. This means the spelling you write in JS is up to you; both work.
Tool handler return shapes
The WASM tool dispatcher (js_to_tool_output in crates/blazen-wasm-sdk/src/agent.rs) accepts two shapes from a handler:
// 1. Bare value: wrapped automatically as { data: <value>, llm_override: null }.
const tool = {
name: 'getWeather',
description: 'Get the current weather for a city',
parameters: { type: 'object', properties: { city: { type: 'string' } }, required: ['city'] },
handler: async (args) => ({ temp: 22, condition: 'cloudy', city: args.city }),
};
// 2. Structured ToolOutput: object literal with a `data` key.
const structuredTool = {
name: 'fetchProfile',
description: 'Fetch a full user profile',
parameters: { type: 'object', properties: { userId: { type: 'string' } }, required: ['userId'] },
handler: async (args) => {
const profile = await db.users.findById(args.userId); // huge blob
return {
data: profile, // caller sees full record
llmOverride: { // model sees compact summary
kind: 'text',
text: `User ${profile.name} (id=${profile.id})`,
},
};
},
};
If a handler returns a string and that string happens to parse as JSON describing a ToolOutput, the dispatcher unpacks it. Otherwise the string is preserved as plain text. If ToolOutput deserialization fails (for instance, a malformed llm_override), the dispatcher silently falls back to wrapping the raw value as { data, llm_override: null } rather than throwing.
LlmPayload
The override sent to the LLM on the next turn (the optional llm_override field of ToolOutput). Discriminated union keyed by kind:
kind | Shape | Description |
|---|---|---|
"text" | { kind: "text"; text: string } | Plain text. Works on every provider universally. |
"json" | { kind: "json"; value: any } | Structured JSON. Anthropic and Gemini consume it natively; OpenAI / Responses / Azure / Fal stringify at the wire boundary. |
"parts" | { kind: "parts"; parts: ContentPart[] } | Multimodal content blocks. Anthropic supports natively as tool_result.content blocks; OpenAI falls back to text; Gemini falls back to a JSON object. |
"provider_raw" | { kind: "provider_raw"; provider: ProviderId; value: any } | Provider-specific escape hatch. Only the named provider sees value; every other provider falls back to the default conversion from ToolOutput.data. |
ProviderId is the snake_case enum:
type ProviderId =
| "openai"
| "openai_compat"
| "azure"
| "anthropic"
| "gemini"
| "responses"
| "fal";
Per-provider behavior for data (when llm_override is null)
When no override is set, each provider applies its default conversion to ToolOutput.data:
- OpenAI / OpenAI-compat / Azure / Responses / Fal: the value is JSON-stringified once and sent as the tool-result string.
- Anthropic: structured data becomes
[{ type: "text", text: <stringified> }]so it lands insidetool_result.contentblocks. - Gemini: a structured object passes through as the
responsefield of the function-response part; scalar values (numbers, booleans, strings) are wrapped as{ result: scalar }to satisfy Gemini’s object-only contract.
Examples
// Send a compact text summary while keeping the full record in `data`.
return {
data: { id: 42, items: bigArray, internal: '...' },
llmOverride: { kind: 'text', text: '42 items processed' },
};
// Force JSON semantics regardless of provider default.
return {
data: { ok: true, count: 17 },
llmOverride: { kind: 'json', value: { ok: true, count: 17 } },
};
// Multimodal: send an image back to the model (Anthropic native; falls
// back to text on OpenAI; falls back to a JSON object on Gemini).
return {
data: { url: 'https://example.com/photo.png' },
llmOverride: {
kind: 'parts',
parts: [
{ type: 'text', text: 'Here is the photo:' },
{ type: 'image', source: { url: 'https://example.com/photo.png' } },
],
},
};
// Anthropic-only escape hatch: send a raw provider blob.
return {
data: { ok: true },
llmOverride: {
kind: 'provider_raw',
provider: 'anthropic',
value: { type: 'text', text: 'anthropic-only payload' },
},
};
CompletionResponse
Returned by model.complete() and model.completeWithOptions().
interface CompletionResponse {
content?: string; // The generated text
toolCalls: ToolCall[]; // Tool calls requested by the model
usage?: TokenUsage; // Token usage statistics
model: string; // Model name used
finishReason?: string; // "stop", "tool_calls", etc.
cost?: number; // Cost in USD
timing?: RequestTiming; // Request timing breakdown
metadata: object; // Raw provider-specific metadata
}
CompletionOptions
Options for completeWithOptions().
interface CompletionOptions {
temperature?: number; // Sampling temperature (0.0 - 2.0)
maxTokens?: number; // Maximum tokens to generate
topP?: number; // Nucleus sampling parameter
model?: string; // Override the default model ID
tools?: ToolDefinition[]; // Tool definitions for function calling
}
ToolCall
A tool invocation requested by the model.
| Property | Type | Description |
|---|---|---|
.id | string | Unique identifier for the tool call |
.name | string | Name of the tool to invoke |
.arguments | object | Parsed JSON arguments |
ToolDefinition
Describes a tool that the model may invoke.
interface ToolDefinition {
name: string; // Unique tool name
description: string; // Human-readable description
parameters: object; // JSON Schema for the tool's parameters
}
Content Subsystem
Browser/edge-friendly subset of Blazen’s multimodal content store. The WASM surface omits filesystem-bound APIs available in the native crate — there is no localFile() factory, and ContentStore does not expose a metadata() method.
ContentKind
String union describing the taxonomy of stored content. Treat as #[non_exhaustive] — new kinds may be added in future releases.
| Value | Description |
|---|---|
"image" | Raster or vector image |
"audio" | Audio clip |
"video" | Video clip |
"document" | Document (PDF, text, etc.) |
"three_d_model" | 3D model (glTF, OBJ, etc.) |
"cad" | CAD file (STEP, IGES, etc.) |
"archive" | Archive (zip, tar, etc.) |
"font" | Font file |
"code" | Source code |
"data" | Structured data (CSV, JSON, etc.) |
"other" | Catch-all for anything else |
ContentHandle
Opaque, store-issued reference to a piece of content. Field names mirror the wire format (snake_case).
interface ContentHandle {
id: string; // Opaque store-defined identifier
kind: ContentKind; // Used for type-checking at the tool-input boundary
mime_type?: string; // MIME type if known
byte_size?: number; // Byte size if known
display_name?: string; // Human-readable name (e.g. original filename)
}
ImageSource
Discriminated union of every way an image can be supplied to a ChatMessage.
type ImageSource =
| { type: "url"; url: string }
| { type: "base64"; data: string }
| { type: "file"; path: string }
| { type: "provider_file"; provider: ProviderId; id: string }
| { type: "handle"; handle: ContentHandle };
The handle variant defers resolution to the active ContentStore, which substitutes one of the other variants at request-build time.
ContentStore
Abstract handle-issuing store. Build built-in instances via the static factories below, supply user-defined backends via ContentStore.custom({...}), or extends ContentStore from JS / TypeScript and override the methods you need. Resources are released by free() or via the explicit-resource-management protocol ([Symbol.dispose]).
class ContentStore {
// Subclass-friendly base constructor (call from `super()`)
constructor();
// Factories
static inMemory(): ContentStore;
static openaiFiles(apiKey: string): ContentStore;
static anthropicFiles(apiKey: string): ContentStore;
static geminiFiles(apiKey: string): ContentStore;
static falStorage(apiKey: string): ContentStore;
static custom(options: CustomContentStoreOptions): ContentStore;
// Instance methods
put(
body: Uint8Array | string,
kindHint?: string | null,
mimeType?: string | null,
displayName?: string | null,
): Promise<ContentHandle>;
resolve(handle: ContentHandle): Promise<unknown>; // MediaSource-shaped JS object
fetchBytes(handle: ContentHandle): Promise<Uint8Array>;
delete(handle: ContentHandle): Promise<void>;
// Lifecycle
free(): void;
[Symbol.dispose](): void;
}
put accepts either a Uint8Array of bytes or a string URL (URL support depends on the backing store). kindHint is the wire string — for example "image" or "three_d_model" — and overrides auto-detection.
import { ContentStore } from '@blazen/sdk';
// Explicit-resource-management form (preferred)
using store = ContentStore.inMemory();
const handle = await store.put(bytes, 'image', 'image/png', 'logo.png');
const media = await store.resolve(handle);
// Manual lifecycle
const fallback = ContentStore.openaiFiles(apiKey);
try {
const h = await fallback.put(bytes, 'document', 'application/pdf');
} finally {
fallback.free();
}
Subclassing ContentStore
ContentStore is subclassable from JavaScript / TypeScript via wasm-bindgen. Override the methods your backend needs; the SDK wraps your subclass in a Rust adapter that dispatches into your JS async functions via js_sys::Function::call + wasm_bindgen_futures::JsFuture.
import { ContentStore } from "@blazen/sdk";
import type { ContentHandle } from "@blazen/sdk";
class IndexedDBContentStore extends ContentStore {
constructor() {
super();
}
async put(body, hint) {
// ... persist to IndexedDB / OPFS / fetch+rehost ...
return { id: "...", kind: "image" };
}
async resolve(handle) {
return { sourceType: "url", url: "..." };
}
async fetchBytes(handle) {
return new Uint8Array([...]);
}
// Optional:
async fetchStream(handle) { return new Uint8Array([...]); }
async delete(handle) { /* no-op */ }
}
Subclasses MUST override put, resolve, fetchBytes. The base-class default impls throw a JsError so any missing override fails clearly rather than silently recursing via super().
ContentStore.custom({...})
Callback-based factory. Direct JS mirror of Rust CustomContentStore::builder.
ContentStore.custom(options: {
put: (body: any, hint: any) => Promise<ContentHandle>;
resolve: (handle: ContentHandle) => Promise<any>; // serialized MediaSource
fetchBytes: (handle: ContentHandle) => Promise<Uint8Array>;
fetchStream?: (handle: ContentHandle) => Promise<Uint8Array>; // single-chunk for now
delete?: (handle: ContentHandle) => Promise<void>;
}): ContentStore
put, resolve, fetchBytes are required. fetchStream and delete are optional. The body arrives as a JS object shaped like {type: "bytes", data: [...]} / {type: "url", url} / {type: "provider_file", provider, id} / {type: "stream", stream: ReadableStream<Uint8Array>, sizeHint: number | null} (no local_path in WASM since there’s no filesystem).
resolve returns a serialized MediaSource JS object. fetchBytes returns a Uint8Array. fetchStream may return either a Uint8Array / number[] (legacy, single-chunk) or a ReadableStream<Uint8Array> for true chunk-by-chunk streaming.
Built-in stores
| Factory | Purpose |
|---|---|
ContentStore.inMemory() | Ephemeral in-WASM-memory store. Bytes live in WASM heap; URL/provider-file inputs are recorded by reference. |
ContentStore.openaiFiles(apiKey) | Uploads to the OpenAI Files API. apiKey is sent as Bearer <apiKey>. |
ContentStore.anthropicFiles(apiKey) | Uploads to the Anthropic Files API. apiKey is sent as x-api-key. |
ContentStore.geminiFiles(apiKey) | Uploads to the Google AI / Gemini Files API. |
ContentStore.falStorage(apiKey) | Uploads to fal.ai storage. |
ContentStore.custom({...}) | User-defined backend via async callbacks (see above). |
Tool-input schema helpers
Each helper returns a JSON Schema fragment with a x-blazen-content-ref extension that tells Blazen’s resolver which content kind the model is expected to pass.
| Helper | Kind |
|---|---|
imageInput(name, description) | image |
audioInput(name, description) | audio |
videoInput(name, description) | video |
fileInput(name, description) | document |
threeDInput(name, description) | three_d_model |
cadInput(name, description) | cad |
import { imageInput } from '@blazen/sdk';
const params = imageInput('photo', 'The user-supplied photograph to analyze');
// {
// type: "object",
// properties: {
// photo: {
// type: "string",
// description: "The user-supplied photograph to analyze",
// "x-blazen-content-ref": { kind: "image" }
// }
// },
// required: ["photo"]
// }
How resolution works
The model only ever sees the schema’s string type and passes the handle id back as a plain string. The x-blazen-content-ref extension is invisible to providers. Before invoking the tool handler, Blazen’s resolver looks up the id in the active ContentStore, fetches a typed content shape (e.g. an ImageSource), and substitutes it into the tool arguments. Handlers therefore receive resolved content rather than raw ids.
TokenUsage
| Property | Type | Description |
|---|---|---|
.promptTokens | number | Tokens in the prompt |
.completionTokens | number | Tokens in the completion |
.totalTokens | number | Total tokens used |
RequestTiming
| Property | Type | Description |
|---|---|---|
.queueMs | number | undefined | Time in queue (ms) |
.executionMs | number | undefined | Execution time (ms) |
.totalMs | number | undefined | Total wall-clock time (ms) |
runAgent
Run an agentic tool-calling loop.
const result = await runAgent(model, messages, tools, options?);
Parameters
| Parameter | Type | Description |
|---|---|---|
model | CompletionModel | The completion model to use |
messages | ChatMessage[] | Initial conversation messages |
tools | ToolDef[] | Tool definitions. Each tool object has name, description, parameters (JSON Schema), and a handler(args) => any | Promise<any> that returns either a bare JSON-serializable value or a structured ToolOutput (an object with a data key, plus an optional llm_override / llmOverride). See the tool handler return shapes above. |
options | AgentRunOptions? | Optional configuration |
AgentRunOptions
interface AgentRunOptions {
toolConcurrency?: number; // Max concurrent tool calls per round (default: 0 = unlimited)
maxIterations?: number; // Max tool-calling iterations (default: 10)
systemPrompt?: string; // System prompt prepended to conversation
temperature?: number; // Sampling temperature
maxTokens?: number; // Max tokens per call
addFinishTool?: boolean; // Add a built-in "finish" tool
}
AgentResult
interface AgentResult {
content?: string; // Final text response
messages: ChatMessage[]; // Full message history (each entry matches the tsify ChatMessage interface)
iterations: number; // Number of iterations
totalUsage?: TokenUsage; // Aggregated token usage
totalCost?: number; // Aggregated cost in USD
}
Tool-result messages in messages carry a tool_result?: ToolOutput field whenever the handler returned a non-string data or supplied an llm_override. Plain string returns appear as content: { Text: "..." } on a role: "tool" message with no tool_result field.
Workflow
new Workflow(name: string)
Create a new workflow instance.
.addStep(name: string, eventTypes: string[], handler: StepHandler)
Register a step that listens for one or more event types.
wf.addStep('process', ['MyEvent'], async (event, ctx) => {
return { type: 'blazen::StopEvent', result: { done: true } };
});
await wf.run(input: object): any
Run the workflow to completion. The input is passed as the data field of a synthetic StartEvent. Returns a Promise that resolves to the result field of the final StopEvent.
await wf.runStreaming(input: any, callback: (event: any) => void): Promise<any>
Run the workflow and forward each lifecycle event to a JS callback as it occurs, resolving with the terminal payload once the workflow completes. Mirrors the Node binding’s runStreaming(input, onEvent). Stream events are subscribed before the engine begins dispatching, so no events are missed between dispatch and subscription.
await wf.runStreaming({ topic: 'TS' }, (event) => {
console.log(event.event_type, event.data);
});
The callback is invoked with { event_type, data } per event. Errors raised synchronously by the listener are swallowed so a misbehaving callback does not abort the run.
await wf.runWithHandler(input: any): Promise<WorkflowHandler>
Build and dispatch the workflow, returning the live WorkflowHandler instead of awaiting the terminal event. Use this when you want to drive the run yourself: call awaitResult() for the final payload, pause() / snapshot() for mid-flight state, nextEvent() / streamEvents() for events, or cancel() / abort() to tear it down. Functionally equivalent to runHandler(input) — the JS-side name runWithHandler exists for parity with the Node binding.
const handler = await wf.runWithHandler({ topic: 'TS' });
await handler.streamEvents((ev) => console.log(ev));
const result = await handler.awaitResult();
wf.setSessionPausePolicy(policy: string): void
Configure how live session refs are treated when the workflow is paused or snapshotted. Mirrors the Node binding’s setSessionPausePolicy. The policy is applied when the workflow is dispatched via run / runHandler / runStreaming / runWithHandler / resumeFromSnapshot / resumeWithSerializableRefs.
| Policy | Behavior |
|---|---|
"pickle_or_error" (default) | Pickle live refs if the binding supports it; error otherwise. |
"pickle_or_serialize" | Pickle if possible; fall back to user-supplied byte serialization. |
"warn_drop" | Log a warning and drop live refs from the snapshot. |
"hard_error" | Always error if any live refs are present. |
PascalCase spellings (PickleOrError, PickleOrSerialize, WarnDrop, HardError) are also accepted.
WorkflowBuilder exposes the same method (chainable, returns WorkflowBuilder).
await wf.resumeWithSerializableRefs(snapshot: any, deserializers: Record<string, (bytes: Uint8Array) => unknown>): Promise<WorkflowHandler>
Resume a workflow from a snapshot whose __blazen_serialized_session_refs sidecar carries JS-serialized session refs. The deserializers object maps type_tag strings to (bytes: Uint8Array) => unknown callbacks. For every entry in the sidecar whose tag appears in deserializers, the callback is invoked synchronously with the captured bytes; the return value is ignored (callbacks should populate any application state they need).
The snapshot’s bytes are also exposed inside step handlers via ctx.getSessionRefSerializable(key) after resume, mirroring the Node binding’s path.
const handler = await wf.resumeWithSerializableRefs(snapshot, {
'app::EmbeddingHandle': (bytes) => {
myStore.rehydrate(bytes); // populate user-side state
},
});
const result = await handler.awaitResult();
Snapshots without serialized session refs work fine with resumeFromSnapshot(); this method is only required when the original pause used SessionPausePolicy::PickleOrSerialize.
WorkflowHandler
Live handle to an in-flight workflow run, returned by runHandler() / runWithHandler() / resumeFromSnapshot() / resumeWithSerializableRefs(). The handler lets JS callers drive a workflow run beyond the simple “fire and forget” pattern of run().
Methods
| Method | Signature | Description |
|---|---|---|
awaitResult | () => Promise<any> | Await the workflow’s terminal payload. Consumes the inner handler. |
pause | () => Promise<any> | Park the event loop and capture a quiescent snapshot. |
snapshot | () => Promise<any> | Capture the current snapshot without halting the loop. Use this for logging or telemetry; pair pause() -> snapshot() -> resumeInPlace() for a quiescent view. |
resumeInPlace | () => void | Resume a paused event loop. The same handler instance remains valid for awaitResult / nextEvent etc. |
cancel | () => void | Tear down the event loop. Best-effort; errors if the loop has already exited. |
abort | () => void | Pure alias for cancel(). Matches JsWorkflowHandler::abort in the Node bindings — use whichever name reads better. |
runId | () => Promise<any> | Return the run’s UUID as a string. First call captures a snapshot to read it, then caches. |
nextEvent | () => Promise<any> | Pull the next event from the broadcast stream. Resolves with null when the stream closes. |
streamEvents | (callback: (event: any) => void) => Promise<void> | Subscribe to the broadcast stream and forward each event to a JS callback until the stream closes. Mirrors the Node binding. Single Promise drives the subscription — no need to wrap repeated nextEvent() calls. |
respondToInput | (requestId: string, response: any) => void | Deliver a human-in-the-loop response to a workflow that auto-parked on an InputRequestEvent. Pass the matching request_id and a JSON-serializable response value. |
Streaming events
const handler = await wf.runWithHandler({ topic: 'WASM' });
await handler.streamEvents((event) => {
console.log(event.event_type, event.data);
});
const result = await handler.awaitResult();
Events emitted before streamEvents() is called are not replayed — call it before awaitResult() to avoid races.
Pause and snapshot
const handler = await wf.runWithHandler({ topic: 'WASM' });
await handler.pause();
const snap = await handler.snapshot();
localStorage.setItem('snap', JSON.stringify(snap));
handler.resumeInPlace();
const result = await handler.awaitResult();
Human-in-the-loop
const handler = await wf.runWithHandler({});
const event = await handler.nextEvent();
if (event?.event_type === 'InputRequestEvent') {
const userInput = await prompt(event.data.prompt);
handler.respondToInput(event.data.request_id, { answer: userInput });
}
const result = await handler.awaitResult();
Pipeline
Pipelines compose multiple Workflows into a sequential or parallel chain. Each Stage wraps one workflow plus optional input-mapping and conditional-execution callbacks; the resulting Pipeline runs the stages in order, threading the previous stage’s output into the next stage’s input.
new PipelineBuilder(name: string)
Construct a new builder with the given pipeline name.
Methods
| Method | Signature | Description |
|---|---|---|
pipelineBuilder.stage(stage) | (stage: Stage) => void | Append a sequential stage. Consumes the stage — the same Stage instance cannot be added to two pipelines. |
pipelineBuilder.parallel(parallel) | (parallel: ParallelStage) => void | Append a parallel stage that fans out across multiple branches. |
pipelineBuilder.timeoutPerStage(seconds) | (seconds: number) => void | Set a per-stage timeout. Exceeding it surfaces as a stage failure with WorkflowError::Timeout. |
pipelineBuilder.onPersist(callback) | (callback: (snapshot: any) => Promise<void>) => void | Persist callback that receives a typed PipelineSnapshot (serialized to a JS object via serde-wasm-bindgen) after each stage completes. The engine awaits the returned Promise before continuing. |
pipelineBuilder.onPersistJson(callback) | (callback: (json: string) => Promise<void>) => void | Persist callback that receives the snapshot serialized as a JSON string. The engine awaits the returned Promise before continuing. |
pipelineBuilder.build() | () => Pipeline | Finalize and return a runnable Pipeline. |
IndexedDB persistence example
const builder = new PipelineBuilder('research');
builder.stage(new Stage('outline', outlineWf));
builder.stage(new Stage('draft', draftWf, (state) => ({ outline: state.outline })));
builder.onPersistJson(async (json) => {
const db = await openDb();
await db.put('snapshots', { id: 'research', json });
});
const pipeline = builder.build();
Stage
new Stage(
name: string,
workflow: Workflow,
input_mapper?: ((state: BlazenState) => unknown) | null,
condition?: ((state: BlazenState) => boolean) | null,
);
input_mapper is an optional (state: BlazenState) => unknown JS callable invoked before the stage runs. Its return value becomes the workflow’s input. When null / undefined, the previous stage’s output (or the pipeline input for the first stage) is passed through directly.
condition is an optional (state: BlazenState) => boolean JS callable that decides whether the stage runs. When null / undefined the stage always runs; when the callable returns false the stage is skipped (its StageResult.skipped is true and output is null).
const stage = new Stage(
'summarize',
summarizeWf,
(state) => ({ text: state.draft, maxWords: 100 }),
(state) => state.draft != null && state.draft.length > 200,
);
ParallelStage
new ParallelStage(name: string, branches: Stage[], join_strategy?: JoinStrategy | null);
Each branch is a Stage; branches execute concurrently and are joined according to JoinStrategy.WaitAll (default) or JoinStrategy.FirstCompletes. Branch Stage instances are consumed when the parallel stage is constructed.
import { ParallelStage, Stage, JoinStrategy } from '@blazen/sdk';
const fanOut = new ParallelStage(
'fan-out',
[new Stage('a', wfA), new Stage('b', wfB)],
JoinStrategy.WaitAll,
);
Context (WasmContext)
Shared workflow context accessible by all steps. Unlike the Node.js SDK, all methods are synchronous — no await needed.
StateValue
Values stored in the context can be any StateValue:
type StateValue = string | number | boolean | null | Uint8Array | StateValue[] | { [key: string]: StateValue };
Methods
| Method | Signature | Description |
|---|---|---|
ctx.set(key, value) | (key: string, value: StateValue) => void | Store a value. Auto-detects Uint8Array and stores it as binary; everything else is stored as-is. |
ctx.get(key) | (key: string) => StateValue | null | Retrieve a value. Returns Uint8Array for binary data, the original JsValue for everything else, or null if the key is missing. |
ctx.setBytes(key, data) | (key: string, data: Uint8Array) => void | Explicitly store binary data. |
ctx.getBytes(key) | (key: string) => Uint8Array | null | Retrieve binary data. Returns null if the key is missing. |
ctx.sendEvent(event) | (event: object) => void | Queue an event into the workflow event loop. |
ctx.writeEventToStream(event) | (event: object) => void | No-op in WASM. Present for API compatibility with the Node.js and Rust SDKs. |
ctx.runId() | () => string | Returns the unique UUID v4 for the current workflow run. |
ctx.insertSessionRefSerializable(typeName, bytes) | (typeName: string, bytes: Uint8Array) => string | Store an opaque, user-serialized payload in the session-ref registry under a fresh registry key. typeName is a stable identifier the caller chooses (e.g. "app::EmbeddingHandle"); it is captured into snapshot metadata along with the bytes when the workflow is paused under SessionPausePolicy::PickleOrSerialize. Returns the registry key as a string. JS code must serialize the value itself (typically into a Uint8Array) before calling and deserialize on retrieval. |
ctx.getSessionRefSerializable(key) | (key: string) => { typeName: string; bytes: Uint8Array } | null | Retrieve a previously inserted opaque payload. Returns null if the registry has no entry under key, or if the entry exists but was inserted via the non-serializable path (set / setBytes / language-specific live refs). |
The session-ref-serializable wire format is cross-binding compatible with the Node binding’s NodeSessionRefSerializable: the same typeName / bytes pair round-trips through a snapshot taken on one binding and resumed on the other.
// Inside a step handler.
const key = ctx.insertSessionRefSerializable(
'app::EmbeddingHandle',
new TextEncoder().encode(JSON.stringify({ id: 42 })),
);
ctx.state.set('embedding_key', key);
// In a later step (or after resume).
const stored = ctx.getSessionRefSerializable(ctx.state.get('embedding_key') as string);
if (stored) {
const obj = JSON.parse(new TextDecoder().decode(stored.bytes));
}
Properties
| Property | Type | Description |
|---|---|---|
ctx.workflowName | string | Getter property returning the workflow name. |
ctx.state | StateNamespace | Getter returning the persistable workflow state namespace. Survives snapshotting when the WASM runner gains snapshot support. Routes through the same JS / bytes dispatch as ctx.set / ctx.get. |
ctx.session | SessionNamespace | Getter returning the live in-process JS reference namespace. Identity IS preserved within a single workflow run (unlike the Node bindings). Excluded from snapshots. |
StateNamespace
Persistable workflow state, accessed via ctx.state. Routes values through the same set / get / setBytes / getBytes dispatch as the legacy ctx.set / ctx.get, so anything stored here will survive snapshotting once the WASM runner gains snapshot support.
All methods are synchronous — the WASM runtime has no tokio.
Methods
| Method | Signature | Description |
|---|---|---|
state.set(key, value) | (key: string, value: unknown) => void | Store a value. Auto-detects Uint8Array and stores it as binary; everything else is stored as-is. |
state.get(key) | (key: string) => unknown | Retrieve a value. Returns Uint8Array for binary data, the original JsValue for everything else, or null if the key is missing. |
state.setBytes(key, data) | (key: string, data: Uint8Array) => void | Explicitly store binary data. |
state.getBytes(key) | (key: string) => Uint8Array | null | Retrieve binary data. Returns null if the key is missing. |
ctx.state.set("counter", 5);
const count = ctx.state.get("counter");
SessionNamespace
Live in-process JS references, accessed via ctx.session. Values are stored as raw JsValue in a separate map and are excluded from any snapshot.
Identity IS preserved within a run on WASM. Because the WASM runtime is single-threaded, session values are stored as raw
JsValueandctx.session.get(key) === objholds afterctx.session.set(key, obj). This is a meaningful differentiator from the Node bindings, where identity is not preserved due to napi-rs threading constraints (values are round-tripped throughserde_json::Value).
All methods are synchronous.
Methods
| Method | Signature | Description |
|---|---|---|
session.set(key, value) | (key: string, value: unknown) => void | Store a live JS reference under key. The value is kept as-is. |
session.get(key) | (key: string) => unknown | Retrieve the value previously stored under key. Returns null if missing. |
session.has(key) | (key: string) => boolean | Check whether a value exists under key. |
session.remove(key) | (key: string) => void | Remove the value stored under key. |
const conn = openConnection();
ctx.session.set("conn", conn);
console.log(ctx.session.get("conn") === conn); // true
WASM does not currently support cross-process snapshot/resume of session entries. Session values exist only within a single workflow run.
BlazenState
A protocol for structured per-field state storage in the WASM context. Objects carrying the __blazen_state__: true marker are automatically decomposed by ctx.set() and reconstructed by ctx.get().
BlazenStateMeta
Configuration is read from the object’s constructor via a static meta property.
interface BlazenStateMeta {
/** Field names excluded from serialization (recreated via restore). */
transient?: string[];
/** Name of the method to call after reconstruction. */
restore?: string;
}
Detection
Any JS object with the property __blazen_state__ set to a truthy value is treated as a BlazenState:
const state = new MyState();
state.__blazen_state__ = true; // enables the protocol
Per-field storage
When ctx.set(key, state) receives a BlazenState object:
- Each enumerable field is stored individually at
{key}.{fieldName}(skipping the marker and transient fields). - A metadata entry is written at
{key}.__blazen_meta__recording the field list, class name, transient array, and restore method name.
When ctx.get(key) finds a {key}.__blazen_meta__ entry:
- Each recorded field is loaded individually.
- The fields are assembled into a new plain object.
- The
__blazen_state__marker is set on the result. - If a
restoremethod name was recorded, that method is called on the reconstructed object.
restore()
The restore entry in meta is a string naming the method on the instance to call after reconstruction. This method receives no arguments and is called synchronously:
class MyState {
dbPath = '';
conn = null;
static meta = {
transient: ['conn'],
restore: 'reconnect',
};
reconnect() {
this.conn = openDb(this.dbPath);
}
}
Synchronous execution
All BlazenState operations in WASM are synchronous. Unlike the Node.js SDK (where saveTo() / loadFrom() return Promises), the WASM context processes BlazenState objects inline during ctx.set() and ctx.get() — no await needed.
Events
Events are plain objects with a type field.
Start Event
{ type: 'blazen::StartEvent', ...input }
Stop Event
{ type: 'blazen::StopEvent', result: { ... } }
EmbeddingModel
Generate vector embeddings from text. Created via static factory methods. All factory methods take no arguments — API keys are read from environment variables (OPENAI_API_KEY, TOGETHER_API_KEY, COHERE_API_KEY, FIREWORKS_API_KEY).
import { EmbeddingModel } from '@blazen/sdk';
const model = EmbeddingModel.openai();
const together = EmbeddingModel.together();
const cohere = EmbeddingModel.cohere();
const fireworks = EmbeddingModel.fireworks();
Provider Factory Methods
| Method | Default Model | Default Dimensions |
|---|---|---|
EmbeddingModel.openai() | text-embedding-3-small | 1536 |
EmbeddingModel.together() | togethercomputer/m2-bert-80M-8k-retrieval | 768 |
EmbeddingModel.cohere() | embed-v4.0 | 1024 |
EmbeddingModel.fireworks() | nomic-ai/nomic-embed-text-v1.5 | 768 |
Properties
| Property | Type | Description |
|---|---|---|
.modelId | string | The model identifier. |
.dimensions | number | Output vector dimensionality. |
await model.embed(texts: string[]): Promise<number[][]>
Embed one or more texts, returning a nested array of float vectors.
const result = await model.embed(['Hello', 'World']);
console.log(result.length); // 2
console.log(result[0].length); // 1536
EmbeddingModel.tract(modelUrl: string, tokenizerUrl: string, options?: TractOptions | null): Promise<EmbeddingModel>
Local embedding via tract-onnx — pure-Rust ONNX inference that runs entirely inside the WASM module with no JS libraries required. Both URLs are fetched via web_sys::fetch because the hf-hub crate is not available on wasm32; both endpoints must respond with CORS headers permitting the calling origin.
import { EmbeddingModel, TractOptions } from '@blazen/sdk';
const opts = new TractOptions();
opts.modelName = 'BGESmallENV15';
const embedder = await EmbeddingModel.tract(
'https://huggingface.co/Xenova/bge-small-en-v1.5/resolve/main/onnx/model.onnx',
'https://huggingface.co/Xenova/bge-small-en-v1.5/resolve/main/tokenizer.json',
opts,
);
const vecs = await embedder.embed(['Hello world']);
TractEmbedModel
Standalone wasm-only ONNX embedding model. The same backend powers EmbeddingModel.tract(...), but TractEmbedModel is exposed directly for callers who want the typed class without going through the generic EmbeddingModel factory.
TractEmbedModel.create(modelUrl: string, tokenizerUrl: string, options?: TractOptions | null): Promise<TractEmbedModel>
Async constructor. The ONNX weights and tokenizer.json are fetched over HTTP via web_sys::fetch (the hf-hub crate doesn’t compile to wasm32). modelUrl should point to a raw ONNX protobuf; tokenizerUrl to a HuggingFace-format tokenizer.json. Both URLs must be CORS-enabled.
import { TractEmbedModel, TractOptions } from '@blazen/sdk';
const opts = new TractOptions();
opts.modelName = 'BGESmallENV15';
opts.maxBatchSize = 32;
const model = await TractEmbedModel.create(
'https://huggingface.co/Xenova/bge-small-en-v1.5/resolve/main/onnx/model.onnx',
'https://huggingface.co/Xenova/bge-small-en-v1.5/resolve/main/tokenizer.json',
opts,
);
console.log(model.modelId, model.dimensions);
const vectors = await model.embed(['hello', 'world']);
Properties
| Property | Type | Description |
|---|---|---|
.modelId | string | The Hugging Face model id this instance was loaded from. |
.dimensions | number | Output embedding dimensionality. |
await model.embed(texts: string[]): Promise<Float32Array[]>
Embed one or more texts. Returns a nested array of Float32Array vectors.
MediaSource
Top-level type alias re-exporting ImageSource so the same Url / Base64 shape is reused across image, audio, video, and file modalities:
type MediaSource = ImageSource;
Use MediaSource in your own type annotations whenever the modality is generic; the runtime shape is identical to ImageSource.
Token Estimation
Lightweight token counting functions available without external data files.
estimateTokens(text: string, contextSize?: number): number
Estimate token count for a string (~3.5 characters per token).
import { estimateTokens } from '@blazen/sdk';
const count = estimateTokens('Hello, world!'); // 4
countMessageTokens(messages: object[], contextSize?: number): number
Estimate total tokens for an array of chat messages (plain objects with role and content fields). Includes per-message overhead.
import { countMessageTokens } from '@blazen/sdk';
const count = countMessageTokens([
{ role: 'system', content: 'You are helpful.' },
{ role: 'user', content: 'Hello!' },
]);
contextSize defaults to 128000 if omitted.
Custom Providers via JS Handlers
CompletionModel and EmbeddingModel can be created from JavaScript handler functions using static factory methods. This lets you implement custom providers without subclassing.
CompletionModel.fromJsHandler
const model = CompletionModel.fromJsHandler("my-llm", async (request) => {
// request contains messages, tools, temperature, etc.
return {
content: "Hello from my custom model",
model: "my-llm",
};
});
const response = await model.complete([ChatMessage.user("Hi")]);
The handler receives a request object and should return a CompletionResponse-shaped object.
EmbeddingModel.fromJsHandler
const embedder = EmbeddingModel.fromJsHandler("my-embedder", 128, async (texts) => {
return texts.map(() => new Array(128).fill(0.1));
});
const result = await embedder.embed(["Hello", "World"]);
The handler receives a string[] and should return a number[][] of embeddings.
Per-Capability Provider Classes
Seven provider classes let you implement a single compute capability by passing handler functions to the constructor.
| Class | Constructor Handler | Description |
|---|---|---|
TTSProvider | (request) => Promise<any> | Text-to-speech synthesis |
MusicProvider | { generateMusic, generateSfx } | Music and sound effect generation |
ImageProvider | { generateImage, upscaleImage } | Image generation and upscaling |
VideoProvider | { textToVideo, imageToVideo } | Video generation |
ThreeDProvider | (request) => Promise<any> | 3D model generation |
BackgroundRemovalProvider | (request) => Promise<any> | Background removal |
VoiceProvider | { cloneVoice, listVoices, deleteVoice } | Voice cloning and management |
Constructor
Single-method providers take a provider ID and a handler function:
const tts = new TTSProvider("elevenlabs", async (request) => {
const audio = await elevenlabs.textToSpeech(request);
return { audioData: audio, format: "mp3" };
});
const result = await tts.textToSpeech({ text: "Hello world", voice: "alice" });
Multi-method providers take a provider ID and a handlers object:
const music = new MusicProvider("suno", {
generateMusic: async (request) => { /* ... */ },
generateSfx: async (request) => { /* ... */ },
});
const image = new ImageProvider("dalle", {
generateImage: async (request) => { /* ... */ },
upscaleImage: async (request) => { /* ... */ },
});
const video = new VideoProvider("runway", {
textToVideo: async (request) => { /* ... */ },
imageToVideo: async (request) => { /* ... */ },
});
const voice = new VoiceProvider("elevenlabs", {
cloneVoice: async (request) => { /* ... */ },
listVoices: async () => { /* ... */ },
deleteVoice: async (voiceId) => { /* ... */ },
});
MemoryBackend
Custom memory storage backends are created by passing handler functions to the MemoryBackend constructor.
const backend = new MemoryBackend({
put: async (entry) => { /* store entry */ },
get: async (id) => { /* retrieve by id, return null if missing */ },
delete: async (id) => { /* delete by id, return true if existed */ },
list: async () => { /* return all entries */ },
len: async () => { /* return entry count */ },
searchByBands: async (bands, limit) => { /* return candidates */ },
});
Handler Methods
| Method | Signature | Description |
|---|---|---|
put | (entry: any) => Promise<void> | Insert or update a stored entry. |
get | (id: string) => Promise<any | null> | Retrieve a stored entry by id. |
delete | (id: string) => Promise<boolean> | Delete an entry by id. Returns true if it existed. |
list | () => Promise<any[]> | Return all stored entries. |
len | () => Promise<number> | Return the number of stored entries. |
searchByBands | (bands: any, limit: number) => Promise<any[]> | Return candidate entries sharing at least one LSH band. |
InMemoryBackend
A typed, Rust-native in-memory MemoryBackend implementation. Unlike MemoryBackend (which round-trips every call through user-supplied JS callbacks), InMemoryBackend keeps reads and writes inside the WASM linear memory — no JS overhead per call.
import { InMemoryBackend, Memory, EmbeddingModel } from '@blazen/sdk';
const backend = new InMemoryBackend();
const embedder = EmbeddingModel.openai();
const memory = Memory.fromBackend(embedder, backend);
await memory.add('doc1', 'hello world', null);
Methods
| Method | Signature | Description |
|---|---|---|
put | (entry: WasmStoredEntry) => Promise<void> | Insert or update a stored entry. |
get | (id: string) => Promise<WasmStoredEntry | null> | Retrieve a stored entry by id. |
delete | (id: string) => Promise<boolean> | Delete an entry by id. Returns true if it existed. |
list | () => Promise<WasmStoredEntry[]> | Return all stored entries. |
len | () => Promise<number> | Return the number of stored entries. |
isEmpty | () => Promise<boolean> | true if the backend contains no entries. |
searchByBands | (bands: string[], limit: number) => Promise<WasmStoredEntry[]> | Return candidate entries sharing at least one LSH band. |
Memory factory methods
| Factory | Signature | Description |
|---|---|---|
Memory.fromBackend | (embedder: EmbeddingModel, backend: InMemoryBackend) => Memory | Full-mode memory (embedding-based search) backed by a typed InMemoryBackend. |
Memory.localFromBackend | (backend: InMemoryBackend) => Memory | Local-only mode (SimHash only, searchLocal() available; search() rejects) backed by a typed InMemoryBackend. |
const localMem = Memory.localFromBackend(new InMemoryBackend());
await localMem.add('doc', 'hello world', null);
const hits = await localMem.searchLocal('hello', 5, null);
MemoryResult
Standalone class representing a single result returned by Memory.search() / Memory.searchLocal(). Exposed primarily as a typed return value for downstream code that wants to construct MemoryResults from JS (e.g. when implementing a custom MemoryStore).
Constructor
new MemoryResult(id: string, text: string, score: number, metadata: any);
Properties
| Property | Type | Description |
|---|---|---|
.id | string | The entry identifier. |
.text | string | The stored text content. |
.score | number | Similarity score in [0, 1], higher means more similar. |
.metadata | any | Arbitrary metadata, decoded from JSON to a JS value. |
ModelManager
Per-pool memory budget-aware model manager with LRU eviction. Not typically used in WASM (where GPU model loading is uncommon), but available for tracking model state across CPU and GPU pools.
ModelManageris a memory budget bookkeeper, not a performance scheduler. It answers “will this fit?” — not “will this run fast?”. Whether a 70B model loaded on CPU is useful at 1–3 tok/s is a workload-choice question the manager intentionally does not answer.
Backed by the real
blazen_manager::ModelManager. Method names match the native and Node bindings; the WASM constructor takes one or two positionalnumberarguments in gigabytes (no options object). Unlike the Node binding, WASM byte-quantity getters return plainnumber(f64) — JS doubles carry 53 bits of mantissa, more than enough for any realistic memory budget — so there is no BigInt migration on this surface.
Constructor
const manager = new ModelManager(8); // 8 GB CPU pool budget; no GPU pool
const manager = new ModelManager(8, 24); // 8 GB CPU pool + 24 GB GPU pool (rare for WASM, but supported)
| Argument | Type | Description |
|---|---|---|
cpuRamGb | number | Host RAM budget in gigabytes for the "cpu" pool. Converted to bytes internally (cpuRamGb * 1_073_741_824). |
gpuVramGb | number? | Optional VRAM budget in gigabytes for the "gpu:0" pool. Omit if your app only loads CPU models (the common case in browsers). |
Methods
| Method | Signature | Description |
|---|---|---|
register | await manager.register(id, model, memoryEstimateBytes, lifecycle) | Register a model with its estimated memory footprint (memoryEstimateBytes: number bytes — host RAM if on CPU, GPU VRAM otherwise) and a JS lifecycle object. The lifecycle accepts async load() / unload() / isLoaded() plus optional memoryBytes() (async) and device() (sync — returns the pool string, defaults to "cpu"). |
load | await manager.load(id) | Load a model, evicting LRU models in the same pool if needed. |
unload | await manager.unload(id) | Unload a model and free its memory. |
isLoaded | await manager.isLoaded(id): boolean | Check if a model is currently loaded. |
ensureLoaded | await manager.ensureLoaded(id) | Alias for load(). |
usedBytes | await manager.usedBytes(pool?: string): number | Bytes currently used by loaded models in the given pool. pool defaults to "cpu". Invalid pool labels reject with invalid pool label '<x>': expected 'cpu', 'gpu', or 'gpu:N' where N is a non-negative integer. |
availableBytes | await manager.availableBytes(pool?: string): number | Bytes still available within the given pool’s budget. Same default and validation as usedBytes. |
pools | manager.pools(): Array<{ pool: string; budgetBytes: number }> | Sync. List every configured pool and its byte budget. |
status | await manager.status(): { id: string; loaded: boolean; memoryEstimateBytes: number; pool: string }[] | Status of all registered models. |
Properties
| Property | Type | Description |
|---|---|---|
.budgetBytes | number | Read-only getter returning the configured CPU pool budget in bytes (cpuRamGb * 1_073_741_824). For GPU pool budgets, use pools(). |
ModelRegistry
JS-callback ABC for advertising a model catalog. Wraps a JS object implementing listModels() and getModel(modelId) so browser code can plug a custom registry into Blazen’s model-info lookup surface. Mirrors the trait at blazen_llm::traits::ModelRegistry and reaches parity with PyModelRegistry (Python) / JsModelRegistry (Node).
Constructor
import init, { ModelRegistry } from "@blazen/sdk";
import type { ModelInfo } from "@blazen/sdk";
await init();
const registry = new ModelRegistry({
async listModels(): Promise<ModelInfo[]> {
const res = await fetch("/api/models");
return res.json();
},
async getModel(modelId: string): Promise<ModelInfo | null> {
const res = await fetch(`/api/models/${modelId}`);
return res.ok ? res.json() : null;
},
});
const models = await registry.listModels();
The constructor argument must implement the ModelRegistryImpl interface (auto-emitted into crates/blazen-wasm-sdk/pkg/blazen_wasm_sdk.d.ts):
export interface ModelRegistryImpl {
listModels(): Promise<ModelInfo[]> | ModelInfo[];
getModel(modelId: string): Promise<ModelInfo | null> | ModelInfo | null;
}
Both methods may return either a Promise or a synchronous value; the binding awaits whichever is produced.
Methods
| Method | Signature | Description |
|---|---|---|
listModels | await registry.listModels(): Promise<ModelInfo[]> | Returns whatever the JS listModels() callback resolved to. |
getModel | await registry.getModel(modelId: string): Promise<ModelInfo | null> | Returns whatever the JS getModel() callback resolved to, or null if the model is unknown. |
The registry returns plain ModelInfo objects — the same tsify-generated shape produced elsewhere on the WASM surface and documented in crates/blazen-wasm-sdk/pkg/blazen_wasm_sdk.d.ts.
Pricing Functions
registerPricing()
Register custom pricing for a model.
import { registerPricing } from "@blazen/sdk";
registerPricing("my-model", 1.0, 2.0);
// Arguments: modelId, inputPerMillion, outputPerMillion
lookupPricing()
Look up pricing for a model by ID. Returns null if the model is unknown.
import { lookupPricing } from "@blazen/sdk";
const pricing = lookupPricing("gpt-4o");
if (pricing) {
console.log(`Input: $${pricing.inputPerMillion}/M tokens`);
}
The returned object has the shape:
{
inputPerMillion: number;
outputPerMillion: number;
}
OTLP Telemetry
OpenTelemetry trace export over HTTP/protobuf. Behind the otlp-http Cargo feature on the blazen-wasm-sdk crate (the default opentelemetry-otlp/grpc-tonic transport is wasm-incompatible because tonic requires tokio networking that does not exist on wasm32). The WASM build instead routes spans through a custom WasmFetchHttpClient that posts protobuf bodies via web_sys::fetch.
new OtlpConfig(endpoint: string, serviceName: string)
| Argument | Type | Description |
|---|---|---|
endpoint | string | Full HTTP/protobuf traces endpoint, e.g. "http://localhost:4318/v1/traces". |
serviceName | string | Reported to the backend as the service.name resource attribute. |
Read-only getters: .endpoint, .serviceName.
initOtlp(config: OtlpConfig): void
Install the global OTLP exporter and a tracing-subscriber stack with an OpenTelemetry layer. Must be called once at startup; subsequent calls fail because the global subscriber can only be installed a single time.
import init, { OtlpConfig, initOtlp } from '@blazen/sdk';
await init();
const cfg = new OtlpConfig('http://localhost:4318/v1/traces', 'my-wasm-app');
initOtlp(cfg);
// All subsequent workflow / pipeline / completion spans are exported.
If the collector is unreachable the export simply drops spans; it never blocks the calling workflow.
Error Handling
All errors are thrown as JavaScript Error objects. The message format indicates the category:
| Error Pattern | Description |
|---|---|
"authentication failed: ..." | Invalid or expired API key |
"rate limited" | Provider rate limit hit |
"timed out after {ms}ms" | Request timed out |
"{provider} error: ..." | Provider-specific error |
"invalid input: ..." | Validation error |
"unsupported: ..." | Feature not supported by provider |
try {
const response = await model.complete([ChatMessage.user('Hello')]);
} catch (e) {
if (e.message.startsWith('rate limited')) {
// Back off and retry
}
}