Rust API Reference

Complete API reference for blazen-llm in Rust

Feature Flags

FeatureDescription
openaiEnables OpenAiProvider and OpenAiCompatProvider (covers OpenRouter, Groq, Together, Mistral, DeepSeek, Fireworks, Perplexity, xAI, Cohere, Bedrock)
anthropicEnables AnthropicProvider
geminiEnables GeminiProvider
falEnables FalProvider (compute: image, video, audio, 3D)
azureEnables AzureOpenAiProvider
all-providersEnables all provider implementations

Core LLM Traits

CompletionModel

The central trait every LLM provider must implement. Supports both one-shot and streaming completions.

#[async_trait]
pub trait CompletionModel: Send + Sync {
    fn model_id(&self) -> &str;

    async fn complete(
        &self,
        request: CompletionRequest,
    ) -> Result<CompletionResponse, BlazenError>;

    async fn stream(
        &self,
        request: CompletionRequest,
    ) -> Result<
        Pin<Box<dyn Stream<Item = Result<StreamChunk, BlazenError>> + Send>>,
        BlazenError,
    >;
}

Usage:

use blazen_llm::{CompletionModel, CompletionRequest, ChatMessage};
use blazen_llm::providers::openai::OpenAiProvider;

let model = OpenAiProvider::new("sk-...");
let request = CompletionRequest::new(vec![
    ChatMessage::user("What is 2 + 2?"),
]);
let response = model.complete(request).await?;
println!("{}", response.content.unwrap_or_default());

Streaming:

use futures_util::StreamExt;

let request = CompletionRequest::new(vec![
    ChatMessage::user("Tell me a story"),
]);
let mut stream = model.stream(request).await?;
while let Some(chunk) = stream.next().await {
    let chunk = chunk?;
    if let Some(delta) = &chunk.delta {
        print!("{delta}");
    }
}

StructuredOutput

Extract typed data from a model using JSON Schema constraints. This trait has a blanket implementation for every CompletionModel — providers do not need to implement it.

#[async_trait]
pub trait StructuredOutput: CompletionModel {
    async fn extract<T: JsonSchema + DeserializeOwned + Send>(
        &self,
        messages: Vec<ChatMessage>,
    ) -> Result<StructuredResponse<T>, BlazenError>;
}

// Blanket impl: every CompletionModel automatically gets this.
impl<M: CompletionModel> StructuredOutput for M {}

T must implement schemars::JsonSchema and serde::de::DeserializeOwned. The schema is derived at call time via schemars::schema_for! and injected into the request’s response_format.

Usage:

use schemars::JsonSchema;
use serde::Deserialize;
use blazen_llm::StructuredOutput;

#[derive(JsonSchema, Deserialize)]
struct Sentiment {
    label: String,
    score: f64,
}

let result = model.extract::<Sentiment>(vec![
    ChatMessage::user("Analyze sentiment: 'I love Rust'"),
]).await?;
println!("{}: {}", result.data.label, result.data.score);

EmbeddingModel

Produces vector embeddings for text inputs.

#[async_trait]
pub trait EmbeddingModel: Send + Sync {
    fn model_id(&self) -> &str;
    fn dimensions(&self) -> usize;
    async fn embed(&self, texts: &[String]) -> Result<EmbeddingResponse, BlazenError>;
}

Usage:

let texts = vec!["Hello world".into(), "Goodbye world".into()];
let response = embedding_model.embed(&texts).await?;
for (i, vector) in response.embeddings.iter().enumerate() {
    println!("text {i}: {} dimensions", vector.len());
}

Tool

A callable tool that can be invoked by an LLM during a conversation.

#[async_trait]
pub trait Tool: Send + Sync {
    fn definition(&self) -> ToolDefinition;
    async fn execute(
        &self,
        arguments: serde_json::Value,
    ) -> Result<serde_json::Value, BlazenError>;
}

Usage:

use blazen_llm::{Tool, ToolDefinition, BlazenError};

struct WeatherTool;

#[async_trait::async_trait]
impl Tool for WeatherTool {
    fn definition(&self) -> ToolDefinition {
        ToolDefinition {
            name: "get_weather".into(),
            description: "Get the current weather for a city".into(),
            parameters: serde_json::json!({
                "type": "object",
                "properties": {
                    "city": { "type": "string" }
                },
                "required": ["city"]
            }),
        }
    }

    async fn execute(
        &self,
        arguments: serde_json::Value,
    ) -> Result<serde_json::Value, BlazenError> {
        let city = arguments["city"].as_str().unwrap_or("unknown");
        Ok(serde_json::json!({ "temp": 72, "city": city }))
    }
}

ModelRegistry

Allows providers to advertise their available models.

#[async_trait]
pub trait ModelRegistry: Send + Sync {
    async fn list_models(&self) -> Result<Vec<ModelInfo>, BlazenError>;
    async fn get_model(&self, model_id: &str) -> Result<Option<ModelInfo>, BlazenError>;
}

ModelInfo

FieldTypeDescription
idStringModel identifier used in API requests (e.g. "gpt-4o")
nameOption<String>Human-readable display name
providerStringProvider that serves this model
context_lengthOption<u64>Maximum context window in tokens
pricingOption<ModelPricing>Pricing information
capabilitiesModelCapabilitiesWhat this model can do

ModelPricing

FieldTypeDescription
input_per_millionOption<f64>Cost per million input tokens (USD)
output_per_millionOption<f64>Cost per million output tokens (USD)
per_imageOption<f64>Cost per image (image generation models)
per_secondOption<f64>Cost per second of compute

ModelCapabilities

FieldTypeDescription
chatboolSupports chat completions
streamingboolSupports streaming responses
tool_useboolSupports tool/function calling
structured_outputboolSupports JSON schema constraints
visionboolSupports image inputs
image_generationboolSupports image generation
embeddingsboolSupports text embeddings
video_generationboolVideo generation support
text_to_speechboolText-to-speech synthesis
speech_to_textboolSpeech-to-text transcription
audio_generationboolAudio generation (music, SFX)
three_d_generationbool3D model generation

Types

ChatMessage

A single message in a chat conversation.

FieldTypeDescription
roleRoleWho produced this message
contentMessageContentThe message payload

Constructors:

// Text messages
ChatMessage::system("You are a helpful assistant")
ChatMessage::user("Hello!")
ChatMessage::assistant("Hi there!")
ChatMessage::tool("{ \"result\": 42 }")

// Multimodal messages
ChatMessage::user_image_url("Describe this", "https://img.com/a.png", Some("image/png"))
ChatMessage::user_image_base64("What is this?", "iVBORw0K...", "image/jpeg")
ChatMessage::user_parts(vec![
    ContentPart::Text { text: "Look at this:".into() },
    ContentPart::Image(ImageContent {
        source: ImageSource::Url { url: "https://...".into() },
        media_type: Some("image/png".into()),
    }),
    ContentPart::File(FileContent {
        source: ImageSource::Url { url: "https://...".into() },
        media_type: "application/pdf".into(),
        filename: Some("doc.pdf".into()),
    }),
])

Role

pub enum Role {
    System,
    User,
    Assistant,
    Tool,
}

MessageContent

pub enum MessageContent {
    Text(String),
    Image(ImageContent),
    Parts(Vec<ContentPart>),
}
MethodSignatureDescription
as_text()&self -> Option<&str>Return the text if this is a Text variant
as_parts()&self -> Vec<ContentPart>Convert any variant into a Vec<ContentPart>
text_content()&self -> Option<String>Extract and concatenate all text content

MessageContent implements From<&str> and From<String>.


ContentPart

pub enum ContentPart {
    Text { text: String },
    Image(ImageContent),
    File(FileContent),
}

ImageContent

FieldTypeDescription
sourceImageSourceURL or base64 data
media_typeOption<String>MIME type (e.g. "image/png")

ImageSource

pub enum ImageSource {
    Url { url: String },
    Base64 { data: String },
}

FileContent

FieldTypeDescription
sourceImageSourceURL or base64 data
media_typeStringMIME type (e.g. "application/pdf")
filenameOption<String>Optional filename for display

CompletionRequest

A provider-agnostic request for a chat completion.

FieldTypeDescription
messagesVec<ChatMessage>The conversation history
toolsVec<ToolDefinition>Tools available for the model to invoke
temperatureOption<f32>Sampling temperature (0.0 = deterministic, 2.0 = very random)
max_tokensOption<u32>Maximum number of tokens to generate
top_pOption<f32>Nucleus sampling parameter
response_formatOption<serde_json::Value>JSON Schema for structured output
modelOption<String>Override the provider’s default model
modalitiesOption<Vec<String>>Output modalities (e.g. ["text"], ["image", "text"])
image_configOption<serde_json::Value>Image generation configuration (model-specific)
audio_configOption<serde_json::Value>Audio output configuration (voice, format, etc.)

Builder pattern:

let request = CompletionRequest::new(vec![ChatMessage::user("Hello")])
    .with_tools(tool_defs)
    .with_temperature(0.7)
    .with_max_tokens(1024)
    .with_top_p(0.9)
    .with_response_format(schema_json)
    .with_model("gpt-4o")
    .with_modalities(vec!["text".into(), "image".into()])
    .with_image_config(serde_json::json!({ "size": "1024x1024" }))
    .with_audio_config(serde_json::json!({ "voice": "alloy" }));

CompletionResponse

The result of a non-streaming chat completion.

FieldTypeDescription
contentOption<String>Text content of the assistant’s reply
tool_callsVec<ToolCall>Tool invocations requested by the model
usageOption<TokenUsage>Token usage statistics
modelStringThe model that produced this response
finish_reasonOption<String>Why the model stopped (e.g. "stop", "tool_use")
costOption<f64>Estimated cost in USD
timingOption<RequestTiming>Request timing breakdown
imagesVec<GeneratedImage>Generated images (multimodal models)
audioVec<GeneratedAudio>Generated audio (TTS / multimodal)
videosVec<GeneratedVideo>Generated videos
metadataserde_json::ValueProvider-specific metadata

StructuredResponse<T>

Response from structured output extraction, preserving metadata.

FieldTypeDescription
dataTThe extracted structured data
usageOption<TokenUsage>Token usage statistics
modelStringThe model that produced this response
costOption<f64>Estimated cost in USD
timingOption<RequestTiming>Request timing
metadataserde_json::ValueProvider-specific metadata

EmbeddingResponse

Response from an embedding operation.

FieldTypeDescription
embeddingsVec<Vec<f32>>The embedding vectors (one per input text)
modelStringThe model used
usageOption<TokenUsage>Token usage statistics
costOption<f64>Estimated cost in USD
timingOption<RequestTiming>Request timing
metadataserde_json::ValueProvider-specific metadata

RequestTiming

Timing metadata for a request.

FieldTypeDescription
queue_msOption<u64>Time spent waiting in queue (ms)
execution_msOption<u64>Time spent executing the request (ms)
total_msOption<u64>Total wall-clock time from submit to response (ms)

TokenUsage

Token usage statistics for a completion request.

FieldTypeDescription
prompt_tokensu32Tokens in the prompt / input
completion_tokensu32Tokens in the completion / output
total_tokensu32Total tokens consumed (prompt + completion)

ToolDefinition

Describes a tool that the model may invoke.

FieldTypeDescription
nameStringUnique name of the tool
descriptionStringHuman-readable description
parametersserde_json::ValueJSON Schema describing the tool’s input parameters

ToolCall

A tool invocation requested by the model.

FieldTypeDescription
idStringProvider-assigned identifier for this invocation
nameStringName of the tool to invoke
argumentsserde_json::ValueArguments to pass, as JSON

StreamChunk

A single chunk from a streaming completion response.

FieldTypeDescription
deltaOption<String>Incremental text content
tool_callsVec<ToolCall>Tool invocations completed in this chunk
finish_reasonOption<String>Present in the final chunk to indicate why generation stopped

Agent System

The agent system implements the standard LLM + tool calling loop: send messages with tool definitions, execute any tool calls the model makes, feed results back, and repeat until the model stops or max_iterations is reached.

run_agent()

Run the agent loop without event callbacks.

pub async fn run_agent(
    model: &dyn CompletionModel,
    messages: Vec<ChatMessage>,
    config: AgentConfig,
) -> Result<AgentResult, BlazenError>

run_agent_with_callback()

Run the agent loop, emitting AgentEvents to the supplied callback.

pub async fn run_agent_with_callback(
    model: &dyn CompletionModel,
    messages: Vec<ChatMessage>,
    config: AgentConfig,
    on_event: impl Fn(AgentEvent) + Send + Sync,
) -> Result<AgentResult, BlazenError>

The loop works as follows:

  1. Build a CompletionRequest with the full message history and all tool definitions.
  2. Call the model.
  3. If the model responds with no tool calls, return immediately.
  4. If the model invoked the built-in “finish” tool (when enabled), extract the answer and return.
  5. Otherwise, execute each tool call, append results to messages, go back to step 1.
  6. If max_iterations is reached, make one final call without tools to force a text answer.

Usage:

use std::sync::Arc;
use blazen_llm::{run_agent, AgentConfig, ChatMessage};

let config = AgentConfig::new(vec![Arc::new(WeatherTool)])
    .with_system_prompt("You are a helpful assistant with weather tools.")
    .with_max_iterations(5)
    .with_finish_tool()
    .with_temperature(0.7)
    .with_max_tokens(2048);

let result = run_agent(
    &model,
    vec![ChatMessage::user("What's the weather in Paris?")],
    config,
).await?;

println!("Answer: {}", result.response.content.unwrap_or_default());
println!("Iterations: {}", result.iterations);
println!("Total cost: ${:.4}", result.total_cost.unwrap_or(0.0));

With callback:

use blazen_llm::{run_agent_with_callback, AgentEvent};

let result = run_agent_with_callback(
    &model,
    vec![ChatMessage::user("What's the weather?")],
    config,
    |event| match &event {
        AgentEvent::ToolCalled { iteration, tool_call } => {
            println!("[iter {iteration}] calling tool: {}", tool_call.name);
        }
        AgentEvent::ToolResult { tool_name, result, .. } => {
            println!("  {tool_name} -> {result}");
        }
        AgentEvent::IterationComplete { iteration, had_tool_calls } => {
            println!("[iter {iteration}] done (tools: {had_tool_calls})");
        }
    },
).await?;

AgentConfig

Configuration for the agentic tool execution loop.

FieldTypeDefaultDescription
max_iterationsu3210Maximum tool call rounds before forcing a stop
toolsVec<Arc<dyn Tool>>requiredTools available to the agent
add_finish_toolboolfalseAdd an implicit “finish” tool the model can call to exit early
system_promptOption<String>NoneSystem prompt prepended to messages
temperatureOption<f32>NoneSampling temperature
max_tokensOption<u32>NoneMaximum tokens per completion call

Builder pattern:

AgentConfig::new(tools)
    .with_max_iterations(5)
    .with_system_prompt("You are helpful.")
    .with_finish_tool()
    .with_temperature(0.7)
    .with_max_tokens(2048)

AgentResult

Result of an agent run.

FieldTypeDescription
responseCompletionResponseThe final completion response
messagesVec<ChatMessage>Full message history including all tool calls and results
iterationsu32Number of tool call rounds that occurred
total_usageOption<TokenUsage>Aggregated token usage across all rounds
total_costOption<f64>Aggregated cost across all rounds
timingOption<RequestTiming>Total wall-clock time for the entire agent run

AgentEvent

Events emitted during agent execution (passed to the callback in run_agent_with_callback).

pub enum AgentEvent {
    ToolCalled {
        iteration: u32,
        tool_call: ToolCall,
    },
    ToolResult {
        iteration: u32,
        tool_name: String,
        result: serde_json::Value,
    },
    IterationComplete {
        iteration: u32,
        had_tool_calls: bool,
    },
}

Context

The Context object is a shared key-value store available in every workflow step. It provides three storage tiers and methods for event routing, streaming, and state management.

State Storage

Typed JSON: set() / get()

Store and retrieve any Serialize / DeserializeOwned type. Values are held internally as StateValue::Json.

// Store a typed value (anything implementing Serialize)
ctx.set("user_id", serde_json::json!("user_123"));
ctx.set("doc_count", serde_json::json!(5));

// Retrieve with type inference
let user_id: String = serde_json::from_value(ctx.get("user_id").unwrap()).unwrap();
let doc_count: i64 = serde_json::from_value(ctx.get("doc_count").unwrap()).unwrap();

Binary: set_bytes() / get_bytes()

Store raw Vec<u8> data. Values are held as StateValue::Bytes. No serialization requirement — useful for model weights, protobuf, bincode, or any binary format.

ctx.set_bytes("weights", vec![0x01, 0x02, 0x03]);
let bytes: Vec<u8> = ctx.get_bytes("weights").unwrap();

Raw StateValue: set_value() / get_value()

Work with the StateValue enum directly for full control over the storage variant, including the Native variant used by language bindings.

use blazen::context::StateValue;

ctx.set_value("config", StateValue::Json(serde_json::json!({"retries": 3})));
ctx.set_value("blob", StateValue::Bytes(vec![0xDE, 0xAD].into()));
ctx.set_value("py_obj", StateValue::Native(pickle_bytes.into()));

match ctx.get_value("config") {
    Some(StateValue::Json(v)) => { /* structured data */ }
    Some(StateValue::Bytes(b)) => { /* raw bytes */ }
    Some(StateValue::Native(b)) => { /* platform-serialized opaque bytes */ }
    None => { /* key not found */ }
}

StateValue

pub enum StateValue {
    Json(serde_json::Value),
    Bytes(BytesWrapper),
    Native(BytesWrapper),
}
VariantDescription
Json(serde_json::Value)Structured, serializable data. Used by ctx.set() / ctx.get().
Bytes(BytesWrapper)Raw binary data. Used by ctx.set_bytes() / ctx.get_bytes().
Native(BytesWrapper)Platform-serialized opaque objects (e.g., Python pickle bytes). Preserved across language boundaries without deserialization.

Run Identity

ctx.run_id() -> &str

Returns the unique identifier for the current workflow run.

Event Routing

ctx.send_event(event: impl Event)

Programmatically route an event into the workflow. Use this when a step needs to emit multiple events or decide at runtime which path to take. When using send_event, the step returns () instead of an event type.

ctx.write_event_to_stream(event: impl Event)

Publish an event to the workflow’s external event stream, observable by callers via stream_events(). Useful for progress reporting and live updates.

State Snapshot and Restore

ctx.collect_events() -> Vec<Box<dyn Event>>
ctx.snapshot_state() -> ContextSnapshot
ctx.restore_state(snapshot: ContextSnapshot)
MethodDescription
collect_events()Drain all pending events from the context.
snapshot_state()Capture the entire context state as a serializable snapshot (for checkpointing / pause-resume).
restore_state(snapshot)Restore context from a previously captured snapshot.

Compute Platform

The compute module provides a unified trait system for async, job-based media generation providers (fal.ai, Replicate, RunPod, etc.) that model a submit-poll-retrieve workflow for GPU workloads.

ComputeProvider

The base trait for compute providers.

#[async_trait]
pub trait ComputeProvider: Send + Sync {
    fn provider_id(&self) -> &str;

    async fn submit(&self, request: ComputeRequest) -> Result<JobHandle, BlazenError>;

    async fn status(&self, job: &JobHandle) -> Result<JobStatus, BlazenError>;

    async fn result(&self, job: JobHandle) -> Result<ComputeResult, BlazenError>;

    async fn cancel(&self, job: &JobHandle) -> Result<(), BlazenError>;

    // Default: submit then wait for result
    async fn run(&self, request: ComputeRequest) -> Result<ComputeResult, BlazenError> {
        let job = self.submit(request).await?;
        self.result(job).await
    }
}

ImageGeneration

Image generation and upscaling. Requires ComputeProvider as a supertrait.

#[async_trait]
pub trait ImageGeneration: ComputeProvider {
    async fn generate_image(&self, request: ImageRequest) -> Result<ImageResult, BlazenError>;
    async fn upscale_image(&self, request: UpscaleRequest) -> Result<ImageResult, BlazenError>;
}

Usage:

use blazen_llm::compute::{ImageGeneration, ImageRequest};

let result = provider.generate_image(
    ImageRequest::new("a cat in space")
        .with_size(1024, 1024)
        .with_count(2)
        .with_negative_prompt("blurry")
        .with_model("flux-dev"),
).await?;

for image in &result.images {
    println!("url: {:?}, {}x{}", image.media.url, image.width.unwrap_or(0), image.height.unwrap_or(0));
}

VideoGeneration

Video synthesis from text or images. Requires ComputeProvider as a supertrait.

#[async_trait]
pub trait VideoGeneration: ComputeProvider {
    async fn text_to_video(&self, request: VideoRequest) -> Result<VideoResult, BlazenError>;
    async fn image_to_video(&self, request: VideoRequest) -> Result<VideoResult, BlazenError>;
}

Usage:

use blazen_llm::compute::{VideoGeneration, VideoRequest};

// Text-to-video
let result = provider.text_to_video(
    VideoRequest::new("a sunset timelapse")
        .with_duration(5.0)
        .with_size(1920, 1080)
        .with_model("kling"),
).await?;

// Image-to-video
let result = provider.image_to_video(
    VideoRequest::for_image("https://example.com/img.png", "animate this scene")
        .with_duration(3.0),
).await?;

AudioGeneration

Audio synthesis including TTS, music, and sound effects. Requires ComputeProvider as a supertrait.

#[async_trait]
pub trait AudioGeneration: ComputeProvider {
    async fn text_to_speech(&self, request: SpeechRequest) -> Result<AudioResult, BlazenError>;

    // Default: returns BlazenError::Unsupported
    async fn generate_music(&self, request: MusicRequest) -> Result<AudioResult, BlazenError>;

    // Default: returns BlazenError::Unsupported
    async fn generate_sfx(&self, request: MusicRequest) -> Result<AudioResult, BlazenError>;
}

generate_music() and generate_sfx() have default implementations that return BlazenError::Unsupported. Providers override only the methods they support.

Usage:

use blazen_llm::compute::{AudioGeneration, SpeechRequest, MusicRequest};

let speech = provider.text_to_speech(
    SpeechRequest::new("Hello world")
        .with_voice("alloy")
        .with_language("en")
        .with_speed(1.0)
        .with_voice_url("https://example.com/voice.wav") // voice cloning
        .with_model("tts-1"),
).await?;

let music = provider.generate_music(
    MusicRequest::new("upbeat jazz")
        .with_duration(30.0)
        .with_model("musicgen"),
).await?;

Transcription

Audio transcription (speech-to-text). Requires ComputeProvider as a supertrait.

#[async_trait]
pub trait Transcription: ComputeProvider {
    async fn transcribe(
        &self,
        request: TranscriptionRequest,
    ) -> Result<TranscriptionResult, BlazenError>;
}

Usage:

use blazen_llm::compute::{Transcription, TranscriptionRequest};

let result = provider.transcribe(
    TranscriptionRequest::new("https://example.com/audio.mp3")
        .with_language("en")
        .with_diarize(true)
        .with_model("whisper-v3"),
).await?;

println!("Full text: {}", result.text);
for segment in &result.segments {
    println!("[{:.1}s - {:.1}s] {}: {}",
        segment.start, segment.end,
        segment.speaker.as_deref().unwrap_or("?"),
        segment.text,
    );
}

ThreeDGeneration

3D model generation from text or images. Requires ComputeProvider as a supertrait.

#[async_trait]
pub trait ThreeDGeneration: ComputeProvider {
    async fn generate_3d(&self, request: ThreeDRequest) -> Result<ThreeDResult, BlazenError>;
}

Usage:

use blazen_llm::compute::{ThreeDGeneration, ThreeDRequest};

// Text-to-3D
let result = provider.generate_3d(
    ThreeDRequest::new("a 3D cat")
        .with_format("glb")
        .with_model("triposr"),
).await?;

// Image-to-3D
let result = provider.generate_3d(
    ThreeDRequest::from_image("https://example.com/cat.png")
        .with_format("obj"),
).await?;

for model_3d in &result.models {
    println!("vertices: {:?}, faces: {:?}, textures: {}, animations: {}",
        model_3d.vertex_count, model_3d.face_count,
        model_3d.has_textures, model_3d.has_animations,
    );
}

Compute Request Types

ImageRequest

FieldTypeDescription
promptStringText prompt describing the desired image
negative_promptOption<String>Things to avoid in the image
widthOption<u32>Desired width in pixels
heightOption<u32>Desired height in pixels
num_imagesOption<u32>Number of images to generate
modelOption<String>Model override
parametersserde_json::ValueAdditional provider-specific parameters

Builder: ImageRequest::new(prompt).with_size(w, h).with_count(n).with_negative_prompt(p).with_model(m)

UpscaleRequest

FieldTypeDescription
image_urlStringURL of the image to upscale
scalef32Scale factor (e.g. 2.0, 4.0)
modelOption<String>Model override
parametersserde_json::ValueAdditional provider-specific parameters

Builder: UpscaleRequest::new(url, scale).with_model(m)

VideoRequest

FieldTypeDescription
promptStringText prompt
image_urlOption<String>Source image for image-to-video
duration_secondsOption<f32>Desired duration in seconds
negative_promptOption<String>Things to avoid
widthOption<u32>Desired width in pixels
heightOption<u32>Desired height in pixels
modelOption<String>Model override
parametersserde_json::ValueAdditional provider-specific parameters

Builder: VideoRequest::new(prompt) or VideoRequest::for_image(url, prompt), then .with_duration(s).with_size(w, h).with_model(m)

SpeechRequest

FieldTypeDescription
textStringText to synthesize
voiceOption<String>Voice identifier (provider-specific)
voice_urlOption<String>Reference voice URL for voice cloning
languageOption<String>Language code (e.g. "en", "fr")
speedOption<f32>Speed multiplier (1.0 = normal)
modelOption<String>Model override
parametersserde_json::ValueAdditional provider-specific parameters

Builder: SpeechRequest::new(text).with_voice(v).with_voice_url(url).with_language(l).with_speed(s).with_model(m)

MusicRequest

FieldTypeDescription
promptStringText prompt
duration_secondsOption<f32>Desired duration in seconds
modelOption<String>Model override
parametersserde_json::ValueAdditional provider-specific parameters

Builder: MusicRequest::new(prompt).with_duration(s).with_model(m)

TranscriptionRequest

FieldTypeDescription
audio_urlStringURL of the audio file
languageOption<String>Language hint
diarizeboolEnable speaker diarization (default: false)
modelOption<String>Model override
parametersserde_json::ValueAdditional provider-specific parameters

Builder: TranscriptionRequest::new(url).with_language(l).with_diarize(true).with_model(m)

ThreeDRequest

FieldTypeDescription
promptStringText prompt
image_urlOption<String>Source image for image-to-3D
formatOption<String>Output format (e.g. "glb", "obj", "usdz")
modelOption<String>Model override
parametersserde_json::ValueAdditional provider-specific parameters

Builder: ThreeDRequest::new(prompt) or ThreeDRequest::from_image(url), then .with_format(f).with_model(m)


Compute Result Types

ImageResult

FieldTypeDescription
imagesVec<GeneratedImage>The generated/upscaled images
timingRequestTimingRequest timing breakdown
costOption<f64>Cost in USD
metadataserde_json::ValueProvider-specific metadata

VideoResult

FieldTypeDescription
videosVec<GeneratedVideo>The generated videos
timingRequestTimingRequest timing breakdown
costOption<f64>Cost in USD
metadataserde_json::ValueProvider-specific metadata

AudioResult

FieldTypeDescription
audioVec<GeneratedAudio>The generated audio clips
timingRequestTimingRequest timing breakdown
costOption<f64>Cost in USD
metadataserde_json::ValueProvider-specific metadata

ThreeDResult

FieldTypeDescription
modelsVec<Generated3DModel>The generated 3D models
timingRequestTimingRequest timing breakdown
costOption<f64>Cost in USD
metadataserde_json::ValueProvider-specific metadata

TranscriptionResult

FieldTypeDescription
textStringFull transcribed text
segmentsVec<TranscriptionSegment>Time-aligned segments
languageOption<String>Detected/specified language code
timingRequestTimingRequest timing breakdown
costOption<f64>Cost in USD
metadataserde_json::ValueProvider-specific metadata

TranscriptionSegment

FieldTypeDescription
textStringTranscribed text for this segment
startf64Start time in seconds
endf64End time in seconds
speakerOption<String>Speaker label (if diarization was enabled)

Compute Job Types

ComputeRequest

FieldTypeDescription
modelStringModel/endpoint to run (e.g. "fal-ai/flux/dev")
inputserde_json::ValueInput parameters as JSON (model-specific)
webhookOption<String>Webhook URL for async completion notification

ComputeResult

FieldTypeDescription
jobOption<JobHandle>The job handle that produced this result
outputserde_json::ValueOutput data (model-specific JSON)
timingRequestTimingRequest timing breakdown
costOption<f64>Cost in USD
metadataserde_json::ValueProvider-specific metadata

JobHandle

FieldTypeDescription
idStringProvider-assigned job identifier
providerStringProvider name (e.g. "fal")
modelStringModel/endpoint that was invoked
submitted_atDateTime<Utc>When the job was submitted

JobStatus

pub enum JobStatus {
    Queued,
    Running,
    Completed,
    Failed { error: String },
    Cancelled,
}

Media

MediaType

Exhaustive enumeration of media formats with detection support. Covers images, video, audio, 3D models, documents, and a catch-all Other variant.

Variants:

CategoryVariants
ImagePng, Jpeg, WebP, Gif, Svg, Bmp, Tiff, Avif, Ico
VideoMp4, WebM, Mov, Avi, Mkv
AudioMp3, Wav, Ogg, Flac, Aac, M4a, WebmAudio
3DGlb, Gltf, Obj, Fbx, Usdz, Stl, Ply
DocumentPdf
Catch-allOther { mime: String }

Methods:

MethodSignatureDescription
mime()&self -> &strReturn the MIME type string
extension()&self -> &strReturn the canonical file extension (no dot)
magic_bytes()&self -> Option<&'static [u8]>Return the magic byte signature, if any
detect(bytes)fn(&[u8]) -> Option<Self>Detect media type from file header bytes
from_mime(mime)fn(&str) -> SelfParse a MIME string (unknown = Other)
from_extension(ext)fn(&str) -> SelfParse a file extension (unknown = Other)
is_image()&self -> boolIs this an image format?
is_video()&self -> boolIs this a video format?
is_audio()&self -> boolIs this an audio format?
is_3d()&self -> boolIs this a 3D model format?
is_vector()&self -> boolIs this a text-based format (SVG, GLTF, OBJ)?

MediaType implements Display (outputs the MIME string).

Example:

use blazen_llm::MediaType;

let mt = MediaType::from_extension("png");
assert_eq!(mt.mime(), "image/png");
assert!(mt.is_image());

// Detect from raw bytes
let bytes = [0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A];
assert_eq!(MediaType::detect(&bytes), Some(MediaType::Png));

MediaOutput

A single piece of generated media content. At least one of url, base64, or raw_content will be populated.

FieldTypeDescription
urlOption<String>URL where the media can be downloaded
base64Option<String>Base64-encoded media data
raw_contentOption<String>Raw text content (SVG, OBJ, GLTF JSON)
media_typeMediaTypeFormat of the media
file_sizeOption<u64>File size in bytes
metadataserde_json::ValueProvider-specific metadata

Constructors:

let output = MediaOutput::from_url("https://example.com/img.png", MediaType::Png);
let output = MediaOutput::from_base64("iVBORw0KGgo=", MediaType::Png);

GeneratedImage

FieldTypeDescription
mediaMediaOutputThe image media output
widthOption<u32>Width in pixels
heightOption<u32>Height in pixels

GeneratedVideo

FieldTypeDescription
mediaMediaOutputThe video media output
widthOption<u32>Width in pixels
heightOption<u32>Height in pixels
duration_secondsOption<f32>Duration in seconds
fpsOption<f32>Frames per second

GeneratedAudio

FieldTypeDescription
mediaMediaOutputThe audio media output
duration_secondsOption<f32>Duration in seconds
sample_rateOption<u32>Sample rate in Hz
channelsOption<u8>Number of audio channels

Generated3DModel

FieldTypeDescription
mediaMediaOutputThe 3D model media output
vertex_countOption<u64>Total vertex count
face_countOption<u64>Total face/triangle count
has_texturesboolWhether the model includes textures
has_animationsboolWhether the model includes animations

Error Handling

BlazenError

The unified error type for all Blazen LLM and compute operations.

VariantFieldsDescription
Authmessage: StringAuthentication failed
RateLimitretry_after_ms: Option<u64>Rate limited by the provider
Timeoutelapsed_ms: u64Request timed out
Providerprovider: String, message: String, status_code: Option<u16>Provider-specific error
Validationfield: Option<String>, message: StringInvalid input
ContentPolicymessage: StringContent policy violation
Unsupportedmessage: StringRequested capability is not supported
SerializationStringJSON serialization/deserialization error
Requestmessage: String, source: Option<Box<dyn Error>>Network or request-level failure
CompletionCompletionErrorKindLLM completion-specific error
ComputeComputeErrorKindCompute job-specific error
MediaMediaErrorKindMedia-specific error
Toolname: Option<String>, message: StringTool execution error

CompletionErrorKind

VariantDescription
NoContentModel returned no content
ModelNotFound(String)Model not found
InvalidResponse(String)Invalid response from the model
Stream(String)Streaming error

ComputeErrorKind

VariantFieldsDescription
JobFailedmessage: String, error_type: Option<String>, retryable: boolCompute job failed
CancelledJob was cancelled
QuotaExceededmessage: StringProvider quota exceeded

MediaErrorKind

VariantFieldsDescription
Invalidmedia_type: Option<String>, message: StringInvalid media
TooLargesize_bytes: u64, max_bytes: u64Media exceeds size limit

is_retryable()

impl BlazenError {
    pub fn is_retryable(&self) -> bool;
}

Returns true for RateLimit, Timeout, Request, provider errors with status >= 500, and ComputeErrorKind::JobFailed where retryable is true.

Convenience Constructors

BlazenError::auth("invalid api key")
BlazenError::timeout(5000)
BlazenError::timeout_from_duration(elapsed)
BlazenError::request("connection reset")
BlazenError::unsupported("music generation not available")
BlazenError::provider("openai", "internal server error")
BlazenError::validation("prompt must not be empty")
BlazenError::tool_error("unknown tool: foo")
BlazenError::no_content()
BlazenError::model_not_found("gpt-5")
BlazenError::invalid_response("missing content field")
BlazenError::stream_error("unexpected EOF")
BlazenError::job_failed("GPU out of memory")
BlazenError::cancelled()

BlazenError also implements From<serde_json::Error> for automatic conversion.


Custom Providers

Implementing CompletionModel

use blazen_llm::{
    CompletionModel, CompletionRequest, CompletionResponse, StreamChunk, BlazenError,
};
use std::pin::Pin;
use futures_util::Stream;

struct MyProvider {
    api_key: String,
}

#[async_trait::async_trait]
impl CompletionModel for MyProvider {
    fn model_id(&self) -> &str {
        "my-custom-model"
    }

    async fn complete(
        &self,
        request: CompletionRequest,
    ) -> Result<CompletionResponse, BlazenError> {
        // Your HTTP/gRPC/local inference logic here
        todo!()
    }

    async fn stream(
        &self,
        request: CompletionRequest,
    ) -> Result<
        Pin<Box<dyn Stream<Item = Result<StreamChunk, BlazenError>> + Send>>,
        BlazenError,
    > {
        // Your streaming implementation here
        todo!()
    }
}

Once implemented, MyProvider automatically gets StructuredOutput via the blanket impl, so model.extract::<T>(messages) works out of the box.

Implementing ComputeProvider + ImageGeneration

use blazen_llm::compute::*;
use blazen_llm::BlazenError;

struct MyImageProvider {
    api_key: String,
}

#[async_trait::async_trait]
impl ComputeProvider for MyImageProvider {
    fn provider_id(&self) -> &str { "my-image-provider" }

    async fn submit(&self, request: ComputeRequest) -> Result<JobHandle, BlazenError> {
        todo!()
    }

    async fn status(&self, job: &JobHandle) -> Result<JobStatus, BlazenError> {
        todo!()
    }

    async fn result(&self, job: JobHandle) -> Result<ComputeResult, BlazenError> {
        todo!()
    }

    async fn cancel(&self, job: &JobHandle) -> Result<(), BlazenError> {
        todo!()
    }
}

#[async_trait::async_trait]
impl ImageGeneration for MyImageProvider {
    async fn generate_image(
        &self,
        request: ImageRequest,
    ) -> Result<ImageResult, BlazenError> {
        // Convert ImageRequest to your provider's format and call the API
        todo!()
    }

    async fn upscale_image(
        &self,
        request: UpscaleRequest,
    ) -> Result<ImageResult, BlazenError> {
        todo!()
    }
}

Built-in Providers

ProviderFeatureTraits Implemented
OpenAiProvideropenaiCompletionModel, StructuredOutput
OpenAiCompatProvideropenaiCompletionModel, StructuredOutput, ModelRegistry
AnthropicProvideranthropicCompletionModel, StructuredOutput
GeminiProvidergeminiCompletionModel, StructuredOutput, ModelRegistry
AzureOpenAiProviderazureCompletionModel, StructuredOutput
FalProviderfalCompletionModel, StructuredOutput, ComputeProvider, ImageGeneration, VideoGeneration, AudioGeneration, Transcription

OpenAiCompatProvider Presets

OpenAiCompatProvider works with any OpenAI-compatible endpoint. Named constructors are provided for popular services:

use blazen_llm::providers::openai_compat::OpenAiCompatProvider;

let groq = OpenAiCompatProvider::groq("gsk-...");
let openrouter = OpenAiCompatProvider::openrouter("sk-or-...");
let together = OpenAiCompatProvider::together("...");
let mistral = OpenAiCompatProvider::mistral("...");
let deepseek = OpenAiCompatProvider::deepseek("...");
let fireworks = OpenAiCompatProvider::fireworks("...");
let perplexity = OpenAiCompatProvider::perplexity("...");
let xai = OpenAiCompatProvider::xai("...");
let cohere = OpenAiCompatProvider::cohere("...");
let bedrock = OpenAiCompatProvider::bedrock("...", "us-east-1");