Embeddings
Generate vector embeddings with Blazen in Node.js
Blazen provides a unified EmbeddingModel interface for generating vector embeddings across multiple providers. The API mirrors CompletionModel: create a model with a static factory method, then call embed().
Create an Embedding Model
import { EmbeddingModel } from "blazen";
// OpenAI (default: text-embedding-3-small, 1536 dimensions)
// Reads OPENAI_API_KEY from the environment by default.
const model = EmbeddingModel.openai();
// Or pass an API key explicitly.
const model = EmbeddingModel.openai({ apiKey: "sk-..." });
// Together AI
const model = EmbeddingModel.together({ apiKey: "tok-..." });
// Cohere
const model = EmbeddingModel.cohere({ apiKey: "co-..." });
// Fireworks AI
const model = EmbeddingModel.fireworks({ apiKey: "fw-..." });
Generate Embeddings
Pass an array of strings to embed(). It returns an EmbeddingResponse with one vector per input text.
const response = await model.embed(["Hello, world!", "Goodbye, world!"]);
console.log(response.embeddings.length); // 2
console.log(response.embeddings[0].length); // 1536 (dimensionality)
console.log(response.model); // "text-embedding-3-small"
EmbeddingResponse
The response object has the following fields:
| Property | Type | Description |
|---|---|---|
.embeddings | number[][] | One vector per input text. |
.model | string | Model that produced the embeddings. |
.usage | TokenUsage | undefined | Token usage statistics. |
.cost | number | undefined | Estimated cost in USD. |
.timing | RequestTiming | undefined | Request timing breakdown. |
Model Properties
console.log(model.modelId); // "text-embedding-3-small"
console.log(model.dimensions); // 1536
Local Embeddings
Blazen can generate embeddings entirely on your machine using its built-in embed backend. No API key, no network calls after the initial model download, and completely free. Blazen’s embed backend runs through ONNX Runtime on glibc/mac/windows and pure-Rust tract on musl — the facade picks the right underlying implementation automatically for your target.
Setup
Local embeddings are available when Blazen is built with the embed feature. The default npm install blazen package includes it.
Usage
import { EmbeddingModel } from "blazen";
// Use the default model (BAAI/bge-small-en-v1.5, 384 dimensions)
const model = EmbeddingModel.embed();
// Or specify a model and other options explicitly
const model = EmbeddingModel.embed({
modelName: "BGESmallENV15",
cacheDir: "/tmp/models",
maxBatchSize: 256,
showDownloadProgress: true,
});
const response = await model.embed(["hello", "world"]);
console.log(response.embeddings.length); // 2
console.log(response.embeddings[0].length); // 384
EmbedOptions
| Field | Type | Default | Description |
|---|---|---|---|
modelName | string | undefined | "BGESmallENV15" | Embed model variant name. |
cacheDir | string | undefined | backend default | Directory where downloaded models are cached. |
maxBatchSize | number | undefined | 256 | Maximum batch size for embedding. |
showDownloadProgress | boolean | undefined | false | Print a progress bar during model download. |
Drop-in with Memory
A local embedding model is a regular EmbeddingModel — it plugs into Memory with no changes:
import { EmbeddingModel, Memory, InMemoryBackend } from "blazen";
const model = EmbeddingModel.embed();
const memory = new Memory(model, new InMemoryBackend());
await memory.add("doc1", "Paris is the capital of France");
const results = await memory.search("capital of France", 5);
Model Download
The first call to embed() (or memory.add()) downloads the ONNX model weights. For BGESmallENV15 the download is roughly 33 MB. After the first run the model is cached locally and no further network access is required.
Use Cases
Embeddings are the building block for semantic search, RAG pipelines, clustering, and classification. A typical pattern inside a workflow step:
import { EmbeddingModel } from "blazen";
const embedModel = EmbeddingModel.openai({ apiKey: "sk-..." });
wf.addStep("embed_documents", ["DocumentsReady"], async (event, ctx) => {
const response = await embedModel.embed(event.documents);
await ctx.set("vectors", response.embeddings);
return { type: "SearchEvent" };
});