Semantic Search
Semantic search retrieves past messages by similarity rather than recency. It requires an embedding adapter (text to vectors) and a vector adapter (storage and search).
Configuration
import { Agent, Memory, AiSdkEmbeddingAdapter, InMemoryVectorAdapter } from "@voltagent/core";
import { LibSQLMemoryAdapter, LibSQLVectorAdapter } from "@voltagent/libsql";
import { openai } from "@ai-sdk/openai";
const memory = new Memory({
storage: new LibSQLMemoryAdapter({ url: "file:./.voltagent/memory.db" }),
embedding: new AiSdkEmbeddingAdapter(openai.embedding("text-embedding-3-small")),
vector: new LibSQLVectorAdapter({ url: "file:./.voltagent/memory.db" }), // or InMemoryVectorAdapter() for dev
enableCache: true, // optional embedding cache
});
const agent = new Agent({
name: "Assistant",
model: openai("gpt-4o-mini"),
memory,
});
Available Adapters
Embedding:
AiSdkEmbeddingAdapter
- Wraps any AI SDK embedding model
Vector Storage:
InMemoryVectorAdapter
(@voltagent/core
) - Development onlyLibSQLVectorAdapter
(@voltagent/libsql
) - Persistent vectors in SQLite/LibSQL/TursoPostgresVectorAdapter
(@voltagent/postgres
) - Persistent vectors in Postgres with pgvectorManagedMemoryVectorAdapter
(@voltagent/voltagent-memory
) - VoltOps-hosted vectors
Usage
Enable semantic search per generation call:
const result = await agent.generateText("What pricing model did we discuss?", {
userId: "user-123",
conversationId: "thread-abc",
semanticMemory: {
enabled: true, // default: auto-enabled when vector support is present
semanticLimit: 5, // number of similar messages to retrieve
semanticThreshold: 0.7, // minimum similarity score (0-1)
mergeStrategy: "append", // "prepend" | "append" | "interleave"
},
});
Default Behavior
When embedding
and vector
adapters are configured:
- Semantic search auto-enables for calls with
userId
andconversationId
- Default
semanticLimit
: 5 messages - Default
semanticThreshold
: 0.7 - Default
mergeStrategy
:"append"
(recent messages first, then similar messages)
Merge Strategies
append
(default):[recent messages] + [similar messages]
- preserves chronological orderprepend
:[similar messages] + [recent messages]
- emphasizes relevanceinterleave
: Alternates between similar and recent messages
How It Works
On Message Save
When saving messages, the Memory
class:
- Extracts text content from
UIMessage.parts
- Generates embeddings via the embedding adapter
- Stores vectors with metadata:
- ID:
msg_${conversationId}_${message.id}
- Metadata:
{ messageId, conversationId, userId, role, createdAt }
- ID:
On Message Retrieval
When semantic search is enabled:
- Embed the current query
- Search for similar vectors using the vector adapter
- Retrieve matching messages by ID
- Merge with recent messages using the configured strategy
- Remove duplicates (messages in both sets)
Programmatic API
Direct search without agent generation:
// Check if vectors are configured
const hasVectors = memory.hasVectorSupport(); // boolean
// Search similar messages
const results = await memory.searchSimilar("pricing discussion", {
limit: 10,
threshold: 0.8,
filter: { userId: "user-123", conversationId: "thread-abc" },
});
for (const result of results) {
console.log(result.id, result.score, result.metadata);
}
Embedding Cache
Enable caching to avoid re-embedding identical text:
const memory = new Memory({
storage: new LibSQLMemoryAdapter({ url: "file:./.voltagent/memory.db" }),
embedding: new AiSdkEmbeddingAdapter(openai.embedding("text-embedding-3-small")),
vector: new LibSQLVectorAdapter({ url: "file:./.voltagent/memory.db" }),
enableCache: true, // enable cache
cacheSize: 1000, // max entries (default: 1000)
cacheTTL: 3600000, // TTL in ms (default: 1 hour)
});
The cache stores text → vector
mappings in memory with LRU eviction.
Vector Adapters
InMemoryVectorAdapter
import { InMemoryVectorAdapter } from "@voltagent/core";
const vector = new InMemoryVectorAdapter();
- Uses cosine similarity
- Supports metadata filtering
- Lost on restart (use for development only)
LibSQLVectorAdapter
import { LibSQLVectorAdapter } from "@voltagent/libsql";
// Local SQLite
const vector = new LibSQLVectorAdapter({ url: "file:./.voltagent/memory.db" });
// In-memory (testing)
const vector = new LibSQLVectorAdapter({ url: ":memory:" });
// Turso
const vector = new LibSQLVectorAdapter({
url: "libsql://your-db.turso.io",
authToken: process.env.TURSO_AUTH_TOKEN,
});
- Stores vectors as BLOBs
- Supports metadata filtering and thresholds
- Persistent across restarts
ManagedMemoryVectorAdapter
import { ManagedMemoryVectorAdapter } from "@voltagent/voltagent-memory";
const vector = new ManagedMemoryVectorAdapter({
databaseName: "production-memory",
// voltOpsClient optional (auto-resolves from environment)
});
- VoltOps-hosted vectors with zero setup
- See Managed Memory for configuration
Example: Full Semantic Search Setup
import { Agent, Memory, AiSdkEmbeddingAdapter } from "@voltagent/core";
import { LibSQLMemoryAdapter, LibSQLVectorAdapter } from "@voltagent/libsql";
import { openai } from "@ai-sdk/openai";
const memory = new Memory({
storage: new LibSQLMemoryAdapter({ url: "file:./.voltagent/memory.db" }),
embedding: new AiSdkEmbeddingAdapter(openai.embedding("text-embedding-3-small")),
vector: new LibSQLVectorAdapter({ url: "file:./.voltagent/memory.db" }),
enableCache: true,
});
const agent = new Agent({
name: "Research Assistant",
instructions: "Help users recall past discussions and find relevant information.",
model: openai("gpt-4o-mini"),
memory,
});
// Semantic search automatically enabled
const result = await agent.generateText(
"What did we decide about the API authentication approach?",
{
userId: "user-123",
conversationId: "project-alpha",
semanticMemory: {
semanticLimit: 10,
semanticThreshold: 0.75,
},
}
);
Learn More
- Working Memory - Maintain compact context across turns
- Managed Memory - Zero-setup vector storage
- LibSQL / Turso - Self-hosted vector storage