Semantic Recall
Vector-indexed long-term memory for retrieving relevant past knowledge using embedding similarity search.
Overview
Semantic Recall is a memory layer designed for vector-indexed long-term storage. It stores items as embeddings and retrieves them by similarity to the current query, making it ideal for large knowledge bases, past interactions, or document retrieval.
- Slot:
400(Slot.SEMANTIC_RECALL) - Scope:
global
Concept
Unlike other memory layers that operate on structured state, Semantic Recall integrates with a vector store to perform approximate nearest-neighbor search. The recall hook embeds the current query and retrieves the most relevant items within the token budget. The store hook embeds new content and indexes it for future retrieval.
Building a Semantic Recall Layer
Noetic does not ship a built-in vector store. Instead, you implement the MemoryHooks interface with your preferred vector database. Here is the general shape:
import type { MemoryLayer } from '@noetic/core';
import { Slot } from '@noetic/core';
interface VectorStore {
upsert(id: string, embedding: number[], text: string, metadata?: Record<string, unknown>): Promise<void>;
query(embedding: number[], topK: number): Promise<Array<{ id: string; text: string; score: number }>>;
}
interface SemanticRecallConfig {
vectorStore: VectorStore;
embed: (text: string) => Promise<number[]>;
topK?: number;
}
function semanticRecall(config: SemanticRecallConfig): MemoryLayer {
return {
id: 'semantic-recall',
name: 'Semantic Recall',
slot: Slot.SEMANTIC_RECALL,
scope: 'global',
budget: { min: 500, max: 3000 },
hooks: {
async init() {
return { state: {} };
},
async recall({ query, budget }) {
const embedding = await config.embed(query);
const results = await config.vectorStore.query(
embedding,
config.topK ?? 10,
);
const items = results.map((r) => ({
id: r.id,
type: 'message' as const,
role: 'developer' as const,
status: 'completed' as const,
content: [{ type: 'input_text' as const, text: r.text }],
}));
return {
items,
tokenCount: items.reduce(
(sum, item) => sum + item.content[0].text.length / 4,
0,
),
};
},
async store({ newItems }) {
for (const item of newItems) {
if (item.type !== 'message') continue;
const text = item.content
.filter((c) => c.type === 'output_text' || c.type === 'input_text')
.map((c) => ('text' in c ? c.text : ''))
.join('');
if (!text) continue;
const embedding = await config.embed(text);
await config.vectorStore.upsert(item.id, embedding, text);
}
return { state: {} };
},
},
};
}Adapter Examples
You can plug in any vector database:
- Pinecone: Use
@pinecone-database/pineconeas the vector store backend - Qdrant: Use
@qdrant/js-client-rest - ChromaDB: Use
chromadb - pgvector: Use raw SQL with
pgand thevectorextension - In-memory: Use a simple array with cosine similarity for development
Configuration Tips
- Set
topKbased on your token budget. Each retrieved document consumes tokens. - Use
budget: { min, max }to let the allocator distribute spare capacity. - Consider a
scopeof'resource'if you want per-project knowledge isolation instead of global. - The
storetimeout should account for embedding API latency.
Next Steps
- Custom Layers -- full guide to building any memory layer
- Memory Layer System -- how slots, scopes, and budgets work together