Latest: v0.1.2 — Query-driven filtering, multilang support, LoCoMo eval suite
A plugin that extracts and retains semantic entities and relations from conversation histories, enriching agent context with semantic memory.
This plugin intercepts the tape context building process to:
- Extract semantics from conversation entries using an LLM
- Store snapshots of entities (people, tasks, concepts) and relations between them
- Inject memory into subsequent agent prompts, enabling long-context awareness
The plugin follows Bub's philosophy: it's completely optional, zero-config after installation, and hooks into the existing build_tape_context architecture without modifying core.
Query-Driven Filtering — Cuts memory block size by 78-81% with minimal accuracy loss.
extract_cues(): Deterministic cue extraction from user queries (language-aware, 14 languages supported). Cues are used to filter entities/relations before injection._format_snapshots_filtered(): Renders only entities and relations relevant to the current question, with 1-hop relation traversal to preserve answer-reachable nodes.
LoCoMo Evaluation Suite — Two benchmark scripts against the ACL 2024 LoCoMo dataset.
scripts/eval_locomo.py: Recall/precision benchmark with token savings per category.scripts/eval_locomo_judge.py: Mem0-protocol LLM-judge accuracy evaluation. Session timeline injection fixes temporal recall (0% → 50-100%). Baseline accuracy reaches 75-100% on DeepSeek (above Mem0's published 67%).
Multilang Support — 14 languages via i18n entity patterns.
- Language-aware stopwords, candidate patterns, and multi-word extraction for:
en,zh-CN,zh-TW,ja,ko,ru,de,fr,es,it,pt-br,be,hi,id. BUB_SEMANTIC_LANGSenv var controls active languages (default:en).- Regex-based fallback for languages without i18n data.
Performance — LLM extraction uses EmptyTapeStore to avoid unnecessary persistence;
snapshot loading is cached within a turn to avoid repeated I/O.
Breaking changes: None (additive release).
The plugin is already registered in pyproject.toml:
[project.entry-points."bub"]
semantic_memory = "bub.plugins.semantic_memory.hook_impl:SemanticMemoryPlugin"Bub's framework automatically loads and instantiates it on startup. No additional setup required.
- Input: Agent receives a new message, tape entries are loaded
- Extract: LLM analyzes entries and identifies:
- Entities: people, tasks, events, concepts
- Relations: created, depends_on, mentions, etc.
- Store: SemanticSnapshot is appended to
~/.bub/tapes/semantic/{tape_id}.jsonl - Load: All historical snapshots for this tape are loaded
- Inject: Semantic memory is formatted as a system prompt block and prepended to the context
- Output: Agent receives enriched context with semantic awareness
Given this conversation:
User: "Alice created a task to deploy v1.0"
Agent: [responds]
User: "What did Alice do?"
On the second turn, the agent sees:
## Semantic Memory
### Entities (2):
- person:alice
- task:deploy_v1 (v1.0 deployment)
### Relations (1):
- alice --created--> deploy_v1
---
[rest of context]
models.py: Pydantic dataclasses for Entity, Relation, SemanticSnapshotextractor.py: LLM-based extraction from tape entriesstore.py: JSONL file storage at~/.bub/tapes/semantic/context.py: Formatting snapshots into system promptshook_impl.py: Bub hookimpl that wires everything together
Snapshots are stored as JSONL (one JSON object per line):
{
"entities": [
{"id": "ent_abc123", "type": "person", "name": "Alice", "metadata": {}},
{"id": "ent_def456", "type": "task", "name": "deploy_v1", "metadata": {"version": "1.0"}}
],
"relations": [
{"from": "ent_abc123", "to": "ent_def456", "type": "created", "metadata": {}}
],
"tape_id": "527c9ae0c6f31e05__0b871d5e50e7c192",
"anchor_id": "anchor_001",
"created_at": "2026-06-06T09:35:00Z"
}The plugin reuses your main LLM settings (BUB_MODEL, BUB_API_KEY, etc.):
# Your existing setup (e.g., DeepSeek)
export BUB_MODEL=deepseek:deepseek-chat
export BUB_API_KEY=sk-...Optional configuration:
# Multilang support (default: en)
export BUB_SEMANTIC_LANGS=en,zh-CN,ja
# Query-driven filtering (off by default; set to enable)
export BUB_SEMANTIC_QUERY_DRIVEN=1No separate credentials needed.
Run the test suite:
uv run pytest tests/plugins/semantic_memory/test_semantic_memory.py -vCoverage: 43 tests across unit and integration scenarios:
- Entity/Relation serialization
- JSONL storage I/O
- LLM extraction with mocks
- Context building
- Multi-turn memory retention
$ uv run bub chat
bub > Alice is a data scientist.
Agent > Got it.
bub > What is Alice's profession?
Agent > Alice is a data scientist. (retrieved from semantic memory)
bub > ,tape.info
[Shows: 2 entries, 1 anchor, ... semantic snapshots: 2]You: "I need to fix a critical bug in the payment module"
Bot: [Uses semantic memory to track bug, module]
You: "What was I working on?"
Bot: [Recalls semantic memory: bug:critical_payment, module:payment]
$ cat ~/.bub/tapes/semantic/527c9ae0c6f31e05__0b871d5e50e7c192.jsonl | python -m json.tool
[Shows stored entities and relations]- Each extraction call: ~300-500 tokens (depends on entry volume)
- Estimated overhead: +10-20% per turn (configurable via extraction prompt)
- JSONL format: ~1-2 KB per snapshot (grows with entities/relations)
- Typical session: ~50-100 KB
- Extraction is async, non-blocking
- First turn (with extraction): ~500ms extra
- Subsequent turns: ~50ms extra (just loading snapshots)
If semantic extraction fails for any reason:
- LLM error: Returns empty snapshot, continues
- Invalid JSON: Logged as warning, continues
- Storage error: Logged, continues with base context
The agent always works, semantic memory is optional enhancement.
Vector embeddings for semantic similarity search- ✅ Query-driven context filtering (reduce prompt bloat by 78-81%)
- ✅ Language-aware cue extraction (14 languages)
- Entity dependency analysis (who depends on what)
- Centrality metrics (who/what is most important)
- Causal reasoning (what led to what)
- Cross-session entity resolution
- Long-term memory across multiple conversations
- Persistent entity graph (not just per-tape)
Q: Plugin not loading? A: Check that entry-point is registered:
python -c "import importlib.metadata; print(list(importlib.metadata.entry_points(group='bub')))"Q: Semantic snapshots not appearing?
A: Check ~/.bub/tapes/semantic/ directory exists. Check logs with BUB_VERBOSE=1.
Q: LLM calls are expensive? A: Reduce extraction frequency or use a cheaper model (e.g., DeepSeek distill). Future releases will support model selection per plugin.
Build context with semantic memory. Called by the framework automatically.
Args:
entries: Iterable of TapeEntry objectscontext: TapeContext instancellm: republic.LLM instance (optional; if None, returns base context)store: SemanticStore instance (optional; if None, returns base context)
Returns: List of message dicts ready for model input
Extract entities and relations from tape entries.
Args:
entries: List of TapeEntry objectsllm: republic.LLM instance for extractiontape_id: Session/tape identifieranchor_id: Optional anchor point identifiermax_tokens: Max tokens for LLM response
Returns: SemanticSnapshot with extracted entities/relations
This plugin is part of Bub's extensibility model. To extend:
- Custom entity types: Modify Entity.type enum in models.py
- Custom extractors: Replace or wrap extractor.py
- Custom storage: Implement SemanticStore interface
- Custom formatters: Replace _format_snapshots in context.py
All without modifying Bub core.
Same as Bub (Apache 2.0)
Questions? See Bub documentation or open an issue.