Semantic Search

How MemNexus finds memories by meaning using vector embeddings and hybrid search.

Traditional search matches keywords. Semantic search matches meaning. MemNexus combines both for the best results.

How it works

When you create a memory, MemNexus generates a vector embedding — a mathematical representation of the memory's meaning in 1536-dimensional space. When you search, your query is also embedded, and MemNexus finds memories whose vectors are closest to your query vector.

"that deployment issue last week"
         │
         ▼ Embed query
   [0.012, -0.034, 0.089, ...]  (1536 dimensions)
         │
         ▼ Cosine similarity
   Compare against all memory embeddings
         │
         ▼ Rank by similarity
   Results ordered by meaning match

This means "that deployment issue" finds memories about deployments even if the memory doesn't contain the word "deployment" — because the meanings are close in vector space.

The embedding model

MemNexus uses OpenAI's text-embedding-3-small model to generate embeddings:

Dimensions: 1,536
Max tokens: 8,191
Similarity metric: Cosine similarity
Index type: Neo4j vector index

Each memory's content is embedded once at creation time and stored alongside the memory node in Neo4j. Query embeddings are generated at search time.

Hybrid search

Pure semantic search is powerful but not perfect — it can miss exact keyword matches. Pure keyword search finds exact terms but misses meaning. MemNexus runs both and merges the results.

The pipeline

Search query
    │
    ├──→ Semantic search (vector similarity)
    │         └── Ranks by meaning
    │
    ├──→ Keyword search (full-text index)
    │         └── Ranks by term frequency
    │
    └──→ Reciprocal Rank Fusion (RRF)
              └── Merges and re-ranks results

Step 1: Semantic search

The query is embedded using OpenAI, then compared against memory embeddings using cosine similarity via Neo4j's vector index.

Good at: Finding conceptually related content Example: "infrastructure problems" matches memories about "server outages" and "database connection failures"

Step 2: Keyword search

The query is matched against memory content using Neo4j's full-text index (powered by Apache Lucene).

Good at: Exact matches, error codes, specific names Example: "ECONNREFUSED" finds memories containing that exact error string

Step 3: Reciprocal Rank Fusion

Results from both searches are combined using RRF. Each result gets a score based on its rank in each search:

RRF_score = Σ (1 / (k + rank_i))

Where k is a constant (typically 60) and rank_i is the result's position in each search. Results that rank high in both searches get the highest combined scores.

Why hybrid search matters

Query	Semantic only	Keyword only	Hybrid
"that deployment issue"	Finds related memories	Misses if "deployment" isn't in content	Best of both
"ECONNREFUSED error"	Finds error-related memories	Finds exact matches	Combines precision and recall
"Node.js best practices"	Finds conceptual matches	Finds exact mentions	Comprehensive results

Similarity scores

Search results include a similarity score from 0 to 1:

0.9-1.0 — Very close match (almost identical meaning)
0.7-0.9 — Strong match (clearly related)
0.5-0.7 — Moderate match (somewhat related)
Below 0.5 — Weak match (tangentially related)

The default threshold is 0.7, which provides a good balance of relevance and recall.

Search tips

Natural language works best — Write queries like you'd describe the memory to a person
Be specific — "React state management with Redux" outperforms "state"
Use topic filters — Narrow results when you know the category
Combine with time filters — --recent 7d limits to recent memories
Try rephrasing — If initial results aren't great, rephrase the query