I wanted a setup where my assistant could search my notes locally—fast, private, and without shipping my personal text to a third-party embedding API. The goal was simple:
- Keep “memory” (markdown notes) on my machine
- Use a local embedding + vector search backend
- Prefer GPU acceleration ...