Hybrid (semantic + keyword) search tool.
Opinionated:
- HNSW cosine similarity through
nmslib - BM25 lucene variant through
bm25s - ONNX models only, default CPU execution through
onnxruntime
Trying to be minimal with dependencies (show Dependency graph)
searchlite v0.1.0
├── bm25s v0.2.14
│ ├── numpy v2.4.0
│ └── scipy v1.16.3
│ └── numpy v2.4.0
├── nmslib v2.1.2
│ ├── numpy v2.4.0
│ ├── pybind11 v3.0.1
│ └── scipy v1.16.3 (*)
├── numpy v2.4.0
├── onnxruntime v1.23.2
│ ├── coloredlogs v15.0.1
│ │ └── humanfriendly v10.0
│ ├── flatbuffers v25.12.19
│ ├── numpy v2.4.0
│ ├── packaging v25.0
│ ├── protobuf v6.33.2
│ └── sympy v1.14.0
│ └── mpmath v1.3.0
└── tokenizers v0.22.1
└── huggingface-hub v1.2.3
├── filelock v3.20.1
├── fsspec v2025.12.0
├── hf-xet v1.2.0
├── httpx v0.28.1
│ ├── anyio v4.12.0
│ │ ├── idna v3.11
│ │ └── typing-extensions v4.15.0
│ ├── certifi v2025.11.12
│ ├── httpcore v1.0.9
│ │ ├── certifi v2025.11.12
│ │ └── h11 v0.16.0
│ └── idna v3.11
├── packaging v25.0
├── pyyaml v6.0.3
├── shellingham v1.5.4
├── tqdm v4.67.1
├── typer-slim v0.20.1
│ ├── click v8.3.1
│ └── typing-extensions v4.15.0
└── typing-extensions v4.15.0
-[ ] Improve CLI -[ ] Server -[ ] Dockerize -[ ] Release to PyPI?
-
Download an ONNX model in Hugging Face like mixedbread-ai/mxbai-embed-large-v1. Also download the model's
tokenizer.json. -
Do indexing. See
examples/hybrid_idx.py. Better to haveINDEX_FILE_PATHandINDEX_DIR_PATHbe inside adata/directory