Skip to content

neil-vqa/searchlite

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

searchlite

Hybrid (semantic + keyword) search tool.

Opinionated:

  • HNSW cosine similarity through nmslib
  • BM25 lucene variant through bm25s
  • ONNX models only, default CPU execution through onnxruntime
Trying to be minimal with dependencies (show Dependency graph)
searchlite v0.1.0
├── bm25s v0.2.14
│   ├── numpy v2.4.0
│   └── scipy v1.16.3
│       └── numpy v2.4.0
├── nmslib v2.1.2
│   ├── numpy v2.4.0
│   ├── pybind11 v3.0.1
│   └── scipy v1.16.3 (*)
├── numpy v2.4.0
├── onnxruntime v1.23.2
│   ├── coloredlogs v15.0.1
│   │   └── humanfriendly v10.0
│   ├── flatbuffers v25.12.19
│   ├── numpy v2.4.0
│   ├── packaging v25.0
│   ├── protobuf v6.33.2
│   └── sympy v1.14.0
│       └── mpmath v1.3.0
└── tokenizers v0.22.1
    └── huggingface-hub v1.2.3
        ├── filelock v3.20.1
        ├── fsspec v2025.12.0
        ├── hf-xet v1.2.0
        ├── httpx v0.28.1
        │   ├── anyio v4.12.0
        │   │   ├── idna v3.11
        │   │   └── typing-extensions v4.15.0
        │   ├── certifi v2025.11.12
        │   ├── httpcore v1.0.9
        │   │   ├── certifi v2025.11.12
        │   │   └── h11 v0.16.0
        │   └── idna v3.11
        ├── packaging v25.0
        ├── pyyaml v6.0.3
        ├── shellingham v1.5.4
        ├── tqdm v4.67.1
        ├── typer-slim v0.20.1
        │   ├── click v8.3.1
        │   └── typing-extensions v4.15.0
        └── typing-extensions v4.15.0

Roadmap?

-[ ] Improve CLI -[ ] Server -[ ] Dockerize -[ ] Release to PyPI?

Get started

  1. Download an ONNX model in Hugging Face like mixedbread-ai/mxbai-embed-large-v1. Also download the model's tokenizer.json.

  2. Do indexing. See examples/hybrid_idx.py. Better to have INDEX_FILE_PATH and INDEX_DIR_PATH be inside a data/ directory

About

Hybrid (semantic + keyword) search tool

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages