Skip to content
View lm-webui's full-sized avatar

Block or report lm-webui

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
lm-webui/README.md

LM WebUI πŸ›‘οΈ

LM-WebUI is a unified Local AI interface and LLM runtime platform, built for privacy-first and sovereign AI systems. Native support for run local GGUF model inference, Ollama, and API-based models like OpenAI and Gemini, with multimodal RAG pipelines, and persistent vector memory.

Run AI on your control


Built open-source for comunity, developers, system integrators, and organizations that require local inference, reproducibility, and infrastructure-level control, lm-webui bridges the power of modern cloud LLM features with the integrity of local data ownership.

Run fully offline, integrate with cloud APIs when needed, and deploy across environments without sacrificing performance, privacy, or sovereignty.

⚠️ Work in Progress (WIP) lm-webui is under active development. Features, APIs, and architecture may change as the project evolves. Contributions, feedback, and early testing are welcome, but expect breaking changes.


πŸš€ Quick Start

One-Line Installation (Recommended)

curl -sSL https://raw.githubusercontent.com/lm-webui/lm-webui/main/install.sh | bash

This will:

  1. Check for Docker and Docker Compose
  2. Clone the repository (if needed)
  3. Set up environment configuration
  4. Build and start the Docker containers
  5. Provide access instructions

Access the application at http://localhost:7070


⚑ Core Features

Feature Capabilities
Authentication Secure JWT-based authentication with refresh tokens and persistent user sessions. Designed for multi-user deployments and role-aware environments.
WebSocket Streaming Bidirectional streaming with structured events, typing indicators, cancellation support, and step-by-step reasoning visibility.
Hardware Acceleration Automatic CUDA, ROCm, and Metal detection with dynamic Memory and Layer optimization for efficient local execution across GPUs and CPUs.
GGUF Runtime Built-in GGUF model lifecycle management download, load, quantize, and serve models locally with HuggingFace compatibility.
RAG Engine Modular retrieval pipeline powered by Qdrant for vector search, reranking, semantic chunking, and context injection.
Multimodal Processing Image and document processing with OCR, embedding, and structured content extraction for unified chat workflows.
Knowledge Graph Triplet-based semantic memory and entity relationship tracking to enhance long-term contextual understanding.
Self-Hosted Ready Effortless on-prem, private cloud, and isolated deployments with no required external telemetry.

πŸ€— GGUF Runtime Highlights

  • Model Management: Upload/download GGUF models with progress tracking
  • HuggingFace Integration: Direct download from HuggingFace repositories
  • Hardware Compatibility: Automatic model validation for your system
  • Local Registry: Manage and organize local GGUF models
  • Seamless Integration: Use GGUF models directly in chat conversations

πŸ“– Documentation

For detailed documentation, see the docs/ directory:


</> Architecture Overview

lm-webui follows a modern microservices-inspired architecture:

lm-webui/
β”œβ”€β”€ backend/           # FastAPI backend with WebSocket streaming
β”‚   β”œβ”€β”€ app/          # Application code
β”‚   β”‚   β”œβ”€β”€ routes/   # API endpoints (chat, auth, gguf, etc.)
β”‚   β”‚   β”œβ”€β”€ streaming/# WebSocket streaming system
β”‚   β”‚   β”œβ”€β”€ rag/      # RAG pipeline with vector search
β”‚   β”‚   β”œβ”€β”€ services/ # Core services (GGUF, model management, etc.)
β”‚   β”‚   β”œβ”€β”€ hardware/ # Hardware acceleration detection
β”‚   β”‚   └── security/ # Authentication & encryption
β”‚   └── tests/        # Backend tests
β”œβ”€β”€ frontend/         # React/TypeScript frontend
β”‚   β”œβ”€β”€ src/          # Source code
β”‚   β”‚   β”œβ”€β”€ components/# UI components
β”‚   β”‚   β”œβ”€β”€ services/ # API and WebSocket services
β”‚   β”‚   β”œβ”€β”€ hooks/    # Custom React hooks
β”‚   β”‚   └── types/    # TypeScript type definitions
β”‚   └── __tests__/    # Frontend tests
└── docs/             # Documentation

πŸ”§ Development

Prerequisites

  • Backend: Python 3.9+, PostgreSQL/SQLite
  • Frontend: Node.js 16+, npm/yarn
  • Optional: Docker, CUDA/ROCm for GPU acceleration

Development Setup

For development work, you can use either the Docker-based setup or manual installation:

Docker-based Development

# Quick setup using the installation script
curl -sSL https://raw.githubusercontent.com/lm-webui/lm-webui/main/install.sh | bash

# Or manually with Docker Compose
git clone https://github.com/lm-webui/lm-webui.git
cd lm-webui
docker-compose up --build

Manual Development Setup

# 1. Clone and setup
git clone https://github.com/lm-webui/lm-webui.git
cd lm-webui

# 2. Backend setup
cd backend
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
pip install -r requirements-test.txt  # For testing

# 3. Frontend setup
cd ../frontend
npm install

# 4. Run tests
cd ../backend && pytest
cd ../frontend && npm test

πŸ“ Roadmap & Known Limitations

Known Limitations

  • Some multimodal pipelines are still experimental
  • Hardware acceleration behavior may vary across GPU vendors
  • RAG metadata handling is functional but not yet fully standardized
  • Media library under development

Roadmap (High-Level)

Near-term

  • Stabilize core orchestration APIs and configuration schema
  • Improve GGUF deployment automation and quantization presets
  • Expand hardware detection and backend fallback logic

Mid-term

  • Add stronger RAG governance (source versioning, metadata filters)
  • Introduce model bundle validation and optional signature checks
  • Improve workflow reproducibility and export/import support

Long-term

  • Advanced scheduling for multi-GPU and multi-model workloads
  • Adapter/LoRA management for task-specific fine-tuning
  • Enterprise features

🀝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Submit a pull request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ”— Links


⭐ Star History

Let’s shape the future of local AI together πŸ€œπŸ€›

Popular repositories Loading

  1. lm-webui lm-webui Public

    Unified Local AI Interface & LLM Runtime (Support GGUF, Ollama, OpenAI, Gemini, etc.). Insearch of building sovereign AI system ✨

    Python 2 1

  2. lmwebui lmwebui Public

    Multimodal LLM Interface & Orchestrator