Skip to content

Scan for code locally stored and find specific agenda

License

Notifications You must be signed in to change notification settings

cbroberg/codescan

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

13 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

CodeScan

AI-powered semantic code search tool for your local repositories. Find code by describing what you're looking for in natural language.

Example: "Where did I implement Microsoft Teams notifications?" or "Show me authentication logic using OAuth"

Features

  • πŸ” Semantic Search: AI-powered understanding of code intent, not just keyword matching
  • πŸ’¬ Interactive Chat: Multi-turn conversation mode for exploratory code discovery
  • πŸ“¦ Local Indexing: SQLite database with semantic chunking of functions and classes
  • πŸš€ CLI Interface: Terminal-based interaction for quick searches
  • 🎨 Web GUI: Next.js dashboard (coming in Phase 5)
  • πŸ”— VS Code Integration: Click files to open directly in VS Code
  • ⚑ Fast: Two-stage search (keyword pre-filter + semantic analysis) for speed
  • πŸ“Š Technology Detection: Automatically identifies tech stack in your repositories

Quick Start

Prerequisites

  • Node.js 20+ and pnpm (or npm)
  • ANTHROPIC_API_KEY (Claude API key) in .env
  • VS Code (optional, for file opening)

Installation

  1. Clone and install dependencies:

    cd codescan
    pnpm install
  2. Create .env file with your API key:

    cp .env.example .env
    # Edit .env and add your ANTHROPIC_API_KEY=sk-ant-...
  3. Build packages:

    pnpm build

Global CLI Installation (Optional)

To use codescan command directly instead of pnpm cli, link the CLI globally:

# Link CLI to system PATH
cd packages/cli
pnpm link --global

# Now you can use codescan from anywhere:
codescan init
codescan index
codescan search "your query"
codescan server start

If using npm instead of pnpm:

cd packages/cli
npm link

To unlink later:

pnpm unlink --global
# or: npm unlink -g codescan

Usage

Starting the API Server

The CLI commands need the backend API server running. Start it in one terminal:

If using global CLI (recommended):

codescan server start

Or if using pnpm directly:

pnpm server
# Or: pnpm cli server start

Server runs on http://localhost:3000

Initializing CodeScan

Configure which directories to search:

With global CLI:

codescan init /path/to/your/code "My Code"

Or with pnpm:

pnpm cli init /path/to/your/code "My Code"

You'll be prompted to add directories. Example:

? Enter directory path: /Users/yourusername/projects
? Include subdirectories? Yes
? Add another directory? No

Building the Index

Index your repositories (one-time or periodic):

With global CLI:

codescan index

Or with pnpm:

pnpm cli index

This scans files, extracts functions/classes, and builds the semantic search index. First run takes a minute or two depending on code volume.

Searching Code

Quick Search

Search from command line:

With global CLI:

codescan search "Microsoft Teams notifications"
codescan search "authentication logic" --tech teams-sdk
codescan search "React hooks" --repo my-web-app --max 10

Or with pnpm:

pnpm cli search "Microsoft Teams notifications"
pnpm cli search "authentication logic" --tech teams-sdk
pnpm cli search "React hooks" --repo my-web-app --max 10

Options:

  • --tech <technology> - Filter by tech (e.g., "react", "teams-sdk")
  • --repo <repository> - Search specific repository
  • --lang <language> - Filter by language (e.g., "typescript", "python")
  • --max <number> - Maximum results (default: 20)

Interactive Chat

For exploratory searching with follow-ups:

With global CLI:

codescan chat

Or with pnpm:

pnpm cli chat

Example session:

> Find code using OAuth

πŸ” Searching repositories...

Found 5 matches:
[auth-service] src/lib/oauth.ts - Login implementation
[web-app] src/auth/provider.ts - OAuth provider setup
...

> Show me the first one in detail
[Full code displayed with syntax highlighting]

> How does it integrate with the API?
[AI responds and searches for related code...]

Checking Status

View indexing progress and statistics:

With global CLI:

codescan status

Or with pnpm:

pnpm cli status

Shows:

  • Index status (indexing, ready)
  • Total repositories, files, chunks
  • Database size
  • Recent searches

Listing Repositories

See all indexed repositories:

With global CLI:

codescan repos

Or with pnpm:

pnpm cli repos

Shows repository name, path, file count, tech stack, and last indexed time.

Configuration

CodeScan stores configuration in your home directory: ~/.config/codescan/config.json

Auto-created on first run with defaults. You can edit manually or use codescan init to reconfigure.

Example configuration:

{
  "searchPaths": [
    {
      "path": "/Users/yourusername/projects",
      "name": "My Projects",
      "exclude": ["node_modules", "dist", ".git"]
    }
  ],
  "search": {
    "maxResults": 20,
    "minRelevanceScore": 50
  },
  "ai": {
    "model": "claude-3-5-sonnet-20241022",
    "maxTokens": 4096
  }
}

Architecture

Monorepo structure with three packages:

  • packages/api - Express backend server with:

    • SQLite semantic indexing
    • Two-stage search engine
    • Claude API integration for semantic analysis
    • RESTful API endpoints
    • Chat session management
  • packages/cli - Terminal interface:

    • Commands for indexing, searching, chatting
    • Interactive prompts with Inquirer
    • Formatted output with syntax highlighting
    • VS Code integration links
  • packages/web - Next.js GUI (in development):

    • Dashboard with statistics
    • Search interface with filters
    • Interactive chat
    • Code preview with Monaco Editor
    • Repository management
  • packages/shared - Shared TypeScript types used by all packages

How It Works

Two-Stage Search

  1. Stage 1 - Keyword Pre-filtering (fast):

    • SQLite FTS5 full-text search
    • Filters ~100-1000 candidate code chunks in <100ms
    • Cheap (no API calls)
  2. Stage 2 - Semantic Analysis (accurate):

    • Send top candidates to Claude API
    • AI scores relevance (0-100)
    • Explains why each result matches
    • Ranks by multiple signals: semantic score, keyword quality, recency, code simplicity, tech match

Indexing Process

  1. Scan configured directories for source code
  2. Parse files using tree-sitter AST parser (TypeScript, JavaScript, Python, etc.)
  3. Extract functions, classes, methods as semantic chunks
  4. Detect technology stack (package.json, imports)
  5. Store in SQLite with full-text search indexes
  6. Watch for file changes (upcoming feature)

Development

Available npm scripts:

# Start everything
pnpm dev              # Runs API + CLI in watch mode

# Run packages individually
pnpm api              # Just start API server
pnpm cli              # CLI commands in watch mode
pnpm web              # Start Next.js dev server (Phase 5)

# Build
pnpm build            # Build all packages
pnpm clean            # Remove dist/ folders

# Monorepo
pnpm turbo:run <task>  # Run task across all packages with Turborepo

Environment Variables

Create .env in project root:

# Required
ANTHROPIC_API_KEY=sk-ant-...

# Optional (defaults shown)
NODE_ENV=development
API_PORT=3000
DB_PATH=~/.config/codescan/index.db
LOG_LEVEL=info

Tips

  • CLI commands too long? Use global CLI installation (see above) to use codescan command directly
  • First indexing slow? That's normal. Depends on your code volume. 1000 files β‰ˆ 30 seconds.
  • API key cost? Semantic search uses Claude API. Cost depends on query complexity (~$0.01-$0.10 per search).
  • Want to re-index? Just run codescan index or pnpm cli index again. It detects changes.
  • API not running? Start it in another terminal with codescan server start or pnpm server
  • Something not working? Check logs with LOG_LEVEL=debug pnpm api for detailed output.

Next Steps

  • Phase 5: Web GUI with dashboard and visual search interface
  • Real-time file watching for incremental index updates
  • Support for more languages (Go, Rust, Java)
  • Code diff analysis and git blame integration
  • Export search results to various formats

License

MIT

About

Scan for code locally stored and find specific agenda

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •