Skip to content

agentic coding capabilities#4

Merged
DavidVeksler merged 2 commits intomasterfrom
claude/agentic-coding-brainstorm-014mNjVVdMnGYzBMP4Co4QDa
Nov 24, 2025
Merged

agentic coding capabilities#4
DavidVeksler merged 2 commits intomasterfrom
claude/agentic-coding-brainstorm-014mNjVVdMnGYzBMP4Co4QDa

Conversation

@DavidVeksler
Copy link
Owner

…oding

This major update transforms CodeContext from a simple file dumper into an intelligent context server optimized for agentic coding workflows.

New Features:

  • MCP (Model Context Protocol) server implementation

    • Native integration with Claude Code, Cline, and other MCP clients
    • Four powerful MCP tools: GetCodeContext, GetProjectStructure, ListProjectFiles, GetFileContent
    • Stdio transport for seamless subprocess integration
  • Token Budget Optimization System

    • TokenCounter service for accurate token estimation
    • FileRelevanceScorer with multi-factor relevance scoring (filename, path, content, importance)
    • TokenBudgetOptimizer with three strategies: GreedyByScore, ValueOptimized, Balanced
    • Intelligent file selection to maximize relevance within token constraints
  • Enhanced Program Architecture

    • Dual-mode support: CLI mode (original) and MCP server mode (new)
    • Command-line flag --mcp/--server to enable MCP mode
    • Refactored CLI code into ProgramCli.cs for separation of concerns
  • Comprehensive Documentation

    • Updated README with MCP setup instructions
    • Detailed explanations of token optimization strategies
    • Example workflows and usage patterns
    • MCP configuration example file

Technical Implementation:

  • Added ModelContextProtocol and Microsoft.Extensions.Hosting NuGet packages
  • Implemented MCP tools using attribute-based discovery pattern
  • Relevance scoring algorithm with configurable weights
  • Multiple selection strategies for different use cases
  • Task-specific context generation vs. whole-codebase dumps

Benefits for Agentic Coding:

  • Token efficiency: Only send relevant files, not entire codebases
  • Task-specific context: Intelligent file selection based on task description
  • Scalable: Works with large codebases through smart sampling
  • Flexible: Multiple optimization strategies for different scenarios
  • Integration-ready: Native MCP support for modern AI coding tools

This update positions CodeContext as essential infrastructure for agentic coding, similar to how LSP became fundamental for modern IDEs.

…oding

This major update transforms CodeContext from a simple file dumper into an intelligent
context server optimized for agentic coding workflows.

New Features:
- MCP (Model Context Protocol) server implementation
  * Native integration with Claude Code, Cline, and other MCP clients
  * Four powerful MCP tools: GetCodeContext, GetProjectStructure, ListProjectFiles, GetFileContent
  * Stdio transport for seamless subprocess integration

- Token Budget Optimization System
  * TokenCounter service for accurate token estimation
  * FileRelevanceScorer with multi-factor relevance scoring (filename, path, content, importance)
  * TokenBudgetOptimizer with three strategies: GreedyByScore, ValueOptimized, Balanced
  * Intelligent file selection to maximize relevance within token constraints

- Enhanced Program Architecture
  * Dual-mode support: CLI mode (original) and MCP server mode (new)
  * Command-line flag --mcp/--server to enable MCP mode
  * Refactored CLI code into ProgramCli.cs for separation of concerns

- Comprehensive Documentation
  * Updated README with MCP setup instructions
  * Detailed explanations of token optimization strategies
  * Example workflows and usage patterns
  * MCP configuration example file

Technical Implementation:
- Added ModelContextProtocol and Microsoft.Extensions.Hosting NuGet packages
- Implemented MCP tools using attribute-based discovery pattern
- Relevance scoring algorithm with configurable weights
- Multiple selection strategies for different use cases
- Task-specific context generation vs. whole-codebase dumps

Benefits for Agentic Coding:
- Token efficiency: Only send relevant files, not entire codebases
- Task-specific context: Intelligent file selection based on task description
- Scalable: Works with large codebases through smart sampling
- Flexible: Multiple optimization strategies for different scenarios
- Integration-ready: Native MCP support for modern AI coding tools

This update positions CodeContext as essential infrastructure for agentic coding,
similar to how LSP became fundamental for modern IDEs.
Copilot AI review requested due to automatic review settings November 24, 2025 01:02
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR transforms CodeContext from a simple CLI tool into a comprehensive agentic coding platform by adding MCP (Model Context Protocol) server capabilities and intelligent token budget optimization. The changes enable native integration with AI coding assistants like Claude Code while maintaining backward compatibility with the original CLI mode.

Key Changes:

  • Added MCP server mode with four powerful tools (GetCodeContext, GetProjectStructure, ListProjectFiles, GetFileContent) for AI agent integration
  • Implemented token budget optimization system with multiple strategies (GreedyByScore, ValueOptimized, Balanced) and intelligent relevance scoring
  • Refactored architecture to support dual-mode operation (CLI and MCP server) with proper separation of concerns

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 21 comments.

Show a summary per file
File Description
Program.cs Refactored entry point to support dual-mode operation (CLI vs MCP server) with command-line flag detection
ProgramCli.cs Extracted CLI-specific logic from original Program.cs for clean separation of concerns
Mcp/CodeContextTools.cs Implements four MCP server tools with file scanning, relevance scoring, and context generation
Services/TokenCounter.cs Provides token estimation for code and natural language using character-based approximations
Services/FileRelevanceScorer.cs Multi-factor relevance scoring algorithm weighing filename, path, content, and importance
Services/TokenBudgetOptimizer.cs Implements three optimization strategies for file selection within token constraints
CodeContext.csproj Adds ModelContextProtocol and Microsoft.Extensions.Hosting package dependencies
mcp-config.example.json Example MCP server configuration for integration with AI coding tools
README.md Comprehensive documentation updates covering MCP setup, token optimization, and usage examples

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +243 to +253
var fullPath = Path.Combine(projectPath, relativePath);

if (!File.Exists(fullPath))
{
output.AppendLine($"## {relativePath}");
output.AppendLine("❌ File not found");
output.AppendLine();
continue;
}

var content = File.ReadAllText(fullPath);
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential path traversal vulnerability. A malicious user could provide file paths like "../../../etc/passwd" to read files outside the project directory. Consider validating that the resolved fullPath is still within the projectPath:

var fullPath = Path.GetFullPath(Path.Combine(projectPath, relativePath));
if (!fullPath.StartsWith(Path.GetFullPath(projectPath) + Path.DirectorySeparatorChar))
{
    output.AppendLine($"## {relativePath}");
    output.AppendLine("❌ Invalid path - outside project directory");
    continue;
}

Copilot uses AI. Check for mistakes.
Comment on lines +160 to +163
// Penalize very long files (might be generated/verbose)
// This would need actual file size, using path length as proxy
if (filePath.Length > 100)
score -= 0.1;
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file path length is used as a proxy for file size to penalize long files, but this is misleading. A file with a very long path (e.g., deep nested directories) but small content would be incorrectly penalized. Consider either:

  1. Passing actual file size/content length as a parameter
  2. Removing this heuristic entirely since it doesn't correlate with file verbosity
  3. Documenting this clearly as a limitation

Copilot uses AI. Check for mistakes.
"--",
"--mcp"
],
"description": "CodeContext MCP Server - Intelligent code context generation with token budget optimization"
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The example configuration uses dotnet run which is inefficient for production use as it rebuilds on every invocation. Consider showing the use of a compiled executable instead:

"command": "/path/to/CodeContext/bin/Release/net9.0/CodeContext",
"args": ["--mcp"]

Or keep both examples showing the development approach (dotnet run) and production approach (compiled executable).

Suggested change
"description": "CodeContext MCP Server - Intelligent code context generation with token budget optimization"
"description": "CodeContext MCP Server (Development) - Uses 'dotnet run' for rapid iteration and debugging"
},
"codecontext-prod": {
"command": "/path/to/CodeContext/bin/Release/net9.0/CodeContext",
"args": [
"--mcp"
],
"description": "CodeContext MCP Server (Production) - Uses compiled executable for efficient deployment"

Copilot uses AI. Check for mistakes.
Comment on lines +322 to +324
catch
{
// Skip directories that can't be accessed
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty catch blocks swallow all exceptions silently. Consider at least logging the error or being more specific about which exceptions to catch (e.g., UnauthorizedAccessException, IOException).

Suggested change
catch
{
// Skip directories that can't be accessed
catch (Exception ex)
{
// Skip directories that can't be accessed
System.Diagnostics.Debug.WriteLine($"[CollectFiles] Could not access directory '{currentPath}': {ex.Message}");

Copilot uses AI. Check for mistakes.
}

// Reserve tokens for project structure if requested
var reservedTokens = includeStructure ? 2000 : 100; // Structure + overhead
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The comment says "Structure + overhead" but the code uses a fixed 2000 tokens for structure, which seems quite high. Consider making this configurable or calculating it dynamically based on actual project structure size using EstimateTokensNaturalLanguage from TokenCounter. A small project might not need 2000 tokens for structure, wasting budget.

Copilot uses AI. Check for mistakes.
var structureTokens = EstimateTokensNaturalLanguage(projectStructure);
var fileTokens = files.Sum(f => EstimateTokensForFile(f.path, f.content));

const int overhead = 100; // Headers, formatting, metadata
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Magic number 100 is used without explanation. Consider defining this as a named constant (e.g., private const int BaseOverheadTokens = 100;) to improve maintainability and make it easier to adjust if needed.

Copilot uses AI. Check for mistakes.
{
// File paths are typically short, use exact word counting
var parts = filePath.Split('/', '\\', '.');
return parts.Length + 2; // Add overhead for formatting
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Magic number 2 is used for overhead. Consider defining this as a named constant to improve code clarity and maintainability.

Copilot uses AI. Check for mistakes.
Comment on lines +112 to +128
var pathLower = filePath.ToLowerInvariant();
if (keywords.Count == 0)
return 0.5;

var matchCount = keywords.Count(keyword => pathLower.Contains(keyword));
return Math.Min(1.0, matchCount / (double)keywords.Count);
}

/// <summary>
/// Scores based on content relevance.
/// </summary>
private static double ScoreContent(string content, List<string> keywords)
{
if (string.IsNullOrWhiteSpace(content) || keywords.Count == 0)
return 0.3; // Low default score

var contentLower = content.ToLowerInvariant();
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance concern: ToLowerInvariant() is called repeatedly on potentially large content strings during scoring (lines 112, 128). For large files, this creates unnecessary string allocations. Consider caching the lowercased content or using case-insensitive string operations like IndexOf(keyword, StringComparison.OrdinalIgnoreCase) to avoid creating new strings.

Copilot uses AI. Check for mistakes.
var fileChecker = new FileFilterService(filterConfig, gitIgnoreParser);
var scanner = new ProjectScanner(fileChecker, _console);

var files = await Task.Run(() => GetAllProjectFiles(scanner, projectPath));
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using Task.Run here is unnecessary for synchronous I/O work. Simply call GetAllProjectFiles(scanner, projectPath) directly without wrapping it in Task.Run.

Suggested change
var files = await Task.Run(() => GetAllProjectFiles(scanner, projectPath));
var files = GetAllProjectFiles(scanner, projectPath);

Copilot uses AI. Check for mistakes.
});

// Normalize by content length and keyword count
var density = totalMatches / (double)(content.Length / 100 + 1);
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible loss of precision: any fraction will be lost.

Suggested change
var density = totalMatches / (double)(content.Length / 100 + 1);
var density = totalMatches / (content.Length / 100.0 + 1);

Copilot uses AI. Check for mistakes.
…uality

This commit addresses all feedback from PR #4's Copilot AI review:

Security Fixes:
- Add PathSecurity utility to prevent path traversal attacks
- Validate all file paths in GetFileContent to prevent directory traversal
- Replace empty catch blocks with specific exception types

Performance Improvements:
- Remove unnecessary Task.Run wrappers (synchronous I/O doesn't benefit)
- Optimize ToLowerInvariant() calls to avoid repeated allocations on large strings
- Cache lowercase conversions once per scoring operation

Code Quality Improvements:
- Replace magic numbers with named constants throughout
  * Scoring weights (FileNameWeight, FilePathWeight, etc.)
  * Scoring parameters (NeutralScore, MaxMatchesPerKeyword, etc.)
  * File importance boost values (ReadmeBoost, ConfigBoost, etc.)
  * Token reservation constants (StructureTokenReservation)
- Fix misleading file path length heuristic
  * Now uses actual file size instead of path length
  * Properly penalizes large files (50KB+ threshold)

Build Fixes:
- Fix .NET 10.0 target framework error in test project (downgrade to .NET 9.0)

Technical Details:
- PathSecurity.ValidatePathWithinRoot ensures resolved paths stay within project
- SecurityException thrown for path traversal attempts
- Specific exception handling for UnauthorizedAccessException, IOException, DirectoryNotFoundException
- Reduced token reservation from 2000 to 1000 for more reasonable small project handling
- All magic numbers extracted to const fields with descriptive names
@DavidVeksler DavidVeksler merged commit cf14129 into master Nov 24, 2025
1 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants