Skip to content

Refactor codebase with functional programming principles#3

Merged
DavidVeksler merged 3 commits intomasterfrom
claude/refactor-functional-programming-01VtJuXiyZe2cAtaCxiRm8HX
Nov 23, 2025
Merged

Refactor codebase with functional programming principles#3
DavidVeksler merged 3 commits intomasterfrom
claude/refactor-functional-programming-01VtJuXiyZe2cAtaCxiRm8HX

Conversation

@DavidVeksler
Copy link
Owner

Comprehensive refactoring applying functional programming principles throughout the codebase: immutability, pure functions, composability, and explicit I/O boundaries.

CORE TRANSFORMATIONS:

GitIgnoreParser.cs:

  • Convert to immutable record type with ImmutableArray and ImmutableDictionary
  • Introduce static factory methods (FromFile, FromPatterns, Empty)
  • Separate I/O operations (FromFile) from pure pattern creation (FromPatterns)
  • Implement lazy, immutable regex cache
  • Remove all mutable state (_patterns, _regexCache mutation)

FileFilterService.cs:

  • Decompose ShouldSkip into composable predicate functions
  • Create pure filter predicates (IsReparsePoint, IsIgnoredFileName, etc.)
  • Extract GetExtensions as pure function using yield return
  • Separate I/O (ReadFirstLines) from business logic (IsGeneratedCode)
  • Replace mutable _gitIgnoreLoaded flag with Lazy
  • Use pattern matching and switch expressions for data flow

GitHelper.cs:

  • Replace imperative while loop with pure tail recursion
  • Implement FindRepositoryRootRecursive with pattern matching
  • Extract HasGitDirectory as pure predicate
  • Achieve immutable, declarative directory traversal

ProjectScanner.cs:

  • Introduce immutable ScanContext record for scan state
  • Replace StringBuilder accumulation with yield return enumeration
  • Eliminate mutable _gitRepoRoot field
  • Decompose scanning into pure recursive functions
  • Separate formatting functions (FormatDirectoryEntry, FormatFileEntry)
  • Use LINQ and ImmutableArray for collection building
  • Isolate side effects (progress reporting, error logging) clearly

ContentBuilder.cs:

  • Introduce ContentSection record for lazy content generation
  • Replace StringBuilder with functional composition using SelectMany
  • Use ImmutableArray.Builder for section assembly
  • Implement lazy evaluation with Func delegates

PathResolver.cs:

  • Extract SelectPath as pure function
  • Refactor GetFolderName using functional composition with ?? operators
  • Separate pure path transformation from I/O operations
  • Use pattern matching in GetOutputPath
  • Create small, composable pure functions (TryGetDirectoryName, CleanRootPath)

FileUtilities.cs:

  • Separate I/O (CheckFileBinaryContent) from analysis (IsStreamBinary)
  • Extract pure functions: CalculateBinaryRatio, IsBinaryByte
  • Define Utf8Bom as immutable static readonly field
  • Use SequenceEqual for BOM comparison
  • Achieve clear functional pipeline: read → analyze → return

ConfigLoader.cs:

  • Introduce Result monad for functional error handling
  • Separate ReadConfigFile (I/O) from ParseConfig (pure)
  • Use pattern matching with Match method
  • Eliminate imperative try-catch flow

OutputFormatter.cs:

  • Separate ResolveOutputPath (pure) from WriteFile (I/O)
  • Extract SerializeToJson as pure function
  • Use StringComparison.OrdinalIgnoreCase for case-insensitive comparison
  • Compose WriteToFile from pure and impure operations

StatsCalculator.cs:

  • Introduce immutable ProjectStats record
  • Separate GatherStats (I/O) from FormatStats (pure)
  • Use pattern matching for null handling
  • Extract CountFiles and CountLines as focused functions

Program.cs:

  • Load GitIgnore patterns upfront using pattern matching
  • Pass immutable GitIgnoreParser to FileFilterService constructor
  • Clearly define I/O boundaries at application entry point

FUNCTIONAL PROGRAMMING PRINCIPLES APPLIED:

✓ Immutability: ImmutableArray, ImmutableDictionary, sealed records
✓ Pure functions: Explicit separation of computation from side effects
✓ Composability: Small, focused functions that compose elegantly
✓ Explicit data flow: No hidden state, all dependencies visible
✓ I/O isolation: Clear boundaries between pure logic and effects
✓ Type safety: Leverage C# records and pattern matching
✓ Lazy evaluation: Yield return, Lazy, Func delegates
✓ Declarative style: LINQ, pattern matching, expression-bodied members
✓ Error handling: Result types, null pattern matching vs exceptions
✓ Recursion: Tail-recursive functions replacing imperative loops

BENEFITS:

  • Enhanced testability: Pure functions easy to test in isolation
  • Thread safety: Immutable state eliminates race conditions
  • Maintainability: Clear data flow, explicit dependencies
  • Composability: Small functions combine into complex operations
  • Predictability: Same inputs always produce same outputs
  • Debugging: Deterministic execution, no hidden state mutations

Comprehensive refactoring applying functional programming principles throughout the codebase:
immutability, pure functions, composability, and explicit I/O boundaries.

CORE TRANSFORMATIONS:

GitIgnoreParser.cs:
- Convert to immutable record type with ImmutableArray and ImmutableDictionary
- Introduce static factory methods (FromFile, FromPatterns, Empty)
- Separate I/O operations (FromFile) from pure pattern creation (FromPatterns)
- Implement lazy, immutable regex cache
- Remove all mutable state (_patterns, _regexCache mutation)

FileFilterService.cs:
- Decompose ShouldSkip into composable predicate functions
- Create pure filter predicates (IsReparsePoint, IsIgnoredFileName, etc.)
- Extract GetExtensions as pure function using yield return
- Separate I/O (ReadFirstLines) from business logic (IsGeneratedCode)
- Replace mutable _gitIgnoreLoaded flag with Lazy<GitIgnoreParser>
- Use pattern matching and switch expressions for data flow

GitHelper.cs:
- Replace imperative while loop with pure tail recursion
- Implement FindRepositoryRootRecursive with pattern matching
- Extract HasGitDirectory as pure predicate
- Achieve immutable, declarative directory traversal

ProjectScanner.cs:
- Introduce immutable ScanContext record for scan state
- Replace StringBuilder accumulation with yield return enumeration
- Eliminate mutable _gitRepoRoot field
- Decompose scanning into pure recursive functions
- Separate formatting functions (FormatDirectoryEntry, FormatFileEntry)
- Use LINQ and ImmutableArray for collection building
- Isolate side effects (progress reporting, error logging) clearly

ContentBuilder.cs:
- Introduce ContentSection record for lazy content generation
- Replace StringBuilder with functional composition using SelectMany
- Use ImmutableArray.Builder for section assembly
- Implement lazy evaluation with Func<string> delegates

PathResolver.cs:
- Extract SelectPath as pure function
- Refactor GetFolderName using functional composition with ?? operators
- Separate pure path transformation from I/O operations
- Use pattern matching in GetOutputPath
- Create small, composable pure functions (TryGetDirectoryName, CleanRootPath)

FileUtilities.cs:
- Separate I/O (CheckFileBinaryContent) from analysis (IsStreamBinary)
- Extract pure functions: CalculateBinaryRatio, IsBinaryByte
- Define Utf8Bom as immutable static readonly field
- Use SequenceEqual for BOM comparison
- Achieve clear functional pipeline: read → analyze → return

ConfigLoader.cs:
- Introduce Result<T> monad for functional error handling
- Separate ReadConfigFile (I/O) from ParseConfig (pure)
- Use pattern matching with Match method
- Eliminate imperative try-catch flow

OutputFormatter.cs:
- Separate ResolveOutputPath (pure) from WriteFile (I/O)
- Extract SerializeToJson as pure function
- Use StringComparison.OrdinalIgnoreCase for case-insensitive comparison
- Compose WriteToFile from pure and impure operations

StatsCalculator.cs:
- Introduce immutable ProjectStats record
- Separate GatherStats (I/O) from FormatStats (pure)
- Use pattern matching for null handling
- Extract CountFiles and CountLines as focused functions

Program.cs:
- Load GitIgnore patterns upfront using pattern matching
- Pass immutable GitIgnoreParser to FileFilterService constructor
- Clearly define I/O boundaries at application entry point

FUNCTIONAL PROGRAMMING PRINCIPLES APPLIED:

✓ Immutability: ImmutableArray, ImmutableDictionary, sealed records
✓ Pure functions: Explicit separation of computation from side effects
✓ Composability: Small, focused functions that compose elegantly
✓ Explicit data flow: No hidden state, all dependencies visible
✓ I/O isolation: Clear boundaries between pure logic and effects
✓ Type safety: Leverage C# records and pattern matching
✓ Lazy evaluation: Yield return, Lazy<T>, Func<T> delegates
✓ Declarative style: LINQ, pattern matching, expression-bodied members
✓ Error handling: Result types, null pattern matching vs exceptions
✓ Recursion: Tail-recursive functions replacing imperative loops

BENEFITS:

- Enhanced testability: Pure functions easy to test in isolation
- Thread safety: Immutable state eliminates race conditions
- Maintainability: Clear data flow, explicit dependencies
- Composability: Small functions combine into complex operations
- Predictability: Same inputs always produce same outputs
- Debugging: Deterministic execution, no hidden state mutations
Copilot AI review requested due to automatic review settings November 23, 2025 04:23
- Fix ConfigLoader.cs: Use 'new' keyword for Result<T> record instantiation
- Fix FileUtilities.cs: Handle partial reads properly to address CA2022 warning
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the codebase to apply functional programming principles including immutability, pure functions, composability, and explicit I/O boundaries. The refactoring touches core services like file filtering, project scanning, and content building.

Key changes:

  • Converts mutable classes to immutable records with ImmutableArray and ImmutableDictionary
  • Extracts smaller, composable functions from larger imperative methods
  • Introduces lazy evaluation patterns using Lazy<T>, Func<T>, and yield return

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
Utils/GitHelper.cs Replaces imperative while loop with tail-recursive directory traversal
Utils/FileUtilities.cs Separates I/O operations from binary content analysis with extracted pure functions
Services/StatsCalculator.cs Introduces immutable ProjectStats record and separates statistics gathering from formatting
Services/ProjectScanner.cs Refactors with immutable ScanContext, removes mutable state, uses lazy evaluation with yield return
Services/PathResolver.cs Extracts pure path transformation functions and uses pattern matching for control flow
Services/OutputFormatter.cs Separates path resolution logic from file writing operations
Services/GitIgnoreParser.cs Converts to immutable record with static factory methods and lazy regex cache
Services/FileFilterService.cs Decomposes filtering into composable predicates, replaces mutable flag with Lazy<T>
Services/ContentBuilder.cs Introduces lazy content generation with ContentSection record and functional composition
Services/ConfigLoader.cs Implements Result<T> monad for functional error handling
Program.cs Loads GitIgnore patterns upfront and passes immutable parser to services

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

};

/// <summary>
/// Pure predicate: checks if a directory contains a .git subdirectory.
Copy link

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment claims HasGitDirectory is a "pure predicate", but it performs I/O by calling Directory.Exists(), which is a side effect. Pure functions should not perform I/O operations. Consider updating the documentation to reflect that this function has I/O side effects.

Suggested change
/// Pure predicate: checks if a directory contains a .git subdirectory.
/// Checks if a directory contains a .git subdirectory by performing I/O.

Copilot uses AI. Check for mistakes.
Comment on lines +50 to +51
/// Pure function: checks if stream starts with UTF-8 BOM.
/// Resets stream position after checking.
Copy link

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment claims HasUtf8Bom is a "pure function", but it performs I/O by reading from the stream and modifies the stream's position. Pure functions should not perform I/O or have side effects. Consider updating the documentation to reflect that this function has I/O side effects.

Suggested change
/// Pure function: checks if stream starts with UTF-8 BOM.
/// Resets stream position after checking.
/// I/O operation: checks if stream starts with UTF-8 BOM.
/// Reads from the stream and resets stream position after checking; this method has I/O side effects.

Copilot uses AI. Check for mistakes.
}

/// <summary>
/// Pure predicate to determine if an entry should be included.
Copy link

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment claims ShouldIncludeEntry is a "pure predicate", but it performs I/O by creating a FileInfo object which accesses the file system. Pure functions should not perform I/O operations. Consider updating the documentation to reflect that this function has I/O side effects.

Suggested change
/// Pure predicate to determine if an entry should be included.
/// Predicate to determine if an entry should be included. May perform I/O and has side effects.

Copilot uses AI. Check for mistakes.
Comment on lines +70 to +71
/// Pure recursive function to generate directory structure lines.
/// Uses lazy evaluation via yield return.
Copy link

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment claims GetProjectStructureLines is a "pure recursive function", but it performs I/O by calling Directory.Exists() and GetFilteredEntries(). Pure functions should not perform I/O operations. Consider updating the documentation to reflect that this function has I/O side effects.

Suggested change
/// Pure recursive function to generate directory structure lines.
/// Uses lazy evaluation via yield return.
/// Recursively generates directory structure lines, performing I/O operations.
/// Uses lazy evaluation via yield return. This function accesses the file system and is not pure.

Copilot uses AI. Check for mistakes.
Comment on lines +175 to +176
/// Pure recursive function to enumerate all files in directory tree.
/// Uses lazy evaluation via yield return.
Copy link

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment claims EnumerateFilesRecursively is a "pure recursive function", but it performs I/O by calling Directory.EnumerateFiles() and Directory.EnumerateDirectories() through helper methods. Pure functions should not perform I/O operations. Consider updating the documentation to reflect that this function has I/O side effects.

Suggested change
/// Pure recursive function to enumerate all files in directory tree.
/// Uses lazy evaluation via yield return.
/// Recursively enumerates all files in the directory tree using lazy evaluation via yield return.
/// Performs I/O operations to access the file system.

Copilot uses AI. Check for mistakes.

private static bool CheckBinaryContent(FileStream stream, int chunkSize, double threshold)
/// <summary>
/// Pure function: checks if stream content exceeds binary threshold.
Copy link

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment claims HasBinaryContent is a "pure function", but it performs I/O by reading from the stream. Pure functions should not perform I/O operations. Consider updating the documentation to reflect that this function has I/O side effects.

Suggested change
/// Pure function: checks if stream content exceeds binary threshold.
/// I/O operation: reads stream content and checks if binary threshold is exceeded.

Copilot uses AI. Check for mistakes.

return structure.ToString();
/// <summary>
/// Pure function to get filtered and sorted directory entries.
Copy link

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment claims GetFilteredEntries is a "pure function", but it performs I/O by calling Directory.EnumerateFileSystemEntries() and has side effects by calling _console.WriteLine() in the exception handler. Pure functions should not perform I/O or have side effects. Consider updating the documentation to reflect that this function has I/O side effects.

Suggested change
/// Pure function to get filtered and sorted directory entries.
/// Gets filtered and sorted directory entries for a given path.
/// Performs I/O and may have side effects (e.g., console output on error).

Copilot uses AI. Check for mistakes.
Comment on lines +235 to 265
/// <summary>
/// Pure predicate to check if a file should be included.
/// </summary>
private bool ShouldIncludeFile(string filePath, string rootPath)
{
try
{
var fileInfo = new FileInfo(filePath);
return !_fileChecker.ShouldSkip(fileInfo, rootPath);
}
catch
{
return false;
}
}

// Recursively process subdirectories
foreach (var subDir in Directory.EnumerateDirectories(directory, "*", enumerationOptions))
{
try
{
var dirInfo = new DirectoryInfo(subDir);
if (!_fileChecker.ShouldSkip(dirInfo, rootPath))
{
EnumerateFilesRecursively(subDir, rootPath, files);
}
}
catch
{
// Skip directories that can't be accessed
}
}
/// <summary>
/// Pure predicate to check if a directory should be included.
/// </summary>
private bool ShouldIncludeDirectory(string dirPath, string rootPath)
{
try
{
var dirInfo = new DirectoryInfo(dirPath);
return !_fileChecker.ShouldSkip(dirInfo, rootPath);
}
catch (Exception ex)
catch
{
_console.WriteLine($"\n⚠️ Warning: Could not enumerate directory {directory}: {ex.Message}");
return false;
}
}
Copy link

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment claims ShouldIncludeFile and ShouldIncludeDirectory are "pure predicates", but they perform I/O by creating FileInfo and DirectoryInfo objects which access the file system. Pure functions should not perform I/O operations. Consider updating the documentation to reflect that these functions have I/O side effects.

Copilot uses AI. Check for mistakes.
Comment on lines +70 to +74
/// Pure function: serializes content to JSON with timestamp.
/// </summary>
private static string SerializeToJson(string content) =>
JsonSerializer.Serialize(
new { content, timestamp = DateTime.Now },
Copy link

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SerializeToJson function is marked as "pure function" but includes DateTime.Now, which introduces non-determinism - calling this function multiple times with the same content will produce different results due to different timestamps. This violates the definition of a pure function. Consider either removing the timestamp or updating the documentation to acknowledge this side effect.

Suggested change
/// Pure function: serializes content to JSON with timestamp.
/// </summary>
private static string SerializeToJson(string content) =>
JsonSerializer.Serialize(
new { content, timestamp = DateTime.Now },
/// Pure function: serializes content to JSON.
/// </summary>
private static string SerializeToJson(string content) =>
JsonSerializer.Serialize(
new { content },

Copilot uses AI. Check for mistakes.
Comment on lines +30 to +31
/// Pure function that generates content sections based on configuration.
/// Uses declarative approach with LINQ and immutable collections.
Copy link

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment claims BuildContentSections is a "pure function", but it doesn't actually execute the content generation - it defers it by storing Func<string> delegates. The functions referenced (_scanner.GetProjectStructure and _scanner.GetFileContents) perform I/O operations. Consider updating the documentation to clarify that this function builds a lazy evaluation structure rather than being truly pure, since the deferred operations will perform I/O when executed.

Suggested change
/// Pure function that generates content sections based on configuration.
/// Uses declarative approach with LINQ and immutable collections.
/// Generates content sections based on configuration using a declarative approach.
/// Builds a lazy evaluation structure: the returned delegates will perform I/O when evaluated.
/// Uses LINQ and immutable collections to assemble content sections.

Copilot uses AI. Check for mistakes.
…tion

Addresses Copilot AI code review feedback from PR #3.
Corrects mislabeled "pure" functions and fixes non-determinism issues.

ISSUES FIXED:

1. Non-Deterministic Function
   - OutputFormatter.SerializeToJson: Now accepts DateTime as parameter
   - Makes function deterministic (same inputs = same outputs)
   - DateTime.Now moved to call site (WriteToFile)

2. Mislabeled Pure Functions (performing I/O)
   GitHelper.cs:
   - HasGitDirectory: "pure predicate" → "I/O operation"
     (calls Directory.Exists)

   FileUtilities.cs:
   - IsStreamBinary: "pure function" → "I/O operation"
   - HasUtf8Bom: "pure function" → "I/O operation"
     (reads from stream, modifies position)
   - HasBinaryContent: "pure function" → "I/O operation"

   ProjectScanner.cs:
   - GetProjectStructureLines: "pure recursive" → "recursive I/O"
   - GetFilteredEntries: "pure function" → "I/O operation"
     (includes console logging side effects)
   - EnumerateFilesRecursively: "pure recursive" → "recursive I/O"
   - GetFilteredFiles: "pure enumeration" → "I/O operation"
   - GetFilteredDirectories: "pure enumeration" → "I/O operation"
   - ShouldIncludeEntry: "pure predicate" → "predicate (may perform I/O)"
   - ShouldIncludeFile: "pure predicate" → "predicate (may perform I/O)"
   - ShouldIncludeDirectory: "pure predicate" → "predicate (may perform I/O)"

   ConfigLoader.cs:
   - ParseConfig: "pure function" → documents console logging side effect

   ContentBuilder.cs:
   - BuildContentSections: "pure function" → "performs I/O via scanner"

   OutputFormatter.cs:
   - ResolveOutputPath: "pure function" → "I/O operation"
     (calls Directory.Exists)

   PathResolver.cs:
   - GetFolderName: "pure function" → "may perform I/O"
   - TryGetDirectoryName: "pure function" → "may perform I/O"
     (creates DirectoryInfo)
   - ExtractRootFolderName: "pure function" → "may perform I/O"
     (calls Path.GetFullPath)

PRINCIPLES CLARIFIED:

A function is only "pure" if:
1. Returns same output for same input (deterministic)
2. Has no observable side effects
3. Does not perform I/O operations
4. Does not modify external state
5. Does not depend on external mutable state

Functions that perform I/O, access file system, write to console,
or depend on current time are now correctly labeled as I/O operations
or documented with their side effects.

This maintains functional programming benefits while being honest
about where effects occur in the codebase.
@DavidVeksler DavidVeksler merged commit 66ca6bb into master Nov 23, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants