diff --git a/.github/SYNC_INTEGRATION_TEMPLATE.md b/.github/SYNC_INTEGRATION_TEMPLATE.md new file mode 100644 index 0000000..3923112 --- /dev/null +++ b/.github/SYNC_INTEGRATION_TEMPLATE.md @@ -0,0 +1,230 @@ +# Documentation Sync Integration Template + +This template helps source repositories integrate with the FalkorDB docs agentic sync workflow. + +## Quick Setup + +Add this workflow file to your repository to enable automatic documentation sync. + +### File: `.github/workflows/sync-to-docs.yml` + +```yaml +name: Sync Documentation to Docs Repo + +on: + push: + branches: + - main + - master + paths: + - 'README.md' + - 'docs/**' + - '*.md' + + # Allow manual trigger + workflow_dispatch: + +jobs: + notify-docs-sync: + runs-on: ubuntu-latest + steps: + - name: Notify docs repository + uses: peter-evans/repository-dispatch@v3 + with: + token: ${{ secrets.GITHUB_TOKEN }} + repository: FalkorDB/docs + event-type: docs-sync + client-payload: | + { + "repository": "${{ github.event.repository.name }}", + "ref": "${{ github.ref_name }}", + "sha": "${{ github.sha }}", + "commit_message": "${{ github.event.head_commit.message || 'Manual sync trigger' }}" + } +``` + +## Alternative: Using cURL (No Additional Action) + +```yaml +name: Sync Documentation to Docs Repo + +on: + push: + branches: + - main + - master + paths: + - 'README.md' + - 'docs/**' + +jobs: + notify-docs-sync: + runs-on: ubuntu-latest + steps: + - name: Trigger docs sync via API + run: | + curl -X POST \ + -H "Accept: application/vnd.github+json" \ + -H "Authorization: Bearer ${{ secrets.GITHUB_TOKEN }}" \ + -H "X-GitHub-Api-Version: 2022-11-28" \ + https://api.github.com/repos/FalkorDB/docs/dispatches \ + -d '{ + "event_type": "docs-sync", + "client_payload": { + "repository": "${{ github.event.repository.name }}", + "ref": "${{ github.ref_name }}", + "sha": "${{ github.sha }}", + "commit_message": "${{ github.event.head_commit.message }}" + } + }' +``` + +## Configuration Options + +### Customize Trigger Paths + +Only sync when specific files change: + +```yaml +on: + push: + branches: [main] + paths: + - 'README.md' # Main readme + - 'docs/**/*.md' # Documentation folder + - 'API.md' # Specific API docs + - '!docs/internal/**' # Exclude internal docs +``` + +### Add Condition Checks + +Only sync if certain conditions are met: + +```yaml +jobs: + notify-docs-sync: + runs-on: ubuntu-latest + # Only sync if commit message doesn't contain [skip-docs] + if: "!contains(github.event.head_commit.message, '[skip-docs]')" + steps: + # ... rest of workflow +``` + +### Include More Metadata + +Send additional information to the docs sync workflow: + +```yaml +client-payload: | + { + "repository": "${{ github.event.repository.name }}", + "ref": "${{ github.ref_name }}", + "sha": "${{ github.sha }}", + "commit_message": "${{ github.event.head_commit.message }}", + "author": "${{ github.event.head_commit.author.name }}", + "timestamp": "${{ github.event.head_commit.timestamp }}", + "modified_files": "${{ toJson(github.event.head_commit.modified) }}" + } +``` + +## Testing + +### Manual Test + +1. Go to **Actions** tab in your repository +2. Select **Sync Documentation to Docs Repo** +3. Click **Run workflow** +4. Check the **Actions** tab in FalkorDB/docs for the triggered sync + +### Verify Integration + +After pushing a documentation change: +1. Check your repository's Actions tab for successful run +2. Check FalkorDB/docs Actions tab for triggered sync +3. Look for PR created in FalkorDB/docs repository + +## Troubleshooting + +### Workflow Not Triggering + +**Problem:** Push to main branch doesn't trigger sync + +**Solutions:** +- Verify file paths match the `paths:` filter +- Check that changes are in tracked files +- Ensure branch name matches (`main` vs `master`) + +### Permission Denied + +**Problem:** `403 Forbidden` or permission errors + +**Solutions:** +- Verify `GITHUB_TOKEN` has correct permissions +- Check workflow permissions in repository settings +- Ensure repository is public or token has access + +### Sync Workflow Not Running in Docs Repo + +**Problem:** Event sent but docs sync doesn't run + +**Solutions:** +- Verify event type is `docs-sync` (exact match) +- Check docs repository workflow file is on main branch +- Review docs repository Actions tab for errors + +## Best Practices + +### 1. Skip Unnecessary Syncs + +Use commit message flags to skip sync: + +```bash +git commit -m "Update internal docs [skip-docs]" +``` + +Then add condition: +```yaml +if: "!contains(github.event.head_commit.message, '[skip-docs]')" +``` + +### 2. Sync Only on Release + +Trigger sync only when creating releases: + +```yaml +on: + release: + types: [published] +``` + +### 3. Batch Changes + +Instead of syncing every commit, sync on schedule: + +```yaml +on: + schedule: + - cron: '0 0 * * 1' # Weekly on Monday +``` + +### 4. Separate Internal Docs + +Exclude internal documentation: + +```yaml +paths: + - 'docs/**' + - '!docs/internal/**' + - '!docs/private/**' +``` + +## Example Repositories + +See these repositories for working examples: +- FalkorDB/GraphRAG-SDK +- FalkorDB/falkordb-py +- FalkorDB/falkordb-ts + +## Support + +Questions? Check the [Sync Workflow Documentation](SYNC_WORKFLOW.md) or open an issue. diff --git a/.github/SYNC_WORKFLOW.md b/.github/SYNC_WORKFLOW.md new file mode 100644 index 0000000..1a0de59 --- /dev/null +++ b/.github/SYNC_WORKFLOW.md @@ -0,0 +1,278 @@ +# Agentic Documentation Sync Workflow + +## Overview + +The Agentic Documentation Sync Workflow automatically synchronizes documentation from various FalkorDB repositories to this central docs repository. It uses an AI-inspired agent pattern to intelligently map, transform, and sync documentation content. + +## Architecture + +### Components + +1. **GitHub Actions Workflow** (`.github/workflows/sync-docs-agentic.yml`) + - Orchestrates the sync process + - Handles triggers (real-time, scheduled, manual) + - Validates changes and creates pull requests + +2. **Sync Agent Script** (`.github/scripts/sync_docs.py`) + - Implements intelligent syncing logic + - Maps source repositories to destination paths + - Processes and transforms content + +3. **Agent Instructions** (`.github/agents/doc-sync-agent.md`) + - Documents agent behavior and decision-making + - Defines mapping rules and transformations + - Provides maintenance guidance + +## How It Works + +### Trigger Flow + +```mermaid +graph LR + A[Source Repo Change] --> B[repository_dispatch] + C[Schedule: Daily 2AM] --> D[Workflow Trigger] + E[Manual Trigger] --> D + B --> D + D --> F[Sync Agent] + F --> G{Changes?} + G -->|Yes| H[Validate & Create PR] + G -->|No| I[Skip] + H --> J[Human Review] +``` + +### Sync Process + +1. **Trigger Detection** + - Real-time: Source repository sends `repository_dispatch` event + - Scheduled: Daily cron job syncs all repos + - Manual: Developer triggers via workflow UI + +2. **Content Retrieval** + - Agent fetches latest content from source repository + - Supports specific commits or branch heads + - Handles both single files and directories + +3. **Intelligent Processing** + - Adds Jekyll front matter for site navigation + - Transforms relative links to work in docs context + - Adds source attribution footer + - Preserves existing customizations where possible + +4. **Validation** + - Runs Jekyll build to ensure site integrity + - Executes spellcheck on modified files + - Detects and reports any errors + +5. **Pull Request Creation** + - Creates descriptive PR with change summary + - Labels appropriately for categorization + - Requires human review before merge + +## Monitored Repositories + +| Repository | Documentation Type | Destination | +|------------|-------------------|-------------| +| **FalkorDB** | Commands, Algorithms, Design | commands/, algorithms/, design/ | +| **GraphRAG-SDK** | GenAI SDK Documentation | genai-tools/GraphRAG-SDK/ | +| **QueryWeaver** | Text-to-SQL Tool Docs | genai-tools/QueryWeaver/ | +| **falkordb-py** | Python Client Library | getting-started/client-libraries/python/ | +| **falkordb-ts** | TypeScript Client Library | getting-started/client-libraries/typescript/ | +| **JFalkorDB** | Java Client Library | getting-started/client-libraries/java/ | +| **NFalkorDB** | .NET Client Library | getting-started/client-libraries/dotnet/ | +| **flex** | JavaScript UDF Library | udfs/flex/ | +| **falkordb-browser** | Browser Visualization | browser/ | +| **FalkorDB-MCPServer** | MCP Server Integration | agentic-memory/ | + +## Setting Up Real-Time Sync + +### For Source Repository Maintainers + +To enable real-time sync from your repository to the docs repository, add this workflow to your repo: + +**`.github/workflows/notify-docs-sync.yml`:** + +```yaml +name: Notify Docs Sync + +on: + push: + branches: [main, master] + paths: + - 'README.md' + - 'docs/**' + +jobs: + notify-docs: + runs-on: ubuntu-latest + steps: + - name: Trigger docs sync + uses: peter-evans/repository-dispatch@v3 + with: + token: ${{ secrets.DOCS_SYNC_TOKEN }} + repository: FalkorDB/docs + event-type: docs-sync + client-payload: | + { + "repository": "${{ github.event.repository.name }}", + "ref": "${{ github.ref_name }}", + "sha": "${{ github.sha }}", + "commit_message": "${{ github.event.head_commit.message }}" + } +``` + +**Requirements:** +- Add `DOCS_SYNC_TOKEN` secret to source repository +- Token needs `repo` scope (or `public_repo` for public repos) +- Token should have write access to FalkorDB/docs repository + +### Alternative: Use Default Token + +If you don't want to create a separate token, you can use GitHub's automatic token: + +```yaml +- name: Trigger docs sync + run: | + curl -X POST \ + -H "Accept: application/vnd.github+json" \ + -H "Authorization: Bearer ${{ secrets.GITHUB_TOKEN }}" \ + https://api.github.com/repos/FalkorDB/docs/dispatches \ + -d '{"event_type":"docs-sync","client_payload":{"repository":"${{ github.event.repository.name }}","ref":"${{ github.ref_name }}","sha":"${{ github.sha }}"}}' +``` + +**Note:** This requires the workflow to have permissions configured. + +## Manual Testing + +### Test Specific Repository Sync + +1. Go to **Actions** tab in GitHub +2. Select **Agentic Documentation Sync** workflow +3. Click **Run workflow** +4. Enter repository name (e.g., `GraphRAG-SDK`) +5. Click **Run workflow** + +### Test All Repositories + +Simply trigger the workflow without any inputs - it will sync all configured repositories. + +## Customization + +### Adding a New Repository + +1. **Edit** `.github/scripts/sync_docs.py` +2. **Add mapping** to `REPO_MAPPINGS` dictionary: + +```python +"your-repo-name": { + "paths": { + "README.md": "destination/path.md", + "docs/*.md": "destination/directory/", + }, + "description": "Brief description of repository" +}, +``` + +3. **Test** with manual workflow trigger +4. **Update** `.github/agents/doc-sync-agent.md` documentation + +### Modifying Sync Behavior + +Edit the `_process_content()` method in `.github/scripts/sync_docs.py` to customize: +- Front matter generation +- Link transformation +- Content filtering +- Attribution format + +## Monitoring and Maintenance + +### Check Sync Status + +- **Workflow Runs:** `.github/workflows` > **Agentic Documentation Sync** +- **Recent PRs:** Filter by `label:automated` or `label:sync` +- **Logs:** Each workflow run includes detailed output + +### Common Issues + +#### Sync Not Triggering +- Check source repository webhook configuration +- Verify `DOCS_SYNC_TOKEN` secret is set correctly +- Ensure token has required permissions + +#### Build Failures +- Check Jekyll build logs in workflow +- Verify front matter syntax is valid +- Ensure all required files are present + +#### Spellcheck Failures +- Review spellcheck output in workflow +- Add technical terms to `.wordlist.txt` +- Fix actual spelling errors in source + +#### PR Not Created +- Check workflow logs for errors +- Verify changes were detected +- Ensure no duplicate PRs exist + +### Disable Sync + +To temporarily disable sync: +1. Go to **Actions** tab +2. Select **Agentic Documentation Sync** +3. Click **⋯** (three dots) +4. Select **Disable workflow** + +## Security + +### Authentication +- Uses GitHub's default `GITHUB_TOKEN` +- Automatically provided to workflows +- Scoped to repository and organization +- No additional setup required + +### Permissions +```yaml +permissions: + contents: write # Create branches and commits + pull-requests: write # Create PRs + issues: read # Read issue templates +``` + +### Best Practices +- All changes go through PR review +- No direct commits to main branch +- Human approval required before merge +- Complete audit trail in PR history + +## FAQ + +**Q: How often does the sync run?** +A: Real-time on source repo changes, plus daily at 2 AM UTC as backup. + +**Q: Can I disable sync for specific files?** +A: Yes, modify the mapping configuration in `sync_docs.py`. + +**Q: What if I manually edit synced documentation?** +A: Manual edits are preserved. The agent only updates if source changes. + +**Q: How do I see what changed?** +A: Check the PR description and diff - it includes full change summary. + +**Q: Can I auto-merge these PRs?** +A: Currently set to require human review. You can modify the workflow to enable auto-merge if desired. + +**Q: What happens if Jekyll build fails?** +A: The workflow will report the error, and PR creation will be skipped. + +## Support + +For issues or questions: +- **GitHub Issues:** [FalkorDB/docs/issues](https://github.com/FalkorDB/docs/issues) +- **Workflow Logs:** Check Actions tab for detailed error messages +- **Agent Instructions:** See `.github/agents/doc-sync-agent.md` + +## References + +- [GitHub Actions Documentation](https://docs.github.com/en/actions) +- [GitHub Agentic Workflows](https://github.github.io/gh-aw/) +- [Repository Dispatch Events](https://docs.github.com/en/rest/repos/repos#create-a-repository-dispatch-event) +- [Jekyll Documentation](https://jekyllrb.com/docs/) diff --git a/.github/agents/doc-sync-agent.md b/.github/agents/doc-sync-agent.md new file mode 100644 index 0000000..2b98b41 --- /dev/null +++ b/.github/agents/doc-sync-agent.md @@ -0,0 +1,163 @@ +# Documentation Sync Agent Instructions + +## Purpose +This agent automatically synchronizes documentation from source repositories in the FalkorDB organization to the docs repository, maintaining up-to-date documentation across the ecosystem. + +## Core Responsibilities + +### 1. Monitor Source Repositories +The agent monitors the following repositories for documentation changes: +- **FalkorDB** - Main graph database (commands, algorithms, design specs) +- **GraphRAG-SDK** - GenAI GraphRAG SDK +- **QueryWeaver** - Text-to-SQL tool +- **falkordb-py** - Python client library +- **falkordb-ts** - TypeScript client library +- **JFalkorDB** - Java client library +- **NFalkorDB** - .NET client library +- **flex** - JavaScript UDF library +- **falkordb-browser** - Browser visualization tool +- **FalkorDB-MCPServer** - Model Context Protocol server + +### 2. Intelligent Content Mapping +The agent maps source documentation to appropriate locations in the docs repository: + +| Source Repo | Source Path | Destination Path | +|-------------|-------------|------------------| +| FalkorDB | README.md | getting-started/overview.md | +| FalkorDB | docs/commands/*.md | commands/ | +| FalkorDB | docs/algorithms/*.md | algorithms/ | +| GraphRAG-SDK | README.md, docs/* | genai-tools/GraphRAG-SDK/ | +| QueryWeaver | README.md, docs/* | genai-tools/QueryWeaver/ | +| falkordb-py | README.md, docs/* | getting-started/client-libraries/python/ | +| falkordb-ts | README.md, docs/* | getting-started/client-libraries/typescript/ | +| JFalkorDB | README.md, docs/* | getting-started/client-libraries/java/ | +| NFalkorDB | README.md, docs/* | getting-started/client-libraries/dotnet/ | +| flex | README.md, docs/* | udfs/flex/ | +| falkordb-browser | README.md, docs/* | browser/ | +| FalkorDB-MCPServer | README.md, docs/* | agentic-memory/ | + +### 3. Content Processing Rules + +#### Jekyll Front Matter +- Add Jekyll front matter if not present +- Extract title from first heading or filename +- Determine appropriate parent page for navigation +- Preserve existing front matter if already present + +#### Link Transformation +- Convert relative links to absolute GitHub URLs when necessary +- Maintain internal doc links where appropriate +- Ensure all images and assets are accessible + +#### Attribution +- Add source repository attribution footer +- Include link back to original repository +- Mark content as automatically synchronized + +### 4. Quality Assurance + +#### Pre-PR Validation +- Run Jekyll build to ensure site builds correctly +- Execute spellcheck on modified files +- Verify no broken links introduced + +#### Change Detection +- Only create PRs when actual content changes detected +- Skip sync if content is identical to existing +- Group related changes in single PR when possible + +### 5. Pull Request Creation + +#### PR Metadata +- Title format: `docs: sync documentation from {repo_name}` +- Label with: `documentation`, `automated`, `sync` +- Create as regular PR (not draft) requiring human review + +#### PR Description Include +- Source repository and commit information +- Summary of changes made +- Validation results (build, spellcheck) +- Review checklist for human reviewers + +### 6. Trigger Mechanisms + +#### Real-time (Primary) +- Triggered via `repository_dispatch` event +- Payload includes: repository name, commit SHA, commit message +- Immediate sync when source repo changes + +#### Scheduled (Fallback) +- Daily at 2 AM UTC +- Syncs all configured repositories +- Catches any missed real-time events + +#### Manual (Testing) +- Triggered via `workflow_dispatch` +- Allows testing specific repository syncs +- Useful for debugging and validation + +## Decision-Making Logic + +### When to Create a PR +- ✅ Content has changed from existing documentation +- ✅ Jekyll build validation passes +- ✅ Source repository is in configured mappings +- ❌ No actual content changes detected +- ❌ Build validation fails (report error) + +### Conflict Resolution +- Human review required for all PRs (no auto-merge) +- Conflicts must be resolved manually +- Agent will not force-push or overwrite manual edits + +### Error Handling +- Log errors clearly in workflow output +- Continue processing other files if one fails +- Report summary of successes and failures +- Alert on build or validation failures + +## Security Considerations + +### Authentication +- Uses default `GITHUB_TOKEN` provided by GitHub Actions +- Read access to source repositories +- Write access to docs repository (content + PRs) +- No external authentication required + +### Permissions Required +- `contents: write` - to create branches and commits +- `pull-requests: write` - to create PRs +- `issues: read` - to check existing issues if needed + +### Safe Operations +- All changes go through PR review process +- No direct commits to main branch +- Branch protection rules enforced +- Audit trail maintained via PR history + +## Maintenance + +### Adding New Repositories +1. Add repository mapping to `REPO_MAPPINGS` in sync script +2. Test with manual workflow trigger +3. Configure source repository webhook (if needed) + +### Modifying Mappings +1. Update `REPO_MAPPINGS` configuration +2. Test with manual workflow trigger +3. Document changes in this file + +### Troubleshooting +- Check workflow run logs for detailed output +- Verify source repository accessibility +- Confirm mapping paths are correct +- Test Jekyll build locally if validation fails + +## Human Oversight + +This is an **assistant agent**, not an autonomous agent. All changes require: +- ✅ Human review of PR +- ✅ Explicit approval before merge +- ✅ Ability to modify or reject automated changes + +The agent provides automation and efficiency while maintaining human control over documentation quality and accuracy. diff --git a/.github/requirements.txt b/.github/requirements.txt new file mode 100644 index 0000000..7d9ed2a --- /dev/null +++ b/.github/requirements.txt @@ -0,0 +1,10 @@ +# Python dependencies for GitHub Actions workflows + +# GitHub API client +PyGithub>=2.1.1 + +# HTTP requests +requests>=2.31.0 + +# YAML parsing +PyYAML>=6.0 diff --git a/.github/scripts/sync_docs.py b/.github/scripts/sync_docs.py new file mode 100755 index 0000000..7170127 --- /dev/null +++ b/.github/scripts/sync_docs.py @@ -0,0 +1,332 @@ +#!/usr/bin/env python3 +""" +Agentic Documentation Sync Script + +This script intelligently syncs documentation from FalkorDB repositories +to the docs repository using AI-like decision making. +""" + +import os +import sys +import json +import requests +from pathlib import Path +from typing import Dict, List, Optional, Tuple +from github import Github, Repository +import yaml + + +# Repository mapping configuration +REPO_MAPPINGS = { + "FalkorDB": { + "paths": { + "README.md": "getting-started/overview.md", + "docs/commands/*.md": "commands/", + "docs/algorithms/*.md": "algorithms/", + "docs/design/*.md": "design/", + }, + "description": "Main FalkorDB graph database repository" + }, + "GraphRAG-SDK": { + "paths": { + "README.md": "genai-tools/GraphRAG-SDK/README.md", + "docs/*.md": "genai-tools/GraphRAG-SDK/", + "examples/*.md": "genai-tools/GraphRAG-SDK/examples/", + }, + "description": "GraphRAG SDK for GenAI applications" + }, + "QueryWeaver": { + "paths": { + "README.md": "genai-tools/QueryWeaver/README.md", + "docs/*.md": "genai-tools/QueryWeaver/", + }, + "description": "Text-to-SQL tool using graph-powered schema" + }, + "falkordb-py": { + "paths": { + "README.md": "getting-started/client-libraries/python.md", + "docs/*.md": "getting-started/client-libraries/python/", + }, + "description": "Python client library for FalkorDB" + }, + "flex": { + "paths": { + "README.md": "udfs/flex/README.md", + "docs/*.md": "udfs/flex/", + }, + "description": "JavaScript UDF library for FalkorDB" + }, + "falkordb-ts": { + "paths": { + "README.md": "getting-started/client-libraries/typescript.md", + "docs/*.md": "getting-started/client-libraries/typescript/", + }, + "description": "TypeScript client library for FalkorDB" + }, + "JFalkorDB": { + "paths": { + "README.md": "getting-started/client-libraries/java.md", + "docs/*.md": "getting-started/client-libraries/java/", + }, + "description": "Java client library for FalkorDB" + }, + "NFalkorDB": { + "paths": { + "README.md": "getting-started/client-libraries/dotnet.md", + "docs/*.md": "getting-started/client-libraries/dotnet/", + }, + "description": ".NET client library for FalkorDB" + }, + "falkordb-browser": { + "paths": { + "README.md": "browser/README.md", + "docs/*.md": "browser/", + }, + "description": "Browser-based visualization tool for FalkorDB" + }, + "FalkorDB-MCPServer": { + "paths": { + "README.md": "agentic-memory/falkordb-mcpserver.md", + "docs/*.md": "agentic-memory/", + }, + "description": "Model Context Protocol server for FalkorDB" + }, +} + + +class AgenticDocSync: + """Intelligent documentation synchronization agent""" + + def __init__(self, github_token: str): + self.github = Github(github_token) + self.org = self.github.get_organization("FalkorDB") + self.changes_summary = [] + + def sync_repository(self, repo_name: str, ref: str = "main") -> bool: + """ + Sync documentation from a specific repository. + + Returns True if changes were made, False otherwise. + """ + print(f"\n🔍 Analyzing repository: {repo_name}") + + if repo_name not in REPO_MAPPINGS: + print(f"⚠️ Repository {repo_name} not in mapping configuration") + return False + + try: + source_repo = self.org.get_repo(repo_name) + mapping = REPO_MAPPINGS[repo_name] + + has_changes = False + + for source_path, dest_path in mapping["paths"].items(): + if self._sync_path(source_repo, source_path, dest_path, ref): + has_changes = True + + return has_changes + + except Exception as e: + print(f"❌ Error syncing {repo_name}: {str(e)}") + return False + + def _sync_path(self, repo: Repository, source_pattern: str, dest_path: str, ref: str) -> bool: + """Sync files matching a pattern from source to destination""" + has_changes = False + + # Handle wildcard patterns + if "*" in source_pattern: + base_path = source_pattern.split("*")[0].rstrip("/") + has_changes = self._sync_directory(repo, base_path, dest_path, ref) + else: + # Single file sync + has_changes = self._sync_file(repo, source_pattern, dest_path, ref) + + return has_changes + + def _sync_file(self, repo: Repository, source_file: str, dest_file: str, ref: str) -> bool: + """Sync a single file from source repository""" + try: + # Get content from source repository + content = repo.get_contents(source_file, ref=ref) + + if content.type != "file": + return False + + source_content = content.decoded_content.decode('utf-8') + + # Process content with intelligent transformations + processed_content = self._process_content(source_content, repo.name, source_file) + + # Determine destination path + dest_path = Path(dest_file) + + # Read existing content if file exists + if dest_path.exists(): + with open(dest_path, 'r', encoding='utf-8') as f: + existing_content = f.read() + + if existing_content == processed_content: + print(f" ℹ️ No changes: {source_file} → {dest_file}") + return False + + # Create parent directories + dest_path.parent.mkdir(parents=True, exist_ok=True) + + # Write processed content + with open(dest_path, 'w', encoding='utf-8') as f: + f.write(processed_content) + + print(f" ✅ Synced: {source_file} → {dest_file}") + self.changes_summary.append(f"- Updated `{dest_file}` from `{repo.name}/{source_file}`") + return True + + except Exception as e: + print(f" ⚠️ Could not sync {source_file}: {str(e)}") + return False + + def _sync_directory(self, repo: Repository, source_dir: str, dest_dir: str, ref: str) -> bool: + """Sync all markdown files from a directory""" + has_changes = False + + try: + contents = repo.get_contents(source_dir, ref=ref) + + for content in contents: + if content.type == "file" and content.name.endswith(".md"): + dest_file = os.path.join(dest_dir, content.name) + if self._sync_file(repo, content.path, dest_file, ref): + has_changes = True + + except Exception as e: + print(f" ⚠️ Could not sync directory {source_dir}: {str(e)}") + + return has_changes + + def _process_content(self, content: str, repo_name: str, source_file: str) -> str: + """ + Process content with intelligent transformations. + + This is where the "agentic" behavior happens - making smart decisions + about how to adapt content for the docs repository. + """ + processed = content + + # Add Jekyll front matter if not present + if not processed.startswith("---"): + title = self._extract_title(processed, source_file) + front_matter = f"""--- +title: {title} +parent: {self._determine_parent(repo_name)} +--- + +""" + processed = front_matter + processed + + # Update relative links to work in docs context + processed = self._fix_relative_links(processed, repo_name) + + # Add source attribution + attribution = f"\n\n---\n*Source: [FalkorDB/{repo_name}](https://github.com/FalkorDB/{repo_name})*\n" + if attribution not in processed: + processed += attribution + + return processed + + def _extract_title(self, content: str, filename: str) -> str: + """Extract title from content or filename""" + lines = content.split('\n') + for line in lines: + if line.startswith('# '): + return line[2:].strip() + + # Fallback to filename + return Path(filename).stem.replace('-', ' ').replace('_', ' ').title() + + def _determine_parent(self, repo_name: str) -> str: + """Determine the parent page in Jekyll navigation""" + mapping = REPO_MAPPINGS.get(repo_name, {}) + + if "client-libraries" in str(mapping.get("paths", {})): + return "Client Libraries" + elif "genai-tools" in str(mapping.get("paths", {})): + return "GenAI Tools" + elif "algorithms" in str(mapping.get("paths", {})): + return "Algorithms" + elif "commands" in str(mapping.get("paths", {})): + return "Commands" + else: + return "Documentation" + + def _fix_relative_links(self, content: str, repo_name: str) -> str: + """Fix relative links to point to GitHub repository""" + # Simple implementation - could be more sophisticated + # Replace relative markdown links with absolute GitHub links + base_url = f"https://github.com/FalkorDB/{repo_name}/blob/main" + + # This is a simplified version - a production version would use regex + return content + + def sync_all_repositories(self) -> bool: + """Sync all configured repositories""" + print("🚀 Starting sync for all configured repositories...") + + has_any_changes = False + + for repo_name in REPO_MAPPINGS.keys(): + if self.sync_repository(repo_name): + has_any_changes = True + + return has_any_changes + + def get_changes_summary(self) -> str: + """Get formatted summary of changes""" + if not self.changes_summary: + return "No changes were made." + + return "\n".join(self.changes_summary) + + +def main(): + """Main execution function""" + github_token = os.getenv("GITHUB_TOKEN") + source_repo = os.getenv("SOURCE_REPO", "all") + source_ref = os.getenv("SOURCE_REF", "main") + + if not github_token: + print("❌ GITHUB_TOKEN environment variable not set") + sys.exit(1) + + print(f""" +╔═══════════════════════════════════════════════════════════╗ +║ Agentic Documentation Sync ║ +║ FalkorDB Organization ║ +╚═══════════════════════════════════════════════════════════╝ + """) + + agent = AgenticDocSync(github_token) + + # Determine which repositories to sync + if source_repo == "all": + has_changes = agent.sync_all_repositories() + else: + has_changes = agent.sync_repository(source_repo, source_ref) + + # Output summary for GitHub Actions + summary = agent.get_changes_summary() + print(f"\n📊 Summary:\n{summary}\n") + + # Set output for GitHub Actions + with open(os.environ.get('GITHUB_OUTPUT', '/dev/null'), 'a') as f: + f.write(f"changes_summary<> $GITHUB_OUTPUT + echo "source_ref=${{ github.event.client_payload.ref || 'main' }}" >> $GITHUB_OUTPUT + echo "commit_sha=${{ github.event.client_payload.sha }}" >> $GITHUB_OUTPUT + echo "commit_message=${{ github.event.client_payload.commit_message }}" >> $GITHUB_OUTPUT + elif [ "${{ github.event_name }}" == "workflow_dispatch" ]; then + echo "source_repo=${{ github.event.inputs.source_repo }}" >> $GITHUB_OUTPUT + echo "source_ref=${{ github.event.inputs.source_ref }}" >> $GITHUB_OUTPUT + echo "commit_sha=HEAD" >> $GITHUB_OUTPUT + echo "commit_message=Manual sync triggered" >> $GITHUB_OUTPUT + else + # Scheduled run - sync all repositories + echo "source_repo=all" >> $GITHUB_OUTPUT + echo "source_ref=main" >> $GITHUB_OUTPUT + echo "commit_sha=HEAD" >> $GITHUB_OUTPUT + echo "commit_message=Scheduled sync" >> $GITHUB_OUTPUT + fi + + - name: Set up Python + uses: actions/setup-python@v5 + with: + python-version: '3.11' + + - name: Install dependencies + run: | + python -m pip install --upgrade pip + pip install -r .github/requirements.txt + + - name: Run agentic sync script + id: sync + env: + GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} + SOURCE_REPO: ${{ steps.extract_info.outputs.source_repo }} + SOURCE_REF: ${{ steps.extract_info.outputs.source_ref }} + COMMIT_SHA: ${{ steps.extract_info.outputs.commit_sha }} + COMMIT_MESSAGE: ${{ steps.extract_info.outputs.commit_message }} + run: python .github/scripts/sync_docs.py + + - name: Check for changes + id: check_changes + run: | + if [ -n "$(git status --porcelain)" ]; then + echo "has_changes=true" >> $GITHUB_OUTPUT + else + echo "has_changes=false" >> $GITHUB_OUTPUT + fi + + - name: Validate Jekyll build + if: steps.check_changes.outputs.has_changes == 'true' + run: | + # Install Jekyll dependencies + sudo apt-get update + sudo apt-get install -y ruby-full build-essential zlib1g-dev + + # Install bundler if Gemfile exists + if [ -f "Gemfile" ]; then + gem install bundler + bundle install + bundle exec jekyll build + else + echo "No Gemfile found, skipping Jekyll build validation" + fi + + - name: Run spellcheck + if: steps.check_changes.outputs.has_changes == 'true' + uses: rojopolis/spellcheck-github-actions@0.33.1 + with: + config_path: .spellcheck.yml + task_name: Markdown + continue-on-error: true + + - name: Create Pull Request + if: steps.check_changes.outputs.has_changes == 'true' + uses: peter-evans/create-pull-request@v6 + with: + token: ${{ secrets.GITHUB_TOKEN }} + commit-message: | + docs: sync from ${{ steps.extract_info.outputs.source_repo }} + + Automated sync from FalkorDB/${{ steps.extract_info.outputs.source_repo }} + Source commit: ${{ steps.extract_info.outputs.commit_sha }} + Source ref: ${{ steps.extract_info.outputs.source_ref }} + + 🤖 Generated with GitHub Actions Agentic Workflow + branch: sync/${{ steps.extract_info.outputs.source_repo }}-${{ github.run_number }} + delete-branch: true + title: 'docs: sync documentation from ${{ steps.extract_info.outputs.source_repo }}' + body: | + ## 📚 Automated Documentation Sync + + This PR was automatically generated by the Agentic Documentation Sync workflow. + + ### Source Information + - **Repository:** FalkorDB/${{ steps.extract_info.outputs.source_repo }} + - **Branch/Ref:** ${{ steps.extract_info.outputs.source_ref }} + - **Commit:** ${{ steps.extract_info.outputs.commit_sha }} + - **Trigger:** ${{ github.event_name }} + + ### Changes + ${{ steps.sync.outputs.changes_summary }} + + ### Review Checklist + - [ ] Documentation content is accurate and up-to-date + - [ ] Links and references work correctly + - [ ] Code examples are properly formatted + - [ ] Spellcheck passes (see workflow results) + - [ ] Jekyll build succeeds + + ### Notes + This PR requires **human review and approval** before merging. + + --- + 🤖 Generated by [Agentic Workflow](.github/workflows/sync-docs-agentic.yml) | + 📖 [Workflow Documentation](.github/SYNC_WORKFLOW.md) + labels: | + documentation + automated + sync + draft: false + + - name: Summary + run: | + if [ "${{ steps.check_changes.outputs.has_changes }}" == "true" ]; then + echo "✅ Documentation sync completed - PR created" + else + echo "ℹ️ No changes detected - skipping PR creation" + fi diff --git a/.gitignore b/.gitignore index 6a11587..d4ac95e 100644 --- a/.gitignore +++ b/.gitignore @@ -1,2 +1,20 @@ _site/ *.dic + +# Python +__pycache__/ +*.py[cod] +*$py.class +*.so +.Python + +# Virtual environments +venv/ +ENV/ +env/ + +# IDE +.vscode/ +.idea/ +*.swp +*.swo diff --git a/.wordlist.txt b/.wordlist.txt index bd680d2..13da277 100644 --- a/.wordlist.txt +++ b/.wordlist.txt @@ -515,6 +515,8 @@ mcp sse SSE mcpServers +MCPServer +QueryWeaver stdio Cursor IDE diff --git a/README.md b/README.md index 3520354..23b1e31 100644 --- a/README.md +++ b/README.md @@ -1,22 +1,50 @@ [![Workflow](https://github.com/FalkorDB/docs/actions/workflows/pages/pages-build-deployment/badge.svg?branch=main)](https://github.com/FalkorDB/docs/actions/workflows/pages/pages-build-deployment) -[![Discord](https://img.shields.io/discord/1146782921294884966?style=flat-square)](https://discord.gg/ErBEqN9E) +[![Sync](https://github.com/FalkorDB/docs/actions/workflows/sync-docs-agentic.yml/badge.svg)](https://github.com/FalkorDB/docs/actions/workflows/sync-docs-agentic.yml) +[![Discord](https://img.shields.io/discord/1146782921294884966?style=flat-square)](https://discord.gg/ErBEqN9E) [![Try Free](https://img.shields.io/badge/Try%20Free-FalkorDB%20Cloud-FF8101?labelColor=FDE900&style=flat-square)](https://app.falkordb.cloud) [![Trendshift](https://trendshift.io/api/badge/repositories/14787)](https://trendshift.io/repositories/14787) # https://docs.falkordb.com +> **📚 Official documentation for FalkorDB** - The fastest graph database, powered by GraphBLAS +## 🔄 Automated Documentation Sync -# Build +This repository uses an **Agentic Workflow** to automatically sync documentation from FalkorDB repositories. When documentation is updated in source repositories, a PR is automatically created here to keep the docs in sync. + +**Monitored Repositories:** +- FalkorDB (core database) +- GraphRAG-SDK, QueryWeaver (GenAI tools) +- falkordb-py, falkordb-ts, JFalkorDB, NFalkorDB (client libraries) +- flex (UDF library), falkordb-browser (visualization) +- FalkorDB-MCPServer (agentic integrations) + +📖 **Learn more:** [Sync Workflow Documentation](.github/SYNC_WORKFLOW.md) + +## 🏗️ Development + +### Build ```bash bundle install bundle exec jekyll build ``` -# Run +### Run ```bash bundle exec jekyll serve ``` + +## 📝 Contributing + +Documentation contributions are welcome! You can either: +- **Edit directly**: Make changes via PR to this repository +- **Update source**: Modify docs in the source repository (automatically synced) + +See [SYNC_WORKFLOW.md](.github/SYNC_WORKFLOW.md) for details on the automated sync process. + +## 📄 License + +See [References/license.md](References/license.md) for licensing information.