From 79a579774343c26029830c47ade221ef93294266 Mon Sep 17 00:00:00 2001 From: Rita Chen Date: Tue, 17 Mar 2026 16:32:59 -0400 Subject: [PATCH] ghcp(crawl): ex0 bootstrap scaffolding --- .copilot-track/crawl/README.md | 95 ++++++++++++++++++++++++++++++++ ai-track-docs/SYSTEM-OVERVIEW.md | 41 ++++++++++++++ ai-track-docs/architecture.mmd | 50 +++++++++++++++++ ai-track-docs/build-test.md | 95 ++++++++++++++++++++++++++++++++ 4 files changed, 281 insertions(+) create mode 100644 .copilot-track/crawl/README.md create mode 100644 ai-track-docs/SYSTEM-OVERVIEW.md create mode 100644 ai-track-docs/architecture.mmd create mode 100644 ai-track-docs/build-test.md diff --git a/.copilot-track/crawl/README.md b/.copilot-track/crawl/README.md new file mode 100644 index 000000000..c2cc29ff6 --- /dev/null +++ b/.copilot-track/crawl/README.md @@ -0,0 +1,95 @@ +# Copilot Crawl Track — README + +This directory (`.copilot-track/crawl/`) holds AI-assisted crawl-and-modernisation artefacts for the **MarkLogic Java Client API** repository. It is not part of the production library; contents are for developer guidance and AI context only. + +--- + +## What Is the Crawl Track? + +The crawl track is an incremental, evidence-driven workflow for making large-scale changes to this codebase using AI assistance (GitHub Copilot / agent mode). Changes are broken into small, reviewable PRs that form a **chain** — each PR builds on the previous one. + +--- + +## Chain-PR Pattern + +A chain-PR is a sequence of pull requests where each PR: + +1. Targets the **previous PR's branch** (not `main`) as its base, creating a linear dependency chain. +2. Carries a **single, focused concern** (e.g., "migrate HTTP client from HttpClient to OkHttp", "update Jackson version", "replace deprecated API calls in DocMgr"). +3. Is reviewed and merged in order — do **not** merge PR N+1 before PR N is merged and its branch updated. + +``` +main ← PR-1 (foundation) ← PR-2 (layer A) ← PR-3 (layer B) ← ... +``` + +When the base PR merges, rebase subsequent PRs down the chain to keep them conflict-free: + +```bash +git fetch origin +git checkout feature/crawl-layer-B +git rebase origin/feature/crawl-layer-A +git push --force-with-lease +``` + +--- + +## Evidence in PRs + +Every crawl PR must include evidence that the change is safe. Accepted evidence types: + +| Evidence | Where to add it | +| ------------------------------------------------ | -------------------------------------------------------------------------- | +| Passing CI green-check (unit + functional tests) | Shown automatically on the PR by Jenkins | +| Before/after compile output | Paste in PR description under `## Build evidence` | +| Test-coverage delta | Add `## Test delta` section; attach Gradle test report if coverage dropped | +| Copilot prompt used | Add `## Prompt used` section (see below) | +| Manual verification steps | Add `## Manual verification` with exact commands run | + +PRs that lack evidence will be marked **needs-evidence** and not merged. + +--- + +## Prompt Usage + +AI prompts that drove a crawl change belong in the PR description under `## Prompt used`. This creates an audit trail and lets reviewers reproduce or adjust the change. + +**Template:** + +```markdown +## Prompt used + +> Agent mode, model: claude-sonnet-4-6 +> +> "Migrate all usages of `com.marklogic.client.impl.OkHttpServices` constructor that +> pass a plain `String` password to instead use `char[]` and call `Arrays.fill` after use. +> Do not modify test files. Only change files under marklogic-client-api/src/main/." + +Files changed by prompt: +Files reviewed manually: +``` + +Storing prompts in PRs helps future crawl passes understand _why_ a change was made, not just _what_ changed. + +--- + +## Adding New Crawl Artefacts + +Place any generated files, diff summaries, or migration notes inside this directory as flat Markdown or JSON files. Suggested naming: + +``` +.copilot-track/crawl/ +├── README.md ← this file +├── 001-.md ← plan / notes for crawl step 1 +├── 002-.md ← plan / notes for crawl step 2 +└── ... +``` + +Keep each step file small (< 200 lines). Reference `ai-track-docs/` for system-level context. + +--- + +## Related Docs + +- [ai-track-docs/SYSTEM-OVERVIEW.md](../../ai-track-docs/SYSTEM-OVERVIEW.md) — what this project does and how it is structured +- [ai-track-docs/build-test.md](../../ai-track-docs/build-test.md) — how to build and run tests locally +- [ai-track-docs/architecture.mmd](../../ai-track-docs/architecture.mmd) — Mermaid architecture diagram diff --git a/ai-track-docs/SYSTEM-OVERVIEW.md b/ai-track-docs/SYSTEM-OVERVIEW.md new file mode 100644 index 000000000..107376b0c --- /dev/null +++ b/ai-track-docs/SYSTEM-OVERVIEW.md @@ -0,0 +1,41 @@ +# System Overview — MarkLogic Java Client API + +## Purpose + +The **MarkLogic Java Client API** (`com.marklogic:marklogic-client-api`) is a Java library that exposes MarkLogic Server's REST API as a type-safe, fluent Java interface. It supports reading, writing, deleting, and querying JSON, XML, binary, and text documents, as well as ACID multi-statement transactions, semantic (SPARQL/RDF), Full-text search, alerting, Data Services, and Row Manager (Optic API). + +## Repository Root + +`marklogic-client-api-parent` (Gradle multi-project). + +## Modules + +| Module | Description | +| -------------------------------------- | ---------------------------------------------------------------------------- | +| `marklogic-client-api` | Core library — all production source code | +| `marklogic-client-api-functionaltests` | Functional / integration tests requiring a live MarkLogic instance | +| `ml-development-tools` | Kotlin-based developer tooling (code generation helpers) | +| `test-app` | MarkLogic application deployed to the test server (modules, schemas, config) | +| `examples` | Standalone usage examples | + +## Runtime Requirements + +- **Java 17** (minimum; Java 21 also supported and tested in CI) +- **MarkLogic Server** (for integration/functional tests) — started via `docker-compose.yaml` + +## Technology Stack + +- Build: **Gradle** (wrapper at `./gradlew`) +- Test framework: **JUnit 5** (unit) + MarkLogic functional test harness +- CI: **Jenkins** (`Jenkinsfile`) — Docker-based MarkLogic, parallel Java 17/21 builds +- Primary language: **Java**; developer tooling in **Kotlin** + +## Key External Dependencies + +- OkHttp (HTTP client transport) +- Jackson (JSON serialization) +- SLF4J / Logback (logging) + +## Relationship to MarkLogic Server + +All network communication travels over the **MarkLogic REST Management and Client APIs** (typically port 8000/8002). The library never connects directly to MarkLogic's internal ports; authentication is via HTTP Digest or certificate. diff --git a/ai-track-docs/architecture.mmd b/ai-track-docs/architecture.mmd new file mode 100644 index 000000000..f98ba7bc7 --- /dev/null +++ b/ai-track-docs/architecture.mmd @@ -0,0 +1,50 @@ +%%{init: {"theme": "neutral"}}%% +graph TD + subgraph callers["Calling Code"] + APP["Java Application / Examples"] + end + + subgraph core["marklogic-client-api (core)"] + DatabaseClient["DatabaseClient\n(entry-point factory)"] + DocMgr["Document Managers\n(JSON / XML / Text / Binary / Generic)"] + QueryMgr["QueryManager\n(search, cts, SPARQL)"] + TxMgr["TransactionManager\n(ACID multi-statement)"] + RowMgr["RowManager\n(Optic / SQL)"] + DataSvc["Data Services\n(generated proxies)"] + RESTServices["RESTServices\n(OkHttp transport layer)"] + end + + subgraph devtools["ml-development-tools"] + CodeGen["Proxy Code Generator\n(Kotlin)"] + end + + subgraph testapp["test-app"] + MLConfig["ml-config\n(DB, forests, REST server)"] + MLModules["ml-modules\n(XQuery / SJS modules)"] + MLSchemas["ml-schemas\n(TDE templates)"] + end + + subgraph server["MarkLogic Server (Docker / remote)"] + REST["REST Client API\n(port 8000 / 8002)"] + XDBC["e-node internal"] + end + + APP --> DatabaseClient + DatabaseClient --> DocMgr + DatabaseClient --> QueryMgr + DatabaseClient --> TxMgr + DatabaseClient --> RowMgr + DatabaseClient --> DataSvc + DocMgr --> RESTServices + QueryMgr --> RESTServices + TxMgr --> RESTServices + RowMgr --> RESTServices + DataSvc --> RESTServices + RESTServices -->|"HTTP Digest / Cert auth"| REST + REST --> XDBC + + CodeGen -->|"generates Java proxy classes"| DataSvc + + MLConfig -->|"mlDeploy"| REST + MLModules -->|"mlDeploy"| REST + MLSchemas -->|"mlReloadSchemas"| REST diff --git a/ai-track-docs/build-test.md b/ai-track-docs/build-test.md new file mode 100644 index 000000000..299d816fc --- /dev/null +++ b/ai-track-docs/build-test.md @@ -0,0 +1,95 @@ +# Build & Test Guide + +## Prerequisites + +| Requirement | Notes | +| -------------- | -------------------------------------------------------------------------------------------- | +| JDK 17+ | JDK 21 also works; set `JAVA_HOME` or rely on Gradle toolchain auto-provisioning | +| Docker | Required for functional tests (MarkLogic container) | +| Gradle wrapper | Use `./gradlew` (Linux/macOS) or `gradlew.bat` (Windows); do **not** install Gradle globally | + +--- + +## Quick Build (no tests) + +```bash +./gradlew clean build -x test +``` + +--- + +## Unit Tests — core library only + +```bash +./gradlew :marklogic-client-api:test +``` + +Unit tests have **no external dependencies**; they run without MarkLogic. + +--- + +## Developer-Tools Tests + +```bash +./gradlew :ml-development-tools:test +``` + +--- + +## Functional / Integration Tests + +Functional tests require a running MarkLogic instance. Start it with Docker Compose first: + +```bash +docker compose up -d +``` + +Then deploy the test application and run the functional tests: + +```bash +./gradlew :test-app:mlDeploy :test-app:mlReloadSchemas +./gradlew :marklogic-client-api-functionaltests:test +``` + +> **Tip:** The `Jenkinsfile` contains the authoritative CI test sequence if the local steps diverge. + +--- + +## Running a Specific Test Class + +```bash +./gradlew :marklogic-client-api:test --tests "com.marklogic.client.test.SomeTest" +``` + +--- + +## Gradle Properties + +Key properties live in `gradle.properties` and `marklogic-client-api/gradle.properties`. Override on the command line with `-P=`: + +```bash +./gradlew :marklogic-client-api:test -PmlHost=localhost -PmlPort=8000 +``` + +--- + +## Build Artifacts + +After a successful build the JAR is at: + +``` +marklogic-client-api/build/libs/marklogic-client-api-.jar +``` + +--- + +## Linting / Static Analysis + +No dedicated lint step is configured in the current Gradle build. + +--- + +## Common Pitfalls + +- Functional tests **will hang or fail** if Docker is not running or MarkLogic has not finished starting. Wait ~30 s after `docker compose up -d` before deploying. +- Java toolchain auto-provisioning requires internet access on first run. On air-gapped machines set `org.gradle.java.installations.paths` in `~/.gradle/gradle.properties`.