Skip to content

Expand validation docs for dandi-cli#1822 (companion JSONL, grouping, VisiData)#235

Open
yarikoptic wants to merge 4 commits intomasterfrom
enh-validation
Open

Expand validation docs for dandi-cli#1822 (companion JSONL, grouping, VisiData)#235
yarikoptic wants to merge 4 commits intomasterfrom
enh-validation

Conversation

@yarikoptic
Copy link
Copy Markdown
Member

@yarikoptic yarikoptic commented Apr 4, 2026

Summary

Major expansion of the Validating Files documentation to cover the new validation features from dandi/dandi-cli#1822.

  • Rewrite validating-files.md from ~53 to ~360 lines with new sections:
    • dandi validate usage with tabbed output format examples (text/JSON/YAML/JSONL)
    • Filtering (--min-severity, --ignore) and grouping (-g severity -g id) with real output
    • Saving/loading results (--output, --load, automatic JSONL companion files)
    • Validation during upload — companion JSONL saved alongside logs
    • Reviewing results with VisiData (key bindings table + embedded asciinema recording)
    • Validation result schema reference (fields, origin, severity levels)
  • Enable mkdocs-material features: pymdownx.highlight, tabbed, snippets, code copy/annotate
  • Pre-generated example outputs from real dandi validate runs on Dandiset 000027 (NWB) and bids-examples/bids-error-examples (BIDS)
  • Scripted asciinema VisiData demo using datalad/screencaster pattern, with headless recording driver (record.sh via Xvfb + xdotool)
  • Cross-references from uploading-data.md to new validation companion file docs

New files

Path Purpose
docs/examples/validation/*.txt,jsonl,yaml,json Pre-generated validation output examples
docs/examples/validation/visidata-demo.cast Asciinema recording of VisiData exploration
scripts/generate-validation-examples.sh Reproducible script to regenerate all examples
scripts/visidata-demo/{demo.sh,record.sh,dot_visidatarc} VisiData demo automation

Dependencies

Test plan

  • mkdocs build succeeds with no errors
  • Content tabs (text/JSON/YAML/JSONL) render correctly
  • Asciinema player loads and plays the VisiData demo
  • Cross-links between validating-files.md and uploading-data.md resolve
  • Verify rendering on deployed preview (after push)

🤖 Generated with Claude Code

Extra TODOs for humans:

  • review asciinema rendered demo, I made some delays tuned but we might want to simplify it etc
  • not sure if CI is involved yet anyhow, but ideally we should make them reproducible and regenerate on CI as dandi-cli and underlying validators might change etc.
  • this PR contains framework to establish and demonstrate asciinema recordings of visidata navigations. Inspired by what I did in https://github.com/con/visidata-demos/tree/master/psychoinformatics-1 . As keeping reusing, might want/need to extract it somewhere else for centralized management etc.
  • decide which in above to do here or later

@netlify
Copy link
Copy Markdown

netlify bot commented Apr 5, 2026

Deploy Preview for dandi-docs ready!

Name Link
🔨 Latest commit de1b5a3
🔍 Latest deploy log https://app.netlify.com/projects/dandi-docs/deploys/69d2af22af07d000083d0d3e
😎 Deploy Preview https://deploy-preview-235--dandi-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yarikoptic What is your plan to guarantee the examples stay up to date? I don't see any CRON / CI here to check or notify of differences / submit updates

Don't want things getting out of sync

@CodyCBakerPhD
Copy link
Copy Markdown
Contributor

CodyCBakerPhD commented Apr 5, 2026

@yarikoptic Thank you for setting up the deploy preview, that helps considerably

Do you have any guess as to how hard it might be to do something like that for the DANDI Archive? (if only the web front end since I imagine backend would be tough)

EDIT: I see

not sure if CI is involved yet anyhow, but ideally we should make them reproducible and regenerate on CI as dandi-cli and underlying validators might change etc.

Can you look into this? My original thought was to have such things on the DANDI-CLI testing suite since that is closer to where code changes would break such things


On a dataset with actual errors (missing README, wrong file extension):

```console
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yarikoptic Remind me, are these lines of code tested through a doctest of some kind?

yarikoptic and others added 4 commits April 5, 2026 14:50
…rsist validation logs)

=== Do not change lines below ===
{
 "chain": [],
 "cmd": "yolo -v /home/yoh/proj/dandi/dandi-cli-enh-validators:/home/yoh/proj/dandi/dandi-cli-enh-validators:ro -- 'In /home/yoh/proj/dandi/dandi-cli-enh-validators which is submitted as dandi/dandi-cli#1822 we significantly improved validation interfacing -- we serialize validation outputs and store so we could reload and potentially review with different filtering or use external tools like visidata to navigate.  I would like here to improve our https://docs.dandiarchive.org/user-guide-sharing/validating-files/ section with improved documentation, reflecting the state of that PR. We should demonstrate that we store companion validation files during upload so they could be re-reviewed/analyzed. We should show basic use of visidata to quickly review them.  Could use bids-examples repo and some sample dandisets (should be sufficiently small) to show how e.g. to compose multiple validation files. Ideally we should script production of example outputs, and/or store/share validation output example for easier access.  Do research how other projects using mkdocs produce similar demo walkthroughs, and what we have done so far in this repo. Do research, build plan for content and also implementation details.'",
 "exit": 0,
 "extra_inputs": [],
 "inputs": [],
 "outputs": [],
 "pwd": "."
}
^^^ Do not change lines above ^^^
…rsist validation logs)

=== Do not change lines below ===
{
 "chain": [],
 "cmd": "yolo -v /home/yoh/proj/dandi/dandi-cli-enh-validators:/home/yoh/proj/dandi/dandi-cli-enh-validators:ro -- --resume",
 "exit": 0,
 "extra_inputs": [],
 "inputs": [],
 "outputs": [],
 "pwd": "."
}
^^^ Do not change lines above ^^^
- Fix asciinema-player cast file path for use_directory_urls (../ -> ../../)
- Rename "Validating BIDS Files" -> "Validating BIDS Datasets"
- Longer pauses after demo-say narration (2s -> 4s) for readability
- Add pauses after typed VisiData commands (go-col-regex, search)
- Use clean "validation-demo $" prompt instead of leaking absolute paths
- Add record.sh driver for headless asciinema recording via Xvfb
- Re-record visidata-demo.cast with all improvements

Co-Authored-By: Claude Code 2.1.92 / Claude Opus 4.6 <noreply@anthropic.com>
@yarikoptic
Copy link
Copy Markdown
Member Author

yarikoptic commented Apr 5, 2026

Do you have any guess as to how hard it might be to do something like that for the DANDI Archive? (if only the web front end since I imagine backend would be tough)

we already had this for long time -- it is a frontend and backend IIRC working against staging S3, eg from dandi/dandi-archive#2771

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants