feat(code-chunk): full integration of YAML, TOML, JSON, and JSONL by matperez · Pull Request #28 · supermemoryai/code-chunk

matperez · 2026-02-22T13:07:39Z

What

Adds full support for YAML, TOML, JSON, and JSONL in the code-chunk pipeline: language detection, tree-sitter parsing, entity extraction (top-level sections), and chunking with size limits.

Why

Config and data files (YAML, TOML, JSON, JSONL) should be chunked like code: entities = top-level sections, boundaries and size controlled by the same node-based algorithm. This allows indexing and retrieval of config/docs in the same way as source files.

Notes

JSONL is handled without a single-file AST: lines are accumulated until maxChunkSize; each line is treated as one entity (name from first key or "line N").
New Language values: yaml, toml, json, jsonl. New EntityType: section.
Dependencies: tree-sitter-json, @tree-sitter-grammars/tree-sitter-yaml, @tree-sitter-grammars/tree-sitter-toml.
Tests added/updated for parser, extract, and integration (stream) for all four formats.

- Добавлены языки: yaml, toml, json, jsonl и тип сущности section - Парсер: грамматики tree-sitter для yaml/toml/json, расширения файлов - Извлечение сущностей: запросы и fallback для топ-уровневых секций - extractName/extractSignature и импорты для новых форматов - JSONL: отдельная ветка чанкинга по строкам (без AST всего файла) - Тесты: parser, extract, integration для всех четырёх форматов

matperez changed the title ~~feat(code-chunk): полная интеграция YAML, TOML, JSON и JSONL~~ feat(code-chunk): full integration of YAML, TOML, JSON, and JSONL Feb 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

feat(code-chunk): full integration of YAML, TOML, JSON, and JSONL#28

feat(code-chunk): full integration of YAML, TOML, JSON, and JSONL#28
matperez wants to merge 1 commit intosupermemoryai:mainfrom
matperez:feature/data-formats-yaml-toml-json-jsonl

matperez commented Feb 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

matperez commented Feb 22, 2026

What

Why

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant