Conversation
Recovered from filesystem after data loss. This squashes ~58 commits originally made between 2026-03-23 and 2026-03-28. The full original reflog is preserved in docs/recovered-git-history.md. New OoT-specific factories: - OoTSceneFactory (OOT:SCENE, OOT:ROOM) — scene command parsing and binary export - OoTSkeletonFactory — skeleton, limb, and skin vertex support - OoTAnimationFactory — normal, curve, legacy, and player animations - OoTCollisionFactory — collision mesh with camera data and waterboxes - OoTArrayFactory — Shipwright-compatible VTX and Vec3s arrays Modified upstream: - DisplayListFactory — OoT cross-segment DList handling, VTX consolidation, virtual segment 0x80, G_BRANCH_Z discovery, ZAPD compatibility fixes - Companion — OoT factory registration, BUILD_OOT cmake option - ResourceType — OoT type codes (OSKL, OSLB, OANM, OROM, OCOL, OPTH, OTXT) Tooling (soh/): - zapd_to_torch.py — converts ZAPDTR/OTRExporter XML to Torch YAML - test_assets.sh, check.sh, verify.sh, manifest.sh, lib.sh — test harness - list_assets.py — asset manifest query tool Status at time of loss: 20,432 assets passing, 0 failures. 14,355 scene assets in progress (scene/room factory implemented, iterating on binary format correctness). OoTTextFactory was not recovered and needs recreation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- identify_roms.sh: identifies OoT ROMs by SHA1, renames to standardized format, handles duplicates - extract_dma.py: extracts DMA tables from all 17 ROM versions using Shipwright filelists, outputs JSON keyed by filename - Pre-computed DMA tables for all 17 versions (14 unique) - Manifests directory with gitignore for generated hash files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
config.yml moved from soh/ to soh/assets/yml/ where Torch expects it. Generated per-version YAML dirs are gitignored via local .gitignore rather than the top-level one. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add lib/libyaz0/ with decode support following libmio0/libyay0 pattern - Wire YAZ0 into Decompressor::Decode and AutoDecode - Add missing PendingVtx struct in DeferredVtx namespace - Add missing IS_VIRTUAL_SEGMENT macro in BaseFactory.h - Add libyaz0 to CMake C_FILES glob Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- TranslateAddr now recognizes high segments (>= 0x80) when they exist in the segment map, not just standard segments (0x01-0x1F) - ASSET_PTR extracts segment offset for virtual segments too, preventing raw 0x80XXXXXX addresses from being used as buffer offsets Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
OTRExporter writes 0-byte files for LimbTable entries. BlobFactory crashed when trying to Write() a null buffer. Guard the write with an empty check. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Empty blobs (e.g. LimbTable) now write 0 bytes to match OTRExporter reference output instead of writing a header with size 0 - test_assets.sh auto-logs to soh/logs/ with timestamp - New compare_asset.sh tool for hex-diffing individual assets between reference and generated O2R Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
*.o2r are generated archive files. torch.hash.yml is a Torch build cache tracking which YAMLs have been processed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Hash all extracted files in a single sha256sum call instead of one process per file - Redirect torch output to a log file instead of piping through grep - Collapse duplicate jq reduce into one pass with inline fail count Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rewrites the asset test script in Python to avoid per-file process spawning. YAML collection, O2R extraction, and hashing are all done in-process. Hashes assets directly from the zip without extracting to disk. 107s → 1.6s for 17,516 object assets. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add BUILD_OOT option (default ON) following pattern of other games, defines OOT_SUPPORT so OoT factories are registered - Stub OoTTextFactory so it compiles (real impl is task HarbourMasters#5) - Expose DeferredVtx::BeginDefer in DisplayListFactory.h so OoTSceneFactory can call it Enables 16,952 additional assets: 12,377 → 29,329 passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Enable GFX auto-discovery for auto-discovered limbs (previously disabled, causing 573 limbs to have empty DList paths) - Fix LOD limb DList suffix: use "FarDL" instead of "DL2" to match OTRExporter/ZAPDTR naming convention - Fix Curve limb DList suffixes: "CurveDL"/"Curve2DL" to match ZAPDTR - Resolve LOD far DList before near, so shared-address limbs use the Far name for both fields (matches OTRExporter behavior) - Rewrite compare_asset.sh as compare_asset.py (takes two O2Rs, no torch run needed) - test_assets.py now saves generated.o2r to soh/o2r/ by default Objects: 17,322 passed, 1 failed (MTX), 193 not generated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
OTRExporter writes a 0-byte file for each skeleton's limb array (e.g. gKeeseSkeletonLimbs). Add this to the skeleton factory's parse to match. Objects: 17,515 passed, 1 failed (MTX), 0 not generated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
OTRExporter/ZAPDTR reads the N64 Mtx as 16 sequential int32 BE values and writes them back as-is. Our exporter was writing individual uint16 int-part values, which produced byte-swapped output within each 32-bit word. Now reads and stores the raw int32 values in the parser and writes them in the binary exporter, matching the reference format. Objects: 17,516 passed, 0 failed. Code: 11 passed, 0 failed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When multiple segments map to the same physical ROM address (common for overlays which alias segments 8-13 to their code data), the virtual address patcher was returning a segment 0x0D address instead of segment 0x80. This caused texture lookups to fail because textures are registered under segment 0x80 offsets in the YAML. Now explicitly prefers segment 0x80 when it maps to the same physical address, matching how YAML offsets are declared. Overlays: 325 passed, 0 failed (was 101 failures). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Scene/room DLists are auto-discovered by the scene factory with room-prefixed names matching OTRExporter output. Pre-declared DList entries from ZAPDTR XMLs used different naming (gXxxDL_ vs xxx_room_0DL_) causing mismatches. Scenes: 10,729 passed, 0 failed (was 27 failures). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Room mesh DLists are auto-discovered by the scene factory with correct room-prefixed names. Pre-declared DLists from ZAPDTR XMLs (both room-named and scene-named) conflict with auto-discovery. 18 scene-level DLists declared in room files (e.g. gKinsutaDL_0030B0) are now missing — these need to be handled by the scene factory or a separate mechanism. Tracked as part of scene work. 31,156 passed, 1 failed (version), 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Scene/room alternate headers (SetAlternateHeaders command) are now recursively processed as sub-assets. Processing is deferred until after the primary header's commands (especially SetMesh) complete, so primary DLists are registered first and alternate headers reuse their names for shared ROM addresses. DeferredVtx state is saved/restored around each alternate header to prevent VTX consolidation corruption. Exposes SaveAndClearPending/RestorePending and PendingVtx struct in DisplayListFactory.h for use by scene factory. 31,436 passed (+280), 128 scene failures (Sets/Cutscenes), 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Alternate headers pass parent's baseName for sub-asset naming (DLists, backgrounds, cutscenes, pathways) so names match OTRExporter which doesn't prefix with Set_ - Fix cutscene suffix: "CutsceneData" instead of "Cs" to match OTRExporter's GetSegmentedPtrName convention 31,501 passed (+345 from session start), 108 failed, 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Cutscenes use entryName (with Set_ prefix) matching OTRExporter - Pathways use baseName (parent name) matching OTRExporter - Fix cutscene suffix: CutsceneData instead of Cs 31,583 passed, 109 failed (84 Set command data, 24 cutscenes, 1 version). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use getNeighborSize to limit pathway entry scanning instead of a hard 256 maximum. This helps some alternate headers with tight boundaries, though pathway count inference remains imperfect without XML metadata. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
OTRExporter creates empty placeholder files for actor list data (e.g. Bmori1_room_0ActorEntry_000054). Add these as companion files in the scene factory. 32,151 passed (+568), 109 failed, 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
OoT alternate headers reference the same DLists as primary headers under Set_-prefixed names. OTRExporter creates both files with identical content. - Add RegisterAssetAlias to Companion for creating duplicate O2R entries with the same binary data under different names - Scene factory uses entryName for DList symbols and ResolveGfxWithAlias to register aliases when an existing DList is found at the same offset - Alias files are written during the export phase using the already-serialized binary data (zero re-parsing overhead) 34,539 passed (+2,388), 109 failed, 738 not generated. Session total: 12,377 → 34,539 (34.9% → 97.6%). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace naive 0xFFFFFFFF scan with a command-aware parser that correctly determines cutscene boundaries by parsing the command structure (ID + entry count + entry size per type). Handles camera splines (terminated by continueFlag), scene transitions (0x2D), destinations (0x3E8), and standard commands. Cutscene sizes are now correct, but content still differs from reference because OTRExporter re-serializes with different byte ordering (ROM is BE, O2R is LE with CMD_HH packing). Full re-serialization is the next step. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Document the BE→LE field re-packing needed for each command type. Raw copy doesn't work because OTRExporter uses CMD_HH/CMD_BBH/CMD_HBB macros to pack fields into uint32 words differently than ROM layout. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace raw cutscene copy with proper BE→LE re-serialization using CMD_HH/CMD_BBH/CMD_HBB field packing to match OTRExporter output. Handles camera splines, actor cues, misc/lighting/BGM, textbox, rumble, settime, transition, and destination commands. 33 additional cutscenes now match. 76 failures remain (likely a subtle issue with uint16/uint32 field reading in some entries). 34,572 passed (97.7%), 76 failed, 738 not generated. Session total: 12,377 → 34,572 (34.9% → 97.7%). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Actor cue entries have rotY/rotZ as the 3rd word packed with CMD_HH, not a raw uint32. Differentiate actor cues from misc/lighting/BGM commands to apply correct packing. 34,602 passed (97.8%), 46 failed (44 cutscene, 1 pathway, 1 version). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
34,602/35,386 (97.8%) passing. Remaining: 44 cutscene format issues, 598 audio (no factory), 135 scene sub-assets, 4 text. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract cutscene re-serialization into reusable SerializeCutscene function. Register OOT:CUTSCENE factory for YAML-declared cutscenes (gXxxCs assets from ZAPDTR XML). 34,698 passed (+50), 0 failed, 688 not generated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace 289 lines of inline cutscene serialization with a call to the reusable SerializeCutscene function. No behavior change. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Register OOT:PATH factory for YAML-declared path assets (gXxxPath from ZAPDTR XML). Reads num_paths pathway entries from ROM and serializes with the same format as scene companion pathway files. No doubling for standalone paths (doubling only occurs in SetPathways command handler). 34,726 passed (+28), 0 failed, 660 not generated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Both the SetPathways handler and the standalone OOT:PATH factory now call the shared SerializePathways function. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Read JPEG screen buffer data (320x240x2 = 153600 bytes) from ROM at the source address in SetMesh type 1 entries. Write as Background companion files matching OTRExporter format (IGBO header + size + data). Handles both single background (format 1) and multiple backgrounds (format 2). 34,761 passed (+35), 0 failed, 625 not generated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Both mesh type 1 format 1 (single) and format 2 (multiple) background handlers now call the shared helper function. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
g-prefixed scene DLists are at different ROM offsets from room mesh DLists, but including them in YAML still causes 838 failures. They interfere with gAddrMap lookups during scene factory processing. Need a companion-file approach instead. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
YAML approach causes 838 regressions because GFX factory processing during YAML parse corrupts DeferredVtx state for subsequent scene factory. Document root cause and alternative approaches. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Keep g-prefixed DLists in room YAMLs but sort OOT:ROOM/OOT:SCENE entries before GFX entries within each file. This ensures the scene factory processes rooms first with clean VTX state, preventing auto-discovery conflicts from pre-registered VTX addresses. 34,779 passed (+18), 0 failed, 607 not generated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Stop skipping room-prefixed DLists from room XML files. Some (like spot00_room_0DL_012B20, spot16_room_0DL_00AA48) are child DLists not discovered by SetMesh and need to be in the YAML. The YAML entry ordering (OOT:ROOM before GFX) prevents VTX auto-discovery conflicts. AddAsset deduplicates mesh DLists that were already auto-discovered by the scene factory. 34,783 passed (+4), 0 failed, 603 not generated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comprehensive plan verified against actual OTRExporter and ZAPDTR source. Documents multi-segment ROM structure, binary formats for samples/fonts/sequences, and implementation approach. Key finding: audio data spans 4 separate DMA entries (code, Audiobank, Audiotable, Audioseq), not a single segment. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Step 0: YAML setup, Step 1: main entry, Step 2: load segments, Step 3: sequences (+110), Step 4: samples (+449), Step 5: fonts (+38). Each step independently verifiable before proceeding. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract OoT audio table offsets, sequence names, and sample names from Shipwright XML. Auto-add Audiobank/Audioseq/Audiotable segments to the audio YAML. Enhanced _format_asset to handle nested list/dict structures for audio sample banks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Register OOT:AUDIO factory that creates the main audio/audio entry (64-byte OAUD header with version 2). Fix audio YAML path to avoid double nesting (audio.yml not audio/audio.yml). 34,784 passed (+1), 0 failed, 602 not generated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Parse audio table headers from decompressed code segment. Extract 110 sequences from Audioseq ROM data with metadata (font indices, medium, cachePolicy). Write as OSEQ companion files. 34,893 passed (+109), 0 failed, 493 not generated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ZAPDTR treats sequence entries with size=0 as aliases: ptr field is an index to another sequence entry whose data should be used. Sequence 087_File_Select aliases sequence 40. 34,894 passed, 0 failed. All 110 sequences passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Documents the safe BinaryReader-based approach for parsing Audiobank structures, the exact pointer chains for drums/ instruments/SFX, and the OSMP output format. Corrects the raw pointer arithmetic approach that caused segfaults. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Parse font structures from Audiobank to discover all unique samples. Use LUS::BinaryReader with BE endianness and bounds checking for all ROM data reads. Extract sample data from Audiotable with loop metadata and ADPCM book data. 35,321 passed (+437), 3 failed (sample size discrepancies), 62 not generated (38 fonts + misc). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Documents cross-bank sample naming collision for Tom Drum, Drum Sidestick, and Windchimes. Identifies root cause and proposes fix options. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use absolute Audiotable offset (not relative) as the sample name lookup key, matching ZAPDTR's ZAudio.cpp:174. Only bank 1 (base=0) resolves named paths; other banks get fallback names like sample_5_00420C20. Fixes 3 data mismatches and 8 missing samples. Result: 449/449 samples pass (was 441 with wrong data for 3). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Documents the OSFT binary format, parsing details from ZAPDTR, and implementation approach for the remaining 38 audio font assets. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Parse drums, instruments, and SFX from Audiobank with envelope data and sample references. Replicate ZAPDTR stack residue behavior for invalid instrument entries. Add font name extraction to YAML generator. 596/598 audio assets pass. 2 fonts differ by 29 bytes total in dead data (uninitialized fields in invalid instruments from ZAPDTR UB). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ZAPDTR reuses the same stack slot for DrumEntry and InstrumentEntry. Invalid instruments before any valid one inherit the last drum's field values: drum.pan→inst.loaded, drum.loaded→inst.normalRangeLo, with padding/offset bytes mapping to zero. 598/598 audio assets now pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Parse message tables from compressed code segment and text data from uncompressed message_data_static segments. Handles PAL languages (ger/fra) with separate lang_offset pointer tables. 4/4 text assets pass. 35,385/35,386 total (only portVersion remaining). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Documents binary format (7 bytes: endianness flag + 3x uint16 BE), generation in OTRExporter, runtime consumption in SoH, and root cause of why Torch doesn't generate it (missing -u/--version CLI flag). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add Big Endian flag byte to ParseVersionString matching OTRExporter format (7 bytes: endianness + 3x uint16 BE). Pass -u 9.2.0 to torch in test_assets.py. 35,386/35,386 assets pass (100%). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Explains why YAML generation requires a reference O2R for VTX backfill: VTX assets aren't in XML, and DeferredVtx auto-naming doesn't match ZAPDTR conventions that SoH expects at runtime. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Smart approach to match the o2r 1:1, at least as far as quickly iterating to get to parity, is it faster? As an aside, a lot of the parts I was unhappy with in terms of the flow between the rom and the o2r have more to do with the formatting, things being unnecessarily different than how they are stored in the rom, and this just inherits those issues, right? |
not yet, it was during some parts of iteration but some complexity in processing was added that slowed it down, i plan to dig into perf improvements
yeah, the goal of this effort is a drop-in replacement for zapdtr/otrexporter. i want a testable way to say "ok, torch does what we used to use zapdtr/otrexporter to do." once that's working i'm all for making improvements to the file structure etc, but i don't want to block a tooling switch on that. |
what this does
generates an o2r that matches what https://github.com/briaguya0/Shipwright/tree/fix-skinvtxcnt-ub (just dev from when i started with HarbourMasters/ZAPDTR#37 included) generates when using a PAL GC (sha1: 0227D7C0074F2D0AC935631990DA8EC5914597B4) rom. since zip files aren't generated deterministically the comparison was done file-by-file within the extracted archive.
relevant files
zapd_to_torch.pytest_assets.pywhy this is POC/draft
sohdir shouldn't be in herethings to look into