Skip to content

[SYSTEMDS-3891] OOC Pipelining Support, New Primitives, and New Operators#2409

Merged
janniklinde merged 1 commit intoapache:mainfrom
janniklinde:OOCPipeliningAndFixes
Feb 19, 2026
Merged

[SYSTEMDS-3891] OOC Pipelining Support, New Primitives, and New Operators#2409
janniklinde merged 1 commit intoapache:mainfrom
janniklinde:OOCPipeliningAndFixes

Conversation

@janniklinde
Copy link
Contributor

This PR introduces pipelining support for operators that produce at most one MatrixBlock per incoming block. As current out-of-core primitives are highly concurrent, which may lead to fan-outs (and consequential OOMs) for fast-producing source streams, this implementation aims to process downstream operations in the same thread. To reliably clean up referenced objects from caller methods (stack), we defer the downstream task using a thread-local context (if available). Because the deferred call is still executed in the same task of the caller, sequences of pipelining operators are completed before new tasks are taken from the task queue.
Additionally, this PR adds new primitives required for general matrix multiplies and adds new operators relying on these new primitives.
We also added (optional) messaging capabilities operators and streams, which may be required to communicate stream capabilities in future (e.g., targeted requests of tiles, cached, ...). While it is still uncertain to what degree these messages can be used, they are required for future experiments.
As streams sometimes need size information, they may now hold the corresponding CacheableData object. Here I don't know if these references can create issues regarding memory management.
Finally, this PR contains various bugfixes, safety checks, improved error handling and additional cache features.
Sorry (again) for the large PR.

@codecov
Copy link

codecov bot commented Jan 26, 2026

Codecov Report

❌ Patch coverage is 56.74677% with 702 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.64%. Comparing base (b394e32) to head (f88898f).
⚠️ Report is 9 commits behind head on main.

Files with missing lines Patch % Lines
.../sysds/runtime/ooc/cache/OOCLRUCacheScheduler.java 57.83% 50 Missing and 28 partials ⚠️
...sysds/runtime/instructions/ooc/OOCInstruction.java 79.89% 47 Missing and 30 partials ⚠️
...ime/instructions/ooc/MapMMChainOOCInstruction.java 53.98% 61 Missing and 14 partials ⚠️
...he/sysds/runtime/ooc/cache/OOCMatrixIOHandler.java 48.27% 60 Missing and 15 partials ⚠️
...che/sysds/runtime/ooc/cache/DeferredReadQueue.java 42.10% 58 Missing and 8 partials ⚠️
.../sysds/runtime/instructions/ooc/CachingStream.java 44.54% 44 Missing and 17 partials ⚠️
...untime/instructions/ooc/SubscribableTaskQueue.java 47.05% 37 Missing and 8 partials ⚠️
...he/sysds/runtime/ooc/stream/FilteredOOCStream.java 21.73% 34 Missing and 2 partials ⚠️
...sysds/runtime/instructions/ooc/PlaybackStream.java 27.50% 26 Missing and 3 partials ⚠️
...instructions/ooc/MatrixIndexingOOCInstruction.java 56.86% 18 Missing and 4 partials ⚠️
... and 18 more
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #2409      +/-   ##
============================================
+ Coverage     71.51%   71.64%   +0.13%     
- Complexity    47441    47750     +309     
============================================
  Files          1539     1547       +8     
  Lines        182605   183790    +1185     
  Branches      35916    36084     +168     
============================================
+ Hits         130585   131674    +1089     
+ Misses        42028    41990      -38     
- Partials       9992    10126     +134     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Better Error Handling
Streams hold underlying CacheableData<?> ?
Generalized primitives for many to many joins
Primitive for (multi-)aggregations
Various bugfixes
Added cache features to merge/prioritize deferred requests
@janniklinde janniklinde force-pushed the OOCPipeliningAndFixes branch from f92dc69 to f88898f Compare February 19, 2026 14:04
@janniklinde janniklinde merged commit 3f30d08 into apache:main Feb 19, 2026
80 of 85 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in SystemDS PR Queue Feb 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

1 participant

Comments