[S#3543] Piece-wise linear compression of column groups first working prototype #2415 by mori49 · Pull Request #2420 · apache/systemds

mori49 · 2026-01-29T18:53:44Z

No description provided.

Test Fix

# Conflicts: # src/main/java/org/apache/sysds/runtime/compress/CompressionSettings.java # src/main/java/org/apache/sysds/runtime/compress/colgroup/scheme/ColGroupPiecewiseLinearCompressed.java # src/test/java/org/apache/sysds/runtime/compress/colgroup/ColGroupPiecewiseLinearCompressedTest.java

janniklinde

Thank you for your first contribution @mori49, this is a good start. I left some comments in the code. You used segmented least squares, which is a fine approach (even though control over the actual loss is quite limited). One limiting factor is compression complexity of O(n³), which is not viable for production compression. This particular approach can be optimized to O(n²). This could be achieved by precomputing prefix sums or SSE matrix (please first address the smaller formatting issues and other code suggestions before approaching that optimization).
In general, we may think of a more lightweight and accurate method to preserve targetLoss as an upper bound (this will be part of after the first submission deadline).
Also, please avoid german comments or variable/method names in your contribution.

janniklinde · 2026-01-30T10:04:06Z

src/main/java/org/apache/sysds/runtime/compress/colgroup/ColGroupPiecewiseLinearCompressed.java

Move this class from colgroup/scheme package to colgroup/.
In general, all methods that are currently unimplemented should throw new NotImplementedException()

janniklinde · 2026-01-30T10:07:21Z

bin/systemds-standalone.sh

This file should not be part of the PR. You can keep it locally but you should untrack it and not add it to your commits. You could use git rm --cached bin/systemds-standalone.sh.

janniklinde · 2026-01-30T10:09:57Z

src/main/java/org/apache/sysds/runtime/compress/colgroup/AColGroup.java

It seems like you reformatted the file to revert the tabs -> spaces conversion, which is good. However, there are still many unnecessary changes. I would recommend you revert that file to the original state of this repository and then only add the enum CompressionType PiecewiseLinear

janniklinde · 2026-01-30T10:20:08Z

src/main/java/org/apache/sysds/runtime/compress/colgroup/ColGroupFactory.java

To keep this file clean, I recommend that you create a new class called PiecewiseLinearUtils in the package functional. Your compressPiecewiseLinearFunctional(...) then just calls PiecewiseLinearUtils.compressSegmentedLeastSquares(...).

janniklinde · 2026-01-30T10:24:12Z

src/main/java/org/apache/sysds/runtime/compress/CompressionSettingsBuilder.java

Here please revert the file. Did you change anything in this file (except tabs->spaces which you should be reverted)?

You might consider creating a variable double targetLoss and a method public CompressionSettingsBuilder setTargetLoss(double loss) {...}. If you then add the targetLoss as a variable in the CompressionSettings constructor, you directly set the target loss via CompressionSettingsBuilder

janniklinde · 2026-01-30T10:40:05Z

src/test/java/org/apache/sysds/test/component/compress/colgroup/ColGroupFactoryTest.java

Remove jupiter assertions, that will cause the build to fail as we don't use jupiter.

janniklinde · 2026-01-30T10:47:08Z

...org/apache/sysds/test/component/compress/colgroup/ColGroupPiecewiseLinearCompressedTest.java

There should be no underscores in method names.

Move this test file to test/component/compress/colgroup.

You have a lot of isolated tests (which also look like autogenerated tests and not handwritten). It would be nice to have more tests. Please remove some redundant ones, and add tests on randomly generated data (with a fixed seed) where you create a ColGroupPiecewiseLinearCompressed and then decompressToDenseBlock. You then compare it to the original data and compute a loss (which should be no more than some upper bound).

janniklinde · 2026-01-30T10:48:30Z

src/main/java/org/apache/sysds/runtime/compress/CompressionSettings.java

Weird comment

janniklinde · 2026-01-30T10:52:47Z

src/main/java/org/apache/sysds/runtime/compress/colgroup/ColGroupFactory.java

You take the first column, which is fine for now, but in a finished implementation you would either repeat compression on every column or do a multidimensional regression, where you treat a 'row' of all indices as a vector.

janniklinde · 2026-01-30T10:58:57Z

...ava/org/apache/sysds/runtime/compress/colgroup/scheme/ColGroupPiecewiseLinearCompressed.java

Avoid emojis;
Also, they are usually a hint of LLM generated code (which is strictly forbidden for your submissions)

fix: fix efficiency getIdx()

extract: methods for compressPiecewiseLinearCompression

add: test with randomly generated Data

mori49 and others added 15 commits January 23, 2026 17:25

wip:SYSTEMDS-3543

163e4e4

Meine lokalen Änderungen

f5df4ea

Merge upstream/main mit meinen lokalen Änderungen

f6500d1

wip: test

8f5c844

Test Fix

11415fa

wip: test

5301f8f

Merge pull request #1 from janniklinde/MaryamMain

a31116d

Test Fix

fix: Methods and testing

d63aae8

wip: decompressing

78460b5

add: Enum Compressiontype piecewiselinear

f42b766

add: include functionality of piecewise linear compression

47256c0

add: Comment

505c0cc

add: dispatch test and remove unused imports

103abd8

fix: reformat code mit Eclipse XML Profile

31b957d

github-project-automation bot added this to SystemDS PR Queue Jan 29, 2026

github-project-automation bot moved this to In Progress in SystemDS PR Queue Jan 29, 2026

janniklinde reviewed Jan 30, 2026

View reviewed changes

mori49 added 10 commits February 5, 2026 16:23

wip: fix formattaing

0faa2f8

fix: fix efficiency getIdx()

fix: reverted file

698a942

rm: comment reformatted and add targetloss handling

898af68

fix: reverted file and add enum CompressionTypepiecewiseLinear

d8ebc9f

fix: reverted file

36d3186

fix: repeated compression on every column

a0d08d7

extract: methods for compressPiecewiseLinearCompression

add: utils, methods to calculate piecewiseLinearCompression

dfe2eee

wip: clear up tests

9e0d18b

add: test with randomly generated Data

fix: revert pom.xml

abeced4

rm files

fc528ae

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[S#3543] Piece-wise linear compression of column groups first working prototype #2415#2420

[S#3543] Piece-wise linear compression of column groups first working prototype #2415#2420
mori49 wants to merge 25 commits intoapache:mainfrom
mori49:main

mori49 commented Jan 29, 2026

Uh oh!

janniklinde left a comment •

edited

Loading

Uh oh!

janniklinde Jan 30, 2026

Uh oh!

janniklinde Jan 30, 2026

Uh oh!

janniklinde Jan 30, 2026

Uh oh!

janniklinde Jan 30, 2026

Uh oh!

janniklinde Jan 30, 2026

Uh oh!

janniklinde Jan 30, 2026

Uh oh!

janniklinde Jan 30, 2026

Uh oh!

janniklinde Jan 30, 2026

Uh oh!

janniklinde Jan 30, 2026

Uh oh!

janniklinde Jan 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

mori49 commented Jan 29, 2026

Uh oh!

janniklinde left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

janniklinde left a comment •

edited

Loading