Skip to content

Matrix compression for federated learning broadcasts#2524

Open
nirvanjhurreepriv wants to merge 18 commits into
apache:mainfrom
nirvanjhurreepriv:feature/compression
Open

Matrix compression for federated learning broadcasts#2524
nirvanjhurreepriv wants to merge 18 commits into
apache:mainfrom
nirvanjhurreepriv:feature/compression

Conversation

@nirvanjhurreepriv

Copy link
Copy Markdown

Authors (for submission)

  • Charansurya Udaysingh Jhurree, 488213
  • Deniz Durmusoglu, 530387
  • Yagiz Bartu Arslan, 529386

Summary

Implements matrix compression techniques to reduce communication overhead in SystemDS's federated learning backend.

Changes

  • New package: org.apache.sysds.runtime.compress
    • TopK Sparsification (up to 66x compression)
    • Probabilistic Quantization (4-16x compression, unbiased)
    • CompressionConfig, CompressionFactory, MatrixCompressor interface
  • Integration into FederationMap.broadcast() with safe fallback
  • 19 unit tests, all passing

Testing

mvn test -Dtest=TopKCompressorTest,ProbabilisticQuantizationCompressorTest

@ywcb00 ywcb00 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for the PR. :)
I added some comments inline. Please have a look at them and try to make the corresponding modifications in the code.
Also, please make sure that the other GitHub workflows succeed, especially the Java Tests and the Coding Style workflows.
Thank you and all the best,
David

Comment on lines +5 to +6
*
* @author Nirvan C. UdaysinghJhurree

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Author information is already stored by git. No need to add it explicitly to the function header.

Comment on lines +5 to +6
*
* @author Nirvan C. Udaysingh Jhurree

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove author information.

Comment on lines +23 to +24
*
*

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove empty lines.

Comment on lines +9 to +10
*
*

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove empty lines.

Comment on lines +16 to +17
*
*

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove empty lines.

*
*
*/
public class ProbabilisticQuantizationCompressorTest {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can extend from the AutomatedTestBase class and leverage the utility functions in org.apache.sysds.test.TestUtils.

[This applies to both test classes that are added with this PR]


// Normalize to [0, 1]
double normalized = (value - min) / (max - min);
normalized = Math.max(0.0, Math.min(1.0, normalized)); // Clamp

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this line? Can the value be smaller than zero or larger than 1 after normalizing it to [0, 1]?

// Find bounding level indices
double scaled = normalized * (levels - 1);
int lowerIdx = (int) scaled;
int upperIdx = Math.min(lowerIdx + 1, levels - 1);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the Math.min operation here?

Comment on lines +85 to +87
public org.apache.sysds.runtime.matrix.data.MatrixBlock decompress(CompressedMatrix compressed)
throws org.apache.sysds.runtime.compress.exceptions.DecompressionException {
return (org.apache.sysds.runtime.matrix.data.MatrixBlock) compressed.getCompressedData();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import the MatrixBlock and use it as MatrixBlock instead of writing the full path.

Comment on lines +52 to +54
case ONE_BIT_CS:
throw new UnsupportedOperationException(
"1-Bit Compressed Sensing not yet implemented");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you still planning on implementing compressed sensing?
If yes, here is a reference that might be useful: https://doi.org/10.1109/CISS.2008.4558487
Otherwise, please remove this case.

@github-project-automation github-project-automation Bot moved this from In Progress to In Review in SystemDS PR Queue Jun 30, 2026
@nirvanjhurreepriv

Copy link
Copy Markdown
Author

@ywcb00 I have addressed all the review feedback in commit fcca7c0. Here is a summary of what was changed:

  • Moved package to org.apache.sysds.runtime.controlprogram.federated.compression
  • Added Apache v2.0 license headers to all files
  • Removed @author tags
  • Merged CompressionFactory.create() methods into one
  • Removed ONE_BIT_CS (not planning to implement because of time constraint)
  • Removed getCompressionType() from interface and implementations
  • Gated compression behind DMLScript.FEDERATED_COMPRESSION flag
  • Removed unnecessary clamp and Math.min in quantization
  • Moved tests to org.apache.sysds.test.functions.federated.io, extended AutomatedTestBase
  • Merged related unit tests
  • Removed custom CI/CD workflow

I have verified locally that checkstyle passes and all 14 tests pass. Could you please approve the workflows so the Java Tests and Coding Style checks can run on your end as well?
Thank you for your time and input :)

Best,
Nirvan

@nirvanjhurreepriv

Copy link
Copy Markdown
Author

@ywcb00 All issues are now fixed in the latest commit:

Checkstyle: converted all indentation to tabs
LicenseCheck: added proper Apache v2.0 headers to test files and removed unnecessary AutomatedTestBase inheritance (its tearDown() requires Hadoop FS setup which pure unit tests don't need)
All 14 compression tests pass locally with 0 errors

The remaining CI failures are pre-existing and unrelated to our changes: 721 unapproved files in docs/ and vendor JS/CSS, and BuiltinMultiLogRegTest.testMultiLogRegInterceptSpark2 timing out after 1000s on the CI runner.

Thanks,
Nirvan

@codecov

codecov Bot commented Jul 3, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 29.67864% with 372 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.51%. Comparing base (17803b1) to head (be8f869).
⚠️ Report is 26 commits behind head on main.

Files with missing lines Patch % Lines
...he/sysds/runtime/compress/TopK/TopKCompressor.java 0.00% 91 Missing ⚠️
...ntization/ProbabilisticQuantizationCompressor.java 0.00% 59 Missing ⚠️
...ache/sysds/runtime/compress/CompressionConfig.java 0.00% 26 Missing ⚠️
...ogram/federated/compression/CompressionConfig.java 0.00% 26 Missing ⚠️
...pache/sysds/runtime/compress/CompressedMatrix.java 0.00% 22 Missing ⚠️
...ram/federated/compression/TopK/TopKCompressor.java 77.77% 17 Missing and 3 partials ⚠️
...s/runtime/compress/Quantization/QuantizedData.java 0.00% 16 Missing ⚠️
...che/sysds/runtime/compress/CompressionFactory.java 0.00% 15 Missing ⚠️
...apache/sysds/runtime/compress/CompressionType.java 0.00% 15 Missing ⚠️
...untime/controlprogram/federated/FederationMap.java 0.00% 10 Missing and 1 partial ⚠️
... and 13 more
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #2524      +/-   ##
============================================
+ Coverage     71.41%   71.51%   +0.10%     
- Complexity    48817    49366     +549     
============================================
  Files          1572     1602      +30     
  Lines        189089   191045    +1956     
  Branches      37101    37430     +329     
============================================
+ Hits         135030   136627    +1597     
- Misses        43596    43884     +288     
- Partials      10463    10534      +71     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

2 participants