Skip to content

[AMORO-4099] Add table-summary metric collection option for non-optimizing tables#4101

Draft
j1wonpark wants to merge 1 commit intoapache:masterfrom
j1wonpark:feature/table-summary-without-optimizing
Draft

[AMORO-4099] Add table-summary metric collection option for non-optimizing tables#4101
j1wonpark wants to merge 1 commit intoapache:masterfrom
j1wonpark:feature/table-summary-without-optimizing

Conversation

@j1wonpark
Copy link
Contributor

Why are the changes needed?

Currently, table_summary metrics (total files, file sizes, health score, etc.) are only collected when self-optimizing.enabled=true. The metric update path (setTableSummary()) is gated behind the optimizingConfig.isEnabled() check in TableRuntimeRefreshExecutor.tryEvaluatingPendingInput(), so tables with self-optimizing disabled always show 0 or N/A in Grafana dashboards.

This PR decouples metric collection from self-optimizing execution by introducing a new table property.

Close #4099.

Brief change log

  • Add self-optimizing.table-summary.enabled table property constant in TableProperties
  • Add tableSummaryEnabled field to OptimizingConfig with getter/setter
  • Parse the new property in TableConfigurations.parseOptimizingConfig()
  • Add else if branch in TableRuntimeRefreshExecutor.tryEvaluatingPendingInput() to collect summary metrics when optimizing is disabled but table-summary is enabled
  • Add TestTableSummaryWithoutOptimizing test class with 2 test cases

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible
    • testSummaryCollectedWhenOptimizingDisabledAndSummaryEnabled: verifies metrics are updated when optimizing is off but summary is enabled
    • testSummaryNotCollectedWhenBothDisabled: verifies metrics remain at initial values when both are disabled
  • Add screenshots for manual tests if appropriate
  • Run test locally before making a pull request

Documentation

  • Does this pull request introduce a new feature? yes
  • If yes, how is the feature documented? JavaDocs

…izing tables

Signed-off-by: Jiwon Park <jpark92@outlook.kr>
@codecov-commenter
Copy link

Codecov Report

❌ Patch coverage is 50.00000% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 28.86%. Comparing base (fda105e) to head (88f649c).
⚠️ Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
...java/org/apache/amoro/config/OptimizingConfig.java 0.00% 5 Missing ⚠️
.../scheduler/inline/TableRuntimeRefreshExecutor.java 80.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #4101      +/-   ##
============================================
+ Coverage     22.39%   28.86%   +6.46%     
- Complexity     2552     3951    +1399     
============================================
  Files           458      654     +196     
  Lines         42116    52175   +10059     
  Branches       5917     6637     +720     
============================================
+ Hits           9433    15061    +5628     
- Misses        31871    36013    +4142     
- Partials        812     1101     +289     
Flag Coverage Δ
core 28.86% <50.00%> (?)
trino ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@j1wonpark j1wonpark marked this pull request as draft February 27, 2026 03:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Improvement]: Collect table_summary metrics independently of self-optimizing

2 participants