Skip to content

fix: TAP-10529 inaccurate statistics of output indicators in share cd…#3091

Open
mnianqi wants to merge 1 commit intorelease-v4.13.0from
TAP-10529-v4.13.0
Open

fix: TAP-10529 inaccurate statistics of output indicators in share cd…#3091
mnianqi wants to merge 1 commit intorelease-v4.13.0from
TAP-10529-v4.13.0

Conversation

@mnianqi
Copy link
Contributor

@mnianqi mnianqi commented Mar 19, 2026

…c task

@augmentcode
Copy link

augmentcode bot commented Mar 19, 2026

🤖 Augment PR Summary

Summary: This PR adjusts how observable metric handlers are managed in order to address inaccurate output indicator statistics in shared CDC tasks.

Changes:

  • Switches per-table, per-data-node, and per-processor handler maps from eager initialization to lazy initialization
  • Adds null-guards when stopping the task before closing handler instances
  • Initializes handler maps on-demand when init/table-count aspects are processed

Technical Notes: The handler map lifecycle is now dependent on which aspects are actually observed during a task run, which affects when these maps may be null.

🤖 Was this summary useful? React with 👍 or 👎

Copy link

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 3 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

private final Map<String, TableSampleHandler> tableSampleHandlers = new ConcurrentHashMap<>();
private final Map<String, DataNodeSampleHandler> dataNodeSampleHandlers = new ConcurrentHashMap<>();
private final Map<String, ProcessorNodeSampleHandler> processorNodeSampleHandlers = new ConcurrentHashMap<>();
private Map<String, TableSampleHandler> tableSampleHandlers;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since tableSampleHandlers now starts as null, any direct dereference can throw NPE (e.g. handleSnapshotWriteTableCompleteFunc uses tableSampleHandlers::get when sourceTableName is non-null). This seems reachable if table-count/handler initialization never happened for a run but snapshot completion still fires.

Severity: high

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

private final Map<String, DataNodeSampleHandler> dataNodeSampleHandlers = new ConcurrentHashMap<>();
private final Map<String, ProcessorNodeSampleHandler> processorNodeSampleHandlers = new ConcurrentHashMap<>();
private Map<String, TableSampleHandler> tableSampleHandlers;
private Map<String, DataNodeSampleHandler> dataNodeSampleHandlers;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With dataNodeSampleHandlers now nullable, several handlers still call dataNodeSampleHandlers.get(...) without a prior null-check (e.g. handleBatchSize, handleDataNodeClose). If any of those aspects can occur before handleDataNodeInit (or in flows where init isn’t observed), this will NPE.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

private final Map<String, ProcessorNodeSampleHandler> processorNodeSampleHandlers = new ConcurrentHashMap<>();
private Map<String, TableSampleHandler> tableSampleHandlers;
private Map<String, DataNodeSampleHandler> dataNodeSampleHandlers;
private Map<String, ProcessorNodeSampleHandler> processorNodeSampleHandlers;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

processorNodeSampleHandlers is now nullable, but handleProcessorNodeClose/handleProcessorNodeProcess dereference it via processorNodeSampleHandlers.get(nodeId) without guarding for null. If close/process aspects can be emitted without a prior init being observed, this will NPE.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant