[Improvement]: Refactor process API and table runtime extension interfaces by baiyangtx · Pull Request #4097 · apache/amoro

baiyangtx · 2026-02-18T11:02:24Z

Why are the changes needed?

This MR introduces the new process extension APIs in common module, without touching most AMS internals yet.

Brief change log

Promote ProcessFactory to a richer abstraction in amoro-common, preparing for plugin-based table process extension.
Adjust Action/IcebergActions and TableRuntime to align with the new process model.
Move ActionCoordinator into amoro-common so it can be shared as a public abstraction.
Introduce ProcessTriggerStrategy to describe trigger policies for processes.
Clean up legacy process state classes which will be replaced by the new abstractions in follow-up MRs.

Note

This commit only updates the common module and the shared ActionCoordinator API; AMS-side wiring and runtime refactors will be done in separate branches/MRs as discussed.

How was this patch tested?

Add some test cases that check the changes thoroughly including negative and positive cases if possible
Add screenshots for manual tests if appropriate
Run test locally before making a pull request

Documentation

Does this pull request introduce a new feature? ( no)
If yes, how is the feature documented? (not documented)

# Context Split from upstream PR apache#4081 to make review easier. This MR introduces the new process extension APIs in common module, without touching most AMS internals yet. # Changes - Promote `ProcessFactory` to a richer abstraction in `amoro-common`, preparing for plugin-based table process extension. - Adjust `Action`/`IcebergActions` and `TableRuntime` to align with the new process model. - Move `ActionCoordinator` into `amoro-common` so it can be shared as a public abstraction. - Introduce `ProcessTriggerStrategy` to describe trigger policies for processes. - Clean up legacy process state classes which will be replaced by the new abstractions in follow-up MRs. # Notes - This commit only updates the common module and the shared `ActionCoordinator` API; AMS-side wiring and runtime refactors will be done in separate branches/MRs as discussed. Co-Authored-By: Aime <aime@bytedance.com> Change-Id: If84ada8fcae1cfb11577d56d3866db7ce0949102

majin1102

Thanks for working on this.
Left some comments

majin1102 · 2026-02-21T08:27:18Z

amoro-common/src/main/java/org/apache/amoro/process/ProcessFactory.java

   * @return target process which has not been submitted yet.
   */
-  AmoroProcess recover(TableRuntime tableRuntime, TableProcessState state);
+  TableProcess recover(TableRuntime tableRuntime, TableProcessStore store);


I think we should also use Optional here for the case something went wrong with the recovering or incapatibility issues

Maybe in this case you should throw an exception?

How about define an UnableRecoverProcess Exception ?

These approaches aren't mutually exclusive. We can throw exceptions or return null to skip this process when needed.

When the recover method is called, there must be a record of Process information in a managed state within the system database. From a system‑level perspective, if recovery is not possible at this point, it should be considered an abnormal situation, meaning the system needs to clean up resources and mark the Process as failed or abnormal. Therefore, I suggest explicitly indicating through an exception that this Process can no longer be recovered.

majin1102 · 2026-02-21T08:29:48Z

amoro-common/src/main/java/org/apache/amoro/process/ProcessTriggerStrategy.java

+import java.time.Duration;
+
+/** Process trigger strategy. */
+public final class ProcessTriggerStrategy {


Why do we need this?

I originally thought we should always triggered process when we have new snapshots, config changed and time based. And this should be unextensiable to me

At least, the factory should tell the scheduler the trigger interval.

I can provide a default implement.

majin1102 · 2026-02-21T08:32:51Z

amoro-common/src/main/java/org/apache/amoro/table/TableRuntimeFactory.java

 /** Table runtime factory. */
 public interface TableRuntimeFactory extends ActivePlugin {

+  List<ActionCoordinator> supportedCoordinators();


So this extension should be aware of the implementation of Coordinator?

I feel a little weried for Lance scenario. Can we provide an abstract TableRuntime if the coordinator is really necessary here? (Seriouslly I think this should be a legacy issue?)

You are not need to implement this plugin.

The default TableRuntimeFactory will create all coordinator automaticly, the only plugin you should provide is ProcessFactory.

You mean I don't need to provide a LanceTableRuntime?

How can I return a LanceTableConfig? Could you provide a simple demo

You can call TableRuntime.getTableConfig(): Map<String, String>

majin1102 · 2026-02-21T08:35:44Z

amoro-common/src/main/java/org/apache/amoro/TableRuntime.java

   */
  List<? extends TableProcessStore> getProcessStates(Action action);

+  void registerProcess(TableProcessStore processStore);


I think we should always use table runtime to trigger a process instead of register a process into table runtime.

If not, I don't get the meaning of putting factories into runtimes

This method is used for process scheduler.

You are not need to care about this method. And also, you are no longer need to provide the implement of TableRuntime

-- -- getProcessStates -> -- getProcessStore or getProcess

majin1102 · 2026-02-21T08:38:16Z

amoro-common/src/main/java/org/apache/amoro/Action.java

-   * the weight number of this action, the bigger the weight number, the higher positions of
-   * schedulers or front pages
-   */
-  private final int weight;


The weight is not used so far.
But it was designed to be used what action should be prioritied if we have multiple actions to schedule in one resouce group.

I was considering to use a priority here and add a default value here. WDYT

majin1102

LGTM

github-actions bot added the module:common label Feb 18, 2026

baiyangtx and others added 2 commits February 20, 2026 10:13

Merge branch 'master' into process-factory/new-process-api

f1f8ecf

fix compile error

50f1e7e

github-actions bot added the module:ams-server Ams server module label Feb 20, 2026

baiyangtx marked this pull request as ready for review February 20, 2026 08:45

baiyangtx requested a review from majin1102 February 20, 2026 08:48

zhangyongxiang.alpha added 2 commits February 20, 2026 21:20

fix compile error

26812d8

fix compile error3

638c0bf

majin1102 reviewed Feb 21, 2026

View reviewed changes

zhangyongxiang.alpha added 2 commits February 25, 2026 10:39

getTableConfig

9490191

refactor

d665c55

majin1102 approved these changes Feb 25, 2026

View reviewed changes

majin1102 merged commit 743e6f1 into apache:master Feb 25, 2026
6 checks passed

baiyangtx deleted the process-factory/new-process-api branch February 25, 2026 12:27

baiyangtx mentioned this pull request Feb 26, 2026

[Improvement]: Load process factories via DefaultTableRuntimeFactory #4100

Open

Conversation

baiyangtx commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are the changes needed?

Brief change log

How was this patch tested?

Documentation

Uh oh!

majin1102 left a comment

Choose a reason for hiding this comment

Uh oh!

majin1102 Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

majin1102 Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

majin1102 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

baiyangtx commented Feb 18, 2026 •

edited

Loading

majin1102 Feb 21, 2026 •

edited

Loading

majin1102 Feb 24, 2026 •

edited

Loading