perf: optimize model runner memory and latency by patricklundquist · Pull Request #994 · Clarifai/clarifai-python

patricklundquist · 2026-03-18T19:36:34Z

Fix memory leak / blocking bug in runner_item_stream by using next() and itertools.chain() instead of list().
Eliminate list allocations for status checks.
Mutate status messages in place instead of allocating new ones.
Optimize _convert_input_protos_to_python in ModelClass

1. Fix massive memory leak / blocking bug in `runner_item_stream` by using `next()` and `itertools.chain()` instead of `list()`. 2. Eliminate list allocations for status checks. 3. Mutate status messages in place instead of allocating new ones. Co-authored-by: patricklundquist <1460278+patricklundquist@users.noreply.github.com>

Copilot

Pull request overview

Optimizes the model runner hot paths to reduce allocations and avoid materializing streaming iterators into memory, improving latency and preventing high memory usage during streaming.

Changes:

Replaces per-output successes list construction with counter-based aggregation in predict/generate/stream paths.
Updates streaming setup to peek the first request via next() and rebuild the iterator with itertools.chain() instead of list().
Mutates resp.status fields directly instead of allocating a new Status proto.

clarifai/runners/models/model_runner.py

patricklundquist · 2026-03-19T14:33:35Z

@copilot open a new pull request to apply changes based on the comments in this thread

Copilot · 2026-03-19T14:33:43Z

@patricklundquist I've opened a new pull request, #995, to work on those changes. Once the pull request is ready, I'll request review from you.

…ions (#995) * Initial plan * fix: address review comments on model runner optimizations - Call resp.status.Clear() before mutating code/description to prevent stale fields - Wrap runner_item_iterator with iter() in runner_item_stream to accept any iterable - Use tuple (first_runner_item,) instead of list [first_runner_item] in itertools.chain - Fix empty outputs to be treated as SUCCESS (aligned across predict/generate/stream) - Align FAILURE status code consistently across all three paths - Add unit tests covering all edge cases Co-authored-by: patricklundquist <1460278+patricklundquist@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: patricklundquist <1460278+patricklundquist@users.noreply.github.com>

Copilot

Pull request overview

This PR optimizes the ModelRunner hot paths by reducing allocations during status aggregation and by fixing runner_item_stream to avoid materializing entire request streams into memory.

Changes:

Replaced per-output boolean list allocations with success/total counters in runner_item_predict, runner_item_generate, and runner_item_stream, and now clears/mutates resp.status in place.
Updated runner_item_stream to use next() + itertools.chain() to safely peek the first item without consuming the entire iterable into a list.
Added unit tests covering status aggregation behavior and iterable handling for streaming.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`clarifai/runners/models/model_runner.py`	Improves latency/memory by avoiding intermediate lists and by streaming safely without buffering the entire iterator.
`tests/runners/test_model_runner_unit.py`	Adds unit coverage for status aggregation semantics and streaming iterator/iterable handling.

clarifai/runners/models/model_runner.py

Pre-computed cast type lookup dictionaries in `_MethodInfo` and removed dynamic reflection per-input (e.g. `hasattr`, `__args__`, `__origin__`) inside `_convert_input_protos_to_python`. This prevents a massive O(N) overhead during streaming and batching requests where every item was previously type-checked dynamically. Also cached `_get_method_infos()` lookups in wrappers. Co-authored-by: patricklundquist <1460278+patricklundquist@users.noreply.github.com>

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

patricklundquist · 2026-03-19T20:17:14Z

@copilot update test assertions per:

=========================== short test summary info ===========================
FAILED tests/runners/test_model_runner_unit.py::TestRunnerItemStreamStatus::test_all_failed_outputs - assert 25608 == 10020

where 25608 = code: RUNNER_PROCESSING_FAILED\ndescription: "Failed"\n.code
where code: RUNNER_PROCESSING_FAILED\ndescription: "Failed"\n = status {\n code: RUNNER_PROCESSING_FAILED\n description: "Failed"\n}\noutputs {\n status {\n code: FAILURE\n }\n}\n.status

 where status {\n  code: RUNNER_PROCESSING_FAILED\n  description: "Failed"\n}\noutputs {\n  status {\n    code: FAILURE\n  }\n}\n = multi_output_response {\n  status {\n    code: RUNNER_PROCESSING_FAILED\n    description: "Failed"\n  }\n  outputs {\n    status {\n      code: FAILURE\n    }\n  }\n}\n.multi_output_response

and 10020 = status_code_pb2.FAILURE

Copilot · 2026-03-19T20:17:24Z

@patricklundquist I've opened a new pull request, #997, to work on those changes. Once the pull request is ready, I'll request review from you.

…tatus code (#997) * Initial plan * fix: update test assertion to use RUNNER_PROCESSING_FAILED for all-failed stream outputs Co-authored-by: patricklundquist <1460278+patricklundquist@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: patricklundquist <1460278+patricklundquist@users.noreply.github.com>

Copilot

Pull request overview

Optimizes the Clarifai runner hot path to reduce memory usage and allocations during status aggregation and streaming, and refactors input-proto casting in ModelClass to avoid repeated work.

Changes:

Optimize ModelRunner status aggregation to avoid per-output list allocations and clear stale status fields in-place.
Fix runner_item_stream iterator handling by peeking with next() and rebuilding the stream via itertools.chain() instead of materializing the full iterator.
Add unit tests covering status aggregation behavior and iterable-vs-iterator handling for streaming.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
`clarifai/runners/models/model_runner.py`	Reduces allocations in status aggregation; updates stream handling to avoid list materialization.
`clarifai/runners/models/model_class.py`	Caches method registry lookups and precomputes cast targets to streamline input conversion.
`tests/runners/test_model_runner_unit.py`	Adds unit tests for status aggregation and iterator/iterable behavior in streaming.

clarifai/runners/models/model_runner.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

patricklundquist · 2026-03-24T17:04:25Z

@copilot resolve merge conflicts to maintain functionality and best performance.

Copilot · 2026-03-24T17:04:34Z

@patricklundquist I've opened a new pull request, #1000, to work on those changes. Once the pull request is ready, I'll request review from you.

…ast-path and counter-based aggregation (#1000) * Initial plan * Merge origin/master: resolve conflict in runner_item_generate by combining fast-path single-output optimization with counter-based multi-output aggregation Co-authored-by: patricklundquist <1460278+patricklundquist@users.noreply.github.com> Agent-Logs-Url: https://github.com/Clarifai/clarifai-python/sessions/f0c05276-4c5a-4a7b-a438-99f3746bef0a --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: patricklundquist <1460278+patricklundquist@users.noreply.github.com>

…5656847

github-actions · 2026-03-24T17:35:18Z

Package	Line Rate	Health
clarifai	45%	❌
clarifai.cli	62%	➖
clarifai.cli.templates	67%	➖
clarifai.cli.templates.toolkits	100%	✔
clarifai.client	65%	➖
clarifai.client.auth	67%	➖
clarifai.constants	100%	✔
clarifai.datasets	100%	✔
clarifai.datasets.export	69%	➖
clarifai.datasets.upload	75%	➖
clarifai.datasets.upload.loaders	37%	❌
clarifai.models	100%	✔
clarifai.rag	0%	❌
clarifai.runners	52%	➖
clarifai.runners.models	60%	➖
clarifai.runners.pipeline_steps	39%	❌
clarifai.runners.pipelines	72%	➖
clarifai.runners.utils	62%	➖
clarifai.runners.utils.data_types	72%	➖
clarifai.schema	100%	✔
clarifai.urls	58%	➖
clarifai.utils	65%	➖
clarifai.utils.evaluation	16%	❌
clarifai.workflows	95%	✔
Summary	60% (11642 / 19284)	➖

Minimum allowed line rate is 50%

patricklundquist requested a review from Copilot March 18, 2026 19:36

Copilot AI reviewed Mar 18, 2026

View reviewed changes

Copilot AI mentioned this pull request Mar 19, 2026

fix: address review comments on model runner memory/latency optimizations #995

Merged

patricklundquist requested a review from Copilot March 19, 2026 16:07

Copilot AI reviewed Mar 19, 2026

View reviewed changes

clarifai/runners/models/model_runner.py Outdated Show resolved Hide resolved

clarifai/runners/models/model_runner.py Show resolved Hide resolved

google-labs-jules bot and others added 4 commits March 19, 2026 09:49

Delete .jules/bolt.md

0aaa072

Add dropped health_check_port parameter to model runner

f261788

Apply suggestions from code review

cc328f5

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot AI mentioned this pull request Mar 19, 2026

fix: align test assertion for all-failed stream outputs with actual status code #997

Merged

Copilot AI and others added 2 commits March 19, 2026 14:44

ruf

9a90d28

patricklundquist requested a review from Copilot March 20, 2026 18:34

Copilot started reviewing on behalf of patricklundquist March 20, 2026 18:34 View session

Copilot AI reviewed Mar 20, 2026

View reviewed changes

clarifai/runners/models/model_runner.py Outdated Show resolved Hide resolved

clarifai/runners/models/model_runner.py Show resolved Hide resolved

clarifai/runners/models/model_runner.py Outdated Show resolved Hide resolved

Apply suggestions from code review

97a8e06

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI mentioned this pull request Mar 24, 2026

perf: merge master into model runner optimization branch, combining fast-path and counter-based aggregation #1000

Merged

Copilot AI and others added 2 commits March 24, 2026 10:28

Merge branch 'master' into bolt-runner-perf-optimizations-24085653846…

4301aa9

…5656847

patricklundquist marked this pull request as ready for review March 24, 2026 22:36

patricklundquist requested a review from luv-bansal March 24, 2026 23:40

patricklundquist requested a review from zeiler March 25, 2026 16:47

Conversation

patricklundquist commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

patricklundquist commented Mar 19, 2026

Uh oh!

Copilot AI commented Mar 19, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

patricklundquist commented Mar 19, 2026

Uh oh!

Copilot AI commented Mar 19, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

patricklundquist commented Mar 24, 2026

Uh oh!

Copilot AI commented Mar 24, 2026

Uh oh!

github-actions bot commented Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

patricklundquist commented Mar 18, 2026 •

edited

Loading