feat: add ReasoningItem output type for Responses API by robinnarsinghranabhat · Pull Request #324 · llamastack/llama-stack-client-python

robinnarsinghranabhat · 2026-03-25T21:02:48Z

Summary

The LlamaStack client's Output union was missing a ReasoningItem variant, causing type: "reasoning" output items from the Responses API to be deserialized as the generic OutputOpenAIResponseMessageOutput instead of a dedicated type. This adds OutputOpenAIResponseReasoningItem to both non-streaming and streaming response types.

Problem

When a server returns reasoning output (from reasoning-capable models), the response includes items with type: "reasoning". The client's discriminated Output union had no variant for this type, so Pydantic fell back to OutputOpenAIResponseMessageOutput.

Why it appears to "work" but is actually broken

The client's BaseModel is configured with extra: 'allow', so Pydantic silently accepts unknown fields like summary and encrypted_content as untyped extras on the wrong class:

resp.output[0]
# OutputOpenAIResponseMessageOutput(
#   content=None,          <-- None! reasoning data is NOT here
#   role=None,             <-- None! not a message
#   id='rs_resp_874653',
#   status=None,
#   type='reasoning',      <-- correct type string, wrong Python class
#   summary=[{'type': 'summary_text', 'text': '...'}],   <-- untyped extra field (raw dict)
#   encrypted_content='...'                                <-- untyped extra field
# )

This means:

content is None — resp.output[0].content[0].text raises TypeError: 'NoneType' object is not subscriptable
summary exists but as raw dicts — no .text attribute, no type validation, no IDE autocompletion
role is None — the class expects it as a required Literal["system", "developer", "user", "assistant"] but Pydantic's extra: 'allow' lets it slide
No type safety — code checking isinstance(item, ReasoningItem) would fail

Setup to reproduce

# Pull a reasoning-capable model
ollama pull gpt-oss:20b

# Install clients
pip install llama-stack-client openai

Before (broken) — non-streaming

Using OpenAI client as reference (correct behavior):

from openai import OpenAI

client = OpenAI(base_url="http://localhost:11434/v1", api_key="unused")
MODEL = "gpt-oss:20b"

resp = client.responses.create(
    model=MODEL,
    input="What is 2 + 2? Think step by step.",
    reasoning={"effort": "low"},
    stream=False,
)

for item in resp.output:
    print(f"type={item.type!r}, python_type={type(item).__name__}")
# type='reasoning', python_type=ResponseReasoningItem   <-- correct
# type='message',   python_type=ResponseOutputMessage    <-- correct

Same call with LlamaStack client (broken):

from llama_stack_client import LlamaStackClient

ls_client = LlamaStackClient(base_url="http://localhost:11434/")
MODEL = "gpt-oss:20b"

resp = ls_client.responses.create(
    model=MODEL,
    input="What is 2 + 2? Think step by step.",
    reasoning={"effort": "low"},
    stream=False,
)

for item in resp.output:
    print(f"type={item.type!r}, python_type={type(item).__name__}")
# type='reasoning', python_type=OutputOpenAIResponseMessageOutput   <-- WRONG
# type='message',   python_type=OutputOpenAIResponseMessageOutput

resp.output[0].content[0].text
# TypeError: 'NoneType' object is not subscriptable

After (fixed) — non-streaming

from llama_stack_client import LlamaStackClient

ls_client = LlamaStackClient(base_url="http://localhost:11434/")
MODEL = "gpt-oss:20b"

resp = ls_client.responses.create(
    model=MODEL,
    input="What is 2 + 2? Think step by step.",
    reasoning={"effort": "low"},
    stream=False,
)

for item in resp.output:
    print(f"type={item.type!r}, python_type={type(item).__name__}")
# type='reasoning', python_type=OutputOpenAIResponseReasoningItem   <-- correct!
# type='message',   python_type=OutputOpenAIResponseMessageOutput

# Reasoning fields properly typed and accessible:
print(resp.output[0].summary[0].type)   # 'summary_text'  <-- typed object, not raw dict

import json
print(json.dumps(resp.output[0].model_dump(), indent=2))
# {
#   "id": "rs_resp_645183",
#   "summary": [
#     {
#       "text": "The user asks \"What is 2 + 2?\" ...",
#       "type": "summary_text"
#     }
#   ],
#   "content": null,
#   "encrypted_content": "The user asks \"What is 2 + 2?\" ...",
#   "status": null,
#   "type": "reasoning"
# }

After (fixed) — streaming

events = list(ls_client.responses.create(
    model=MODEL,
    input="What is 2 + 2? Think step by step.",
    reasoning={"effort": "low"},
    stream=True,
))

for e in events:
    etype = getattr(e, 'type', None)
    item = getattr(e, 'item', None)
    if item and hasattr(item, 'type'):
        print(f"  {etype:<45} item.type={item.type}")
    else:
        print(f"  {etype}")

# Output:
#   response.created
#   response.in_progress
#   response.output_item.added
#   response.reasoning_summary_text.delta
#   ...
#   response.reasoning_summary_text.done
#   response.output_item.done
#   response.output_item.added
#   response.content_part.added
#   response.output_text.delta
#   ...
#   response.output_text.done
#   response.content_part.done
#   response.output_item.done
#   response.completed

# Final output correctly typed:
final = events[-1].response.output
print(type(final[0]).__name__)  # OutputOpenAIResponseReasoningItem
print(type(final[1]).__name__)  # OutputOpenAIResponseMessageOutput

Changes

File	Change
`src/llama_stack_client/types/response_object.py`	Added `OutputOpenAIResponseReasoningItem`, `OutputOpenAIResponseReasoningItemContent`, `OutputOpenAIResponseReasoningItemSummary` + added to `Output` discriminated union
`src/llama_stack_client/types/response_object_stream.py`	Added matching reasoning types for `OutputItemAdded` and `OutputItemDone` item unions

Test plan

type: "reasoning" correctly deserializes into OutputOpenAIResponseReasoningItem in both streaming and non-streaming modes
Tested with OpenAI client against Ollama (gpt-oss:20b) as ground truth
Non-reasoning responses unaffected

The Output discriminated union was missing a ReasoningItem variant, causing type="reasoning" output items from the Responses API to fall back to OutputOpenAIResponseMessageOutput with Pydantic warnings and broken content access. Adds OutputOpenAIResponseReasoningItem (with Content and Summary subtypes) to both non-streaming and streaming response types.

meta-cla bot added the cla signed label Mar 25, 2026

robinnarsinghranabhat force-pushed the feat/add-reasoning-item-output-type branch from 81922da to dcd21d9 Compare March 25, 2026 21:04

style: apply formatter line wrapping

99426ab

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add ReasoningItem output type for Responses API#324

feat: add ReasoningItem output type for Responses API#324
robinnarsinghranabhat wants to merge 2 commits intollamastack:mainfrom
robinnarsinghranabhat:feat/add-reasoning-item-output-type

robinnarsinghranabhat commented Mar 25, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

robinnarsinghranabhat commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Why it appears to "work" but is actually broken

Setup to reproduce

Before (broken) — non-streaming

After (fixed) — non-streaming

After (fixed) — streaming

Changes

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

robinnarsinghranabhat commented Mar 25, 2026 •

edited

Loading