Skip to content

feat: add ReasoningItem output type for Responses API#324

Open
robinnarsinghranabhat wants to merge 2 commits intollamastack:mainfrom
robinnarsinghranabhat:feat/add-reasoning-item-output-type
Open

feat: add ReasoningItem output type for Responses API#324
robinnarsinghranabhat wants to merge 2 commits intollamastack:mainfrom
robinnarsinghranabhat:feat/add-reasoning-item-output-type

Conversation

@robinnarsinghranabhat
Copy link

@robinnarsinghranabhat robinnarsinghranabhat commented Mar 25, 2026

Summary

The LlamaStack client's Output union was missing a ReasoningItem variant, causing type: "reasoning" output items from the Responses API to be deserialized as the generic OutputOpenAIResponseMessageOutput instead of a dedicated type. This adds OutputOpenAIResponseReasoningItem to both non-streaming and streaming response types.

Problem

When a server returns reasoning output (from reasoning-capable models), the response includes items with type: "reasoning". The client's discriminated Output union had no variant for this type, so Pydantic fell back to OutputOpenAIResponseMessageOutput.

Why it appears to "work" but is actually broken

The client's BaseModel is configured with extra: 'allow', so Pydantic silently accepts unknown fields like summary and encrypted_content as untyped extras on the wrong class:

resp.output[0]
# OutputOpenAIResponseMessageOutput(
#   content=None,          <-- None! reasoning data is NOT here
#   role=None,             <-- None! not a message
#   id='rs_resp_874653',
#   status=None,
#   type='reasoning',      <-- correct type string, wrong Python class
#   summary=[{'type': 'summary_text', 'text': '...'}],   <-- untyped extra field (raw dict)
#   encrypted_content='...'                                <-- untyped extra field
# )

This means:

  • content is Noneresp.output[0].content[0].text raises TypeError: 'NoneType' object is not subscriptable
  • summary exists but as raw dicts — no .text attribute, no type validation, no IDE autocompletion
  • role is None — the class expects it as a required Literal["system", "developer", "user", "assistant"] but Pydantic's extra: 'allow' lets it slide
  • No type safety — code checking isinstance(item, ReasoningItem) would fail

Setup to reproduce

# Pull a reasoning-capable model
ollama pull gpt-oss:20b

# Install clients
pip install llama-stack-client openai

Before (broken) — non-streaming

Using OpenAI client as reference (correct behavior):

from openai import OpenAI

client = OpenAI(base_url="http://localhost:11434/v1", api_key="unused")
MODEL = "gpt-oss:20b"

resp = client.responses.create(
    model=MODEL,
    input="What is 2 + 2? Think step by step.",
    reasoning={"effort": "low"},
    stream=False,
)

for item in resp.output:
    print(f"type={item.type!r}, python_type={type(item).__name__}")
# type='reasoning', python_type=ResponseReasoningItem   <-- correct
# type='message',   python_type=ResponseOutputMessage    <-- correct

Same call with LlamaStack client (broken):

from llama_stack_client import LlamaStackClient

ls_client = LlamaStackClient(base_url="http://localhost:11434/")
MODEL = "gpt-oss:20b"

resp = ls_client.responses.create(
    model=MODEL,
    input="What is 2 + 2? Think step by step.",
    reasoning={"effort": "low"},
    stream=False,
)

for item in resp.output:
    print(f"type={item.type!r}, python_type={type(item).__name__}")
# type='reasoning', python_type=OutputOpenAIResponseMessageOutput   <-- WRONG
# type='message',   python_type=OutputOpenAIResponseMessageOutput

resp.output[0].content[0].text
# TypeError: 'NoneType' object is not subscriptable

After (fixed) — non-streaming

from llama_stack_client import LlamaStackClient

ls_client = LlamaStackClient(base_url="http://localhost:11434/")
MODEL = "gpt-oss:20b"

resp = ls_client.responses.create(
    model=MODEL,
    input="What is 2 + 2? Think step by step.",
    reasoning={"effort": "low"},
    stream=False,
)

for item in resp.output:
    print(f"type={item.type!r}, python_type={type(item).__name__}")
# type='reasoning', python_type=OutputOpenAIResponseReasoningItem   <-- correct!
# type='message',   python_type=OutputOpenAIResponseMessageOutput

# Reasoning fields properly typed and accessible:
print(resp.output[0].summary[0].type)   # 'summary_text'  <-- typed object, not raw dict

import json
print(json.dumps(resp.output[0].model_dump(), indent=2))
# {
#   "id": "rs_resp_645183",
#   "summary": [
#     {
#       "text": "The user asks \"What is 2 + 2?\" ...",
#       "type": "summary_text"
#     }
#   ],
#   "content": null,
#   "encrypted_content": "The user asks \"What is 2 + 2?\" ...",
#   "status": null,
#   "type": "reasoning"
# }

After (fixed) — streaming

events = list(ls_client.responses.create(
    model=MODEL,
    input="What is 2 + 2? Think step by step.",
    reasoning={"effort": "low"},
    stream=True,
))

for e in events:
    etype = getattr(e, 'type', None)
    item = getattr(e, 'item', None)
    if item and hasattr(item, 'type'):
        print(f"  {etype:<45} item.type={item.type}")
    else:
        print(f"  {etype}")

# Output:
#   response.created
#   response.in_progress
#   response.output_item.added
#   response.reasoning_summary_text.delta
#   ...
#   response.reasoning_summary_text.done
#   response.output_item.done
#   response.output_item.added
#   response.content_part.added
#   response.output_text.delta
#   ...
#   response.output_text.done
#   response.content_part.done
#   response.output_item.done
#   response.completed

# Final output correctly typed:
final = events[-1].response.output
print(type(final[0]).__name__)  # OutputOpenAIResponseReasoningItem
print(type(final[1]).__name__)  # OutputOpenAIResponseMessageOutput

Changes

File Change
src/llama_stack_client/types/response_object.py Added OutputOpenAIResponseReasoningItem, OutputOpenAIResponseReasoningItemContent, OutputOpenAIResponseReasoningItemSummary + added to Output discriminated union
src/llama_stack_client/types/response_object_stream.py Added matching reasoning types for OutputItemAdded and OutputItemDone item unions

Test plan

  • type: "reasoning" correctly deserializes into OutputOpenAIResponseReasoningItem in both streaming and non-streaming modes
  • Tested with OpenAI client against Ollama (gpt-oss:20b) as ground truth
  • Non-reasoning responses unaffected

@meta-cla meta-cla bot added the cla signed label Mar 25, 2026
The Output discriminated union was missing a ReasoningItem variant,
causing type="reasoning" output items from the Responses API to
fall back to OutputOpenAIResponseMessageOutput with Pydantic warnings
and broken content access.

Adds OutputOpenAIResponseReasoningItem (with Content and Summary
subtypes) to both non-streaming and streaming response types.
@robinnarsinghranabhat robinnarsinghranabhat force-pushed the feat/add-reasoning-item-output-type branch from 81922da to dcd21d9 Compare March 25, 2026 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant