brew tap RayBytes/chatmock
brew install chatmockpipx install chatmockDownload from releases (macOS & Windows)
See DOCKER.md
# 1. Sign in with your ChatGPT account
chatmock login
# 2. Start the server
chatmock serveThe server runs at http://127.0.0.1:8000 by default. Use http://127.0.0.1:8000/v1 as your base URL for OpenAI-compatible apps.
Python
from openai import OpenAI
client = OpenAI(
base_url="http://127.0.0.1:8000/v1",
api_key="anything" # not checked
)
response = client.chat.completions.create(
model="gpt-5.4",
messages=[{"role": "user", "content": "hello"}]
)
print(response.choices[0].message.content)cURL
curl http://127.0.0.1:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.4",
"messages": [{"role": "user", "content": "hello"}]
}'gpt-5.4gpt-5.4-minigpt-5.2gpt-5.1gpt-5gpt-5.3-codexgpt-5.3-codex-sparkgpt-5.2-codexgpt-5-codexgpt-5.1-codexgpt-5.1-codex-maxgpt-5.1-codex-minicodex-mini
- Tool / function calling
- Vision / image input
- Thinking summaries (via think tags)
- Configurable thinking effort
- Fast mode for supported models
- Web search tool
- OpenAI-compatible
/v1/responses(HTTP + WebSocket) - Ollama-compatible endpoints
- Reasoning effort exposed as separate models (optional)
All flags go after chatmock serve. These can also be set as environment variables.
| Flag | Env var | Options | Default | Description |
|---|---|---|---|---|
--reasoning-effort |
CHATGPT_LOCAL_REASONING_EFFORT |
none, minimal, low, medium, high, xhigh | medium | How hard the model thinks |
--reasoning-summary |
CHATGPT_LOCAL_REASONING_SUMMARY |
auto, concise, detailed, none | auto | Thinking summary verbosity |
--reasoning-compat |
CHATGPT_LOCAL_REASONING_COMPAT |
legacy, o3, think-tags | think-tags | How reasoning is returned to the client |
--fast-mode |
CHATGPT_LOCAL_FAST_MODE |
true/false | false | Priority processing for supported models |
--enable-web-search |
CHATGPT_LOCAL_ENABLE_WEB_SEARCH |
true/false | false | Allow the model to search the web |
--expose-reasoning-models |
CHATGPT_LOCAL_EXPOSE_REASONING_MODELS |
true/false | false | List each reasoning level as its own model |
Web search in a request
{
"model": "gpt-5.4",
"messages": [{"role": "user", "content": "latest news on ..."}],
"responses_tools": [{"type": "web_search"}],
"responses_tool_choice": "auto"
}Fast mode in a request
{
"model": "gpt-5.4",
"input": "summarize this",
"fast_mode": true
}Use responsibly and at your own risk. This project is not affiliated with OpenAI.