opencode-llm-proxy

One local endpoint. Every model you have access to. Any API format.

opencode-llm-proxy is an OpenCode plugin that starts a local HTTP server on http://127.0.0.1:4010. It translates between the API format your tool speaks and whichever LLM provider OpenCode has configured — so you never reconfigure the same models twice.

Your tool (OpenAI / Anthropic / Gemini SDK)
         │
         ▼  http://127.0.0.1:4010
  opencode-llm-proxy
         │
         ▼  OpenCode SDK
  GitHub Copilot · Anthropic · Gemini · Ollama · OpenRouter · Bedrock · …

Supported API formats — all with streaming:

Format	Endpoint
OpenAI Chat Completions	`POST /v1/chat/completions`
OpenAI Responses API	`POST /v1/responses`
Anthropic Messages API	`POST /v1/messages`
Google Gemini	`POST /v1beta/models/:model:generateContent`

Why

Most LLM tools speak exactly one API dialect. OpenCode already manages connections to every provider you use. This proxy bridges the two — your tools keep working as-is, and you change which model they use in one place.

Common situations it solves:

You have a GitHub Copilot subscription. Open WebUI, Chatbox, or a VS Code extension only accepts an OpenAI-compatible URL. Point them at the proxy — done.
You run Ollama locally. Your Python scripts use the OpenAI SDK. Set base_url to the proxy and use your Ollama model IDs directly.
You want to swap models without code changes. Your app talks to the proxy; you change the model in OpenCode config.
You want to share your models on a LAN. Expose the proxy on 0.0.0.0 and give teammates the URL.
You use the Anthropic SDK but want to route through GitHub Copilot or Bedrock. No code change in the SDK — just point it at the proxy.

Quickstart

npm install opencode-llm-proxy

Add to opencode.json:

{
  "plugin": ["opencode-llm-proxy"]
}

Start OpenCode — the proxy starts automatically:

opencode

Send a request:

curl http://127.0.0.1:4010/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "github-copilot/claude-sonnet-4.6",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Install

npm plugin (recommended)

npm install opencode-llm-proxy

Add to your global ~/.config/opencode/opencode.json (works everywhere) or a project-level opencode.json:

{
  "plugin": ["opencode-llm-proxy"]
}

Copy the file

Global — loaded for every OpenCode session:

curl -o ~/.config/opencode/plugins/llm-proxy.js \
  https://raw.githubusercontent.com/KochC/opencode-llm-proxy/main/index.js

Per-project — loaded only in this directory:

mkdir -p .opencode/plugins
curl -o .opencode/plugins/llm-proxy.js \
  https://raw.githubusercontent.com/KochC/opencode-llm-proxy/main/index.js

Configuration

Variable	Default	Description
`OPENCODE_LLM_PROXY_HOST`	`127.0.0.1`	Bind address. `0.0.0.0` to expose on LAN or Docker.
`OPENCODE_LLM_PROXY_PORT`	`4010`	TCP port.
`OPENCODE_LLM_PROXY_TOKEN`	(unset)	Bearer token required on every request. Unset = no auth.
`OPENCODE_LLM_PROXY_CORS_ORIGIN`	`*`	`Access-Control-Allow-Origin` value for browser clients.

OPENCODE_LLM_PROXY_HOST=0.0.0.0 \
OPENCODE_LLM_PROXY_TOKEN=my-secret \
opencode

Using with SDKs and tools

OpenAI SDK (JS/TS)

import OpenAI from "openai"

const client = new OpenAI({
  baseURL: "http://127.0.0.1:4010/v1",
  apiKey: "unused",
})

const response = await client.chat.completions.create({
  model: "github-copilot/claude-sonnet-4.6",
  messages: [{ role: "user", content: "Explain recursion." }],
})

OpenAI SDK (Python)

from openai import OpenAI

client = OpenAI(base_url="http://127.0.0.1:4010/v1", api_key="unused")

response = client.chat.completions.create(
    model="ollama/qwen2.5-coder",
    messages=[{"role": "user", "content": "Write a Python function to reverse a string."}],
)
print(response.choices[0].message.content)

Anthropic SDK (Python)

import anthropic

client = anthropic.Anthropic(
    base_url="http://127.0.0.1:4010",
    api_key="unused",
)

message = client.messages.create(
    model="anthropic/claude-3-5-sonnet",
    max_tokens=1024,
    messages=[{"role": "user", "content": "What is the Pythagorean theorem?"}],
)
print(message.content[0].text)

Anthropic SDK (JS/TS)

import Anthropic from "@anthropic-ai/sdk"

const client = new Anthropic({
  baseURL: "http://127.0.0.1:4010",
  apiKey: "unused",
})

const message = await client.messages.create({
  model: "anthropic/claude-opus-4",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Explain async/await." }],
})

Google Generative AI SDK (JS/TS)

import { GoogleGenerativeAI } from "@google/generative-ai"

const genAI = new GoogleGenerativeAI("unused", {
  baseUrl: "http://127.0.0.1:4010",
})

const model = genAI.getGenerativeModel({ model: "google/gemini-2.0-flash" })
const result = await model.generateContent("What is machine learning?")
console.log(result.response.text())

LangChain (Python)

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="anthropic/claude-3-5-sonnet",
    openai_api_base="http://127.0.0.1:4010/v1",
    openai_api_key="unused",
)

response = llm.invoke("What are the SOLID principles?")
print(response.content)

Open WebUI

Settings → Connections → OpenAI API
Set API Base URL to http://127.0.0.1:4010/v1
Leave API Key blank (or set to your OPENCODE_LLM_PROXY_TOKEN)
Save — all your OpenCode models appear in the model picker

Running Open WebUI in Docker? Use http://host.docker.internal:4010/v1 and set OPENCODE_LLM_PROXY_HOST=0.0.0.0.

Chatbox

Settings → AI Provider → OpenAI API → set API Host to http://127.0.0.1:4010.

Continue (VS Code / JetBrains)

In ~/.continue/config.json:

{
  "models": [
    {
      "title": "Claude via OpenCode",
      "provider": "openai",
      "model": "anthropic/claude-3-5-sonnet",
      "apiBase": "http://127.0.0.1:4010/v1",
      "apiKey": "unused"
    }
  ]
}

Zed

In ~/.config/zed/settings.json:

{
  "language_models": {
    "openai": {
      "api_url": "http://127.0.0.1:4010/v1",
      "available_models": [
        {
          "name": "github-copilot/claude-sonnet-4.6",
          "display_name": "Claude (OpenCode)",
          "max_tokens": 8096
        }
      ]
    }
  }
}

Finding model IDs

curl http://127.0.0.1:4010/v1/models | jq '.data[].id'
# "github-copilot/claude-sonnet-4.6"
# "anthropic/claude-3-5-sonnet"
# "ollama/qwen2.5-coder"
# ...

Use provider/model for clarity. Bare model IDs (e.g. gpt-4o) work if unambiguous across your providers.

To force a specific provider without changing the model string, add:

x-opencode-provider: anthropic

API reference

GET /health

{ "healthy": true, "service": "opencode-openai-proxy" }

GET /v1/models

Returns all models from all configured providers in OpenAI list format.

POST /v1/chat/completions

OpenAI Chat Completions. Required fields: model, messages. Optional: stream, temperature, max_tokens.

POST /v1/responses

OpenAI Responses API. Required fields: model, input. Optional: instructions, stream, max_output_tokens.

POST /v1/messages

Anthropic Messages API. Required fields: model, messages. Optional: system, max_tokens, stream.

Errors are returned in Anthropic format: { "type": "error", "error": { "type": "...", "message": "..." } }.

POST /v1beta/models/:model:generateContent

Google Gemini non-streaming. Model name in URL path. Required field: contents. Optional: systemInstruction, generationConfig.

POST /v1beta/models/:model:streamGenerateContent

Same as above, returns newline-delimited JSON stream.

How it works

Each request:

Is authenticated if OPENCODE_LLM_PROXY_TOKEN is set
Has its model resolved — provider/model, bare model ID, or Gemini URL path
Creates a temporary OpenCode session (visible in the session list)
Sends the prompt via client.session.prompt / client.session.promptAsync
Returns the response in the same format as the request

Streaming uses OpenCode's client.event.subscribe() SSE stream. Text deltas are forwarded in real time.

Limitations

Text only — image, audio, and file inputs are ignored
No tool/function calling — all OpenCode tools are disabled for proxy sessions
No cross-request session state — send full conversation history on every request
Temperature and max tokens are advisory (passed as system prompt hints)

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github		.github
.gitignore		.gitignore
.npmignore		.npmignore
.release-please-manifest.json		.release-please-manifest.json
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
eslint.config.js		eslint.config.js
index.js		index.js
index.test.js		index.test.js
package-lock.json		package-lock.json
package.json		package.json
release-please-config.json		release-please-config.json

Folders and files

Latest commit

History

Repository files navigation

opencode-llm-proxy

Why

Quickstart

Install

npm plugin (recommended)

Copy the file

Configuration

Using with SDKs and tools

OpenAI SDK (JS/TS)

OpenAI SDK (Python)

Anthropic SDK (Python)

Anthropic SDK (JS/TS)

Google Generative AI SDK (JS/TS)

LangChain (Python)

Open WebUI

Chatbox

Continue (VS Code / JetBrains)

Zed

Finding model IDs

API reference

GET /health

GET /v1/models

POST /v1/chat/completions

POST /v1/responses

POST /v1/messages

POST /v1beta/models/:model:generateContent

POST /v1beta/models/:model:streamGenerateContent

How it works

Limitations

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages