Skip to content

Add push and pull functions commands#70

Merged
Parker Henderson (parkerhendo) merged 25 commits intomainfrom
functions-push-pull-command
Mar 20, 2026
Merged

Add push and pull functions commands#70
Parker Henderson (parkerhendo) merged 25 commits intomainfrom
functions-push-pull-command

Conversation

@parkerhendo
Copy link
Contributor

Add bt functions push and bt functions pull

Adds push/pull commands for managing Braintrust functions as code. Push bundles local TypeScript/Python files and uploads function definitions (code functions, prompts, tools) to Braintrust. Pull materializes prompt definitions from Braintrust back to local files.

Commands

bt functions push [PATH...] --if-exists <error|replace|ignore>

Flag Short Default Description
--file <PATH> File or directory paths (also accepts positional args)
--if-exists error Behavior when slug already exists: error, replace, ignore
--language auto Force runtime: auto, javascript, python
--runner <BIN> Override runner binary (e.g. tsx, vite-node, python)
--tsconfig <PATH> tsconfig path for JS runner and bundler
--external-packages <PKG> Additional packages to mark external during JS bundling
--requirements <PATH> Python requirements file
--terminate-on-failure false Stop after first hard failure
--yes -y false Skip confirmation prompt

bt functions pull [SLUG...] --language <typescript|python>

Flag Short Default Description
--slug -s Function slug(s) to pull (also accepts positional args)
--output-dir <PATH> ./braintrust Destination directory for generated files
--language typescript Output language: typescript, python
--project-id Scope to a specific project by ID
--id Select a specific function by ID
--version Version selector
--force false Overwrite existing files
--verbose false Show skipped/unsupported records

How it works

Push runs a JS/Python runner to discover function and prompt registrations from source files, bundles code functions with esbuild, and uploads definitions to Braintrust via /insert-functions. Handles both modern (toFunctionDefinition) and legacy (project.prompts.create()) prompt registration paths.

Pull fetches prompt definitions from a project and writes them as TypeScript or Python source files. Non-prompt function types are skipped.

Infrastructure

  • JS bundler script (esbuild) and runner scripts for TS/Python
  • Shared runner-common.ts and python_runner_common.py modules
  • Atomic file writes, git helpers, source language detection, Python interpreter resolution

Test plan

  • cargo test --test functions — 20 integration tests covering push/pull CLI fixtures, mock server round-trips, edge cases
  • Manual: bt functions push prompt.ts --if-exists replace pushes both code functions and prompts
  • Manual: bt functions pull --slug <slug> materializes prompt to local file

@github-actions
Copy link

github-actions bot commented Mar 10, 2026

Latest downloadable build artifacts for this PR commit 1acb3e34c2e8:

Available artifact names
  • ``artifacts-build-global
  • ``artifacts-build-local-x86_64-apple-darwin
  • ``artifacts-build-local-x86_64-pc-windows-msvc
  • ``artifacts-build-local-x86_64-unknown-linux-musl
  • ``artifacts-build-local-x86_64-unknown-linux-gnu
  • ``artifacts-build-local-aarch64-apple-darwin
  • ``artifacts-build-local-aarch64-unknown-linux-musl
  • ``artifacts-build-local-aarch64-unknown-linux-gnu
  • ``artifacts-plan-dist-manifest
  • ``cargo-dist-cache

Copy link
Contributor Author

Updated integrations tests here. Had to make a few tweaks to get them to pass, but all are green now.

https://github.com/braintrustdata/braintrust/pull/11810

@nselvidge
Copy link
Contributor

Parker Henderson (@parkerhendo) I pulled this branch and had some failing tests. looking at our CI it looks like we don't run all new tests by default, I don't think the new tests are getting run in CI.

Introduce foundational modules that the push and pull commands will
build upon:

- source_language: SourceLanguage enum and extension classification
  for distinguishing JS/TS from Python files
- utils/fs_atomic: atomic file writes via temp-then-rename
- utils/git: GitRepo discovery and dirty-state detection
- js_runner: runner script materialization and JS runtime discovery
  (tsx, vite-node, ts-node, deno)
- python_runner: Python interpreter resolution with venv support
- scripts/runner-common.ts: shared TS types for runner manifests
- scripts/python_runner_common.py: shared Python utilities for module
  loading, file normalization, and source collection
Introduce the command scaffolding for `bt functions push` and
`bt functions pull`:

- functions/mod.rs: PushArgs, PullArgs, AuthContext, IfExistsMode,
  FunctionsLanguage enums, and refactored context resolution
- functions/api.rs: paginated function listing, code upload slots,
  bundle upload, and batch insert endpoints
- functions/report.rs: structured report types for JSON output with
  HardFailureReason, SoftSkipReason, and summary types
- auth.rs: AvailableOrg struct and list_available_orgs()
- http.rs: put_signed_url() for uploading to signed URLs
Add the full push pipeline for deploying local function definitions to
Braintrust:

- push.rs: file classification, runner invocation, manifest parsing,
  project preflight, bundle compression/upload, and batch insert with
  structured reporting
- functions-runner.ts: TS/JS runner that imports user files, inspects
  the Braintrust global registry, and emits a JSON manifest
- functions-runner.py: Python runner with bundle collection (entry
  module + source files) for server-side execution

Supports both TypeScript and Python source files with automatic
language detection, interactive org/project selection, and
configurable conflict resolution (error/replace/ignore).
Add the pull pipeline for downloading Braintrust function definitions
as local source files:

- Paginated fetching with cursor-based pagination and snapshot
  consistency
- Code generation for both TypeScript and Python with proper imports,
  typed prompt definitions, and recursive JSON value formatting
- Safety checks: git dirty detection, existing file protection, and
  force flag override
- Sanitized identifiers and filenames with Windows reserved name
  handling
- Per-project directory organization with atomic file writes
Comprehensive test coverage for the push and pull commands:

- 16 CLI fixture tests covering help text, flag validation, env var
  parsing, language selection, and argument conflict detection
- Mock API server (actix-web) with handlers for login, projects,
  upload slots, bundle upload, and function insert/list
- Integration tests for full push and pull flows against mock server
- JS and Python runner manifest validation tests
- Python bundle validation and cross-file module purge tests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one bigger issue with function schemas, one minor issue we should resolve and one documentation nit. I had codex go through an make a suite of end to end tests running through all of the functionality and verified it's correct and matches existing sdk behavior, but it's tough to review a 10k line PR with high confidence. Not blocking for this PR I think if we break up large chunks of work like this in the future we'll have an easier time in review going through and understanding code quality.

&mut seen_names,
);
out.push_str(&format!("{var_name} = project.prompts.create(\n"));
out.push_str(&format!(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like the TS renderer includes the id and version but python doesn't. made a prompt and pulled it to compare:

I'm actually not sure what the correct behavior is but we probably should be consistent

TS

export const test04ef = project.prompts.create({
  id: "2fab7153-4a5f-4ce3-bd9b-aee9e34a441b",
  name: "Test",
  slug: "test-04ef",
  version: "1000196843330666497",
  messages: [
    {
      content: "hello world",
      role: "system"
    }
  ],
  model: "gpt-5-mini",
  params: {
    temperature: 0,
    use_cache: true
  },
});

Python


test_04ef = project.prompts.create(
    name="Test",
    slug="test-04ef",
    messages=[
        {
            "content": "hello world",
            "role": "system"
        }
    ],
    model="gpt-5-mini",
    params={
        "temperature": 0,
        "use_cache": True
    },
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed this. Thanks for flagging!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocking comment, but I think we're doing several different things in this file and could probably break it into a few separate components. I think this would be easier to maintain if we decomposed it into smaller, more focused pieces. again, don't think this should block the PR but I think we should keep an eye out for stuff like this.

return zodToJsonSchemaFn;
}

function schemaToJsonSchema(schema: unknown): JsonObject | undefined {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like there's a bug in here. if you create a function with the legacy sdk you get this function_schema:

legacy_row.function_schema == {
    "parameters": {
      "type": "object",
      "$schema": "http://json-schema.org/draft-07/schema#",
      "required": ["orderId"],
      "properties": {
        "orderId": {
          "type": "string",
          "description": "The order ID"
        }
      },
      "additionalProperties": false
    },
    "returns": {
      "type": "object",
      "$schema": "http://json-schema.org/draft-07/schema#",
      "required": ["status"],
      "properties": {
        "status": {
          "type": "string"
        }
      },
      "additionalProperties": false
    }
  }

if you create the same function with the bt cli you get:


  rust_row.function_schema == {
    "parameters": {
      "_def": {
        "typeName": "ZodObject",
        "unknownKeys": "strip",
        "catchall": {
          "_def": { "typeName": "ZodNever" },
          "~standard": { "vendor": "zod", "version": 1 }
        }
      },
      "_cached": null,
      "~standard": { "vendor": "zod", "version": 1 }
    },
    "returns": {
      "_def": {
        "typeName": "ZodObject",
        "unknownKeys": "strip",
        "catchall": {
          "_def": { "typeName": "ZodNever" },
          "~standard": { "vendor": "zod", "version": 1 }
        }
      },
      "_cached": null,
      "~standard": { "vendor": "zod", "version": 1 }
    }
  }

I think it's because we convert to a json value then check isJsonObject so we never get to where we call the converter


  const normalizedSchema = toJsonValue(schema as JsonValue);
  if (isJsonObject(normalizedSchema)) {
    return normalizedSchema;
  }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the fixes. except for the failing tests this looks good!

@parkerhendo Parker Henderson (parkerhendo) merged commit 213da80 into main Mar 20, 2026
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants