Skip to content

adding bulk endpoints and auth token endpoint#7

Open
fcollman wants to merge 11 commits intomainfrom
bulk_skeleton_improvements
Open

adding bulk endpoints and auth token endpoint#7
fcollman wants to merge 11 commits intomainfrom
bulk_skeleton_improvements

Conversation

@fcollman
Copy link
Contributor

Addresses user reports of hitting the 10-skeleton limit in get_bulk_skeletons() even when all requested skeletons already exist in the GCS cache. The existing limit was designed to prevent blocking on skeleton generation, but incorrectly also throttled retrieval of pre-existing cached skeletons. This PR adds two new endpoints that separate those concerns.

Changes

POST //bulk/get_cached_skeletons//<output_format> — retrieves up to 500 already-cached skeletons per call. Skips per-RID is_valid_nodes() validation against the chunkedgraph (the main bottleneck of the existing endpoint). Returns a structured dict with three keys: skeletons (data for found RIDs), missing (not in cache), and async_queued (queued for async generation if generate_missing=true). Rate-limited by the new get_cached_skeletons_bulk category.

POST //bulk/get_skeleton_token/ — generates a short-lived, downscoped GCS OAuth2 Bearer token scoped to read-only access on the skeleton bucket prefix for the given datastack and version. The client can use this token to download skeleton H5 files directly from GCS without routing through this service, which is significantly faster for bulk access. Returns the token, expiry, bucket name, GCS object paths for each cached RID, and a list of missing RIDs.

New constants: MAX_BULK_CACHED_SKELETONS = 500

New dependencies: google.auth.downscoped (already available via google-auth)

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds two new bulk-oriented API endpoints to improve high-volume skeleton access by (1) separating cached retrieval from generation and (2) enabling direct GCS downloads via a downscoped OAuth2 token, addressing reports of the existing bulk endpoint hitting limits even when skeletons are already cached.

Changes:

  • Add a bulk endpoint to fetch already-cached skeletons with a higher per-call RID limit and optional async-queueing of missing RIDs.
  • Add an endpoint that returns a downscoped, short-lived GCS Bearer token plus object paths for cached skeleton H5 files.
  • Introduce MAX_BULK_CACHED_SKELETONS = 500 and wire new API routes to the service layer.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.

File Description
skeletonservice/datasets/service.py Implements cached-bulk retrieval logic and downscoped GCS token generation, plus new bulk limit constant.
skeletonservice/datasets/api.py Exposes new POST routes for cached-bulk retrieval and token issuance, with rate-limiting and auth decorators.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +2521 to +2523
datastack_name_remapped = DATASTACK_NAME_REMAPPING[datastack_name] if datastack_name in DATASTACK_NAME_REMAPPING else datastack_name
skvn_prefix = f"{bucket_extra_prefix}{datastack_name_remapped}/{HIGHEST_SKELETON_VERSION}/"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

from skeletonservice.datasets import limiter
from skeletonservice.datasets.limiter import *
from skeletonservice.datasets.service import NEUROGLANCER_SKELETON_VERSION, SKELETON_DEFAULT_VERSION_PARAMS, SKELETON_VERSION_PARAMS, SkeletonService
from skeletonservice.datasets.service import NEUROGLANCER_SKELETON_VERSION, SKELETON_DEFAULT_VERSION_PARAMS, SKELETON_VERSION_PARAMS, SkeletonService, MAX_BULK_CACHED_SKELETONS
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot please enforce the limit at the server side

Comment on lines +2346 to +2356
@staticmethod
def get_cached_skeletons_bulk_by_datastack_and_rids(
datastack_name: str,
rids: List,
bucket: str,
root_resolution: List,
collapse_soma: bool,
collapse_radius: int,
skeleton_version: int = 0,
output_format: str = "flatdict",
generate_missing_skeletons: bool = False,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

Comment on lines +2425 to +2432
skeleton = SkeletonService.get_skeleton_by_datastack_and_rid(
datastack_name,
rid,
output_format,
bucket,
root_resolution,
collapse_soma,
collapse_radius,
Copilot AI added a commit that referenced this pull request Mar 14, 2026
…n validation, RID limits, fast-path, remove unused param, add tests

Co-authored-by: fcollman <782341+fcollman@users.noreply.github.com>
Copy link

Copilot AI commented Mar 18, 2026

@fcollman I've opened a new pull request, #10, to work on those changes. Once the pull request is ready, I'll request review from you.

Copy link

Copilot AI commented Mar 18, 2026

@fcollman I've opened a new pull request, #11, to work on those changes. Once the pull request is ready, I'll request review from you.

Copy link

Copilot AI commented Mar 18, 2026

@fcollman I've opened a new pull request, #12, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot AI and others added 6 commits March 18, 2026 09:55
Co-authored-by: fcollman <782341+fcollman@users.noreply.github.com>
Co-authored-by: fcollman <782341+fcollman@users.noreply.github.com>
Enforce MAX_BULK_CACHED_SKELETONS limit at the API layer
Fix hard-coded HIGHEST_SKELETON_VERSION in get_skeleton_token_by_datastack
Co-authored-by: fcollman <782341+fcollman@users.noreply.github.com>
Add unit tests for bulk cached skeleton and token endpoints
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants