python(feat): add export api for sift client by wei-qlu · Pull Request #490 · sift-stack/sift

wei-qlu · 2026-03-10T17:59:37Z

What was changed

Adds a high-level data export API to sift_client to reach parity with sift_py (soon to be deprecated).

Add DataExportsAPI with a single export method that accepts domain objects or raw string IDs for runs, assets, or time-range-based exports and returns a Job handle
wait_and_download polls until complete, then downloads and extracts the exported files
Add _resolve_calculated_channels helper that resolves name-based channel identifiers to UUIDs via the channels API (the export API requires UUIDs but CalculatedChannel objects store names)
Add ExportsLowLevelClient that wraps the gRPC protobuf calls; the high-level client has zero proto imports
Register sync wrapper and type stubs; wire up client.exports and client.async_.exports

Verification

Unit tests for input validation, domain object resolution, low-level delegation, wait_and_download status handling, and calculated channel resolution
Integration tests for async and sync export paths
Manual export testing locally

…ference exports

…ort_by_time_range

…grpcio

…hods

python/lib/sift_client/resources/exports.py

python/lib/sift_client/sift_types/export.py

python/lib/sift_client/resources/exports.py

…vel concerns, update tests

…updated unit tests

…xports

nathan-sift

This looks good, thanks for the changes! Approving, but it looks like you might need to re-update the stubs

alexluck-sift · 2026-03-18T21:31:02Z

python/lib/sift_client/resources/exports.py

+
+        # Run the synchronous request in a thread pool to avoid blocking the event loop
+        loop = asyncio.get_event_loop()
+        extracted_files = await loop.run_in_executor(


is this necessary and does it even work in the sync context? ExportsAPIAsync functions are already async.

From Claude:
wait_until_complete is not safe for sync generation

This is the most serious issue. The async wait_until_complete
does:
loop = asyncio.get_event_loop()
extracted_files = await loop.run_in_executor(None,
self._download_and_extract, ...)

When the sync wrapper calls this, it's already running in a
dedicated event loop thread (per the architecture in
README.md). asyncio.get_event_loop() inside that context may
return a different loop than the one actually running the
coroutine, especially in Python 3.10+ where this is
deprecated. Even with get_running_loop(), the run_in_executor
call spawns a thread from within the dedicated gRPC thread —
this works but is fragile and untested.

Compare with how other resources handle blocking I/O: they
don't. Every other resource is pure async gRPC calls. Exports
is unique in mixing async gRPC with sync HTTP (requests.get)
via run_in_executor.

run_in_executor is needed because requests.get() is synchronous (even using rest_client) and would block the event loop during download. Switching to get_running_loop() handles the issue (in sync mode as well) since it always returns the correct loop

should we make a util/wrapper for REST calls such that we can handle this the same anytime we need to use REST? For example with file attachments

That's a good idea, added a util function wrapping the executor for blocking calls.

Resolved

python/lib/sift_client/resources/exports.py

python/lib/sift_client/sift_types/export.py

python/lib/sift_client/_internal/low_level_wrappers/exports.py

python/lib/sift_client/resources/exports.py

python/lib/sift_client/_tests/resources/test_exports.py

python/lib/sift_client/resources/exports.py

…_event_loop()

…ternal/util

…xtracting

alexluck-sift · 2026-03-20T16:35:33Z

python/lib/sift_client/_internal/util/download.py

+    from sift_client.transport.rest_transport import RestClient
+
+
+def download_and_extract_zip(


can this be broken into two separate utils? a download (that we can use for any downloads such as file attachments, check what we do there) and the zip extraction? These are separate concerns

resolved, broken into two separate functions and renamed the file to be more generic (_internal/util/file.py)

alexluck-sift · 2026-03-20T16:39:13Z

python/lib/sift_client/resources/exports.py

+            ),
+        )
+
+        return extracted_files


this is confusing naming since the files may or may not be extracted.

Once the concerns are separated a bit, it may be a bit more clear to:

zip_file_path = download_presigned_file(...) if not extract: return [zip_file_path] return extract_zip(zip_file_path, delete=True, ...)

resolved, renamed to extract_zip to be more specific

alexluck-sift · 2026-03-20T16:58:04Z

python/lib/sift_client/_tests/resources/test_exports.py

+    async def test_export_by_run(self, exports_api_async, sift_client):
+        runs = await sift_client.async_.runs.list_(limit=1)
+        assert runs, "No runs available"
+        job = await exports_api_async.export(runs=[runs[0]], output_format=CSV)


be careful with this, the run could have a lot of data, can we scope this in a safer way? Or perhaps we link this to an ingestion test and we then have full control of what data exists and are exporting

Resolved, added an upper-bound time stamp (only 10s of data gets exported)

alexluck-sift · 2026-03-20T16:58:26Z

python/lib/sift_client/_tests/resources/test_exports.py

+        assets = await sift_client.async_.assets.list_(limit=1)
+        assert assets, "No assets available"
+        now = datetime.now(timezone.utc)
+        job = await exports_api_async.export(


same comment as for the run export

Resolved, added an upper-bound time stamp (only 10s of data gets exported)

alexluck-sift · 2026-03-20T17:36:39Z

python/lib/sift_client/_tests/resources/test_exports.py

+        assert len(files) > 0
+        assert all(f.exists() for f in files)
+
+    def test_sync_export_by_run(self, exports_api_sync, sift_client):


Since we are adding some unique loop handling (outside of what we are already doing in the sync wrapper), we should make sure we are sufficiently testing the following scenarios:

Plain sync — no event loop active, user calls sync API directly (already tested)

User's own event loop — user has their own loop and calls sync API from within it

Sync from async — user calls sync API from inside a running async def

While this was tested extensively during development on the sync_wrapper, could you also please add these tests to the test_sync_wrapper for completeness?

Resolved, added tests for all three scenarios (plain sync, user's own loop, sync from async) in test_sync_wrapper.py

…med the file to be more appropriate

…alls off the event loop

…d on the job object itself

…test data

wei-qlu added 2 commits March 6, 2026 18:31

python(feature): sift_client low level wrapper for exports

5ca890a

python(feat): data export api for sift_client

a083f1d

wei-qlu changed the title ~~Python/data export high level interface~~ python(feat): add export api for sift client Mar 10, 2026

wei-qlu added 8 commits March 10, 2026 16:14

python(fix): updated sift_type export docstrings to handle channel_re…

7da8381

…ference exports

python(fix): enforce channel_ids or calculated_channel_config for exp…

07ea61d

…ort_by_time_range

python(fix): rename internal grpc module to _grpc to aovid shadowing …

7327208

…grpcio

python(fix): add input validation and job status checks to export met…

8748f8c

…hods

python(fix): added timestamp checks and unit tests for exports

e04bfae

python(fix): updated docstring args to match export options in UI

db372c6

python(fix): added assert for calc configs result

891fe87

python(fix): updated sync stubs

37810c2

wei-qlu requested review from alexluck-sift, marc-sift, nathan-sift and solidiquis March 12, 2026 19:18

wei-qlu marked this pull request as ready for review March 12, 2026 21:19

nathan-sift requested changes Mar 13, 2026

View reviewed changes

wei-qlu added 7 commits March 15, 2026 14:01

python(refactor): exports API accepts domain objects alongisde raw IDs

9ae44c6

python(refactor): exports API returns job, separate high-level/low-le…

67dc30c

…vel concerns, update tests

python(fix): low-level-client accepts ExportOutputFormat enum

5a54678

python(fix): linting

ca1caf0

python(refactor): merge low-level export methods into single method, …

dae2ae5

…updated unit tests

python(fix): add assertions for datetime to resolve mypy errors

60f2cff

linting

3ef402f

wei-qlu requested a review from nathan-sift March 16, 2026 19:52

wei-qlu added 2 commits March 16, 2026 13:14

python(refactor): remove redundant code

1420dda

python(refactor): removed use_legacy_format as a possible field for e…

fc844e8

…xports

wei-qlu marked this pull request as draft March 17, 2026 00:39

nathan-sift previously approved these changes Mar 17, 2026

View reviewed changes

python(feat): return file path from export job, update test

6787655

wei-qlu added 2 commits March 18, 2026 15:18

python(refactor): revert _grpc to grpc and updated stubs

837d8fa

python(fix): updated unit tests to use the renamed function

535335a

alexluck-sift requested changes Mar 18, 2026

View reviewed changes

alexluck-sift reviewed Mar 18, 2026

View reviewed changes

python/lib/sift_client/resources/exports.py Outdated Show resolved Hide resolved

wei-qlu added 14 commits March 18, 2026 15:52

python(refactor): shared _export helper to deduplicate export methods

ca7dd63

python(fix): use asyncio.get_running_loop() instead of deprecated get…

6e78e03

…_event_loop()

python(fix): add integration tests and combined duplicate unit tests

61a355c

pyright fix

3fb4557

move _resolve_calculated_channels and download_and_extract_zip to _in…

faf4af8

…ternal/util

add extract parameter to wait_and_download can keep the zip without e…

edca0ad

…xtracting

refactored export methods to a single entry point

4194dd0

mypy fix

35fd172

python(refactor): use rest_client instead of raw request

75f0d1d

python(refactor): added dict support for calc channels

28d353c

python(refactor): class rename to DataExportsAPI

c971119

mypy fix

ab35764

pyright fix

f8c9caf

timeout increased

9e32d81

alexluck-sift reviewed Mar 20, 2026

View reviewed changes

wei-qlu added 7 commits March 20, 2026 12:52

python(refactor): split up the download_and_extract function and rena…

05a4ce1

…med the file to be more appropriate

python(fix): scoped the integration export jobs to 10s

e8ab290

python(fix): add sync wrapper tests for run_in_executer loop scenarios

95df192

python(refactor): add a run_sync_function util for running blocking c…

45f32a8

…alls off the event loop

python(refactor): move wait_and_download to JobsAPI and add the metho…

af904bb

…d on the job object itself

python(fix): scoped the export_by_asset test to use one channel

1cfc7ec

python(fix): updated export_by_asset integration tests with ingested …

3924682

…test data

		from sift_client.transport.rest_transport import RestClient


		def download_and_extract_zip(

Conversation

wei-qlu commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What was changed

Verification

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nathan-sift left a comment

Choose a reason for hiding this comment

Uh oh!

alexluck-sift Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wei-qlu Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wei-qlu Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wei-qlu commented Mar 10, 2026 •

edited

Loading

alexluck-sift Mar 18, 2026 •

edited

Loading

wei-qlu Mar 20, 2026 •

edited

Loading

wei-qlu Mar 20, 2026 •

edited

Loading