fix(logging): recover gracefully when log file deleted mid-session (#711) by livepeer-tessa · Pull Request #712 · daydreamlive/scope

livepeer-tessa · 2026-03-18T06:24:54Z

Summary

Fixes #711.

On fal.ai workers the OS may clean up /tmp while the process is still running. When RotatingFileHandler's stream is closed (e.g. during doRollover()) and the log directory has since been removed, every subsequent log call emits a noisy --- Logging error --- traceback to stderr instead of writing to the log file.

Root cause

The failure path is:

RotatingFileHandler accumulates 5 MB → calls doRollover()
doRollover() closes the stream and sets self.stream = None
Meanwhile the OS deletes /tmp/.daydream-scope/…/logs/
doRollover() (or the next shouldRollover() call) tries self._open() → FileNotFoundError
Python's default handleError dumps a traceback to stderr for every subsequent log record

Fix

Introduce ResilientRotatingFileHandler (in logs_config.py) — a drop-in subclass of RotatingFileHandler that:

Overrides shouldRollover() — catches FileNotFoundError, recreates the log directory + file via _reopen_stream(), retries the rollover check.
Overrides emit() — catches FileNotFoundError, recreates the log directory + file, retries the write.
Falls back to the standard handleError() path only when recovery itself fails (truly unrecoverable errors).

_configure_logging() in app.py now uses ResilientRotatingFileHandler instead of the stdlib class.

Tests

Added TestResilientRotatingFileHandler in tests/test_logs_config.py:

Normal emit works
Recovery after directory deletion (with stream closed)
Recovery after file deletion (with stream closed, directory intact)
shouldRollover() recovery after directory deletion
Fallback to handleError when recovery itself fails

All 22 tests pass.

Without a heartbeat, aiohttp does not send WebSocket ping frames, so NAT gateways, proxies, and firewalls can silently drop idle TCP connections. This manifests as code=1006 (abnormal closure / no close frame) after 10-30 minutes of use. Set heartbeat=30.0 on ws_connect so aiohttp sends a ping frame every 30 seconds, keeping the connection alive through middleboxes. Fixes #707 Signed-off-by: livepeer-robot <robot@livepeer.org>

On fal.ai workers, the OS may clean up /tmp while the process is still running (issue #711). When RotatingFileHandler's stream is closed (e.g. during doRollover()) and the log directory has been removed, subsequent log calls emit noisy '--- Logging error ---' tracebacks to stderr instead of writing to a log file. Introduce ResilientRotatingFileHandler, a subclass that: - Overrides shouldRollover() to catch FileNotFoundError, recreate the log directory/file via _reopen_stream(), and retry the check. - Overrides emit() to catch FileNotFoundError, recreate the log directory/file via _reopen_stream(), and retry the write. - Falls back to the standard handleError() path only when recovery itself fails (truly unrecoverable errors). Use ResilientRotatingFileHandler in _configure_logging() in app.py instead of the stdlib RotatingFileHandler. Add unit tests covering: normal operation, recovery after directory deletion, recovery after file deletion (stream previously closed), and fallback to handleError on unrecoverable errors. Fixes #711 Signed-off-by: livepeer-robot <robot@livepeer.org>

coderabbitai · 2026-03-18T06:25:01Z

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 35598090-c9e2-430b-9a4d-5b476deeabcb

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/711-resilient-rotating-file-handler

📝 Coding Plan

Generate coding plan for human review comments

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Tip

You can disable sequence diagrams in the walkthrough.

Disable the reviews.sequence_diagrams setting to disable sequence diagrams in the walkthrough.

github-actions · 2026-03-18T06:33:49Z

🚀 fal.ai Preview Deployment


App ID	`daydream/scope-pr-712--preview`
WebSocket	`wss://fal.run/daydream/scope-pr-712--preview/ws`
Commit	`ff3bbb6`

Testing

Connect to this preview deployment by running this on your branch:

uv run build && SCOPE_CLOUD_APP_ID="daydream/scope-pr-712--preview/ws" uv run daydream-scope

🧪 E2E tests will run automatically against this deployment.

github-actions · 2026-03-18T06:37:03Z

✅ E2E Tests passed


Status	passed
fal App	`daydream/scope-pr-712--preview`
Run	View logs

Test Artifacts

Check the workflow run for screenshots.

livepeer-robot added 2 commits March 17, 2026 16:31

livepeer-tessa requested review from emranemran and mjh1 March 18, 2026 06:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(logging): recover gracefully when log file deleted mid-session (#711)#712

fix(logging): recover gracefully when log file deleted mid-session (#711)#712
livepeer-tessa wants to merge 2 commits intomainfrom
fix/711-resilient-rotating-file-handler

livepeer-tessa commented Mar 18, 2026

Uh oh!

coderabbitai bot commented Mar 18, 2026

Review skipped

Uh oh!

github-actions bot commented Mar 18, 2026

Uh oh!

github-actions bot commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

livepeer-tessa commented Mar 18, 2026

Summary

Root cause

Fix

Tests

Uh oh!

coderabbitai bot commented Mar 18, 2026

Review skipped

Uh oh!

github-actions bot commented Mar 18, 2026

🚀 fal.ai Preview Deployment

Testing

Uh oh!

github-actions bot commented Mar 18, 2026

✅ E2E Tests passed

Test Artifacts

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant