-
Notifications
You must be signed in to change notification settings - Fork 198
Description
Hello Temporal Team,
We are implementing observability for our Temporal‑based project.
We are using OpenTelemetry + New Relic, and our stack is based on Spring Boot 3.5, Java 25, and temporal‑spring‑boot‑starter.
We’re currently facing two issues:
1. Long‑running Workflow / Activity spans never end
For long‑running workflows and activities, Temporal-OTel generates the following spans:
- StartWorkflow
- RunWorkflow
- StartActivity
- RunActivity
Because RunWorkflow and RunActivity spans last for a very long time, they remain open (not “ended”) for hours or even days.
As a result:
- The long‑running parent spans stay in memory and are never exported
- Their child spans finish quickly and get exported immediately
- Jaeger reports: "invalid parent span id"
- New Relic shows: "missing parent span"
Is there any recommended configuration or best practice to handle long‑running spans?
Specifically:
- Is there a way to disable generating RunWorkflow / RunActivity spans?
- Or an API to manually end these spans early?
- Or any recommended pattern to avoid missing‑parent problems for long‑running Temporal executions?
2. Spans after long retry delays appear in a different trace group
For retrying Activities, when the retry delay exceeds ~90 seconds, New Relic starts a new trace group, even though the spans share the same trace ID.
This causes trace fragmentation (split traces).
This seems like a New Relic–specific behavior, but I’m not sure whether other Temporal users have similar experiences with long‑running or delayed spans.
Questions
1. How should long‑running RunWorkflow and RunActivity spans be handled according to best practices?
2. Is the recommended approach to avoid long‑running spans entirely?
3. Is there any official guidance for retry delays that may break trace parent–child relationships?
Thank you very much. Any insights or recommended configurations would be greatly appreciated.