[Tracing] Implement initial OTLP traces weblog tests#6363
[Tracing] Implement initial OTLP traces weblog tests#6363zacharycmontoya wants to merge 32 commits intomainfrom
Conversation
…ple. Notably, this creates a new scenario APM_TRACING_OTLP to enable the environment variables needed to configure the SDK to export traces as OTLP.
…e only use the "/v1/metrics" subpath
|
|
|
✨ Fix all issues with BitsAI or with Cursor
|
…v, since users do not need this feature for OTLP export to work
…e we're not immediately implementing that
…ing specifically: - At runtime determine if the request is JSON - If JSON, look up proto field names by their camelCase representation. Otherwise, look up field names by their snake_case representation - If JSON, assert that the 'traceId' and 'spanId' fields are case-insensitive hexadecimal strings, rather than base64-encoded strings - If JSON, assert that enums (e.g. span.kind and span.status.code) are encoded using an integer, not a string representation of the enum value name - Regardless of protocol, get the time before and after the test HTTP request is issued, and assert that the span's reported 'start_time_unix_nano' and 'end_time_unix_nano' fall in this range - Regardless of protocol, make the 'http.method' and 'http.status_code' span attribute assertions more flexible by also testing against their stable OpenTelemetry HTTP equivalents of 'http.request.method' and 'http.response.status_code', respectively
…ision of 0 is respected for OTLP traces by default
Since JSON must be expressed in lowerCamelCase (according to the OpenTelemetry spec), we can consolidate our parsing and assertions on that style of field names
….py to the proxy, in utils/proxy/traces/otlp_v1.py
… explain why other scenarios are facing issues with the "Library not ready" messages
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5f73ad4c0b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| yield data.get("request"), content, span | ||
| break # Skip to next span |
There was a problem hiding this comment.
Return all matching OTLP spans for a request
Dropping out of the span loop after the first RID match causes get_otel_spans() to undercount spans when a single OTLP payload contains multiple matching spans in the same scopeSpans block (for example, server + framework spans that both carry the request user-agent tag). This can make tests/otel/test_tracing_otlp.py pass even when extra spans are exported, weakening the regression signal for this new scenario.
Useful? React with 👍 / 👎.
|
As far as I can tell, it's only the |
| # Assert that the span fields match the expected values | ||
| span_start_time_ns = int(span["startTimeUnixNano"]) | ||
| span_end_time_ns = int(span["endTimeUnixNano"]) | ||
| assert span_start_time_ns >= self.start_time_ns |
There was a problem hiding this comment.
this line is flaky for me.
Claude says "self.start_time_ns is captured using time.time_ns() on the test host machine, while span_start_time_ns comes from the weblog container. If the container's clock is behind the host's clock (even by a small amount due to clock skew or Docker clock drift), the span start time will appear earlier than self.start_time_ns, failing the assertion".
Does it make any sense? or should I look into my code? :3
There was a problem hiding this comment.
I'm getting this issue too, I'll look into fixing this!
There was a problem hiding this comment.
OK I think I've fixed it in the latest set of commits by making a call to the weblog container to get the time via Unix's date utility. Let me know if you're still experiencing this!
|
There are some recent changes on this scenario (#6445), @obordeau or @smola , do we expect to have some instability here? @zacharycmontoya, waiting for a response, you can rebase. If it does not fix the issue, I'll look closely, and if needed, force-merge on your call. |
I think the SDS cassette broke (the JSON looks badly formatted) with this PR and made the AI Guard scenario failing. Looking on a fix :) cc @cbeauchesne |
|
Gotta log off, trying to fix it here Fix AI Guard SDS cassette |
…' call to the weblog container
…s the tracing OTLP export feature
|
Just merged the fix Fix AI Guard SDS system test, you can update your branch. Sorry for the inconvenience 🥲 |
.github/workflows/run-end-to-end.yml
Outdated
| if: always() && steps.build.outcome == 'success' && contains(inputs.scenarios, '"APM_TRACING_OTLP"') | ||
| run: ./run.sh APM_TRACING_OTLP | ||
| env: | ||
| DD_API_KEY: ${{ secrets.DD_API_KEY }} |
There was a problem hiding this comment.
In theory, we don't need this, you can use a fake key and it'll work the same (and if not, ping me, I'll fix that).
There was a problem hiding this comment.
Yes I think you're right. I'll either remove this entirely or use a fake key, stay tuned for updates 😎
- Assert that resource attribute telemetry.sdk.name=datadog - Assert that span attribute span.type=web - Assert that span attribute operation.name is present
…on't implement the latest required span attributes
2538f35 to
470194c
Compare
Motivation
We are seeing increased demand for exporting traces as OTLP from our DD SDKs (rather than Datadog-proprietary MessagePack), so we are prototyping and establishing requirements for generating OTLP traces payloads. This is only the first of a series of PRs to establish clear expectations for what the generated OTLP traces and trace stats will look like.
Changes
APM_TRACING_OTLPscenario to test the weblog application with the configuration needed for the DD SDK to export traces using OTLP. This also adds ainclude_opentelemetryproperty to theEndToEndScenarioto set up the OpenTelemetry interface.tests/otel/test_tracing_otlp.py::Test_Otel_Tracing_OTLP::test_tracingto send a request to the weblog app usingweblog.get("/")and asserts properties of the OTLP trace payloadget_otel_spansandget_trace_statsto retrieve the OTLP payloads for test assertionsWorkflow
🚀 Once your PR is reviewed and the CI green, you can merge it!
🛟 #apm-shared-testing 🛟
Reviewer checklist
tests/ormanifests/is modified ? I have the approval from R&P teambuild-XXX-imagelabel is present