Skip to content

agents,lib,src,test: add traceSampleRate support#430

Open
santigimeno wants to merge 1 commit intonode-v24.x-nsolid-v6.xfrom
santi/tracing_rate
Open

agents,lib,src,test: add traceSampleRate support#430
santigimeno wants to merge 1 commit intonode-v24.x-nsolid-v6.xfrom
santi/tracing_rate

Conversation

@santigimeno
Copy link
Copy Markdown
Member

@santigimeno santigimeno commented Mar 3, 2026

Add end-to-end traceSampleRate handling across config, runtime propagation, tracing decisions, and regression tests.

Why:

  • Enable configurable probabilistic trace sampling with predictable behavior.
  • Ensure consistent semantics across all config entry points.
  • Prevent invalid updates from corrupting current sampling behavior.
  • Keep transaction consistency by deciding sampling at the root span only.

What changed:

  • Added traceSampleRate parsing and normalization in JS config paths with explicit default fallback and finite/range validation in [0, 1].
  • Added native config sanitization for traceSampleRate to reject invalid values before merge, preserving previous valid configuration.
  • Ensured runtime sampling state is synchronized from effective current config after updates to avoid stale shared-memory sample rates.
  • Added gRPC reconfigure support for traceSampleRate in proto and agent mapping, including generated protobuf updates.
  • Updated tracing logic so root spans perform the sampling decision and child spans inherit parent traceFlags.
  • Extended tests for:
    • invalid value handling (including NaN/Infinity)
    • env/package bootstrap behavior
    • partial updates preserving existing traceSampleRate
    • gRPC invalid-update fallback behavior
    • sampling behavior at 0%, 50% (tolerance), and 100%
    • worker-thread sampling behavior
    • explicit parent/child trace consistency assertions

Summary by CodeRabbit

  • New Features

    • Probabilistic trace sampling for root spans, configurable via env, package config, or runtime API; default 1.0 and observable from JS at runtime.
    • Runtime reconfiguration preserves and propagates the trace sampling rate.
  • Bug Fixes

    • Invalid, non-finite, or out-of-range sampling values are sanitized/rejected and do not overwrite a prior valid rate.
  • Tests

    • Added tests and fixtures covering config precedence, validation, persistence, and sampling behavior.

@santigimeno santigimeno requested a review from RafaelGSS March 3, 2026 16:01
@santigimeno santigimeno self-assigned this Mar 3, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 3, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cdcccd3c-34cb-49bd-bf8c-4a86cbbbe0e1

📥 Commits

Reviewing files that changed from the base of the PR and between 85d36e4 and d4333fe.

📒 Files selected for processing (4)
  • agents/grpc/proto/reconfigure.proto
  • agents/grpc/src/grpc_agent.cc
  • agents/grpc/src/proto/reconfigure.pb.cc
  • agents/grpc/src/proto/reconfigure.pb.h
✅ Files skipped from review due to trivial changes (1)
  • agents/grpc/proto/reconfigure.proto

Walkthrough

Adds a new traceSampleRate (double, 0–1) configuration propagated through protobufs, gRPC reconfigure, runtime C++ APIs, and JS tracing; implements validation, runtime propagation to per-env bindings, probabilistic root-span sampling, and accompanying tests and fixtures.

Changes

Cohort / File(s) Summary
Proto schema & generated code
agents/grpc/proto/reconfigure.proto, agents/grpc/src/proto/reconfigure.pb.h, agents/grpc/src/proto/reconfigure.pb.cc
Added optional traceSampleRate (double, field 14) to ReconfigureBody. Updated generated C++: field storage, accessors, has-bit layout, parse table, serialization, merge/clear/size logic, and descriptor metadata.
gRPC agent logic
agents/grpc/src/grpc_agent.cc
Read traceSampleRate from incoming reconfigure JSON/config and set body->set_tracesamplerate(...); propagate traceSampleRate into outgoing reconfigure JSON when present.
Runtime C++ API & binding
src/nsolid/nsolid_api.h, src/nsolid/nsolid_api.cc
Added EnvList atomic trace_sample_rate_ and per-EnvInst trace_sample_rate_; validation helpers (validate_trace_sample_rate, validate_config), update_tracing_sample_rate to sanitize and propagate rate to EnvInsts, and exported JS ArrayBuffer trace_sample_rate for bindings.
JS config & tracing
lib/nsolid.js, lib/internal/otel/trace.js
Introduced DEFAULT_TRACING_SAMPLING_RATE, parseTraceSampleRate() for normalization/validation, config handling for traceSampleRate; root-span sampling changed to probabilistic decision using binding.trace_sample_rate, children inherit parent traceFlags.
Tests & fixtures
test/fixtures/nsolid-trace-sample-rate-package.json, test/fixtures/test-nsolid-config-trace-sample-rate-env-script.js, test/addons/nsolid-tracing/test-otel-basic2.js, test/agents/test-grpc-reconfigure.mjs, test/parallel/test-nsolid-config-trace-sample-rate.js, test/parallel/test-nsolid-config-trace-sample-rate-env.js, test/parallel/test-nsolid-trace-sample-rate-sampling.js
Added fixtures and tests covering env/package precedence, validation (reject out-of-range/non-finite), reconfigure propagation, child span flag inheritance, and sampling behavior (including worker-thread sampling counts and rate preservation on invalid updates).

Sequence Diagram

sequenceDiagram
    participant App as JavaScript App
    participant Config as nsolid.config
    participant gRPC as gRPC Agent
    participant EnvList as EnvList (C++)
    participant Binding as binding.trace_sample_rate
    participant Tracer as OpenTelemetry Tracer
    participant Span as Root Span

    App->>Config: updateConfig({ traceSampleRate: X })
    Config->>Config: parseTraceSampleRate(X)
    Config->>gRPC: send reconfigure (traceSampleRate: X)
    gRPC->>EnvList: receive reconfigure
    EnvList->>EnvList: validate_trace_sample_rate -> sanitize
    EnvList->>EnvList: update_tracing_sample_rate(X)
    EnvList->>Binding: write trace_sample_rate (shared buffer)

    App->>Tracer: start root span
    Tracer->>Binding: read trace_sample_rate
    Tracer->>Span: decide sampled = Math.random() < trace_sample_rate
    alt sampled
        Span->>Span: set traceFlags = SAMPLED
    else not sampled
        Span->>Span: set traceFlags = NONE
    end
    Span-->>Tracer: return span
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • RafaelGSS
  • EHortua

Poem

🐰 I nibbled a rate from zero to one,

tucked it in configs, now tracing is fun.
Roots roll the dice, children keep the light,
sanitized and shared—every trace takes flight.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 4.65% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately summarizes the main change: adding traceSampleRate support across agents, lib, src, and test components. It is concise, specific, and directly reflects the primary objective of the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch santi/tracing_rate

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
test/agents/test-grpc-reconfigure.mjs (1)

171-208: Consider adding NaN/Infinity invalid-rate cases in this gRPC path test.

This block currently checks only out-of-range finite numbers. Extending it with Number.NaN and ±Infinity would better guard the gRPC reconfigure edge cases already covered in non-gRPC tests.

✅ Minimal test extension
-        const invalidRates = [2, -0.5];
+        const invalidRates = [2, -0.5, Number.NaN, Number.POSITIVE_INFINITY, Number.NEGATIVE_INFINITY];
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/agents/test-grpc-reconfigure.mjs` around lines 171 - 208, Update the
test "should preserve previous traceSampleRate for invalid values" to include
NaN and Infinity cases: add Number.NaN, Number.POSITIVE_INFINITY and
Number.NEGATIVE_INFINITY (or +/-Infinity) to the invalidRates array used with
grpcServer.reconfigure and client.config assertions so
grpcServer.reconfigure(agentId, { traceSampleRate: invalidRate }) is exercised
for NaN/Infinity and the existing assert.strictEqual checks that
nsolidConfig.traceSampleRate remained 0.4 still apply.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@agents/grpc/src/grpc_agent.cc`:
- Around line 1761-1763: PopulateReconfigureEvent currently omits the
traceSampleRate field when building the outgoing reconfigure event body; update
PopulateReconfigureEvent to set body.traceSampleRate from the internal config so
the outgoing event mirrors inbound updates (i.e., when you previously read
body.tracesamplerate() on incoming updates, ensure PopulateReconfigureEvent
calls the corresponding setter to populate body.traceSampleRate in the event).
Locate the PopulateReconfigureEvent function in grpc_agent.cc and add the
traceSampleRate assignment using the same source/field used for other mapped
settings so reconfigure responses include traceSampleRate.

In `@lib/nsolid.js`:
- Around line 1144-1154: The parseTraceSampleRate function currently coerces
unintended types via unary +; update parseTraceSampleRate to only accept inputs
that are typeof 'number' or 'string', explicitly reject booleans and other
types, trim string inputs and return undefined for empty/whitespace-only
strings, then convert the trimmed numeric string (or the number input) to a
numeric value and validate with NumberIsFinite and range checks (>=0 && <=1);
ensure the function returns undefined for invalid types, whitespace-only
strings, NaN, non-finite numbers, or values outside the 0–1 range while
returning the numeric rate for valid inputs.

In `@test/parallel/test-nsolid-config-trace-sample-rate-env.js`:
- Around line 15-18: The test currently spreads process.env into the child env
(env: {...process.env, ...envVars}), which can leak ambient NSOLID_* variables
and make the test nondeterministic; update the env construction in
test/parallel/test-nsolid-config-trace-sample-rate-env.js to explicitly filter
out any keys starting with "NSOLID_" (e.g., build a filteredEnv =
Object.fromEntries(Object.entries(process.env).filter(([k]) =>
!k.startsWith('NSOLID_'))) and then use env: { ...filteredEnv, ...envVars } so
the child process is isolated from ambient NSOLID_* variables while still
allowing envVars to override.

---

Nitpick comments:
In `@test/agents/test-grpc-reconfigure.mjs`:
- Around line 171-208: Update the test "should preserve previous traceSampleRate
for invalid values" to include NaN and Infinity cases: add Number.NaN,
Number.POSITIVE_INFINITY and Number.NEGATIVE_INFINITY (or +/-Infinity) to the
invalidRates array used with grpcServer.reconfigure and client.config assertions
so grpcServer.reconfigure(agentId, { traceSampleRate: invalidRate }) is
exercised for NaN/Infinity and the existing assert.strictEqual checks that
nsolidConfig.traceSampleRate remained 0.4 still apply.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 369ce34 and 5048dab.

📒 Files selected for processing (15)
  • agents/grpc/proto/reconfigure.proto
  • agents/grpc/src/grpc_agent.cc
  • agents/grpc/src/proto/reconfigure.pb.cc
  • agents/grpc/src/proto/reconfigure.pb.h
  • lib/internal/otel/trace.js
  • lib/nsolid.js
  • src/nsolid/nsolid_api.cc
  • src/nsolid/nsolid_api.h
  • test/addons/nsolid-tracing/test-otel-basic2.js
  • test/agents/test-grpc-reconfigure.mjs
  • test/fixtures/nsolid-trace-sample-rate-package.json
  • test/fixtures/test-nsolid-config-trace-sample-rate-env-script.js
  • test/parallel/test-nsolid-config-trace-sample-rate-env.js
  • test/parallel/test-nsolid-config-trace-sample-rate.js
  • test/parallel/test-nsolid-trace-sample-rate-sampling.js

RafaelGSS
RafaelGSS previously approved these changes Mar 3, 2026
Copy link
Copy Markdown
Member

@RafaelGSS RafaelGSS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RSLGTM

@santigimeno santigimeno requested review from EHortua and RafaelGSS March 9, 2026 09:04
Add end-to-end traceSampleRate handling across config, runtime
propagation, tracing decisions, and regression tests.

Why:
- Enable configurable probabilistic trace sampling with predictable
  behavior.
- Ensure consistent semantics across all config entry points.
- Prevent invalid updates from corrupting current sampling behavior.
- Keep transaction consistency by deciding sampling at the root span
  only.

What changed:
- Added traceSampleRate parsing and normalization in JS config paths
  with explicit default fallback and finite/range validation in [0, 1].
- Added native config sanitization for traceSampleRate to reject invalid
  values before merge, preserving previous valid configuration.
- Ensured runtime sampling state is synchronized from effective current
  config after updates to avoid stale shared-memory sample rates.
- Added gRPC reconfigure support for traceSampleRate in proto and agent
  mapping, including generated protobuf updates.
- Updated tracing logic so root spans perform the sampling decision and
  child spans inherit parent traceFlags.
- Extended tests for:
  - invalid value handling (including NaN/Infinity)
  - env/package bootstrap behavior
  - partial updates preserving existing traceSampleRate
  - gRPC invalid-update fallback behavior
  - sampling behavior at 0%, 50% (tolerance), and 100%
  - worker-thread sampling behavior
  - explicit parent/child trace consistency assertions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants