feat: Add support for upstream trace headers as request ID#2124
feat: Add support for upstream trace headers as request ID#2124captainp1an3t wants to merge 3 commits into
Conversation
5d6cd69 to
db4ea25
Compare
matt0x6F
left a comment
There was a problem hiding this comment.
Thanks for your contribution to Athens!
This is good work, but I'm curious if we could rather derive the request ID from the active span context rather than switching on header values as such.
|
|
||
| const ( | ||
| SourceAthens Source = "athens" | ||
| SourceB3 Source = "b3" |
There was a problem hiding this comment.
B3 supports multiple and single header propagation: https://github.com/openzipkin/b3-propagation#single-header
I'll take a look if this is possible. My assumption is that it depends on what the capabilities of the opencensus middleware are, but I'll see what I can do. |
I haven't abandoned this, but I need to investigate how #2138 impacts my changes. |
Use OTel propagator to extract trace ID from incoming W3C traceparent or B3 headers as the request ID. Falls back to Athens-Request-ID header, then generates a UUID. Propagator is registered unconditionally so trace correlation works regardless of exporter configuration.
315aa33 to
5979345
Compare
|
@matt0x6F Can you take a fresh look at this? |
Correlate Athens request IDs with distributed traces
Problem
When Athens runs behind a service mesh (Istio, Linkerd) or API gateway, there's no way to correlate Athens logs with distributed traces. The mesh injects trace headers (
traceparent,x-b3-traceid) into incoming requests, but Athens ignores them and generates its own request ID. This forces teams to manually correlate logs and traces by timestamp and path.Solution
Use the OpenTelemetry propagator to extract trace context from incoming requests and adopt the trace ID as the Athens request ID. If trace headers are present, the trace ID is used; otherwise the existing behavior is preserved.
Note: The original PR was proposed prior to #2138 getting merged. The conversation/comments history may appear disjointed.
Capture Order of the incoming trace id:
traceparent(W3C) orb3/X-B3-TraceId(Zipkin) headers → use itAthens-Request-IDheader (legacy) → use itWhat changed
pkg/middleware/requestid.gopkg/middleware/requestid_test.gopkg/observ/observ.goRegisterPropagator()(W3C TraceContext + B3); called unconditionally so trace correlation works even without a trace exportercmd/proxy/actions/app.goobserv.RegisterPropagator()unconditionally before exporter setupgo.mod/go.sumgo.opentelemetry.io/contrib/propagators/b3Design note
There was an attempt to satisfy this comment HOWEVER
otelhttp.NewHandler(which wraps the router) always creates a span with a valid trace ID, even when no trace headers are present. Thus there is no way to distinguish "trace ID came from an upstream header" vs. "otelhttp auto-generated one" simply by looking span alone.So the middleware does not read the trace ID from the active span context (
trace.SpanFromContext). Instead, the middleware re-invokes the global OTel propagator against the request headers to detect whether a remote trace context was actually propagated. This is format-agnostic (supports any propagation format registered in the composite propagator) and avoids manual header parsing, but it is technically re-extracting from headers rather than purely deriving from the span.Backwards compatibility
No breaking changes: Existing deployments that use
Athens-Request-IDor no trace headers see identical behavior. Only when trace headers are present does behavior change (request ID becomes the trace ID instead of a UUID).How I Tested It
Unit tests
Added a few unit tests for failure cases. Ran full suite.
Integration Tests
Ran Athens in Docker (
make run-docker) with Mongo 6 and Jaeger.Note: OTLP tracing was enabled in the test environment but is not required as trace correlation works regardless of exporter configuration.
Test 1: W3C
traceparentTest 2: B3 multi-header
Test 3: B3 single header
Test 4: Both W3C and B3 present → W3C wins
Test 5:
Athens-Request-IDheader (no trace headers)Test 6: No headers → UUID fallback