Skip to content

feat(rand): add DD_TRACE_SECURE_RANDOM support to span ID generation#3873

Merged
bwoebi merged 2 commits into
DataDog:masterfrom
litianningdatadog:tianning.li/dd-trace-secure-random
May 15, 2026
Merged

feat(rand): add DD_TRACE_SECURE_RANDOM support to span ID generation#3873
bwoebi merged 2 commits into
DataDog:masterfrom
litianningdatadog:tianning.li/dd-trace-secure-random

Conversation

@litianningdatadog
Copy link
Copy Markdown
Contributor

@litianningdatadog litianningdatadog commented May 11, 2026

Tech Doc

https://datadoghq.atlassian.net/browse/SVLS-9142

Background

For Firecracker-based container technology, in order to reduce cold-start latency, the system snapshots the entire process memory of a warmed-up instance and reuses it to launch new ones. Every resumed instance starts from the same frozen memory image — including any userspace PRNG state that was initialized before the snapshot was taken.

Motivation

Standard PRNGs seed once at startup and produce a deterministic sequence from that point forward. When Firecracker resumes thousands of instances from the same snapshot, every one of them begins at the same position in that sequence. Concurrent instances then generate identical trace IDs and span IDs, corrupting distributed traces and making sampled data statistically meaningless.

The fix cannot live inside each language tracer: from inside a resumed process, a cold start and a snapshot restore are indistinguishable. The process simply wakes up already initialized — no changed PID, no kernel signal, no env var that differs between the two cases.

Solution

serverless-init (PID 1) is the only process that knows, before exec, that the child will run in a Firecracker snapshot environment. It injects DD_TRACE_SECURE_RANDOM=true into the child's environment before launch. Each tracer reads this flag at startup and switches ID generation to draw directly from the kernel entropy pool on every call (getrandom(2)). With no userspace PRNG state to freeze, every resumed instance generates an independent sequence — regardless of when the snapshot was taken.

Summary

  • Adds DD_TRACE_SECURE_RANDOM boolean config option (default false) to ext/configuration.h
  • When DD_TRACE_SECURE_RANDOM=true, ddtrace_generate_span_id() calls php_random_bytes_silent() — which reads from the OS entropy pool (getrandom(2)) per call — instead of the MT19937-64 PRNG
  • The existing MT path and ddtrace_seed_prng() RINIT seeding are fully preserved for non-secure-random deployments

Motivation

PHP's span ID generator uses MT19937-64, a Mersenne Twister whose
312-element state array lives as a static C variable in ddtrace.so.
The state is seeded at RINIT (per PHP request) from php_random_int_silent()
which is cryptographically secure, but the derived PRNG state is
captured verbatim in any process memory snapshot.

In process-snapshot environments, instances restored from the same
snapshot have identical MT state and thus produce the same span ID
sequence. When DD_TRACE_SECURE_RANDOM=true, the MT is bypassed
entirely: php_random_bytes_silent() calls getrandom(2) directly
on each invocation with no userspace PRNG state, so snapshot-restored
instances draw independent entropy immediately.

This flag is intended to be injected automatically by the serverless
init container process before spawning the user application.

Test plan

  • tests/ext/secure_random_generates_nonzero_ids.phpt — verifies non-zero,
    distinct IDs under the DD_TRACE_SECURE_RANDOM=true path
  • tests/ext/secure_random_ignores_prng_seed.phpt — verifies that a fixed
    DD_TRACE_DEBUG_PRNG_SEED does not constrain output when
    DD_TRACE_SECURE_RANDOM=true (proving the MT is fully bypassed)
  • Both tests pass on PHP 8.3 debug build

Compatibility

php_random_bytes_silent is available via existing conditional includes
in ext/random.cext/standard/php_random.h (PHP < 8.4) and
ext/random/php_random.h (PHP ≥ 8.4) — no new header dependencies.

🤖 Generated with Claude Code

When DD_TRACE_SECURE_RANDOM=true, ddtrace_generate_span_id() bypasses
the MT19937-64 thread-local state and calls php_random_bytes_silent()
instead, which reads from the OS entropy pool (getrandom(2)) on every
invocation with no userspace PRNG state.

This ensures span IDs are drawn from the kernel entropy pool on every
call, making them safe in process-snapshot environments where PRNG state
seeded at startup would be identical across all resumed instances.

The existing MT path and ddtrace_seed_prng() RINIT seeding are
unchanged; the 128-bit trace ID .time component is a Unix timestamp and
requires no fix.

Tests: secure_random_generates_nonzero_ids.phpt verifies non-zero
distinct IDs under the CSPRNG path; secure_random_ignores_prng_seed.phpt
verifies that a fixed DD_TRACE_DEBUG_PRNG_SEED does not constrain output
when DD_TRACE_SECURE_RANDOM=true.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@litianningdatadog litianningdatadog marked this pull request as ready for review May 13, 2026 01:57
@litianningdatadog litianningdatadog requested a review from a team as a code owner May 13, 2026 01:57
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ad49f5d1d2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread ext/random.c
@datadog-official
Copy link
Copy Markdown

datadog-official Bot commented May 15, 2026

🎯 Code Coverage (details)
Patch Coverage: 100.00%
Overall Coverage: 60.67% (-0.02%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 45ba56e | Docs | Datadog PR Page | Give us feedback!

@bwoebi bwoebi merged commit 09d8943 into DataDog:master May 15, 2026
2105 of 2122 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants