[SVLS-8349] Add CPU Enhanced Metrics in Windows Azure Functions by kathiehuang · Pull Request #133 · DataDog/serverless-components

kathiehuang · 2026-05-18T17:55:01Z

What does this PR do?

Adds CPU usage enhanced metric azure.functions.enhanced.cpu.usage (nanocores) for Windows Azure Functions. This feature is already available on Linux Azure Functions (see #77).

Read cumulative CPU time using the Windows Job Object API, which accumulates total user + kernel CPU time for all processes in the job object.
- A Job Object in a Windows Azure Function consists of its IIS worker process, language worker process, and our agent process.
  - How we found this is explained in Calculating CPU Enhanced Metrics in Windows and Linux Azure Functions.

Motivation

https://datadoghq.atlassian.net/browse/SVLS-8349

Describe how to test/QA your changes

Build with serverless-compat-self-monitoring.

This was deployed with the serverless-compat-self-monitoring pipeline across all runtimes and hosting plans. All hosting plans in Windows were tested to make sure metrics submit correctly.

Testing to compare against Azure Monitor is documented in an internal doc in Enhanced Metrics in the Serverless Compatibility Layer.

Copilot

Pull request overview

Adds Windows support for the Azure Functions enhanced CPU usage metric (azure.functions.enhanced.cpu.usage) by reading cumulative CPU time from the Windows Job Object API and enabling enhanced metrics in Windows Azure Functions.

Changes:

Enable DD_ENHANCED_METRICS_ENABLED-controlled enhanced metrics for Azure Functions on Windows (no longer hard-disabled).
Implement Windows CPU total collection via QueryInformationJobObject and convert Job Object accounting time to nanoseconds for downstream usage-rate computation.
Add the windows crate dependency behind the existing windows-enhanced-metrics feature.

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 4 comments.

File	Description
crates/datadog-serverless-compat/src/main.rs	Enables enhanced metrics in Azure Functions based on `DD_ENHANCED_METRICS_ENABLED` for all platforms (including Windows).
crates/datadog-metrics-collector/src/azure_windows.rs	Implements Job Object CPU time reads for Windows to support enhanced CPU usage metric computation.
crates/datadog-metrics-collector/Cargo.toml	Adds optional `windows` dependency and wires it to `windows-enhanced-metrics`.
Cargo.lock	Locks new `windows` crate transitive dependencies for Windows builds.

Comments suppressed due to low confidence (1)

crates/datadog-metrics-collector/src/azure_windows.rs:41

This adds new Windows-specific CPU collection logic, but there are no unit tests covering the conversion from Job Object 100-ns units to the CpuStats.total nanoseconds value (including overflow/negative handling). Consider extracting the conversion into a pure function and adding a small #[cfg(windows)] test to prevent regressions.

            // TotalUserTime and TotalKernelTime are in 100-nanosecond units - multiply by 100 to get nanoseconds
            let total_ns = (info.TotalUserTime + info.TotalKernelTime) as u64 * 100;
            Some(CpuStats { total: total_ns })
        }

kathiehuang · 2026-05-18T19:00:53Z

+//! This module provides functionality to read CPU usage from Windows Job Objects.
 //!
-//! All CPU metrics will be reported in nanocores (1 core = 1,000,000,000 nanocores).
+//! All CPU metrics are reported in nanocores (1 core = 1,000,000,000 nanocores).

 use crate::azure_cpu::{CpuStats, CpuStatsReader};


Specified that CPU time is stored in nanoseconds and reported in nanocores in 0e97d78

kathiehuang · 2026-05-18T19:01:54Z

+fn read_cpu_usage_from_job_object() -> Option<CpuStats> {
+    let mut info = JOBOBJECT_BASIC_ACCOUNTING_INFORMATION::default();
+    let result = unsafe {
+        QueryInformationJobObject(
+            None, // If the handle is None, the current process's job object is used
+            JobObjectBasicAccountingInformation, // The type of info to retrieve
+            &mut info as *mut _ as *mut _, // Pointer to the struct that will store the info
+            std::mem::size_of::<JOBOBJECT_BASIC_ACCOUNTING_INFORMATION>() as u32,
+            None,
+        )
+    };
+    match result {
+        Ok(()) => {


Added a safety comment in 0e97d78 following https://std-dev-guide.rust-lang.org/policy/safety-comments.html

kathiehuang · 2026-05-18T19:02:32Z

+    match result {
+        Ok(()) => {
+            // TotalUserTime and TotalKernelTime are in 100-nanosecond units - multiply by 100 to get nanoseconds
+            let total_ns = (info.TotalUserTime + info.TotalKernelTime) as u64 * 100;


Guarded against overflow by using try_from and checked add/mul in 0e97d78

kathiehuang · 2026-05-18T19:02:43Z

+        Err(_) => {
+            debug!("Failed to read CPU usage from Job Object");


Logged the error in 0e97d78

…ect, log error

kathiehuang · 2026-05-18T19:04:16Z

@codex review

chatgpt-codex-connector · 2026-05-18T19:07:09Z

Codex Review: Didn't find any major issues. Hooray!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e5cf36972a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-18T20:26:39Z

 libdd-common = { git = "https://github.com/DataDog/libdatadog", rev = "4ae8ebe252451374c292efd159ce254c3f5a72e0", default-features = false }

+[target.'cfg(windows)'.dependencies]
+windows = { version = "0.62", features = ["Win32_System_JobObjects"], optional = true }


Add Win32_Foundation to the Windows bindings

When building the Windows target with windows-enhanced-metrics, this dependency only enables Win32_System_JobObjects, but the windows bindings gate QueryInformationJobObject/HANDLE behind the Win32_Foundation feature as well. As a result, azure_windows.rs cannot resolve the imported API in the Windows enhanced-metrics build; include Win32_Foundation in this feature list.

Useful? React with 👍 / 👎.

Win32_Foundation is already transitively included, defined in the windows crate's Cargo.toml:

Win32_System_JobObjects = ["Win32_System"] Win32_System = ["Win32"] Win32 = ["Win32_Foundation"]

The CI build passing confirms this

duncanpharvey · 2026-05-18T20:51:18Z

+libdd-common = { git = "https://github.com/DataDog/libdatadog", rev = "4ae8ebe252451374c292efd159ce254c3f5a72e0", default-features = false }
+
+[target.'cfg(windows)'.dependencies]
+windows = { version = "0.62", features = ["Win32_System_JobObjects"], optional = true }


It looks like windows pulls in additional dependencies. Are all of these dependencies needed? If you set default-features = false are some or all of the dependencies omitted while still being able to collect cpu stats?

Let's try to minimize the increase of the build artifact as much as possible.

serverless-components/Cargo.lock

Lines 3773 to 3778 in e5cf369

dependencies = [

"windows-collections",

"windows-core",

"windows-future",

"windows-numerics",

]

I tried setting default-features = false but it didn't change the Cargo.lock - it looks like all four of these are hard dependencies of the windows crate and not features: https://github.com/microsoft/windows-rs/blob/d3a079bfe1bde71bceebfc6628f4c55a044cd501/crates/libs/windows/Cargo.toml#L15

I think if we want to avoid these subcrates we could switch to windows-sys. The code would be slightly more verbose but the compile footprint would be smaller: https://kennykerr.ca/rust-getting-started/windows-or-windows-sys.html

I tested windows-sys - output was the same, metrics look right, and binary size went down 11,264 bytes. Updated in c404f84

apiarian-datadog · 2026-05-18T21:11:58Z

+    // SAFETY: `info` is a stack-allocated `JOBOBJECT_BASIC_ACCOUNTING_INFORMATION` initialized via `default()`, so the compiler guarantees its alignment.
+    // The buffer size argument is `size_of::<JOBOBJECT_BASIC_ACCOUNTING_INFORMATION>()`, which exactly matches `info`, so the API cannot write out of bounds.
+    // Passing `None` for the job handle is documented to use the current process's job object.
+    let result = unsafe {


can we add unit tests for this? do we already have windows test runners in ci? if not, maybe now with "unsafe" in the mix is a good time to add them?

Yes we have Windows test runners in CI! I added unit tests for the CPU time conversion logic to check negative values and overflow a75f3b5 but I think the only way I can test the unsafe block would be to just see if the API call works and that read_cpu_usage_from_job_object doesn't return none?

Calculate CPU usage and submit metric

fd1b9e7

kathiehuang requested a review from Copilot May 18, 2026 17:57

Copilot started reviewing on behalf of kathiehuang May 18, 2026 17:58 View session

Copilot AI reviewed May 18, 2026

View reviewed changes

kathiehuang added 2 commits May 18, 2026 15:00

Guard against overflow, add safety comment for QueryInformationJobObj…

0e97d78

…ect, log error

Update license

9cb4483

nit: whitespace

e5cf369

kathiehuang marked this pull request as ready for review May 18, 2026 20:22

kathiehuang requested review from a team as code owners May 18, 2026 20:22

kathiehuang requested review from apiarian-datadog, duncanpharvey and jchrostek-dd and removed request for a team May 18, 2026 20:22

chatgpt-codex-connector Bot reviewed May 18, 2026

View reviewed changes

duncanpharvey reviewed May 18, 2026

View reviewed changes

apiarian-datadog reviewed May 18, 2026

View reviewed changes

Add unit tests for CPU time conversion logic

a75f3b5

This comment has been minimized.

Sign in to view

kathiehuang added 2 commits May 19, 2026 15:35

Use windows-sys instead of windows

c404f84

Update license

001bc15

		Err(_) => {
		debug!("Failed to read CPU usage from Job Object");

	dependencies = [
	"windows-collections",
	"windows-core",
	"windows-future",
	"windows-numerics",
	]

Conversation

kathiehuang commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Motivation

Describe how to test/QA your changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kathiehuang commented May 18, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 18, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kathiehuang May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment has been minimized.

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kathiehuang commented May 18, 2026 •

edited

Loading

kathiehuang May 19, 2026 •

edited

Loading