[BUG6108338] Update windows documentation for onnxruntime quantization with Cuda13.x by ynankani · Pull Request #1368 · NVIDIA/Model-Optimizer

ynankani · 2026-04-29T10:04:25Z

What does this PR do?

Type of change: ? documentation

Update windows documentation for onnxruntime quantization with Cuda13.x

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed (git commit -s -S).

Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded trust_remote_code=True, torch.load(..., weights_only=False), pickle, etc.).

Is this change backward compatible?: N/A
If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: N/A
Did you write any new necessary tests?: N/A
Did you update Changelog?: N/A

Summary by CodeRabbit

Documentation
- Updated Windows installation guide with CUDA 13.x-specific setup instructions for GPU-accelerated dependencies, including CuPy and ONNX Runtime configuration with nightly builds.

…n with Cuda13.x Signed-off-by: ynankani <ynankani@nvidia.com>

coderabbitai · 2026-04-29T10:04:39Z

📝 Walkthrough

Walkthrough

Documentation update to the Windows installation guide, adding CUDA 13.x-specific setup instructions. Users are guided to replace CUDA 12-based packages (CuPy and ONNX Runtime) with their CUDA 13 equivalents using specified nightly builds and a pre-release package.

Changes

Cohort / File(s)	Summary
Windows Installation Documentation `docs/source/getting_started/windows/_installation_standalone.rst`	Added CUDA 13.x-specific installation instructions, including commands to uninstall CUDA 12 packages and install `cupy-cuda13x`, ONNX Runtime CUDA 13 nightly builds, and `onnxruntime-genai-cuda` pre-release package.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 6

✅ Passed checks (6 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main change: updating Windows documentation for ONNX Runtime quantization with CUDA 13.x. It directly relates to the documentation update shown in the raw summary.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Security Anti-Patterns	✅ Passed	The modified documentation file contains only ReStructuredText content with no Python code changes.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch ynankani/bug_6108338

_{Review rate limit: 9/10 reviews remaining, refill in 6 minutes.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

docs/source/getting_started/windows/_installation_standalone.rst (1)
67-82: Consider adding context about nightly builds and onnxruntime-genai-cuda.

While the instructions are technically correct for users with CUDA 13.x, the documentation could be enhanced with additional context:

Purpose of onnxruntime-genai-cuda: The documentation doesn't explain what this package is or why it's being installed. Consider adding a brief explanation for users unfamiliar with the GenAI components.

Nightly build implications: Using nightly builds from a custom index URL can have implications for stability and support. Consider adding a note about this, such as:

"Note: These instructions use nightly builds which may be less stable than official releases."

When to apply these instructions: While line 67 mentions "If you are using CUDA 13.x", consider making this more prominent (e.g., using a note/warning admonition in reStructuredText).

Example enhancement:
.. note::
   **Only for CUDA 13.x users**: If you have CUDA 13.x installed, follow these steps to update the GPU acceleration packages. These instructions use nightly builds which may be less stable than official releases.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/source/getting_started/windows/_installation_standalone.rst` around
lines 67 - 82, Add brief explanatory context and a stability warning around the
CUDA 13.x steps: explain what onnxruntime-genai-cuda provides (GenAI-specific
GPU-accelerated runtime components) and why users install
onnxruntime-gpu/cupy-cuda13x, add a short admonition (reST note/warning) that
these steps are only for CUDA 13.x users, and include a clear warning that the
ONNX Runtime CUDA 13 packages are nightly builds from a custom index URL
(https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ort-cuda-13-nightly/...)
which may be less stable and not officially supported; place this note above the
numbered uninstall/install commands so readers see the caveats before running
the pip commands (reference symbols: onnxruntime-genai-cuda, onnxruntime-gpu,
cupy-cuda13x, CUDA 13.x, nightly).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/source/getting_started/windows/_installation_standalone.rst`:
- Around line 71-74: Clarify that Step 2's uninstall of onnxruntime-genai-cuda
is conditional by updating Section 5 to state that this step applies only if the
user previously installed onnxruntime-genai-cuda via another guide (e.g., the
Olive workflow); change the wording to "If you previously installed
onnxruntime-genai-cuda, uninstall it" or alternatively add
onnxruntime-genai-cuda to the prerequisite list before Section 5 if the intent
is that it should have been installed earlier; ensure the text references the
package name onnxruntime-genai-cuda and the other packages mentioned
(onnxruntime-gpu, cupy-cuda12x/cupy-cuda13x) so readers know when the uninstall
is required.

---

Nitpick comments:
In `@docs/source/getting_started/windows/_installation_standalone.rst`:
- Around line 67-82: Add brief explanatory context and a stability warning
around the CUDA 13.x steps: explain what onnxruntime-genai-cuda provides
(GenAI-specific GPU-accelerated runtime components) and why users install
onnxruntime-gpu/cupy-cuda13x, add a short admonition (reST note/warning) that
these steps are only for CUDA 13.x users, and include a clear warning that the
ONNX Runtime CUDA 13 packages are nightly builds from a custom index URL
(https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ort-cuda-13-nightly/...)
which may be less stable and not officially supported; place this note above the
numbered uninstall/install commands so readers see the caveats before running
the pip commands (reference symbols: onnxruntime-genai-cuda, onnxruntime-gpu,
cupy-cuda13x, CUDA 13.x, nightly).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 46756e41-a013-4041-9d8f-353a6442d80b

📥 Commits

Reviewing files that changed from the base of the PR and between 077e29a and a4cac9e.

📒 Files selected for processing (1)

docs/source/getting_started/windows/_installation_standalone.rst

coderabbitai · 2026-04-29T10:07:50Z

+1. Uninstall ``cupy-cuda12x`` and install ``cupy-cuda13x``.
+2. Uninstall ``onnxruntime-genai-cuda`` and ``onnxruntime-gpu``.
+3. Install ONNX Runtime CUDA 13 nightly and the pre-release ``onnxruntime-genai-cuda`` package.
+


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Description: Search for any references to onnxruntime-genai in project configuration files # Search for onnxruntime-genai in dependency files rg -i 'onnxruntime-genai' --type=toml --type=txt --type=cfg -g '!**/*.rst'

Repository: NVIDIA/Model-Optimizer

Length of output: 94

🏁 Script executed:

#!/bin/bash # Search for onnxruntime-genai references with corrected syntax rg -i 'onnxruntime-genai' --type=toml --type=txt

Repository: NVIDIA/Model-Optimizer

Length of output: 156

🏁 Script executed:

#!/bin/bash # Also search more broadly without type restrictions rg -i 'onnxruntime-genai'

Repository: NVIDIA/Model-Optimizer

Length of output: 7460

🏁 Script executed:

#!/bin/bash # Read the full context of the documentation file around lines 67-82 cat -n docs/source/getting_started/windows/_installation_standalone.rst | sed -n '60,90p'

Repository: NVIDIA/Model-Optimizer

Length of output: 1951

🏁 Script executed:

#!/bin/bash # Check pyproject.toml for any onnxruntime references rg -i 'onnxruntime' pyproject.toml

Repository: NVIDIA/Model-Optimizer

Length of output: 677

🏁 Script executed:

#!/bin/bash # Check if there are earlier sections in the same file that mention installing onnxruntime-genai-cuda head -n 70 docs/source/getting_started/windows/_installation_standalone.rst | cat -n

Repository: NVIDIA/Model-Optimizer

Length of output: 4397

🏁 Script executed:

#!/bin/bash # Also check if _installation_with_olive.rst (which mentions onnxruntime-genai-cuda) is a prerequisite or related guide grep -n "onnxruntime-genai" docs/source/getting_started/windows/_installation_with_olive.rst | head -20

Repository: NVIDIA/Model-Optimizer

Length of output: 496

Clarify the context for uninstalling onnxruntime-genai-cuda in Section 5.

Step 2 instructs users to uninstall onnxruntime-genai-cuda, but this package is not installed as part of the standalone ModelOpt-Windows installation guide (Sections 1-4). Users following this guide from the beginning would not have this package, making the uninstall step confusing and potentially causing errors.

While onnxruntime-genai-cuda is used in other installation paths (e.g., the Olive workflow guide), Section 5 lacks context about when this package would be present.

Please clarify:

Is Section 5 intended for users who previously installed onnxruntime-genai-cuda through alternative paths?

Should the uninstall step be conditional (e.g., "If you previously installed onnxruntime-genai-cuda from another guide, uninstall it")?

Or should onnxruntime-genai-cuda be added to the prerequisite installations before Section 5?

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@docs/source/getting_started/windows/_installation_standalone.rst` around lines 71 - 74, Clarify that Step 2's uninstall of onnxruntime-genai-cuda is conditional by updating Section 5 to state that this step applies only if the user previously installed onnxruntime-genai-cuda via another guide (e.g., the Olive workflow); change the wording to "If you previously installed onnxruntime-genai-cuda, uninstall it" or alternatively add onnxruntime-genai-cuda to the prerequisite list before Section 5 if the intent is that it should have been installed earlier; ensure the text references the package name onnxruntime-genai-cuda and the other packages mentioned (onnxruntime-gpu, cupy-cuda12x/cupy-cuda13x) so readers know when the uninstall is required.

github-actions · 2026-04-29T10:07:54Z

PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-04-29 10:45 UTC

codecov · 2026-04-29T10:18:46Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.48%. Comparing base (077e29a) to head (a4cac9e).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #1368   +/-   ##
=======================================
  Coverage   76.48%   76.48%           
=======================================
  Files         471      471           
  Lines       50487    50487           
=======================================
  Hits        38617    38617           
  Misses      11870    11870

Flag	Coverage Δ
unit	`52.78% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

[BUG6108338] Update windows documentation for onnxruntime quantizatio…

a4cac9e

…n with Cuda13.x Signed-off-by: ynankani <ynankani@nvidia.com>

ynankani requested a review from kevalmorabia97 April 29, 2026 10:06

coderabbitai Bot reviewed Apr 29, 2026

View reviewed changes

kevalmorabia97 added the cherry-pick-0.44.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc label Apr 29, 2026

kevalmorabia97 approved these changes Apr 29, 2026

View reviewed changes

kevalmorabia97 merged commit 9bb917d into main Apr 29, 2026
32 checks passed

kevalmorabia97 deleted the ynankani/bug_6108338 branch April 29, 2026 10:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG6108338] Update windows documentation for onnxruntime quantization with Cuda13.x#1368

[BUG6108338] Update windows documentation for onnxruntime quantization with Cuda13.x#1368
kevalmorabia97 merged 1 commit intomainfrom
ynankani/bug_6108338

ynankani commented Apr 29, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 29, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Apr 29, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ynankani commented Apr 29, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before your PR is "Ready for review"

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ynankani commented Apr 29, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 29, 2026 •

edited

Loading

github-actions Bot commented Apr 29, 2026 •

edited

Loading

codecov Bot commented Apr 29, 2026 •

edited

Loading