Skip to content

[BUG6108338] Update windows documentation for onnxruntime quantization with Cuda13.x#1368

Merged
kevalmorabia97 merged 1 commit intomainfrom
ynankani/bug_6108338
Apr 29, 2026
Merged

[BUG6108338] Update windows documentation for onnxruntime quantization with Cuda13.x#1368
kevalmorabia97 merged 1 commit intomainfrom
ynankani/bug_6108338

Conversation

@ynankani
Copy link
Copy Markdown
Contributor

@ynankani ynankani commented Apr 29, 2026

What does this PR do?

Type of change: ? documentation

Update windows documentation for onnxruntime quantization with Cuda13.x

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed (git commit -s -S).

Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded trust_remote_code=True, torch.load(..., weights_only=False), pickle, etc.).

  • Is this change backward compatible?: N/A
  • If you copied code from any other sources or added a new PIP dependency, did you follow guidance in CONTRIBUTING.md: N/A
  • Did you write any new necessary tests?: N/A
  • Did you update Changelog?: N/A

Summary by CodeRabbit

  • Documentation
    • Updated Windows installation guide with CUDA 13.x-specific setup instructions for GPU-accelerated dependencies, including CuPy and ONNX Runtime configuration with nightly builds.

…n with Cuda13.x

Signed-off-by: ynankani <ynankani@nvidia.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 29, 2026

📝 Walkthrough

Walkthrough

Documentation update to the Windows installation guide, adding CUDA 13.x-specific setup instructions. Users are guided to replace CUDA 12-based packages (CuPy and ONNX Runtime) with their CUDA 13 equivalents using specified nightly builds and a pre-release package.

Changes

Cohort / File(s) Summary
Windows Installation Documentation
docs/source/getting_started/windows/_installation_standalone.rst
Added CUDA 13.x-specific installation instructions, including commands to uninstall CUDA 12 packages and install cupy-cuda13x, ONNX Runtime CUDA 13 nightly builds, and onnxruntime-genai-cuda pre-release package.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 6
✅ Passed checks (6 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: updating Windows documentation for ONNX Runtime quantization with CUDA 13.x. It directly relates to the documentation update shown in the raw summary.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Security Anti-Patterns ✅ Passed The modified documentation file contains only ReStructuredText content with no Python code changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ynankani/bug_6108338

Review rate limit: 9/10 reviews remaining, refill in 6 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

@ynankani ynankani requested a review from kevalmorabia97 April 29, 2026 10:06
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
docs/source/getting_started/windows/_installation_standalone.rst (1)

67-82: Consider adding context about nightly builds and onnxruntime-genai-cuda.

While the instructions are technically correct for users with CUDA 13.x, the documentation could be enhanced with additional context:

  1. Purpose of onnxruntime-genai-cuda: The documentation doesn't explain what this package is or why it's being installed. Consider adding a brief explanation for users unfamiliar with the GenAI components.

  2. Nightly build implications: Using nightly builds from a custom index URL can have implications for stability and support. Consider adding a note about this, such as:

    • "Note: These instructions use nightly builds which may be less stable than official releases."
  3. When to apply these instructions: While line 67 mentions "If you are using CUDA 13.x", consider making this more prominent (e.g., using a note/warning admonition in reStructuredText).

Example enhancement:

.. note::
   **Only for CUDA 13.x users**: If you have CUDA 13.x installed, follow these steps to update the GPU acceleration packages. These instructions use nightly builds which may be less stable than official releases.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/source/getting_started/windows/_installation_standalone.rst` around
lines 67 - 82, Add brief explanatory context and a stability warning around the
CUDA 13.x steps: explain what onnxruntime-genai-cuda provides (GenAI-specific
GPU-accelerated runtime components) and why users install
onnxruntime-gpu/cupy-cuda13x, add a short admonition (reST note/warning) that
these steps are only for CUDA 13.x users, and include a clear warning that the
ONNX Runtime CUDA 13 packages are nightly builds from a custom index URL
(https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ort-cuda-13-nightly/...)
which may be less stable and not officially supported; place this note above the
numbered uninstall/install commands so readers see the caveats before running
the pip commands (reference symbols: onnxruntime-genai-cuda, onnxruntime-gpu,
cupy-cuda13x, CUDA 13.x, nightly).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/source/getting_started/windows/_installation_standalone.rst`:
- Around line 71-74: Clarify that Step 2's uninstall of onnxruntime-genai-cuda
is conditional by updating Section 5 to state that this step applies only if the
user previously installed onnxruntime-genai-cuda via another guide (e.g., the
Olive workflow); change the wording to "If you previously installed
onnxruntime-genai-cuda, uninstall it" or alternatively add
onnxruntime-genai-cuda to the prerequisite list before Section 5 if the intent
is that it should have been installed earlier; ensure the text references the
package name onnxruntime-genai-cuda and the other packages mentioned
(onnxruntime-gpu, cupy-cuda12x/cupy-cuda13x) so readers know when the uninstall
is required.

---

Nitpick comments:
In `@docs/source/getting_started/windows/_installation_standalone.rst`:
- Around line 67-82: Add brief explanatory context and a stability warning
around the CUDA 13.x steps: explain what onnxruntime-genai-cuda provides
(GenAI-specific GPU-accelerated runtime components) and why users install
onnxruntime-gpu/cupy-cuda13x, add a short admonition (reST note/warning) that
these steps are only for CUDA 13.x users, and include a clear warning that the
ONNX Runtime CUDA 13 packages are nightly builds from a custom index URL
(https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ort-cuda-13-nightly/...)
which may be less stable and not officially supported; place this note above the
numbered uninstall/install commands so readers see the caveats before running
the pip commands (reference symbols: onnxruntime-genai-cuda, onnxruntime-gpu,
cupy-cuda13x, CUDA 13.x, nightly).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 46756e41-a013-4041-9d8f-353a6442d80b

📥 Commits

Reviewing files that changed from the base of the PR and between 077e29a and a4cac9e.

📒 Files selected for processing (1)
  • docs/source/getting_started/windows/_installation_standalone.rst

Comment on lines +71 to +74
1. Uninstall ``cupy-cuda12x`` and install ``cupy-cuda13x``.
2. Uninstall ``onnxruntime-genai-cuda`` and ``onnxruntime-gpu``.
3. Install ONNX Runtime CUDA 13 nightly and the pre-release ``onnxruntime-genai-cuda`` package.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Search for any references to onnxruntime-genai in project configuration files

# Search for onnxruntime-genai in dependency files
rg -i 'onnxruntime-genai' --type=toml --type=txt --type=cfg -g '!**/*.rst'

Repository: NVIDIA/Model-Optimizer

Length of output: 94


🏁 Script executed:

#!/bin/bash
# Search for onnxruntime-genai references with corrected syntax
rg -i 'onnxruntime-genai' --type=toml --type=txt

Repository: NVIDIA/Model-Optimizer

Length of output: 156


🏁 Script executed:

#!/bin/bash
# Also search more broadly without type restrictions
rg -i 'onnxruntime-genai'

Repository: NVIDIA/Model-Optimizer

Length of output: 7460


🏁 Script executed:

#!/bin/bash
# Read the full context of the documentation file around lines 67-82
cat -n docs/source/getting_started/windows/_installation_standalone.rst | sed -n '60,90p'

Repository: NVIDIA/Model-Optimizer

Length of output: 1951


🏁 Script executed:

#!/bin/bash
# Check pyproject.toml for any onnxruntime references
rg -i 'onnxruntime' pyproject.toml

Repository: NVIDIA/Model-Optimizer

Length of output: 677


🏁 Script executed:

#!/bin/bash
# Check if there are earlier sections in the same file that mention installing onnxruntime-genai-cuda
head -n 70 docs/source/getting_started/windows/_installation_standalone.rst | cat -n

Repository: NVIDIA/Model-Optimizer

Length of output: 4397


🏁 Script executed:

#!/bin/bash
# Also check if _installation_with_olive.rst (which mentions onnxruntime-genai-cuda) is a prerequisite or related guide
grep -n "onnxruntime-genai" docs/source/getting_started/windows/_installation_with_olive.rst | head -20

Repository: NVIDIA/Model-Optimizer

Length of output: 496


Clarify the context for uninstalling onnxruntime-genai-cuda in Section 5.

Step 2 instructs users to uninstall onnxruntime-genai-cuda, but this package is not installed as part of the standalone ModelOpt-Windows installation guide (Sections 1-4). Users following this guide from the beginning would not have this package, making the uninstall step confusing and potentially causing errors.

While onnxruntime-genai-cuda is used in other installation paths (e.g., the Olive workflow guide), Section 5 lacks context about when this package would be present.

Please clarify:

  • Is Section 5 intended for users who previously installed onnxruntime-genai-cuda through alternative paths?
  • Should the uninstall step be conditional (e.g., "If you previously installed onnxruntime-genai-cuda from another guide, uninstall it")?
  • Or should onnxruntime-genai-cuda be added to the prerequisite installations before Section 5?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/source/getting_started/windows/_installation_standalone.rst` around
lines 71 - 74, Clarify that Step 2's uninstall of onnxruntime-genai-cuda is
conditional by updating Section 5 to state that this step applies only if the
user previously installed onnxruntime-genai-cuda via another guide (e.g., the
Olive workflow); change the wording to "If you previously installed
onnxruntime-genai-cuda, uninstall it" or alternatively add
onnxruntime-genai-cuda to the prerequisite list before Section 5 if the intent
is that it should have been installed earlier; ensure the text references the
package name onnxruntime-genai-cuda and the other packages mentioned
(onnxruntime-gpu, cupy-cuda12x/cupy-cuda13x) so readers know when the uninstall
is required.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 29, 2026

PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-04-29 10:45 UTC

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 29, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.48%. Comparing base (077e29a) to head (a4cac9e).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1368   +/-   ##
=======================================
  Coverage   76.48%   76.48%           
=======================================
  Files         471      471           
  Lines       50487    50487           
=======================================
  Hits        38617    38617           
  Misses      11870    11870           
Flag Coverage Δ
unit 52.78% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kevalmorabia97 kevalmorabia97 added the cherry-pick-0.44.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc label Apr 29, 2026
@kevalmorabia97 kevalmorabia97 merged commit 9bb917d into main Apr 29, 2026
32 checks passed
@kevalmorabia97 kevalmorabia97 deleted the ynankani/bug_6108338 branch April 29, 2026 10:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cherry-pick-0.44.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants