docs(extraction): OCR v2 defaults, captioning link, B200 nemotron-parse (26.05, NVBug 6204537) by kheiss-uwzoo · Pull Request #2103 · NVIDIA/NeMo-Retriever

kheiss-uwzoo · 2026-05-22T20:00:46Z

Summary

Doc-only updates for 26.05 extraction documentation.

NVBug 6204537: Correct the support matrix so B200 shows nemotron-parse as deployable (1 GPU, ~16GB disk), matching successful Helm deployment of nemotron-parse-v1.2 (NIMCache Ready, NIMService/Pod Running/Ready). RTX Pro 6000 and H200 NVL remain Not supported; footnote ² still applies only to 32GB (RTX PRO 4500). Documentation correctness only; separate from end-to-end SDK workflow issues in NVBug 6198661.
OCR v2 multilingual defaults: Adds Nemotron OCR v2 language mode—local Hugging Face inference defaults to multilingual (multi), with --ocr-lang english / --ocr-version v1 documented—and a cross-link from OCR and scanned documents.
Helm / NIM (26.05): Points to nimOperator.ocr; when the chart targets nemotron-ocr-v2, the deployed NIM also defaults to multilingual (confirm repository / tag before upgrade).
Captioning Related link: Trims redundant hardware prose from the image-captioning cross-link in multimodal-extraction.md.

Test plan

MkDocs build for extraction docs
Verify anchor #nemotron-ocr-v2-language-mode resolves
Confirm nemotron-parse B200 cells render as 1 / ~16GB in the support matrix table
Spot-check OCR section and captioning Related link in rendered HTML

Document local HuggingFace and Helm OCR NIM language defaults for Nemotron OCR v2.

greptile-apps · 2026-05-22T20:02:44Z

Greptile Summary

This docs-only PR adds OCR v2 multilingual default documentation for the 26.05 release: a new #nemotron-ocr-v2-language-mode subsection in the support matrix and a cross-link paragraph in the OCR section of multimodal-extraction.md. It also trims redundant link text from the image-captioning related link.

New OCR v2 language-mode section in prerequisites-support-matrix.md documents that local Hugging Face inference defaults to multilingual mode and explains how to override it via CLI/API; the Helm/NIM path is covered with a pointer to nimOperator.ocr in values.yaml.
Cross-link paragraph added to multimodal-extraction.md#ocr-and-scanned-documents summarising the multilingual default inline and directing Kubernetes users to the new support-matrix section.
Unrelated support-matrix edit: the nemotron-parse rows for B200 were silently changed from Not supported to 1 GPU / ~16GB disk, which is not mentioned anywhere in the PR description and needs explicit confirmation.

Confidence Score: 4/5

Safe to merge once the nemotron-parse B200 row change is confirmed or reverted.

The OCR v2 language-mode content and the captioning link trim are straightforward and correct. However, the support matrix rows for nemotron-parse on B200 were silently flipped from 'Not supported' to supported without any mention in the PR description or commit message. If that data is wrong, it will direct users into a broken Kubernetes deployment; the PR should explicitly confirm whether that change was intentional before merging.

docs/docs/extraction/prerequisites-support-matrix.md — the nemotron-parse B200 rows at lines 121–122 need author confirmation.

Important Files Changed

Filename	Overview
docs/docs/extraction/prerequisites-support-matrix.md	Adds OCR v2 language-mode subsection and cross-links to values.yaml; also silently changes nemotron-parse support for B200 from "Not supported" to "1 GPU / ~16GB" with no mention in the PR description.
docs/docs/extraction/multimodal-extraction.md	Adds one paragraph introducing Nemotron OCR v2 multilingual defaults in the OCR section and trims redundant description text from the image-captioning related link; both changes are clean and match the stated PR scope.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[User runs OCR extraction] --> B{Deployment mode?}
    B -->|Local HF weights| C[Nemotron OCR v2\ndefault: multilingual mode]
    B -->|Helm / NIM| D[nimOperator.ocr block\nin values.yaml]
    C --> E{Override needed?}
    E -->|English-only v2| F[--ocr-lang english]
    E -->|Legacy engine| G[--ocr-version v1]
    E -->|No override| H[Runs multilingual multi]
    D --> I{nemotron-ocr-v2 targeted?}
    I -->|Yes| J[NIM also defaults\nto multilingual]
    I -->|No| K[Confirm repository\nand tag before upgrade]

Prompt To Fix All With AI

Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
docs/docs/extraction/prerequisites-support-matrix.md:121-122
**Undocumented nemotron-parse B200 support change**

The PR description covers only OCR v2 language defaults and the captioning link trim, but these two rows silently flip nemotron-parse from `Not supported → 1 GPU / ~16GB disk` for the B200. If that change is intentional and verified against the B200 runtime, it should be called out in the PR description so it gets a proper review. If it's a copy-paste accident, it should be reverted — incorrectly marking B200 as supported would send users into a broken deployment path.

### Issue 2 of 2
docs/docs/extraction/prerequisites-support-matrix.md:77
The brand name "Hugging Face" is spelled without a space consistently elsewhere in both changed files (and in NVIDIA docs), but this sentence uses `HuggingFace` (no space). Keeping it consistent avoids confusion.

```suggestion
    **Local Hugging Face inference:** When you deploy locally with Hugging Face model weights (for example `pip install "nemo-retriever[local]"` and GPU inference without remote OCR NIM URLs), the default OCR engine is **Nemotron OCR v2**, which runs in **multilingual** mode by default (`multi`). For English-only v2, pass `--ocr-lang english` on the [CLI](https://github.com/NVIDIA/NeMo-Retriever/tree/main/nemo_retriever/docs/cli) or set the equivalent `ocr_lang` parameter in the Python API. Use `--ocr-version v1` for the legacy English-only engine. Remote OCR NIM endpoints use their own model and language behavior; local OCR language selectors are not sent on remote requests.
```

_{Reviews (3): Last reviewed commit: "docs(extraction): mark B200 supported fo..." | Re-trigger Greptile}

greptile-apps · 2026-05-22T20:02:48Z

+
+    **Local Hugging Face inference:** When you deploy locally with HuggingFace model weights (for example `pip install "nemo-retriever[local]"` and GPU inference without remote OCR NIM URLs), the default OCR engine is **Nemotron OCR v2**, which runs in **multilingual** mode by default (`multi`). For English-only v2, pass `--ocr-lang english` on the [CLI](https://github.com/NVIDIA/NeMo-Retriever/tree/main/nemo_retriever/docs/cli) or set the equivalent `ocr_lang` parameter in the Python API. Use `--ocr-version v1` for the legacy English-only engine. Remote OCR NIM endpoints use their own model and language behavior; local OCR language selectors are not sent on remote requests.
+
+    **Helm / NIM (26.05):** The [NeMo Retriever Helm chart](https://github.com/NVIDIA/NeMo-Retriever/blob/26.05/nemo_retriever/helm/README.md) deploys the core OCR NIM under [`nimOperator.ocr`](https://github.com/NVIDIA/NeMo-Retriever/blob/26.05/nemo_retriever/helm/values.yaml#L817-L852). When that block targets **nemotron-ocr-v2** for your release, the deployed NIM also runs in multilingual mode by default. Confirm the `repository` and `tag` in `values.yaml` before you upgrade.


Hardcoded line-range anchor in values.yaml link will go stale

The URL values.yaml#L817-L852 pins specific line numbers that will drift the moment anyone adds or removes lines above that block in values.yaml. When the anchor breaks, readers land at the wrong section with no error. Consider linking to the file root (values.yaml) without the fragment, or to a named heading/comment in values.yaml that is stable across edits.

Prompt To Fix With AI

This is a comment left during a code review. Path: docs/docs/extraction/prerequisites-support-matrix.md Line: 79 Comment: **Hardcoded line-range anchor in `values.yaml` link will go stale** The URL `values.yaml#L817-L852` pins specific line numbers that will drift the moment anyone adds or removes lines above that block in `values.yaml`. When the anchor breaks, readers land at the wrong section with no error. Consider linking to the file root (`values.yaml`) without the fragment, or to a named heading/comment in `values.yaml` that is stable across edits. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-05-22T20:02:49Z

+
+!!! note
+
+    **Local Hugging Face inference:** When you deploy locally with HuggingFace model weights (for example `pip install "nemo-retriever[local]"` and GPU inference without remote OCR NIM URLs), the default OCR engine is **Nemotron OCR v2**, which runs in **multilingual** mode by default (`multi`). For English-only v2, pass `--ocr-lang english` on the [CLI](https://github.com/NVIDIA/NeMo-Retriever/tree/main/nemo_retriever/docs/cli) or set the equivalent `ocr_lang` parameter in the Python API. Use `--ocr-version v1` for the legacy English-only engine. Remote OCR NIM endpoints use their own model and language behavior; local OCR language selectors are not sent on remote requests.


CLI link targets main instead of the 26.05 branch

The anchor text says this is 26.05-specific guidance, but the CLI link resolves to https://github.com/NVIDIA/NeMo-Retriever/tree/main/nemo_retriever/docs/cli. If the CLI interface diverges between main and 26.05, readers following this link from the versioned docs will see instructions that may not match their installed release. Consider pinning to 26.05 (or the appropriate release tag) for consistency with the rest of this section.

Prompt To Fix With AI

This is a comment left during a code review. Path: docs/docs/extraction/prerequisites-support-matrix.md Line: 77 Comment: **CLI link targets `main` instead of the `26.05` branch** The anchor text says this is 26.05-specific guidance, but the CLI link resolves to `https://github.com/NVIDIA/NeMo-Retriever/tree/main/nemo_retriever/docs/cli`. If the CLI interface diverges between `main` and `26.05`, readers following this link from the versioned docs will see instructions that may not match their installed release. Consider pinning to `26.05` (or the appropriate release tag) for consistency with the rest of this section. How can I resolve this? If you propose a fix, please make it concise.

Correct support matrix GPU and disk columns for B200; Helm can deploy nemotron-parse-v1.2 on B200. Doc-only; separate from SDK workflow tracking in NVBug 6198661.

docs(extraction): note OCR v2 multilingual defaults

c0cc98c

Document local HuggingFace and Helm OCR NIM language defaults for Nemotron OCR v2.

kheiss-uwzoo requested review from a team as code owners May 22, 2026 20:00

kheiss-uwzoo requested review from drobison00 and removed request for a team May 22, 2026 20:00

greptile-apps Bot reviewed May 22, 2026

View reviewed changes

docs(extraction): drop NIM hardware prose from captioning Related link

cb3a441

kheiss-uwzoo changed the title ~~docs(extraction): note OCR v2 multilingual defaults (26.05)~~ OCR v2 multilingual defaults (26.05) May 22, 2026

docs(extraction): mark B200 supported for nemotron-parse (NVBug 6204537)

f357a56

Correct support matrix GPU and disk columns for B200; Helm can deploy nemotron-parse-v1.2 on B200. Doc-only; separate from SDK workflow tracking in NVBug 6198661.

kheiss-uwzoo changed the title ~~OCR v2 multilingual defaults (26.05)~~ docs(extraction): OCR v2 defaults, captioning link, B200 nemotron-parse (26.05, NVBug 6204537) May 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(extraction): OCR v2 defaults, captioning link, B200 nemotron-parse (26.05, NVBug 6204537)#2103

docs(extraction): OCR v2 defaults, captioning link, B200 nemotron-parse (26.05, NVBug 6204537)#2103
kheiss-uwzoo wants to merge 3 commits into
NVIDIA:26.05from
kheiss-uwzoo:kheiss/docs-ocr-v2-multilingual-2605

kheiss-uwzoo commented May 22, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented May 22, 2026 •

edited

Loading

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot May 22, 2026

Uh oh!

greptile-apps Bot May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant


		Local Hugging Face inference: When you deploy locally with HuggingFace model weights (for example `pip install "nemo-retriever[local]"` and GPU inference without remote OCR NIM URLs), the default OCR engine is Nemotron OCR v2, which runs in multilingual mode by default (`multi`). For English-only v2, pass `--ocr-lang english` on the [CLI](https://github.com/NVIDIA/NeMo-Retriever/tree/main/nemo_retriever/docs/cli) or set the equivalent `ocr_lang` parameter in the Python API. Use `--ocr-version v1` for the legacy English-only engine. Remote OCR NIM endpoints use their own model and language behavior; local OCR language selectors are not sent on remote requests.

		Helm / NIM (26.05): The [NeMo Retriever Helm chart](https://github.com/NVIDIA/NeMo-Retriever/blob/26.05/nemo_retriever/helm/README.md) deploys the core OCR NIM under [`nimOperator.ocr`](https://github.com/NVIDIA/NeMo-Retriever/blob/26.05/nemo_retriever/helm/values.yaml#L817-L852). When that block targets nemotron-ocr-v2 for your release, the deployed NIM also runs in multilingual mode by default. Confirm the `repository` and `tag` in `values.yaml` before you upgrade.


		!!! note

		Local Hugging Face inference: When you deploy locally with HuggingFace model weights (for example `pip install "nemo-retriever[local]"` and GPU inference without remote OCR NIM URLs), the default OCR engine is Nemotron OCR v2, which runs in multilingual mode by default (`multi`). For English-only v2, pass `--ocr-lang english` on the [CLI](https://github.com/NVIDIA/NeMo-Retriever/tree/main/nemo_retriever/docs/cli) or set the equivalent `ocr_lang` parameter in the Python API. Use `--ocr-version v1` for the legacy English-only engine. Remote OCR NIM endpoints use their own model and language behavior; local OCR language selectors are not sent on remote requests.

Conversation

kheiss-uwzoo commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

greptile-apps Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kheiss-uwzoo commented May 22, 2026 •

edited

Loading

greptile-apps Bot commented May 22, 2026 •

edited

Loading