Skip to content

docs(extraction): OCR v2 defaults, captioning link, B200 nemotron-parse (26.05, NVBug 6204537)#2103

Open
kheiss-uwzoo wants to merge 3 commits into
NVIDIA:26.05from
kheiss-uwzoo:kheiss/docs-ocr-v2-multilingual-2605
Open

docs(extraction): OCR v2 defaults, captioning link, B200 nemotron-parse (26.05, NVBug 6204537)#2103
kheiss-uwzoo wants to merge 3 commits into
NVIDIA:26.05from
kheiss-uwzoo:kheiss/docs-ocr-v2-multilingual-2605

Conversation

@kheiss-uwzoo
Copy link
Copy Markdown
Collaborator

@kheiss-uwzoo kheiss-uwzoo commented May 22, 2026

Summary

Doc-only updates for 26.05 extraction documentation.

  • NVBug 6204537: Correct the support matrix so B200 shows nemotron-parse as deployable (1 GPU, ~16GB disk), matching successful Helm deployment of nemotron-parse-v1.2 (NIMCache Ready, NIMService/Pod Running/Ready). RTX Pro 6000 and H200 NVL remain Not supported; footnote ² still applies only to 32GB (RTX PRO 4500). Documentation correctness only; separate from end-to-end SDK workflow issues in NVBug 6198661.
  • OCR v2 multilingual defaults: Adds Nemotron OCR v2 language mode—local Hugging Face inference defaults to multilingual (multi), with --ocr-lang english / --ocr-version v1 documented—and a cross-link from OCR and scanned documents.
  • Helm / NIM (26.05): Points to nimOperator.ocr; when the chart targets nemotron-ocr-v2, the deployed NIM also defaults to multilingual (confirm repository / tag before upgrade).
  • Captioning Related link: Trims redundant hardware prose from the image-captioning cross-link in multimodal-extraction.md.

Test plan

  • MkDocs build for extraction docs
  • Verify anchor #nemotron-ocr-v2-language-mode resolves
  • Confirm nemotron-parse B200 cells render as 1 / ~16GB in the support matrix table
  • Spot-check OCR section and captioning Related link in rendered HTML

Document local HuggingFace and Helm OCR NIM language defaults for Nemotron OCR v2.
@kheiss-uwzoo kheiss-uwzoo requested review from a team as code owners May 22, 2026 20:00
@kheiss-uwzoo kheiss-uwzoo requested review from drobison00 and removed request for a team May 22, 2026 20:00
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 22, 2026

Greptile Summary

This docs-only PR adds OCR v2 multilingual default documentation for the 26.05 release: a new #nemotron-ocr-v2-language-mode subsection in the support matrix and a cross-link paragraph in the OCR section of multimodal-extraction.md. It also trims redundant link text from the image-captioning related link.

  • New OCR v2 language-mode section in prerequisites-support-matrix.md documents that local Hugging Face inference defaults to multilingual mode and explains how to override it via CLI/API; the Helm/NIM path is covered with a pointer to nimOperator.ocr in values.yaml.
  • Cross-link paragraph added to multimodal-extraction.md#ocr-and-scanned-documents summarising the multilingual default inline and directing Kubernetes users to the new support-matrix section.
  • Unrelated support-matrix edit: the nemotron-parse rows for B200 were silently changed from Not supported to 1 GPU / ~16GB disk, which is not mentioned anywhere in the PR description and needs explicit confirmation.

Confidence Score: 4/5

Safe to merge once the nemotron-parse B200 row change is confirmed or reverted.

The OCR v2 language-mode content and the captioning link trim are straightforward and correct. However, the support matrix rows for nemotron-parse on B200 were silently flipped from 'Not supported' to supported without any mention in the PR description or commit message. If that data is wrong, it will direct users into a broken Kubernetes deployment; the PR should explicitly confirm whether that change was intentional before merging.

docs/docs/extraction/prerequisites-support-matrix.md — the nemotron-parse B200 rows at lines 121–122 need author confirmation.

Important Files Changed

Filename Overview
docs/docs/extraction/prerequisites-support-matrix.md Adds OCR v2 language-mode subsection and cross-links to values.yaml; also silently changes nemotron-parse support for B200 from "Not supported" to "1 GPU / ~16GB" with no mention in the PR description.
docs/docs/extraction/multimodal-extraction.md Adds one paragraph introducing Nemotron OCR v2 multilingual defaults in the OCR section and trims redundant description text from the image-captioning related link; both changes are clean and match the stated PR scope.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[User runs OCR extraction] --> B{Deployment mode?}
    B -->|Local HF weights| C[Nemotron OCR v2\ndefault: multilingual mode]
    B -->|Helm / NIM| D[nimOperator.ocr block\nin values.yaml]
    C --> E{Override needed?}
    E -->|English-only v2| F[--ocr-lang english]
    E -->|Legacy engine| G[--ocr-version v1]
    E -->|No override| H[Runs multilingual multi]
    D --> I{nemotron-ocr-v2 targeted?}
    I -->|Yes| J[NIM also defaults\nto multilingual]
    I -->|No| K[Confirm repository\nand tag before upgrade]
Loading
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
docs/docs/extraction/prerequisites-support-matrix.md:121-122
**Undocumented nemotron-parse B200 support change**

The PR description covers only OCR v2 language defaults and the captioning link trim, but these two rows silently flip nemotron-parse from `Not supported → 1 GPU / ~16GB disk` for the B200. If that change is intentional and verified against the B200 runtime, it should be called out in the PR description so it gets a proper review. If it's a copy-paste accident, it should be reverted — incorrectly marking B200 as supported would send users into a broken deployment path.

### Issue 2 of 2
docs/docs/extraction/prerequisites-support-matrix.md:77
The brand name "Hugging Face" is spelled without a space consistently elsewhere in both changed files (and in NVIDIA docs), but this sentence uses `HuggingFace` (no space). Keeping it consistent avoids confusion.

```suggestion
    **Local Hugging Face inference:** When you deploy locally with Hugging Face model weights (for example `pip install "nemo-retriever[local]"` and GPU inference without remote OCR NIM URLs), the default OCR engine is **Nemotron OCR v2**, which runs in **multilingual** mode by default (`multi`). For English-only v2, pass `--ocr-lang english` on the [CLI](https://github.com/NVIDIA/NeMo-Retriever/tree/main/nemo_retriever/docs/cli) or set the equivalent `ocr_lang` parameter in the Python API. Use `--ocr-version v1` for the legacy English-only engine. Remote OCR NIM endpoints use their own model and language behavior; local OCR language selectors are not sent on remote requests.
```

Reviews (3): Last reviewed commit: "docs(extraction): mark B200 supported fo..." | Re-trigger Greptile


**Local Hugging Face inference:** When you deploy locally with HuggingFace model weights (for example `pip install "nemo-retriever[local]"` and GPU inference without remote OCR NIM URLs), the default OCR engine is **Nemotron OCR v2**, which runs in **multilingual** mode by default (`multi`). For English-only v2, pass `--ocr-lang english` on the [CLI](https://github.com/NVIDIA/NeMo-Retriever/tree/main/nemo_retriever/docs/cli) or set the equivalent `ocr_lang` parameter in the Python API. Use `--ocr-version v1` for the legacy English-only engine. Remote OCR NIM endpoints use their own model and language behavior; local OCR language selectors are not sent on remote requests.

**Helm / NIM (26.05):** The [NeMo Retriever Helm chart](https://github.com/NVIDIA/NeMo-Retriever/blob/26.05/nemo_retriever/helm/README.md) deploys the core OCR NIM under [`nimOperator.ocr`](https://github.com/NVIDIA/NeMo-Retriever/blob/26.05/nemo_retriever/helm/values.yaml#L817-L852). When that block targets **nemotron-ocr-v2** for your release, the deployed NIM also runs in multilingual mode by default. Confirm the `repository` and `tag` in `values.yaml` before you upgrade.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Hardcoded line-range anchor in values.yaml link will go stale

The URL values.yaml#L817-L852 pins specific line numbers that will drift the moment anyone adds or removes lines above that block in values.yaml. When the anchor breaks, readers land at the wrong section with no error. Consider linking to the file root (values.yaml) without the fragment, or to a named heading/comment in values.yaml that is stable across edits.

Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/docs/extraction/prerequisites-support-matrix.md
Line: 79

Comment:
**Hardcoded line-range anchor in `values.yaml` link will go stale**

The URL `values.yaml#L817-L852` pins specific line numbers that will drift the moment anyone adds or removes lines above that block in `values.yaml`. When the anchor breaks, readers land at the wrong section with no error. Consider linking to the file root (`values.yaml`) without the fragment, or to a named heading/comment in `values.yaml` that is stable across edits.

How can I resolve this? If you propose a fix, please make it concise.


!!! note

**Local Hugging Face inference:** When you deploy locally with HuggingFace model weights (for example `pip install "nemo-retriever[local]"` and GPU inference without remote OCR NIM URLs), the default OCR engine is **Nemotron OCR v2**, which runs in **multilingual** mode by default (`multi`). For English-only v2, pass `--ocr-lang english` on the [CLI](https://github.com/NVIDIA/NeMo-Retriever/tree/main/nemo_retriever/docs/cli) or set the equivalent `ocr_lang` parameter in the Python API. Use `--ocr-version v1` for the legacy English-only engine. Remote OCR NIM endpoints use their own model and language behavior; local OCR language selectors are not sent on remote requests.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 CLI link targets main instead of the 26.05 branch

The anchor text says this is 26.05-specific guidance, but the CLI link resolves to https://github.com/NVIDIA/NeMo-Retriever/tree/main/nemo_retriever/docs/cli. If the CLI interface diverges between main and 26.05, readers following this link from the versioned docs will see instructions that may not match their installed release. Consider pinning to 26.05 (or the appropriate release tag) for consistency with the rest of this section.

Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/docs/extraction/prerequisites-support-matrix.md
Line: 77

Comment:
**CLI link targets `main` instead of the `26.05` branch**

The anchor text says this is 26.05-specific guidance, but the CLI link resolves to `https://github.com/NVIDIA/NeMo-Retriever/tree/main/nemo_retriever/docs/cli`. If the CLI interface diverges between `main` and `26.05`, readers following this link from the versioned docs will see instructions that may not match their installed release. Consider pinning to `26.05` (or the appropriate release tag) for consistency with the rest of this section.

How can I resolve this? If you propose a fix, please make it concise.

@kheiss-uwzoo kheiss-uwzoo changed the title docs(extraction): note OCR v2 multilingual defaults (26.05) OCR v2 multilingual defaults (26.05) May 22, 2026
Correct support matrix GPU and disk columns for B200; Helm can deploy nemotron-parse-v1.2 on B200. Doc-only; separate from SDK workflow tracking in NVBug 6198661.
@kheiss-uwzoo kheiss-uwzoo changed the title OCR v2 multilingual defaults (26.05) docs(extraction): OCR v2 defaults, captioning link, B200 nemotron-parse (26.05, NVBug 6204537) May 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant