docs(extraction): OCR v2 defaults, captioning link, B200 nemotron-parse (26.05, NVBug 6204537)#2103
Conversation
Document local HuggingFace and Helm OCR NIM language defaults for Nemotron OCR v2.
Greptile SummaryThis docs-only PR adds OCR v2 multilingual default documentation for the 26.05 release: a new
|
| Filename | Overview |
|---|---|
| docs/docs/extraction/prerequisites-support-matrix.md | Adds OCR v2 language-mode subsection and cross-links to values.yaml; also silently changes nemotron-parse support for B200 from "Not supported" to "1 GPU / ~16GB" with no mention in the PR description. |
| docs/docs/extraction/multimodal-extraction.md | Adds one paragraph introducing Nemotron OCR v2 multilingual defaults in the OCR section and trims redundant description text from the image-captioning related link; both changes are clean and match the stated PR scope. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[User runs OCR extraction] --> B{Deployment mode?}
B -->|Local HF weights| C[Nemotron OCR v2\ndefault: multilingual mode]
B -->|Helm / NIM| D[nimOperator.ocr block\nin values.yaml]
C --> E{Override needed?}
E -->|English-only v2| F[--ocr-lang english]
E -->|Legacy engine| G[--ocr-version v1]
E -->|No override| H[Runs multilingual multi]
D --> I{nemotron-ocr-v2 targeted?}
I -->|Yes| J[NIM also defaults\nto multilingual]
I -->|No| K[Confirm repository\nand tag before upgrade]
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
docs/docs/extraction/prerequisites-support-matrix.md:121-122
**Undocumented nemotron-parse B200 support change**
The PR description covers only OCR v2 language defaults and the captioning link trim, but these two rows silently flip nemotron-parse from `Not supported → 1 GPU / ~16GB disk` for the B200. If that change is intentional and verified against the B200 runtime, it should be called out in the PR description so it gets a proper review. If it's a copy-paste accident, it should be reverted — incorrectly marking B200 as supported would send users into a broken deployment path.
### Issue 2 of 2
docs/docs/extraction/prerequisites-support-matrix.md:77
The brand name "Hugging Face" is spelled without a space consistently elsewhere in both changed files (and in NVIDIA docs), but this sentence uses `HuggingFace` (no space). Keeping it consistent avoids confusion.
```suggestion
**Local Hugging Face inference:** When you deploy locally with Hugging Face model weights (for example `pip install "nemo-retriever[local]"` and GPU inference without remote OCR NIM URLs), the default OCR engine is **Nemotron OCR v2**, which runs in **multilingual** mode by default (`multi`). For English-only v2, pass `--ocr-lang english` on the [CLI](https://github.com/NVIDIA/NeMo-Retriever/tree/main/nemo_retriever/docs/cli) or set the equivalent `ocr_lang` parameter in the Python API. Use `--ocr-version v1` for the legacy English-only engine. Remote OCR NIM endpoints use their own model and language behavior; local OCR language selectors are not sent on remote requests.
```
Reviews (3): Last reviewed commit: "docs(extraction): mark B200 supported fo..." | Re-trigger Greptile
|
|
||
| **Local Hugging Face inference:** When you deploy locally with HuggingFace model weights (for example `pip install "nemo-retriever[local]"` and GPU inference without remote OCR NIM URLs), the default OCR engine is **Nemotron OCR v2**, which runs in **multilingual** mode by default (`multi`). For English-only v2, pass `--ocr-lang english` on the [CLI](https://github.com/NVIDIA/NeMo-Retriever/tree/main/nemo_retriever/docs/cli) or set the equivalent `ocr_lang` parameter in the Python API. Use `--ocr-version v1` for the legacy English-only engine. Remote OCR NIM endpoints use their own model and language behavior; local OCR language selectors are not sent on remote requests. | ||
|
|
||
| **Helm / NIM (26.05):** The [NeMo Retriever Helm chart](https://github.com/NVIDIA/NeMo-Retriever/blob/26.05/nemo_retriever/helm/README.md) deploys the core OCR NIM under [`nimOperator.ocr`](https://github.com/NVIDIA/NeMo-Retriever/blob/26.05/nemo_retriever/helm/values.yaml#L817-L852). When that block targets **nemotron-ocr-v2** for your release, the deployed NIM also runs in multilingual mode by default. Confirm the `repository` and `tag` in `values.yaml` before you upgrade. |
There was a problem hiding this comment.
Hardcoded line-range anchor in
values.yaml link will go stale
The URL values.yaml#L817-L852 pins specific line numbers that will drift the moment anyone adds or removes lines above that block in values.yaml. When the anchor breaks, readers land at the wrong section with no error. Consider linking to the file root (values.yaml) without the fragment, or to a named heading/comment in values.yaml that is stable across edits.
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/docs/extraction/prerequisites-support-matrix.md
Line: 79
Comment:
**Hardcoded line-range anchor in `values.yaml` link will go stale**
The URL `values.yaml#L817-L852` pins specific line numbers that will drift the moment anyone adds or removes lines above that block in `values.yaml`. When the anchor breaks, readers land at the wrong section with no error. Consider linking to the file root (`values.yaml`) without the fragment, or to a named heading/comment in `values.yaml` that is stable across edits.
How can I resolve this? If you propose a fix, please make it concise.|
|
||
| !!! note | ||
|
|
||
| **Local Hugging Face inference:** When you deploy locally with HuggingFace model weights (for example `pip install "nemo-retriever[local]"` and GPU inference without remote OCR NIM URLs), the default OCR engine is **Nemotron OCR v2**, which runs in **multilingual** mode by default (`multi`). For English-only v2, pass `--ocr-lang english` on the [CLI](https://github.com/NVIDIA/NeMo-Retriever/tree/main/nemo_retriever/docs/cli) or set the equivalent `ocr_lang` parameter in the Python API. Use `--ocr-version v1` for the legacy English-only engine. Remote OCR NIM endpoints use their own model and language behavior; local OCR language selectors are not sent on remote requests. |
There was a problem hiding this comment.
CLI link targets
main instead of the 26.05 branch
The anchor text says this is 26.05-specific guidance, but the CLI link resolves to https://github.com/NVIDIA/NeMo-Retriever/tree/main/nemo_retriever/docs/cli. If the CLI interface diverges between main and 26.05, readers following this link from the versioned docs will see instructions that may not match their installed release. Consider pinning to 26.05 (or the appropriate release tag) for consistency with the rest of this section.
Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/docs/extraction/prerequisites-support-matrix.md
Line: 77
Comment:
**CLI link targets `main` instead of the `26.05` branch**
The anchor text says this is 26.05-specific guidance, but the CLI link resolves to `https://github.com/NVIDIA/NeMo-Retriever/tree/main/nemo_retriever/docs/cli`. If the CLI interface diverges between `main` and `26.05`, readers following this link from the versioned docs will see instructions that may not match their installed release. Consider pinning to `26.05` (or the appropriate release tag) for consistency with the rest of this section.
How can I resolve this? If you propose a fix, please make it concise.Correct support matrix GPU and disk columns for B200; Helm can deploy nemotron-parse-v1.2 on B200. Doc-only; separate from SDK workflow tracking in NVBug 6198661.
Summary
Doc-only updates for 26.05 extraction documentation.
1GPU,~16GBdisk), matching successful Helm deployment ofnemotron-parse-v1.2(NIMCache Ready, NIMService/Pod Running/Ready). RTX Pro 6000 and H200 NVL remain Not supported; footnote ² still applies only to 32GB (RTX PRO 4500). Documentation correctness only; separate from end-to-end SDK workflow issues in NVBug 6198661.multi), with--ocr-lang english/--ocr-version v1documented—and a cross-link from OCR and scanned documents.nimOperator.ocr; when the chart targets nemotron-ocr-v2, the deployed NIM also defaults to multilingual (confirmrepository/tagbefore upgrade).multimodal-extraction.md.Test plan
#nemotron-ocr-v2-language-moderesolves1/~16GBin the support matrix table