feat(constants)!: switch URLs to v0.9.0 layout + add MODEL_REGISTRY#1148
feat(constants)!: switch URLs to v0.9.0 layout + add MODEL_REGISTRY#1148msluszniak wants to merge 22 commits into
Conversation
URL refresh
-----------
Every URL constant in the library now points at the restructured HF
layout under `resolve/v0.9.0`. File names follow
`<model>_<size>_<backend>_<precision>.pte`, files sit under per-size
and per-backend directories. Affects:
- modelUrls.ts: 170 URL refs rewritten to new paths. The 8da4w-typo
file `lfm2_5_350m_xnnpack_8w4da.pte` is corrected to `..._8da4w.pte`.
- ocr/models.ts: CRAFT detector URL + CRNN per-language URL template
switch to the new `<lang>/xnnpack/crnn_<lang>_xnnpack_fp32.pte` shape.
- tts/models.ts: Kokoro consts re-rooted to
`<size>/xnnpack/kokoro_<size>_<component>_xnnpack_fp32.pte`.
- tts/voices.ts: voices/ and phonemizer/ asset paths kept in place;
only the `${VERSION_TAG}` value bumps.
- versions.ts: VERSION_TAG -> resolve/v0.9.0. NEXT_VERSION_TAG
collapsed into VERSION_TAG. PREVIOUS_VERSION_TAG=resolve/v0.8.0
retained for the two @deprecated Llama QLoRA aliases (LLAMA3_2_*_QLORA)
that continue to resolve their v0.8.0 file. SpinQuant is the canonical
quantized Llama 3.2 variant going forward.
MODEL_REGISTRY
--------------
Adds `constants/modelRegistry.ts` — a typed accessor grouped by
capability (LLM, VLM, CLASSIFICATION, OBJECT_DETECTION,
SEMANTIC_SEGMENTATION, INSTANCE_SEGMENTATION, STYLE_TRANSFER,
SPEECH_TO_TEXT, TEXT_EMBEDDING, IMAGE_EMBEDDING, IMAGE_GENERATION,
VAD). Each entry is callable with `{ quant, backend }`:
MODEL_REGISTRY.LLM.LLAMA3_2_3B // default (base)
MODEL_REGISTRY.LLM.LLAMA3_2_3B({ quant: true }) // SpinQuant
When read as a value (object access), returns the default config; when
called, resolves the requested variant. `backend` is accepted in the
signature for forward-compat but the library still picks via
`Platform.OS` at module load.
The previous flat `MODEL_REGISTRY = { ALL_MODELS: {...} }` export in
modelUrls.ts is removed; its internal-only consumer (the urlToModelName
lookup) now reads from a private `_ALL_MODELS` array.
Resolves the JS-API side of the HF naming convention migration.
The umbrella lfm-2.5 HF repo hosts two distinct models — the text LLM
(1.2B + 350M) and the vision-language model (1.6B + 450M). The
migrator collapsed the VL size tokens (`vl_1_6b`, `vl_450m`) to bare
numeric sizes, making VL 1.6B indistinguishable from a hypothetical
text 1.6B variant. It also left the four per-variant tokenizers at
their legacy `lfm2.5-*/` paths instead of moving them next to the new
backend dirs.
HF state (separate commits on the repo):
- VL .pte files renamed to `vl_<size>/xnnpack/lfm_2_5_vl_<size>_*.pte`
- tokenizers moved into `<size>/` and `vl_<size>/` next to each cell
- legacy `lfm2.5-*-instruct/` and `lfm2.5-VL-*/` dirs cleaned out
- config.json files refreshed (vl_* configs now carry
`model: lfm_2_5_vl` + `capabilities: [vision, text-generation]`)
This commit refreshes the matching URL constants in modelUrls.ts so
every LFM2.5 model points at its new HF path.
Covers the new grouped MODEL_REGISTRY shape (capability groups with
callable accessors), the `{ quant, backend }` options, default vs
quantized resolution, the still-supported direct-import pattern, and a
short migration note from the previous flat `ALL_MODELS` dict.
22 files updated across apps/llm, apps/computer-vision, apps/speech,
apps/text-embeddings, and apps/bare-rn. Each flat model-constant import
is replaced with the corresponding `MODEL_REGISTRY.<GROUP>.<NAME>` (or
`(...)({ quant: true })` for quantized variants). Llama QLoRA aliases
remain imported under their flat names — they're deprecated and not
part of the registry.
Net effect: -242 / +158 lines (collapsed imports, terser callsites).
Apps now serve as the canonical usage example for the typed registry.
…ctions useState auto-invokes function-typed initial values as lazy initializers, so passing a MODEL_REGISTRY accessor unwraps it into a plain config — breaking reference equality against the accessor stored in MODELS. Compare by modelName (falling back to === for picker users without one, e.g. VoiceConfig).
Each accessor's `backend` parameter is now typed to exactly the backends the model ships with — passing an unsupported one is a compile-time error. `Platform.OS` still picks the default when `backend` is omitted. The per- backend (quant × backend) variant matrix lives in modelRegistry.ts so modelUrls.ts stays flat-per-model. Unifies DISTILUSE_BASE_MULTILINGUAL_CASED_V2 to one accessor with xnnpack + coreml; the _8DA4W and _COREML named constants stay as deprecated aliases.
…ariant
Bare accessors (and undefined `quant`) now resolve to the quantized
variant when one is published; pass `{ quant: false }` to opt out. Docs
and example apps are updated to match — dual pickers keep both rows by
making the FP32 entry the explicit opt-out.
667d6b3 to
fc5eeb0
Compare
|
There are some problems with HF repos:
That's what I managed to find, probably someone else should also have a look at the updated HF repos. I will be testing example apps now. |
Each repo's v0.9.0 tag was retagged or rewritten today to match
MODEL_SPEC.md. Sync the constants:
- clip-vit-base-patch32: image/text encoders now live under their
component-tokenized filenames (the migration silently dropped text
and pointed both image and text constants at the image encoder).
- deeplab-v3, fcn: separate filenames per backbone (mobilenet_v3_large,
resnet50, resnet101 for deeplab; resnet50, resnet101 for fcn). All
three deeplab constants previously resolved to the same blob.
- fast-sam: size-tokened paths (S vs X) for both xnnpack and coreml.
FASTSAM_S and FASTSAM_X were aliased to the X variant.
- qwen-3.5: tokenizers moved out of the legacy Qwen3.5-{0.8B,2B}/
directories into the canonical <size>/ layout matching lfm-2.5.
… size-first paths `MODEL_REGISTRY` accessors default to the quantized variant when one is published, so the non-quantized slot is no longer the runtime default. Rename the `BackendCell.default` key to `base` so the field name matches what it actually represents (the unquantized base variant) and stops fighting the new runtime default. `PlatformDefaults.default` is unrelated (platform-fallback backend) and is unchanged. Also point FastSAM URLs at the size-first HF layout: <size>/<backend>/fast_sam_<size>_<backend>_<precision>.pte matching the yolo26-seg / lfm-2.5 convention.
@barhanc I don't think these are sizes per se in this case. You have specified only the backbone of the final model. These models might be smaller or bigger but semantically do not indicate sizes immediately (in comparison where models from the same family have different number of parameters and the name derives from this number of is explicitly named s,m,l, xl etc.). |
modelRegistry.ts duplicated the same `${URL_PREFIX}-…/${VERSION_TAG}/…`
strings that modelUrls.ts already had inline in each Platform.OS branch.
Hoist a single set of per-backend URL constants into modelUrls.ts and
have both consumers reference them, so each URL string lives in exactly
one place.
- Add per-backend exports for efficientnet-v2-s, ssdlite320-mobilenet-v3-
large, rfdetr-nano-detector, rfdetr-nano-segmentation, fast-sam {s,x},
distiluse-base-multilingual-cased-v2.
- Add `styleTransferUrls(display, slug)` helper for the 4 style-transfer
styles; the registry's `styleTransferVariants` now consumes it.
- Drop the now-unused `URL_PREFIX, VERSION_TAG` import from
modelRegistry.ts.
Addresses #1148 (comment)
…al/tts/ocr Adopts Bartek's feedback on #1148 — the accessor is no longer dual-shaped (value AND function). Each leaf is a pure function: call it (optionally with \`{ quant, backend }\`) to get the resolved config. This eliminates the \`useState\` lazy-init footgun and \`useMemo\`/\`useCallback\` dep hazards, so pickers fall back to plain \`===\` reference equality (drops the \`sameValue\` workaround across four \`ModelPicker.tsx\` files). Renames: - \`MODEL_REGISTRY\` → \`models\` (lowercase top-level) - group keys lowercased: \`LLM\` → \`llm\`, etc. - per Kuba: \`vlm\` → \`multimodal\` (anticipates audio-capable LMs like Gemma 4) Adds: - \`models.text_to_speech\` group: \`kokoro_small\`, \`kokoro_medium\`, plus voices as plain configs under \`voices\` (no quant/backend axis). - \`models.ocr({ language })\` parameterized accessor — covers all ISO language tokens via a runtime map built from the existing \`OCR_<LANGUAGE>\` exports. Example apps (22 files, ~150 substitutions) migrated by script. bare-rn demo swapped from \`llama3_2_1b\` to \`lfm2_5_1_2b_instruct\` per Kuba's note. Docs rewritten with the new syntax + TTS + OCR sections. Relaxes the project's \`camelcase\` rule with \`properties: 'never'\` so the lowercase snake_case keys in \`models\` (which mirror the \`.pte\` filename convention) pass without per-file disables. Variable and function names still require camelCase.
Per Kuba's review on #1148 — hoist a camelCase alias for any group used ≥ 2 times in a file, e.g. const instanceSegmentation = models.instance_segmentation; const objectDetection = models.object_detection; Then \`models.instance_segmentation.yolo26n_seg()\` becomes \`instanceSegmentation.yolo26n_seg()\`. Applied to 14 files where it actually reduces noise. Skips aliasing when the camelCase name would shadow an existing local identifier — common in the LLM/STT/embeddings screens where \`llm\`, \`speechToText\`, \`imageEmbedding\` etc. already name hook return values or temporaries.
…version The 0.8.x doc linked at \`/docs/next/api-reference/variables/MODEL_REGISTRY\`, which broke once the next-version docs renamed the variable to \`models\`. Use a relative path within the 0.8.x version so the link is independent of later renames.
barhanc
left a comment
There was a problem hiding this comment.
In docs we have many snippets that use the old API for selecting the model, these should probably be changed as well.
…`, add `pose_estimation` - `models.multimodal` is renamed to `models.lmm` (Large Multimodal Models) — clearer than the generic "multimodal" tag now that LMMs span audio+vision. - `privacy_filter_*` is moved out of `models.classification` into its own `models.privacy_filter` group (it has a dedicated hook and a distinct model-name union, so grouping under `classification` was misleading). - Pose estimation is added as `models.pose_estimation` (`yolo26n`). Example apps that select these models are updated.
Every code snippet that selected a model via a named constant (`LFM2_5_1_2B_INSTRUCT`, `OCR_ENGLISH`, `KOKORO_MEDIUM`, …) is rewritten to use the typed `models.<group>.<entry>()` accessor. Snippets default to the quantized variant where one is published — the constant-import path still works and is documented in the Model Registry guide. webrtc-integration intentionally keeps the named-constant style: the snippet sits next to imports from other libraries and the bare constant reads cleaner in that context.
- Drop three `eslint-disable camelcase` comments that became no-ops once the `camelcase` rule was relaxed to `properties: 'never'`. - Add a `@returns` line and trim the blank lines in `styleTransferUrls`'s JSDoc so `jsdoc/require-returns` and `jsdoc/tag-lines` pass.
We can just put all these models under |
|
Agreed, I'm also in favour of moving them under llm. Also regarding this one:
Do you have anything particular solution on your mind? |
|
I guess we already have something like this in place, since the user can check it like this const LFM2_5_VL = models.llm.lfm2_5_vl_1_6b()
console.log(LFM2_5_VL.capabilities) |
Vision-capable LLMs are still language models at their core and go through the same `useLLM` hook / `LLMModule` module — splitting them into a separate `lmm` group only forced users to know which file the model lives in. Move `lfm2_5_vl_1_6b` and `lfm2_5_vl_450m` under `models.llm`, drop the empty `lmm` group, and update call sites in the multimodal_llm example app + docs.
The two standalone `LFM2.5-VL-*` rows pointed at HF `tree/main/...` subdirs that resolve back to the aggregated repo, so they added a clickable link without exposing any extra content. Fold them into the existing `LFM2.5` row (which already links to the aggregated repo) and surface both `450M-VL` and `1.6B-VL` in the sizes column. Bump the row's `Capabilities` to `vision` to reflect the VL variants.
Qwen 3.5 (0.8B, 2B) and Bielik v3.0 (1.5B) ship in the registry but the family table didn't list them. Slot Qwen 3.5 next to the other Qwen rows and add Bielik above the LFM2.5 entry.
- `:::danger` made the beta notice read like a stop-ship warning when the intent is just "subject to change" — switch to `:::info`. - Add NPM/PNPM/YARN tabs for the three peer dependencies so users don't have to assemble the install command by hand.
LFM-2.5 1.2B Instruct is the model the README/docs highlight and what the rest of the example apps use as the default; the LLM playground was the only screen still booting to Llama 3.2 1B SpinQuant.
Description
Refreshes every URL constant to the restructured HF layout under
resolve/v0.9.0and adds the typedmodelsaccessor.URL refresh
All URLs follow
<model>_<size>_<backend>_<precision>.pte, files situnder per-size and per-backend directories on HF.
modelUrls.ts— every URL rewritten; multi-backend URLs hoisted here so the registry stays declarative. Thelfm2_5_350m_xnnpack_8w4da.ptetypo is corrected to_8da4w.pte.ocr/models.ts,tts/models.ts,tts/voices.ts— paths updated to the new shape.versions.ts—VERSION_TAG → resolve/v0.9.0;PREVIOUS_VERSION_TAG = resolve/v0.8.0retained for the@deprecatedLlama QLoRA aliases.modelsaccessorNew
constants/modelRegistry.tsexportsmodels, a typed accessor grouped one-to-one with hooks:llmuseLLM(includes vision-capable LLMs likelfm2_5_vl_*)classificationuseClassificationprivacy_filterusePrivacyFilterobject_detectionuseObjectDetectionpose_estimationusePoseEstimationsemantic_segmentationuseSemanticSegmentationinstance_segmentationuseInstanceSegmentationstyle_transferuseStyleTransferspeech_to_textuseSpeechToTexttext_to_speechuseTextToSpeechtext_embeddinguseTextEmbeddingsimage_embeddinguseImageEmbeddingsimage_generationuseTextToImagevaduseVADocruseOCR/useVerticalOCREach entry is a function — call it (optionally with
{ quant, backend }) to get the resolved config:backendparameter is typed to exactly the backends each model ships with —models.llm.llama3_2_3b({ backend: 'coreml' })is a compile-time error (xnnpack-only).{ quant }is omitted.text_to_speechexposeskokoro_small/kokoro_mediumplus plain voice configs undervoices.*.ESLint's
camelcaserule is relaxed toproperties: 'never'so the snake_case property keys pass while bindings/functions stay camelCase.Migration
Individual constant imports (
LLAMA3_2_1B_SPINQUANT,KOKORO_MEDIUM, etc.) still work — the new accessor is the recommended path. The flatMODEL_REGISTRY = { ALL_MODELS: {...} }export frommodelUrls.tsis removed; the internalgetModelNameForUrllookup is preserved.Example apps + docs
models.*(). Heavily-used groups are destructured at the top of the file (const segmentation = models.semantic_segmentation;).modelNameto handle accessor-function values.bare-rnLLM demo switched to LFM-2.5.models.<group>.<entry>()accessor across03-hooks/**,04-typescript-api/**,01-fundamentals/**, etc. The webrtc-integration page intentionally keeps the named-constant style — the snippet reads cleaner alongside imports from other libraries.Deprecations
LLAMA3_2_3B_QLORA,LLAMA3_2_1B_QLORA—@deprecated; the .pte files stay atv0.8.0and the constants still resolve those URLs. UseLLAMA3_2_*_SPINQUANTgoing forward.Introduces a breaking change?
URL paths under
${VERSION_TAG}change — code that hardcodedresolve/v0.8.0URLs through the constants keeps working only if it read them at runtime. The flatMODEL_REGISTRYexport is removed in favour of the newmodelsaccessor.Type of change
Tested on
yarn typecheckandyarn lintclean across the monorepo. Every example app runs against the v0.9.0 HF state.Testing instructions
In application code:
Related issues
#431
#612
Checklist