Add :safetensors_reader option to load_model/2#456
Merged
jonatanklosko merged 1 commit intoMay 18, 2026
Conversation
Allows callers to override the function used to read `.safetensors` parameter files. The override receives a file path and must return a map from tensor name to an `Nx.Tensor` or `Nx.LazyContainer`. Defaults to the existing `&Safetensors.read!(&1, lazy: true)`, so behaviour is unchanged when the option is not supplied. This is a small seam for custom readers — for example, a memory-mapped loader backed by a resource binary, which can keep peak memory bounded to a single tensor when loading very large checkpoints.
jonatanklosko
approved these changes
May 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a
:safetensors_readeroption toBumblebee.load_model/2. When supplied, the override is used instead of&Safetensors.read!(&1, lazy: true)to read.safetensorsparameter files. The override receives a file path and must return a map from tensor name to anNx.Tensoror any term implementingNx.LazyContainer— the same shapeSafetensors.read!/2already returns, so the rest of the loading pipeline (Bumblebee.Conversion.PyTorchParams.load_params!/4) is unchanged.Default behaviour is identical when the option is not supplied.
Motivation
This is a small seam that enables custom safetensors readers without forking Bumblebee. The concrete use case is a memory-mapped reader backed by a NIF resource binary (
enif_make_resource_binary), which keeps peak BEAM memory bounded to a single tensor when loading very large checkpoints — pages are demand-faulted by the OS and freed after the backend transfer, instead of beingpread-ed and copied into BEAM heap per tensor.The current
Safetensors.read!/2lazy path already streams reasonably well, but the per-tensorFile.open+pread+ binary copy is a real cost on multi-GB checkpoints, and there's no way to swap it today because the loader is wired directly to the module name atbumblebee.ex:771.The option is scoped to safetensors only. PyTorch pickle isn't a candidate for the same treatment because it requires structural parsing rather than byte-range reads.
Test plan
mix test test/bumblebee_test.exs --only describe:"load_model/2"passes locally (5/5)..safetensorsfiles and that the resulting params have the same keys as the default reader.