Safe, zero-friction Rust bindings for Google LiteRT 2.x — on-device ML
inference and LLM text generation. Add a crate to Cargo.toml and
cargo build. No Bazel, no CMake, no libclang on user machines.
[dependencies]
litert = "0.3"use litert::{CompilationOptions, CompiledModel, Environment, Model, TensorBuffer};
let env = Environment::new()?;
let model = Model::from_file("mobilenet.tflite")?;
let compiled = CompiledModel::new(env, model, &CompilationOptions::new()?)?;
// ... fill input buffers, compiled.run(...), read outputs ...
# Ok::<(), litert::Error>(())[dependencies]
litertlm = "0.3"use litertlm::{Backend, Engine, EngineSettings, SamplerParams};
let engine = Engine::new(
EngineSettings::new("Qwen3-0.6B.litertlm")
.backend(Backend::Gpu)
.max_num_tokens(512),
)?;
// Streaming (token-by-token)
let mut conv = engine.create_conversation(SamplerParams::default().top_p(0.95))?;
conv.send_message_stream("Explain Rust lifetimes", |chunk| {
print!("{chunk}");
})?;
// Or blocking
let mut session = engine.create_session(SamplerParams::default().top_p(0.95))?;
let response = session.generate("Explain Rust lifetimes")?;
# Ok::<(), litertlm::Error>(())| Other options | Friction |
|---|---|
| Build LiteRT from source via CMake | Bazel or CMake + protoc + flatc + abseil + Android NDK on your box |
Invoke via Python (ai-edge-litert) |
Python interpreter + wheel dependency graph |
| Hand-roll FFI against TFLite C API | Maintain a sysroot per target + track header drift manually |
litert-rs takes the same upstream runtime binaries Google publishes, pins
each by SHA-256, and downloads them into a user-level cache the first time
cargo build runs. Your app links against that cached libLiteRt.{so,dylib,dll}.
| Crate | What it is | crates.io |
|---|---|---|
litert |
Safe ML inference wrappers (CompiledModel, TensorBuffer, GPU) | 0.3.x |
litertlm |
Safe LLM text generation (Engine, Session, Conversation streaming) | 0.3.x |
litert-sys |
Raw FFI — LiteRT 2.x C API | 0.3.x |
litert-lm-sys |
Raw FFI — LiteRT-LM C engine API | 0.3.x |
| Rust target | CPU | GPU accelerator(s) shipped | Source |
|---|---|---|---|
aarch64-apple-darwin |
✅ | Metal, WebGPU | litert-lm prebuilt (Git LFS) |
x86_64-unknown-linux-gnu |
✅ | WebGPU | litert-lm prebuilt (Git LFS) |
aarch64-unknown-linux-gnu |
✅ | WebGPU | litert-lm prebuilt (Git LFS) |
x86_64-pc-windows-msvc |
✅ | WebGPU | litert-lm prebuilt (Git LFS) |
aarch64-linux-android |
✅ | OpenCL/GL (via ClGlAccelerator) |
LiteRT Maven AAR |
x86_64-linux-android |
✅ | OpenCL/GL | LiteRT Maven AAR |
wasm32-unknown-emscripten |
✅ | — (XNNPACK only; GPU deferred) | LiteRT-rs CMake+emcc build |
aarch64-apple-ios |
⏳ | — | deferred (no upstream prebuilt) |
litertlm / litert-lm-sys (LLM inference) are desktop/Android only this
release. WASM support for the LLM stack is on the 0.4.0 roadmap — see
wasm-patches/litert-lm-v0.10.2/.
| Variable | Effect |
|---|---|
LITERT_LIB_DIR |
Directory containing libLiteRt.{so,dylib,dll}. Bypasses the downloader. |
LITERT_NO_DOWNLOAD |
Fail the build if any prebuilt is missing from cache (air-gapped CI). |
LITERT_CACHE_DIR |
Override the cache root. Default: $XDG_CACHE_HOME/litert-sys. |
The prebuilt libLiteRt.dylib Google ships has install_name=@rpath/libLiteRt.dylib
and wasn't linked with -headerpad_max_install_names, so we can't rewrite that
identifier to an absolute path post-download. litert-sys' build script emits
an -rpath flag for its own tests and examples, but Cargo's rustc-link-arg
does not propagate to downstream consumer binaries. Without action, the
binaries your crate produces on macOS will fail at launch with:
dyld: Library not loaded: @rpath/libLiteRt.dylib
Fix it once per downstream crate — add this tiny build.rs
next to your Cargo.toml:
// build.rs
fn main() {
// `litert-sys` declares `links = "LiteRt"` and publishes its cache
// directory as `DEP_LITERT_LIB_DIR`. Embedding it as an rpath makes
// dyld find libLiteRt.dylib without DYLD_LIBRARY_PATH.
if let Ok(dir) = std::env::var("DEP_LITERT_LIB_DIR") {
println!("cargo:rustc-link-arg=-Wl,-rpath,{dir}");
}
}Alternatively, prefix individual invocations with
DYLD_LIBRARY_PATH=$(cargo xtask cache-dir), or link with
RUSTFLAGS="-C link-arg=-Wl,-rpath,/path/to/cache".
Linux, Windows, and Android are unaffected.
litert and litert-sys cross-compile to wasm32-unknown-emscripten for
in-browser ML inference, server-side WASM (Cloudflare Workers, wasmtime),
or Node.js. The runtime is TFLite + XNNPACK CPU kernels, statically
linked into the produced .wasm.
- emsdk ≥ 5.0.7. Install via
git clone https://github.com/emscripten-core/emsdk && cd emsdk && ./emsdk install latest && ./emsdk activate latest. rustup target add wasm32-unknown-emscripten.source $EMSDK/emsdk_env.shbefore each build session.
source $EMSDK/emsdk_env.sh
# NODERAWFS=1 lets the WASM module read host files (model.tflite) under
# Node/wasmtime. ALLOW_MEMORY_GROWTH=1 lets the heap grow past 16 MB so
# larger models load. Drop both for a browser bundle (use --preload-file
# or fetch+MEMFS instead).
RUSTFLAGS="-C link-arg=-sNODERAWFS=1 -C link-arg=-sALLOW_MEMORY_GROWTH=1" \
cargo build -p litert --example add_wasm \
--target wasm32-unknown-emscripten --releaseOutput (target/wasm32-unknown-emscripten/release/examples/):
add_wasm.wasm— ~5–12 MB WebAssembly module (12 MB debug, smaller in release with-Oz).add_wasm.js— emscripten JS shim that knows how to instantiate the.wasm.
node target/wasm32-unknown-emscripten/release/examples/add_wasm.js
# add_10x10.tflite — WASM CPU inference
# first 5 outputs: [100.0, 102.0, 104.0, 106.0, 108.0]
# last 5 outputs: [290.0, 292.0, 294.0, 296.0, 298.0]Drop NODERAWFS=1 (no host filesystem in browsers). Embed the model with
emcc's --preload-file model.tflite (bundles into a .data sidecar), or
fetch() it from JS and write to MEMFS before calling Model::from_file,
or use Model::from_bytes with a Vec<u8> you fetch()'d. Then drop the
.wasm + .js into a static page:
<script src="add_wasm.js"></script>While the v0.3.0 prebuilt tarball is being staged, build the static
archives locally and point litert-sys at them via LITERT_LIB_DIR:
# 1. Clone + patch upstream LiteRT
git clone --depth=1 --branch=v2.1.4 \
https://github.com/google-ai-edge/LiteRT.git /tmp/litert
cd /tmp/litert
git apply $LITERT_RS_DIR/wasm-patches/litert-v2.1.4/01-cmake-emscripten-support.patch
# 2. Cross-compile via CMake + emcc
emcmake cmake -S litert -B litert/build-wasm \
-DCMAKE_BUILD_TYPE=Release \
-DLITERT_ENABLE_GPU=OFF -DLITERT_ENABLE_NPU=OFF \
-DLITERT_DISABLE_KLEIDIAI=ON -DLITERT_BUILD_TESTS=OFF \
-DTFLITE_ENABLE_GPU=OFF
emmake cmake --build litert/build-wasm \
--target litert_runtime_c_api_shared_lib -j
# 3. Flatten archives into a single dir for litert-sys
mkdir -p /tmp/litert-wasm-libs
find litert/build-wasm \( -name "lib*.a" ! -path "*/testdata/*" ! -name "input.a" \) \
-exec cp {} /tmp/litert-wasm-libs/ \;
# 4. Build
cd $LITERT_RS_DIR
LITERT_LIB_DIR=/tmp/litert-wasm-libs \
RUSTFLAGS="-C link-arg=-sNODERAWFS=1 -C link-arg=-sALLOW_MEMORY_GROWTH=1" \
cargo build -p litert --example add_wasm --target wasm32-unknown-emscriptenOnce the build-litert-wasm.yml
GitHub Actions workflow has uploaded a SHA-pinned tarball, cargo build
will download and verify it automatically — no LITERT_LIB_DIR, no emsdk
setup needed for end users on the WASM target.
- CPU only. WebGPU acceleration is on the 0.4.0 roadmap.
- No
set_global_log_severity. The WASM build doesn't export the logger-control symbols. ReturnsError::Unsupported; LiteRT logs at default verbosity. - No LLM stack.
litertlm/litert-lm-sysneed separate fork patches to LiteRT-LM (orchestrator + transitive C++ deps); 0.4.0 milestone.
End users only need cargo. The sections below are for contributors who want
to regenerate bindings or build for foreign targets locally.
rustupwith stable + any target triples you want to exercise.- A container engine for foreign-target builds: Docker or Podman.
- macOS:
brew install podman && podman machine init && podman machine start. - Linux: your distro's Docker/Podman packages.
- macOS:
cross≥ 0.2.5:cargo install cross --locked.- If you're using Podman:
export CROSS_CONTAINER_ENGINE=podman.
macOS and Windows target toolchains run natively, not through cross.
cross is only invoked for Linux + Android targets.
All cross-target chores flow through a single xtask binary.
cargo xtask targets # list every supported Rust target triple
cargo xtask regen-bindings # rebuild litert-sys bindings for every target
cargo xtask regen-bindings --target aarch64-apple-darwin # single target
cargo xtask build-all # cross-build the workspace for every targetregen-bindings dispatches automatically:
- Host target → native
cargo build -p litert-sys --features generate-bindings. - Foreign target →
cross build …, which runsbindgeninside a container image that already haslibclang+ the target sysroot installed (seeCross.toml).
.github/workflows/ci.yml runs three matrices on
every push and PR:
native— macOS arm64 (tests + build), Linux x86_64 (tests +fmt --check+clippy -D warnings), Windows x86_64 (build).cross— Linux arm64, Android arm64, Android x86_64 (build only).bindings-drift— regenerates all 6 target binding files viacargo xtask regen-bindingsand fails on anygit diff. If drift is detected the regenerated files are uploaded as a build artifact so they can be inspected or accepted in a PR.
Drift is the authoritative check: if the CI-generated bindings for a target differ from what's committed, that's the signal to update the committed file.
On the first cargo build for a given target, litert-sys/build.rs emits a
one-time warning while it fetches the pinned prebuilt libraries:
warning: litert-sys: downloading 4 file(s) of LiteRT prebuilt v0.10.2
for target `macos_arm64` into
/Users/you/Library/Caches/litert-sys/v0.10.2/aarch64-apple-darwin
(first build only)
Subsequent builds hit the cache (with a size check; a SHA-256 verified marker file short-circuits rehashing). Deleting the cache directory or bumping the pinned upstream version triggers a fresh download + re-verification.
# 1. Add the triple to:
# Cargo.toml workspace members (if needed)
# Cross.toml (pre-build apt-get for libclang)
# litert-sys/build.rs target_spec() — with pinned checksums
# xtask/src/main.rs TARGETS
#
# 2. Regenerate:
cargo xtask regen-bindings --target <new-target>
#
# 3. Commit litert-sys/src/bindings/<new-target>.rs and push.This project is a binding, not a fork. The runtime that does the work — model loading, graph compilation, kernel execution, GPU/NPU delegation — is Google's LiteRT and LiteRT-LM:
- LiteRT (Apache-2.0, © 2024–2026 Google LLC) — C API headers vendored
under
third_party/litert-v2.1.4/, source: https://github.com/google-ai-edge/LiteRT. - LiteRT-LM (Apache-2.0, © 2024–2026 Google LLC) — source of the
prebuilt
libLiteRt.*+ accelerator plugins we download: https://github.com/google-ai-edge/litert-lm. - TensorFlow Lite, XNNPACK, abseil-cpp, flatbuffers, protobuf and the rest of the transitive open-source stack that LiteRT itself is built on.
See NOTICE for the full attribution and third_party/litert-v2.1.4/LICENSE for the upstream LiteRT license text.
Licensed under the Apache License, Version 2.0. By contributing you agree that your contribution is licensed under the same terms.