LiteRT-rs

Safe, zero-friction Rust bindings for Google LiteRT 2.x — on-device ML inference and LLM text generation. Add a crate to Cargo.toml and cargo build. No Bazel, no CMake, no libclang on user machines.

ML inference (`litert`)

[dependencies]
litert = "0.3"

use litert::{CompilationOptions, CompiledModel, Environment, Model, TensorBuffer};

let env = Environment::new()?;
let model = Model::from_file("mobilenet.tflite")?;
let compiled = CompiledModel::new(env, model, &CompilationOptions::new()?)?;
// ... fill input buffers, compiled.run(...), read outputs ...
# Ok::<(), litert::Error>(())

LLM text generation (`litertlm`)

[dependencies]
litertlm = "0.3"

use litertlm::{Backend, Engine, EngineSettings, SamplerParams};

let engine = Engine::new(
    EngineSettings::new("Qwen3-0.6B.litertlm")
        .backend(Backend::Gpu)
        .max_num_tokens(512),
)?;

// Streaming (token-by-token)
let mut conv = engine.create_conversation(SamplerParams::default().top_p(0.95))?;
conv.send_message_stream("Explain Rust lifetimes", |chunk| {
    print!("{chunk}");
})?;

// Or blocking
let mut session = engine.create_session(SamplerParams::default().top_p(0.95))?;
let response = session.generate("Explain Rust lifetimes")?;
# Ok::<(), litertlm::Error>(())

Why

Other options	Friction
Build LiteRT from source via CMake	Bazel or CMake + protoc + flatc + abseil + Android NDK on your box
Invoke via Python (`ai-edge-litert`)	Python interpreter + wheel dependency graph
Hand-roll FFI against TFLite C API	Maintain a sysroot per target + track header drift manually

litert-rs takes the same upstream runtime binaries Google publishes, pins each by SHA-256, and downloads them into a user-level cache the first time cargo build runs. Your app links against that cached libLiteRt.{so,dylib,dll}.

Crates

Crate	What it is	crates.io
`litert`	Safe ML inference wrappers (CompiledModel, TensorBuffer, GPU)	0.3.x
`litertlm`	Safe LLM text generation (Engine, Session, Conversation streaming)	0.3.x
`litert-sys`	Raw FFI — LiteRT 2.x C API	0.3.x
`litert-lm-sys`	Raw FFI — LiteRT-LM C engine API	0.3.x

Platform support

Rust target	CPU	GPU accelerator(s) shipped	Source
`aarch64-apple-darwin`	✅	Metal, WebGPU	litert-lm prebuilt (Git LFS)
`x86_64-unknown-linux-gnu`	✅	WebGPU	litert-lm prebuilt (Git LFS)
`aarch64-unknown-linux-gnu`	✅	WebGPU	litert-lm prebuilt (Git LFS)
`x86_64-pc-windows-msvc`	✅	WebGPU	litert-lm prebuilt (Git LFS)
`aarch64-linux-android`	✅	OpenCL/GL (via `ClGlAccelerator`)	LiteRT Maven AAR
`x86_64-linux-android`	✅	OpenCL/GL	LiteRT Maven AAR
`wasm32-unknown-emscripten`	✅	— (XNNPACK only; GPU deferred)	LiteRT-rs CMake+emcc build
`aarch64-apple-ios`	⏳	—	deferred (no upstream prebuilt)

litertlm / litert-lm-sys (LLM inference) are desktop/Android only this release. WASM support for the LLM stack is on the 0.4.0 roadmap — see wasm-patches/litert-lm-v0.10.2/.

Environment variables (escape hatches)

Variable	Effect
`LITERT_LIB_DIR`	Directory containing `libLiteRt.{so,dylib,dll}`. Bypasses the downloader.
`LITERT_NO_DOWNLOAD`	Fail the build if any prebuilt is missing from cache (air-gapped CI).
`LITERT_CACHE_DIR`	Override the cache root. Default: `$XDG_CACHE_HOME/litert-sys`.

macOS downstream binaries (one-time setup)

The prebuilt libLiteRt.dylib Google ships has install_name=@rpath/libLiteRt.dylib and wasn't linked with -headerpad_max_install_names, so we can't rewrite that identifier to an absolute path post-download. litert-sys' build script emits an -rpath flag for its own tests and examples, but Cargo's rustc-link-arg does not propagate to downstream consumer binaries. Without action, the binaries your crate produces on macOS will fail at launch with:

dyld: Library not loaded: @rpath/libLiteRt.dylib

Fix it once per downstream crate — add this tiny build.rs next to your Cargo.toml:

// build.rs
fn main() {
    // `litert-sys` declares `links = "LiteRt"` and publishes its cache
    // directory as `DEP_LITERT_LIB_DIR`. Embedding it as an rpath makes
    // dyld find libLiteRt.dylib without DYLD_LIBRARY_PATH.
    if let Ok(dir) = std::env::var("DEP_LITERT_LIB_DIR") {
        println!("cargo:rustc-link-arg=-Wl,-rpath,{dir}");
    }
}

Alternatively, prefix individual invocations with DYLD_LIBRARY_PATH=$(cargo xtask cache-dir), or link with RUSTFLAGS="-C link-arg=-Wl,-rpath,/path/to/cache".

Linux, Windows, and Android are unaffected.

WebAssembly (browser + Node.js + wasmtime)

litert and litert-sys cross-compile to wasm32-unknown-emscripten for in-browser ML inference, server-side WASM (Cloudflare Workers, wasmtime), or Node.js. The runtime is TFLite + XNNPACK CPU kernels, statically linked into the produced .wasm.

Prerequisites

emsdk ≥ 5.0.7. Install via git clone https://github.com/emscripten-core/emsdk && cd emsdk && ./emsdk install latest && ./emsdk activate latest.
rustup target add wasm32-unknown-emscripten.
source $EMSDK/emsdk_env.sh before each build session.

Build

source $EMSDK/emsdk_env.sh

# NODERAWFS=1 lets the WASM module read host files (model.tflite) under
# Node/wasmtime. ALLOW_MEMORY_GROWTH=1 lets the heap grow past 16 MB so
# larger models load. Drop both for a browser bundle (use --preload-file
# or fetch+MEMFS instead).
RUSTFLAGS="-C link-arg=-sNODERAWFS=1 -C link-arg=-sALLOW_MEMORY_GROWTH=1" \
  cargo build -p litert --example add_wasm \
              --target wasm32-unknown-emscripten --release

Output (target/wasm32-unknown-emscripten/release/examples/):

add_wasm.wasm — ~5–12 MB WebAssembly module (12 MB debug, smaller in release with -Oz).
add_wasm.js — emscripten JS shim that knows how to instantiate the .wasm.

Run (Node.js)

node target/wasm32-unknown-emscripten/release/examples/add_wasm.js
# add_10x10.tflite — WASM CPU inference
# first 5 outputs: [100.0, 102.0, 104.0, 106.0, 108.0]
# last 5 outputs:  [290.0, 292.0, 294.0, 296.0, 298.0]

Browser

Drop NODERAWFS=1 (no host filesystem in browsers). Embed the model with emcc's --preload-file model.tflite (bundles into a .data sidecar), or fetch() it from JS and write to MEMFS before calling Model::from_file, or use Model::from_bytes with a Vec<u8> you fetch()'d. Then drop the .wasm + .js into a static page:

<script src="add_wasm.js"></script>

Building from source (no published artifact yet)

While the v0.3.0 prebuilt tarball is being staged, build the static archives locally and point litert-sys at them via LITERT_LIB_DIR:

# 1. Clone + patch upstream LiteRT
git clone --depth=1 --branch=v2.1.4 \
    https://github.com/google-ai-edge/LiteRT.git /tmp/litert
cd /tmp/litert
git apply $LITERT_RS_DIR/wasm-patches/litert-v2.1.4/01-cmake-emscripten-support.patch

# 2. Cross-compile via CMake + emcc
emcmake cmake -S litert -B litert/build-wasm \
    -DCMAKE_BUILD_TYPE=Release \
    -DLITERT_ENABLE_GPU=OFF -DLITERT_ENABLE_NPU=OFF \
    -DLITERT_DISABLE_KLEIDIAI=ON -DLITERT_BUILD_TESTS=OFF \
    -DTFLITE_ENABLE_GPU=OFF
emmake cmake --build litert/build-wasm \
    --target litert_runtime_c_api_shared_lib -j

# 3. Flatten archives into a single dir for litert-sys
mkdir -p /tmp/litert-wasm-libs
find litert/build-wasm \( -name "lib*.a" ! -path "*/testdata/*" ! -name "input.a" \) \
    -exec cp {} /tmp/litert-wasm-libs/ \;

# 4. Build
cd $LITERT_RS_DIR
LITERT_LIB_DIR=/tmp/litert-wasm-libs \
RUSTFLAGS="-C link-arg=-sNODERAWFS=1 -C link-arg=-sALLOW_MEMORY_GROWTH=1" \
  cargo build -p litert --example add_wasm --target wasm32-unknown-emscripten

Once the build-litert-wasm.yml GitHub Actions workflow has uploaded a SHA-pinned tarball, cargo build will download and verify it automatically — no LITERT_LIB_DIR, no emsdk setup needed for end users on the WASM target.

Limitations (v0.3.0)

CPU only. WebGPU acceleration is on the 0.4.0 roadmap.
No set_global_log_severity. The WASM build doesn't export the logger-control symbols. Returns Error::Unsupported; LiteRT logs at default verbosity.
No LLM stack. litertlm / litert-lm-sys need separate fork patches to LiteRT-LM (orchestrator + transitive C++ deps); 0.4.0 milestone.

Cross-platform development

End users only need cargo. The sections below are for contributors who want to regenerate bindings or build for foreign targets locally.

Tooling prerequisites (contributors only)

rustup with stable + any target triples you want to exercise.
A container engine for foreign-target builds: Docker or Podman.
- macOS: brew install podman && podman machine init && podman machine start.
- Linux: your distro's Docker/Podman packages.
cross ≥ 0.2.5: cargo install cross --locked.
If you're using Podman: export CROSS_CONTAINER_ENGINE=podman.

macOS and Windows target toolchains run natively, not through cross. cross is only invoked for Linux + Android targets.

Workspace automation

All cross-target chores flow through a single xtask binary.

cargo xtask targets            # list every supported Rust target triple
cargo xtask regen-bindings     # rebuild litert-sys bindings for every target
cargo xtask regen-bindings --target aarch64-apple-darwin   # single target
cargo xtask build-all          # cross-build the workspace for every target

regen-bindings dispatches automatically:

Host target → native cargo build -p litert-sys --features generate-bindings.
Foreign target → cross build …, which runs bindgen inside a container image that already has libclang + the target sysroot installed (see Cross.toml).

CI

.github/workflows/ci.yml runs three matrices on every push and PR:

native — macOS arm64 (tests + build), Linux x86_64 (tests + fmt --check + clippy -D warnings), Windows x86_64 (build).
cross — Linux arm64, Android arm64, Android x86_64 (build only).
bindings-drift — regenerates all 6 target binding files via cargo xtask regen-bindings and fails on any git diff. If drift is detected the regenerated files are uploaded as a build artifact so they can be inspected or accepted in a PR.

Drift is the authoritative check: if the CI-generated bindings for a target differ from what's committed, that's the signal to update the committed file.

First-build download

On the first cargo build for a given target, litert-sys/build.rs emits a one-time warning while it fetches the pinned prebuilt libraries:

warning: litert-sys: downloading 4 file(s) of LiteRT prebuilt v0.10.2
         for target `macos_arm64` into
         /Users/you/Library/Caches/litert-sys/v0.10.2/aarch64-apple-darwin
         (first build only)

Subsequent builds hit the cache (with a size check; a SHA-256 verified marker file short-circuits rehashing). Deleting the cache directory or bumping the pinned upstream version triggers a fresh download + re-verification.

Regenerating bindings for a new target

# 1. Add the triple to:
#      Cargo.toml workspace members (if needed)
#      Cross.toml (pre-build apt-get for libclang)
#      litert-sys/build.rs target_spec()  — with pinned checksums
#      xtask/src/main.rs TARGETS
#
# 2. Regenerate:
cargo xtask regen-bindings --target <new-target>
#
# 3. Commit litert-sys/src/bindings/<new-target>.rs and push.

Credits

This project is a binding, not a fork. The runtime that does the work — model loading, graph compilation, kernel execution, GPU/NPU delegation — is Google's LiteRT and LiteRT-LM:

LiteRT (Apache-2.0, © 2024–2026 Google LLC) — C API headers vendored under third_party/litert-v2.1.4/, source: https://github.com/google-ai-edge/LiteRT.
LiteRT-LM (Apache-2.0, © 2024–2026 Google LLC) — source of the prebuilt libLiteRt.* + accelerator plugins we download: https://github.com/google-ai-edge/litert-lm.
TensorFlow Lite, XNNPACK, abseil-cpp, flatbuffers, protobuf and the rest of the transitive open-source stack that LiteRT itself is built on.

See NOTICE for the full attribution and third_party/litert-v2.1.4/LICENSE for the upstream LiteRT license text.

License

Licensed under the Apache License, Version 2.0. By contributing you agree that your contribution is licensed under the same terms.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LiteRT-rs

ML inference (`litert`)

LLM text generation (`litertlm`)

Why

Crates

Platform support

Environment variables (escape hatches)

macOS downstream binaries (one-time setup)

WebAssembly (browser + Node.js + wasmtime)

Prerequisites

Build

Run (Node.js)

Browser

Building from source (no published artifact yet)

Limitations (v0.3.0)

Cross-platform development

Tooling prerequisites (contributors only)

Workspace automation

CI

First-build download

Regenerating bindings for a new target

Credits

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
.cargo		.cargo
.github/workflows		.github/workflows
litert-lm-sys		litert-lm-sys
litert-lm		litert-lm
litert-sys		litert-sys
litert		litert
wasm-patches		wasm-patches
xtask		xtask
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Cargo.toml		Cargo.toml
Cross.toml		Cross.toml
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

LiteRT-rs

ML inference (litert)

LLM text generation (litertlm)

Why

Crates

Platform support

Environment variables (escape hatches)

macOS downstream binaries (one-time setup)

WebAssembly (browser + Node.js + wasmtime)

Prerequisites

Build

Run (Node.js)

Browser

Building from source (no published artifact yet)

Limitations (v0.3.0)

Cross-platform development

Tooling prerequisites (contributors only)

Workspace automation

CI

First-build download

Regenerating bindings for a new target

Credits

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

ML inference (`litert`)

LLM text generation (`litertlm`)

Packages