Skip to content

leps_localizer: --ios-bundle should emit a runtime-loadable ZIP (compiled .mlmodelc, assetID-wrapped) #77

@mihow

Description

@mihow

What's wrong

research/leps_localizer/scripts/export_classifier.py --ios-bundle currently emits a staging directory containing:

<model-id>.mlpackage/
<model-id>-category-map.json
<model-id>.model-info.json

The downstream consumer (LepsAI iOS, mihow/LepsAI) downloads ZIPs from s3://ami-models/mobile/<assetID>.zip at runtime via its AssetManager and tries to load <assetID>.mlmodelc directly with MLModel(contentsOf:). iOS cannot consume a raw .mlpackage at runtime — only Xcode-time compilation produces .mlmodelc. So shipping the staging dir as-is silently breaks: the ZIP extracts cleanly, but the model file the app looks up is missing.

This was hit during the 2026-05-08 ship of global-butterflies-resnet50-512 to LepsAI. Worked around by manually compiling on a macOS VM (xcrun coremlc compile in.mlpackage outdir/) and re-zipping.

What the consumer expects

Match the working na-butterflies-v3.zip layout exactly. LepsAI/Services/AssetDownloader.swift::flattenIfNeeded looks for a single top-level dir matching assetID:

<assetID>/
  <assetID>.mlmodelc/
    weights/weight.bin
    coremldata.bin
    metadata.json
    model.mil
    analytics/coremldata.bin
  <assetID>-category-map.json
  <assetID>.model-info.json

Suggested fix

Add a follow-on step to --ios-bundle (or a new --ios-zip flag) that:

  1. Compiles the .mlpackage.mlmodelc via xcrun coremlc compile (must run on macOS — gate behind platform.system() == \"Darwin\" and surface a clear error otherwise, since the compile is Apple-toolchain-only).
  2. Wraps the three artifacts (.mlmodelc/, -category-map.json, .model-info.json) in a parent dir named <model-id>/.
  3. ZIPs that parent dir to <out_dir>/<model-id>.zip.
  4. Prints the sha256 + size for the catalog entry on the consumer side.

Sketch:

import platform, subprocess, shutil, hashlib, zipfile

if args.ios_bundle and args.ios_zip:
    if platform.system() != \"Darwin\":
        raise SystemExit(\"--ios-zip requires macOS (xcrun coremlc compile)\")
    mlmodelc_parent = out_dir / \"_compile\"
    mlmodelc_parent.mkdir(exist_ok=True)
    subprocess.run(
        [\"xcrun\", \"coremlc\", \"compile\", str(mlpackage_path), str(mlmodelc_parent)],
        check=True,
    )
    pkg_dir = out_dir / args.model_id
    if pkg_dir.exists():
        shutil.rmtree(pkg_dir)
    pkg_dir.mkdir()
    shutil.move(str(mlmodelc_parent / f\"{args.model_id}.mlmodelc\"), str(pkg_dir))
    shutil.copy2(cat_map_path, pkg_dir)
    shutil.copy2(info_path, pkg_dir)
    zip_path = out_dir / f\"{args.model_id}.zip\"
    with zipfile.ZipFile(zip_path, \"w\", zipfile.ZIP_DEFLATED) as zf:
        for p in pkg_dir.rglob(\"*\"):
            zf.write(p, p.relative_to(out_dir))
    sha = hashlib.sha256(zip_path.read_bytes()).hexdigest()
    print(f\"ios-zip -> {zip_path}  sha256={sha}  size={zip_path.stat().st_size}\")

Alternative (cross-platform): add a sibling shell helper scripts/pack_ios_zip.sh that runs on a macOS VM/host and consumes the staging dir, so the Linux trainer doesn't need to think about platform gates.

Why it's worth doing

The packaging step is currently tribal knowledge — the failure mode (silent on-device skip) is hard to diagnose without inspecting na-butterflies-v3.zip for comparison. Baking it into the exporter avoids the next person re-discovering it, and gives a stable artifact to upload.

References

  • Working pattern: https://object-arbutus.cloud.computecanada.ca/ami-models/mobile/na-butterflies-v3.zip (inspect with `unzip -l`).
  • Consumer code: mihow/LepsAI LepsAI/Services/AssetDownloader.swift (extraction) and LepsAI/Services/AssetManager.swift (lookup).
  • Script that hit the bug: `research/leps_localizer/scripts/export_classifier.py` line ~621 (the `--ios-bundle` block).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions