Skip to content

Resolve AppKit and Agent Skills versions from compatibility manifest#5139

Merged
simonfaltum merged 10 commits into
mainfrom
pkosiec/appkit-version-pinning
May 12, 2026
Merged

Resolve AppKit and Agent Skills versions from compatibility manifest#5139
simonfaltum merged 10 commits into
mainfrom
pkosiec/appkit-version-pinning

Conversation

@pkosiec
Copy link
Copy Markdown
Member

@pkosiec pkosiec commented Apr 30, 2026

Summary

Introduces a CLI compatibility manifest (internal/build/cli-compat.json) that maps CLI versions to compatible AppKit template and Agent Skills versions. This enables template updates to reach users without CLI releases.

Manifest format

The manifest is purely range-based — each versioned entry defines a range floor that applies to that CLI version and all versions above it, up to the next entry. The manifest should be sparse: only add a new entry when a compatibility boundary changes (e.g., new AppKit templates require specific CLI features).

Resolution

  • Exact match → use that entry
  • Between entries → nearest lower version
  • Newer than all → highest versioned entry
  • Dev builds (0.0.0-dev*) → highest versioned entry

Manifest sources (fallback chain)

  1. Fresh local cache (< 1h)
  2. Remote fetch from GitHub (with retry)
  3. Stale local cache or embedded manifest fallback

Set DATABRICKS_FORCE_EMBEDDED_COMPAT=true to skip remote fetch and use only the embedded manifest (useful for local development).

Companion PRs

Screenshot

image

@pkosiec pkosiec temporarily deployed to test-trigger-is April 30, 2026 15:10 — with GitHub Actions Inactive
@pkosiec pkosiec temporarily deployed to test-trigger-is April 30, 2026 15:10 — with GitHub Actions Inactive
@pkosiec pkosiec temporarily deployed to test-trigger-is April 30, 2026 15:20 — with GitHub Actions Inactive
@pkosiec pkosiec temporarily deployed to test-trigger-is April 30, 2026 15:20 — with GitHub Actions Inactive
@pkosiec pkosiec temporarily deployed to test-trigger-is April 30, 2026 16:12 — with GitHub Actions Inactive
@pkosiec pkosiec temporarily deployed to test-trigger-is April 30, 2026 16:12 — with GitHub Actions Inactive
@pkosiec pkosiec temporarily deployed to test-trigger-is April 30, 2026 16:16 — with GitHub Actions Inactive
@pkosiec pkosiec temporarily deployed to test-trigger-is April 30, 2026 16:16 — with GitHub Actions Inactive
@pkosiec pkosiec temporarily deployed to test-trigger-is April 30, 2026 16:25 — with GitHub Actions Inactive
@pkosiec pkosiec temporarily deployed to test-trigger-is April 30, 2026 16:25 — with GitHub Actions Inactive
@pkosiec pkosiec temporarily deployed to test-trigger-is April 30, 2026 16:35 — with GitHub Actions Inactive
@pkosiec pkosiec temporarily deployed to test-trigger-is April 30, 2026 16:35 — with GitHub Actions Inactive
@pkosiec pkosiec marked this pull request as ready for review April 30, 2026 16:39
Copy link
Copy Markdown
Contributor

@renaudhartert-db renaudhartert-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @pkosiec. Could we have a chat internally about what you're trying to achieve? I'd like to make sure that this is aligned with the overall direction we're planning to evolve that command toward.

@pkosiec pkosiec temporarily deployed to test-trigger-is May 5, 2026 12:35 — with GitHub Actions Inactive
@pkosiec pkosiec temporarily deployed to test-trigger-is May 5, 2026 12:35 — with GitHub Actions Inactive
@pkosiec pkosiec changed the title feat: resolve AppKit template version from compatibility manifest feat: CLI compatibility manifest with 3-tier fallback May 5, 2026
@pkosiec pkosiec marked this pull request as draft May 5, 2026 12:43
@pkosiec pkosiec force-pushed the pkosiec/appkit-version-pinning branch from d7d005f to 3241ae2 Compare May 5, 2026 13:00
@pkosiec pkosiec temporarily deployed to test-trigger-is May 5, 2026 13:00 — with GitHub Actions Inactive
@pkosiec pkosiec temporarily deployed to test-trigger-is May 5, 2026 13:00 — with GitHub Actions Inactive
@pkosiec pkosiec temporarily deployed to test-trigger-is May 5, 2026 13:17 — with GitHub Actions Inactive
@pkosiec pkosiec temporarily deployed to test-trigger-is May 5, 2026 13:17 — with GitHub Actions Inactive
@pkosiec pkosiec changed the title feat: CLI compatibility manifest with 3-tier fallback Resolve AppKit and Agent Skills versions from compatibility manifest May 5, 2026
@pkosiec pkosiec temporarily deployed to test-trigger-is May 5, 2026 13:24 — with GitHub Actions Inactive
@pkosiec pkosiec temporarily deployed to test-trigger-is May 5, 2026 13:24 — with GitHub Actions Inactive
@pkosiec pkosiec temporarily deployed to test-trigger-is May 5, 2026 13:30 — with GitHub Actions Inactive
@pkosiec pkosiec temporarily deployed to test-trigger-is May 5, 2026 13:30 — with GitHub Actions Inactive
pkosiec added 4 commits May 11, 2026 11:49
- Fix stale libs/depversions/ references in README (renamed to libs/clicompat/)
- Fix incorrect "next" key description (only used for dev builds, not newer-than-all)
- Fix import ordering (clicompat before cmdctx/cmdio alphabetically)
- Fix slices import grouping in clicompat_test.go (stdlib, not separate group)
- Fix error format: semicolon instead of period before hint text
- Fix FetchManifest godoc: "4-tier fallback" to match numbered list
- Fix writeLocalManifest comment: explain temp-file-then-rename pattern
- Fix README example values to match actual manifest
- Fix flaky test: pre-populate cache to avoid real network calls

Co-authored-by: Isaac
When the manifest resolves to a version that doesn't exist as a git tag
(404), retry with the version from the embedded manifest. Only triggers
on "not found" errors, not transient network failures.

Also:
- Rename EmbeddedDefaultAppKitVersion/EmbeddedResolve* to
  ResolveEmbeddedAppKitVersion/ResolveEmbeddedAgentSkillsVersion
- Remove duplicate log lines (keep only log.Warnf, drop cmdio.LogString)
- Drop ctx parameter from ResolveEmbedded* (not needed)

Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>
Address all review comments: add sentinel ErrNotFound and typed
HTTPStatusError, validate manifest entry values, centralize
not-found fallback for skills and AppKit consumers, skip retries
on 4xx, add User-Agent header, support DATABRICKS_CACHE_ENABLED,
and document pruning policy and trust model.

Co-authored-by: Isaac
- Only fall back to embedded version when version was auto-resolved,
  not when user explicitly passed --version or --branch
- Skip fallback if embedded version matches the one already tried
- Log warning when embedded fallback resolution itself fails

Co-authored-by: Isaac
@pkosiec pkosiec force-pushed the pkosiec/appkit-version-pinning branch from c79f553 to 847438f Compare May 11, 2026 09:49
@pkosiec pkosiec temporarily deployed to test-trigger-is May 11, 2026 09:49 — with GitHub Actions Inactive
@pkosiec pkosiec temporarily deployed to test-trigger-is May 11, 2026 09:49 — with GitHub Actions Inactive
Comment thread experimental/aitools/lib/installer/installer.go
GetSkillsRef now returns whether the ref was explicitly set via env var.
FetchSkillsManifestWithFallback accepts allowFallback to skip the
embedded fallback when the user explicitly chose a ref.

Co-authored-by: Isaac
@pkosiec pkosiec temporarily deployed to test-trigger-is May 11, 2026 11:25 — with GitHub Actions Inactive
@pkosiec pkosiec temporarily deployed to test-trigger-is May 11, 2026 11:25 — with GitHub Actions Inactive
@simonfaltum
Copy link
Copy Markdown
Member

The new libs/clicompat/ package rolls its own file cache (readLocalManifest, writeLocalManifest, manual DATABRICKS_CACHE_DIR / DATABRICKS_CACHE_ENABLED handling, hand-written temp-file-then-rename, mtime-based TTL). The CLI already has a shared cache library for this in libs/cache, and it's what every other on-disk cache in the repo uses. Let's not introduce a second caching convention.

Use libs/cache

  • cache.NewCache(ctx, component, ttl, metrics) + cache.GetOrCompute[T] / cache.Get[T] / cache.Put[T] handle JSON serialization, fingerprinting, atomic writes, per-key locking, env-var controls, and metrics integration.
  • Storage path: ~/.cache/databricks/<cli-version>/<component>/<hash>.json. Version-scoped, swept by databricks cache clear.
  • Introduced in Initial implementation of the local cache layer #3678.

Closest analog: the .well-known/databricks-config cache (#5011)

Same shape as this PR: remote HTTP fetch with a TTL and fallback behavior. See libs/hostmetadata/resolver.go, ~90 lines total. It wraps the SDK's HostMetadataResolver with cache.GetOrCompute for the hit-or-fetch path and cache.Get for a read-only negative-cache probe. Positive TTL 1h, negative TTL 60s.

Other call sites worth skimming

  • bundle/config/mutator/populate_current_user.go (Enable caching user identity by default #4202) caches CurrentUser.Me(). Shows GetOrCompute with a real fingerprint.
  • bundle/config/mutator/initialize_cache.go shows the metrics wiring at bundle init.

Sketch for clicompat

const (
    compatCacheComponent = "cli-compat"
    compatCacheTTL       = 1 * time.Hour
)

type manifestFingerprint struct{}

func FetchManifest(ctx context.Context) (Manifest, error) {
    c := cache.NewCache(ctx, compatCacheComponent, compatCacheTTL, nil)
    m, err := cache.GetOrCompute[Manifest](ctx, c, manifestFingerprint{}, fetchRemoteWithRetry)
    if err == nil {
        return m, nil
    }
    return parseEmbeddedManifest()
}

That removes cachedManifest, isFresh, readLocalManifest, writeLocalManifest, manifestLocalPath, and the manual env-var handling (~100 lines).

One open question: the stale-cache fallback (current tier 3a)

libs/cache doesn't expose a "return the cached value even if expired" read. Two options:

  1. Drop tier 3a and rely on the embedded manifest. The embedded copy is always in the binary, so we lose only the narrow window of "local cache just expired and remote happens to be down right now." Seems like a fair trade for adopting the shared primitive.
  2. Add cache.Peek / cache.GetStale to libs/cache as a follow-up, then preserve the 4-tier behavior.

I'd vote (1) for this PR.

@pkosiec
Copy link
Copy Markdown
Member Author

pkosiec commented May 12, 2026

@simonfaltum yeah, as discussed, it was done on purpose to fallback to a successfuly fetched manifest (even if it is "outdated") in case of GitHub is down 👍

Copy link
Copy Markdown
Member

@simonfaltum simonfaltum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a deep dive on this in person, looks good

@pkosiec pkosiec temporarily deployed to test-trigger-is May 12, 2026 14:22 — with GitHub Actions Inactive
@pkosiec pkosiec temporarily deployed to test-trigger-is May 12, 2026 14:22 — with GitHub Actions Inactive
@pkosiec pkosiec force-pushed the pkosiec/appkit-version-pinning branch from 5e4c5df to 2f92468 Compare May 12, 2026 14:26
@pkosiec pkosiec temporarily deployed to test-trigger-is May 12, 2026 14:27 — with GitHub Actions Inactive
@pkosiec pkosiec temporarily deployed to test-trigger-is May 12, 2026 14:27 — with GitHub Actions Inactive
@pkosiec pkosiec force-pushed the pkosiec/appkit-version-pinning branch from 2f92468 to d526958 Compare May 12, 2026 14:27
…E_EMBEDDED_COMPAT

The manifest is now purely range-based: each versioned entry defines a
range floor that applies to that CLI version and all above it. The "next"
key was redundant since we always know the CLI version when updating the
manifest. Dev builds now resolve to the highest versioned entry.

Also adds DATABRICKS_FORCE_EMBEDDED_COMPAT=true env var to skip remote
fetch and use only the embedded manifest, useful for local development.

Co-authored-by: Isaac
@pkosiec pkosiec temporarily deployed to test-trigger-is May 12, 2026 14:28 — with GitHub Actions Inactive
@pkosiec pkosiec temporarily deployed to test-trigger-is May 12, 2026 14:28 — with GitHub Actions Inactive
@pkosiec pkosiec force-pushed the pkosiec/appkit-version-pinning branch from d526958 to ea39588 Compare May 12, 2026 14:28
@pkosiec pkosiec temporarily deployed to test-trigger-is May 12, 2026 14:29 — with GitHub Actions Inactive
@pkosiec pkosiec temporarily deployed to test-trigger-is May 12, 2026 14:29 — with GitHub Actions Inactive
Copy link
Copy Markdown
Contributor

@renaudhartert-db renaudhartert-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM on the high-level, deferring final approval to Simon.

@simonfaltum simonfaltum merged commit 3e4cf68 into main May 12, 2026
31 of 32 checks passed
@simonfaltum simonfaltum deleted the pkosiec/appkit-version-pinning branch May 12, 2026 16:00
pkosiec added a commit to databricks/databricks-agent-skills that referenced this pull request May 12, 2026
## Summary

Adds a "Version resolution in Databricks CLI" section to CONTRIBUTING.md
explaining that the CLI uses `cli-compat.json` to determine which Agent
Skills version to install.

Companion PRs:
- [databricks/cli#5139](databricks/cli#5139)
- [databricks/appkit#333](databricks/appkit#333)

Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants