Switch LFS backend from GitHub LFS to lfs.dimensionalos.com#2022
Switch LFS backend from GitHub LFS to lfs.dimensionalos.com#2022spomichter wants to merge 2 commits intolegacy-dev-dont-mergefrom
Conversation
Adds .lfsconfig pointing `git lfs` at our self-hosted giftless instance (see dimensionalOS/dimensional-lfs) backed by S3. All 75 LFS objects were migrated to the new layout before this change merged. Effect: - `git lfs pull` (no auth) — matches GitHub LFS public-repo behavior - `git lfs push` — requires a GitHub PAT with write access to this repo, validated by giftless against the GitHub API Adds dimos/utils/test_lfs.py with three direct batch-API smoke tests (reachability, anonymous-write rejection, known-object roundtrip) so a broken LFS server gives a clearer signal in CI than a smudge failure.
Greptile SummaryThis PR switches the Git LFS backend from GitHub LFS to a self-hosted giftless instance (
Confidence Score: 5/5Safe to merge — the change is a two-line config redirect and three network smoke tests with no impact on application logic. Both changed files are additive and self-contained: No files require special attention beyond the minor diagnostic-quality note in dimos/utils/test_lfs.py. Important Files Changed
Sequence DiagramsequenceDiagram
participant Client as git lfs / test
participant LFS as lfs.dimensionalos.com (giftless)
participant GitHub as api.github.com
participant S3 as s3://dimos-github-lfs
Note over Client,S3: Anonymous download
Client->>LFS: POST /dimensionalOS/dimos/objects/batch
LFS-->>Client: 200 presigned-S3-URL
Client->>S3: GET presigned-S3-URL
S3-->>Client: object bytes
Note over Client,S3: Authenticated upload
Client->>LFS: POST /dimensionalOS/dimos/objects/batch + GitHub PAT
LFS->>GitHub: Verify PAT write permission
GitHub-->>LFS: 200 OK
LFS-->>Client: 200 presigned-S3-PUT-URL
Client->>S3: PUT presigned-S3-PUT-URL
S3-->>Client: 200 OK
Note over Client,S3: Unauthenticated upload blocked
Client->>LFS: POST /dimensionalOS/dimos/objects/batch (no auth)
LFS-->>Client: 403 Forbidden
Reviews (2): Last reviewed commit: "Merge branch 'dev' into lfs-server-cutov..." | Re-trigger Greptile |
| @pytest.mark.slow | ||
| def test_anonymous_upload_is_forbidden(): | ||
| """An unauthenticated upload returns 403 — only repo collaborators can push.""" | ||
| response = _batch("upload", "0" * 64, 1) | ||
| assert response.status_code == 403, response.text |
There was a problem hiding this comment.
The test pins
403, but HTTP semantics permit a server to return 401 for unauthenticated requests (signalling "please authenticate") rather than 403 ("authenticated but not permitted"). Giftless may return either code depending on configuration. Accepting both makes the test resilient to that variation without weakening the assertion that anonymous upload is rejected.
| @pytest.mark.slow | |
| def test_anonymous_upload_is_forbidden(): | |
| """An unauthenticated upload returns 403 — only repo collaborators can push.""" | |
| response = _batch("upload", "0" * 64, 1) | |
| assert response.status_code == 403, response.text | |
| @pytest.mark.slow | |
| def test_anonymous_upload_is_forbidden(): | |
| """An unauthenticated upload returns 401/403 — only repo collaborators can push.""" | |
| response = _batch("upload", "0" * 64, 1) | |
| assert response.status_code in (401, 403), response.text |
| href = obj["actions"]["download"]["href"] | ||
| assert href.startswith("https://dimos-github-lfs.s3"), href |
There was a problem hiding this comment.
The
startswith("https://dimos-github-lfs.s3") check couples the test to the virtual-hosted S3 URL style. AWS presigned URLs can also be path-style (https://s3.amazonaws.com/dimos-github-lfs/…), and the format may also include a regional subdomain variant. Checking for the bucket name anywhere in the URL is less brittle while still confirming the object came from the right bucket.
| href = obj["actions"]["download"]["href"] | |
| assert href.startswith("https://dimos-github-lfs.s3"), href | |
| href = obj["actions"]["download"]["href"] | |
| assert "dimos-github-lfs" in href, href |
|
|
||
| def _batch(operation: str, oid: str, size: int, *, auth=None): | ||
| return requests.post( |
There was a problem hiding this comment.
auth parameter is defined but never exercised in any test
_batch accepts an auth kwarg intended for the authenticated-push path, but no test currently passes a real credential. If the giftless auth handler (PAT → GitHub permission check → S3 PUT signing) regresses, none of these tests would catch it. Consider adding a pytest.mark.slow test that uses a read-only test PAT (stored as a CI secret) to exercise at least the upload batch request, so the push auth path has some CI coverage.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
|
|
||
| @pytest.mark.slow | ||
| def test_anonymous_upload_is_forbidden(): | ||
| """An unauthenticated upload returns 403 — only repo collaborators can push.""" |
There was a problem hiding this comment.
what is a normal dev process for a third party contributor?
Summary
.lfsconfigpointinggit lfsat our self-hosted serverlfs.dimensionalos.com(see dimensionalOS/dimensional-lfs) — a giftless instance on EC2 that signs presigned URLs tos3://dimos-github-lfs/lfs/dimensionalOS/dimos/<sha>. We're cutting over from GitHub LFS to escape its bandwidth budget cap.cafe.jpg.tar.gz—b8cf30...7603).dimos/utils/test_lfs.pywith three batch-API smoke tests so CI gives a clear "LFS server X is broken" signal rather than the looser "git lfs pull failed".What changes for users
git lfs pulllfs.dimensionalos.com-> S3, no budgetgit lfs pushdimensionalOS/dimoswrite permission, then signs an S3 PUTFor pull-only consumers, nothing changes —
git lfs pullis still anonymous.For push, devs need their git credential helper to supply a GitHub PAT with
reposcope (or fine-grained PAT with read/write contents on this repo).gh auth tokenworks.Test plan
devrebasedimos/utils/test_lfs.py(new) passes — exercises the LFS server's batch API directlydimos/utils/test_data.py::test_pull_file(existing slow test) passes — exercises the fullgit lfs pull -> smudge -> decompresspipeline against the new servergit lfs pushfrom a fresh checkout to confirm the GitHub-PAT auth path works end-to-end (can't fully test in CI since CI doesn't push)Rollback
If the LFS server has issues post-merge, revert this PR.
git lfs pullwill go back to GitHub LFS automatically — the LFS objects still exist there from before this migration. We haven't disabled or pruned GitHub LFS storage.Related