Skip to content

fix(occurrences): remove all n+1 queries from occurrence API #1274

Merged
mihow merged 13 commits into
mainfrom
fix/occurrence-list-queries
May 11, 2026
Merged

fix(occurrences): remove all n+1 queries from occurrence API #1274
mihow merged 13 commits into
mainfrom
fix/occurrence-list-queries

Conversation

@mihow
Copy link
Copy Markdown
Collaborator

@mihow mihow commented Apr 29, 2026

Summary

Speeds up the occurrences list endpoint by ~9x and frees up gunicorn workers that were previously pinned for 30+ seconds per request, which under multi-user load was stalling every other API endpoint sharing the pool (captures, jobs, deployments — "every page feels frozen"). Fewer queries per request also drops Postgres load and lets the same hardware support more concurrent reviewers. Closes #1271 (Sentry AMI-PLATFORM-API-BMY, 34–40s cold responses).

Mechanically: introduces reusable Prefetch factories and wires them into OccurrenceViewSet list and detail paths. Strict prefetch contract — serializer raises RuntimeError if a caller skips the prefetch instead of silently re-introducing per-row queries.

Measurement

Query count (in-test, CaptureQueriesContext)

In-test project with 25 occurrences (each: one detection, one classification). Cachalot cache cleared between runs.

Page size Before After Reduction
limit=5 24 13 1.8x
limit=25 64 7 9.1x

Baseline grew ~1.6 queries per occurrence (clear N+1). Post-fix is flat regardless of page size.

Concurrent latency A/B (arctia=PR vs serbia=baseline, project 5, ~5k occurrences)

Env Concurrency limit avg p95 max errors
serbia (baseline) 10 25 4.79s 8.66s 9.41s 0/30
arctia (PR) 10 25 5.16s 7.61s 7.84s 0/30
arctia (PR) 40 25 9.13s 18.39s 19.78s 0/40

Prod broke at 20 concurrent (4× 502 errors at limit=25) and saturated at 10P × limit=100 with 59.6s max. Arctia (PR-deployed) holds 40 concurrent without errors; breaks at 60. ~3× capacity vs prod baseline. Reproducible via scripts/benchmark_occurrences_list.sh.

Changes

  1. ami/main/models_future/occurrence.py (new module)
    • prefetch_detections_for_list() — detections + nested classifications, ordered for stable galleries
    • prefetch_detections_for_detail() — detections + nested classifications + source_image, ordered most-recent-first
    • best_prediction_from_prefetch() — mirrors Occurrence.predictions() per-algorithm max-score grouping in Python
    • best_identification_from_prefetch() — mirrors BEST_IDENTIFICATION_ORDER + withdrawn=False
    • detection_image_urls_from_prefetch(occurrence, limit=...) — bounded URL list
    • _require_prefetch() — strict gate; raises if relations missing from _prefetched_objects_cache
  2. OccurrenceQuerySet.with_list_prefetches() / with_detail_prefetches() call the Prefetch factories directly.
  3. OccurrenceListSerializer
    • detection_images, determination_details, best_machine_prediction route through the strict helpers — no fallback to model methods.
    • detection_images_limit: int | None = 1 — list cards render a single cover image.
    • OccurrenceSerializer (detail) overrides detection_images_limit = 100.
  4. OccurrenceViewSet.get_queryset calls with_list_prefetches() on list, with_detail_prefetches() on detail.
  5. JSONExporter.get_queryset (ami/exports/format_types.py) adds with_detail_prefetches(). The export serializer subclasses OccurrenceSerializer and inherits the strict helpers, so it needs the same prefetch contract.
  6. scripts/benchmark_occurrences_list.sh — reproducible concurrent load script with randomised offsets to bypass cachalot, p50/p95/p99/max + per-status error counts.
  7. TaxonListSerializer.parents bug fix — field was declared as TaxonNestedSerializer(read_only=True) with no source, but Taxon has no parents attribute (only parents_json JSONB + parent FK). Field rendered null for every taxon. Fixed to read from parents_json via TaxonParentSerializer(many=True, source="parents_json") — matches the detail serializer pattern at TaxonSerializer.parents. Found during the audit follow-up below.
  8. Regression guards (ami/main/tests.py)
    • TestSourceImageListQueryCount — 3 variants (no_detections, with_detections, with_counts+detections)
    • TestTaxonListQueryCount — base case
    • Both use @override_settings(CACHALOT_ENABLED=False) to measure cold query count.

Behavior changes (read carefully)

  • List response detection_images capped at 1 (was unbounded via the model method, but in practice fed from the same query and rarely larger).
  • Detail response detection_images capped at 100 (was unbounded; per-occurrence detection counts can reach thousands once tracking lands, pagination is the long-term answer).
  • Strict prefetch contract: any new call site that uses OccurrenceListSerializer or OccurrenceSerializer must apply with_list_prefetches() or with_detail_prefetches() on the queryset. CI surfaced one missing site (JSONExporter) — fixed here. Audited all six occurrence-related serializers; only the three that hit the strict helpers (OccurrenceListSerializer, OccurrenceSerializer, OccurrenceExportSerializer) need prefetches, and all three are wired.
  • TaxonListSerializer.parents now returns a populated [{id, name, rank}] array instead of null. UI in ui/src/data-services/models/taxa.ts already expected this shape and falls back when missing — fix lets the list path populate ranks directly.

SourceImage + Taxon list audit (completed)

Audited both viewsets against the same N+1 pattern. Result: both list endpoints already scale flat — no fix needed.

Empirical cold-query counts (cachalot disabled, double-warmup):

Endpoint limit=5 limit=25
/captures/?with_detections=false 7 3
/captures/?with_detections=true 10 4
/captures/?with_detections=true&with_counts=true 4 4
/taxa/ 10 4

SourceImageViewSet is annotation- and prefetch-driven from the start (select_related("event","deployment"), Prefetch("detections", to_attr="filtered_detections") with nested select_related("occurrence","occurrence__determination"), subquery-based with_counts, cached count columns).

TaxonViewSet.get_taxa_observed uses correlated subqueries for occurrences_count / last_detected / best_determination_score plus a Prefetch("tags", to_attr="prefetched_tags"). TaxonListSerializer renders only annotations + JSONB + prefetched tags — no FK dereferences per row.

No unified abstraction warranted right now. _require_prefetch was needed for Occurrence because that serializer aggregates over detections.classifications and identifications in Python. SourceImage / Taxon are annotation-driven, not aggregation-driven; lifting _require_prefetch into a base mixin would be solution-without-problem.

Caveats

  • _require_prefetch only checks top-level relations. A caller prefetching detections without nested classifications will silently re-introduce per-detection queries. Mitigated by always going through the list/detail factories. Tighter depth-enforcement deferred to django-zen-queries follow-up.

Deferred to follow-up

  • django-zen-queries QueriesDisabledViewMixin as a guardrail (cachalot interaction needs a quick spike — cache-key lookups may register as queries)
  • Restore best_identification + best_detection fields on TaxonListSerializer as a feature add — cheap subquery annotation per the audit memo (not an N+1 fix)
  • Expose ?detection_images_limit=N query param on OccurrenceViewSet. The cap (list=1, detail=100) is currently a serializer class attribute; bundle into the TaxonListSerializer feature PR when tracking-merged occurrences are common enough that a frontend gallery card actually needs to request more than 1 image. Today's data has max_det=1 per occurrence everywhere, so exposing the knob is premature.

Post-deploy validation (2026-05-11)

Merged + deployed to prod as ad0826d5 at ~22:53 UTC. Posted a production APM deployment marker for timeline correlation.

Ran a synthetic UI traffic sweep against production both before deploy and after (4 separate sweeps each side, 5 iterations × 5 pages × 3 projects per sweep, 300 page-loads per side, 600 total). Each iteration runs in a fresh isolated browser context (headless Chromium, Playwright) so cache state is consistent between runs. Token-authenticated against three projects covering different occurrence-count regimes (~37k, ~2.3k, ~1.4k).

target_api_ms medians on /api/v2/occurrences/ family

project page before after Δ
18 occ_list 782 652 -17%
18 occ_list_large 767 678 -12%
90 occ_list 992 693 -30%
90 occ_list_large 939 681 -27%
120 occ_list 964 592 -39%
120 occ_list_large 953 634 -33%

Every PR-target page dropped on every project. Project 120 (smaller, detection-heavy ratio) gained the most — consistent with the fix removing per-row detection re-queries.

Control pages (audited but not modified by this PR)

project page before after verdict
18 captures_list 446 441 flat
18 taxa_list 223 195 flat
18 occ_detail 274 235 -14%
120 captures_list 356 409 flat (within prod variance)
120 taxa_list 214 188 flat
90 taxa_list 195 174 flat

No regressions. captures_list and taxa_list (SourceImage + Taxon list endpoints from the audit) behave as predicted.

Variance tightened on the touched pages

Coefficient-of-variation (CV%) of the median target_api_ms across the 4 sweeps:

project / page before CV% after CV%
90 / occ_list 38% 10%
90 / occ_list_large 32% 11%
120 / occ_list 23% 2%
18 / occ_list_large 9% 8%

The N+1 removal eliminates a variable amount of per-row Python work, so the post-fix timing is more deterministic. Secondary win.

End-to-end page-load (DOM + assets + React hydration)

project page before load_ms after load_ms Δ
18 occ_list 2745 2473 -10%
90 occ_list 2819 2602 -8%
120 occ_list 2812 2516 -11%
120 occ_list_large 2751 2647 -4%

User-visible page load on the occurrences list dropped roughly 10% — directly attributable to the API speedup (TLS handshake, asset load, hydration cost is roughly constant).

Caveats

  • Sample size: 4 sweeps × 5 iterations = 20 per (project, page). Tight enough for double-digit deltas (which the PR delivers); not a substitute for the longer-window APM traffic averages.
  • Project 90 had highest cold-start variance pre-fix; one baseline run caught a fully-warm window. Including/excluding it shifts the before-median by a few hundred ms either way. Post-fix the variance collapsed regardless, which is the more important signal.
  • Sentry AMI-PLATFORM-API-BMY soak (the cold 34–40s tail) is the separate signal tracked in the test plan and is left unchecked here. The APM averages above show typical traffic, not the long-tail cold cases.

Test plan

  • python manage.py test ami.main.tests.TestOccurrenceListQueryCount passes
  • python manage.py test ami.main.tests.TestOccurrenceDetailQueryCount passes
  • python manage.py test ami.main.tests.TestOccurrencePrefetchHelpersEdgeCases passes (zero-detection no 500, missing prefetch raises)
  • python manage.py test ami.main.tests.TestSourceImageListQueryCount ami.main.tests.TestTaxonListQueryCount passes (audit regression guards)
  • python manage.py test ami.exports.tests.DataExportTest.test_export_record_count ami.exports.tests.DataExportTest.test_json_export_record_count passes (regression caught the missing exporter prefetch)
  • Concurrent benchmark on arctia.dev (PR-deployed) — limit=25 holds at 40 concurrent with 0 errors, breaks at 60 (10% errors)
  • CSV export validated end-to-end on arctia (occurrences_simple_csv format)
  • Identification POST flow on arctia (2 identifications, no 500s — strict contract holds under writes)
  • JSON-format export trigger on arctia validated against real data — 438 occurrences (project 116, "Kenya Labelling Project") exported via occurrences_api_json, ~24s wall time, every record populated detection_images + determination_details + best_machine_prediction with zero RuntimeError from the strict-prefetch contract
  • Sentry AMI-PLATFORM-API-BMY soak post-merge — confirm 34–40s cold responses go away

Summary by CodeRabbit

Release Notes

  • New Features

    • Detection images are now capped: list views show 1 image per occurrence, while detail views display up to 100.
  • Performance Improvements

    • Optimized database queries across API list and detail endpoints to reduce response times.
    • Improved export performance through better data prefetching.

… remove N+1

Issue: #1271

OccurrenceViewSet.list issued one query per occurrence to fetch
detection paths, the best machine prediction, and the best
identification, scaling roughly linearly with page size and producing
the 34-40s cold-cache responses tracked in Sentry as AMI-PLATFORM-API-BMY.

This change adds reusable Prefetch factories in
ami/main/models_future/occurrence.py and wires them into the list path:

- prefetch_detection_images() populates a `prefetched_detection_images`
  to_attr list of detections that have a path
- prefetch_classifications_for_best_prediction() populates
  detections__classifications with select_related taxon and algorithm
- best_prediction_from_prefetch / best_identification_from_prefetch
  pick the winners in Python without re-querying

OccurrenceQuerySet.with_list_prefetches() bundles these prefetches and
is called from OccurrenceViewSet.get_queryset only when action == 'list'.
OccurrenceListSerializer reads from the prefetch caches when present
and falls back to the existing model methods for non-list callers
(admin, exports, signals).

Local measurement on a project with 25 occurrences:

  page size | before | after | reduction
  limit=5   |   24   |  13   | 1.8x
  limit=25  |   64   |   7   | 9.1x

A new TestOccurrenceListQueryCount guards against regression by
asserting the query count does not grow with page size.

Follow-ups (separate PR):
- django-zen-queries QueriesDisabledViewMixin to fail loudly on lazy
  queries during list rendering
- audit other large-list serializers (SourceImage, Taxon) for the
  same pattern

Co-Authored-By: Claude <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 29, 2026 14:50
@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 29, 2026

Deploy Preview for antenna-preview canceled.

Name Link
🔨 Latest commit 054d331
🔍 Latest deploy log https://app.netlify.com/projects/antenna-preview/deploys/6a0240719edaee0007d95f17

@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 29, 2026

Deploy Preview for antenna-ssec canceled.

Name Link
🔨 Latest commit 054d331
🔍 Latest deploy log https://app.netlify.com/projects/antenna-ssec/deploys/6a0240712242700008065cdd

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 29, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

List/detail queryset prefetches are split and centralized into new helpers; serializers consume prefetched detections/classifications/identifications via helper functions for best-identification, best-prediction, and detection image URLs. Tests, exporter usage, and a benchmark script were added/updated to exercise and validate the prefetch contract.

Changes

Occurrence list/detail prefetch + serializer helpers

Layer / File(s) Summary
Prefetch builders & helpers
ami/main/models_future/occurrence.py
New module exporting prefetch_detections_for_list(), prefetch_detections_for_detail(), _require_prefetch(), best_prediction_from_prefetch(), best_identification_from_prefetch(), and detection_image_urls_from_prefetch(). Implements deterministic ordering and strict prefetch checks.
Queryset helpers
ami/main/models.py
Adds BEST_IDENTIFICATION_ORDER constant; adds OccurrenceQuerySet.with_list_prefetches() and .with_detail_prefetches() that import and apply the new prefetch builders; applies BEST_IDENTIFICATION_ORDER to best-identification selection paths.
View wiring
ami/main/api/views.py
OccurrenceViewSet.get_queryset() now chooses .with_list_prefetches() for list actions and .with_detail_prefetches() for non-list/detail actions.
Serializers (list/detail behavior)
ami/main/api/serializers.py
TaxonListSerializer.parents switched to TaxonParentSerializer(many=True, source="parents_json"). OccurrenceListSerializer adds detection_images_limit=1, detection_images field, get_detection_images() (uses detection_image_urls_from_prefetch), refactors get_determination_details() to use best_identification_from_prefetch() and best_prediction_from_prefetch() (prediction suppressed when identification exists), and updates get_best_machine_prediction() to use the prefetched helper. OccurrenceSerializer sets detection_images_limit=100 for detail.
Exporter usage
ami/exports/format_types.py
JSONExporter.get_queryset() now applies .with_detail_prefetches() so export serializers relying on prefetched data don't trigger extra DB lookups.
Tests
ami/main/tests.py
Adds regression tests asserting bounded SQL query counts for occurrences list/detail, tests for inflated-detail case, prefetch-helper edge-case tests (safe defaults vs. RuntimeError when missing), and cold-cache audits for other list endpoints.
Benchmark script
scripts/benchmark_occurrences_list.sh
New script to load-test /api/v2/occurrences/ with concurrent randomized offsets; records status codes, latencies, percentiles and prints a summary.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client as Client
  participant View as OccurrenceViewSet\n(get_queryset)
  participant QS as OccurrenceQuerySet\n(with_*_prefetches)
  participant DB as Database
  participant Serializer as OccurrenceList/Detail\nSerializer
  participant Helpers as models_future/occurrence.py

  Client->>View: GET /api/v2/occurrences/ or /api/v2/occurrences/{id}/
  View->>QS: build queryset -> with_list_prefetches() / with_detail_prefetches()
  QS->>DB: query occurrences + prefetch(detections, classifications, identifications...)
  DB-->>QS: rows + prefetched related objects
  View->>Serializer: pass queryset / instance
  Serializer->>Helpers: best_identification_from_prefetch / best_prediction_from_prefetch / detection_image_urls_from_prefetch
  Helpers-->>Serializer: chosen identification/prediction and image URLs
  Serializer-->>Client: JSON response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • annavik

Poem

🐇 I hopped through cached rows, light and fleet,

I gathered paths and kept query counts neat.
Prefetch in paw, predictions in sight,
One tidy hop — responses return bright. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 51.35% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title directly describes the main objective: removing N+1 queries from the occurrence API. It is concise, specific, and accurately reflects the primary change focus across all modified files.
Linked Issues check ✅ Passed The PR addresses all coding requirements from issue #1271: prefetch detections/classifications/identifications for list and detail paths, implement strict helpers to consume prefetched data, and update serializers to eliminate per-row queries. Tests confirm constant query count across page sizes.
Out of Scope Changes check ✅ Passed All changes directly support the N+1 query elimination objective: prefetch factories, serializer updates, viewset wiring, exporter fix, and tests. Benchmark script and minor query optimizations (BEST_IDENTIFICATION_ORDER constant) are scope-aligned and necessary for verification.
Description check ✅ Passed The PR description comprehensively follows the template with all required sections: Summary, List of Changes, Related Issues (#1271), Detailed Description with measurements and behavior changes, How to Test the Changes, and a detailed Checklist.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/occurrence-list-queries

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR targets the /api/v2/occurrences/ list endpoint’s dominant N+1 sources by adding reusable Prefetch factories, wiring them into the list queryset, and teaching the list serializer to consume prefetched caches (with fallbacks), plus adding a regression query-count test.

Changes:

  • Added ami/main/models_future/occurrence.py with reusable Prefetch factories and Python helpers for selecting “best” records from prefetched data.
  • Added OccurrenceQuerySet.with_list_prefetches() and applied it only for OccurrenceViewSet list actions.
  • Updated OccurrenceListSerializer to read detection images / best prediction / best identification from prefetch caches when present, and added a query-count regression test.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
ami/main/tests.py Adds TestOccurrenceListQueryCount to guard against query-count scaling with page size.
ami/main/models_future/occurrence.py Introduces prefetch factories and helper functions to pick best prediction/identification and derive detection image URLs from caches.
ami/main/models.py Adds OccurrenceQuerySet.with_list_prefetches() to apply the list serializer’s required prefetches.
ami/main/api/views.py Applies list-only prefetching via with_list_prefetches(); keeps existing detail prefetch behavior.
ami/main/api/serializers.py Routes list fields through prefetched data when available, with fallbacks to existing model methods.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread ami/main/api/serializers.py Outdated
Comment thread ami/main/models_future/occurrence.py Outdated
Comment thread ami/main/models_future/occurrence.py
Comment thread ami/main/models_future/occurrence.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ami/main/models_future/occurrence.py`:
- Around line 69-91: best_prediction_from_prefetch currently aggregates all
det.classifications and ranks them, which diverges from
Occurrence.best_prediction because that method first applies per-algorithm
max-score filtering via Occurrence.predictions(); update
best_prediction_from_prefetch to call/use Occurrence.predictions() (or replicate
its per-algorithm max-score grouping) to produce the filtered set of
Classification objects, then apply the same sort key (terminal, -score, -pk) and
return the first element; ensure you reference and reuse
Occurrence.predictions() so the prefetch path preserves identical semantics to
best_prediction.

In `@ami/main/tests.py`:
- Around line 2956-2959: The try/except around caches["default"].clear() is
swallowing failures—remove the broad except or replace it with a specific catch
that fails the test; call caches["default"].clear() directly (so exceptions
propagate) or if you must catch, re-raise or call pytest.fail() with the caught
exception text so a cache-clear failure causes the test to fail and exposes the
error rather than hiding it.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 829c8693-706f-40db-89be-84b6b55a965a

📥 Commits

Reviewing files that changed from the base of the PR and between a347d55 and 3e7d9dd.

📒 Files selected for processing (5)
  • ami/main/api/serializers.py
  • ami/main/api/views.py
  • ami/main/models.py
  • ami/main/models_future/occurrence.py
  • ami/main/tests.py

Comment thread ami/main/models_future/occurrence.py Outdated
Comment thread ami/main/tests.py Outdated
- Prevent detail-path N+1: `_best_prediction` now requires both
  `detections` AND nested `classifications` to be in the prefetch cache
  before calling `best_prediction_from_prefetch`. Detail viewset only
  prefetches `detections`, so the previous gate still walked
  `det.classifications.all()` lazily (one query per detection).

- Mirror `Occurrence.best_prediction` semantics in the prefetch helper:
  `Occurrence.predictions()` filters classifications to the per-algorithm
  max-score row before ordering by `-terminal, -score`. The helper now
  applies the same per-algorithm grouping in Python so list and detail
  return the same winner for occurrences whose top-scoring classification
  is non-terminal.

- Add deterministic tiebreaker to `best_identification_from_prefetch`:
  tuple-keyed comparison (`created_at_is_set, created_at, pk`) treats
  `None` timestamps as lowest and breaks ties by `pk`, matching
  `order_by('-created_at', '-pk')`.

- Collapse the duplicate `detections` prefetch: prior PR loaded
  detections twice (once via a `to_attr` filtered list, once via
  `detections__classifications`). Replace both with a single
  `prefetch_detections_for_list()` that nests classifications, and
  filter for non-null paths in Python. Saves one query per request.

- Don't swallow cache-clear failures in the regression guard. A silent
  cache-clear failure would let the warm cache hide query-count
  regressions.

Refs: #1271
PR: #1274

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ami/main/api/serializers.py`:
- Around line 1377-1379: The fast-path should only run for list responses, not
any object with a cached 'detections' relation. Change the condition in the
serializer method (the branch using detection_image_urls_from_prefetch and
obj.detection_images) to require an explicit list-only marker (e.g. check
getattr(obj, "_prefetched_for_list", False) && "detections" in getattr(obj,
"_prefetched_objects_cache", {})); then set that marker when you build the list
queryset that prefetches detections (for example, after the prefetch in the list
view or in the queryset iteration set obj._prefetched_for_list = True) so
OccurrenceSerializer and other detail callers still call detection_images() and
preserve ordering/filtering.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 274a54ab-5ddf-4af8-9c00-c63dab62b87b

📥 Commits

Reviewing files that changed from the base of the PR and between 3e7d9dd and b62fb39.

📒 Files selected for processing (3)
  • ami/main/api/serializers.py
  • ami/main/models_future/occurrence.py
  • ami/main/tests.py

Comment thread ami/main/api/serializers.py Outdated
Mirror BEST_MACHINE_PREDICTION_ORDER: define the identification ordering
once and use it everywhere a "best identification" is selected.

- New constant `BEST_IDENTIFICATION_ORDER = ("-created_at", "-pk")`
  alongside `BEST_MACHINE_PREDICTION_ORDER`.
- `Occurrence.best_identification` and `OccurrenceQuerySet.with_verification_info`
  now order by the constant. Adds a `-pk` deterministic tiebreaker; created_at
  is auto_now_add so equal timestamps are vanishingly rare in practice.
- `best_identification_from_prefetch` references the constant in its
  docstring (the helper's tuple key already encodes the same ordering).

Co-Authored-By: Claude <noreply@anthropic.com>
Comment thread ami/main/api/serializers.py Outdated
Comment thread ami/main/api/serializers.py Outdated
mihow and others added 2 commits April 29, 2026 15:24
…paths

Per PR review: the cache-key-gated fallbacks in `OccurrenceListSerializer`
exercised the same N+1 path the PR closed, and silently triggered on the
detail endpoint via shared serializer inheritance.

- Drop `_best_prediction`, `_best_identification`, `get_detection_images`
  fallbacks. Serializer trusts the prefetch contract.
- Add `prefetch_detections_for_detail()` factory + `with_detail_prefetches()`
  queryset method. Detail viewset now applies it instead of an inline Prefetch.
- Drop `has_prefetched_classifications()` helper (no longer needed).
- Add `TestOccurrenceDetailQueryCount` regression guard. Detail endpoint
  measures 12 queries; list (warm test client) measures 6.

Closes review threads on PR #1274 (mihow comments 3164296309/3164306519,
CodeRabbit comment 3163836361).

Co-Authored-By: Claude <noreply@anthropic.com>
…on_images

Per PR review (mihow): adopt the 'disable model methods' safety pattern as
strict prefetch contract. Helpers now raise RuntimeError when required
relations are missing from `_prefetched_objects_cache` instead of silently
slow-pathing through Django's lazy lookups.

- `_require_prefetch()` gate on best_prediction / best_identification /
  detection_image_urls helpers.
- Restore old behavior: list responses cap detection_images at 3
  (`OccurrenceListSerializer.detection_images_limit`). Detail subclass
  overrides to None (unbounded) — TODO: bound when occurrence tracking
  enables thousands of detections per occurrence.
- New tests:
  - TestOccurrenceResponseShape pins list + detail JSON keys + the list cap.
  - TestOccurrencePrefetchHelpersEdgeCases asserts safe defaults on empty
    prefetched relations, and RuntimeError when prefetch is missing.

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ami/main/models_future/occurrence.py`:
- Around line 83-85: The strict-path prefetch check currently only asserts
detections were prefetched but then iterates det.classifications.all(), which
can trigger per-detection queries; update the check to require the nested
classifications prefetch (use _require_prefetch(occurrence,
"detections__classifications") or otherwise validate that each Detection object
has a prefetched 'classifications' attribute) before iterating, so the code
fails fast when nested prefetching is missing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dff16c79-6ad9-4049-9270-883a29b5cf76

📥 Commits

Reviewing files that changed from the base of the PR and between b62fb39 and 454ec77.

📒 Files selected for processing (5)
  • ami/main/api/serializers.py
  • ami/main/api/views.py
  • ami/main/models.py
  • ami/main/models_future/occurrence.py
  • ami/main/tests.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • ami/main/api/views.py
  • ami/main/tests.py

Comment thread ami/main/models_future/occurrence.py
mihow and others added 2 commits April 29, 2026 16:22
Drop TestOccurrenceResponseShape — most assertions are key-presence checks
already enforced by serializer Meta. Fold the only valuable assertion (the
detection_images per-row cap on list responses) into _list_occurrences in
TestOccurrenceListQueryCount where the response data is already inspected.

Co-Authored-By: Claude <noreply@anthropic.com>
…cover image to 1

Exports invoke OccurrenceExportSerializer (subclasses OccurrenceSerializer),
whose inherited get_determination_details / _best_prediction call the strict
prefetch helpers in models_future/occurrence.py. JSONExporter.get_queryset()
did not apply the prefetch, so two CI tests in ami.exports.tests.DataExportTest
raised RuntimeError ("Occurrence X is missing prefetched relations
['detections']"). Add .with_detail_prefetches() to the exporter queryset.

Cap OccurrenceListSerializer.detection_images_limit from 3 to 1 — list cards
render a single cover image; payload stays tight without changing response
shape.

Document the nested-prefetch caveat in _require_prefetch: it only checks
top-level relations, so a caller prefetching 'detections' without nested
'classifications' silently re-introduces per-detection queries. Tighter
enforcement is deferred to the django-zen-queries follow-up (issue #1271),
which catches per-row queries at the view boundary regardless of depth.

Co-Authored-By: Claude <noreply@anthropic.com>
@mihow mihow changed the title fix(occurrences): prefetch detection paths and classifications in list view fix(occurrences): remove all n+1 queries from occurrence API Apr 30, 2026
@mihow mihow requested a review from Copilot April 30, 2026 00:21
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread ami/main/models_future/occurrence.py Outdated
Comment thread ami/main/models_future/occurrence.py Outdated
Comment thread ami/main/models_future/occurrence.py Outdated
Comment thread ami/main/models_future/occurrence.py
Comment thread ami/main/tests.py
best_prediction_from_prefetch:
  - Skip classifications with score=None (mirrors SQL NULL semantics:
    Max(NULL) = NULL, score IN (NULL) never matches).
  - Removes the asymmetry between -inf in max calc and 0.0 in sort key.
  - Document Python-vs-SQL semantic difference around per-algorithm
    max grouping (Python is the stricter interpretation).

TestOccurrenceDetailQueryCount:
  - Add test_detail_query_count_does_not_scale_with_detections,
    inflating one occurrence to 9 detections x 4 classifications.
    The single-detection test alone could not catch a multi-detection
    N+1; this guards the actual property the test name asserts.

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
ami/main/api/serializers.py (1)

1384-1426: ⚡ Quick win

Memoize _best_prediction per occurrence to avoid duplicate scans.

_best_prediction(obj) is used in both get_determination_details (Line 1406) and get_best_machine_prediction (Line 1425). Caching it once on the object avoids repeated Python traversal of prefetched detections/classifications.

Suggested refactor
     def _best_prediction(self, obj: Occurrence):
         from ami.main.models_future.occurrence import best_prediction_from_prefetch
 
-        return best_prediction_from_prefetch(obj)
+        cache_attr = "_cached_best_prediction_from_prefetch"
+        if hasattr(obj, cache_attr):
+            return getattr(obj, cache_attr)
+        value = best_prediction_from_prefetch(obj)
+        setattr(obj, cache_attr, value)
+        return value
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ami/main/api/serializers.py` around lines 1384 - 1426, The _best_prediction
traversal is called twice; modify _best_prediction to memoize its result on the
Occurrence instance (e.g., check for and return obj._cached_best_prediction if
present, otherwise compute via best_prediction_from_prefetch(obj), assign it to
obj._cached_best_prediction and return it) so both get_determination_details and
get_best_machine_prediction reuse the cached value and avoid duplicate scans;
update only the _best_prediction function (reference symbols: _best_prediction,
best_prediction_from_prefetch, get_determination_details,
get_best_machine_prediction) and ensure you cache None results as well.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ami/main/tests.py`:
- Around line 3049-3054: The test creates a Detection with bbox set to
normalized floats but the codebase expects absolute pixel coordinates; update
the Detection.objects.create call to supply integer pixel bbox values instead of
normalized floats—for example compute with the fixture image dimensions (e.g.,
[int(source_image.width*0.1), int(source_image.height*0.1),
int(source_image.width*0.2), int(source_image.height*0.2)]) or use explicit
pixel values like [10, 10, 20, 20] for the bbox field so the Detection.bbox
matches production assumptions.

---

Nitpick comments:
In `@ami/main/api/serializers.py`:
- Around line 1384-1426: The _best_prediction traversal is called twice; modify
_best_prediction to memoize its result on the Occurrence instance (e.g., check
for and return obj._cached_best_prediction if present, otherwise compute via
best_prediction_from_prefetch(obj), assign it to obj._cached_best_prediction and
return it) so both get_determination_details and get_best_machine_prediction
reuse the cached value and avoid duplicate scans; update only the
_best_prediction function (reference symbols: _best_prediction,
best_prediction_from_prefetch, get_determination_details,
get_best_machine_prediction) and ensure you cache None results as well.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2a4cdc38-93a6-40e4-883d-075bc1815960

📥 Commits

Reviewing files that changed from the base of the PR and between 454ec77 and 1f5918a.

📒 Files selected for processing (4)
  • ami/exports/format_types.py
  • ami/main/api/serializers.py
  • ami/main/models_future/occurrence.py
  • ami/main/tests.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • ami/main/models_future/occurrence.py

Comment thread ami/main/tests.py
@mihow
Copy link
Copy Markdown
Collaborator Author

mihow commented Apr 30, 2026

Staging A/B benchmark

Side-by-side run against arctia.dev (this PR deployed) vs serbia.dev (baseline, similar infra), Vermont Atlas of Life project 5 (5,075 occurrences). Tooling: plain curl + xargs.

Single-request latency (5 runs each, varied offset to bypass cachalot)

limit arctia (PR) serbia (baseline) speedup
5 0.63s avg 1.30s avg 2.06×
25 1.63s avg 3.91s avg 2.39×
100 6.06s avg 12.17s avg 2.01×

The wall-clock speedup is smaller than the 9.1× query-count reduction at limit=25 — per-request fixed costs (TLS, auth, project filter, presigned URL generation) don't scale with query count, so they dominate the residual.

Detail endpoint (3 runs)

arctia (PR) serbia (baseline) speedup
0.33s avg 0.62s avg 1.9×

Concurrent load (10 in flight, 30 total, limit=25)

metric arctia (PR) serbia (baseline) speedup
wall 17.2s 47.8s 2.78×
p50 5.92s 14.80s 2.50×
p95 6.51s 21.29s 3.27×
p99 6.54s 21.68s 3.31×

Tail widens under load — the PR holds steady, the baseline blows out at p95/p99.

Worker-pool unblocking — /captures/?project_id=5&limit=10 while 8 /occurrences/?limit=25 are in flight

env captures baseline captures under occurrence load
arctia (PR) 0.32–0.50s 1.46–2.37s
serbia (baseline) 0.91–1.66s 3.50–5.72s

Captures-under-occurrence-load on the PR (~2s) is roughly equivalent to captures-no-load on the baseline (~1.2s). Workers free up ~3× faster, so the "every page feels frozen" symptom under multi-reviewer use should largely go away.

Other view A/B (PR doesn't change these — sanity check that nothing regressed)

view arctia (PR) serbia (baseline)
captures 0.40s 0.89s
taxa 0.27s 0.44s
events 0.45s 0.90s
deployments 0.32s 0.48s
jobs 0.64s 1.10s
sourceimages 0.27s 0.26s

sourceimages is identical on both → confirms baseline path. Other deltas are infra/cache noise (sequential probes, no shared load); the PR doesn't touch these viewsets.

Production sizing reference (no benchmark run on prod, just for scale)

api.antenna.insectai.org Vermont Atlas of Life (project 18, 31,741 occurrences):

limit latency (no PR)
5 1.25s avg
25 2.24s avg
100 6.07s avg

Post-deploy: based on the staging multipliers, expect roughly halved latency at limit=25 (~1.0s) and ~2× at limit=5 (~0.6s). Larger pages benefit less in absolute terms because per-row serializer cost dominates above the N+1 ceiling.

Correctness

  • Top-level keys identical on PR vs baseline ✓
  • detection_images length = 1 in list response on both (this dataset has 1 detection per occurrence; cap not visible)
  • Detail content diff shows only env-specific URLs and a re-run classification (different id/score/created_at) — expected, since arctia ran ML again
  • Identification create flow exercised in UI on arctia (occurrences 50781, 50804 identified live, no 500) — confirms strict prefetch contract holds under POST
  • Occurrences-by-collection JSON export triggered for 7,090 records — validates JSONExporter.get_queryset() strict-prefetch wiring

Still pending

  • Soak window on AMI-PLATFORM-API-BMY (the original 34–40s cold-response Sentry issue) — needs ~30 min of real reviewer load on arctia
  • Production benchmark on api.antenna.insectai.org after merge

@mihow
Copy link
Copy Markdown
Collaborator Author

mihow commented Apr 30, 2026

Production load test (api.antenna.insectai.org, project 18 = 31,741 occurrences)

Light concurrency, no concurrent users at probe time.

metric 4 in flight, 12 total, limit=25
wall 14.7s
avg 4.33s
p50 4.60s
p95 6.54s
max 6.60s

Prod baseline holds up better than serbia.dev despite 6× more data — likely bigger gunicorn pool / beefier infra. But p95 still hits 6.5s under modest 4-concurrent load. Under real multi-reviewer concurrency, expect the same tail widening that we see on serbia → arctia (3.3× p95 gap), so post-deploy p95 should drop into the ~2s range.

Export verification status

  • occurrences_simple_csv export on arctia completed successfully (7,090 records, file generated). However, CSV export uses OccurrenceTabularSerializer, which doesn't go through the strict prefetch contract — this proves the CSV path wasn't broken, but does NOT validate the JSONExporter.get_queryset() fix in 8f7db77.
  • A occurrences_simple_json export still needs to be triggered on arctia to exercise OccurrenceExportSerializer → strict-prefetch helpers under real export workload.

Reusable curl + xargs runner that probes the project's occurrence count,
then issues N parallel requests at varied offsets to bypass cachalot.
Reports wall, avg, min, p50, p95, p99, max, and per-status error counts.

Used to A/B compare staging deployments (arctia vs serbia) and to
characterize prod baseline for PR #1274. Re-run after deploy to verify
the speedup against prod's pre-merge numbers in the PR thread.

Co-Authored-By: Claude <noreply@anthropic.com>
@mihow
Copy link
Copy Markdown
Collaborator Author

mihow commented Apr 30, 2026

Production load test — scaled

Runs against https://api.antenna.insectai.org/api/v2/occurrences/?project_id=18 (Vermont Atlas of Life, 31,741 occurrences). All 200s except where noted. Test reproducer: scripts/benchmark_occurrences_list.sh (committed in cbae3fc).

concurrency total limit wall avg p50 p95 p99 max errors
4 12 25 14.7s 4.33s 4.60s 6.54s 6.60s 0/12
10 30 25 13.9s 3.79s 3.93s 5.28s 5.32s 5.32s 0/30
20 60 25 27.1s 7.70s 8.11s 13.50s 13.98s 14.00s 4/60 (502 Bad Gateway)
10 30 100 90.4s 27.13s 13.93s 57.00s 58.62s 59.60s 0/30

Findings:

  • At limit=25, prod stays usable up to ~10 concurrent (~5s p95). At 20 concurrent it saturates the gunicorn pool — 4 requests returned 502 after ~2.7s (ALB-level timeout / backpressure).
  • At limit=100, prod is already in trouble at 10 concurrent: p95 = 57s, max = 59.6s — i.e. hitting the gunicorn worker timeout. This is exactly the original Sentry symptom (AMI-PLATFORM-API-BMY, 34–40s cold responses).

Inferred post-merge expectation, based on the staging multipliers (~3.3× p95 reduction at limit=25 under concurrent load):

  • 10 concurrent / limit=25: prod p95 ~5.3s → ~1.6s
  • 20 concurrent / limit=25: 502s should disappear (workers free up faster, no saturation)
  • 10 concurrent / limit=100: prod p95 ~57s → ~17–25s (still slow, but below worker timeout — list with limit=100 is rare in the UI anyway)

Reproducer

# Pre-merge prod baseline
./scripts/benchmark_occurrences_list.sh https://api.antenna.insectai.org 18 10 30 25
./scripts/benchmark_occurrences_list.sh https://api.antenna.insectai.org 18 20 60 25
./scripts/benchmark_occurrences_list.sh https://api.antenna.insectai.org 18 10 30 100

# Post-merge prod re-run (same exact commands)
# Compare against the table above.

The script:

  • Probes count for the project so the random offset stays in range
  • Uses RANDOM % MAX_OFFSET to bypass django-cachalot per request
  • Reports avg / min / p50 / p95 / p99 / max plus per-status error counts

mihow and others added 2 commits April 30, 2026 17:08
…rings

Simplify the prefetch contract surface introduced in this branch:

- Drop _best_prediction / _best_identification thin wrappers in
  OccurrenceListSerializer; inline best_*_from_prefetch calls.
- Drop prefetches_for_list_serializer / prefetches_for_detail_serializer
  1-line list wrappers; with_list_prefetches / with_detail_prefetches now
  call the Prefetch factories directly.
- Compress _require_prefetch and best_*_from_prefetch docstrings; keep the
  strict-prefetch behavior and the SQL/Python null-handling note as a
  single line each.
- OccurrenceSerializer.detection_images_limit = 100 (was None) — bound
  detail responses from the start; pagination is the long-term answer.

Net -47 lines, no behavior change beyond the detail detection_images cap.

Co-Authored-By: Claude <noreply@anthropic.com>
Audit follow-up to PR #1274 (Occurrence list N+1 fix). Both viewsets
already scale flat (4q at limit=25 with cachalot disabled and warm pool).
Tests pin the current behaviour so future serializer changes can't silently
reintroduce per-row queries.

Co-Authored-By: Claude <noreply@anthropic.com>
Field was declared as TaxonNestedSerializer(read_only=True) with no source,
but Taxon has no `parents` attribute (only `parents_json` JSONB cache and
the `parent` FK). DRF resolved getattr(taxon, "parents") to nothing, so the
field rendered null for every taxon in the list response.

Match the detail serializer at TaxonSerializer.parents (line 865) by reading
from parents_json with TaxonParentSerializer. UI in
ui/src/data-services/models/taxa.ts already consumes this shape and falls
back when missing — fix lets the list path populate ranks directly.

Found during audit follow-up to PR #1274.

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
ami/main/tests.py (1)

3056-3063: 💤 Low value

Use a timezone-aware timestamp here for consistency with the surrounding fixtures.

datetime.datetime.now() is naive, while every other timestamp in this test (and the seed detection on Line 3052) comes from source_image.timestamp, which is tz-aware. Mixing a naive value can trigger Django's RuntimeWarning: DateTimeField received a naive datetime (and is order-fragile under USE_TZ=True). Reusing source_image.timestamp keeps the test self-consistent and avoids the warning.

Suggested fix
-                    timestamp=datetime.datetime.now(),
+                    timestamp=source_image.timestamp,

The unused import datetime at Line 3034 can then also be dropped.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ami/main/tests.py` around lines 3056 - 3063, The test creates Classification
rows with a naive timestamp via datetime.datetime.now(); update the
Classification.objects.create calls in the loop to use the existing tz-aware
source_image.timestamp (or seed_classification.timestamp if more appropriate)
instead of datetime.datetime.now() to ensure consistency with surrounding
fixtures and avoid Django's naive datetime warning, and remove the now-unused
import datetime at the top of the file.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/benchmark_occurrences_list.sh`:
- Around line 39-44: The script fails when MAX_OFFSET is 0 because offset is
computed with RANDOM % ${MAX_OFFSET}; update the offset logic in the benchmark
loop (the block using variables START, TOTAL, CONCURRENCY, MAX_OFFSET, BASE_URL,
PROJECT_ID, LIMIT, offset, OUT) to guard against division by zero—compute offset
only when MAX_OFFSET > 0 (e.g., use RANDOM % MAX_OFFSET) and otherwise set
offset to 0 (or a fixed safe value); ensure this conditional is applied inside
the xargs/bash -c invocation so curl uses a valid offset for small datasets.

---

Nitpick comments:
In `@ami/main/tests.py`:
- Around line 3056-3063: The test creates Classification rows with a naive
timestamp via datetime.datetime.now(); update the Classification.objects.create
calls in the loop to use the existing tz-aware source_image.timestamp (or
seed_classification.timestamp if more appropriate) instead of
datetime.datetime.now() to ensure consistency with surrounding fixtures and
avoid Django's naive datetime warning, and remove the now-unused import datetime
at the top of the file.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 933dc148-f716-4ac5-8055-f00c3fe63cdb

📥 Commits

Reviewing files that changed from the base of the PR and between 1f5918a and d41deb7.

📒 Files selected for processing (5)
  • ami/main/api/serializers.py
  • ami/main/models.py
  • ami/main/models_future/occurrence.py
  • ami/main/tests.py
  • scripts/benchmark_occurrences_list.sh
🚧 Files skipped from review as they are similar to previous changes (1)
  • ami/main/models.py

Comment thread scripts/benchmark_occurrences_list.sh
- ami/main/tests.py: switch Detection.bbox in the inflated-detections
  fixture to pixel-space ints, matching the rest of the codebase's bbox
  conventions.
- scripts/benchmark_occurrences_list.sh: guard against RANDOM % 0 when
  the project has fewer than LIMIT occurrences (MAX_OFFSET = 0).

Co-Authored-By: Claude <noreply@anthropic.com>
@mihow mihow merged commit ad0826d into main May 11, 2026
7 checks passed
@mihow mihow deleted the fix/occurrence-list-queries branch May 11, 2026 23:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OccurrenceViewSet list: N+1 queries on detections, best_prediction, best_identification

2 participants