Skip to content

fix: resolve DB bootstrap issues on fresh installations#8108

Open
grantfitzsimmons wants to merge 9 commits into
mainfrom
issue-8107
Open

fix: resolve DB bootstrap issues on fresh installations#8108
grantfitzsimmons wants to merge 9 commits into
mainfrom
issue-8107

Conversation

@grantfitzsimmons
Copy link
Copy Markdown
Member

@grantfitzsimmons grantfitzsimmons commented May 21, 2026

Fixes #8107

This PR fixes first-run and recovery behavior for the development Docker setup when MariaDB starts with a database that does not contain a valid Specify schema.

Previously, the startup path could treat a partially initialized database as if it were ready, which led to confusing migration failures and restart loops. The new flow makes the behavior explicit:

  • if the database is empty, startup continues normally
  • if the database is non-empty and missing the Specify schema, startup now requires explicit opt-in before any destructive reset
  • if reset is not allowed, the container exits cleanly instead of looping

Changed

  • Removed the MariaDB bootstrap user settings that caused first-run initialization problems with the container image.
  • Updated sp7_db_setup_check.sh to distinguish between:
    • an empty database
    • an existing database with a valid legacy Specify schema
    • an existing database that is missing the Specify schema
  • Added an explicit ALLOW_DB_RESET=true opt-in for destructive reset of a non-empty, schema-less database. In our development environment, this is true in the default .env file. If it is not specified, the default is false.
  • Made startup fail fast when database setup cannot proceed, instead of continuing into later migration steps against an invalid schema.
  • Clarified docker-entrypoint.sh so it is obvious that:
    • manage.py base_specify_migration runs inside sp7_db_setup_check.sh
    • manage.py migrate runs inside sp7_db_setup_check.sh
    • only manage.py run_key_migration_functions is called afterward from the entrypoint

With DB Reset

Set this in .env:

ALLOW_DB_RESET=true

Expected specify7 logs:

specify7       | MariaDB is up and running.
mariadb        | 2026-05-21 16:43:43 3 [Warning] Aborted connection 3 to db: 'unconnected' user: 'unauthenticated' host: '172.18.0.9' (This connection closed normally without authentication)
specify7       | Client host as seen by MariaDB: '172.18.0.9'
specify7       | Database 'specify' exists but does not contain a Specify schema.
specify7       | Resetting non-empty database 'specify' because ALLOW_DB_RESET=true was set.
specify7       | Ensuring migrator user 'specify_migrator' exists for relevant hosts...
specify7       | Migrator user 'specify_migrator'@'%' already exists.
specify7       | Migrator user 'specify_migrator'@'172.18.0.9' already exists.
specify7       | Existing hosts for 'specify_migrator' in mysql.user:
specify7       | CONCAT("'", user, "'@'", host, "'")
specify7       | 'specify_migrator'@'%'
specify7       | 'specify_migrator'@'172.18.0.9'
specify7       | Granting privileges to migrator user 'specify_migrator'...
specify7       | Granting ALL PRIVILEGES ON `specify`.* TO 'specify_migrator'@'%'...
specify7       | Granting ALL PRIVILEGES ON `specify`.* TO 'specify_migrator'@'172.18.0.9'...
specify7       | Verified: migrator user 'specify_migrator' has usable access to 'specify'.
specify7       | Ensuring app user 'specify_user' exists for relevant hosts...
specify7       | App user 'specify_user'@'%' already exists.
specify7       | App user 'specify_user'@'172.18.0.9' already exists.
specify7       | Existing hosts for 'specify_user' in mysql.user:
specify7       | CONCAT("'", user, "'@'", host, "'")
specify7       | 'specify_user'@'%'
specify7       | 'specify_user'@'172.18.0.9'
specify7       | Granting required privileges to app user 'specify_user' for relevant hosts...
specify7       | Granting app privileges on `specify`.* to 'specify_user'@'%'...
specify7       | Granting app privileges on `specify`.* to 'specify_user'@'172.18.0.9'...
specify7       | Verified: app user 'specify_user' has required privileges on 'specify'.
specify7       | --------------------------------------------------
specify7       | Database and user setup complete.
specify7       | New database created: True
specify7       | New migrator user created: False
specify7       | New app user created: False
specify7       | --------------------------------------------------
specify7       | Blank or newly created database detected.

In this mode, the database is recreated and startup continues normally.

Without DB Reset

Set this in .env:

ALLOW_DB_RESET=false

Expected specify7 logs:

specify7       | MariaDB is up and running.
mariadb        | 2026-05-21 16:40:52 3 [Warning] Aborted connection 3 to db: 'unconnected' user: 'unauthenticated' host: '172.18.0.9' (This connection closed normally without authentication)
specify7       | Client host as seen by MariaDB: '172.18.0.9'
specify7       | Database 'specify' exists but does not contain a Specify schema. Please ensure that the database is either empty or contains a valid Specify schema.
specify7       | Error: Database 'specify' exists but lacks the Specify schema and is not empty.
specify7       | If you want to allow a destructive reset, set ALLOW_DB_RESET=true in your environment before starting.
specify7       | Otherwise restore a proper Specify database or delete the database entirely to start fresh.
specify7       | Database setup failed; skipping startup migrations.
specify7 exited with code 1

In this mode, the container exits cleanly and the database is left untouched.


This avoids two failure modes:

  • treating a partial or invalid database as if it were a valid first-run database
  • continuing into later migration steps after setup has already failed

Checklist

  • Self-review the PR after opening it to make sure the changes look good and self-explanatory
  • Add relevant issue to release milestone

Testing instructions

  1. Start from a clean checkout with the development Docker composition configured.
  2. Run docker compose up --build.
  3. Verify the startup behavior with and without ALLOW_DB_RESET=true when connecting to a dummy database that does not have a specifyuser table but does contain more than 1 table in it:
    • with ALLOW_DB_RESET=true, the stack should drop and recreate the database and continue startup
    • without it, startup should stop with the explicit database schema error and the container should not loop (for data preservation)

Summary by CodeRabbit

  • Chores

    • Updated Docker Compose configuration for MariaDB service setup.
  • Bug Fixes

    • Added database reset protection mechanism with a configurable safety setting (disabled by default for production).
    • Enhanced database validation during startup with improved error handling and legacy schema detection.
    • Database operations now fail gracefully with clear guidance when setup issues are encountered, preventing data corruption.

Review Change Stack

Add ALLOW_DB_RESET environment flag and enhance DB setup checks to avoid accidental data loss. .env: add ALLOW_DB_RESET=true with a comment recommending false in production. sp7_db_setup_check.sh: detect presence of a legacy Specify schema by checking for the 'specifyuser' table; if the database exists but lacks the schema, determine whether the DB is empty and proceed, or if non-empty require explicit opt-in via ALLOW_DB_RESET=true to drop and recreate the database. If ALLOW_DB_RESET is not set, the script now aborts with guidance. Also adjust the migration logic to use the legacy-schema check to choose the appropriate base_specify_migration invocation.
Remove the 'restart: unless-stopped' line from the development service in docker-compose.yml
Replace the previous tolerant invocation of sp7_db_setup_check.sh (which used set +e / set -e) with an explicit if-check. If the DB setup script succeeds, continue to run key migration functions; if it fails, print an error and exit with status 1 to avoid continuing startup with an incomplete DB setup.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 21, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: a88cedf3-d9bc-474e-a1d8-cb9c7c138d36

📥 Commits

Reviewing files that changed from the base of the PR and between 5608205 and bea4028.

📒 Files selected for processing (3)
  • .env
  • docker-entrypoint.sh
  • sp7_db_setup_check.sh

📝 Walkthrough

Walkthrough

This PR hardens the database initialization flow by adding legacy schema detection, making database setup a hard startup requirement, and providing controlled reset behavior via environment configuration. MariaDB bootstrap is simplified by removing redundant user variables that conflict with the official image, and migration execution now branches on schema presence rather than database creation timing.

Changes

Database setup and legacy schema detection

Layer / File(s) Summary
Database reset configuration
.env, docker-compose.yml
ALLOW_DB_RESET=false environment variable gates destructive reset behavior; MYSQL_USER and MYSQL_PASSWORD removed from MariaDB service to prevent bootstrap conflicts; specify7 service restart policy removed.
Legacy Specify schema detection
sp7_db_setup_check.sh
New LEGACY_SCHEMA_EXISTS flag checks for specifyuser table in existing databases to distinguish between legacy installations and partial/empty database states.
Database reset and safety checks
sp7_db_setup_check.sh
When database exists without legacy schema, table count determines behavior: empty databases proceed as initial setup, non-empty databases reset only if ALLOW_DB_RESET=true, otherwise exit with guidance to restore or delete the database.
Migration execution based on schema state
sp7_db_setup_check.sh
base_specify_migration branching switches from NEW_DATABASE_CREATED to LEGACY_SCHEMA_EXISTS: legacy paths use --use-override to preserve existing data, new/reset databases run standard migration.
Startup enforcement of database setup
docker-entrypoint.sh
docker-entrypoint.sh now gates run_key_migration_functions on successful execution of sp7_db_setup_check.sh, exiting with status 1 on failure instead of continuing, making database readiness a hard prerequisite.
🚥 Pre-merge checks | ✅ 4 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Automatic Tests ⚠️ Warning No automatic tests added for database bootstrap logic changes in sp7_db_setup_check.sh and docker-entrypoint.sh, despite PR template and issue #8107 requiring test coverage. Add automated tests for fresh DB init, ALLOW_DB_RESET scenarios, and legacy schema detection as required by the PR template checklist and issue #8107.
Testing Instructions ⚠️ Warning PR template's "Testing instructions" section is empty. Per objectives, instructions should verify fresh installs, ALLOW_DB_RESET=true/false cases, and legacy schema detection. Add clear testing steps covering: fresh database first-run, ALLOW_DB_RESET=true with non-empty schema-less DB, ALLOW_DB_RESET=false rejection, and legacy schema handling.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: fixing database bootstrap issues on fresh Docker installations by handling initialization state detection and optional reset logic.
Linked Issues check ✅ Passed The pull request addresses all core requirements from issue #8107: removes MariaDB bootstrap user conflicts, implements robust database-existence detection, prevents fake migration records, ensures idempotent startup/setup, and gates destructive resets with explicit opt-in.
Out of Scope Changes check ✅ Passed All changes directly address the linked issue requirements: .env adds ALLOW_DB_RESET setting, docker-compose.yml removes problematic bootstrap credentials, docker-entrypoint.sh gates migrations on setup success, and sp7_db_setup_check.sh implements state detection and conditional reset logic.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch issue-8107

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@grantfitzsimmons grantfitzsimmons marked this pull request as ready for review May 21, 2026 16:53
@grantfitzsimmons grantfitzsimmons added this to the 7.12.1 milestone May 21, 2026
@grantfitzsimmons grantfitzsimmons changed the title fix: remove MYSQL_USER and MYSQL_PASSWORD fix: resolve DB bootstrap issues on fresh installations May 21, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
sp7_db_setup_check.sh (1)

460-469: ⚡ Quick win

Fail fast on base_specify_migration so a half-migrated state can't slip through.

The conditional swap looks right, but neither base_specify_migration nor the subsequent migrate has its exit status checked. If base_specify_migration fails, the script still runs migrate and the script's overall exit status becomes migrate's — which can mask a failure and produce exactly the partial-migration state issue #8107 is trying to eliminate. A small guard makes the contract explicit:

♻️ Suggested guard
-if [[ "$LEGACY_SCHEMA_EXISTS" -eq 1 ]]; then
-  echo "Existing database detected."
-  ve/bin/python manage.py base_specify_migration --use-override --database=${MIGRATION_DB_ALIAS}
-else
-  echo "Blank or newly created database detected."
-  ve/bin/python manage.py base_specify_migration --database=${MIGRATION_DB_ALIAS}
-fi
-
-# Run Django migrations
-ve/bin/python manage.py migrate --database=${MIGRATION_DB_ALIAS}
+if [[ "$LEGACY_SCHEMA_EXISTS" -eq 1 ]]; then
+  echo "Existing database detected."
+  if ! ve/bin/python manage.py base_specify_migration --use-override --database="${MIGRATION_DB_ALIAS}"; then
+    echo "Error: base_specify_migration failed on existing database; aborting before Django migrations."
+    exit 1
+  fi
+else
+  echo "Blank or newly created database detected."
+  if ! ve/bin/python manage.py base_specify_migration --database="${MIGRATION_DB_ALIAS}"; then
+    echo "Error: base_specify_migration failed; aborting before Django migrations."
+    exit 1
+  fi
+fi
+
+# Run Django migrations
+if ! ve/bin/python manage.py migrate --database="${MIGRATION_DB_ALIAS}"; then
+  echo "Error: Django migrate failed."
+  exit 1
+fi
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@sp7_db_setup_check.sh` around lines 460 - 469, The script currently runs
ve/bin/python manage.py base_specify_migration (conditional on
LEGACY_SCHEMA_EXISTS) and then always runs ve/bin/python manage.py migrate
without checking return codes; update the flow to fail fast: after each
base_specify_migration invocation (both branches that call
base_specify_migration with --use-override or without), check its exit status
and if non-zero immediately exit the script with that status (so the subsequent
migrate is not run on a failed base_specify_migration); apply the same pattern
of explicit exit checking around the migrate invocation if you want symmetry,
referencing the commands/variables base_specify_migration, migrate,
LEGACY_SCHEMA_EXISTS, and MIGRATION_DB_ALIAS to locate the calls.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.env:
- Around line 6-8: The ALLOW_DB_RESET flag currently defaults to true which
enables destructive DB resets; change its default value to false so destructive
reset is opt-in by explicitly setting ALLOW_DB_RESET=true when needed, and
verify any code reading ALLOW_DB_RESET (env key name ALLOW_DB_RESET) treats
absence or "false"/"0" as disabled.

In `@docker-entrypoint.sh`:
- Line 15: The inline comment beside the command "ve/bin/python manage.py
run_key_migration_functions" is inaccurate because the command is already
active; update the comment to reflect that the migration runs on startup (or
remove the misleading "Uncomment if you want..." wording). Locate the line
containing ve/bin/python manage.py run_key_migration_functions and replace the
comment with a correct note such as "Runs key migration functions on startup" or
delete the comment entirely to avoid confusion.

In `@sp7_db_setup_check.sh`:
- Around line 148-150: The DB_TABLE_COUNT fallback of "|| DB_TABLE_COUNT=0" can
mask query failures and cause a populated DB to be treated as empty; change the
mariadb invocation that sets DB_TABLE_COUNT so that any non-zero exit or empty
output triggers an error path and exits (do not silently set DB_TABLE_COUNT=0).
Specifically, run the SELECT COUNT(*) command (the line that calls mariadb with
-sse and assigns DB_TABLE_COUNT) and if the command fails or produces an
empty/non-numeric result, log an explicit error and exit with non-zero status
instead of assigning 0; only allow DB_TABLE_COUNT==0 when the command succeeded
and returned a numeric 0, so the ALLOW_DB_RESET/abort logic that follows behaves
correctly.
- Around line 127-131: The current mariadb query can fail and leave
LEGACY_SCHEMA_EXISTS empty (treated as 0), so change the block that sets
LEGACY_SCHEMA_EXISTS (guarded by DB_EXISTS) to explicitly detect query failures:
run the mariadb command and capture its exit status and output into a variable,
then if the command exit code is non-zero or the output is empty/not an integer,
log an error and exit non‑zero (fail fast) rather than treating it as "no legacy
schema"; reference the existing symbols LEGACY_SCHEMA_EXISTS, DB_EXISTS, and the
mariadb invocation so you replace the direct command substitution with a
two-step capture/check and an explicit process exit on failure.

---

Nitpick comments:
In `@sp7_db_setup_check.sh`:
- Around line 460-469: The script currently runs ve/bin/python manage.py
base_specify_migration (conditional on LEGACY_SCHEMA_EXISTS) and then always
runs ve/bin/python manage.py migrate without checking return codes; update the
flow to fail fast: after each base_specify_migration invocation (both branches
that call base_specify_migration with --use-override or without), check its exit
status and if non-zero immediately exit the script with that status (so the
subsequent migrate is not run on a failed base_specify_migration); apply the
same pattern of explicit exit checking around the migrate invocation if you want
symmetry, referencing the commands/variables base_specify_migration, migrate,
LEGACY_SCHEMA_EXISTS, and MIGRATION_DB_ALIAS to locate the calls.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 73a2cfe6-6664-4db2-b1ae-c00a5b7f4c3b

📥 Commits

Reviewing files that changed from the base of the PR and between 66b1266 and 5608205.

📒 Files selected for processing (4)
  • .env
  • docker-compose.yml
  • docker-entrypoint.sh
  • sp7_db_setup_check.sh
💤 Files with no reviewable changes (1)
  • docker-compose.yml

Comment thread .env Outdated
Comment thread docker-entrypoint.sh Outdated
Comment thread sp7_db_setup_check.sh
Comment thread sp7_db_setup_check.sh Outdated
@github-project-automation github-project-automation Bot moved this from 📋Back Log to Dev Attention Needed in General Tester Board May 21, 2026
out of an abundance of caution, this detects any query failures explicitly instead of silently defaulting to "no legacy schema".
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Dev Attention Needed

Development

Successfully merging this pull request may close these issues.

MariaDB bootstrap and partial-migration race cause failed first-run and FK errors on main

2 participants