From 22e298823a9869cc50983c1bfe2fd685381ef9b3 Mon Sep 17 00:00:00 2001 From: meskill <8974488+meskill@users.noreply.github.com> Date: Wed, 22 Apr 2026 09:59:30 +0000 Subject: [PATCH 1/2] feat: explain the configuration for copy_data retries --- docs/configuration/pgdog.toml/general.md | 12 ++++ docs/features/sharding/resharding/.pages | 2 +- docs/features/sharding/resharding/cutover.md | 2 +- .../features/sharding/resharding/databases.md | 2 +- docs/features/sharding/resharding/index.md | 4 +- .../sharding/resharding/{hash.md => move.md} | 58 ++++++++++++++++++- docs/features/sharding/resharding/schema.md | 6 +- docs/roadmap.md | 2 +- 8 files changed, 77 insertions(+), 11 deletions(-) rename docs/features/sharding/resharding/{hash.md => move.md} (82%) diff --git a/docs/configuration/pgdog.toml/general.md b/docs/configuration/pgdog.toml/general.md index 21aae155..2b3660d9 100644 --- a/docs/configuration/pgdog.toml/general.md +++ b/docs/configuration/pgdog.toml/general.md @@ -457,6 +457,18 @@ Available options: `text` format is required when migrating from `INTEGER` to `BIGINT` primary keys during resharding. +### `resharding_copy_retry_max_attempts` + +How many times PgDog retries a single table copy after a failure (e.g., a shard goes down mid-copy). The delay starts at [`resharding_copy_retry_min_delay`](#resharding_copy_retry_min_delay) and doubles each attempt, capped at 32× that value. + +Default: **`5`** + +### `resharding_copy_retry_min_delay` + +Starting delay between retry attempts, in milliseconds. Doubles on each attempt, capped at 32× this value (`32_000` ms at the default). + +Default: **`1_000`** (1s) + ### `reload_schema_on_ddl` !!! warning diff --git a/docs/features/sharding/resharding/.pages b/docs/features/sharding/resharding/.pages index c31f3414..4b392fdb 100644 --- a/docs/features/sharding/resharding/.pages +++ b/docs/features/sharding/resharding/.pages @@ -2,5 +2,5 @@ nav: - 'index.md' - 'databases.md' - 'schema.md' - - 'hash.md' + - 'move.md' - 'cutover.md' diff --git a/docs/features/sharding/resharding/cutover.md b/docs/features/sharding/resharding/cutover.md index a8cd718a..cdf74993 100644 --- a/docs/features/sharding/resharding/cutover.md +++ b/docs/features/sharding/resharding/cutover.md @@ -8,7 +8,7 @@ icon: material/set-right This is a new and experimental feature. Please make sure to test it before deploying to production and report any issues you find. -Traffic cutover involves moving application traffic (read and write queries) to the destination database. This happens when the source and destination databases are in sync: all source data is copied, and resharded with [logical replication](hash.md). +Traffic cutover involves moving application traffic (read and write queries) to the destination database. This happens when the source and destination databases are in sync: all source data is copied, and resharded with [logical replication](move.md). ## Performing the cutover diff --git a/docs/features/sharding/resharding/databases.md b/docs/features/sharding/resharding/databases.md index 73cbe9c8..34dd2403 100644 --- a/docs/features/sharding/resharding/databases.md +++ b/docs/features/sharding/resharding/databases.md @@ -8,7 +8,7 @@ PgDog's strategy for resharding Postgres databases is to create a new, independe ## Requirements -New databases should be **empty**: don't migrate your [table definitions](schema.md) or [data](hash.md). These will be taken care of automatically by PgDog. +New databases should be **empty**: don't migrate your [table definitions](schema.md) or [data](move.md). These will be taken care of automatically by PgDog. ### Database users diff --git a/docs/features/sharding/resharding/index.md b/docs/features/sharding/resharding/index.md index cc536247..123133b6 100644 --- a/docs/features/sharding/resharding/index.md +++ b/docs/features/sharding/resharding/index.md @@ -24,7 +24,7 @@ The resharding process is composed of four independent operations. The first one |-|-| | [Create new cluster](databases.md) | Create a new set of empty databases that will be used for storing data in the new, sharded cluster. | | [Schema synchronization](schema.md) | Replicate table and index definitions to the new shards, making sure the new cluster has the same schema as the old one. | -| [Move & reshard data](hash.md) | Copy data using logical replication, while redistributing rows in-flight between new shards. | +| [Move & reshard data](move.md) | Copy data using logical replication, while redistributing rows in-flight between new shards. | | [Cutover traffic](cutover.md) | Make the new cluster service both reads and writes from the application, without taking downtime. | While each step can be executed separately by the operator, PgDog provides an [admin database](../../../administration/index.md) command to perform online resharding and traffic cutover steps in a completely automated fashion: @@ -50,5 +50,5 @@ The `` and `` parameters accept the name of the source and {{ next_steps_links([ ("Schema sync", "schema.md", "Synchronize table, index and other schema entities between the source and destination databases."), - ("Move data", "hash.md", "Redistribute data between shards using the configured sharding function. This happens without downtime and keeps the shards up-to-date with the source database until traffic cutover."), + ("Move data", "move.md", "Redistribute data between shards using the configured sharding function. This happens without downtime and keeps the shards up-to-date with the source database until traffic cutover."), ]) }} diff --git a/docs/features/sharding/resharding/hash.md b/docs/features/sharding/resharding/move.md similarity index 82% rename from docs/features/sharding/resharding/hash.md rename to docs/features/sharding/resharding/move.md index ce2a8213..8664964a 100644 --- a/docs/features/sharding/resharding/hash.md +++ b/docs/features/sharding/resharding/move.md @@ -121,9 +121,9 @@ PgDog will distribute the table copy load evenly between all replicas in the con To prevent the resharding process from impacting production queries, you can create a separate set of replicas just for resharding. Managed clouds (e.g., AWS RDS) make this especially easy, but require a warm-up period to fetch all the data from the backup snapshot, before they can read data at full speed of their storage volumes. - + To make sure dedicated replicas are not used for read queries in production, you can configure PgDog to use them for resharding only: - + ```toml [[databases]] name = "prod" @@ -215,6 +215,60 @@ FROM pg_replication_slots; ``` The replication delay between the two database clusters is measured in bytes. When that number reaches zero, the two databases are byte-for-byte identical, and traffic can be [cut over](cutover.md) to the destination database. +## Troubleshooting + +### Shard failure during copy + +If a shard goes down mid-copy, PgDog retries that table with exponential backoff - up to **5 attempts**, starting at **1 second** and doubling each time. + +The two cases behave differently: + +- **Destination shard down** — PgDog opens a fresh connection on the next attempt. +- **Source shard down** — the `TEMPORARY` slot for that table is re-created from scratch. No cleanup needed; `TEMPORARY` slots vanish when the connection closes (unlike the [permanent slot](#replication-slot) created at the start of the overall copy). + +To change the defaults: + +```toml +[general] +resharding_copy_retry_max_attempts = 5 # per-table retry attempts +resharding_copy_retry_min_delay = 1000 # base backoff in ms; doubles each attempt, max 32× +``` + +### Rows remaining in destination after failure + +Most drops are clean: table copies run inside an implicit transaction, so Postgres rolls back on disconnect and the destination is left empty. + +The awkward case: the connection drops *after* `COPY` commits on some or all shards but before PgDog records it as done. The rows survive, and the next attempt hits primary key violations immediately. + +PgDog catches this and tells you exactly what to run: + +``` +data sync for "public"."orders" failed with rows remaining in destination; +truncate manually before retrying: TRUNCATE "public"."orders_new"; +``` + +Run the `TRUNCATE` on the destination (the table name is literal — copy it verbatim), then re-run `COPY_DATA`. + +### Binary format mismatch + +Different major Postgres versions can produce incompatible binary `COPY` data. PgDog surfaces this as a `BinaryFormatMismatch` error. Switch to text: + +```toml +[general] +resharding_copy_format = "text" +``` + +See [Integer primary keys](#integer-primary-keys) for the other common reason to use text format. + +### Replication slot not dropped after abort + +Aborting `COPY_DATA` without restarting it leaves the **permanent** replication slot sitting on the source. Postgres won't recycle WAL until it's gone. Drop it manually: + +```postgresql +SELECT pg_drop_replication_slot('slot_name'); +``` + +The slot name appears in the `COPY_DATA` startup log and in `SHOW REPLICATION_SLOTS`. ## Next steps {{ next_steps_links([ diff --git a/docs/features/sharding/resharding/schema.md b/docs/features/sharding/resharding/schema.md index efc370e3..421ae98d 100644 --- a/docs/features/sharding/resharding/schema.md +++ b/docs/features/sharding/resharding/schema.md @@ -12,7 +12,7 @@ The schema synchronization process is composed of 4 distinct steps, all of which | Phase | Description | |-|-| | [Pre-data](#pre-data-phase) | Create identical tables on all shards along with the primary key constraint (and index). Secondary indexes are _not_ created yet. | -| [Post-data](#post-data-phase) | Create secondary indexes on all tables and shards. This is done after [moving data](hash.md), as a separate step, because it's considerably faster to create indexes on whole tables than while inserting individual rows. | +| [Post-data](#post-data-phase) | Create secondary indexes on all tables and shards. This is done after [moving data](move.md), as a separate step, because it's considerably faster to create indexes on whole tables than while inserting individual rows. | | [Cutover](#cutover) | This step is executed during traffic cutover, while application queries are blocked from executing on the database. | | Post-cutover | This step makes sure the rollback database cluster can handle reverse logical replication. | @@ -110,7 +110,7 @@ This will make sure all tables and schemas in your database are copied and resha ## Post-data phase -The post-data phase is performed after the [data copy](hash.md) is complete and tables have been synchronized with logical replication. Its job is to create all secondary indexes (e.g., `CREATE INDEX`). +The post-data phase is performed after the [data copy](move.md) is complete and tables have been synchronized with logical replication. Its job is to create all secondary indexes (e.g., `CREATE INDEX`). This step is performed after copying data because it makes the copy process considerably faster: Postgres doesn't need to update several indexes while writing rows into the tables. @@ -189,5 +189,5 @@ pg_dump_path = "/path/to/pg_dump" ## Next steps {{ next_steps_links([ - ("Move data", "hash.md", "Redistribute data between shards using the configured sharding function. This happens without downtime and keeps the shards up-to-date with the source database until traffic cutover."), + ("Move data", "move.md", "Redistribute data between shards using the configured sharding function. This happens without downtime and keeps the shards up-to-date with the source database until traffic cutover."), ]) }} diff --git a/docs/roadmap.md b/docs/roadmap.md index c2b8a94e..9923d704 100644 --- a/docs/roadmap.md +++ b/docs/roadmap.md @@ -89,7 +89,7 @@ Support for sorting rows in [cross-shard](features/sharding/cross-shard-queries/ | Feature | Status | Notes | |-|-|-| -| [Data sync](features/sharding/resharding/hash.md) | :material-wrench: | Sync table data with logical replication. | +| [Data sync](features/sharding/resharding/move.md) | :material-wrench: | Sync table data with logical replication. | | [Schema sync](features/sharding/resharding/schema.md) | :material-wrench: | Sync table, index and constraint definitions. | | Online rebalancing | :material-calendar-check: | Not automated yet, requires manual orchestration. | From 4779ffa12f7130e854ae2cb0a861e025487a19aa Mon Sep 17 00:00:00 2001 From: meskill <8974488+meskill@users.noreply.github.com> Date: Wed, 29 Apr 2026 09:06:09 +0000 Subject: [PATCH 2/2] fix review --- docs/features/sharding/resharding/move.md | 23 +++++++++++------------ mkdocs.yml | 1 + 2 files changed, 12 insertions(+), 12 deletions(-) diff --git a/docs/features/sharding/resharding/move.md b/docs/features/sharding/resharding/move.md index 8664964a..0f203a01 100644 --- a/docs/features/sharding/resharding/move.md +++ b/docs/features/sharding/resharding/move.md @@ -219,14 +219,14 @@ The replication delay between the two database clusters is measured in bytes. Wh ### Shard failure during copy -If a shard goes down mid-copy, PgDog retries that table with exponential backoff - up to **5 attempts**, starting at **1 second** and doubling each time. +If a shard goes down mid-copy, PgDog retries that table with exponential backoff. By default: up to **5 attempts**, starting at **1 second** and doubling each time. -The two cases behave differently: - -- **Destination shard down** — PgDog opens a fresh connection on the next attempt. -- **Source shard down** — the `TEMPORARY` slot for that table is re-created from scratch. No cleanup needed; `TEMPORARY` slots vanish when the connection closes (unlike the [permanent slot](#replication-slot) created at the start of the overall copy). +| Situation | Behavior | +|-|-| +| Destination shard down | New connection opened on the next attempt. | +| Source shard down | `TEMPORARY` replication slot re-created from scratch. It's cleaned up automatically when the connection closes before starting new attempt. | -To change the defaults: +To change the defaults, configure [`resharding_copy_retry_max_attempts`](../../../configuration/pgdog.toml/general.md#resharding_copy_retry_max_attempts) and [`resharding_copy_retry_min_delay`](../../../configuration/pgdog.toml/general.md#resharding_copy_retry_min_delay) in `pgdog.toml`: ```toml [general] @@ -234,11 +234,10 @@ resharding_copy_retry_max_attempts = 5 # per-table retry attempts resharding_copy_retry_min_delay = 1000 # base backoff in ms; doubles each attempt, max 32× ``` -### Rows remaining in destination after failure +### Primary key violations when retrying after copy failure -Most drops are clean: table copies run inside an implicit transaction, so Postgres rolls back on disconnect and the destination is left empty. +In rare cases, if the connection drops after `COPY` commits but before PgDog records the table as done, some rows may remain in the destination. The next retry will hit primary key violations. -The awkward case: the connection drops *after* `COPY` commits on some or all shards but before PgDog records it as done. The rows survive, and the next attempt hits primary key violations immediately. PgDog catches this and tells you exactly what to run: @@ -247,7 +246,7 @@ data sync for "public"."orders" failed with rows remaining in destination; truncate manually before retrying: TRUNCATE "public"."orders_new"; ``` -Run the `TRUNCATE` on the destination (the table name is literal — copy it verbatim), then re-run `COPY_DATA`. +Run the `TRUNCATE` on the destination (you can copy the command above, but make sure to run it strictly on the destination), then re-run `COPY_DATA`. ### Binary format mismatch @@ -260,9 +259,9 @@ resharding_copy_format = "text" See [Integer primary keys](#integer-primary-keys) for the other common reason to use text format. -### Replication slot not dropped after abort +### Replication slot not cleaned up after stop or crash -Aborting `COPY_DATA` without restarting it leaves the **permanent** replication slot sitting on the source. Postgres won't recycle WAL until it's gone. Drop it manually: +If `COPY_DATA` is stopped via `STOP_TASK` or PgDog exits unexpectedly, the **permanent** replication slot will remain on the source. Postgres will not recycle WAL until it is dropped. Clean it up manually: ```postgresql SELECT pg_drop_replication_slot('slot_name'); diff --git a/mkdocs.yml b/mkdocs.yml index 8506e956..4cc6e6dd 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -85,6 +85,7 @@ plugins: 'features/sharding/migrations.md': 'features/sharding/schema_management/migrations.md' 'features/sharding/primary-keys.md': 'features/sharding/sequences.md' 'features/sharding/schema_management/primary_keys.md': 'features/sharding/sequences.md' + 'features/sharding/resharding/hash.md': 'features/sharding/resharding/move.md' 'features/sharding/cross-shard/index.md': 'features/sharding/cross-shard-queries/index.md' extra: