Skip to content

[server] Support configurable time partition format for auto-partitioned tables#3200

Open
wattt3 wants to merge 2 commits intoapache:mainfrom
wattt3:auto-partition-time-format
Open

[server] Support configurable time partition format for auto-partitioned tables#3200
wattt3 wants to merge 2 commits intoapache:mainfrom
wattt3:auto-partition-time-format

Conversation

@wattt3
Copy link
Copy Markdown

@wattt3 wattt3 commented Apr 25, 2026

Purpose

Linked issue: close #3191

Brief change log

  • Add 'table.auto-partition.time-format' to override the unit partition value format
  • Validate pattern syntax at table creation
    • the zero-padded / lex-orderable contract is the users's responsibility. (retention compares partition names via TreeMap)

Tests

API and Format

Documentation

@wattt3
Copy link
Copy Markdown
Author

wattt3 commented Apr 25, 2026

@luoyuxia PTAL, thanks.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new table option to customize the string format used for time-based auto-partition values, propagating that option through partition generation/retention logic and documenting the new behavior.

Changes:

  • Introduce table.auto-partition.time-format (no default; derived from time-unit when unset) and wire it into auto-partition creation and retention.
  • Validate custom time format pattern syntax during table descriptor validation, with new unit tests.
  • Update docs and existing tests/call sites for updated partition utility APIs.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
website/docs/table-design/data-distribution/partitioning.md Documents the new table.auto-partition.time-format option for auto-partitioned tables.
website/docs/engine-flink/options.md Exposes the new option in Flink engine table options documentation.
fluss-server/src/test/java/org/apache/fluss/server/utils/TableDescriptorValidationTest.java Adds coverage for accepting/rejecting custom time-format values at table creation validation time.
fluss-server/src/test/java/org/apache/fluss/server/coordinator/TableManagerITCase.java Updates test helper call site for new generateAutoPartition(...) signature.
fluss-server/src/main/java/org/apache/fluss/server/utils/TableDescriptorValidation.java Adds table-create-time validation for time-format pattern syntax.
fluss-server/src/main/java/org/apache/fluss/server/coordinator/AutoPartitionManager.java Passes time-format through to partition pre-creation and retention cutoff calculation.
fluss-common/src/test/java/org/apache/fluss/utils/PartitionUtilsTest.java Adds tests for partition name generation with a custom time format.
fluss-common/src/main/java/org/apache/fluss/utils/PartitionUtils.java Extends partition time generation/validation to accept an optional custom time-format and uses Locale.ROOT.
fluss-common/src/main/java/org/apache/fluss/utils/AutoPartitionStrategy.java Adds timeFormat to the resolved auto-partition strategy from table options.
fluss-common/src/main/java/org/apache/fluss/config/ConfigOptions.java Defines the new table.auto-partition.time-format config option and its description.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +484 to +501
String timeFormat = autoPartition.timeFormat();
if (timeFormat != null) {
if (timeFormat.trim().isEmpty()) {
throw new InvalidConfigException(
String.format(
"'%s' must not be empty.",
ConfigOptions.TABLE_AUTO_PARTITION_TIME_FORMAT.key()));
}
try {
DateTimeFormatter.ofPattern(timeFormat, Locale.ROOT);
} catch (IllegalArgumentException e) {
throw new InvalidConfigException(
String.format(
"Invalid time format '%s' for '%s': %s",
timeFormat,
ConfigOptions.TABLE_AUTO_PARTITION_TIME_FORMAT.key(),
e.getMessage()));
}
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

table.auto-partition.time-format is only validated for DateTimeFormatter pattern syntax. It can still generate partition values that are invalid for Fluss/ZooKeeper paths (e.g., /, ., spaces) or collide for a given time-unit (e.g., time-unit=HOUR with format yyyy-MM-dd produces identical values for different hours). This will lead to auto-partition creation failures or incorrect retention behavior at runtime. Consider validating at table creation that a sample formatted value passes the same partition-value rules (TablePath.detectInvalidName/validatePrefix) and that formatting differs between now and now + 1 <time-unit> (and ideally preserves lexicographic order).

Copilot uses AI. Check for mistakes.
| table.auto-partition.enabled | Boolean | no | false | Whether enable auto partition for the table. Disable by default. When auto partition is enabled, the partitions of the table will be created automatically. |
| table.auto-partition.key | String | no | (none) | This configuration defines the time-based partition key to be used for auto-partitioning when a table is partitioned with multiple keys. Auto-partitioning utilizes a time-based partition key to handle partitions automatically, including creating new ones and removing outdated ones, by comparing the time value of the partition with the current system time. In the case of a table using multiple partition keys (such as a composite partitioning strategy), this feature determines which key should serve as the primary time dimension for making auto-partitioning decisions. And If the table has only one partition key, this config is not necessary. Otherwise, it must be specified. |
| table.auto-partition.time-unit | ENUM | no | DAY | The time granularity for auto created partitions. The default value is 'DAY'. Valid values are 'HOUR', 'DAY', 'MONTH', 'QUARTER', 'YEAR'. If the value is 'HOUR', the partition format for auto created is yyyyMMddHH. If the value is 'DAY', the partition format for auto created is yyyyMMdd. If the value is 'MONTH', the partition format for auto created is yyyyMM. If the value is 'QUARTER', the partition format for auto created is yyyyQ. If the value is 'YEAR', the partition format for auto created is yyyy. |
| table.auto-partition.time-format | String | no | (derived from unit) | The time format used for auto-created partition values. If not set, the format is derived from `table.auto-partition.time-unit` (e.g. `yyyyMMdd` for DAY). When set, this value overrides the format derived from the time unit, while the partition granularity still follows `table.auto-partition.time-unit`. A custom format must use zero-padded numeric fields covering at least the unit's precision so that partition values sort by time as strings (e.g. `yyyy-MM-dd` for DAY). |
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The table.auto-partition.time-format docs don't mention the character constraints for partition values (Fluss only allows ASCII alphanumerics, _, and -). Formats like yyyy/MM/dd or those producing spaces will be accepted by the formatter but will create invalid partition names (and may fail when creating ZooKeeper nodes). Consider documenting the allowed output character set (and/or explicitly warning against /, ., spaces, etc.).

Suggested change
| table.auto-partition.time-format | String | no | (derived from unit) | The time format used for auto-created partition values. If not set, the format is derived from `table.auto-partition.time-unit` (e.g. `yyyyMMdd` for DAY). When set, this value overrides the format derived from the time unit, while the partition granularity still follows `table.auto-partition.time-unit`. A custom format must use zero-padded numeric fields covering at least the unit's precision so that partition values sort by time as strings (e.g. `yyyy-MM-dd` for DAY). |
| table.auto-partition.time-format | String | no | (derived from unit) | The time format used for auto-created partition values. If not set, the format is derived from `table.auto-partition.time-unit` (e.g. `yyyyMMdd` for DAY). When set, this value overrides the format derived from the time unit, while the partition granularity still follows `table.auto-partition.time-unit`. A custom format must use zero-padded numeric fields covering at least the unit's precision so that partition values sort by time as strings (e.g. `yyyy-MM-dd` for DAY). The formatted partition value must contain only ASCII letters and digits, `_`, or `-`. Do not use formats that produce `/`, `.`, spaces, `:`, or other characters outside this set, because the formatter may accept them but Fluss partition names do not. |

Copilot uses AI. Check for mistakes.
| table.auto-partition.enabled | Boolean | false | Whether enable auto partition for the table. Disable by default. When auto partition is enabled, the partitions of the table will be created automatically. |
| table.auto-partition.key | String | (None) | This configuration defines the time-based partition key to be used for auto-partitioning when a table is partitioned with multiple keys. Auto-partitioning utilizes a time-based partition key to handle partitions automatically, including creating new ones and removing outdated ones, by comparing the time value of the partition with the current system time. In the case of a table using multiple partition keys (such as a composite partitioning strategy), this feature determines which key should serve as the primary time dimension for making auto-partitioning decisions. And If the table has only one partition key, this config is not necessary. Otherwise, it must be specified. |
| table.auto-partition.time-unit | ENUM | DAY | The time granularity for auto created partitions. The default value is `DAY`. Valid values are `HOUR`, `DAY`, `MONTH`, `QUARTER`, `YEAR`. If the value is `HOUR`, the partition format for auto created is yyyyMMddHH. If the value is `DAY`, the partition format for auto created is yyyyMMdd. If the value is `MONTH`, the partition format for auto created is yyyyMM. If the value is `QUARTER`, the partition format for auto created is yyyyQ. If the value is `YEAR`, the partition format for auto created is yyyy. |
| table.auto-partition.time-format | String | (derived from unit) | The time format used for auto-created partition values. If not set, the format is derived from `table.auto-partition.time-unit` (e.g. `yyyyMMdd` for DAY). When set, this value overrides the format derived from the time unit, while the partition granularity still follows `table.auto-partition.time-unit`. A custom format must use zero-padded numeric fields covering at least the unit's precision so that partition values sort by time as strings (e.g. `yyyy-MM-dd` for DAY). |
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The table.auto-partition.time-format description doesn’t call out Fluss partition value character restrictions (only [A-Za-z0-9_-] are allowed). Without this, users may choose formats like yyyy/MM/dd that work as DateTimeFormatter patterns but will produce invalid partition names and fail at runtime. Consider adding a brief note about the allowed output characters / disallowed separators.

Suggested change
| table.auto-partition.time-format | String | (derived from unit) | The time format used for auto-created partition values. If not set, the format is derived from `table.auto-partition.time-unit` (e.g. `yyyyMMdd` for DAY). When set, this value overrides the format derived from the time unit, while the partition granularity still follows `table.auto-partition.time-unit`. A custom format must use zero-padded numeric fields covering at least the unit's precision so that partition values sort by time as strings (e.g. `yyyy-MM-dd` for DAY). |
| table.auto-partition.time-format | String | (derived from unit) | The time format used for auto-created partition values. If not set, the format is derived from `table.auto-partition.time-unit` (e.g. `yyyyMMdd` for DAY). When set, this value overrides the format derived from the time unit, while the partition granularity still follows `table.auto-partition.time-unit`. A custom format must use zero-padded numeric fields covering at least the unit's precision so that partition values sort by time as strings (e.g. `yyyy-MM-dd` for DAY). The formatted partition value must contain only characters allowed by Fluss partition values: `[A-Za-z0-9_-]`. Do not use separators or other characters outside this set (for example, `yyyy/MM/dd` is invalid because `/` is not allowed). |

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support configurable time partition format for auto-partitioned tables

2 participants