Skip to content

Introduce custom constraints to extend the smithy validation rules#1171

Open
Gravewalker666 wants to merge 7 commits intosmithy-lang:mainfrom
Gravewalker666:1089-allow-custom-validation-rules
Open

Introduce custom constraints to extend the smithy validation rules#1171
Gravewalker666 wants to merge 7 commits intosmithy-lang:mainfrom
Gravewalker666:1089-allow-custom-validation-rules

Conversation

@Gravewalker666
Copy link
Copy Markdown

@Gravewalker666 Gravewalker666 commented May 7, 2026

Issue #, if available:
#1089

Description of changes:

  • Introduced CustomConstraint interface for extensible validation rules via ServiceLoader
  • Added Validator.CustomConstraintProvider to discover and register custom constraints as schema extensions
  • Integrated applyCustomConstraints() into Validator.ShapeValidator to apply custom rules after standard validation
  • Added ValidationError.CustomValidationFailure record for custom constraint violations

Benchmark numbers:

  Benchmark                                      Baseline (ns/op)    Feature (ns/op)    Improvement
  StringValidationBench.stringComposite          53.19               51.27              +3.6%                            
  StringValidationBench.stringEnumOnly           11.14               11.18              -0.4%                          
  StringValidationBench.stringLengthOnly         10.28               10.92              -6.2%
  StringValidationBench.stringNoValidation        9.90                9.90               0.0%
  StringValidationBench.stringPatternOnly        51.39               52.44              -2.0%
  ValidatorBench.person                          15.74               15.76              -0.1%
  ValidatorBench.pojoWithValidatedCollections   150.44              146.93              +2.3%
  ValidatorBench.unvalidatedPojo                 15.05               12.91             +14.2%
  ValidatorBench.validatedPojo                   17.83               14.01             +21.4%

A record of benchmark runs I've done along the way is documented here (non-complete): https://pastebin.com/GTuJKkQN

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@Gravewalker666 Gravewalker666 force-pushed the 1089-allow-custom-validation-rules branch from b597a16 to 00b7074 Compare May 7, 2026 11:58
@Gravewalker666 Gravewalker666 force-pushed the 1089-allow-custom-validation-rules branch from 00b7074 to 3163773 Compare May 7, 2026 12:08
@Gravewalker666 Gravewalker666 force-pushed the 1089-allow-custom-validation-rules branch from 363d10c to 21a9497 Compare May 7, 2026 12:37
@Gravewalker666 Gravewalker666 force-pushed the 1089-allow-custom-validation-rules branch from 21a9497 to f815d7a Compare May 7, 2026 12:46
@Gravewalker666 Gravewalker666 changed the title 1089 allow custom validation rules Introduce custom constraints to extend the smithy validation rules May 7, 2026
@Gravewalker666 Gravewalker666 marked this pull request as ready for review May 7, 2026 13:52
}
}

record CustomValidationFailure(String path, Schema schema, String message) implements ValidationError {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this needs to be defined here, ValidationError is an open public interface that custom integrations can implement and define their own errors in their packages? Feels like redundant to me

Comment on lines +519 to +520
for (var error: validationErrors) {
addError(error);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might worth adding a new addErrors method as adding multiple objects should be faster than adding then one by one. Also, having it this way might lead to the situation where, say, you have maxAllowedErrors=1 but custom validator returned 2 errors (and these are the only errors during the validation) - we will add only first but the second one has been generated already so maybe we should still add it? It feels like maxAllowedErrors is more like a circuit breaker (to stop further validation) but if the validation is done already, why not just add these errors?

}

private void applyCustomConstraints(Schema schema, Object value) {
if (!CustomConstraintProvider.HAS_CUSTOM_CONSTRAINTS) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this feels a bit odd to me - we have a static field but we operate on a instance basis. Should either this to be related to an instance of CustomConstraintProvider or other methods (e.g. provide, key) to be static in the class?

return;
}
for (var rule: customConstraints) {
var validationErrors = rule.validate(schema, value, createPath());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

createPath should be created up front and reused. Or better still, we could give the validator a Supplier instead of eagerly computing the path.


checkListLength(schema, count);
}
applyCustomConstraints(schema, state);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, state passed to writeList and writeMap don't have to be the list/map being written (e.g., it could be streaming, or just canned content). The "state" value just acts as a carrier to avoid needing a closure that references up-values. That makes custom list and map validation tricky.

}
for (var rule: customConstraints) {
var validationErrors = rule.validate(schema, value, createPath());
for (var error: validationErrors) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (var error: validationErrors) {
for (var error : validationErrors) {


@Override
public List<CustomConstraint> provide(Schema schema) {
var rulesForThisSchema = new ArrayList<CustomConstraint>();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems heavy. It adds a list to every schema we validate. I think instead of this, we could have CustomConstraint return an EnumSet<ShapeType> appliesTo(). Then we'd have a List<CustomConstraint>[] on the class, sized to the number of shape types. Then check if there's an entry per/shape being validated. Then there's no SchemaExtension and no need to punch an array/schema onto all schemas.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we'd have a List[] on the class, sized to the number of shape types

@mtdowling does it mean that there would be only one custom constraint possible per shape type? If so, which approach you'd choose to select which one should apply when there are multiple options? I'm curious if in this case the overall approach should be to pass "constraint per shape type" map to the validator constructor to avoid ambiguity when using SPI...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should do away with the idea of per/shape custom validators, and instead have per/type custom validators (which there could be many per/type). Maybe with support for a wildcard which matches any type. If you need shape-specific validators (which would likely be rare), then that filtering can be done in validate by inspecting the Schema.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants