Fix ConditionParser to handle newlines before && operator#12038
Fix ConditionParser to handle newlines before && operator#12038blaspat wants to merge 4 commits into
Conversation
Fix for apache#11882 The tokenize() method only treated space as whitespace, so newlines became standalone tokens. When a line break appeared before &&, the operator was parsed as two separate & tokens instead of one &&. Added \n, \r, \t to whitespace handling and properly tokenize && as a single operator token regardless of surrounding whitespace. Signed-off-by: Blasius Patrick <blasius.patrick@gmail.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes profile activation condition parsing so formatted multiline conditions can place a line break around the && operator without breaking tokenization.
Changes:
- Treats newline, carriage return, and tab characters as whitespace in
ConditionParser. - Tokenizes
&&as a single logical AND operator when encountered directly. - Adds regression coverage for multiline
&&expressions.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
impl/maven-impl/src/main/java/org/apache/maven/impl/model/profile/ConditionParser.java |
Updates tokenizer whitespace/operator handling for multiline logical AND conditions. |
impl/maven-impl/src/test/java/org/apache/maven/impl/model/profile/ConditionParserTest.java |
Adds regression tests covering && with line breaks before, after, and around the operator. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
gnodet
left a comment
There was a problem hiding this comment.
Thanks for the fix, @blaspat! The root cause analysis is spot-on — whitespace and & delimiter handling in the tokenizer were the issue. Here are a few improvements to make the fix more robust before merging:
- Use
Character.isWhitespace(c)instead of enumerating' ','\n','\r','\t'— cleaner and handles edge cases - Handle
||symmetrically with&&—||has the same tokenization gap (not in the operator char list) - Remove
&from theX=lookahead —&=is not a valid operator in this language
See inline comments for concrete suggestions.
For the remaining arithmetic operators (-, *, /) that also are not handled as delimiter characters, I have filed #12052 as a separate follow-up since those need more careful design (e.g., - in property names).
Claude Code on behalf of Guillaume Nodet
| if ((c == '>' || c == '<' || c == '=' || c == '!' || c == '&') | ||
| && i + 1 < expression.length() | ||
| && expression.charAt(i + 1) == '=') { | ||
| tokens.add(c + "="); | ||
| i++; // Skip the next character | ||
| } else if (c == '&' && i + 1 < expression.length() && expression.charAt(i + 1) == '&') { | ||
| tokens.add("&&"); | ||
| i++; // Skip the next character |
There was a problem hiding this comment.
Two issues:
-
&in theX=group:&was added to the>=/<=/==/!=lookahead, which means&=would be tokenized as a valid two-char operator. But&=is not valid in this language — remove|| c == '&'from this condition. -
||not handled: The||operator has the exact same tokenization gap that&&had. The|character is not in the operator list, so||only works when surrounded by spaces. Handle it symmetrically with&&.
| if ((c == '>' || c == '<' || c == '=' || c == '!' || c == '&') | |
| && i + 1 < expression.length() | |
| && expression.charAt(i + 1) == '=') { | |
| tokens.add(c + "="); | |
| i++; // Skip the next character | |
| } else if (c == '&' && i + 1 < expression.length() && expression.charAt(i + 1) == '&') { | |
| tokens.add("&&"); | |
| i++; // Skip the next character | |
| if ((c == '>' || c == '<' || c == '=' || c == '!') | |
| && i + 1 < expression.length() | |
| && expression.charAt(i + 1) == '=') { | |
| tokens.add(c + "="); | |
| i++; // Skip the next character | |
| } else if (c == '&' && i + 1 < expression.length() && expression.charAt(i + 1) == '&') { | |
| tokens.add("&&"); | |
| i++; // Skip the next character | |
| } else if (c == '|' && i + 1 < expression.length() && expression.charAt(i + 1) == '|') { | |
| tokens.add("||"); | |
| i++; // Skip the next character |
Please also add corresponding test cases for || with newlines, similar to the && tests.
|
|
||
| @Test | ||
| void testNestedPropertyAlias() { | ||
| functions.put("property", args -> { |
There was a problem hiding this comment.
Please also add a test for || with newlines, similar to testAmpersandAmpersandTokenizerMultiline. For example:
@Test
void testPipePipeTokenizerMultiline() {
// || with newlines should work the same as &&
assertTrue((Boolean) parser.parse("${os.arch} == 'amd64' || ${os.name} == 'linux'"));
assertTrue((Boolean) parser.parse("${os.arch} == 'amd64'\n|| ${os.name} == 'linux'"));
assertTrue((Boolean) parser.parse("${os.arch} == 'amd64' ||\n${os.name} == 'linux'"));
assertTrue((Boolean) parser.parse("${os.arch} == 'amd64'\n||\n${os.name} == 'linux'"));
}…ile/ConditionParser.java Co-authored-by: Guillaume Nodet <gnodet@gmail.com>
…ile/ConditionParser.java Co-authored-by: Guillaume Nodet <gnodet@gmail.com>
…ile/ConditionParserTest.java Co-authored-by: Guillaume Nodet <gnodet@gmail.com>
|
@blaspat Thanks for applying the suggestions! Two items still need to be addressed before this can be merged: 1.
Current: if ((c == '>' || c == '<' || c == '=' || c == '!' || c == '&')
&& i + 1 < expression.length()
&& expression.charAt(i + 1) == '=') {Should be: if ((c == '>' || c == '<' || c == '=' || c == '!')
&& i + 1 < expression.length()
&& expression.charAt(i + 1) == '=') {2.
Please add a } else if (c == '&' && i + 1 < expression.length() && expression.charAt(i + 1) == '&') {
tokens.add("&&");
i++; // Skip the next character
} else if (c == '|' && i + 1 < expression.length() && expression.charAt(i + 1) == '|') {
tokens.add("||");
i++; // Skip the next character
} else {3. Add Claude Code on behalf of Guillaume Nodet |
Fix for #11882
The tokenize() method only treated space as whitespace, so newlines became standalone tokens. When a line break appeared before &&, the operator was parsed as two separate & tokens instead of one &&.
Added \n, \r, \t to whitespace handling and properly tokenize && as a single operator token regardless of surrounding whitespace.