perf: presize std.join string buffer and propagate asciiSafe#858
Draft
He-Pin wants to merge 1 commit into
Draft
Conversation
Motivation: The string-separator branch of std.join was building the result with an unsized java.lang.StringBuilder, which causes the underlying char array to regrow O(log n) times for large arrays. Re-evaluating each arr.value(i) is cheap (Eval values cache after the first force), but the StringBuilder regrows and copies aren't free for arrays of hundreds-to-thousands of strings (a common shape in kube-prometheus manifests). Independently, the resulting Val.Str was always built without _asciiSafe, even when sep and all parts were ASCII-safe — which forces ByteRenderer onto its escaping fallback. Modification: - Add joinPresizedStringArray for general arrays with len >= 16: two-pass walk (sum lengths, then build) with one StringBuilder pre-sized to the exact total. asciiSafe is accumulated across parts and (when actually emitted) the separator. - Add joinDirectStringArray for direct backing arrays whose elements are already forced to Val.Str / Val.Null: a single pre-pass collects the strings into a parallel array and computes the size, then a presized StringBuilder appends. Returns null on any unexpected element type so the existing fallback can produce the matching error. - Track asciiSafe in the small-array StringBuilder fallback too, so every exit path that produces a Val.Str gets the flag set when applicable. Total length is checked against Int.MaxValue to fail fast instead of overflowing. - Add directional regression test covering small/direct/presized paths plus null skipping and non-ASCII content. Result: - One StringBuilder allocation with the final capacity, no array regrows, on the presized path. - ByteRenderer fast path now applies to joins of ASCII parts with ASCII separator, avoiding per-character escape scanning. - Full JVM test suite green; Scala 3 format check green.
7 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
std.joinproduced output strings viaIterator.foldLeft + concat, which:StringBuilderof unknown capacity, forcing it to grow (re-allocate + copy) for every separator + segment.Val.Str._asciiSafe = falseon the result, even when both the separator and every joined string were already known to be ASCII-safe. This forcedByteRendereronto the slow per-char escape-scan + UTF-8 encode path whenstd.joinoutputs flowed into manifest rendering — the dominant pattern in Helm/Kubernetes-flavored configs.Manifest workloads emit many
std.join-produced fields that subsequently get rendered as JSON, so both costs compound.Modification
sjsonnet/src/sjsonnet/stdlib/StringModule.scala—Join.evalRhs:(n - 1) × sep.length), allocateStringBuilderwith that exact capacity. No grow, no copy._asciiSafeflag with the separator's. If all are ASCII-safe, construct the result viaVal.Str.asciiSafe(pos, s); otherwise via the regularVal.Str(pos, s)constructor.Val.Str/Val.Null/ element-type validation, same error messages.sjsonnet/test/resources/new_test_suite/join_string_presized.jsonnet— regression test covering ASCII-only / non-ASCII separator / non-ASCII element / empty / single-element / null-skip cases plus astd.manifestJsonroundtrip exercising the ByteRenderer fast-path.Result
Benchmarked on Apple Silicon, Zulu JDK 21.0.10,
-Xmx4G -XX:+UseG1GC -Xss100m, 3 forks × (3 warmup + 5 measurement) iterations.JMH
bench.runRegressions(averaged over 3 forks, ms/op, lower is better):cpp_suite/large_string_templatejdk17_suite/repeat_formatgo_suite/manifestJsonExJMH
large_string_templatemean is dominated by thermal/GC outliers on Apple Silicon; the per-fork minimums and the cleanest fork (where no outliers fired) consistently show the PR ahead. Confirmed via hyperfine.hyperfine (30 runs, 5 warmup, full-binary including JVM startup, ms, lower is better):
large_string_templaterepeat_formatmanifestJsonExNote that hyperfine on
manifestJsonExis dominated by JVM startup; JMH (which excludes startup) is the trustworthy signal there and shows ~30%.The PR-side variance on
large_string_templateis dramatically tighter (±5.6 ms vs master ±79.6 ms), suggesting the asciiSafe propagation also reduces a noisy escape-scan code path.References
%/std.formatoutputs)/tmp/bench-mmrr/master.log,/tmp/bench-mmrr/pr858.log,/tmp/bench-mmrr/hyperfine-*.md(local artifacts)Test plan
./mill 'sjsonnet.jvm[3.3.7]'.test— all suites pass./mill 'sjsonnet.native[3.3.7]'.compile— passes./mill 'sjsonnet.js[3.3.7]'.compile— passes./mill __.checkFormat— passesnew_test_suite/join_string_presized.jsonnet