BridgeJS: Optimize string encoding for JS-to-Swift crossings#748
Closed
krodak wants to merge 1 commit into
Closed
Conversation
Two techniques applied to all JS-to-Swift string paths: 1. LRU encoding cache for parameter and return paths - avoids re-encoding repeated strings via a Map<string, Uint8Array> with 256-entry LRU eviction. 2. Direct string retain + encodeInto() for stack ABI paths (arrays, structs, enums, dictionaries) - skips the intermediate Uint8Array allocation entirely by retaining the JS string and encoding directly into the WASM linear memory buffer. _swift_js_init_memory now returns the actual byte count written, which the stack ABI path needs since it passes a worst-case buffer size (string.length * 3) rather than the exact UTF-8 byte count. Benchmarks (100k iterations, Node.js): StringRoundtrip/takeString: -21% ArrayRoundtrip/takeStringArray: -35% ArrayRoundtrip/roundtripStringArray: -29%
Member
Author
|
Opened against wrong repo - meant for PassiveLogic fork for now 🙏🏻 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
String parameters crossing from JS to Swift go through
TextEncoder.encode()+ object heapretain/releaseon every call. This is measurably slow for repeated strings and for string arrays. Two independent optimizations target the two string-encoding paths in the generated JS glue, without touching BridgeType or the codegen structure.Related: #677, #700 (different approach - adds
JSStringas a new BridgeType; this PR avoids that)What changed
1. LRU encoding cache for parameter and return paths
When JS passes a string to an exported Swift function (or returns a string from an imported JS function), the glue calls
textEncoder.encode(string)to get aUint8Array, then retains it in the object heap. The same string encoded 100k times means 100k allocations.A 256-entry LRU cache (
Map<string, Uint8Array>) now sits in front oftextEncoder.encode(). Repeated strings get a cache hit and skip encoding. The cache uses JSMapinsertion-order semantics for O(1) LRU eviction - delete-and-reinsert on hit, delete-first on eviction.Affected fragments:
stringLowerParameter,stringLowerReturn2. Direct string retain +
encodeInto()for the stack ABIArrays, struct fields, enum payloads, and dictionary entries use the stack ABI, which encodes each string element independently. Instead of allocating a
Uint8Arrayper element, the JS glue now retains the JS string itself in the object heap and passesstring.length * 3as the buffer capacity (worst-case UTF-8).On the Swift side,
_swift_js_init_memorydetects the string viatypeofand writes UTF-8 directly into the WASM linear memory buffer usingTextEncoder.encodeInto(). It returns the actual byte count written, whichString(unsafeUninitializedCapacity:)uses for the final string length.Affected fragments:
stackLowerFragmentfor.string/.rawValueEnum(_, .string)One ABI change:
_swift_js_init_memoryreturnsInt32instead ofVoid. The return value is the byte count actually written - needed because the stack ABI passes a worst-case capacity, not the exact byte count.Benchmarks
100k iterations, Node.js v22, 15-run average:
StringRoundtrip/takeStringArrayRoundtrip/takeStringArrayArrayRoundtrip/roundtripStringArrayStringRoundtrip/makeStringArrayRoundtrip/makeStringArrayThe
make*benchmarks (Swift-to-JS direction) are unaffected - those paths already use direct memory reads viadecodeString(ptr, len).Independence of the two techniques
The two optimizations are independent. If the
_swift_js_init_memoryreturn type change is a concern, technique #2 can be dropped and the stack ABI reverted to use_cachedEncode()(same as the parameter path). That still gives the ~21% improvement ontakeStringwith zero Swift-side changes.Files changed (excluding snapshots)
Sources/JavaScriptKit/BridgeJSIntrinsics.swift-_swift_js_init_memoryreturnsInt32;bridgeJSLiftParameteruses the returned countPlugins/BridgeJS/Sources/BridgeJSLink/BridgeJSLink.swift- LRU cache preamble;_swift_js_init_memoryhandler with string detectionPlugins/BridgeJS/Sources/BridgeJSLink/JSGlueGen.swift-stringLowerParameterandstringLowerReturnuse cache;stackLowerFragmentuses direct retain; reserved variable names for cache