Skip to content

Commit 5b8e487

Browse files
author
DavidQ
committed
Refine text2speach-V2 voice filtering, character presets, and header consistency - PR_26130_010-text2speach-v2-language-filtering
1 parent aa8b746 commit 5b8e487

9 files changed

Lines changed: 719 additions & 130 deletions

File tree

docs/dev/reports/PR_26130_010-text2speach-v2-language-filtering.md

Lines changed: 86 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -2,29 +2,57 @@
22

33
## Purpose
44

5-
Make `text2speach-V2` match how browser SpeechSynthesis behaves: Language is selected first, and Voice only shows voices whose `SpeechSynthesisVoice.lang` matches that language.
5+
Make `text2speach-V2` follow the intended speech setup stack:
6+
7+
```text
8+
Gender -> Language -> Voice -> Voice Age -> Character preset -> SSML-like preset -> editable tuning controls
9+
```
10+
11+
Gender filters available Languages, Language filters Voice, Voice Age shapes pitch/rate, Character applies editable performance defaults, and SSML-like remains a separate delivery treatment.
12+
13+
The duplicate Announcer/Announcement concept was removed: `announcer` was a Character preset and `announcement` was an SSML-like preset, which made the two dropdowns feel like they were doing the same job. Character now owns persona/performance choices, while SSML-like owns text treatment.
614

715
## Scope
816

9-
Changed only `text2speach-V2` language/voice UI, control filtering, schema/default ordering, and Workspace Manager V2 Playwright coverage.
17+
Changed only `text2speach-V2` option ordering, dropdown sorting, Gender/Language/Voice filtering, Voice Age shaping, voice metadata classification, Character preset defaults, the text2speach-V2 schema enum, the shared text-to-speech engine voice option metadata, and Workspace Manager V2 Playwright coverage.
1018

1119
No `start_of_day` files were changed.
1220

1321
`docs/dev/codex_commands.md` and `docs/dev/commit_comment.txt` were updated locally as required and remain ignored so they cannot be committed.
1422

1523
## Implementation Summary
1624

17-
- Moved Language above Voice in the Speech Options form.
18-
- Moved `language` before `voice` in the text2speach-V2 required-field/default/schema order.
19-
- Filtered Voice options to `speechSynthesis.getVoices()` entries whose `lang` matches the selected Language.
20-
- Added visible voice match details under Voice, including match count and voice names.
21-
- Auto-selects the first matching voice when a language change invalidates the previous selected voice.
22-
- Clears Voice, disables Speak, and logs a visible failure when the selected Language has no matching voices.
23-
- Preserved the existing queue payload shape and required queue item fields.
25+
- Stacked Speech Options in the requested order: Gender, Language, Voice, Voice Age, Character preset, SSML-like preset.
26+
- Added Voice Age dropdown with `Any`, `Adult`, `Child`, `Elder`, and `Teen`; `Any` is pinned first.
27+
- Voice Age no longer filters Language or Voice; it shapes pitch/rate before Character preset tuning.
28+
- Added shared voice option age metadata from the SpeechSynthesis voice model when present.
29+
- Added `voiceAge` to default queue data and the workspace-embeddable speech queue schema.
30+
- Added Voice Age shaping defaults that adjust pitch/rate for `Child`, `Teen`, `Adult`, and `Elder` while leaving `Any` neutral.
31+
- Rebuilds Language from voices matching the selected Gender.
32+
- Rebuilds Voice from voices matching the selected Gender and Language.
33+
- Shows explicit no-language/no-voice empty states when a selected Gender has no matching voices.
34+
- Sorted dropdown choices alphabetically while pinning requested defaults at the top: `Any`, `All`, `Manual`, and `Normal`.
35+
- Sorted runtime Language and Voice dropdowns by visible labels after SpeechSynthesis voices load.
36+
- Preserved Gender with `All`, `Male`, `Female`, and `Neutral`; it remains runtime-only because SpeechSynthesis has no standard gender utterance field.
37+
- Classifies explicit Neutral voices only when voice metadata/name says neutral, non-binary, or androgynous.
38+
- From the current browser voice list supplied by the user, there are no truly Neutral voices; generic Google voices are not treated as Neutral.
39+
- Classifies explicit male voices next, treats the `es-ES` Spanish Spain browser voice as Male, then treats female/known Google browser voices as Female.
40+
- Leaves unknown-gender voices out of Gender-specific buckets instead of treating unknown as Neutral.
41+
- Character presets now contain `manual`, `alert`, `calm`, `dramatic`, `narrator`, and `robot`.
42+
- SSML-like presets now contain `normal`, `slow`, and `whisper-ish`.
43+
- Character presets apply editable defaults:
44+
- `manual`: neutral baseline
45+
- `alert`: faster attention-getting baseline
46+
- `calm`: slower steady baseline
47+
- `dramatic`: brighter performance baseline
48+
- `narrator`: neutral narration baseline
49+
- `robot`: flatter, lower synthetic baseline
50+
- User slider/SSML changes after selecting a Character preset are preserved as manual tuning.
51+
- Preserved existing queue payload shape and required queue item fields.
2452

2553
## Tool Completion Status
2654

27-
Failing behavior before: Language and Voice were independent, so selecting Language did not change audible voice behavior or the Voice dropdown contents.
55+
Failing/unclear behavior before: Character and SSML-like both had announcement-style choices, there was no Voice Age filter, and age-specific voice selection could not be represented in the UI.
2856

2957
Tool fixed: `text2speach-V2`.
3058

@@ -36,17 +64,26 @@ Playwright impacted: Yes.
3664

3765
Coverage added/updated for:
3866

39-
- language-first control ordering
40-
- dynamic Voice filtering by selected Language
41-
- auto-selection of the first matching voice when the prior Voice becomes invalid
42-
- invalid voice reset behavior when a language has no matching voices
43-
- visible Voice match counts/details
44-
- delayed `voiceschanged` population respecting the selected Language filter
45-
- existing full TTS options, schema-valid default queue, speech actions, and Workspace Manager V2 launch behavior
46-
47-
Expected pass behavior: Language controls the Voice list, Voice never shows non-matching voices, selection adjustments are logged, and Speak is disabled when no matching voice exists.
48-
49-
Expected fail behavior: tests fail if Language is not first, non-matching voices appear, an invalid selected Voice remains active, Voice match details are missing, or delayed voice population ignores the language filter.
67+
- ordered controls: Gender, Language, Voice, Voice Age, Character preset, SSML-like preset
68+
- Voice Age options with `Any` pinned first
69+
- Voice Age pitch/rate shaping for Child and reset back to Any
70+
- Voice Age does not filter or clear selected Language/Voice
71+
- schema-required `voiceAge` on every speech queue item
72+
- sorted dropdown order with `Any`, `All`, `Manual`, and `Normal` pinned first
73+
- alphabetic runtime Language and Voice options
74+
- Character preset options: `manual`, `alert`, `calm`, `dramatic`, `narrator`, `robot`
75+
- SSML-like preset options: `normal`, `slow`, `whisper-ish`
76+
- schema enum matching the Character and SSML-like option split
77+
- Gender-filtered Language and Voice behavior
78+
- corrected voice gender buckets: explicit Neutral metadata appears under Neutral, unknown voices do not appear under Neutral, explicit male voices remain Male, `es-ES` appears under Male, Google browser voices without male markers move to Female, `es-US` remains Female, and empty Neutral disables Speak
79+
- Character preset default application to rate, pitch, volume, and SSML
80+
- Manual preset reset to neutral defaults
81+
- user adjustment after preset application
82+
- existing speech queue, speech actions, delayed voice loading, and Workspace Manager V2 launch behavior
83+
84+
Expected pass behavior: the top controls appear in the requested order, dropdowns are sorted with `Any`, `All`, `Manual`, and `Normal` pinned first, Gender filters Language, Language filters Voice, Voice Age adjusts pitch/rate without clearing Language/Voice, `es-ES` is Male, `es-US` remains Female, explicit Neutral metadata appears under Neutral, unknown voices are not mislabeled as Neutral, Character applies editable defaults, user tuning after a preset is reflected in the summary/speak action, and Character no longer duplicates SSML-like announcement treatment.
85+
86+
Expected fail behavior: tests fail if the order regresses, Voice Age disappears, age shaping stops changing pitch/rate, Voice Age clears or filters Language/Voice, dropdown sorting breaks, explicit Neutral metadata does not populate Neutral, unknown voices are mislabeled as Neutral, `es-ES` is missing from Male, `es-US` moves out of Female unexpectedly, Male/Female filtering breaks, Manual is missing, Character no longer applies defaults, user edits are overwritten unexpectedly, or voice filtering breaks.
5087

5188
## Validation
5289

@@ -59,19 +96,21 @@ npm run test:workspace-v2
5996
Result:
6097

6198
```text
62-
26 passed
99+
28 passed
63100
```
64101

65-
Additional checks:
102+
Additional checks passed:
66103

67104
```text
68105
node --check src/engine/audio/TextToSpeechDefaults.js
106+
node --check src/engine/audio/TextToSpeechEngine.js
69107
node --check tools/text2speach-V2/js/controls/SpeechOptionsControl.js
70108
node --check tools/text2speach-V2/js/TextToSpeechToolApp.js
71109
node --check tools/text2speach-V2/js/bootstrap.js
72110
node --check tests/playwright/tools/WorkspaceManagerV2.spec.mjs
73111
node -e "JSON.parse(require('node:fs').readFileSync('tools/schemas/tools/text2speach-V2.schema.json','utf8'));"
74-
git diff --check
112+
rg -n -P "<script(?![^>]*\bsrc=)|<style|\son[a-zA-Z]+=" tools/text2speach-V2/index.html
113+
git diff --check HEAD -- .
75114
```
76115

77116
The workspace-v2 Playwright run also generated advisory V8 coverage reports:
@@ -81,41 +120,52 @@ The workspace-v2 Playwright run also generated advisory V8 coverage reports:
81120

82121
## Full Samples Smoke Test
83122

84-
Skipped. The full samples smoke test is intentionally out of scope because this PR is limited to text2speach-V2 language/voice filtering behavior and targeted Workspace Manager V2 coverage, not broad sample runtime behavior.
123+
Skipped. The full samples smoke test is intentionally out of scope because this change is limited to text2speach-V2 Gender/Language/Voice filtering, Voice Age shaping, preset behavior, and targeted Workspace Manager V2 coverage, not broad sample runtime behavior.
85124

86125
## ZIP Artifact
87126

88127
Repo-structured delta ZIP:
89128

90129
```text
91-
tmp/PR_26130_010-text2speach-v2-language-filtering_delta.zip
130+
tmp/PR_26130_010-text2speach-v2-age-before-character_delta.zip
92131
```
93132

94133
## Manual Validation Steps
95134

96135
1. Open `tools/text2speach-V2/index.html`.
97-
2. Confirm Language appears above Voice in Speech Options.
98-
3. Confirm the default `en-US` language shows only matching `en-US` voices and the visible Voice details line reports match count/name details.
99-
4. Change Language to `en-GB`; Voice should auto-select the first matching `en-GB` voice and the status log should report the adjustment.
100-
5. Change Language to a locale with no available voices; Voice should clear, Speak should disable, and the status log should explain that no matching voice exists.
101-
6. Change back to a language with matching voices and confirm Speak becomes available again.
102-
103-
Expected outcome: Voice options are always language-filtered, selection changes are visible/logged, and no non-matching or hidden fallback voice is used.
136+
2. Confirm Speech Options are stacked as Gender, Language, Voice, Voice Age, Character preset, SSML-like preset.
137+
3. Confirm Voice Age shows `Any` first, then the remaining values alphabetically.
138+
4. Confirm Gender shows `All` first, then the remaining values alphabetically.
139+
5. Confirm Character preset shows `Manual` first, then `Alert`, `Calm`, `Dramatic`, `Narrator`, and `Robot`.
140+
6. Confirm SSML-like preset shows `Normal` first, then `Slow` and `Whisper-ish`.
141+
7. Confirm Queue mode, Language, and Voice options are alphabetic by visible label.
142+
8. Select `Child` Voice Age and confirm pitch/rate change while Language, Voice, and Speak remain available.
143+
9. Select `Any` Voice Age to restore neutral pitch/rate for the current Character preset.
144+
10. Select `Male` and confirm Language includes `es-ES` plus languages with explicit male voices.
145+
11. Select `es-ES` and confirm the Voice dropdown shows `Google espanol`.
146+
12. Select `Female` and confirm `es-US` remains available while `es-ES` is not listed.
147+
13. Select `Neutral`; with the current available voice list, it should show no Neutral voice languages, no Neutral voices, and disabled Speak.
148+
14. In the mocked explicit-neutral Playwright path, confirm Neutral shows only the voice whose metadata is `gender: "neutral"`.
149+
15. In the mocked explicit-age Playwright path, confirm Child shows only the voice whose metadata is `age: "child"`.
150+
16. Select Language and confirm Voice narrows to matching voices.
151+
17. Select `Calm`, `Dramatic`, `Alert`, and `Robot`; Rate, Pitch, Volume, and SSML should change to each preset's defaults.
152+
18. Select `Manual`; Rate, Pitch, Volume, and SSML should return to neutral defaults.
153+
19. Select a Character preset, then adjust sliders/SSML manually and confirm the summary reflects the manual adjustments.
154+
155+
Expected outcome: the setup flow reads top-down, Character is a persona/performance preset, SSML-like is a separate delivery treatment, Gender filtering is explicit, Voice Age changes pitch/rate without changing selected voice, no pasted browser voice is mislabeled Neutral, and the user remains free to tune after selecting a preset.
104156

105157
## Changed Files
106158

107159
- `src/engine/audio/TextToSpeechDefaults.js`
160+
- `src/engine/audio/TextToSpeechEngine.js`
108161
- `tests/playwright/tools/WorkspaceManagerV2.spec.mjs`
109162
- `tools/schemas/tools/text2speach-V2.schema.json`
110163
- `tools/text2speach-V2/index.html`
111164
- `tools/text2speach-V2/js/TextToSpeechToolApp.js`
112165
- `tools/text2speach-V2/js/bootstrap.js`
113166
- `tools/text2speach-V2/js/controls/SpeechOptionsControl.js`
114-
- `tools/text2speach-V2/styles/text2speach-V2.css`
115167
- `docs/dev/reports/PR_26130_010-text2speach-v2-language-filtering.md`
116168
- `docs/dev/reports/codex_review.diff`
117169
- `docs/dev/reports/codex_changed_files.txt`
118-
- `docs/dev/reports/playwright_v8_coverage_report.txt`
119-
- `docs/dev/reports/coverage_changed_js_guardrail.txt`
120170
- `docs/dev/codex_commands.md`
121171
- `docs/dev/commit_comment.txt`

0 commit comments

Comments
 (0)