Skip to content

Commit f244fe3

Browse files
author
DavidQ
committed
Add Text to Speech engine and backend roadmap items to planned tools section - PR_26130_014-tools-roadmap-tts-engine-planning
1 parent dda34d3 commit f244fe3

13 files changed

Lines changed: 313 additions & 74 deletions

File tree

Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
# PR_26130_013-text-to-speech-v2-polish
2+
3+
## Purpose
4+
5+
Polish the existing `text2speach-V2` tool so user-facing surfaces read as `Text to Speech V2`, the header follows the shared first-class tool shell pattern, named sentence selection fully hydrates Speech Options, and Rate / Speed is capped to the practical browser `speechSynthesis` range used by the tool.
6+
7+
## Scope
8+
9+
Changed only Text to Speech V2 runtime/defaults/schema/UI, the visible tool registry/workspace navigation labels needed for launch surfaces, and targeted Workspace Manager V2 Playwright coverage.
10+
11+
The internal tool id, schema id, DOM ids, path, folder, and workspace toolState key remain `text2speach-V2`.
12+
13+
No `start_of_day` files were changed.
14+
15+
## Implementation Summary
16+
17+
- Updated visible tool title, card/nav labels, how-to/readme naming, sample copy, status messages, queue data name, and schema title to `Text to Speech V2`.
18+
- Added the shared `toolShellCommon.css` header frame and kept the tool-specific header class for scoped styling/tests.
19+
- Renamed the left queue accordion to `Named Sentences`.
20+
- Added complete named sentence defaults so each queue item carries language, voice, gender, age, presets, queue mode, auto speak, repeat, delay, volume, rate, pitch, and text.
21+
- Fixed named sentence selection hydration by rebinding the requested queue item voice when the selected item changes, preventing the previous voice selection from leaking into the newly selected item.
22+
- Added `gender` to required queue item options and to the schema/default queue payload.
23+
- Capped Rate / Speed at `2` in defaults, schema, UI range setup, engine utterance clamping, and Playwright assertions.
24+
25+
## Playwright Impact
26+
27+
Playwright impacted: Yes.
28+
29+
Coverage added/updated for:
30+
31+
- visible `Text to Speech V2` naming on the tools index, direct tool page, and Workspace Manager V2 launch tile
32+
- shared header layout class on the Text to Speech V2 intro/header
33+
- `Named Sentences` accordion label
34+
- schema-complete named sentence defaults including required `gender`
35+
- named sentence selection hydrating all Speech Options fields from selected JSON data
36+
- Rate / Speed slider maximum and schema maximum capped at `2`
37+
- updated status log text using the visible `Text to Speech V2` label
38+
39+
Expected pass behavior: selecting each named sentence updates text, gender, language, voice, age, presets, queue mode, repeat, delay, volume, rate, pitch, and summary state from the selected JSON item.
40+
41+
Expected fail behavior: tests fail if visible naming regresses to `text2speach-V2`, the shared header frame is missing, Rate / Speed exceeds `2`, queue items are schema-incomplete, or named sentence selection preserves stale option values.
42+
43+
## Validation
44+
45+
Passed:
46+
47+
```text
48+
npm run test:workspace-v2
49+
```
50+
51+
Result:
52+
53+
```text
54+
28 passed
55+
```
56+
57+
Additional checks passed:
58+
59+
```text
60+
node --check src/engine/audio/TextToSpeechDefaults.js
61+
node --check src/engine/audio/TextToSpeechEngine.js
62+
node --check tools/text2speach-V2/js/TextToSpeechToolApp.js
63+
node --check tools/text2speach-V2/js/controls/SpeechOptionsControl.js
64+
node --check tools/workspace-manager-v2/js/services/WorkspaceManagerV2ContextService.js
65+
node --check tools/toolRegistry.js
66+
node --check tests/playwright/tools/WorkspaceManagerV2.spec.mjs
67+
node -e "JSON.parse(require('node:fs').readFileSync('tools/schemas/tools/text2speach-V2.schema.json','utf8'));"
68+
rg -n -P "<script(?![^>]*\bsrc=)|<style|\son[a-zA-Z]+=" tools/text2speach-V2/index.html tools/text2speach-V2/how_to_use.html
69+
git diff --check HEAD -- .
70+
```
71+
72+
The inline HTML restriction scan returned no matches. `git diff --check` reported only the existing Windows line-ending warnings and no whitespace errors.
73+
74+
## Full Samples Smoke Test
75+
76+
Skipped. The full samples smoke test is intentionally out of scope because this PR is limited to Text to Speech V2 polish, schema/default alignment, named sentence hydration, and targeted Workspace Manager V2 Playwright coverage.
77+
78+
## ZIP Artifact
79+
80+
Repo-structured delta ZIP:
81+
82+
```text
83+
tmp/PR_26130_013-text-to-speech-v2-polish_delta.zip
84+
```
85+
86+
## Manual Validation Steps
87+
88+
1. Open `tools/text2speach-V2/index.html`.
89+
2. Confirm the browser title and visible heading read `Text to Speech V2`.
90+
3. Confirm the intro/header uses the shared first-class tool shell framing and the queue accordion reads `Named Sentences`.
91+
4. Select `Alert warning`, `Narrator welcome`, and `Hero ready`; Speech Options should update every option from the selected named sentence.
92+
5. Confirm Rate / Speed cannot exceed `2`.
93+
94+
Expected outcome: user-facing naming is polished, the internal `text2speach-V2` contract is preserved, named sentences hydrate all speech options, and the practical Rate / Speed cap is enforced.
95+
96+
## Changed Files
97+
98+
- `src/engine/audio/TextToSpeechDefaults.js`
99+
- `src/engine/audio/TextToSpeechEngine.js`
100+
- `tests/playwright/tools/WorkspaceManagerV2.spec.mjs`
101+
- `tools/schemas/tools/text2speach-V2.schema.json`
102+
- `tools/text2speach-V2/README.md`
103+
- `tools/text2speach-V2/how_to_use.html`
104+
- `tools/text2speach-V2/index.html`
105+
- `tools/text2speach-V2/js/TextToSpeechToolApp.js`
106+
- `tools/text2speach-V2/js/controls/SpeechOptionsControl.js`
107+
- `tools/text2speach-V2/styles/text2speach-V2.css`
108+
- `tools/toolRegistry.js`
109+
- `tools/workspace-manager-v2/js/services/WorkspaceManagerV2ContextService.js`
110+
- `docs/dev/reports/PR_26130_013-text-to-speech-v2-polish.md`
111+
- `docs/dev/reports/codex_review.diff`
112+
- `docs/dev/reports/codex_changed_files.txt`
113+
- `docs/dev/codex_commands.md`
114+
- `docs/dev/commit_comment.txt`

src/engine/audio/TextToSpeechDefaults.js

Lines changed: 34 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
const TEXT_TO_SPEECH_TOOL_ID = "text2speach-V2";
22
const TEXT_TO_SPEECH_SCHEMA_ID = "tools/schemas/tools/text2speach-V2.schema.json";
33
const TEXT_TO_SPEECH_PAYLOAD_SCHEMA = "html-js-gaming.text2speach-V2";
4+
const TEXT_TO_SPEECH_DISPLAY_NAME = "Text to Speech V2";
45

56
const TEXT_TO_SPEECH_LANGUAGE_OPTIONS = Object.freeze([
67
Object.freeze({ label: "English (UK)", value: "en-GB" }),
@@ -68,7 +69,7 @@ const TEXT_TO_SPEECH_SSML_LIKE_PRESET_DEFAULTS = Object.freeze({
6869
const TEXT_TO_SPEECH_RANGE_DEFAULTS = Object.freeze({
6970
delayBetweenRepeatsMs: Object.freeze({ max: 5000, min: 0, step: 100, value: 0 }),
7071
pitch: Object.freeze({ max: 2, min: 0, step: 0.1, value: 1 }),
71-
rate: Object.freeze({ max: 10, min: 0.1, step: 0.1, value: 1 }),
72+
rate: Object.freeze({ max: 2, min: 0.1, step: 0.1, value: 1 }),
7273
volume: Object.freeze({ max: 1, min: 0, step: 0.01, value: 1 })
7374
});
7475

@@ -85,6 +86,7 @@ const TEXT_TO_SPEECH_QUEUE_ITEM_REQUIRED_FIELDS = Object.freeze([
8586
"id",
8687
"name",
8788
"text",
89+
"gender",
8890
"language",
8991
"voice",
9092
"voiceAge",
@@ -103,6 +105,7 @@ const TEXT_TO_SPEECH_DEFAULT_OPTIONS = Object.freeze({
103105
autoSpeak: false,
104106
characterPreset: "manual",
105107
delayBetweenRepeatsMs: TEXT_TO_SPEECH_RANGE_DEFAULTS.delayBetweenRepeatsMs.value,
108+
gender: "any",
106109
language: "en-US",
107110
pitch: TEXT_TO_SPEECH_CHARACTER_PRESET_DEFAULTS.manual.pitch,
108111
queueMode: "replace",
@@ -119,33 +122,58 @@ const TEXT_TO_SPEECH_DEFAULT_QUEUE = Object.freeze([
119122
...TEXT_TO_SPEECH_DEFAULT_OPTIONS,
120123
...TEXT_TO_SPEECH_CHARACTER_PRESET_DEFAULTS.narrator,
121124
characterPreset: "narrator",
125+
gender: "any",
122126
id: "narrator-welcome",
127+
language: "en-US",
123128
name: "Narrator welcome",
124-
text: "Welcome to Toolbox Aid. This is the default text2speach-V2 sample line for previewing narration, prompts, and menu feedback."
129+
text: "Welcome to Toolbox Aid. This is the default Text to Speech V2 sample line for previewing narration, prompts, and menu feedback.",
130+
voice: "mock-google-us-english",
131+
voiceAge: "any"
125132
}),
126133
Object.freeze({
127134
...TEXT_TO_SPEECH_DEFAULT_OPTIONS,
128135
...TEXT_TO_SPEECH_CHARACTER_PRESET_DEFAULTS.dramatic,
136+
autoSpeak: false,
129137
characterPreset: "dramatic",
138+
delayBetweenRepeatsMs: 500,
139+
gender: "male-preferred",
130140
id: "hero-ready",
141+
language: "en-GB",
131142
name: "Hero ready",
132-
text: "Systems ready. The hero prompt is queued for an upbeat menu confirmation."
143+
pitch: 1.4,
144+
queueMode: "append",
145+
rate: 1.2,
146+
repeatCount: 2,
147+
text: "Systems ready. The hero prompt is queued for an upbeat menu confirmation.",
148+
voice: "mock-google-uk-english-male",
149+
voiceAge: "teen"
133150
}),
134151
Object.freeze({
135152
...TEXT_TO_SPEECH_DEFAULT_OPTIONS,
136153
...TEXT_TO_SPEECH_CHARACTER_PRESET_DEFAULTS.alert,
154+
autoSpeak: false,
137155
characterPreset: "alert",
156+
delayBetweenRepeatsMs: 1000,
157+
gender: "female-preferred",
138158
id: "alert-warning",
159+
language: "en-US",
139160
name: "Alert warning",
140-
text: "Warning. Incoming hazard detected. Please confirm the next action."
161+
pitch: 0.9,
162+
queueMode: "replace",
163+
rate: 1.3,
164+
repeatCount: 3,
165+
text: "Warning. Incoming hazard detected. Please confirm the next action.",
166+
voice: "mock-microsoft-zira",
167+
voiceAge: "adult",
168+
volume: 0.9
141169
})
142170
]);
143171

144172
const TEXT_TO_SPEECH_DEFAULT_QUEUE_DATA = Object.freeze({
145173
$schema: TEXT_TO_SPEECH_SCHEMA_ID,
146174
schema: TEXT_TO_SPEECH_PAYLOAD_SCHEMA,
147175
version: 1,
148-
name: "text2speach-V2 default queue",
176+
name: "Text to Speech V2 default queue",
149177
queue: TEXT_TO_SPEECH_DEFAULT_QUEUE
150178
});
151179

@@ -164,6 +192,7 @@ export {
164192
TEXT_TO_SPEECH_DEFAULT_QUEUE,
165193
TEXT_TO_SPEECH_DEFAULT_QUEUE_DATA,
166194
TEXT_TO_SPEECH_DEFAULTS,
195+
TEXT_TO_SPEECH_DISPLAY_NAME,
167196
TEXT_TO_SPEECH_GENDER_FILTER_OPTIONS,
168197
TEXT_TO_SPEECH_LANGUAGE_OPTIONS,
169198
TEXT_TO_SPEECH_PAYLOAD_SCHEMA,

src/engine/audio/TextToSpeechEngine.js

Lines changed: 15 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,8 @@
1-
import { TEXT_TO_SPEECH_DEFAULTS } from "./TextToSpeechDefaults.js";
1+
import {
2+
TEXT_TO_SPEECH_DEFAULTS,
3+
TEXT_TO_SPEECH_DISPLAY_NAME,
4+
TEXT_TO_SPEECH_RANGE_DEFAULTS
5+
} from "./TextToSpeechDefaults.js";
26

37
function finiteNumber(value, fallback) {
48
const number = Number(value);
@@ -119,18 +123,22 @@ class TextToSpeechEngine {
119123

120124
const normalizedText = String(text || "").trim();
121125
if (!normalizedText) {
122-
return { message: "text2speach-V2 text is required before speaking.", ok: false };
126+
return { message: `${TEXT_TO_SPEECH_DISPLAY_NAME} text is required before speaking.`, ok: false };
123127
}
124128

125129
const selectedVoice = this.voiceForValue(voice);
126130
if (!selectedVoice) {
127-
return { message: `text2speach-V2 voice is required before speaking: ${voice || "(none selected)"}.`, ok: false };
131+
return { message: `${TEXT_TO_SPEECH_DISPLAY_NAME} voice is required before speaking: ${voice || "(none selected)"}.`, ok: false };
128132
}
129133

130134
const utterance = new this.Utterance(normalizedText);
131135
utterance.lang = String(language || selectedVoice.lang || TEXT_TO_SPEECH_DEFAULTS.language);
132136
utterance.pitch = boundedNumber(pitch, { fallback: TEXT_TO_SPEECH_DEFAULTS.pitch, max: 2, min: 0 });
133-
utterance.rate = boundedNumber(rate, { fallback: TEXT_TO_SPEECH_DEFAULTS.rate, max: 10, min: 0.1 });
137+
utterance.rate = boundedNumber(rate, {
138+
fallback: TEXT_TO_SPEECH_DEFAULTS.rate,
139+
max: TEXT_TO_SPEECH_RANGE_DEFAULTS.rate.max,
140+
min: TEXT_TO_SPEECH_RANGE_DEFAULTS.rate.min
141+
});
134142
utterance.volume = boundedNumber(volume, { fallback: TEXT_TO_SPEECH_DEFAULTS.volume, max: 1, min: 0 });
135143
utterance.voice = selectedVoice;
136144
return {
@@ -160,6 +168,7 @@ class TextToSpeechEngine {
160168
autoSpeak = TEXT_TO_SPEECH_DEFAULTS.autoSpeak,
161169
characterPreset = TEXT_TO_SPEECH_DEFAULTS.characterPreset,
162170
delayBetweenRepeatsMs = TEXT_TO_SPEECH_DEFAULTS.delayBetweenRepeatsMs,
171+
gender = TEXT_TO_SPEECH_DEFAULTS.gender,
163172
language = TEXT_TO_SPEECH_DEFAULTS.language,
164173
pitch = TEXT_TO_SPEECH_DEFAULTS.pitch,
165174
queueMode = TEXT_TO_SPEECH_DEFAULTS.queueMode,
@@ -180,7 +189,7 @@ class TextToSpeechEngine {
180189
this.clearScheduledRepeats();
181190
this.speechSynthesis.cancel();
182191
} else if (queueMode !== "append") {
183-
return { message: `Unsupported text2speach-V2 queueMode: ${queueMode}.`, ok: false };
192+
return { message: `Unsupported ${TEXT_TO_SPEECH_DISPLAY_NAME} queueMode: ${queueMode}.`, ok: false };
184193
}
185194

186195
this.loopCanceled = false;
@@ -215,6 +224,7 @@ class TextToSpeechEngine {
215224
autoSpeak: autoSpeak === true,
216225
characterPreset,
217226
delayBetweenRepeatsMs: boundedNumber(delayBetweenRepeatsMs, { fallback: 0, max: 60000, min: 0 }),
227+
gender,
218228
language: firstUtterance.utterance.lang,
219229
ok: true,
220230
pitch: firstUtterance.utterance.pitch,

0 commit comments

Comments
 (0)