22 lines
925 B
Markdown
22 lines
925 B
Markdown
# STT: Google Speech-to-Text v2 / Chirp Evaluation (follow-up)
|
|
|
|
Status: planned (not implemented). Related: gateway `connectorVoiceGoogle.py` uses Speech v1 `SpeechClient` only.
|
|
|
|
## Goal
|
|
|
|
Benchmark STT v2 (e.g. Chirp / Chirp 2) for `de-DE` vs current v1 `latest_short` / `latest_long` on:
|
|
|
|
- Latency (time-to-first-token, final latency)
|
|
- WER / subjective quality in meeting + coaching scenarios
|
|
- Cost and quota
|
|
|
|
## Steps
|
|
|
|
1. Add optional v2 client path (`google.cloud.speech_v2` or REST) behind a feature flag.
|
|
2. Run A/B on CommCoach streaming and Teamsbot batch paths with identical audio fixtures.
|
|
3. Document decision in `wiki/b-reference/` and remove flag or make v2 default.
|
|
|
|
## Notes
|
|
|
|
- Streaming and batch config differ between v1 and v2; keep `VoiceObjects` as the single facade.
|
|
- Billing hooks (`calculateSttCostCHF`) must use measured duration (see streaming `result_end_time`), not compressed byte heuristics.
|