Skip to main content

Synthesizes speech audio from text using the specified provider and voice.

POST /studio/speech

Records an OrcaRequest, estimates token usage/cost, persists the generated audio, and returns media metadata.

Processing steps:

Validate inputs (text, provider).
Resolve the provider adapter from configured speech groups.
Create and persist an OrcaRequest in InProgress state.
Estimate prompt/completion tokens and cost via provider-specific estimators; save to request.
Invoke synthesis (SynthesizeSpeechAsync), receive base64 audio.
Persist audio as OrcaAssetType.Speech and return its metadata.

Always responds with HTTP 200 for handled outcomes; clients should inspect the Error field.

Request

Responses

200
401
403

OK