Skip to main content

Synthesizes speech audio from text using the specified provider, voice, and style; records an `OrcaRequest`, estimates token usage/cost, persists the generated audio, and returns media metadata.

POST 

/studio/speech

Processing steps:

  1. Validate inputs (text, provider).

  2. Resolve the provider adapter from configured speech groups.

  3. Create and persist an OrcaRequest in InProgress state.

  4. Estimate prompt/completion tokens and cost via provider-specific estimators; save to request.

  5. Invoke synthesis (SynthesizeSpeechAsync), receive base64 audio.

  6. Persist audio as OrcaAssetType.Speech and return its metadata.

    Always responds with HTTP 200 for handled outcomes; clients should inspect the Error field.

Request

Responses

OK