Synthesizes speech audio from text using the specified provider, voice, and style; records an `OrcaRequest`, estimates token usage/cost, persists the generated audio, and returns media metadata.

POST /studio/speech

Processing steps:

Validate inputs (text, provider).
Resolve the provider adapter from configured speech groups.
Create and persist an OrcaRequest in InProgress state.
Estimate prompt/completion tokens and cost via provider-specific estimators; save to request.
Invoke synthesis (SynthesizeSpeechAsync), receive base64 audio.
Persist audio as OrcaAssetType.Speech and return its metadata.

Always responds with HTTP 200 for handled outcomes; clients should inspect the Error field.

/studio/speech