TTS, STT, voice cloning, music generation.
Naturalness & latency
Accuracy & languages
Consent, quality, limits
Style control, licensing