Question 1

What is speech-to-text and how accurate is Lyssna?

Accepted Answer

Speech-to-text (STT) converts recorded or live audio into written text using AI. Lyssna runs OpenAI’s Whisper model, which routinely hits 95%+ accuracy on clear English and strong multilingual performance. Noisy calls, heavy accents, and cross-talk all reduce accuracy — the playground on this page lets you test a real clip before committing.

Question 2

Is your speech-to-text tool free to use?

Accepted Answer

Yes. The playground on this page transcribes up to a 25 MB clip without signup. Signed-in accounts receive starter credits, unlock history, longer files, and the full re-voice workflow.

Question 3

Which languages do you support?

Accepted Answer

Whisper covers 90+ languages with automatic detection — including English, Hindi, Tamil, Telugu, Bengali, Marathi, Spanish, French, German, Portuguese, Japanese, Korean, and Chinese. We also handle code-mixed speech (for example, Hinglish) without requiring you to pick a language manually.

Question 4

Can I transcribe video files, not just audio?

Accepted Answer

Yes. Drop in MP4, MOV, WebM, and most common video containers. We extract the audio track and transcribe it — no need to convert first.

Question 5

Does it identify different speakers?

Accepted Answer

Yes. Our backend returns diarization segments whenever they’re available — speaker labels, start/end timestamps, and per-sentence chunks. Perfect for interviews, panels, and multi-person meetings.

Question 6

How long can my audio file be?

Accepted Answer

The free playground accepts files up to 25 MB (roughly 25 minutes at standard podcast quality). Paid plans raise this limit and support background batch uploads for longer episodes.

Question 7

Is my audio private?

Accepted Answer

Playground uploads are processed in-memory and deleted once the transcript is returned. Signed-in accounts get encrypted storage with a clear-at-any-time button — you own every file.

Question 8

Can I convert the transcript back into a different voice?

Accepted Answer

That’s the Lyssna difference. A single click pushes your transcript into our TTS tool, where you can re-voice it in ElevenLabs, Inworld, or MiniMax — ideal for localization, voice swaps, and creator remixing.

Transcribe audio to text with 99% accuracy

Transcribe audio in three simple steps

Upload or record audio

Hit Transcribe

Copy, download, or re-voice

One transcriber. Six creator workflows.

Podcast & interview transcripts

Reels, TikTok & YouTube captions

Voice notes to clean text

Meeting & call transcription

Lectures & study audio

Accessibility & closed captions

Everything about our transcriber