Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.wittify.ai/llms.txt

Use this file to discover all available pages before exploring further.

The Speech to Text dashboard lives at /{lang}/speech-to-text. It is a single page with five sections: hero, How It Works, Start Transcribing method cards, supported audio formats, and Pro Tips for best results.

Feature specs

100+ Languages & Dialects

Arabic, English, French, Spanish, and 25+ Arabic dialects.

Real-Time Processing

Sub-300ms latency for live transcription use cases.

Diarization & Gender Detection

Identify speakers, separate voices, and detect gender.

Code-Switching Ready

Arabic-English mixed speech handled natively.

How It Works

A compact stepper strip. Five steps, each with a colored icon tile.
1

Upload or Record

Bring audio in by uploading a file, recording from the mic, or speaking live.
2

Detect Language

Faheem auto-detects the language. You can also override it from the language selector.
3

Faheem AI Engine

The Arabic-first ASR engine transcribes the audio with sub-300ms latency.
4

Review & Edit

Open the transcript editor, click any word to fix it, rename speakers, adjust cue times.
5

Export & Share

Quick-export to PDF, DOCX, or TXT, or copy the plain text.

Start Transcribing

Three method cards in the Start Transcribing grid. Each opens a modal.

Upload File

Drag & drop or browse audio files , supports MP3, WAV, M4A, AAC, OGG, FLAC and more.Highlights: Drag & drop, Batch processing, All formats. Badge: No setup required. Button: + Upload Audio File.

Live Transcription

Speak directly and watch your words appear in real-time. Perfect for meetings, interviews, and notes.Highlights: Sub-300ms latency, Copy transcript, Language detection. Badge: Instant. Button: + Start Live Transcription.

Record Audio

Record from your microphone, review the audio, and transcribe it, all in one seamless flow.Highlights: Record + transcribe, Audio playback, Auto-save. Badge: No setup required. Button: + Start Recording.
Upload File and Record Audio both open the Live Transcribe modal in record mode. They share the same backing flow.

Supported Audio Formats

Strip card under the method cards. All popular and professional audio formats supported.
FormatExtensions
MP3.mp3
WAV.wav
M4A.m4a
AAC.aac
OGG.ogg
FLAC.flac
WEBM.webm
The dashboard also accepts MP4 audio (.mp4) when you go through Upload File.

Pro Tips for best results

Seven numbered tips, two-column grid, each with a cyan badge.

1. Verify detected language

Ensure the correct language is selected for better accuracy.

2. Don't upload sensitive content

Be mindful of shared or confidential information.

3. Review audio before submitting

Check for clear audio quality and minimal background noise.

4. Wait for processing to complete

Larger files may take a few moments to transcribe.

5. Review and edit transcribed text

Manually correct any AI inaccuracies for perfect results.

6. Use supported formats (MP3, MP4 or WAV)

For best compatibility, use high-quality audio formats.

7. Avoid background noise & speak clearly

Clearer input leads to significantly better transcription output.
When you are on /speech-to-text/*, the dashboard shell swaps:
  • System Switcher auto-syncs to Speech to Text (Faheem).
  • The sidebar shows Your Files, the live STT file list.
  • + Create Agent in the topbar becomes + Transcribe Audio, with three options: Upload File (cyan), Live Transcription (green), Record Audio (red).
  • The breadcrumb on the editor page reads Speech to Text (Faheem)file name.

Recent files

Below the formats strip the dashboard also surfaces a Latest from the library section pulled from the same source as the sidebar. Each row clicks through to the Transcript Editor and exposes the 3-dot actions menu (export, pin, rename, delete).

Common questions

Upload File is for audio you already have on disk. Live Transcription is for transcribing while you speak (your browser does the work). Record Audio opens the same modal as Live Transcription, the difference is intent: Live Transcription is meant for note-taking, Record Audio is meant for capturing a session you’ll review later.
Upload File uses the Faheem engine and is generally more accurate, especially for Arabic and dialects. Live Transcription uses your browser’s built-in speech recognition which varies by browser. For the most accurate transcript, record audio first then upload it.
5 hours per file, with a 500 MB file size limit. If your audio is longer, split it into parts before uploading.
Into your File Library and into the Your Files sidebar list (which only shows on Speech to Text pages). Each file is private to your Wittify account.
The engine was trained Arabic-first, with strong support for the major Arabic dialects (Egyptian, Saudi, Levantine, Maghrebi, Gulf, and others). It also handles code-switching between Arabic and English natively, which most generic engines don’t do well.
Yes, Faheem handles code-switched audio without you having to do anything special. Pick Auto Detect in the language dropdown, or pick the dominant language. The engine still picks up the second language wherever it appears.