Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.wittify.ai/llms.txt

Use this file to discover all available pages before exploring further.

The transcript editor is the deepest surface in Speech to Text. You reach it by clicking any row in the File Library or any file in the Your Files sidebar. It shares the standard dashboard shell and uses a two-panel layout: transcript and timeline on the start edge, settings and tabs on the other side.

Page layout

The page is split into two panels. On screens narrower than md they stack with the transcript on top and the settings panel underneath.
PanelWhat it holds
Header bar (top)File name + filename, then a Clock chip showing the live audio length, a Globe chip showing the language, Speakers + Subtitles chips when enabled, the file Status pill, and on the end edge a Transcript download dropdown (TXT / Word / PDF) and an Audio download button.
Editor toolbar (between header and transcript)Merge, Split, Reassign (dropdown of speakers), Undo, Save changes. The buttons enable based on what you have selected.
Transcript (left, scrollable)The list of speaker turn rows. Click any row to select it. Click any word to edit it. The audio timeline sticks to the bottom of this panel. The Save bar slides in below the timeline whenever you have unsaved edits.
Settings (right)Transcribe Settings at the top, then a tab switcher between Summary and Intent, then Cue Properties at the bottom.
The breadcrumb at the top of the page reads Speech to Text (Faseeh)file name. Renaming the file from the File Library 3-dot menu updates this breadcrumb instantly.

Default speakers

Two speakers ship by default: Speaker 1 and Speaker 2. Each name is editable in place via the start-edge input on its Timeline Row. The name you type is what appears everywhere: the transcript bubble label, the side-panel Summary rows, the Reassign dropdown in the toolbar. Names render correctly in mixed Arabic and Latin scripts. The transcript text itself can be in any language. The dashboard chrome (button labels, dialog copy, toast text) translates between English and Arabic, but the transcript content stays exactly as the audio came in — Arabic, English, a third language, or a mix.

Header — Downloads

Two downloads in the file header:
A dropdown with three formats:
  • Plain text (.txt) — UTF-8 with BOM so Excel can open Arabic correctly.
  • Word (.doc) — opens directly in Microsoft Word.
  • PDF — opens the formatted transcript in a new browser tab. Use File → Print → Save as PDF in your browser to save it.
For very long transcripts (over 5,000 messages) the export is truncated and a yellow notice appears at the bottom of the file.

Transcript pane

Speaker turn rows

Each row is one speaker turn. The row carries:
  • A selection checkbox at the start edge. Click anywhere on the row (except a word) to toggle selection.
  • A vertical accent line: blue for Speaker 1, green for Speaker 2, red when the turn has an error.
  • A speaker label showing the current speaker name (whatever you typed in the timeline), with the start time of that turn formatted MM:SS Sec.
  • The transcript text as a list of clickable words separated by spaces.
A small fold-vertical separator icon appears between every two rows.

Editable words

Every word is independently clickable. Click any word to flip it into a small inline input:
  • The input grows or shrinks to fit the text you type.
  • The background tints to a light brand color and the input has a thin brand border so you know it’s active.
  • Press Enter or click anywhere outside the input to commit the change. Press Escape to cancel the edit.
Editing a word makes the editor “dirty” — the Save bar appears at the bottom of the transcript pane with Undo, Discard, and Save changes buttons. Edits aren’t persisted until you click Save changes.

Karaoke highlight

While the audio plays, the word currently being spoken is highlighted in amber. The highlight follows along automatically — no setup needed. The estimate is based on the message’s start time and the length of the current message; over time, as our transcription engine gets more confident, the highlight will land on the exact spoken word.

Top toolbar

A row of actions sits between the file header and the transcript:
ActionEnables whenWhat it does
Merge2 or more consecutive lines are selectedCombines the selected lines into a single transcript bubble. Uses the first line’s speaker and start time. The waveforms on the timeline re-bind to the merged bubble automatically.
Splitexactly 1 line is selected and it has at least 2 wordsSplits the selected bubble in half (at the midpoint by word count). The left half keeps any waveforms attached; the right half starts unattached — drag waves onto it from the timeline to assign them.
Reassignat least 1 line is selectedOpens a dropdown of speakers; pick a speaker to reassign every selected line to. The waveforms on the timeline change color in lockstep.
Undoat least one change has been madeReverts the last change. History is bounded at the last 50 changes. Cmd + Z (macOS) / Ctrl + Z (Windows / Linux) does the same thing — except when you’re editing a word, where it falls through to the browser’s native text-input undo.
Save changesthe editor is dirtyPersists your edits and clears the dirty flag.

Audio Timeline

A sticky strip at the bottom of the transcript pane. The timeline always renders left-to-right, even on Arabic pages, because timestamps and waveforms only make sense in that direction.
Sub-componentWhat it does
Timeline HeaderA ruler with ticks. The interval auto-scales: a tick every 5 s for files under 1 minute, every 15 s for files under 2 minutes, every 30 s for longer files.
Timeline Row (one per speaker)An editable speaker name input on the start edge plus the speaker’s waveform segments laid out by time. Each segment’s width is proportional to its duration.
WaveformAnimated bars drawn for each segment. Speaker 1 is blue. Speaker 2 is green. Any segment that the engine flagged as “negative” (frustrated / problematic) renders red, regardless of which speaker it belongs to. The colors swap automatically when you toggle dark mode.
Timeline ControlsA primary Play / Pause circular button (40 px) and a small speed toggle that cycles 1.0x → 1.5x → 2.0x → 0.5x → 1.0x.
Timeline ScrubberClick anywhere on the timeline to jump the audio to that moment. The position indicator slides to where you clicked.

Drag a wave to fix a wrong speaker or wrong time

A single drag does both:
  • Drag down or up to move the wave to a different speaker.
  • Drag left or right to move the wave to a different time.
  • Both at once is also supported — diagonal drags work.
While you drag, the segment dims to 30% opacity and the row underneath the cursor highlights with a dashed outline. Release the mouse / finger to commit. The corresponding transcript bubble updates immediately — its color and timestamp follow the wave. A tap or click on a wave (no drag) seeks the playhead to that wave’s start. The audio meter slides to that point. The whole drag gesture counts as a single undo step — pressing Undo once returns the wave to where it was. While the audio is still loading, the timeline shows a centered spinner with the label Loading timeline… so you don’t see a misaligned ruler.

Save bar

Whenever the editor has unsaved edits, a sliding bar appears below the audio timeline:
  • A pulsing amber dot + Unsaved changes label so you can’t miss it.
  • Undo — reverts the last change.
  • Discard — opens a confirmation dialog (“All edits since the last save will be lost. This can’t be undone.”). Confirming reverts everything since the last save.
  • Save changes — persists everything and the bar disappears.

Unsaved-changes guard

If you try to navigate away (click another sidebar item, the dashboard logo, your profile menu, etc.) while you have unsaved edits, a dialog appears with three options:
  • Save and continue — saves your edits, then takes you to where you wanted to go.
  • Discard (red) — drops your edits, then takes you to where you wanted to go.
  • Cancel — keeps you on the page; nothing happens.
If you try to refresh the page or close the tab, your browser will show its own “Leave site?” prompt instead — that one is the browser’s native dialog, so we can’t customise the buttons.

Right panel

Transcribe Settings (top of the panel)

FieldWhat it does
Language DetectionAn outline button showing the language detection setting you chose pre-processing (defaults to Auto-Detect). The Globe chip in the file header shows the actual detected language separately.
Include subtitles switchTurn on to generate subtitles for export to SRT or VTT. The AI labels each speaker so you can rename them on the fly.
Detect words buttonAn outline button that runs word-level detection over the audio.
Run spell check buttonA brand-gradient primary button that runs spell check across the transcript.

Tabs section (middle of the panel)

Two tabs: Summary and Intent.
  • A row per speaker showing Speaker 1 / Speaker 2 label and the editable name.
  • Executive Overview heading with a thin bordered underline, followed by the conversation’s narrative paragraph.

Cue Properties (bottom of the panel)

A static card showing the time bounds of the current audio:
FieldFormatExample
Start timeHH:MM:SS.cc00:00:00.00
End timeHH:MM:SS.ccmatches the live audio duration
Delete buttonAn outline button styled red (red text, red border). Removes the active cue.

Common questions

A click without movement seeks. To drag, press the mouse / finger down on a wave and move it more than a few pixels. The cursor changes to “grab” while you’re dragging.
The bubble follows the wave automatically. If it didn’t, your audio timeline and your transcript may be looking at different files — refresh the page and try again. If it persists, file a bug.
Speaker names are saved when you click outside the input. Then click Save changes in the save bar at the bottom of the transcript pane (or in the top toolbar) to commit the change.
That’s intentional. While you’re typing in a word’s input, Cmd / Ctrl + Z is the browser’s native text-input undo. Click anywhere outside the input first, then Cmd / Ctrl + Z runs the editor’s undo.
The unsaved-changes prompt should always appear when you click a sidebar item or the logo with unsaved edits. If it didn’t appear, you may have used a button-driven shortcut that bypasses the guard — file a bug. To recover, the previous save state is gone.
Three specific errors are surfaced now: “Couldn’t reach the audio file” means a network or CORS issue, “Audio file unavailable (HTTP X)” means the server returned an error code (X), and a generic message means something else went wrong. If it persists, refresh the page; if it still fails, the file may have expired — re-upload it.
PDF export opens the formatted transcript in a new browser tab. Use your browser’s File → Print → Save as PDF to save it. We’ll add direct .pdf download once the server-side export ships.
The current .doc is a Word-readable HTML document. Microsoft Word opens it natively. We’ll switch to a real .docx once the server-side export ships.
Exports are capped at 5,000 messages and 1,000 words per message — a ”… [truncated for export]” notice marks where the cap kicked in, and a yellow toast appears alongside the success toast. Server-side export (which removes the cap) is on the roadmap.
Speaker 1 is blue, Speaker 2 is green. If you see one color, the file probably wasn’t transcribed with Detect Speakers Automatically turned on. Re-upload through Upload Audio with that switch on.
Wait until the Loading timeline… spinner clears, controls don’t engage until the audio’s metadata has loaded. If it never clears, the audio file may not be available yet — refresh the page after a few seconds.
Click the Language Detection button at the top of the right panel and pick the right language. Re-running detection with the right language often improves the transcript quality on the next refresh.