Transcript Editor · Wittify Docs

The transcript editor is the deepest surface in Speech to Text. You reach it by clicking any row in the File Library or any file in the Your Files sidebar. It shares the standard dashboard shell and uses a two-panel layout: transcript and timeline on the start edge, settings and tabs on the other side.

Page layout

The page is split into two panels. On screens narrower than md they stack with the transcript on top and the settings panel underneath.

Panel	What it holds
Header bar (top)	File name + filename, then a Clock chip showing the live audio length, a Globe chip showing the language, Speakers + Subtitles chips when enabled, the file Status pill, and on the end edge a Transcript download dropdown (TXT / Word / PDF) and an Audio download button.
Editor toolbar (between header and transcript)	Merge, Split, Reassign (dropdown of speakers), Undo, Save changes. The buttons enable based on what you have selected.
Transcript (left, scrollable)	The list of speaker turn rows. Click any row to select it. Click any word to edit it. The audio timeline sticks to the bottom of this panel. The Save bar slides in below the timeline whenever you have unsaved edits.
Settings (right)	Transcribe Settings at the top, then a tab switcher between Summary and Intent, then Cue Properties at the bottom.

The breadcrumb at the top of the page reads Speech to Text (Faseeh) › file name. Renaming the file from the File Library 3-dot menu updates this breadcrumb instantly.

Default speakers

Two speakers ship by default: Speaker 1 and Speaker 2. Each name is editable in place via the start-edge input on its Timeline Row. The name you type is what appears everywhere: the transcript bubble label, the side-panel Summary rows, the Reassign dropdown in the toolbar. Names render correctly in mixed Arabic and Latin scripts. The transcript text itself can be in any language. The dashboard chrome (button labels, dialog copy, toast text) translates between English and Arabic, but the transcript content stays exactly as the audio came in — Arabic, English, a third language, or a mix. Two downloads in the file header:

Transcript
Audio

A dropdown with three formats:

Plain text (.txt) — UTF-8 with BOM so Excel can open Arabic correctly.
Word (.doc) — opens directly in Microsoft Word.
PDF — opens the formatted transcript in a new browser tab. Use File → Print → Save as PDF in your browser to save it.

For very long transcripts (over 5,000 messages) the export is truncated and a yellow notice appears at the bottom of the file.

Transcript pane

Speaker turn rows

Each row is one speaker turn. The row carries:

A selection checkbox at the start edge. Click anywhere on the row (except a word) to toggle selection.
A vertical accent line: blue for Speaker 1, green for Speaker 2, red when the turn has an error.
A speaker label showing the current speaker name (whatever you typed in the timeline), with the start time of that turn formatted MM:SS Sec.
The transcript text as a list of clickable words separated by spaces.

A small fold-vertical separator icon appears between every two rows.

Editable words

Every word is independently clickable. Click any word to flip it into a small inline input:

The input grows or shrinks to fit the text you type.
The background tints to a light brand color and the input has a thin brand border so you know it’s active.
Press Enter or click anywhere outside the input to commit the change. Press Escape to cancel the edit.

Editing a word makes the editor “dirty” — the Save bar appears at the bottom of the transcript pane with Undo, Discard, and Save changes buttons. Edits aren’t persisted until you click Save changes.

Karaoke highlight

While the audio plays, the word currently being spoken is highlighted in amber. The highlight follows along automatically — no setup needed. The estimate is based on the message’s start time and the length of the current message; over time, as our transcription engine gets more confident, the highlight will land on the exact spoken word. A row of actions sits between the file header and the transcript:

Action	Enables when	What it does
Merge	2 or more consecutive lines are selected	Combines the selected lines into a single transcript bubble. Uses the first line’s speaker and start time. The waveforms on the timeline re-bind to the merged bubble automatically.
Split	exactly 1 line is selected and it has at least 2 words	Splits the selected bubble in half (at the midpoint by word count). The left half keeps any waveforms attached; the right half starts unattached — drag waves onto it from the timeline to assign them.
Reassign	at least 1 line is selected	Opens a dropdown of speakers; pick a speaker to reassign every selected line to. The waveforms on the timeline change color in lockstep.
Undo	at least one change has been made	Reverts the last change. History is bounded at the last 50 changes. Cmd + Z (macOS) / Ctrl + Z (Windows / Linux) does the same thing — except when you’re editing a word, where it falls through to the browser’s native text-input undo.
Save changes	the editor is dirty	Persists your edits and clears the dirty flag.

Audio Timeline

A sticky strip at the bottom of the transcript pane. The timeline always renders left-to-right, even on Arabic pages, because timestamps and waveforms only make sense in that direction.

Sub-component	What it does
Timeline Header	A ruler with ticks. The interval auto-scales: a tick every 5 s for files under 1 minute, every 15 s for files under 2 minutes, every 30 s for longer files.
Timeline Row (one per speaker)	An editable speaker name input on the start edge plus the speaker’s waveform segments laid out by time. Each segment’s width is proportional to its duration.
Waveform	Animated bars drawn for each segment. Speaker 1 is blue. Speaker 2 is green. Any segment that the engine flagged as “negative” (frustrated / problematic) renders red, regardless of which speaker it belongs to. The colors swap automatically when you toggle dark mode.
Timeline Controls	A primary Play / Pause circular button (40 px) and a small speed toggle that cycles 1.0x → 1.5x → 2.0x → 0.5x → 1.0x.
Timeline Scrubber	Click anywhere on the timeline to jump the audio to that moment. The position indicator slides to where you clicked.

Drag a wave to fix a wrong speaker or wrong time

A single drag does both:

Drag down or up to move the wave to a different speaker.
Drag left or right to move the wave to a different time.
Both at once is also supported — diagonal drags work.

While you drag, the segment dims to 30% opacity and the row underneath the cursor highlights with a dashed outline. Release the mouse / finger to commit. The corresponding transcript bubble updates immediately — its color and timestamp follow the wave. A tap or click on a wave (no drag) seeks the playhead to that wave’s start. The audio meter slides to that point. The whole drag gesture counts as a single undo step — pressing Undo once returns the wave to where it was. While the audio is still loading, the timeline shows a centered spinner with the label Loading timeline… so you don’t see a misaligned ruler.

Save bar

Whenever the editor has unsaved edits, a sliding bar appears below the audio timeline:

A pulsing amber dot + Unsaved changes label so you can’t miss it.
Undo — reverts the last change.
Discard — opens a confirmation dialog (“All edits since the last save will be lost. This can’t be undone.”). Confirming reverts everything since the last save.
Save changes — persists everything and the bar disappears.

Unsaved-changes guard

If you try to navigate away (click another sidebar item, the dashboard logo, your profile menu, etc.) while you have unsaved edits, a dialog appears with three options:

Save and continue — saves your edits, then takes you to where you wanted to go.
Discard (red) — drops your edits, then takes you to where you wanted to go.
Cancel — keeps you on the page; nothing happens.

If you try to refresh the page or close the tab, your browser will show its own “Leave site?” prompt instead — that one is the browser’s native dialog, so we can’t customise the buttons.

Right panel

Transcribe Settings (top of the panel)

Field	What it does
Language Detection	An outline button showing the language detection setting you chose pre-processing (defaults to Auto-Detect). The Globe chip in the file header shows the actual detected language separately.
Include subtitles switch	Turn on to generate subtitles for export to SRT or VTT. The AI labels each speaker so you can rename them on the fly.
Detect words button	An outline button that runs word-level detection over the audio.
Run spell check button	A brand-gradient primary button that runs spell check across the transcript.

Tabs section (middle of the panel)

Two tabs: Summary and Intent.

Summary
Intent

A row per speaker showing Speaker 1 / Speaker 2 label and the editable name.
Executive Overview heading with a thin bordered underline, followed by the conversation’s narrative paragraph.

Cue Properties (bottom of the panel)

A static card showing the time bounds of the current audio:

Field	Format	Example
Start time	`HH:MM:SS.cc`	`00:00:00.00`
End time	`HH:MM:SS.cc`	matches the live audio duration
Delete button	An outline button styled red (red text, red border). Removes the active cue.

Common questions

Drag-and-drop isn't working — clicking a wave just seeks the audio.

A click without movement seeks. To drag, press the mouse / finger down on a wave and move it more than a few pixels. The cursor changes to “grab” while you’re dragging.

I dragged a wave to another speaker but the transcript bubble didn't change color.

The bubble follows the wave automatically. If it didn’t, your audio timeline and your transcript may be looking at different files — refresh the page and try again. If it persists, file a bug.

My speaker name change didn't stick.

Speaker names are saved when you click outside the input. Then click Save changes in the save bar at the bottom of the transcript pane (or in the top toolbar) to commit the change.

Cmd + Z / Ctrl + Z doesn't undo when I'm editing a word.

That’s intentional. While you’re typing in a word’s input, Cmd / Ctrl + Z is the browser’s native text-input undo. Click anywhere outside the input first, then Cmd / Ctrl + Z runs the editor’s undo.

I lost my edits when I navigated away.

The unsaved-changes prompt should always appear when you click a sidebar item or the logo with unsaved edits. If it didn’t appear, you may have used a button-driven shortcut that bypasses the guard — file a bug. To recover, the previous save state is gone.

The Audio download fails with a generic error.

Three specific errors are surfaced now: “Couldn’t reach the audio file” means a network or CORS issue, “Audio file unavailable (HTTP X)” means the server returned an error code (X), and a generic message means something else went wrong. If it persists, refresh the page; if it still fails, the file may have expired — re-upload it.

My PDF download just opened a new tab — where's the file?

PDF export opens the formatted transcript in a new browser tab. Use your browser’s File → Print → Save as PDF to save it. We’ll add direct .pdf download once the server-side export ships.

The Word file Microsoft Word opens looks like an HTML preview.

The current .doc is a Word-readable HTML document. Microsoft Word opens it natively. We’ll switch to a real .docx once the server-side export ships.

I exported a very long transcript and it's missing the end.

Exports are capped at 5,000 messages and 1,000 words per message — a ”… [truncated for export]” notice marks where the cap kicked in, and a yellow toast appears alongside the success toast. Server-side export (which removes the cap) is on the roadmap.

The waveform is showing the same color for both speakers.

Speaker 1 is blue, Speaker 2 is green. If you see one color, the file probably wasn’t transcribed with Detect Speakers Automatically turned on. Re-upload through Upload Audio with that switch on.

Play / Pause doesn't work.

Wait until the Loading timeline… spinner clears, controls don’t engage until the audio’s metadata has loaded. If it never clears, the audio file may not be available yet — refresh the page after a few seconds.

My language detection is wrong.

Click the Language Detection button at the top of the right panel and pick the right language. Re-running detection with the right language often improves the transcript quality on the next refresh.

Documentation Index

​Page layout

​Default speakers

​Header — Downloads

​Transcript pane

​Speaker turn rows

​Editable words

​Karaoke highlight

​Top toolbar

​Audio Timeline

​Drag a wave to fix a wrong speaker or wrong time

​Save bar

​Unsaved-changes guard

​Right panel

​Transcribe Settings (top of the panel)

​Tabs section (middle of the panel)

​Cue Properties (bottom of the panel)

​Common questions

Page layout

Default speakers

Header — Downloads

Transcript pane

Speaker turn rows

Editable words

Karaoke highlight

Top toolbar

Audio Timeline

Drag a wave to fix a wrong speaker or wrong time

Save bar

Unsaved-changes guard

Right panel

Transcribe Settings (top of the panel)

Tabs section (middle of the panel)

Cue Properties (bottom of the panel)

Common questions