Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.wittify.ai/llms.txt

Use this file to discover all available pages before exploring further.

Documents are the units inside a knowledge base. You drag-drop or browse to upload, watch the engine turn raw files into searchable chunks, then open any document for per-document scope, tags, description, and a chunk inspector. Each document goes through a five-state pipeline before it is searchable.

The ingestion pipeline

Every uploaded document goes through the same five states.
Status chipWhat it means
PendingLocal placeholder while the file uploads.
ParsingThe engine is extracting text. PDFs use OCR for scanned pages, with Arabic optimisation.
ChunkingText is split per the knowledge base’s chunk size and overlap settings.
EmbeddingEach chunk is converted into a vector (covers 100+ languages).
ReadyThe document is searchable in chats.
FailedA server-side error occurred. The document detail page shows the exact error message.
The list refreshes automatically while a document is moving through the pipeline. You can leave the page and come back, the status flips on the next refresh.

Document detail page

Open any document for a tabbed view: Overview and SQL tables.

Overview tab

The default tab. Three cards stacked vertically.

Details card

FieldNotes
SizeStored size in MB or KB.
ChunksNumber of chunks the document was split into.
UploadedThe upload timestamp.
StatusThe current pipeline state.
Ingestion errorWhen status is Failed, a red strip surfaces the server message.

Description card

ElementWhat it shows
TitleDescription.
SubtitleShort explanation of what this document contains. The assistant reads it when answering and when matching your questions to the right document.
BodyThe current description text, or the empty state No description yet. Add one so the assistant can cite this document accurately.
Edit description buttonOpens an inline editor with a Save and Cancel button.

Document scope card

A read-only summary of what the engine found in the document.
AxisWhat it shows
SheetsXLSX worksheet names.
HeadingsTop-level headings extracted from the document.
LanguagesAuto-detected language tags.
Contains tablesA Yes or No flag.
The scope filter on the chat composer uses these axes when you ask a question. So a heading like Section 4 - Refunds shows up as a one-click filter in the chat composer’s scope panel.

Tags card

Inline metadata editor for the document.
ElementNotes
TitleTags.
SubtitleAttach metadata to this document. The scope filter uses these keys when you ask a question.
Tag rowsEach tag is a key plus a value, displayed as small chips. Click any chip to edit, or click the + Add tag button to add a new one.
Key rulesLetters, digits, underscore, or Arabic characters. Maximum 64 characters.
Value rulesFree-form text up to 1,024 characters.
Common tag examples: team: hr, year: 2024, region: gulf, confidential: yes.

SQL tables tab

When the document is an XLSX file, every sheet is materialised into a Postgres table during ingestion so the chat assistant can write SQL against it. This tab lists those tables.
Element per table cardNotes
HeaderThe table’s name.
Copy FROM clause buttonCopies a snippet like FROM project_xx.table_yy to your clipboard.
Description editorInline editor with placeholder Describe what this sheet contains (e.g. Monthly water reuse totals by region, 2017–2024).
Columns tableOne row per column.
ColumnWhat it shows
Original columnThe header name as it appears in the XLSX file.
Postgres columnThe sanitized column name the database actually uses.
TypeThe inferred SQL type (text, integer, numeric, date, etc.).
RowsThe total row count, formatted rows.
When the document did not produce any SQL tables, the tab shows the empty state This document didn’t produce any SQL tables. Excel sheets are materialised into Postgres during ingestion; other formats are query-ready via RAG only.

Chunk inspector

A power-user view of how the assistant sees this document. Open any chunk row to enter the inspector.
ElementWhat it shows
HeaderChunks count, Edited count, Pinned count.
Search boxPlaceholder Search chunks. Filters by content.
Each chunk cardChunk index, page number, heading, plus small flags: OCR (came from optical character recognition), pinned, edited, boost (has keyword boosts), table (contains tabular content).
Edit chunk buttonOn every chunk card. Opens the editor.

Editing a chunk

FieldNotes
Chunk textThe raw extracted text. Editing this triggers a fresh embedding.
Keyword boostFree-form list. Help text Synonyms the retriever will treat as if they appeared in the chunk.
Pin this chunkToggle. Help text Pinned chunks are always included in retrieval for this document.
MetadataRead-only. Shows page number, heading, language, and any other auto-extracted properties.
When you click Save on a chunk-text edit, a confirmation dialog appears.
ElementWhat it says
TitleRe-embed this chunk?
BodySaving the edited text triggers a fresh embedding (about 1 to 2 seconds). Citations pointing at this chunk keep working.
Cancel buttonCloses the dialog without saving.
Save and re-embed buttonSaves and re-embeds.
Keyword boost and pin do not trigger a re-embed. Only chunk-text edits do.

Bulk delete

In the document list (back on the KB detail page), select one or more rows to reveal the bulk toolbar. Click Delete to open a confirmation:
ElementWhat it says
TitleDelete selected documents?
BodyThis will remove every selected document, its chunks, and its vectors. This cannot be undone.
Cancel buttonCloses the dialog.
Delete buttonStyled with the destructive (red) background and white text.

Common questions

Files larger than 10 MB are rejected before they leave your browser. Within the cap, a file can stay in Parsing for a minute or two on first upload. Wait, then refresh the page. If it stays stuck, the engine likely failed to read the file (encrypted PDF, exotic format).
OCR quality depends on scan quality. Re-export the PDF at a higher resolution, ideally 300 DPI or above, and re-upload. Avoid handwritten content, the engine is not trained on cursive.
Once you confirm Save and re-embed, the new text is live within 1 to 2 seconds. Older chats already in progress may still show citations to the old text until you ask a fresh question. Open a new chat to confirm.
Yes for retrieval against this document. Pinned chunks bypass the scoring step and are always included in the retrieval set for this document. Use sparingly, pinning everything defeats the relevance ranking.
The chat composer reads tag keys to populate the scope filter. After adding tags, refresh the chat page or start a new chat. The tags should appear in the filter dropdown.
Only XLSX is materialised into Postgres tables today. CSV, JSON, TXT, and other text formats are query-ready via the document text only (RAG). To get SQL tables for a CSV, save it as an XLSX file first.
No. Chunks are owned by the document. Delete the document instead, or pin / edit chunks to nudge the assistant.
You do not need to use it. The inspector exists for power users tuning retrieval. Most teams ship without ever opening a chunk. Tags and the description card cover the common cases.

Where to go next

Knowledge Bases

Back to the KB list.

Chats

Ask questions grounded in this document.

SQL Sources

For live database queries alongside documents.

Project Settings

See total storage and retrieval features.