AI & Processing
Model Studio
Plain-language guide to VAD, pause detection, and transcription behavior so you can tune Voiscribd without guesswork.
Settings -> Model Studio (Expert Mode)
Model Studio is where you control how Voiscribd "listens" and when it decides speech has started or ended.
If your transcript feels cut off, late, noisy, or inconsistent, the fix is usually here.
What VAD Actually Is
VAD means Voice Activity Detection.
In practice: it is the part that decides whether incoming audio sounds like voice or not.
Good VAD settings give you:
- fewer accidental starts from keyboard clicks/fan noise
- cleaner stops when you are actually done speaking
- no clipped first words
Output and Language Behavior
Output Format
Output Format is a style instruction layer for transcription text.
Use it to control punctuation style, compactness, and formatting preferences.
Vocabulary Boost
Add names, acronyms, product terms, and domain jargon that matter in your workflow.
- supports comma-separated or newline-separated entries
- especially helpful for niche names and internal terminology
Post-Processing
Add space after paste: inserts a trailing space after text outputAutomatic text formatting: cleans punctuation/paragraph flow after transcription
Global Transcription Defaults
Default Transcription Mode
Used when no Chain overrides it:
BatchReal Time
Live Transcript Preview
Shows/hides interim partial text while speaking.
VAD Controls Explained for Humans
Voice Activity Detection (VAD)
Master switch for speech/non-speech detection logic.
Skip likely non-speech audio
Filters out sounds that are probably not voice.
Auto-stop on silence
Stops recording when enough silence is detected.
Natural pause tolerance
Lets you pause briefly without the recording ending too aggressively.
VAD Threshold
How "confident" audio must be before it is treated as voice.
- higher threshold: stricter, less noise triggers
- lower threshold: more sensitive, easier voice pickup
Min Speech
Minimum voice duration before speech is accepted as real speech.
Min Silence
How long silence must last before Voiscribd decides you are done.
Speech Padding
Extra audio around detected speech boundaries for safer segmentation.
Always Listen Pre-roll
Keeps a short audio lead-in so first words are not clipped.
Quick Tuning Recipes
Problem: first word gets chopped
Increase Always Listen Pre-roll, then slightly increase Speech Padding.
Problem: recording stops while I am thinking
Increase Min Silence and enable Natural pause tolerance.
Problem: random background sounds trigger recording
Increase VAD Threshold and enable Skip likely non-speech audio.
Problem: speech start feels late
Reduce VAD Threshold slightly and reduce Min Speech.
Safe Baseline
If settings are heavily tuned and behavior is unstable:
- reset VAD values to default,
- enable
Natural pause tolerance, - tune one parameter at a time while testing the same sentence pattern.
Note
[Screencast Placeholder] Record a tuning session that starts with premature auto-stop, then fixes it by adjusting Min Silence and pause tolerance.