reels-app

History

Sebastjan Artič df6011c3cf Detect Scribe hallucinations + filter from SRT + auto-retry Bug found in Žena ME TEPE third re-test: - Scribe transcribed only verse 1 (0-33s) properly - Then returned a single 98s segment [34.7-133.2] with just 1 word 'sam' - This is a known Scribe hallucination on instrumental sections - Result: SRT showed 'SAM SAM SAM SAM...' 14 times across the chorus - Looked completely wrong because the chorus audio was correct but subtitles showed 'SAM' repeatedly Three-part fix: 1. SRT GENERATOR: skip segments > 15s with < 5 words. These are hallucinations and have no real transcription value. 2. SCRIBE TRANSCRIBE: detect hallucinations in returned segments. - Mark segments > 15s with < 5 words as hallucinations - Compute true coverage % (excluding hallucinations) - Add _hallucination_count and _coverage_pct to result 3. TRANSCRIBE_FULL: auto-retry Scribe if quality is poor. - If hallucinations detected OR coverage < 50%, retry once - Keep retry result only if it has better stats - Otherwise fall back to first attempt (still better than nothing) This makes the pipeline robust against Scribe's occasional bad transcripts on songs with long instrumental breaks. Most second attempts succeed where the first failed (random Scribe variance).		2026-04-29 18:08:35 +00:00
..
acr_recognize.py	MXF/MPG broadcast format support: handle multichannel audio properly	2026-04-29 14:38:48 +00:00
analyze.py	Detect Scribe hallucinations + filter from SRT + auto-retry	2026-04-29 18:08:35 +00:00
clip.py	Upgrade default Whisper model: small/medium → large-v3 for much better Slovenian/Slavic transcription accuracy	2026-04-29 08:20:18 +00:00
find_chorus.py	Find chorus: weight repetitive short phrases (like 'Ohne dich x5') as strong chorus signal	2026-04-28 16:57:45 +00:00
reframe.py	MXF/MPG broadcast format support: handle multichannel audio properly	2026-04-29 14:38:48 +00:00
subtitle.py	Upgrade default Whisper model: small/medium → large-v3 for much better Slovenian/Slavic transcription accuracy	2026-04-29 08:20:18 +00:00
yt_download.py	Add cookies support to yt_download.py for YouTube bot detection bypass	2026-04-28 15:47:59 +00:00