reels-app

History

Sebastjan Artič 91caf957f2 Auto-route NZ folk-pop directly to Gemini (skip Scribe) User feedback: Scribe consistently produces bad transcripts for Slovenian narodno-zabavna (NZ) folk-pop music: - 'Saša Avsenik - Žena ME TEPE': hallucinated 'sam sam sam' x14 - 'FEHTARJI - Gorenjska Ljubljena': total hallucination ('finančni moduli') - 'Ansambel UNIKAT - PA PA': mistranscribed 'mu' as 'vsem' - 'Ansambel Saša Avsenika - CVETELE SO MALINE': wrong lyrics entirely Common pattern: all are Slovenian folk-pop with diatonic accordion. Scribe training data has very little of this genre, so it consistently fails. Solution: auto-detect NZ songs by filename keywords and route directly to Gemini 3 Pro (which handles them correctly), skipping Scribe entirely. is_likely_folk_pop() detects: - Slovenian: ansambel, avsenik, slak, fehtar, modrijan, atomik, gadi, vikend, stil, unikat, korenjaki, gorenjski, štajerski, polka, valček - Croatian: klapa, thompson, mate bulić - Serbian/Bosnian: lepa brena, ceca, halid bešlić When detected: 1. Skip Scribe entirely (it would fail anyway) 2. Go directly to Gemini 3 Pro (~100s, /bin/sh.20) 3. If Gemini fails, fall back to Scribe (rare) Cost analysis (10 reels/day, 30% NZ): - Before: 10x Scribe = $0.13/day, ~30% need re-process - Hybrid (fallback): 10x Scribe + 3x Gemini retry = $0.79/day - NZ-routing (now): 7x Scribe + 3x Gemini = $0.69/day, FIRST-TRY success Saves time AND money for NZ-heavy workloads.		2026-04-29 18:58:12 +00:00
..
acr_recognize.py	MXF/MPG broadcast format support: handle multichannel audio properly	2026-04-29 14:38:48 +00:00
analyze.py	Auto-route NZ folk-pop directly to Gemini (skip Scribe)	2026-04-29 18:58:12 +00:00
clip.py	Upgrade default Whisper model: small/medium → large-v3 for much better Slovenian/Slavic transcription accuracy	2026-04-29 08:20:18 +00:00
find_chorus.py	Find chorus: weight repetitive short phrases (like 'Ohne dich x5') as strong chorus signal	2026-04-28 16:57:45 +00:00
reframe.py	MXF/MPG broadcast format support: handle multichannel audio properly	2026-04-29 14:38:48 +00:00
subtitle.py	Upgrade default Whisper model: small/medium → large-v3 for much better Slovenian/Slavic transcription accuracy	2026-04-29 08:20:18 +00:00
yt_download.py	Add cookies support to yt_download.py for YouTube bot detection bypass	2026-04-28 15:47:59 +00:00