reels-app

History

Sebastjan Artič 157e6b781e Fix 'Žena' word still cut: word-level start extension instead of segment-level Previous fix used segment boundaries — required segments <3s for type 1 or <4s for type 2. But Žena was in a 4.3s segment ('saj še doma mi več noč'jo verjet'. Žena me'), so the condition wasn't met and clip start stayed at 77.7s, exactly at end of word 'Žena' (76.88-77.70s). New approach: scan word-level timestamps directly: 1. If clip start falls MID-WORD → extend back to word start - 0.15s 2. If a word ends 0-0.5s BEFORE clip start AND next word is at clip start → that word is suspect (may be first word of chorus that Scribe put in previous segment), extend back to its start - 0.15s Word-level timestamps are always available from Scribe (timestamps_granularity=word). Falls back to segment-level for local Whisper without word timing. This handles arbitrary segment lengths and is universal — works for any language where the chorus starts on a word that the STT placed in the previous segment.		2026-04-29 15:04:18 +00:00
..
acr_recognize.py	MXF/MPG broadcast format support: handle multichannel audio properly	2026-04-29 14:38:48 +00:00
analyze.py	Fix 'Žena' word still cut: word-level start extension instead of segment-level	2026-04-29 15:04:18 +00:00
clip.py	Upgrade default Whisper model: small/medium → large-v3 for much better Slovenian/Slavic transcription accuracy	2026-04-29 08:20:18 +00:00
find_chorus.py	Find chorus: weight repetitive short phrases (like 'Ohne dich x5') as strong chorus signal	2026-04-28 16:57:45 +00:00
reframe.py	MXF/MPG broadcast format support: handle multichannel audio properly	2026-04-29 14:38:48 +00:00
subtitle.py	Upgrade default Whisper model: small/medium → large-v3 for much better Slovenian/Slavic transcription accuracy	2026-04-29 08:20:18 +00:00
yt_download.py	Add cookies support to yt_download.py for YouTube bot detection bypass	2026-04-28 15:47:59 +00:00