reels-app/scripts
Sebastjan Artič 22bb3cfe02 Trust LLM: remove forced extension, content-driven prompt
User feedback: 'Tikaj more llm razmislat in ineti filing kaj dat notri'.
With Soniox transcript now accurate, LLM has all info to decide content-wise.

TWO CHANGES:

1. smart_clip_range() — REMOVED forced extension logic:
   Before: if duration < min_duration (20s):
       - extend to next chorus (40% match) ← WRONG! merged with B-chorus
       - extend symmetrically into VERSE ← WRONG! brought in kitica
       - cap at max_duration
   After: trust LLM completely. Only safety: clamp to video bounds.

2. Prompt rewrite — content-driven instead of number-driven:
   Before: 'Skupna dolžina: 12-25 sekund (običajno)' + conflicting '~30s'
           ' Drugi/tretji nastop refrena — uporabi PRVI'
   After: '~30 sekund (NAJBOLJŠA opcija = dva zaporedna refrena)'
          'Vključi naravne intro klice (Ajmo Janezi! Hey! Pa-pa!)'
          'BRAJDE primer: 41.8-69.8s = 28s (dva refrena z Ajmo Janezi intro)'
          'NE meša 2 RAZLIČNA refrena (A + B = napaka)'
          'NE razširi v VERZE/KITICE'

For BRAJDE this means:
- Old: Claude picked 57.1-69.8s (12.7s, 2nd chorus, no Ajmo)
       Code forced extension to 57.06-82.5s (mixed with B-chorus + verse)
- New: Claude picks 41.8-69.8s (28s, 2 choruses with 'Ajmo Janezi!' intro)
       Code returns exactly that — no forced extension.
2026-04-30 04:39:26 +00:00
..
acr_recognize.py MXF/MPG broadcast format support: handle multichannel audio properly 2026-04-29 14:38:48 +00:00
analyze.py Trust LLM: remove forced extension, content-driven prompt 2026-04-30 04:39:26 +00:00
clip.py Upgrade default Whisper model: small/medium → large-v3 for much better Slovenian/Slavic transcription accuracy 2026-04-29 08:20:18 +00:00
find_chorus.py Find chorus: weight repetitive short phrases (like 'Ohne dich x5') as strong chorus signal 2026-04-28 16:57:45 +00:00
reframe.py MXF/MPG broadcast format support: handle multichannel audio properly 2026-04-29 14:38:48 +00:00
subtitle.py Upgrade default Whisper model: small/medium → large-v3 for much better Slovenian/Slavic transcription accuracy 2026-04-29 08:20:18 +00:00
yt_download.py Add cookies support to yt_download.py for YouTube bot detection bypass 2026-04-28 15:47:59 +00:00