Strict 'chorus only' mode: respect include_prebuild in LLM prompt

Bug: 'Vključi pre-chorus' checkbox in UI was sent to backend but ignored
by Claude/Gemini analysis prompt. Both modes used same lenient rules
saying 'pre-chorus is optional' — Claude often included pre-chorus even
when user wanted just chorus.

Real-world failure: Lady Gaga 'Abracadabra' picked 54.7-84.6s, but actual
chorus 'Abracadabra, amor, ooh-na-na' starts at 85.2s. Claude included
the entire pre-chorus block ('Hold me in your heart tonight', 'Like a
poem said by a lady in red', 'With a haunting dance') and missed the
actual chorus completely.

Fix: include_prebuild parameter now flows all the way to the prompt:
- main.py → analyze.py CLI args → analyze_with_llm() → prompt builder
- Two distinct prompt rule sets:

  CHORUS ONLY (default, include_prebuild=False):
  - Strict: 'clip starts on FIRST WORD of chorus, never before'
  - Length: 12-25s typically
  - Explicit examples for pop songs (Abracadabra, Despacito, Shape of You)
  - List of common mistakes to avoid

  CHORUS + PRE-CHORUS (include_prebuild=True):
  - Optional pre-chorus before chorus, 4-10s
  - Length: 18-35s

This fixes the most common failure mode where Claude rationalizes
including verse/pre-chorus content even when user explicitly wants
just the chorus.
This commit is contained in:
Sebastjan Artič 2026-04-29 14:03:40 +00:00
parent 671b512917
commit a30137f1f2

View File

@ -754,8 +754,12 @@ def detect_audio_fade(clip_range, transcript, video_duration=None):
}
def _build_analysis_prompt(transcript, video_duration, target_duration=30, filename_hint=None):
"""Pripravi enotni prompt za Claude/Gemini analizo."""
def _build_analysis_prompt(transcript, video_duration, target_duration=30, filename_hint=None, include_prebuild=False):
"""Pripravi enotni prompt za Claude/Gemini analizo.
include_prebuild: če True, lahko vključi pre-chorus pred refrenom.
če False (default), MORA biti SAMO refren strogo.
"""
lines = []
for seg in transcript["segments"]:
start = seg["start"]
@ -827,36 +831,63 @@ PROSIM:
- Ohrani timestamp-e nespremenjene
3. Prepoznaj REFREN: del besedila ki se PONAVLJA (ponavadi 2-4 vrstice, ki se v pesmi večkrat ponovijo). To je **univerzalno za vse jezike** refren je strukturni element pesmi, ne le slovenske/nemške/angleške.
4. **IZBERI ODSEK REFREN JE GLAVNA STVAR:**
{"" if include_prebuild else '''4. **🎯 KRITIČNO PRAVILO: SAMO REFREN, NIČ DRUGEGA**
## 🎯 OBVEZNO: cel **PRVI** refren
- **Začetek**: prva vrstica refrena (kjer ponavljanje prvič začne)
- **Konec**: vključno z **vsem naravnim izpevom refrena**:
- Outro fraze ki so del refrena (slo: "aj ja ja", "ej ej ej"; en: "yeah", "oh oh"; es: "ay ay ay"; ro: "hei hei"; ja: "la la la" uniSAERZALno čez jezike)
- Pevec drži zadnji ton 1-3s to je **del refrena**, ne reži ga
Uporabnik je izbral način "**SAMO REFREN**". To pomeni:
## ⚠️ ABSOLUTNO PRAVILO: clip se ZAČNE na PRVI BESEDI prvega refrena
- **NE** vključuj kateri koli verz, pre-chorus, build-up, ali intro
- **NE** začni "tik pred" refrenom
- **Začetek = točno tam, kjer prva vrstica refrena prvič začne**
## Identifikacija refrena (univerzalno čez jezike):
- Najdi del, ki se v pesmi **ponavlja vsaj 2-krat** (običajno 3-4x)
- Refren ima ponavadi **najvišjo melodičnost**, "catchy" del
- Verzi pripovedujejo zgodbo (različno besedilo), refren je vedno enako besedilo
- Pri pop pesmih: refren običajno začne z naslovom pesmi ali znano frazo
- Lady Gaga "Abracadabra" refren = "Abracadabra, amor, ooh-na-na..."
- "Despacito" refren = "Despacito, quiero respirar..."
- "Shape of You" refren = "I'm in love with the shape of you..."
- Pri narodno-zabavnih (SL/HR/SR): refren je tisti del, ki se ponovi po vsakem verzu
- Pri Schlager (DE): refren je melodični "hook" del
## Konec: vključno s celotnim naravnim izpevom
- **Vse outro fraze** ki so del refrena (slo: "aj ja ja", "ej ej ej"; en: "yeah", "oh oh"; es: "ay ay ay"; ro: "hei hei"; ja: "la la la")
- Pevec drži zadnji ton 1-3s to je **del refrena**, ne reži ga
- Refren naj se **naravno izteče**
## Skupna dolžina: 12-25 sekund (običajno)
- Če refren traja 18s izberi 18s
- Če refren traja 25s izberi 25s
- **NIKOLI ne dodajaj sekund pred refrenom** za "obogatitev"
## 💡 PRE-CHORUS samo če JE pred refrenom in pasta naravno
- **PRE-CHORUS = zadnja 1-2 vrstici verza tik pred refrenom** (slišne, povezane z refrenom)
## 🚫 NAJPOGOSTEJŠE NAPAKE (NE DELAJ TEH):
- Vključitev pre-chorusa "ker je vsebinsko povezan" NE, samo refren!
- Začetek 5s pred refrenom za "kontext" NE, točno na refrenu!
- Kombinacija pre-chorus + refren NE, zgolj refren!
- Drugi/tretji nastop refrena uporabi PRVI
- Sekanje sredi besede / izpeta tona
'''}{'''4. **IZBERI ODSEK REFREN + PRE-CHORUS:**
Uporabnik je izbral način "**REFREN + PRE-CHORUS**".
## OBVEZNO: cel **PRVI** refren (kot opisano spodaj)
## OPCIJSKO: pre-chorus PRED refrenom
- **Pre-chorus = zadnja 1-2 vrstici verza tik pred refrenom** (slišne, povezane z refrenom)
- **Dodaj samo če**:
- Je tik pred refrenom (brez pavze ali instrumental vmes)
- Vsebinsko vodi v refren (gradnja občutka, "stopnjuje" se)
- Vsebinsko vodi v refren (gradnja občutka)
- Je kratek: 4-10 sekund
- **Ne dodajaj** pre-refrena če:
- Refren se začne neposredno za prejšnjim refrenom
- Verz je predaleč (>2s pavze) ali je instrumental vmes
- Bi presegel skupno dolžino 35s
- **V dvomu**: bolje SAMO REFREN kot da bi dodal čuden pre-chorus
## 📏 Skupna dolžina: 12-35 sekund
- **Sam refren** (najpogostejša izbira): 12-25s
- **Refren + pre-chorus**: 18-32s
## 🚫 NIKOLI:
- **Drugi/tretji refren** vedno PRVI nastop
- **Instrumentalni medbridge** brez petja
- **Skok med verzi** (clip mora biti ena neprekinjena celota)
- **Sekanje sredi besede ali izpeta tona**
- **Ne dodajaj** če bi presegel skupno dolžino 35s
## REFREN — kot pri "samo refren":
- Začetek refrena = prva vrstica refrena
- Konec refrena = vključno z vsemi outro frazami in zadnjim držečim tonom
- Naravni izpev (ej-ej-ej, oh oh, la la la, etc.)
## Skupna dolžina: 18-35 sekund
''' if include_prebuild else ""}
5. Če transkript je v večini halucinacija (manj kot 30% smiselnih besed), v "reason" napiši "STT_HALLUCINATION_DETECTED"
@ -910,7 +941,7 @@ def _parse_llm_response(text, video_duration):
}
def analyze_with_claude(transcript, video_duration, target_duration=30, model="claude-sonnet-4-6", filename_hint=None):
def analyze_with_claude(transcript, video_duration, target_duration=30, model="claude-sonnet-4-6", filename_hint=None, include_prebuild=False):
"""Pošlje transkript Claude API-ju (Anthropic).
model: claude-sonnet-4-6 (default), claude-haiku-4-5-20251001, claude-opus-4-7
@ -924,7 +955,7 @@ def analyze_with_claude(transcript, video_duration, target_duration=30, model="c
if not transcript.get("segments"):
return None
prompt = _build_analysis_prompt(transcript, video_duration, target_duration, filename_hint=filename_hint)
prompt = _build_analysis_prompt(transcript, video_duration, target_duration, filename_hint=filename_hint, include_prebuild=include_prebuild)
try:
import urllib.request
@ -1040,7 +1071,7 @@ def analyze_with_claude(transcript, video_duration, target_duration=30, model="c
return None
def analyze_with_gemini(transcript, video_duration, target_duration=30, model="gemini-3.1-pro-preview", filename_hint=None):
def analyze_with_gemini(transcript, video_duration, target_duration=30, model="gemini-3.1-pro-preview", filename_hint=None, include_prebuild=False):
"""Pošlje transkript Gemini API-ju (Google).
Gemini 3.1 Pro ima najboljši multilingual rezultat (MMMLU 92.6%) odličen za SLO/HR/BS.
@ -1053,7 +1084,7 @@ def analyze_with_gemini(transcript, video_duration, target_duration=30, model="g
if not transcript.get("segments"):
return None
prompt = _build_analysis_prompt(transcript, video_duration, target_duration, filename_hint=filename_hint)
prompt = _build_analysis_prompt(transcript, video_duration, target_duration, filename_hint=filename_hint, include_prebuild=include_prebuild)
try:
import urllib.request
@ -1145,23 +1176,23 @@ def analyze_with_gemini(transcript, video_duration, target_duration=30, model="g
return None
def analyze_with_llm(transcript, video_duration, target_duration=30, provider="claude", llm_model=None, filename_hint=None):
def analyze_with_llm(transcript, video_duration, target_duration=30, provider="claude", llm_model=None, filename_hint=None, include_prebuild=False):
"""Glavna funkcija — uporabi izbrano LLM (claude/gemini/auto)."""
if provider == "gemini":
model = llm_model or "gemini-3.1-pro-preview"
return analyze_with_gemini(transcript, video_duration, target_duration, model, filename_hint=filename_hint)
return analyze_with_gemini(transcript, video_duration, target_duration, model, filename_hint=filename_hint, include_prebuild=include_prebuild)
elif provider == "claude":
model = llm_model or "claude-sonnet-4-6"
return analyze_with_claude(transcript, video_duration, target_duration, model, filename_hint=filename_hint)
return analyze_with_claude(transcript, video_duration, target_duration, model, filename_hint=filename_hint, include_prebuild=include_prebuild)
elif provider == "auto":
# Najprej probaj Claude, fallback na Gemini
result = analyze_with_claude(transcript, video_duration, target_duration,
llm_model or "claude-sonnet-4-6", filename_hint=filename_hint)
llm_model or "claude-sonnet-4-6", filename_hint=filename_hint, include_prebuild=include_prebuild)
if result:
return result
print(" 🔄 Claude ni uspel, probam Gemini...", file=sys.stderr)
return analyze_with_gemini(transcript, video_duration, target_duration,
llm_model or "gemini-3.1-pro-preview", filename_hint=filename_hint)
llm_model or "gemini-3.1-pro-preview", filename_hint=filename_hint, include_prebuild=include_prebuild)
else:
print(f" ⚠️ Neznan LLM provider: {provider}", file=sys.stderr)
return None
@ -1253,6 +1284,7 @@ def main():
transcript, duration, target_duration=args.target_duration,
provider=provider, llm_model=args.llm_model,
filename_hint=fname_hint,
include_prebuild=args.include_prebuild,
)
# 5b. Find chorus lokalno (kot fallback ali za score-jev preview)