Diagnoza:
- analyze.py je zgodovinsko imel samo Claude support
- ko se je dodal Gemini, je clip_range.source ostal hardcoded 'claude'
- prav tako log 'Whisper segmenti zamenjani s Claude' in 'Generated SRT from Claude'
- API rezultat je v jobu kazal source='claude' tudi ko je dejansko bil uporabljen Gemini
- to je samo COSMETIC bug — funkcionalno je vse delovalo pravilno
- Gemini se DEJANSKO klical (potrjeno: '🤖 Gemini (gemini-3.1-pro-preview) izbral: 172.5-201.8s')
in vrnil pravilen rezultat — samo logging je rekel napačno
Popravki:
1. clip_range['source'] = claude_result['source'] (dejansko 'gemini:...' ali 'claude:...')
2. clip_range['reason'] prefix iz hardcoded 'claude_llm:' v dinamičen '{source}:'
3. Log 'Whisper segmenti zamenjani s Claude' → 'z {llm_label}'
4. Log 'Claude je popravil jezik' → 'LLM je popravil'
5. main.py 'Generated SRT from Claude' → 'from {llm_src}'
Test (Zlati Muzikanti - Le prijatelja bodiva, valček, 246s):
✓ Gemini dejansko izbere refren (172.5-201.8s)
✓ Whisper detektira sl (p=0.97 across 3 samples)
✓ Vseh 18 segmentov popravljenih
✓ Pipeline end-to-end deluje
Backward compat:
- transcript['claude_corrected'] in srt_from_claude variable name ohranjena
ker že obstajajo v starih job state fajlih
When Coolify redeploys, the container is killed mid-job.
Now on FastAPI startup:
- Detect status=processing jobs from JOBS_DIR
- If input file exists and resume_attempts < 3, restart pipeline (status=queued)
- After 3 failed attempts, mark as error
- If input is missing, mark error immediately
- Track resume_attempts and last_resume_at for diagnostics
Run actual process_job in asyncio executor (sync function in thread)
so startup completes quickly and resume happens in background.
Resolves: 'Veseli Dolenci stuck' issue
- @app.on_event(startup) marks all status=processing jobs as error after restart
- Process endpoint now clears chorus_error/interrupted_at on retry (retry-friendly)
- GEMINI_API_KEY added to Coolify env (Gemini 3.1 Pro now active)
- User can now choose Gemini in UI dropdown for analysis
- Refactored analyze_with_claude into shared _build_analysis_prompt + _parse_llm_response helpers
- New analyze_with_gemini() using Gemini 3.1 Pro ($2/M in, MMMLU 92.6% — best multilingual)
- Unified analyze_with_llm(provider) dispatcher with auto-fallback (Claude → Gemini)
- API endpoint accepts llm_provider in StartJobIn (claude/gemini/auto)
- Frontend dropdown to pick LLM
- Default model is now Sonnet 4.6 (was Haiku 4.5) — 3x quality at 3x price (~3 cents/video)
- Gemini support is opt-in: needs GEMINI_API_KEY env var to activate
- Claude now corrects transcription errors (Slavic languages, dialects, mixed langs)
- Returns corrected_segments with same timestamps but cleaner text
- Pipeline generates SRT from Claude-corrected transcript and passes to subtitle.py via --srt
- subtitle.py supports --srt to skip Whisper re-transcription on the trimmed clip
- clip.py propagates --srt through to subtitle.py
- Whisper still runs once (in analyze.py); subtitle.py reuses corrected output instead of re-running
- This means: Whisper's mistakes (mixed langs, hallucinations, wrong words) are fixed by Claude before becoming visible subtitles
- 3-sample voting for auto-detect (start/middle/end of song) prevents lang switching mid-song
- Lock detected language for full transcription
- Anti-hallucination: condition_on_previous_text=False, temperature=0.0
- compression_ratio_threshold=2.4 (rejects repetitive hallucinations)
- log_prob_threshold=-1.0 (rejects low-confidence segments)
- no_speech_threshold=0.6 (more aggressive silence detection)
- Default Whisper model changed: small → medium (better for all langs incl. Slavic)