reels-app

Author	SHA1	Message	Date
OpenClaw Agent	e350352883	Fix: Gemini 3.1 Pro thinking model needs 32k maxOutputTokens (was 4096 → MAX_TOKENS truncation) Diagnoza: - Gemini 3.x Pro je thinking model (ima internal reasoning, thoughtsTokenCount) - Pri velikih transkriptih (60+ segmentov pesmi): * thoughts ~ 1500-3000 tokens * output JSON s corrected_segments ~ 3000-7000 tokens * total ~ 4500-10000 tokens - Z maxOutputTokens=4096 je bil response prekinjen (finishReason: MAX_TOKENS), JSON odrezan na pol, _parse_llm_response je threw json.JSONDecodeError - Rezultat: 'Gemini vrnil prazen string' v logih Popravki: 1. Gemini maxOutputTokens 4096 → 32768 (dovolj za thinking + dolg JSON) 2. Diagnostika finishReason==MAX_TOKENS in usage tokens v logih 3. Detekcija praznega text-a (ne samo praznega parts array-a) 4. Claude max_tokens 4096 → 8192 (rezerva za dolge pesmi) 5. Claude detekcija stop_reason==max_tokens Test (60 segmentov, 5631 char prompt): - 4096 → finishReason=MAX_TOKENS, thoughts=2594, output=1488, JSON odrezan ❌ - 16384 → finishReason=STOP, thoughts=1445, output=3040, JSON popoln ✅ - 32768 → varen default ✅	2026-04-29 09:03:53 +00:00
Sebastjan Artič	ec71c54570	Upgrade to Sonnet 4.6 + add Gemini 3.1 Pro support - Refactored analyze_with_claude into shared _build_analysis_prompt + _parse_llm_response helpers - New analyze_with_gemini() using Gemini 3.1 Pro ($2/M in, MMMLU 92.6% — best multilingual) - Unified analyze_with_llm(provider) dispatcher with auto-fallback (Claude → Gemini) - API endpoint accepts llm_provider in StartJobIn (claude/gemini/auto) - Frontend dropdown to pick LLM - Default model is now Sonnet 4.6 (was Haiku 4.5) — 3x quality at 3x price (~3 cents/video) - Gemini support is opt-in: needs GEMINI_API_KEY env var to activate	2026-04-29 08:26:27 +00:00
Sebastjan Artič	9faa224885	Upgrade Claude model: Haiku 4.5 → Sonnet 4.6 for better Slavic language transcript correction	2026-04-29 08:22:10 +00:00
Sebastjan Artič	69fb2f5ce8	Upgrade default Whisper model: small/medium → large-v3 for much better Slovenian/Slavic transcription accuracy	2026-04-29 08:20:18 +00:00
Sebastjan Artič	4bc5ac6756	Major: Claude post-processing of Whisper transcript - Claude now corrects transcription errors (Slavic languages, dialects, mixed langs) - Returns corrected_segments with same timestamps but cleaner text - Pipeline generates SRT from Claude-corrected transcript and passes to subtitle.py via --srt - subtitle.py supports --srt to skip Whisper re-transcription on the trimmed clip - clip.py propagates --srt through to subtitle.py - Whisper still runs once (in analyze.py); subtitle.py reuses corrected output instead of re-running - This means: Whisper's mistakes (mixed langs, hallucinations, wrong words) are fixed by Claude before becoming visible subtitles	2026-04-29 08:13:33 +00:00
Sebastjan Artič	af3c933c78	Robust language detection + anti-hallucination - 3-sample voting for auto-detect (start/middle/end of song) prevents lang switching mid-song - Lock detected language for full transcription - Anti-hallucination: condition_on_previous_text=False, temperature=0.0 - compression_ratio_threshold=2.4 (rejects repetitive hallucinations) - log_prob_threshold=-1.0 (rejects low-confidence segments) - no_speech_threshold=0.6 (more aggressive silence detection) - Default Whisper model changed: small → medium (better for all langs incl. Slavic)	2026-04-29 07:59:20 +00:00
Sebastjan Artič	c870d80726	Fix: extend clip if ends mid-vocal (no chorus cut-off), DejaVu Sans font (supports SLO/HR/BS chars), auto-upgrade to medium Whisper model for Slavic languages	2026-04-29 07:35:00 +00:00
Sebastjan Artič	5d5e169f9d	Disable Whisper VAD filter — was dropping vocal segments in songs creating gaps in subtitles	2026-04-29 07:07:29 +00:00
Sebastjan Artič	a04811bdc9	Add Claude LLM analysis: sends full transcript to Claude API for true song structure understanding (refrain detection across all repetitions, not just local heuristic)	2026-04-29 06:55:41 +00:00
Sebastjan Artič	e072eec362	Fix: handle Whisper transcribe failure for instrumental-only audio (fallback to empty transcript)	2026-04-29 06:33:52 +00:00
Sebastjan Artič	33a138af9e	Fix: force native Python bool/float for JSON serialization (numpy types)	2026-04-29 06:23:41 +00:00
Sebastjan Artič	8512076b91	Major: smart selection pipeline (analyze.py) + audio fade + multi-lang auto-detect - New analyze.py: full transcript + energy + structural analysis - Smart clip range: includes pre-chorus, can exceed 30s up to max_duration (default 45s) - Audio fade in/out: auto-detected from vocal boundaries - Instrumental detection: auto-disables subs if vocals < 10% of duration - Multi-language: auto-detect via Whisper or explicit (DE/SL/HR/BS/SR/EN/IT/ES/FR) - Frontend: cleaner UX, added bs language, smart selection description - reframe.py: --fade-in --fade-out args - clip.py: propagates fade params - app/main.py: replaces find_chorus.py call with analyze.py	2026-04-29 06:21:35 +00:00
Sebastjan Artič	81edd24ca3	Subtitles: smaller font 56px (was 84), higher position MarginV=400, side margins 80px for safe zone	2026-04-29 06:09:26 +00:00
Sebastjan Artič	ba787744a6	Subtitles: cap chunk duration at 2.5s, split long lines into multiple time slices for faster reels pacing	2026-04-29 05:59:36 +00:00
Sebastjan Artič	e001387a89	Subtitles: convert SRT to ASS directly with PlayResY=1920 for predictable scaling instead of unreliable force_style	2026-04-28 18:09:53 +00:00
Sebastjan Artič	28d933c916	Subtitles: UPPERCASE + position lower (MarginV=320 for 1080x1920) + bigger font	2026-04-28 17:40:48 +00:00
Sebastjan Artič	15ef4888a1	Debug: log exact clip.py cmd in job + clip.py logs run_clip args	2026-04-28 17:28:10 +00:00
Sebastjan Artič	bc3fe1f9d4	Add explicit FFmpeg trim command logging + duration verification	2026-04-28 17:17:11 +00:00
Sebastjan Artič	8eaef029e2	Find chorus: weight repetitive short phrases (like 'Ohne dich x5') as strong chorus signal	2026-04-28 16:57:45 +00:00
Sebastjan Artič	c17578521a	Fix find_chorus: RMS energy parser was broken (no pts_time available), now syntheses timestamps; energy weight x10 (refren je glasnejši)	2026-04-28 16:55:51 +00:00
Sebastjan Artič	64e8854cea	Track mode: more sensitive face detection + longer smoothing window	2026-04-28 16:45:13 +00:00
Sebastjan Artič	400f6dbb6d	Fix: limit FFmpeg crop expression to 20 sample points (was overflowing 4KB limit)	2026-04-28 16:32:26 +00:00
Sebastjan Artič	2e337ff079	Fix: shutil import was inside finally block, causing NameError when shutil.move was called	2026-04-28 16:22:39 +00:00
Sebastjan Artič	6e2a13d8a3	Fix cross-device link error: use shutil.move instead of os.replace	2026-04-28 16:15:20 +00:00
Sebastjan Artič	47509b4f06	Add cookies support to yt_download.py for YouTube bot detection bypass	2026-04-28 15:47:59 +00:00
Sebastjan Artič	30b969e4b8	Initial: reels clipper app - FastAPI backend (auth, jobs, SSE, download) - Frontend: drag&drop + YouTube URL + jobs panel - Pipeline: yt_download → find_chorus → reframe → subtitle - Modes: track (face follow), center, blur - Whisper for SI/DE/EN subtitles - Auto-chorus detection via Whisper + RMS energy - Docker + Coolify ready	2026-04-28 15:28:22 +00:00

26 Commits