reels-app

Author	SHA1	Message	Date
Sebastjan Artič	69fb2f5ce8	Upgrade default Whisper model: small/medium → large-v3 for much better Slovenian/Slavic transcription accuracy	2026-04-29 08:20:18 +00:00
Sebastjan Artič	4bc5ac6756	Major: Claude post-processing of Whisper transcript - Claude now corrects transcription errors (Slavic languages, dialects, mixed langs) - Returns corrected_segments with same timestamps but cleaner text - Pipeline generates SRT from Claude-corrected transcript and passes to subtitle.py via --srt - subtitle.py supports --srt to skip Whisper re-transcription on the trimmed clip - clip.py propagates --srt through to subtitle.py - Whisper still runs once (in analyze.py); subtitle.py reuses corrected output instead of re-running - This means: Whisper's mistakes (mixed langs, hallucinations, wrong words) are fixed by Claude before becoming visible subtitles	2026-04-29 08:13:33 +00:00
Sebastjan Artič	af3c933c78	Robust language detection + anti-hallucination - 3-sample voting for auto-detect (start/middle/end of song) prevents lang switching mid-song - Lock detected language for full transcription - Anti-hallucination: condition_on_previous_text=False, temperature=0.0 - compression_ratio_threshold=2.4 (rejects repetitive hallucinations) - log_prob_threshold=-1.0 (rejects low-confidence segments) - no_speech_threshold=0.6 (more aggressive silence detection) - Default Whisper model changed: small → medium (better for all langs incl. Slavic)	2026-04-29 07:59:20 +00:00
Sebastjan Artič	c870d80726	Fix: extend clip if ends mid-vocal (no chorus cut-off), DejaVu Sans font (supports SLO/HR/BS chars), auto-upgrade to medium Whisper model for Slavic languages	2026-04-29 07:35:00 +00:00
Sebastjan Artič	5d5e169f9d	Disable Whisper VAD filter — was dropping vocal segments in songs creating gaps in subtitles	2026-04-29 07:07:29 +00:00
Sebastjan Artič	a04811bdc9	Add Claude LLM analysis: sends full transcript to Claude API for true song structure understanding (refrain detection across all repetitions, not just local heuristic)	2026-04-29 06:55:41 +00:00
Sebastjan Artič	e072eec362	Fix: handle Whisper transcribe failure for instrumental-only audio (fallback to empty transcript)	2026-04-29 06:33:52 +00:00
Sebastjan Artič	33a138af9e	Fix: force native Python bool/float for JSON serialization (numpy types)	2026-04-29 06:23:41 +00:00
Sebastjan Artič	8512076b91	Major: smart selection pipeline (analyze.py) + audio fade + multi-lang auto-detect - New analyze.py: full transcript + energy + structural analysis - Smart clip range: includes pre-chorus, can exceed 30s up to max_duration (default 45s) - Audio fade in/out: auto-detected from vocal boundaries - Instrumental detection: auto-disables subs if vocals < 10% of duration - Multi-language: auto-detect via Whisper or explicit (DE/SL/HR/BS/SR/EN/IT/ES/FR) - Frontend: cleaner UX, added bs language, smart selection description - reframe.py: --fade-in --fade-out args - clip.py: propagates fade params - app/main.py: replaces find_chorus.py call with analyze.py	2026-04-29 06:21:35 +00:00

9 Commits