af3c933c78
Robust language detection + anti-hallucination
...
- 3-sample voting for auto-detect (start/middle/end of song) prevents lang switching mid-song
- Lock detected language for full transcription
- Anti-hallucination: condition_on_previous_text=False, temperature=0.0
- compression_ratio_threshold=2.4 (rejects repetitive hallucinations)
- log_prob_threshold=-1.0 (rejects low-confidence segments)
- no_speech_threshold=0.6 (more aggressive silence detection)
- Default Whisper model changed: small → medium (better for all langs incl. Slavic)
2026-04-29 07:59:20 +00:00
c870d80726
Fix: extend clip if ends mid-vocal (no chorus cut-off), DejaVu Sans font (supports SLO/HR/BS chars), auto-upgrade to medium Whisper model for Slavic languages
2026-04-29 07:35:00 +00:00
5d5e169f9d
Disable Whisper VAD filter — was dropping vocal segments in songs creating gaps in subtitles
2026-04-29 07:07:29 +00:00
a04811bdc9
Add Claude LLM analysis: sends full transcript to Claude API for true song structure understanding (refrain detection across all repetitions, not just local heuristic)
2026-04-29 06:55:41 +00:00
e072eec362
Fix: handle Whisper transcribe failure for instrumental-only audio (fallback to empty transcript)
2026-04-29 06:33:52 +00:00
33a138af9e
Fix: force native Python bool/float for JSON serialization (numpy types)
2026-04-29 06:23:41 +00:00
8512076b91
Major: smart selection pipeline (analyze.py) + audio fade + multi-lang auto-detect
...
- New analyze.py: full transcript + energy + structural analysis
- Smart clip range: includes pre-chorus, can exceed 30s up to max_duration (default 45s)
- Audio fade in/out: auto-detected from vocal boundaries
- Instrumental detection: auto-disables subs if vocals < 10% of duration
- Multi-language: auto-detect via Whisper or explicit (DE/SL/HR/BS/SR/EN/IT/ES/FR)
- Frontend: cleaner UX, added bs language, smart selection description
- reframe.py: --fade-in --fade-out args
- clip.py: propagates fade params
- app/main.py: replaces find_chorus.py call with analyze.py
2026-04-29 06:21:35 +00:00