Commit Graph

12 Commits

Author SHA1 Message Date
OpenClaw Agent
c1e00b7b73 Final SAR=1 fix: dodaj setsar=1 na konec vfilter-ja v reframe.py + ass filtrom v subtitle.py (kompenzira rounding errore iz scale/crop filtrov, ki dajo SAR 10240:10239 namesto 1:1) 2026-05-02 12:12:38 +00:00
OpenClaw Agent
6279b0ec03 subtitle.py: dodaj -pix_fmt yuv420p v burn-in encode (subtitle re-encode je perpetuiral broadcast yuv422p iz prej\u0161nje stopnje) 2026-05-02 12:05:47 +00:00
69fb2f5ce8 Upgrade default Whisper model: small/medium → large-v3 for much better Slovenian/Slavic transcription accuracy 2026-04-29 08:20:18 +00:00
4bc5ac6756 Major: Claude post-processing of Whisper transcript
- Claude now corrects transcription errors (Slavic languages, dialects, mixed langs)
- Returns corrected_segments with same timestamps but cleaner text
- Pipeline generates SRT from Claude-corrected transcript and passes to subtitle.py via --srt
- subtitle.py supports --srt to skip Whisper re-transcription on the trimmed clip
- clip.py propagates --srt through to subtitle.py
- Whisper still runs once (in analyze.py); subtitle.py reuses corrected output instead of re-running
- This means: Whisper's mistakes (mixed langs, hallucinations, wrong words) are fixed by Claude before becoming visible subtitles
2026-04-29 08:13:33 +00:00
af3c933c78 Robust language detection + anti-hallucination
- 3-sample voting for auto-detect (start/middle/end of song) prevents lang switching mid-song
- Lock detected language for full transcription
- Anti-hallucination: condition_on_previous_text=False, temperature=0.0
- compression_ratio_threshold=2.4 (rejects repetitive hallucinations)
- log_prob_threshold=-1.0 (rejects low-confidence segments)
- no_speech_threshold=0.6 (more aggressive silence detection)
- Default Whisper model changed: small → medium (better for all langs incl. Slavic)
2026-04-29 07:59:20 +00:00
c870d80726 Fix: extend clip if ends mid-vocal (no chorus cut-off), DejaVu Sans font (supports SLO/HR/BS chars), auto-upgrade to medium Whisper model for Slavic languages 2026-04-29 07:35:00 +00:00
5d5e169f9d Disable Whisper VAD filter — was dropping vocal segments in songs creating gaps in subtitles 2026-04-29 07:07:29 +00:00
81edd24ca3 Subtitles: smaller font 56px (was 84), higher position MarginV=400, side margins 80px for safe zone 2026-04-29 06:09:26 +00:00
ba787744a6 Subtitles: cap chunk duration at 2.5s, split long lines into multiple time slices for faster reels pacing 2026-04-29 05:59:36 +00:00
e001387a89 Subtitles: convert SRT to ASS directly with PlayResY=1920 for predictable scaling instead of unreliable force_style 2026-04-28 18:09:53 +00:00
28d933c916 Subtitles: UPPERCASE + position lower (MarginV=320 for 1080x1920) + bigger font 2026-04-28 17:40:48 +00:00
30b969e4b8 Initial: reels clipper app
- FastAPI backend (auth, jobs, SSE, download)
- Frontend: drag&drop + YouTube URL + jobs panel
- Pipeline: yt_download → find_chorus → reframe → subtitle
- Modes: track (face follow), center, blur
- Whisper for SI/DE/EN subtitles
- Auto-chorus detection via Whisper + RMS energy
- Docker + Coolify ready
2026-04-28 15:28:22 +00:00