Self-hosted Opus Clip alternative — reels.biba.live
Real-world test confirmed Gemini 3 Pro can transcribe Slovenian folk-pop songs accurately where ElevenLabs Scribe hallucinates: Test: FEHTARJI - GORENJSKA LJUBLJENA (120s sample) - Scribe result: 'finančni moduli...' (total hallucination, wrong content) - Gemini 3 Pro: 'Zunaj srečo sem iskal, planet prepotoval' (CORRECT lyrics) Implementation: 1. New transcribe_with_gemini() function: - Uploads audio via Gemini Files API (resumable upload) - Calls gemini-3-pro-preview with structured prompt - Parses JSON response with word-level timestamps - Computes coverage_pct and hallucination_count - Returns same format as Scribe (compatible) 2. New 'hybrid' provider mode (now the default for 'auto'): - Try Scribe first (fast, cheap: 8-10s, $0.013) - If quality OK (coverage >= 50%, no hallucinations) → return Scribe - Else retry Scribe once - If still bad → fallback to Gemini 3 Pro (slow, more expensive: 100s, $0.20) - Compare results, return whichever is better 3. Provider modes: - 'auto' → hybrid if both keys, else elevenlabs, else local - 'hybrid' → explicit Scribe + Gemini fallback - 'elevenlabs'→ Scribe only (with auto-retry) - 'gemini' → Gemini only - 'local' → faster-whisper on CPU Cost analysis (10 reels/day): - Pure Scribe: $0.13/day, ~5-10% reels unusable - Hybrid: ~$0.55/day, 100% usable - Pure Gemini: $2/day Hybrid is the clear winner: +$0.42/day for 100% reliability. |
||
|---|---|---|
| app | ||
| scripts | ||
| templates | ||
| .env.example | ||
| .gitignore | ||
| docker-compose.yml | ||
| Dockerfile | ||
| README.md | ||
| requirements.txt | ||
Reels Clipper · biba.live
Self-hosted Opus Clip alternativa za FOLX TV / PTC. Pretvori 16:9 video v 9:16 reels/shorts/tiktok format z auto face tracking, podnapisi (sl/de/en) in avto-detekcijo refrena v glasbenih pesmih.
Features
- 📤 Drag & drop upload (do 2 GB)
- 📺 YouTube URL paste (yt-dlp)
- 🎯 Smart reframe: track (face follow), center, blur (za glasbo)
- 🎵 Auto-chorus detection (Whisper + energy hibrid)
- 📝 Burned-in podnapisi (faster-whisper, multi-jezik)
- 🎨 3 stili podnapisov: reels, yellow (MrBeast), minimal
- 🔐 HTTP Basic Auth
- 📊 Real-time progress (Server-Sent Events)
- 📦 Docker / Coolify ready
Quick start (lokalno)
docker compose up --build
# odpri http://localhost:8000
Default login: sebastjan / nastavi AUTH_PASS v .env.
Coolify deploy
- V Coolify ustvari nov projekt → Docker Compose iz tega repoja
- Domena:
reels.biba.live - Env vars:
AUTH_USER=sebastjan AUTH_PASS=<močno geslo> MAX_UPLOAD_MB=2000 - Volume
reels_datase ustvari avtomatsko - Deploy → Coolify postavi Traefik reverse proxy + SSL via Let's Encrypt
Pipeline
Upload / YouTube
↓
[ yt_download.py ] ← samo če YouTube
↓
[ find_chorus.py ] ← samo če auto_chorus=true (Whisper + RMS analiza)
↓
[ reframe.py ] ← 16:9 → 9:16 (track / center / blur)
↓
[ subtitle.py ] ← Whisper transkripcija + burn-in
↓
reel.mp4
API
POST /api/upload— multipart file upload, vrnejob_idPOST /api/youtube— JSON{url, mode, lang, ...}POST /api/process— start processing za uploaded jobGET /api/jobs— list vsehGET /api/jobs/{id}— statusGET /api/stream/{id}— SSE stream progressGET /api/download/{id}— final reelDELETE /api/jobs/{id}— pobriši
Dependencies
- FFmpeg (system)
- faster-whisper (transkripcija)
- OpenCV (face detection)
- yt-dlp (YouTube)
- FastAPI + uvicorn (server)