Initial: reels clipper app

- FastAPI backend (auth, jobs, SSE, download)
- Frontend: drag&drop + YouTube URL + jobs panel
- Pipeline: yt_download → find_chorus → reframe → subtitle
- Modes: track (face follow), center, blur
- Whisper for SI/DE/EN subtitles
- Auto-chorus detection via Whisper + RMS energy
- Docker + Coolify ready
This commit is contained in:
Sebastjan Artič 2026-04-28 15:28:22 +00:00
commit 30b969e4b8
13 changed files with 2096 additions and 0 deletions

6
.env.example Normal file
View File

@ -0,0 +1,6 @@
# Auth (basic HTTP)
AUTH_USER=sebastjan
AUTH_PASS=zamenjaj-me-v-coolify-env
# Upload limit (MB)
MAX_UPLOAD_MB=2000

11
.gitignore vendored Normal file
View File

@ -0,0 +1,11 @@
__pycache__/
*.pyc
*.pyo
.venv/
venv/
.env
*.mp4
*.wav
*.srt
data/
.DS_Store

39
Dockerfile Normal file
View File

@ -0,0 +1,39 @@
FROM python:3.11-slim
# System deps: FFmpeg, libs za OpenCV
RUN apt-get update && apt-get install -y --no-install-recommends \
ffmpeg \
libsm6 \
libxext6 \
libgl1 \
curl \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
# Python deps
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# App code
COPY app/ ./app/
COPY scripts/ ./scripts/
COPY templates/ ./templates/
COPY static/ ./static/
# Data volume
RUN mkdir -p /data/uploads /data/outputs /data/jobs
VOLUME /data
ENV DATA_DIR=/data
ENV PYTHONUNBUFFERED=1
EXPOSE 8000
# Pre-download Whisper "small" model za faster cold start (opcijsko)
# RUN python -c "from faster_whisper import WhisperModel; WhisperModel('small', device='cpu', compute_type='int8')"
HEALTHCHECK --interval=30s --timeout=10s --start-period=30s --retries=3 \
CMD curl -fsS http://localhost:8000/healthz || exit 1
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "1"]

73
README.md Normal file
View File

@ -0,0 +1,73 @@
# Reels Clipper · biba.live
Self-hosted Opus Clip alternativa za FOLX TV / PTC.
Pretvori 16:9 video v 9:16 reels/shorts/tiktok format z auto face tracking, podnapisi (sl/de/en) in **avto-detekcijo refrena** v glasbenih pesmih.
## Features
- 📤 **Drag & drop upload** (do 2 GB)
- 📺 **YouTube URL paste** (yt-dlp)
- 🎯 **Smart reframe**: track (face follow), center, blur (za glasbo)
- 🎵 **Auto-chorus detection** (Whisper + energy hibrid)
- 📝 **Burned-in podnapisi** (faster-whisper, multi-jezik)
- 🎨 **3 stili podnapisov**: reels, yellow (MrBeast), minimal
- 🔐 **HTTP Basic Auth**
- 📊 **Real-time progress** (Server-Sent Events)
- 📦 **Docker / Coolify ready**
## Quick start (lokalno)
```bash
docker compose up --build
# odpri http://localhost:8000
```
Default login: `sebastjan` / nastavi `AUTH_PASS` v `.env`.
## Coolify deploy
1. V Coolify ustvari nov projekt → **Docker Compose** iz tega repoja
2. Domena: `reels.biba.live`
3. Env vars:
```
AUTH_USER=sebastjan
AUTH_PASS=<močno geslo>
MAX_UPLOAD_MB=2000
```
4. Volume `reels_data` se ustvari avtomatsko
5. Deploy → Coolify postavi Traefik reverse proxy + SSL via Let's Encrypt
## Pipeline
```
Upload / YouTube
[ yt_download.py ] ← samo če YouTube
[ find_chorus.py ] ← samo če auto_chorus=true (Whisper + RMS analiza)
[ reframe.py ] ← 16:9 → 9:16 (track / center / blur)
[ subtitle.py ] ← Whisper transkripcija + burn-in
reel.mp4
```
## API
- `POST /api/upload` — multipart file upload, vrne `job_id`
- `POST /api/youtube` — JSON `{url, mode, lang, ...}`
- `POST /api/process` — start processing za uploaded job
- `GET /api/jobs` — list vseh
- `GET /api/jobs/{id}` — status
- `GET /api/stream/{id}` — SSE stream progress
- `GET /api/download/{id}` — final reel
- `DELETE /api/jobs/{id}` — pobriši
## Dependencies
- FFmpeg (system)
- faster-whisper (transkripcija)
- OpenCV (face detection)
- yt-dlp (YouTube)
- FastAPI + uvicorn (server)

454
app/main.py Normal file
View File

@ -0,0 +1,454 @@
"""
reels.biba.live FastAPI backend.
Endpoints:
GET / frontend HTML
POST /api/upload naloži video file
POST /api/youtube submit YouTube URL
POST /api/process/{id} start processing job
GET /api/jobs list vseh jobov
GET /api/jobs/{id} status job-a
GET /api/stream/{id} SSE progress stream
GET /api/download/{id} download finalni reel
GET /api/preview/{id} preview video stream
DELETE /api/jobs/{id} pobriši job + datoteke
"""
import asyncio
import json
import os
import secrets
import shutil
import subprocess
import time
import uuid
from pathlib import Path
from typing import Optional
from fastapi import (
FastAPI, UploadFile, File, Form, HTTPException, Depends,
BackgroundTasks, Request, status
)
from fastapi.responses import (
FileResponse, HTMLResponse, StreamingResponse, JSONResponse
)
from fastapi.staticfiles import StaticFiles
from fastapi.security import HTTPBasic, HTTPBasicCredentials
from pydantic import BaseModel
# ────────────────────────────────────────────────────────────────
# Config
# ────────────────────────────────────────────────────────────────
DATA_DIR = Path(os.environ.get("DATA_DIR", "/data"))
UPLOAD_DIR = DATA_DIR / "uploads"
OUTPUT_DIR = DATA_DIR / "outputs"
JOBS_DIR = DATA_DIR / "jobs"
SCRIPTS_DIR = Path(__file__).parent.parent / "scripts"
UPLOAD_DIR.mkdir(parents=True, exist_ok=True)
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
JOBS_DIR.mkdir(parents=True, exist_ok=True)
AUTH_USER = os.environ.get("AUTH_USER", "sebastjan")
AUTH_PASS = os.environ.get("AUTH_PASS", "change-me-in-coolify-env")
MAX_UPLOAD_MB = int(os.environ.get("MAX_UPLOAD_MB", "2000"))
# ────────────────────────────────────────────────────────────────
# Auth
# ────────────────────────────────────────────────────────────────
security = HTTPBasic()
def check_auth(creds: HTTPBasicCredentials = Depends(security)):
correct_user = secrets.compare_digest(creds.username, AUTH_USER)
correct_pass = secrets.compare_digest(creds.password, AUTH_PASS)
if not (correct_user and correct_pass):
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Napačno geslo",
headers={"WWW-Authenticate": "Basic"},
)
return creds.username
# ────────────────────────────────────────────────────────────────
# Job state (filesystem-based, persistent prek restartov)
# ────────────────────────────────────────────────────────────────
def job_path(job_id):
return JOBS_DIR / f"{job_id}.json"
def load_job(job_id):
p = job_path(job_id)
if not p.exists():
return None
return json.loads(p.read_text())
def save_job(job):
job_path(job["id"]).write_text(json.dumps(job, ensure_ascii=False, indent=2))
def update_job(job_id, **kwargs):
job = load_job(job_id)
if not job:
return None
job.update(kwargs)
job["updated_at"] = time.time()
save_job(job)
return job
def list_jobs():
out = []
for f in sorted(JOBS_DIR.glob("*.json"), reverse=True):
try:
out.append(json.loads(f.read_text()))
except Exception:
pass
return out
# ────────────────────────────────────────────────────────────────
# Pipeline runner (background task)
# ────────────────────────────────────────────────────────────────
def run_subprocess_logged(cmd, job_id, step_name):
"""Pokliče subprocess, logi gredo v job."""
update_job(job_id, current_step=step_name, status="processing")
proc = subprocess.run(cmd, capture_output=True, text=True)
if proc.returncode != 0:
update_job(
job_id,
status="failed",
error=f"{step_name}: {proc.stderr[-500:]}",
)
return False
return True
def process_job(job_id):
"""Glavni pipeline: download (če YT) → find_chorus (če auto) → reframe → subs."""
job = load_job(job_id)
if not job:
return
try:
# ── 1. Source preparation ─────────────────────────────
if job["source_type"] == "youtube":
update_job(job_id, status="downloading", current_step="YouTube download")
input_path = UPLOAD_DIR / f"{job_id}_yt.mp4"
cmd = [
"python3", str(SCRIPTS_DIR / "yt_download.py"),
job["youtube_url"], str(input_path),
]
if not run_subprocess_logged(cmd, job_id, "YouTube download"):
return
update_job(job_id, input_path=str(input_path))
else:
input_path = Path(job["input_path"])
# ── 2. Find chorus (če auto) ──────────────────────────
if job.get("auto_chorus"):
update_job(job_id, current_step="Iščem refren (Whisper + energy)")
cmd = [
"python3", str(SCRIPTS_DIR / "find_chorus.py"),
str(input_path),
"--duration", str(job.get("duration", 30)),
"--json",
]
if job.get("lang"):
cmd += ["--lang", job["lang"]]
cmd += ["--model", job.get("whisper_model", "small")]
proc = subprocess.run(cmd, capture_output=True, text=True)
if proc.returncode == 0:
try:
chorus = json.loads(proc.stdout)
if chorus.get("candidates"):
best = chorus["candidates"][0]
update_job(
job_id,
chorus_detection=chorus,
start=best["start"],
duration=best["duration"],
)
except json.JSONDecodeError:
update_job(job_id, chorus_error="JSON decode failed")
else:
update_job(job_id, chorus_error=proc.stderr[-300:])
# ── 3. Reframe + subtitles (clip.py orchestrator) ─────
output_path = OUTPUT_DIR / f"{job_id}.mp4"
update_job(job_id, current_step="Reframe + subtitles")
cmd = [
"python3", str(SCRIPTS_DIR / "clip.py"),
str(input_path), str(output_path),
"--mode", job.get("mode", "track"),
"--quality", job.get("quality", "medium"),
"--style", job.get("subtitle_style", "reels"),
]
if job.get("start") is not None:
cmd += ["--start", str(job["start"])]
if job.get("duration") is not None:
cmd += ["--duration", str(job["duration"])]
if job.get("lang"):
cmd += ["--lang", job["lang"]]
if job.get("no_subs"):
cmd += ["--no-subs"]
cmd += ["--model", job.get("whisper_model", "small")]
if not run_subprocess_logged(cmd, job_id, "Reframe + subtitles"):
return
# ── Done ──────────────────────────────────────────────
if output_path.exists():
update_job(
job_id,
status="done",
current_step="Končano",
output_path=str(output_path),
output_size_mb=round(output_path.stat().st_size / 1024 / 1024, 2),
)
else:
update_job(
job_id,
status="failed",
error="Output datoteka ne obstaja po obdelavi",
)
except Exception as e:
update_job(job_id, status="failed", error=str(e))
# ────────────────────────────────────────────────────────────────
# FastAPI app
# ────────────────────────────────────────────────────────────────
app = FastAPI(title="Reels Clipper")
app.mount("/static", StaticFiles(directory=Path(__file__).parent.parent / "static"), name="static")
@app.get("/", response_class=HTMLResponse)
async def index(user: str = Depends(check_auth)):
html = (Path(__file__).parent.parent / "templates" / "index.html").read_text()
return html
@app.get("/healthz")
async def healthz():
return {"ok": True}
# ────────────────────────────────────────────────────────────────
# Job models
# ────────────────────────────────────────────────────────────────
class YouTubeJobIn(BaseModel):
url: str
mode: str = "track"
lang: Optional[str] = None
auto_chorus: bool = True
start: Optional[float] = None
duration: Optional[float] = 30
no_subs: bool = False
subtitle_style: str = "reels"
whisper_model: str = "small"
quality: str = "medium"
class StartJobIn(BaseModel):
job_id: str
mode: str = "track"
lang: Optional[str] = None
auto_chorus: bool = True
start: Optional[float] = None
duration: Optional[float] = 30
no_subs: bool = False
subtitle_style: str = "reels"
whisper_model: str = "small"
quality: str = "medium"
# ────────────────────────────────────────────────────────────────
# Upload (file)
# ────────────────────────────────────────────────────────────────
@app.post("/api/upload")
async def upload_video(
file: UploadFile = File(...),
user: str = Depends(check_auth),
):
if not file.filename:
raise HTTPException(400, "Brez imena")
job_id = uuid.uuid4().hex[:12]
ext = Path(file.filename).suffix or ".mp4"
input_path = UPLOAD_DIR / f"{job_id}{ext}"
size = 0
with input_path.open("wb") as f:
while chunk := await file.read(1024 * 1024):
size += len(chunk)
if size > MAX_UPLOAD_MB * 1024 * 1024:
f.close()
input_path.unlink(missing_ok=True)
raise HTTPException(413, f"Prevelika datoteka (limit {MAX_UPLOAD_MB} MB)")
f.write(chunk)
job = {
"id": job_id,
"source_type": "upload",
"filename": file.filename,
"input_path": str(input_path),
"size_mb": round(size / 1024 / 1024, 2),
"status": "uploaded",
"current_step": "Naloženo, čaka na obdelavo",
"created_at": time.time(),
"updated_at": time.time(),
}
save_job(job)
return job
# ────────────────────────────────────────────────────────────────
# YouTube submit
# ────────────────────────────────────────────────────────────────
@app.post("/api/youtube")
async def submit_youtube(
payload: YouTubeJobIn,
background: BackgroundTasks,
user: str = Depends(check_auth),
):
job_id = uuid.uuid4().hex[:12]
job = {
"id": job_id,
"source_type": "youtube",
"youtube_url": payload.url,
"status": "queued",
"current_step": "V vrsti za YouTube prenos",
"created_at": time.time(),
"updated_at": time.time(),
"mode": payload.mode,
"lang": payload.lang,
"auto_chorus": payload.auto_chorus,
"start": payload.start,
"duration": payload.duration,
"no_subs": payload.no_subs,
"subtitle_style": payload.subtitle_style,
"whisper_model": payload.whisper_model,
"quality": payload.quality,
}
save_job(job)
background.add_task(process_job, job_id)
return job
# ────────────────────────────────────────────────────────────────
# Start processing for uploaded job
# ────────────────────────────────────────────────────────────────
@app.post("/api/process")
async def start_processing(
payload: StartJobIn,
background: BackgroundTasks,
user: str = Depends(check_auth),
):
job = load_job(payload.job_id)
if not job:
raise HTTPException(404, "Job ne obstaja")
update_job(
payload.job_id,
status="queued",
mode=payload.mode,
lang=payload.lang,
auto_chorus=payload.auto_chorus,
start=payload.start,
duration=payload.duration,
no_subs=payload.no_subs,
subtitle_style=payload.subtitle_style,
whisper_model=payload.whisper_model,
quality=payload.quality,
current_step="V vrsti za obdelavo",
)
background.add_task(process_job, payload.job_id)
return load_job(payload.job_id)
# ────────────────────────────────────────────────────────────────
# Job queries
# ────────────────────────────────────────────────────────────────
@app.get("/api/jobs")
async def get_jobs(user: str = Depends(check_auth)):
return {"jobs": list_jobs()}
@app.get("/api/jobs/{job_id}")
async def get_job(job_id: str, user: str = Depends(check_auth)):
job = load_job(job_id)
if not job:
raise HTTPException(404, "Ne obstaja")
return job
@app.get("/api/stream/{job_id}")
async def stream_job(job_id: str, user: str = Depends(check_auth)):
"""Server-Sent Events za real-time status."""
async def gen():
last_status = None
last_step = None
for _ in range(600): # max 10 min stream
job = load_job(job_id)
if not job:
yield f"data: {json.dumps({'error': 'not found'})}\n\n"
return
if job["status"] != last_status or job.get("current_step") != last_step:
yield f"data: {json.dumps(job, ensure_ascii=False)}\n\n"
last_status = job["status"]
last_step = job.get("current_step")
if job["status"] in ("done", "failed"):
return
await asyncio.sleep(1)
return StreamingResponse(gen(), media_type="text/event-stream")
# ────────────────────────────────────────────────────────────────
# Download / preview
# ────────────────────────────────────────────────────────────────
@app.get("/api/download/{job_id}")
async def download(job_id: str, user: str = Depends(check_auth)):
job = load_job(job_id)
if not job or job.get("status") != "done":
raise HTTPException(404, "Ne pripravljen")
out = Path(job["output_path"])
if not out.exists():
raise HTTPException(404, "Output ne obstaja")
return FileResponse(
out,
media_type="video/mp4",
filename=f"reel_{job_id}.mp4",
)
@app.get("/api/preview/{job_id}")
async def preview(job_id: str, user: str = Depends(check_auth)):
job = load_job(job_id)
if not job or job.get("status") != "done":
raise HTTPException(404, "Ne pripravljen")
out = Path(job["output_path"])
if not out.exists():
raise HTTPException(404, "Output ne obstaja")
return FileResponse(out, media_type="video/mp4")
@app.delete("/api/jobs/{job_id}")
async def delete_job(job_id: str, user: str = Depends(check_auth)):
job = load_job(job_id)
if not job:
raise HTTPException(404, "Ne obstaja")
for key in ("input_path", "output_path"):
p = job.get(key)
if p and Path(p).exists():
Path(p).unlink(missing_ok=True)
job_path(job_id).unlink(missing_ok=True)
return {"deleted": job_id}

21
docker-compose.yml Normal file
View File

@ -0,0 +1,21 @@
services:
reels-app:
build: .
ports:
- "8000"
volumes:
- reels_data:/data
environment:
- AUTH_USER=${AUTH_USER:-sebastjan}
- AUTH_PASS=${AUTH_PASS:-change-me}
- MAX_UPLOAD_MB=${MAX_UPLOAD_MB:-2000}
- DATA_DIR=/data
restart: unless-stopped
labels:
# Coolify tags (Traefik bo to pobral za reverse proxy)
- "coolify.managed=true"
- "coolify.name=reels-clipper"
volumes:
reels_data:
driver: local

8
requirements.txt Normal file
View File

@ -0,0 +1,8 @@
fastapi==0.115.0
uvicorn[standard]==0.32.0
python-multipart==0.0.12
pydantic==2.9.2
faster-whisper==1.0.3
opencv-python-headless==4.10.0.84
numpy==1.26.4
yt-dlp==2024.10.7

132
scripts/clip.py Normal file
View File

@ -0,0 +1,132 @@
#!/usr/bin/env python3
"""
clip.py Vse v enem: vzemi 16:9 video, izreži klip, reframe na 9:16, dodaj podnapise.
Primer:
# Cel video → 9:16 z face tracking + slovenskimi podnapisi
python3 clip.py input.mp4 reel.mp4 --lang sl
# 30s klip od 1:20 dalje, blur ozadje, brez podnapisov
python3 clip.py input.mp4 reel.mp4 --start 80 --duration 30 --mode blur --no-subs
# Več klipov hkrati prek timestamp seznama
python3 clip.py input.mp4 out_dir/ --clips "0:30-1:00,2:15-2:45,5:00-5:30" --lang sl
"""
import argparse
import subprocess
import sys
import os
import tempfile
from pathlib import Path
def parse_ts(s):
"""'1:23' → 83.0, '1:23.5' → 83.5, '90' → 90.0"""
s = s.strip()
if ":" in s:
parts = s.split(":")
if len(parts) == 2:
return int(parts[0]) * 60 + float(parts[1])
if len(parts) == 3:
return int(parts[0]) * 3600 + int(parts[1]) * 60 + float(parts[2])
return float(s)
def parse_clips(spec):
"""'0:30-1:00,2:15-2:45' → [(30.0, 60.0), (135.0, 165.0)]"""
out = []
for c in spec.split(","):
a, b = c.split("-")
out.append((parse_ts(a), parse_ts(b)))
return out
SCRIPT_DIR = Path(__file__).parent
def run_clip(src, dst, start, duration, mode, lang, model, style, no_subs, quality):
"""Naredi en klip src → dst."""
tmp = tempfile.mkdtemp(prefix="reel_")
try:
reframed = Path(tmp) / "reframed.mp4"
# 1. Reframe (in trim hkrati)
cmd = [
"python3", str(SCRIPT_DIR / "reframe.py"),
str(src), str(reframed),
"--mode", mode,
"--quality", quality,
]
if start is not None:
cmd += ["--start", str(start)]
if duration is not None:
cmd += ["--duration", str(duration)]
print(f"\n▶ Klip: {dst.name}")
r = subprocess.run(cmd)
if r.returncode != 0:
print(f"❌ Reframe napaka pri {dst.name}", file=sys.stderr)
return False
# 2. Subtitles (opcijsko)
if no_subs:
os.replace(reframed, dst)
else:
cmd = [
"python3", str(SCRIPT_DIR / "subtitle.py"),
str(reframed), str(dst),
"--model", model,
"--style", style,
]
if lang:
cmd += ["--lang", lang]
r = subprocess.run(cmd)
if r.returncode != 0:
print(f"❌ Subtitle napaka — shranim brez", file=sys.stderr)
os.replace(reframed, dst)
return True
finally:
import shutil
shutil.rmtree(tmp, ignore_errors=True)
def main():
ap = argparse.ArgumentParser()
ap.add_argument("input")
ap.add_argument("output", help="Datoteka (en klip) ali mapa (več klipov)")
ap.add_argument("--start", type=str, default=None, help="Začetek (s ali mm:ss)")
ap.add_argument("--duration", type=float, default=None, help="Trajanje v s")
ap.add_argument("--clips", type=str, default=None,
help="Več klipov: '0:30-1:00,2:15-2:45'")
ap.add_argument("--mode", default="track", choices=["track", "center", "blur"])
ap.add_argument("--lang", default=None, help="sl, de, en, ... (privzeto auto)")
ap.add_argument("--model", default="small",
choices=["tiny", "base", "small", "medium", "large-v3"])
ap.add_argument("--style", default="reels", choices=["reels", "yellow", "minimal"])
ap.add_argument("--no-subs", action="store_true")
ap.add_argument("--quality", default="medium", choices=["fast", "medium", "high"])
args = ap.parse_args()
src = Path(args.input)
if not src.exists():
print(f"{src} ne obstaja", file=sys.stderr)
sys.exit(1)
if args.clips:
clips = parse_clips(args.clips)
out_dir = Path(args.output)
out_dir.mkdir(parents=True, exist_ok=True)
ok = 0
for i, (s, e) in enumerate(clips, 1):
dst = out_dir / f"reel_{i:02d}.mp4"
if run_clip(src, dst, s, e - s, args.mode, args.lang, args.model,
args.style, args.no_subs, args.quality):
ok += 1
print(f"\n✅ Dokončano: {ok}/{len(clips)} klipov v {out_dir}")
else:
start = parse_ts(args.start) if args.start else None
run_clip(src, Path(args.output), start, args.duration, args.mode,
args.lang, args.model, args.style, args.no_subs, args.quality)
if __name__ == "__main__":
main()

289
scripts/find_chorus.py Normal file
View File

@ -0,0 +1,289 @@
#!/usr/bin/env python3
"""
find_chorus.py Avto-detekcija refrena v glasbenem videu.
Hibridni pristop:
1. Whisper transkribira pesem z word-level timestamps
2. Najde ponavljajoče se vrstice (n-gram matching + Levenshtein)
3. Energy analiza prek FFmpeg (RMS dB) refren je navadno glasnejši
4. Združi: ponovljen tekst + visoka energija = refren
Output: JSON z najboljšimi kandidati (ranked).
Primer:
python3 find_chorus.py pesem.mp4
python3 find_chorus.py pesem.mp4 --duration 30 --json
"""
import argparse
import json
import subprocess
import sys
import tempfile
from collections import Counter
from pathlib import Path
import re
def extract_audio(video, sample_rate=16000):
"""Ekstrahiraj mono WAV za Whisper in energy analizo."""
tmp = tempfile.NamedTemporaryFile(suffix=".wav", delete=False)
tmp.close()
cmd = [
"ffmpeg", "-y", "-i", str(video),
"-vn", "-ac", "1", "-ar", str(sample_rate),
"-c:a", "pcm_s16le", tmp.name,
]
subprocess.run(cmd, check=True, stderr=subprocess.DEVNULL)
return tmp.name
def transcribe(audio_path, lang=None, model_size="small"):
"""Whisper transkripcija z word-level timestamps."""
from faster_whisper import WhisperModel
print(f"🧠 Whisper: {model_size}, lang={lang or 'auto'}", file=sys.stderr)
model = WhisperModel(model_size, device="cpu", compute_type="int8")
segments, info = model.transcribe(
audio_path,
language=lang,
word_timestamps=True,
vad_filter=True,
)
print(f" Detekcija: {info.language} (p={info.language_probability:.2f})", file=sys.stderr)
# Vrne seznam line-level segmentov s timestamp-i
lines = []
for seg in segments:
text = seg.text.strip()
if text:
lines.append({
"start": seg.start,
"end": seg.end,
"text": text,
"duration": seg.end - seg.start,
})
return lines, info.language
def normalize_text(s):
"""Normalize za primerjavo: lowercase, brez punktuacije."""
s = s.lower()
s = re.sub(r"[^\w\s]", "", s)
s = re.sub(r"\s+", " ", s).strip()
return s
def line_similarity(a, b):
"""Jaccard similarity na bigrams besedah."""
a_words = normalize_text(a).split()
b_words = normalize_text(b).split()
if not a_words or not b_words:
return 0.0
def bigrams(words):
return set(zip(words, words[1:])) if len(words) > 1 else {(words[0],)}
a_bg = bigrams(a_words)
b_bg = bigrams(b_words)
if not a_bg or not b_bg:
return 0.0
return len(a_bg & b_bg) / len(a_bg | b_bg)
def find_repeated_lines(lines, similarity_threshold=0.5):
"""
Najdi ponavljajoče se vrstice. Vrne seznam clustrov.
Vsak cluster = list[indices_v_lines] kjer so si vrstice podobne.
"""
n = len(lines)
visited = [False] * n
clusters = []
for i in range(n):
if visited[i]:
continue
cluster = [i]
visited[i] = True
for j in range(i + 1, n):
if visited[j]:
continue
sim = line_similarity(lines[i]["text"], lines[j]["text"])
if sim >= similarity_threshold:
cluster.append(j)
visited[j] = True
if len(cluster) >= 2: # samo če se ponovi vsaj 2x
clusters.append(cluster)
return clusters
def compute_energy(audio_path, window_sec=1.0):
"""
Vrni list (timestamp, rms_db) preko FFmpeg ebur128 filter.
"""
# Uporabi ebur128 ali astats za RMS
cmd = [
"ffmpeg", "-i", audio_path,
"-af", f"asetnsamples=n={int(16000 * window_sec)}:p=0,astats=metadata=1:reset={window_sec},"
"ametadata=print:key=lavfi.astats.Overall.RMS_level",
"-f", "null", "-",
]
result = subprocess.run(cmd, capture_output=True, text=True)
output = result.stderr
energies = []
current_pts = None
for line in output.split("\n"):
line = line.strip()
if line.startswith("frame:"):
# frame:N pts:X pts_time:Y
m = re.search(r"pts_time:(\S+)", line)
if m:
current_pts = float(m.group(1))
elif line.startswith("lavfi.astats.Overall.RMS_level="):
val = line.split("=")[1]
try:
rms = float(val)
if current_pts is not None:
energies.append((current_pts, rms))
except ValueError:
pass
return energies
def avg_energy_in_range(energies, start, end):
"""Povprečna RMS v [start, end]."""
in_range = [e for t, e in energies if start <= t <= end]
if not in_range:
return -60.0 # default tih
return sum(in_range) / len(in_range)
def find_chorus(video, lang=None, model_size="small", target_duration=30.0):
"""
Glavni entry point. Vrne ranked kandidate refrenov.
"""
audio = extract_audio(video)
try:
lines, detected_lang = transcribe(audio, lang=lang, model_size=model_size)
if not lines:
return {"error": "Brez transkripcije", "candidates": []}
print(f"📝 {len(lines)} vrstic transkripta", file=sys.stderr)
clusters = find_repeated_lines(lines, similarity_threshold=0.5)
print(f"🔁 {len(clusters)} ponavljajočih se sklopov", file=sys.stderr)
if not clusters:
return {
"error": "Ni najdenih ponavljajočih se vrstic",
"language": detected_lang,
"candidates": [],
}
print("🔊 Analiza energije...", file=sys.stderr)
energies = compute_energy(audio)
avg_overall = sum(e for _, e in energies) / max(1, len(energies))
print(f" Povprečje RMS: {avg_overall:.1f} dB", file=sys.stderr)
# Za vsak cluster izračunaj score
candidates = []
for cluster_idx, cluster in enumerate(clusters):
# Predstavnik clusterja = najdaljša vrstica
rep = max(cluster, key=lambda i: len(lines[i]["text"]))
rep_text = lines[rep]["text"]
# Vsaka instanca = potencialen reel start
for inst_idx in cluster:
line = lines[inst_idx]
# Razširi okno na target_duration začenši pri tej vrstici
start = line["start"]
end = min(start + target_duration, line["start"] + target_duration)
# Najdi konec videa (zadnja vrstica)
video_end = max(l["end"] for l in lines)
if start + target_duration > video_end:
start = max(0, video_end - target_duration)
end = video_end
avg_e = avg_energy_in_range(energies, start, start + target_duration)
energy_score = max(0, avg_e - avg_overall) # koliko nad povprečjem
# Score: število ponovitev + energy + dolžina vrstice
score = (
len(cluster) * 10 # repetition weight
+ energy_score * 2 # energy weight
+ min(len(rep_text.split()), 10) # text richness
)
candidates.append({
"start": round(start, 2),
"end": round(start + target_duration, 2),
"duration": target_duration,
"score": round(score, 2),
"repetitions": len(cluster),
"avg_rms_db": round(avg_e, 1),
"energy_above_avg_db": round(energy_score, 1),
"text_sample": rep_text[:80],
"cluster_id": cluster_idx,
})
# Sort by score, dedupe close candidates
candidates.sort(key=lambda c: -c["score"])
deduped = []
for c in candidates:
if all(abs(c["start"] - d["start"]) > 5 for d in deduped):
deduped.append(c)
if len(deduped) >= 5:
break
return {
"language": detected_lang,
"total_lines": len(lines),
"clusters_found": len(clusters),
"candidates": deduped,
}
finally:
Path(audio).unlink(missing_ok=True)
def main():
ap = argparse.ArgumentParser()
ap.add_argument("input")
ap.add_argument("--lang", default=None)
ap.add_argument("--model", default="small",
choices=["tiny", "base", "small", "medium", "large-v3"])
ap.add_argument("--duration", type=float, default=30.0,
help="Ciljna dolžina reel-a v s")
ap.add_argument("--json", action="store_true", help="JSON output")
args = ap.parse_args()
src = Path(args.input)
if not src.exists():
print(f"{src} ne obstaja", file=sys.stderr)
sys.exit(1)
result = find_chorus(src, lang=args.lang, model_size=args.model,
target_duration=args.duration)
if args.json:
print(json.dumps(result, ensure_ascii=False, indent=2))
else:
if "error" in result and not result.get("candidates"):
print(f"{result['error']}", file=sys.stderr)
sys.exit(2)
print(f"\n🎵 Jezik: {result.get('language', '?')}")
print(f"📋 {result['total_lines']} vrstic, {result['clusters_found']} ponavljanj\n")
print("🏆 Najboljši kandidati za refren:\n")
for i, c in enumerate(result["candidates"], 1):
mins = int(c["start"] // 60)
secs = c["start"] - mins * 60
print(f" {i}. {mins}:{secs:05.2f} → +{c['duration']:.0f}s "
f"(score={c['score']}, ponovitev={c['repetitions']}, "
f"energija={c['energy_above_avg_db']:+.1f} dB)")
print(f" '{c['text_sample']}'\n")
if __name__ == "__main__":
main()

306
scripts/reframe.py Normal file
View File

@ -0,0 +1,306 @@
#!/usr/bin/env python3
"""
reframe.py Pretvori 16:9 video v 9:16 (reels/shorts/tiktok format).
Modi:
--mode track : Pametno sledi obrazu/osebi (MediaPipe face detection)
Crop okno se gladko premika za subjektom.
--mode center : Statični center crop (najhitrejše)
--mode blur : 9:16 platno z blur ozadjem + 16:9 video v sredini
Primer:
python3 reframe.py input.mp4 output.mp4 --mode track
python3 reframe.py input.mp4 output.mp4 --mode track --start 10 --duration 30
"""
import argparse
import subprocess
import sys
import os
import json
import tempfile
from pathlib import Path
import cv2
import numpy as np
def get_video_info(path):
"""Vrni dict z width, height, fps, duration."""
cmd = [
"ffprobe", "-v", "quiet", "-print_format", "json",
"-show_streams", "-show_format", str(path)
]
data = json.loads(subprocess.check_output(cmd))
vstream = next(s for s in data["streams"] if s["codec_type"] == "video")
fps_str = vstream["r_frame_rate"]
num, den = fps_str.split("/")
fps = float(num) / float(den)
return {
"width": int(vstream["width"]),
"height": int(vstream["height"]),
"fps": fps,
"duration": float(data["format"]["duration"]),
}
def detect_face_centers(video_path, sample_fps=5):
"""
Vzorči video pri sample_fps in vrni seznam (timestamp, x_center_normalized).
x_center_normalized je 0..1 (0 = levi rob, 1 = desni rob).
Če obraza ni, vrne None za to vzorčenje.
Uporablja OpenCV Haar cascade (frontalface_alt2) robustno, brez external modela.
"""
cap = cv2.VideoCapture(str(video_path))
src_fps = cap.get(cv2.CAP_PROP_FPS)
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
step = max(1, int(src_fps / sample_fps))
cascade_path = cv2.data.haarcascades + "haarcascade_frontalface_alt2.xml"
face_cascade = cv2.CascadeClassifier(cascade_path)
samples = []
frame_idx = 0
while True:
ret, frame = cap.read()
if not ret:
break
if frame_idx % step == 0:
ts = frame_idx / src_fps
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(
gray, scaleFactor=1.2, minNeighbors=5, minSize=(60, 60)
)
if len(faces) > 0:
# Vzemi največji obraz
x, y, w, h = max(faces, key=lambda f: f[2] * f[3])
x_center = (x + w / 2) / width
samples.append((ts, x_center))
else:
samples.append((ts, None))
frame_idx += 1
cap.release()
return samples, width, height, src_fps, total_frames
def smooth_track(samples, total_duration, smoothing_window=2.0):
"""
Iz seznama (ts, x) naredi gladko krivuljo x(t) za vsako sekundo videa.
- None vrednosti se zapolni z zadnjo znano (ali 0.5 default).
- Drsno povprečje preko smoothing_window sekund.
"""
# Zapolni manjkajoče
last = 0.5
filled = []
for ts, x in samples:
if x is None:
x = last
else:
last = x
filled.append((ts, x))
if not filled:
return lambda t: 0.5
# Drsno povprečje
timestamps = np.array([t for t, _ in filled])
values = np.array([v for _, v in filled])
smoothed = np.zeros_like(values)
for i, t in enumerate(timestamps):
mask = np.abs(timestamps - t) <= smoothing_window / 2
smoothed[i] = np.mean(values[mask])
def x_at(t):
if t <= timestamps[0]:
return float(smoothed[0])
if t >= timestamps[-1]:
return float(smoothed[-1])
return float(np.interp(t, timestamps, smoothed))
return x_at
def build_track_filter(info, x_at, target_w, target_h, fps):
"""
Sestavi FFmpeg filter za track mode.
Generiramo crop expression, ki se premika z x(t).
Ker FFmpeg ne podpira poljubne funkcije časa, vzorčimo x(t) in
sestavimo piecewise linearno funkcijo prek `if(...)`.
Bolj robustno: pre-scale na ciljno višino, potem crop x = f(t).
"""
src_w = info["width"]
src_h = info["height"]
# Najprej scale: višina = target_h, širina proporcionalno
scale_h = target_h
scale_w = int(src_w * (target_h / src_h))
# Po skaliranju je crop širina = target_w
# x_center v skaliranem prostoru
max_x = scale_w - target_w # max levo-zgornji x
# Vzorčimo x(t) na ~5 fps (dovolj gladko po smoothingu)
duration = info["duration"]
n_samples = max(2, int(duration * 5))
times = np.linspace(0, duration, n_samples)
x_centers_norm = [x_at(t) for t in times]
# Pretvori normaliziran center v dejanski levi-zgornji x v skaliranem oknu
x_lefts = []
for xc in x_centers_norm:
x_left = xc * scale_w - target_w / 2
x_left = max(0, min(max_x, x_left))
x_lefts.append(x_left)
# Sestavi piecewise expression: če (t < t1, x1, če (t < t2, x2, ...))
# FFmpeg ima omejitev na dolžino expression-a, zato uporabimo drugačen pristop:
# Generiramo CSV in uporabimo `sendcmd` filter ali pa preprosto
# nizkofrekvenčno linearno interpolacijo prek `if/lerp`.
# Pragmatično: zgradimo nested if. Pri 5 fps in 60s = 300 vej; deluje.
# Za daljše videe rebajzamo na 2 fps.
if duration > 120:
n_samples = int(duration * 2)
times = np.linspace(0, duration, n_samples)
x_lefts_resampled = []
for t in times:
x_lefts_resampled.append(np.interp(t, np.linspace(0, duration, len(x_lefts)), x_lefts))
x_lefts = x_lefts_resampled
# Linearna interpolacija med vzorci znotraj FFmpeg expression
# Format: če(t<t_i, lerp(x_{i-1}, x_i, (t-t_{i-1})/(t_i-t_{i-1})), nadaljuj)
expr = f"{x_lefts[-1]:.1f}"
for i in range(len(times) - 1, 0, -1):
t0, t1 = times[i - 1], times[i]
x0, x1 = x_lefts[i - 1], x_lefts[i]
# lerp = x0 + (x1-x0)*(t-t0)/(t1-t0)
if abs(t1 - t0) < 1e-6:
lerp = f"{x0:.1f}"
else:
lerp = f"({x0:.1f}+({x1 - x0:.1f})*(t-{t0:.3f})/{t1 - t0:.3f})"
expr = f"if(lt(t,{t1:.3f}),{lerp},{expr})"
vfilter = (
f"scale={scale_w}:{scale_h},"
f"crop={target_w}:{target_h}:'{expr}':0"
)
return vfilter
def build_center_filter(info, target_w, target_h):
src_w = info["width"]
src_h = info["height"]
scale_h = target_h
scale_w = int(src_w * (target_h / src_h))
return f"scale={scale_w}:{scale_h},crop={target_w}:{target_h}:(in_w-{target_w})/2:0"
def build_blur_filter(info, target_w, target_h):
"""
9:16 platno: spodaj/zgoraj blur kopija, v sredini originalni 16:9.
"""
# Originalna širina v 9:16 platnu = target_w, višina proporcionalno
src_w = info["width"]
src_h = info["height"]
fg_h = int(target_w * src_h / src_w)
return (
f"[0:v]scale={target_w}:{target_h}:force_original_aspect_ratio=increase,"
f"crop={target_w}:{target_h},gblur=sigma=30[bg];"
f"[0:v]scale={target_w}:{fg_h}[fg];"
f"[bg][fg]overlay=0:(H-h)/2"
)
def main():
ap = argparse.ArgumentParser()
ap.add_argument("input")
ap.add_argument("output")
ap.add_argument("--mode", choices=["track", "center", "blur"], default="track")
ap.add_argument("--target-width", type=int, default=1080)
ap.add_argument("--target-height", type=int, default=1920)
ap.add_argument("--start", type=float, default=None, help="Začetek (s)")
ap.add_argument("--duration", type=float, default=None, help="Trajanje (s)")
ap.add_argument("--quality", default="medium", choices=["fast", "medium", "high"])
args = ap.parse_args()
src = Path(args.input)
dst = Path(args.output)
if not src.exists():
print(f"❌ Vhod ne obstaja: {src}", file=sys.stderr)
sys.exit(1)
# Če imamo --start/--duration, najprej trim z FFmpeg v temp file (hitreje)
work_input = src
tmp = None
if args.start is not None or args.duration is not None:
tmp = tempfile.NamedTemporaryFile(suffix=".mp4", delete=False)
tmp.close()
cmd = ["ffmpeg", "-y"]
if args.start is not None:
cmd += ["-ss", str(args.start)]
cmd += ["-i", str(src)]
if args.duration is not None:
cmd += ["-t", str(args.duration)]
cmd += ["-c", "copy", tmp.name]
subprocess.run(cmd, check=True, stderr=subprocess.DEVNULL)
work_input = Path(tmp.name)
print(f"✂ Trim → {work_input}")
info = get_video_info(work_input)
print(f"📹 Vhod: {info['width']}x{info['height']} @ {info['fps']:.2f}fps, {info['duration']:.1f}s")
if args.mode == "track":
print("🔍 Detektiram obraze (OpenCV)...")
samples, _, _, _, _ = detect_face_centers(work_input, sample_fps=5)
n_with_face = sum(1 for _, x in samples if x is not None)
print(f" {n_with_face}/{len(samples)} vzorcev z obrazom")
x_at = smooth_track(samples, info["duration"], smoothing_window=2.0)
vfilter = build_track_filter(info, x_at, args.target_width, args.target_height, info["fps"])
elif args.mode == "center":
vfilter = build_center_filter(info, args.target_width, args.target_height)
elif args.mode == "blur":
vfilter = build_blur_filter(info, args.target_width, args.target_height)
preset = {"fast": "veryfast", "medium": "medium", "high": "slow"}[args.quality]
crf = {"fast": "26", "medium": "21", "high": "18"}[args.quality]
if args.mode == "blur":
# blur uporablja filter_complex
cmd = [
"ffmpeg", "-y", "-i", str(work_input),
"-filter_complex", vfilter,
"-c:v", "libx264", "-preset", preset, "-crf", crf,
"-c:a", "aac", "-b:a", "128k",
"-movflags", "+faststart",
str(dst),
]
else:
cmd = [
"ffmpeg", "-y", "-i", str(work_input),
"-vf", vfilter,
"-c:v", "libx264", "-preset", preset, "-crf", crf,
"-c:a", "aac", "-b:a", "128k",
"-movflags", "+faststart",
str(dst),
]
print(f"🎬 Render ({args.mode})...")
result = subprocess.run(cmd, capture_output=True, text=True)
if result.returncode != 0:
print("❌ FFmpeg napaka:", file=sys.stderr)
print(result.stderr[-2000:], file=sys.stderr)
sys.exit(1)
if tmp:
os.unlink(tmp.name)
out_info = get_video_info(dst)
out_size = dst.stat().st_size / 1024 / 1024
print(f"{dst}{out_info['width']}x{out_info['height']}, {out_size:.1f} MB")
if __name__ == "__main__":
main()

143
scripts/subtitle.py Normal file
View File

@ -0,0 +1,143 @@
#!/usr/bin/env python3
"""
subtitle.py Generiraj podnapise iz videa in jih burn-in v output.
Uporablja faster-whisper za transkripcijo, FFmpeg za burn-in.
Primer:
python3 subtitle.py video.mp4 video_sub.mp4
python3 subtitle.py video.mp4 video_sub.mp4 --lang sl --model small
python3 subtitle.py video.mp4 video_sub.mp4 --style reels # velik beli centriran tekst
"""
import argparse
import subprocess
import sys
import tempfile
import os
from pathlib import Path
def transcribe(video, lang=None, model_size="small"):
"""Vrne pot do .srt datoteke."""
from faster_whisper import WhisperModel
print(f"🧠 Whisper model: {model_size}, lang={lang or 'auto'}")
model = WhisperModel(model_size, device="cpu", compute_type="int8")
segments, info = model.transcribe(
str(video),
language=lang,
word_timestamps=True,
vad_filter=True,
)
print(f" Detekcija: {info.language} (p={info.language_probability:.2f})")
srt_path = tempfile.NamedTemporaryFile(suffix=".srt", delete=False, mode="w", encoding="utf-8")
def fmt_ts(s):
h = int(s // 3600)
m = int((s % 3600) // 60)
sec = s % 60
return f"{h:02d}:{m:02d}:{sec:06.3f}".replace(".", ",")
# Generiramo word-level chunked podnapise: 3-5 besed naenkrat
idx = 1
for seg in segments:
words = seg.words or []
if not words:
srt_path.write(f"{idx}\n{fmt_ts(seg.start)} --> {fmt_ts(seg.end)}\n{seg.text.strip()}\n\n")
idx += 1
continue
# Združi v skupine po ~4 besede
group = []
for w in words:
group.append(w)
if len(group) >= 4 or w.word.strip().endswith((".", "?", "!")):
start = group[0].start
end = group[-1].end
text = "".join(g.word for g in group).strip()
srt_path.write(f"{idx}\n{fmt_ts(start)} --> {fmt_ts(end)}\n{text}\n\n")
idx += 1
group = []
if group:
start = group[0].start
end = group[-1].end
text = "".join(g.word for g in group).strip()
srt_path.write(f"{idx}\n{fmt_ts(start)} --> {fmt_ts(end)}\n{text}\n\n")
idx += 1
srt_path.close()
print(f"📝 SRT: {srt_path.name} ({idx - 1} segmentov)")
return srt_path.name
SUBTITLE_STYLES = {
"reels": (
"FontName=Arial,FontSize=18,Bold=1,"
"PrimaryColour=&H00FFFFFF,OutlineColour=&H00000000,BackColour=&H80000000,"
"Outline=2,Shadow=0,Alignment=2,MarginV=180,BorderStyle=1"
),
"yellow": (
"FontName=Arial,FontSize=20,Bold=1,"
"PrimaryColour=&H0000FFFF,OutlineColour=&H00000000,"
"Outline=3,Shadow=0,Alignment=2,MarginV=200,BorderStyle=1"
),
"minimal": (
"FontName=Arial,FontSize=14,"
"PrimaryColour=&H00FFFFFF,OutlineColour=&H80000000,"
"Outline=1,Shadow=0,Alignment=2,MarginV=80,BorderStyle=1"
),
}
def burn_subtitles(video, srt, output, style="reels"):
style_str = SUBTITLE_STYLES.get(style, SUBTITLE_STYLES["reels"])
# Escape srt path za FFmpeg subtitles filter
srt_escaped = srt.replace("\\", "\\\\").replace(":", "\\:").replace("'", r"\'")
vf = f"subtitles='{srt_escaped}':force_style='{style_str}'"
cmd = [
"ffmpeg", "-y", "-i", str(video),
"-vf", vf,
"-c:v", "libx264", "-preset", "medium", "-crf", "21",
"-c:a", "copy",
"-movflags", "+faststart",
str(output),
]
print("🔥 Burn-in podnapisov...")
result = subprocess.run(cmd, capture_output=True, text=True)
if result.returncode != 0:
print("❌ FFmpeg napaka:", file=sys.stderr)
print(result.stderr[-2000:], file=sys.stderr)
sys.exit(1)
print(f"{output}")
def main():
ap = argparse.ArgumentParser()
ap.add_argument("input")
ap.add_argument("output")
ap.add_argument("--lang", default=None, help="Jezik (sl, de, en, ...) ali auto")
ap.add_argument("--model", default="small", choices=["tiny", "base", "small", "medium", "large-v3"])
ap.add_argument("--style", default="reels", choices=list(SUBTITLE_STYLES.keys()))
ap.add_argument("--keep-srt", action="store_true", help="Ohrani .srt poleg output")
args = ap.parse_args()
src = Path(args.input)
if not src.exists():
print(f"{src} ne obstaja", file=sys.stderr)
sys.exit(1)
srt = transcribe(src, lang=args.lang, model_size=args.model)
burn_subtitles(src, srt, args.output, style=args.style)
if args.keep_srt:
keep_path = Path(args.output).with_suffix(".srt")
os.rename(srt, keep_path)
print(f"💾 SRT shranjen: {keep_path}")
else:
os.unlink(srt)
if __name__ == "__main__":
main()

80
scripts/yt_download.py Normal file
View File

@ -0,0 +1,80 @@
#!/usr/bin/env python3
"""
yt_download.py Download YouTube video v 1080p (16:9) za reels pipeline.
Primer:
python3 yt_download.py "https://youtu.be/dQw4w9WgXcQ" /data/uploads/video.mp4
"""
import argparse
import subprocess
import sys
from pathlib import Path
import json
def download(url, output, max_height=1080, format_str=None):
"""
Download YT video. Privzeto: best mp4 1080p z audiotrackom.
"""
if format_str is None:
format_str = (
f"bestvideo[height<={max_height}][ext=mp4]+bestaudio[ext=m4a]/"
f"best[height<={max_height}][ext=mp4]/best"
)
cmd = [
"yt-dlp",
"-f", format_str,
"--merge-output-format", "mp4",
"--no-playlist",
"--write-info-json",
"--restrict-filenames",
"-o", str(output),
url,
]
print(f"⬇ Downloading {url}...", file=sys.stderr)
result = subprocess.run(cmd, capture_output=True, text=True)
if result.returncode != 0:
print(f"❌ yt-dlp napaka:\n{result.stderr[-1500:]}", file=sys.stderr)
sys.exit(1)
print(f"{output}", file=sys.stderr)
return output
def get_info(url):
"""Vrni metadata brez prenosa."""
cmd = ["yt-dlp", "--dump-json", "--no-playlist", url]
result = subprocess.run(cmd, capture_output=True, text=True)
if result.returncode != 0:
return None
return json.loads(result.stdout.strip().split("\n")[0])
def main():
ap = argparse.ArgumentParser()
ap.add_argument("url")
ap.add_argument("output")
ap.add_argument("--max-height", type=int, default=1080)
ap.add_argument("--info-only", action="store_true",
help="Samo metadata, brez prenosa")
args = ap.parse_args()
if args.info_only:
info = get_info(args.url)
if info:
print(json.dumps({
"title": info.get("title"),
"duration": info.get("duration"),
"uploader": info.get("uploader"),
"thumbnail": info.get("thumbnail"),
}, indent=2))
else:
print("❌ Ne morem dobiti info", file=sys.stderr)
sys.exit(1)
return
download(args.url, args.output, max_height=args.max_height)
if __name__ == "__main__":
main()

534
templates/index.html Normal file
View File

@ -0,0 +1,534 @@
<!DOCTYPE html>
<html lang="sl">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Reels Clipper · biba.live</title>
<style>
:root {
--bg: #0d0e12;
--panel: #1a1c24;
--panel-2: #232631;
--border: #2d3142;
--text: #e6e8ed;
--muted: #8a8fa3;
--accent: #DC1C4C;
--accent-2: #ff3a6e;
--success: #3ec98f;
--warn: #f0b03b;
--error: #ef4444;
}
* { box-sizing: border-box; }
html, body { margin: 0; padding: 0; }
body {
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", system-ui, sans-serif;
background: var(--bg);
color: var(--text);
min-height: 100vh;
line-height: 1.5;
}
header {
padding: 24px 32px;
border-bottom: 1px solid var(--border);
display: flex;
align-items: center;
gap: 16px;
}
header h1 {
margin: 0;
font-size: 22px;
font-weight: 700;
letter-spacing: -0.3px;
}
.accent-mark {
display: inline-block;
background: var(--accent);
padding: 2px 8px;
border-radius: 4px;
font-weight: 800;
color: white;
margin-right: 4px;
}
main {
max-width: 1100px;
margin: 0 auto;
padding: 32px;
display: grid;
grid-template-columns: 1fr 1fr;
gap: 24px;
}
@media (max-width: 800px) {
main { grid-template-columns: 1fr; }
}
.card {
background: var(--panel);
border: 1px solid var(--border);
border-radius: 12px;
padding: 20px;
}
.card h2 {
margin: 0 0 16px;
font-size: 16px;
text-transform: uppercase;
letter-spacing: 0.6px;
color: var(--muted);
}
.dropzone {
border: 2px dashed var(--border);
border-radius: 10px;
padding: 40px 20px;
text-align: center;
cursor: pointer;
transition: all 0.15s ease;
}
.dropzone:hover, .dropzone.drag {
border-color: var(--accent);
background: rgba(220, 28, 76, 0.05);
}
.dropzone svg { width: 48px; height: 48px; opacity: 0.5; margin-bottom: 8px; }
.dropzone .small { color: var(--muted); font-size: 13px; }
input[type="text"], input[type="url"], select, input[type="number"] {
width: 100%;
background: var(--panel-2);
border: 1px solid var(--border);
border-radius: 8px;
padding: 10px 12px;
color: var(--text);
font-size: 14px;
font-family: inherit;
}
input:focus, select:focus { outline: 2px solid var(--accent); outline-offset: -1px; }
label { display: block; font-size: 13px; color: var(--muted); margin-bottom: 6px; margin-top: 12px; }
.row { display: grid; grid-template-columns: 1fr 1fr; gap: 12px; }
button {
background: var(--accent);
color: white;
border: none;
padding: 11px 20px;
border-radius: 8px;
font-weight: 600;
cursor: pointer;
font-size: 14px;
transition: background 0.15s;
}
button:hover { background: var(--accent-2); }
button:disabled { opacity: 0.5; cursor: not-allowed; }
button.ghost { background: transparent; color: var(--muted); border: 1px solid var(--border); }
button.ghost:hover { background: var(--panel-2); color: var(--text); }
button.small { padding: 6px 12px; font-size: 12px; }
.full-width { grid-column: 1 / -1; }
.jobs-list { display: flex; flex-direction: column; gap: 10px; }
.job {
background: var(--panel-2);
border: 1px solid var(--border);
border-radius: 10px;
padding: 14px;
display: flex;
flex-direction: column;
gap: 8px;
}
.job-head { display: flex; justify-content: space-between; align-items: center; gap: 12px; }
.job-title { font-weight: 600; font-size: 14px; flex: 1; min-width: 0; overflow: hidden; text-overflow: ellipsis; white-space: nowrap; }
.badge { padding: 3px 10px; border-radius: 99px; font-size: 11px; font-weight: 600; }
.badge.queued { background: rgba(138, 143, 163, 0.15); color: var(--muted); }
.badge.processing, .badge.downloading { background: rgba(240, 176, 59, 0.15); color: var(--warn); }
.badge.done { background: rgba(62, 201, 143, 0.15); color: var(--success); }
.badge.failed { background: rgba(239, 68, 68, 0.15); color: var(--error); }
.badge.uploaded { background: rgba(220, 28, 76, 0.15); color: var(--accent); }
.progress { height: 4px; background: var(--border); border-radius: 99px; overflow: hidden; }
.progress-bar { height: 100%; background: var(--accent); width: 0%; transition: width 0.3s; }
.progress-bar.indeterminate {
width: 30%;
animation: shimmer 1.5s linear infinite;
}
@keyframes shimmer {
0% { margin-left: -30%; }
100% { margin-left: 100%; }
}
.step { font-size: 12px; color: var(--muted); }
.meta { font-size: 11px; color: var(--muted); display: flex; gap: 12px; flex-wrap: wrap; }
.actions { display: flex; gap: 8px; flex-wrap: wrap; margin-top: 4px; }
.error-text { color: var(--error); font-size: 12px; }
video { width: 100%; max-height: 400px; border-radius: 8px; background: black; }
.empty { color: var(--muted); text-align: center; padding: 40px 20px; font-size: 14px; }
.toggle { display: flex; align-items: center; gap: 8px; cursor: pointer; user-select: none; font-size: 13px; }
.toggle input { width: auto; }
.tabs { display: flex; gap: 4px; margin-bottom: 16px; border-bottom: 1px solid var(--border); }
.tab { padding: 10px 14px; cursor: pointer; color: var(--muted); border-bottom: 2px solid transparent; font-size: 14px; }
.tab.active { color: var(--text); border-bottom-color: var(--accent); }
.hidden { display: none !important; }
code { background: var(--panel-2); padding: 1px 6px; border-radius: 3px; font-family: ui-monospace, monospace; font-size: 12px; }
</style>
</head>
<body>
<header>
<h1><span class="accent-mark">1]</span> reels clipper</h1>
<span style="color: var(--muted); font-size: 13px;">biba.live</span>
</header>
<main>
<!-- ─── INPUT ───────────────────────────────────── -->
<section class="card">
<h2>nov reel</h2>
<div class="tabs">
<div class="tab active" data-tab="upload">Upload</div>
<div class="tab" data-tab="youtube">YouTube</div>
</div>
<div id="tab-upload">
<div class="dropzone" id="dropzone">
<svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<path d="M21 15v4a2 2 0 0 1-2 2H5a2 2 0 0 1-2-2v-4"/>
<polyline points="17 8 12 3 7 8"/>
<line x1="12" y1="3" x2="12" y2="15"/>
</svg>
<div>Klikni ali povleci video sem</div>
<div class="small">.mp4, .mov, .webm — do 2 GB</div>
<input type="file" id="file-input" accept="video/*" style="display:none">
</div>
</div>
<div id="tab-youtube" class="hidden">
<label>YouTube URL</label>
<input type="url" id="yt-url" placeholder="https://www.youtube.com/watch?v=...">
</div>
<label>Način reframe</label>
<select id="mode">
<option value="track">Track (sledi obrazu — intervjuji, vlogi)</option>
<option value="center">Center (statična kamera)</option>
<option value="blur">Blur (glasba, koncerti)</option>
</select>
<div class="row">
<div>
<label>Jezik podnapisov</label>
<select id="lang">
<option value="">Auto detect</option>
<option value="sl">Slovenščina</option>
<option value="de">Deutsch</option>
<option value="en">English</option>
<option value="hr">Hrvatski</option>
<option value="sr">Српски</option>
</select>
</div>
<div>
<label>Whisper model</label>
<select id="model">
<option value="tiny">tiny (najhitrejši)</option>
<option value="base">base</option>
<option value="small" selected>small (priporočeno)</option>
<option value="medium">medium (zelo dobro)</option>
<option value="large-v3">large-v3 (najboljše)</option>
</select>
</div>
</div>
<label class="toggle" style="margin-top: 16px;">
<input type="checkbox" id="auto-chorus" checked>
Avto-detekcija refrena (priporočeno za glasbo)
</label>
<div id="manual-times" class="row hidden">
<div>
<label>Začetek (sekunde ali mm:ss)</label>
<input type="text" id="start" placeholder="npr. 1:24">
</div>
<div>
<label>Trajanje (s)</label>
<input type="number" id="duration" value="30" min="5" max="180">
</div>
</div>
<div class="row">
<div>
<label>Stil podnapisov</label>
<select id="subtitle-style">
<option value="reels">Reels (TikTok beli)</option>
<option value="yellow">Yellow (MrBeast)</option>
<option value="minimal">Minimal</option>
</select>
</div>
<div>
<label>Kvaliteta</label>
<select id="quality">
<option value="fast">Fast (preview)</option>
<option value="medium" selected>Medium (objava)</option>
<option value="high">High (arhiv)</option>
</select>
</div>
</div>
<label class="toggle" style="margin-top: 12px;">
<input type="checkbox" id="no-subs">
Brez podnapisov
</label>
<button id="submit-btn" class="full-width" style="margin-top: 20px; width: 100%;">
Naredi reel
</button>
<div id="upload-progress" class="hidden" style="margin-top: 12px;">
<div class="step" id="upload-status">Nalaganje...</div>
<div class="progress"><div class="progress-bar" id="upload-bar"></div></div>
</div>
</section>
<!-- ─── JOBS ────────────────────────────────────── -->
<section class="card">
<h2>moji reels</h2>
<div class="jobs-list" id="jobs-list">
<div class="empty">Še ni obdelav</div>
</div>
</section>
</main>
<script>
const $ = (s) => document.querySelector(s);
const $$ = (s) => document.querySelectorAll(s);
// ─── Tabs ───────────────────────────────────────
$$(".tab").forEach(t => {
t.addEventListener("click", () => {
$$(".tab").forEach(x => x.classList.remove("active"));
t.classList.add("active");
const target = t.dataset.tab;
$("#tab-upload").classList.toggle("hidden", target !== "upload");
$("#tab-youtube").classList.toggle("hidden", target !== "youtube");
});
});
// ─── Auto-chorus toggle ─────────────────────────
$("#auto-chorus").addEventListener("change", e => {
$("#manual-times").classList.toggle("hidden", e.target.checked);
});
// ─── Drag & drop ────────────────────────────────
const dz = $("#dropzone");
const fileInput = $("#file-input");
let pendingFile = null;
dz.addEventListener("click", () => fileInput.click());
fileInput.addEventListener("change", () => {
if (fileInput.files[0]) {
pendingFile = fileInput.files[0];
dz.querySelector("div").textContent = `📹 ${pendingFile.name}`;
}
});
["dragover", "dragenter"].forEach(ev =>
dz.addEventListener(ev, e => { e.preventDefault(); dz.classList.add("drag"); }));
["dragleave", "drop"].forEach(ev =>
dz.addEventListener(ev, e => { e.preventDefault(); dz.classList.remove("drag"); }));
dz.addEventListener("drop", e => {
const f = e.dataTransfer.files[0];
if (f) {
pendingFile = f;
dz.querySelector("div").textContent = `📹 ${f.name}`;
}
});
// ─── Settings collector ─────────────────────────
function collectSettings() {
return {
mode: $("#mode").value,
lang: $("#lang").value || null,
whisper_model: $("#model").value,
auto_chorus: $("#auto-chorus").checked,
start: !$("#auto-chorus").checked && $("#start").value ? parseTimestamp($("#start").value) : null,
duration: parseFloat($("#duration").value) || 30,
subtitle_style: $("#subtitle-style").value,
quality: $("#quality").value,
no_subs: $("#no-subs").checked,
};
}
function parseTimestamp(s) {
s = s.trim();
if (s.includes(":")) {
const parts = s.split(":").map(parseFloat);
if (parts.length === 2) return parts[0] * 60 + parts[1];
if (parts.length === 3) return parts[0] * 3600 + parts[1] * 60 + parts[2];
}
return parseFloat(s);
}
// ─── Submit ─────────────────────────────────────
$("#submit-btn").addEventListener("click", async () => {
const isYT = $("#tab-youtube").classList.contains("hidden") === false;
const settings = collectSettings();
$("#submit-btn").disabled = true;
$("#upload-progress").classList.remove("hidden");
try {
if (isYT) {
const url = $("#yt-url").value.trim();
if (!url) { alert("Vpiši YouTube URL"); return; }
$("#upload-status").textContent = "Pošiljam YouTube job...";
const r = await fetch("/api/youtube", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ url, ...settings }),
});
if (!r.ok) throw new Error("YouTube submit napaka");
const job = await r.json();
watchJob(job.id);
refreshJobs();
} else {
if (!pendingFile) { alert("Izberi datoteko"); return; }
const fd = new FormData();
fd.append("file", pendingFile);
const xhr = new XMLHttpRequest();
xhr.upload.onprogress = e => {
if (e.lengthComputable) {
const pct = (e.loaded / e.total) * 100;
$("#upload-bar").style.width = pct + "%";
$("#upload-status").textContent = `Nalagam... ${pct.toFixed(0)}%`;
}
};
xhr.onload = async () => {
if (xhr.status !== 200) {
alert("Upload napaka: " + xhr.responseText);
$("#submit-btn").disabled = false;
return;
}
const job = JSON.parse(xhr.responseText);
$("#upload-status").textContent = "Naloženo, začenjam obdelavo...";
const proc = await fetch("/api/process", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ job_id: job.id, ...settings }),
});
if (!proc.ok) throw new Error("Process start napaka");
watchJob(job.id);
refreshJobs();
};
xhr.open("POST", "/api/upload");
xhr.send(fd);
}
} catch (e) {
alert("Napaka: " + e.message);
} finally {
setTimeout(() => {
$("#upload-progress").classList.add("hidden");
$("#submit-btn").disabled = false;
pendingFile = null;
fileInput.value = "";
dz.querySelector("div").textContent = "Klikni ali povleci video sem";
}, 2000);
}
});
// ─── Watch job (SSE) ────────────────────────────
function watchJob(jobId) {
const evt = new EventSource(`/api/stream/${jobId}`);
evt.onmessage = (e) => {
try {
const job = JSON.parse(e.data);
updateJobInList(job);
if (job.status === "done" || job.status === "failed") {
evt.close();
refreshJobs();
}
} catch {}
};
evt.onerror = () => evt.close();
}
// ─── Jobs list ──────────────────────────────────
async function refreshJobs() {
const r = await fetch("/api/jobs");
if (!r.ok) return;
const data = await r.json();
const list = $("#jobs-list");
if (!data.jobs.length) {
list.innerHTML = '<div class="empty">Še ni obdelav</div>';
return;
}
list.innerHTML = "";
data.jobs.forEach(j => list.appendChild(buildJobEl(j)));
// Watch any in-progress job
data.jobs.forEach(j => {
if (["queued", "processing", "downloading", "uploaded"].includes(j.status)) {
watchJob(j.id);
}
});
}
function updateJobInList(job) {
const existing = document.getElementById(`job-${job.id}`);
const el = buildJobEl(job);
if (existing) {
existing.replaceWith(el);
} else {
const list = $("#jobs-list");
if (list.querySelector(".empty")) list.innerHTML = "";
list.prepend(el);
}
}
function buildJobEl(job) {
const el = document.createElement("div");
el.className = "job";
el.id = `job-${job.id}`;
const title = job.source_type === "youtube"
? (job.youtube_url || "YouTube")
: (job.filename || job.id);
const sizeStr = job.output_size_mb ? `${job.output_size_mb} MB` :
job.size_mb ? `${job.size_mb} MB` : "";
const statusLabel = {
queued: "v vrsti", uploaded: "naloženo", processing: "obdeluje",
downloading: "prenaša", done: "končano", failed: "napaka",
}[job.status] || job.status;
const isProcessing = ["queued", "processing", "downloading"].includes(job.status);
const showBar = isProcessing ? '<div class="progress"><div class="progress-bar indeterminate"></div></div>' : "";
const actions = [];
if (job.status === "done") {
actions.push(`<button class="small" onclick="window.open('/api/download/${job.id}')">⬇ Download</button>`);
actions.push(`<button class="small ghost" onclick="previewJob('${job.id}')">▶ Preview</button>`);
}
actions.push(`<button class="small ghost" onclick="deleteJob('${job.id}')"></button>`);
el.innerHTML = `
<div class="job-head">
<div class="job-title" title="${title}">${title}</div>
<span class="badge ${job.status}">${statusLabel}</span>
</div>
${job.current_step ? `<div class="step">${job.current_step}</div>` : ""}
${showBar}
${job.error ? `<div class="error-text">⚠ ${job.error}</div>` : ""}
<div class="meta">
<span>${job.source_type === "youtube" ? "YouTube" : "Upload"}</span>
${sizeStr ? `<span>${sizeStr}</span>` : ""}
${job.mode ? `<span>${job.mode}</span>` : ""}
${job.lang ? `<span>${job.lang}</span>` : ""}
</div>
<div class="actions">${actions.join("")}</div>
${job.status === "done" ? `<video id="video-${job.id}" class="hidden" controls></video>` : ""}
`;
return el;
}
async function deleteJob(id) {
if (!confirm("Izbrišem ta job?")) return;
await fetch(`/api/jobs/${id}`, { method: "DELETE" });
refreshJobs();
}
function previewJob(id) {
const v = document.getElementById(`video-${id}`);
v.src = `/api/preview/${id}`;
v.classList.remove("hidden");
v.play();
}
refreshJobs();
setInterval(refreshJobs, 10000);
</script>
</body>
</html>