reels-app

History

Sebastjan Artič dc1cb1ad27 Fix SRT subtitle timing: use word-level timestamps for chunk boundaries Bug: BRAJDE reel showed subtitles 2-3 seconds out of sync with audio. Soniox returned correct word timestamps: - 'Ajmo,' at 41.82s - 'Janezi!' at 42.18s - 'Pejd' greva, ajde,' at 43.44-44.40s But generate_srt_from_segments() ignored word timestamps and split long segments into evenly-spaced 2.5s chunks based on segment duration: chunk_dur = duration / n_parts ← assumes even pacing for i in range(n_parts): cs = rel_start + i * chunk_dur This produces wrong timing because singers don't sing evenly. Real audio had 'Ajmo, Janezi!' in 0.9s and 'Pejd' greva, ajde, na traktorju od Majde' in 6s — the 2.5s chunks didn't align with vocals. Fix: when word-level timestamps are available (Soniox/Scribe), group words into chunks where each chunk's start/end match the actual first/last word timestamps. Each chunk is at most MAX_CHUNK_DURATION (2.5s) but respects natural word boundaries. Before: 00:00.000 → 01.900 AJMO, JANEZI! PEJD' GREVA, AJDE, NA TRAKTORJU OD 00:01.900 → 03.800 MAJDE, NOBEN NAJU NE NAJDE, KO PELJEM TE After: 00:00.020 → 02.120 AJMO, JANEZI! PEJD' GREVA, 00:02.360 → 04.820 AJDE, NA TRAKTORJU OD MAJDE, NOBEN Subtitles now perfectly align with vocals.	2026-04-30 04:02:09 +00:00
..
main.py	Fix SRT subtitle timing: use word-level timestamps for chunk boundaries	2026-04-30 04:02:09 +00:00
telegram.py	Multi-upload batch queue + Telegram notifications	2026-04-29 15:12:38 +00:00

Sebastjan Artič dc1cb1ad27 Fix SRT subtitle timing: use word-level timestamps for chunk boundaries

Bug: BRAJDE reel showed subtitles 2-3 seconds out of sync with audio.

Soniox returned correct word timestamps:
- 'Ajmo,' at 41.82s
- 'Janezi!' at 42.18s
- 'Pejd' greva, ajde,' at 43.44-44.40s

But generate_srt_from_segments() ignored word timestamps and split long
segments into evenly-spaced 2.5s chunks based on segment duration:

  chunk_dur = duration / n_parts   ← assumes even pacing
  for i in range(n_parts):
      cs = rel_start + i * chunk_dur

This produces wrong timing because singers don't sing evenly. Real audio
had 'Ajmo, Janezi!' in 0.9s and 'Pejd' greva, ajde, na traktorju od Majde'
in 6s — the 2.5s chunks didn't align with vocals.

Fix: when word-level timestamps are available (Soniox/Scribe), group
words into chunks where each chunk's start/end match the actual first/last
word timestamps. Each chunk is at most MAX_CHUNK_DURATION (2.5s) but
respects natural word boundaries.

Before:
  00:00.000 → 01.900  AJMO, JANEZI! PEJD' GREVA, AJDE, NA TRAKTORJU OD
  00:01.900 → 03.800  MAJDE, NOBEN NAJU NE NAJDE, KO PELJEM TE

After:
  00:00.020 → 02.120  AJMO, JANEZI! PEJD' GREVA,
  00:02.360 → 04.820  AJDE, NA TRAKTORJU OD MAJDE, NOBEN

Subtitles now perfectly align with vocals.

2026-04-30 04:02:09 +00:00

main.py

Fix SRT subtitle timing: use word-level timestamps for chunk boundaries

2026-04-30 04:02:09 +00:00

telegram.py

Multi-upload batch queue + Telegram notifications

2026-04-29 15:12:38 +00:00