When Whisper hallucinates (generates fake lyrics not matching the audio), LLM can now use the original filename as a hint to recognize the song and override the false transcript with the actual lyrics. Pipeline: 1. Pass filename (e.g. 'Ben Zucker - Bonnie und Clyde') as hint 2. Whisper transcribes (may hallucinate) 3. Claude/Gemini reads filename + transcript: - Recognizes song from filename hint - Compares Whisper output to known lyrics - Replaces hallucinated text with real lyrics (preserves timestamps) - If can't fix, removes segment (better silent than wrong) Also added Whisper anti-hallucination params: - beam_size=5 (more careful decoding vs greedy) - hallucination_silence_threshold=2.0 (skip text in long silences) |
||
|---|---|---|
| .. | ||
| main.py | ||