Fix CP1250 encoding bug v sync_qnet.py — È→Č
PROBLEM: Songs.txt na MB Windows playerjih je v CP1250 (slovenski/CEE), NE Windows-1252 (Western European). iconv -f WINDOWS-1252 je 'Č' (0xC8) napačno interpretiral kot 'È', zaradi česar je 811 zapisov v Qnet bazi imelo 'È' namesto 'Č' (npr. 'POSKOÈNI', 'ÈAS ZA ZABAVO', 'STORŽIÈ'). Posledica: ko je qnet_match povezal job na napačno labeliran zapis, je 'parsed_title' polnil z mojibake iz Qnet baze (15 jobov). FIX: WINDOWS-1252 → WINDOWS-1250. Razlike v CP1250 vs CP1252 (slovanske črke): Č↔È, č↔è, Ć↔Æ, ć↔æ, Đ↔Ð, đ↔ð, Ń↔Ñ, Ł↔£, ł↔³, Ś↔Œ, ś↔œ, ź↔Ÿ Ž, š, ž — ostanejo (isti byte v obeh) BACKFILL (ločen skript, že apliciran): - Qnet lookup: 2746 polj v 20860 zapisih popravljenih - Qnet songs.json: 2856 polj - 15 jobov: parsed_artist/title popravljen na pravilen UTF-8
This commit is contained in:
parent
576cc807b5
commit
2abd9daae1
@ -74,14 +74,18 @@ def ssh_exec(cmd: str, timeout: int = 60) -> dict:
|
|||||||
|
|
||||||
|
|
||||||
def fetch_one(station: str, ip: str, subdir: str) -> str:
|
def fetch_one(station: str, ip: str, subdir: str) -> str:
|
||||||
"""Fetcha Songs.txt z windows playerja, vrne UTF-8 string."""
|
"""Fetcha Songs.txt z windows playerja, vrne UTF-8 string.
|
||||||
|
|
||||||
|
Songs.txt je v CP1250 encoding (Windows Slovenian/CE), NE 1252 (Western).
|
||||||
|
1252 bi 'Č' (0xC8) interpretiral kot 'È', 'Š' kot 'Š' OK ampak 'Ž' (0xDE) kot 'Þ' itd.
|
||||||
|
"""
|
||||||
# 1) scp z playerja na openclaw, iconv v utf8, base64 nazaj
|
# 1) scp z playerja na openclaw, iconv v utf8, base64 nazaj
|
||||||
cmd = (
|
cmd = (
|
||||||
f"set -e; "
|
f"set -e; "
|
||||||
f"TMP=$(mktemp); "
|
f"TMP=$(mktemp); "
|
||||||
f"scp -i {SSH_KEY} -o StrictHostKeyChecking=no "
|
f"scp -i {SSH_KEY} -o StrictHostKeyChecking=no "
|
||||||
f'"folxadmin@{ip}:c:/{subdir}/Data/Songs.txt" "$TMP"; '
|
f'"folxadmin@{ip}:c:/{subdir}/Data/Songs.txt" "$TMP"; '
|
||||||
f'iconv -f WINDOWS-1252 -t UTF-8 "$TMP" | base64 -w 0; '
|
f'iconv -f WINDOWS-1250 -t UTF-8 "$TMP" | base64 -w 0; '
|
||||||
f'rm -f "$TMP"'
|
f'rm -f "$TMP"'
|
||||||
)
|
)
|
||||||
res = ssh_exec(cmd, timeout=90)
|
res = ssh_exec(cmd, timeout=90)
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user