El blog de Juan Palómez

13 mayo 2011

Detecting charsets with recode

Filed under: Uncategorized — Etiquetas: , , — thisisoneball @ 13:07

– source.txt is the file containing unreadable characters
– ‘Ñ’ is one character that we now that appears inside source.txt and is not readable
recode is a free program that translates text between different encodings

This just tries to recode source.txt using every encoding supported by recode, then checks the recoded file for the special character, if it is found, it means that the character was recoded correctly, so it prints the name of the encoding

for a in $(recode --list | cut -f1 -d\ )
    if recode $a < source.txt 2> /dev/null | grep 'Ñ' > /dev/null
       echo $a

Blog de WordPress.com.