El blog de Juan Palómez

13 mayo 2011

Detecting charsets with recode

Archivado en: Uncategorized — Etiquetas: , , — thisisoneball @ 13:07

- source.txt is the file containing unreadable characters
- ‘Ñ’ is one character that we now that appears inside source.txt and is not readable
- recode is a free program that translates text between different encodings

This just tries to recode source.txt using every encoding supported by recode, then checks the recoded file for the special character, if it is found, it means that the character was recoded correctly, so it prints the name of the encoding

for a in $(recode --list | cut -f1 -d\ )
do
    if recode $a < source.txt 2> /dev/null | grep 'Ñ' > /dev/null
    then
       echo $a
    fi
done
Advertisement

Dejar un comentario »

Aún no hay comentarios.

RSS feed para los comentarios de esta entrada. URI para TrackBack.

Deja un comentario

Fill in your details below or click an icon to log in:

Logo de WordPress.com

You are commenting using your WordPress.com account. Log Out / Cambiar )

Twitter picture

You are commenting using your Twitter account. Log Out / Cambiar )

Facebook photo

You are commenting using your Facebook account. Log Out / Cambiar )

Connecting to %s

Tema Shocking Blue Green. Blog de WordPress.com.

Seguir

Get every new post delivered to your Inbox.