Archive for February 7th, 2010

Howto to detect file encoding and convert default encoding of given files from one encoding to another on GNU/Linux and FreeBSD

Sunday, February 7th, 2010

I wanted to convert an html document character encoding to UTF-8, to achieve that of
course it was first needed to determine what kind of character encoding was used in
creation time of the file.
First thing I tried was:

hipo@noah:~/Desktop/test$ file File-Whole.htm
File-Whole.htm: HTML document text

as you can see that’s shit cause for some reason mime encoding is not printed by the file
command.
Next what I tried was:
hipo@noah:~/Desktop/test$ file --mime File-Whole.htm1File-Whole.htm1: text/html; charset=unknown-8bit

Here you see that character encoding is reported as charset=unknown-8bit which
ain’t cool at all and is of no use and prompts an error if I try it in iconv
Here is why I needed concretely to determine what kind of character set my file uses to later
be able to convert it using iconv .
To achieve my goal after consulting with Mr. Google , I found
out about enca — detect and convert encoding of text files
It’s obviously my lucky day because good guys from Debian has packaged enca so, everything came to the point of
apt-getting it.
# apt-get install enca

On FreeBSD enca port is available, so installing it cames simply to installing it from port tree.
Here is how:
pcfreak# cd /usr/ports/converters/enca;pcfreak# make install clean

Now I tried launching enca directly without any program parameters, but I was unlucky:

hipo@noah:~/Desktop/test$ enca file-Whole.htm
enca: Cannot determine (or understand) your language preferences.
Please use `-L language', or `-L none' if your language is not supported
(only a few multibyte encodings can be recognized then).
Run `enca --list languages' to get a list of supported languages.

I gave it another try, following prescribed usage parameters though I first checked my possibility
as a languages I can pass by to enca’s -L parameter.
Preliminary knowing that my text contains text in Bulgarian language, it wasn’t such a big deal
for me to determine the required language:

hipo@noah:~/Desktop/test$ enca -L bulgarian File-Whole.htm
transformation format 8 bits; CP1251

Knowing my character set all left for me was to do do the convert to UTF-8 to make text,
much more accessible.

hipo@noah:~/Desktop/test$ iconv --from-code=unknown-8bit --to=UTF-8 File-Whole.htm > File-Whole.htm.new
hipo@noah:~/Desktop/test$ mv File-Whole.htm.new File-Whole.htm

Well here we are conversion mission accomplished 🙂

What is Sufism (An interesting Muslim Sect, combining teachings from both Islam and Christianity)

Sunday, February 7th, 2010

I decided to blog about Sufism , cause I found it really interesting. I first heard about Sufism, as areligious stream as a reference following some readings about Rumi Mevlana after
I’ve been shown a video about his attained great wisdom by an ex-student
Turkish colleague of mine (Burdji).
Mevlana (or Mowlana) as he is also known is famous for a couple of major things
among Muslims.

1. He used to be a great Persian Poet of his time
2. A great Persian Theologian
3. He was a Major figure in Sufism
5. He was a notable musician of his time
4. His Life inspired the creation of Sufi Dances (Whirling Dervishes).

Here is a nice video with quotes from various Sufi teachers including
Rumi Mevlana himself:

It’s interesting that even though from Christian perspective Rumi Mevlana
and Sufism in general is a Heretical teaching it’s teachings, are closer
and more closer to Christianity than Islam is.
Happy Watching.

Orthodoxy Around the World (Orthodox Churches arond the world)

Sunday, February 7th, 2010

Dear reader, I thought it’s a nice idea to link that nice video
It’s compiled by some Orthodox Brother with God’s mercy.
The scale of Orthodox Christianity is really amazing. Almost 600 000+
people around the world are orthodox christians.
As Orthodoxy claims to be a direct descendant of original christianity,
faith and traditions.
Well Enough talk, Enjoy the wonderful video!