Posts Tagged ‘shell scripts’

How to convert html pages to text in console / terminal on GNU / Linux and FreeBSD

Thursday, December 8th, 2011

HTML to Plain Text Convertion on GNU / Linux and FreeBSD

I'm realizing the more I'm converting to a fully functional GUI user, the less I'm doing coding or any interesting stuff…
I remembered of the old glorious times, when I was full time console user and got a memory on a nifty trick I was so used to back in the day.
Back then I was quite often writing shell scripts which were fetching (html) webpages and converting the html content into a plain TEXT (TXT) files

In order to fetch a page back in the days I used lynx (a very simple UNIX text browser, which by the way lacks support for any CSS or Javascipt) in combination with html2text – (an advanced HTML-to-text converter).

Let's say I wanted to fetch a my personal home page http://www.pc-freak.net/, I did that via the command:

$ lynx -source http://www.pc-freak.net/ | html2text > pcfreak_page.txt

The content from www.pc-freak.net got spit by lynx as an html source and passed html2pdf wchich saves it in plain text file pcfreak_page.txt
The bit more advanced elinks – (lynx-like alternative character mode WWW browser) provides better support for HTML and even some CSS and Javascript so to properly save the content of many pages in plain html file its better to use it instead of lynx, the way to produce .txt using elinks files is identical, e.g.:

$ elinks -source http://www.pc-freak.net/blog/ | html2text > pcfreak_blog_page.txt

By the way back in the days I was used more to links , than the superior elinks , nowdays I have both of the text browsers installed and testing to fetch an html like in the upper example and pipe to html2text produced garbaged output.

Here is the time to tell its not even necessery to have a text browser installed in order to fetch a webpage and convert it to a plain text TXT!. wget file downloading tools supports source dump as well, for all those who did not (yet) tried it and want to test it:

$ wget -qO- http://www.pc-freak.net | html2text
Anyways of course, some pages convertion of text inside HTML tags would not properly get saved with neither lynx or elinks cause some texts might be embedded in some elinks or lynx unsupported CSS or JavaScript. In those cases the GUI browser is useful. You can use any browser like Firefox, Epiphany or Opera 's File -> Save As (Text Files) embedded functionality, below is a screenshot showing an html page which I'm about to save as a plain Text File in Mozilla Firefox:

Firefox iceWeasel Opera etc. save html webpage as plain text on GNU / Linux, FreeBSD

Besides being handy in conjunction with text browsers, html2text is also handy for converting .html pages already existing on the computer's hard drive to a plain (.TXT) text format.
One might wonder, why would ever one would like to do that?? Well I personally prefer reading plain text documents instead of htmls ;)
Converting an html files already existing on hard drive with html2text is done with cmd:

$ html2text index.html >index.txt

To convert a whole directory full of .html (documentation) or whatever files to plain text .TXT , cd the directory with HTMLs and issue the one liner bash loop command:

$ cd html/
html$ for i in $(echo *.html); do html2text $i > $(echo $i | sed -e 's#.html#.txt#g'); done

Now lay off your back and enjoy reading the dox like in the good old hacker days when .TXT files were fashionable ;)

Share this on

Monitor General Server / Desktop system health in console on Linux and FreeBSD

Tuesday, October 4th, 2011

saidar is a text based ncurses program to display live statistics about general system health.

It displays in one refreshable screen (similar to top) statistics about server state of:
CPU, Load, Memory, Swap, Network, I/O disk operations
Besides that saidar supports a ncurses console colors, which makes it more funny to look at.
Saidar extracts the statistics for system state based on libgstrap cross platform statistics library about pc system health.

On Debian, Ubuntu, Fedora, CentOS Linuxes saider is available for install straight from distribution repositories.
On Debian and Ubuntu saidar is installed with cmd:

debian:~# apt-get install saidar
...

On CentOS and Fedora saidar is bundled as a part of statgrab-tools rpm package.
Installing it on 64 bit CentOS with yum is with command:

[root@centos ~]# yum install statgrab-tools.x86_64

Saidar is also available on FreeBSD as a part of the /usr/ports/devel/libgstrab, hence to use on my FreeBSD I had to install the libgstrab port:

freebsd# cd /usr/ports/devel/libstatgrab
freebsd# make install clean

Here is saidar running on my Desktop Debian on Thinkpad in color output:

debian:~# saidar -c

Saidar Linux General statistics Screenshot

I've seen many people, who use various shell scripts to output system monitoring information, this scripts however are often written to just run without efficiency in mind and they put some let's say 1% extra load on the system CPU. This is not the case with saidar which is written in C and hence the program is optimized well for what it does.

Share this on

How to configure networking in CentOS, Fedora and other Redhat based distros

Wednesday, June 1st, 2011

On Debian Linux I’m used to configure the networking via /etc/network/interfaces , however on Redhat based distributions to do a manual configuration of network interfaces is a bit different.

In order to configure networking in CentOS there is a special file for each interface and some values one needs to fill in to enable networking.

These network adapters configuration files for Redhat based distributions are located in the files:

/etc/sysconfig/network-scripts/ifcfg-*

Just to give you and idea on the content of this network configuration file, here is how it looks like:

[root@centos:~ ]# cat /etc/sysconfig/network-scripts/ifcfg-eth0
# Broadcom Corporation NetLink BCM57780 Gigabit Ethernet PCIe
DEVICE=eth0
BOOTPROTO=static
DHCPCLASS=
HWADDR=00:19:99:9C:08:3A
IPADDR=192.168.0.1
NETMASK=255.255.252.0
ONBOOT=yes

This configuration is of course just for eth0 for other network card names and devices, one needs to look up for the proper file name which corresponds to the network interface visible with the ifconfig command.
For instance to list all network interfaces via ifconfig use:

[root@centos:~ ]# /sbin/ifconfig |grep -i 'Link encap'|awk '{ print $1 }'
eth0
eth1
lo

In this case there are only two network cards on my host.
The configuration files for the ethernet network devices eth0 and eth1 from below example are located in files /etc/sysconfig/network-scripts/ifcfg-eth{1,2}

/etc/sysconfig/network-scripts/ directory contains plenty of shell scripts related to Fedora networking.
This directory contains actually the networking boot time load up rules for fedora and CentOS hosts.

The complete list of options available which can be used in /etc/sysconfig/network-scripts/ifcfg-ethx is located in:
/usr/share/doc/initscripts-*/sysconfig.txt

, to quickly observe the documentation:

[root@centos:~ ]# less /usr/share/doc/initscripts-*/sysconfig.txt

One typical example of configuring a CentOS based host to possess a static IP address (192.168.1.5) and a gateway (192.168.1.1), which will be assigned in boot time during the /etc/init.d/network is loaded is:

[root@centos:~ ]# cat /etc/sysconfig/network-scripts/ifcfg-eth0
# Broadcom Corporation NetLink BCM57780 Gigabit Ethernet PCIe
IPV6INIT=no
BOOTPROTO=static
ONBOOT=yes
USERCTL=yes
TYPE=Ethernet
DEVICE=eth0
IPADDR=192.168.1.5
NETWORK=192.168.1.0
GATEWAY=192.168.1.1
BROADCAST=192.168.1.255
NETMASK=255.255.255.0

After some changes to the network configuration files are made, to load up the new rules a /etc/init.d/network script restart is necessery with the command:

[root@centos:~ ]# /etc/init.d/network restart

Of course one can always use /etc/rc.local script as universal way to configure network rules on a Redhat based host, however using methods like rc.local to load up, ifconfig or route rules in a Fedora would break the distribution logic and therefore is not recommended.

There is also a serious additional reason against using /etc/rc.local post init commands load up script.
If one uses rc.local to load up and configure the networing, the network will get initialized only after all the other scripts in /etc/init.d/ gets started.

Therefore using /etc/rc.local might also be DANGEROUS!, if used remotely via (ssh), supposedly it might completely fail to load the networking, if all bringing the server interfaces relies on it.

Here is an example, imagine that some of the script set in to load up during a CentOS boot up hangs and does continue to load forever (for example after some crucial software upate), as a consequence the /etc/rc.local script will never get executed as it only starts up after all the rest init scripts had succesfully completed execution.

A network eth1 interface configuration for a Fedora host which has to fetch it's network settings automatically via DHCP is as follows:

[root@fedora:/etc/network:]# cat /etc/sysconfig/network-scripts/ifcfg-eth1
# Intel Corporation 82557/8/9 [Ethernet Pro 100]DEVICE=eth1
BOOTPROTO=dhcp
HWADDR=00:0A:E4:C9:7B:51
ONBOOT=yes

To sum it up I think Fedora's /etc/sysconfig/network-scripts methodology to configure ethernet devices is a way inferior if compared to Debian.

In GNU/Debian Linux configuration of all networking is (simpler)!, everything related to networking is in one single file ( /etc/network/interfaces ), moreover getting all the thorough documentation for the network configurations options for the interfaces is available as a system wide manual (e.g. man interfaces).

Partially Debian interfaces configuration is a bit more complicated in terms of syntax if matched against Redhat's network-scripts/ifcfg-*, lest that generally I still find Debian's manual network configuration interface to be easier to configure networking manually vicommand line.

Share this on

How to fix “delivery 1: deferral: Sorry,_message_has_wrong_owner._(#4.3.5)/” qmail mail delivery failure message

Friday, May 20th, 2011

After a failed attempt to enable some wrapper scripts to enable domain keys support in a qmail powered mail server my qmail server suddenly stopped being able to normally send mail.

The exact error message which was logged in /var/log/qmail/current was:

@400000004dd66fcc16a088ac delivery 1: deferral: Sorry,_message_has_wrong_owner._(#4.3.5)/

This qmail messed happened after I substituted /var/qmail/bin/qmail-queue and /var/qmail/bin/qmail-remote with two respective wrapper shell scripts which were calling for the original qmail-queue and qmail-remote binaries under the names qmail-queue.orig and qmail-queue.orig

Restoring back qmail-queue.orig to /var/qmail/bin/qmail-queue and qmail-remote.orig to /var/qmain/bin/qmail-remote and restarting the mail server broke my qmail install.

After a bunch of nerves trying to isolate what is causing the error I found out that by mistake I forgot to copy the qmail-queue and qmail-remote permissions and ownership.

Thus I had to check another qmail working installation’s permissions for both binaries and fix the permissions to be equivalent to the permissions:

debian:~# ls -al /var/qmail/bin/qmail-remote
-rwx–x–x 1 root qmail 50464 2011-05-20 12:56 /var/qmail/bin/qmail-remote*
debian:~# ls -al /var/qmail/bin/qmail-queue
-rws–x–x 1 qmailq qmail 20392 2011-05-20 12:56 /var/qmail/bin/qmail-queue*

The exact chmod and chmod commands I issued to solve the shitty issues were as follows:

First I fixed the qmail-queue and qmail-remote ownership:

debian:~# chown qmailq:qmail /var/qmail/bin/qmail-queue
debian:~# chown root:qmail /var/qmail/bin/qmail-remote

Second I set the proper file permissions:

# make the qmail-queue binary suid
debian:~# chmod u+s /var/qmail/bin/qmail-queue
debian:~# chmod 611 /var/qmail/bin/qmail-queue
debian:~# chmod 611 /var/qmail/bin/qmail-remote

Third and last I did a restart of the qmail server and tested it sends properly

debian:~# /usr/bin/qmailctl stop
Stopping qmail...
qmail-send
qmail-smtpd
debian:~# /usr/bin/qmailctl start
Starting qmail

Finally to test that the qmail server qmail-queue was queing and sending with qmail-remote I used the system mail command like so:

debian:~# mail -s "test email" testuser@pc-freak.net
asdfafdsdf
.
Cc:

Afterwards the mail was properly received on my mail account testuser@pc-freak.net immediately.

In my /var/log/qmail/current log file all seemed fine:

@400000004dd6702a2eb2b064 starting delivery 1: msg 85281596 to remote testuser@pc-freak.net
@400000004dd6702a2eb2b834 status: local 0/20 remote 1/20
@400000004dd6702b34cc809c delivery 1: success: 83.228.93.76_accepted_message./Remote_host_said:_250_ok_
1305899099_qp_65293/
@400000004dd6702b34cc886c status: local 0/20 remote 0/20
@400000004dd6702b34cc8c54 end msg 85281596

The test mail was properly received on my mail account testuser@pc-freak.net immediately.

It took me like half an hour to figure out what exactly is wrong with the permissions in situations like this I really wanted to change all my qmail installs with postfix and forget forever I ever used qmail ...

Share this on