Posts Tagged ‘mess’

How to disable ACPI on productive Linux servers to decrease kernel panics and increase CPU fan lifespan

Tuesday, May 15th, 2012

Reading Time: 5minutes

Linux TUX ACPI logo / Tux Hates ACPI logohttps://pc-freak.net/images/linux_tux_acpi_logo-tux-hates-acpi.png

Why would anyone disable ACPI support on a server machine??
Well  ACPI support kernel loaded code is just another piece of code constantly being present in the memory,  that makes the probability for a fatal memory mess up leading to  a fatal bug resulting in system crash (kernel panic) more likely.

Many computers ship with buggy or out of specifications ACPI firmware which can cause a severe oddities on a brand new bought piece of comp equipment.

One such oddity related to ACPI motherboard support problems is if you notice your machine randomly powering off or failing to boot with a brand new Linux installed on it.

Another reason to switch off ACPI code will would to be prevent the CPU FAN rotation from being kernel controlled.

If the kernel controls the CPU fan on  high CPU heat up it will instruct the fan to rotate quickly and on low system loads it will bring back the fan to loose speed.
 This frequent switch of FAN from high speed to low speed  increases the probability for a short fan damage due to frequent changes of fan speed. Such a fan damage leads often to  system outage due to fan failure to rotate properly.

Therefore in my view it is better ACPI support is switched off completely on  servers. On some servers ACPI is useful as it can be used to track CPU temperature with embedded motherboard sensors with lm_sensors or any piece of hardwre vendor specific software provided. On many machines, however lm_sensors will not properly recognize the integrated CPU temperature sensors and hence ACPI is mostly useless.

There are 3 ways to disable fully or partially ACPI support.

- One is to disable it straight for BIOS (best way IMHO)
- Disable via GRUB or LILO passing a kernel parameter
- Partial ACPI off-ing - /disabling the software that controls the CPU fan/

1. Disable ACPI in BIOS level

Press DEL, F1, F2, F10 or whatever the enter bios key combination is go through all the different menus (depending on the vios BENDOR) and make sure every occurance of ACPI is set to off / disable whatever it is called.

Below is a screenshot of menus with ACPI stuff on a motherboard equipped with Phoenix AwardBIOS:

BIOS ACPI Disable power Off Phoenix BIOS

This is the in my opinon best and safest way to disable ACPI power saving, Unfortunately some newer PCs lack the functionality to disable ACPI; (probably due to the crazy "green" policy the whole world is nowdays mad of).

If that's the case with you, thanksfully there is a "software way" to disable ACPI via passing kernel options via GRUB and LILO boot loaders.

2. Disabling ACPI support on kernel boot level through GRUB boot loader config

There is a tiny difference in command to pass in order to disable  ACPI depending on the Linux installed  GRUB ver. 1.x or GRUB 2.x.

a) In GRUB 0.99 (GRUB version 1)

Edit file /etc/grub/menu.lst or /etc/grub/grub.conf (location differs across Linux distribution). Therein append:

acpi=off

to the end of kernel command line.

Here is an example of a kernel command line with ACPI not disabled (example taken from CentOS server grub.conf):

[root@centos ~]# grep -i title -A 4 /etc/grub/grub.conf
title Red Hat Enterprise Linux Server (2.6.18-36.el5)
root (hd0,0)
kernel /vmlinuz-2.6.18-36.el5 ro root=/dev/VolGroup00/LogVol00 console=ttyS0,115200n8
initrd /initrd-2.6.18-36.el5.img

The edited version of the file with acpi=off included should look like so:

title Red Hat Enterprise Linux Server (2.6.18-36.el5)
root (hd0,0)
kernel /vmlinuz-2.6.18-36.el5 ro root=/dev/VolGroup00/LogVol00 console=ttyS0,115200n8 acpi=off
initrd /initrd-2.6.18-36.el5.img

The kernel option root=/dev/VolGroup00/LogVol00 means the the server is configured to use LVM (Logical Volume Manager).

b) Disabling ACPI on GRUB version 1.99 +

This version is by default installed on newer Ubuntu and Debian Linux-es.

In grub 1.99 on latest Debian Squeeze, the file to edit is located in /boot/grub/grub.cfg. The file is more messy than with its predecessor menu.lst (grub 0.99).
Thanks God there is no need to directly edit the file (though this is possible), but on newer Linuces (as of time of writting the post), there is another simplied grub config file /etc/grub/config

Hence to add the acpi=off to 1.99 open /etc/grub/config find the line reading:

GRUB_CMDLINE_LINUX_DEFAULT="quiet"

and append the "acpi=off" option, e.g. the line has to change to:

GRUB_CMDLINE_LINUX_DEFAULT="quiet acpi=off"

On some servers it might be better to also disable APIC along with ACPI:

Just in case you don't know what is the difference between ACPI and APIC, here is a short explanation:

ACPI = Advanced Configuration and Power Interface

APIC = Advanced Programmable Interrupt Controllers

ACPIis the system that controls your dynamic speed fans, the power button behavior, sleep states, etc.

APICis the replacement for the old PIC chip that used to come imbedded on motherboards that allowed you to setup interrupts for your soundcard, ide controllers, etc.

Hence on some machines experiencing still problems with even ACPI switched off, it is helpful  to disable the APIC support too, by using:

acpi=off noapic noacpi

Anyways, while doing the changes, be very very cautious or you might end up with un-boot-able server. Don't blame me if this happens :); be sure you have a backup option if server doesn't boot.

To assure faultless kernel boot, GRUB has ability to be configured to automatically load up a second kernel if 1st one fails to boot, if you need that read the grub documentation on that.

To load up the kernel with the new setting, give it a restart:

[root@centos ~]# shutdown -r now
....

3. Disable ACPI support on kernel boot time on Slackware or other Linuxes still booting kernel with LILO

Still, some Linux distros like Slackware, decided to keep the old way and use LILO(LInux LOader) as a default boot loader.

Disabling ACPI support in LILO is done through /etc/lilo.conf

By default in /etc/lilo.conf, there is a line:

append= acpi=on

it should be changed to:

append= acpi=off

Next to load up the new acpi disabled setting, lilo has to be reloaded:

slackware:~# /sbin/lilo -c /etc/lilo.conf
....

Finally a reboot is required:

slackware:~# reboot
....

(If you don't have a physical access or someone near the server you better not 🙂 )

4. Disable ACPI fan control support on a running Linux server without restart

This is the most secure work-around, to disabling the ACPI control over the machine CPU fan, however it has a downside that still the ACPI code will be loaded in the kernel and could cause kernel issues possibly in the long run – lets say the machine has uptime of more than 2 years…

The acpi support on a user level  is controlled by acpid or haldaemon (depending on the Linux distro), hence to disable the fan control on servers this services has to be switched off:

a) disabling ACPI on Debian and deb based Linux-es

As of time of writting on Debian Linux servers acpid (Advanced Configuration and Power Interface event daemon) is there to control how power management will be handled. To disable it stop it as a service (if running):

debian:~# /etc/init.d/acpid stop

To permanently remove acpid from boot up on system boot disable it with update-rc.d:

debian:~# update-rc.d acpid disable 2 3 4 5
update-rc.d: using dependency based boot sequencing
insserv: Script iptables is broken: incomplete LSB comment.
insserv: missing `Required-Start:' entry: please add even if empty.
insserv: warning: current start runlevel(s) (empty) of script `acpid' overwrites defaults (2 3 4 5).
insserv: warning: current stop runlevel(s) (2 3 4 5) of script `acpid' overwrites defaults (empty).
insserv: missing `Required-Start:' entry: please add even if empty.

b) disabling ACPI on RHEL, Fedora and other Redhat-s (also known as RedHacks 🙂 )

I'm not sure if this is safe,as many newer rpm based server system services,  might not work properly with haldaemon disabled.

Anyways you can give it a try if when it is stopped there are issues just bring it up again.

[root@rhel ~]# /etc/init.d/haldaemon stop

If all is fine with the haldaemon switched off (hope so), you can completely disable it to load on start up with:

[root@centos ~]# /sbin/chkconfig --level 2 3 4 5 haldaemon off

Disabling ACPI could increase a bit your server bills, but same time decrease losses from downtimes, so I guess it worths its costs 🙂

 

How to solve “Incorrect key file for table ‘/tmp/#sql_9315.MYI’; try to repair it” mysql start up error

Saturday, April 28th, 2012

Reading Time: 2minutes

When a server hard disk scape gets filled its common that Apache returns empty (no content) pages…
This just happened in one server I administer. To restore the normal server operation I freed some space by deleting old obsolete backups.
Actually the whole reasons for this mess was an enormous backup files, which on the last monthly backup overfilled the disk empty space.

Though, I freed about 400GB of space on the the root filesystem and on a first glimpse the system had plenty of free hard drive space, still restarting the MySQL server refused to start up properly and spit error:

Incorrect key file for table '/tmp/#sql_9315.MYI'; try to repair it" mysql start up error

Besides that there have been corrupted (crashed) tables, which reported next to above error.
Checking in /tmp/#sql_9315.MYI, I couldn't see any MYI – (MyISAM) format file. A quick google look up revealed that this error is caused by not enough disk space. This was puzzling as I can see both /var and / partitions had plenty of space so this shouldn't be a problem. Also manally creating the file /tmp/#sql_9315.MYI with:

server:~# touch /tmp/#sql_9315.MYI

Didn't help it, though the file created fine. Anyways a bit of a closer examination I've noticed a /tmp filesystem mounted besides with the other file system mounts ????
You can guess my great amazement to find this 1 Megabyte only /tmp filesystem hanging on the server mounted on the server.

I didn't mounted this 1 Megabyte filesystem, so it was either an intruder or some kind of "weird" bug…
I digged in Googling to see, if I can find more on the error and found actually the whole mess with this 1 mb mounted /tmp partition is caused by, just recently introduced Debian init script /etc/init.d/mountoverflowtmp.
It seems this script was introduced in Debian newer releases. mountoverflowtmp is some kind of emergency script, which is triggered in case if the root filesystem/ space gets filled.
The script has only two options:

# /etc/init.d/mountoverflowtmp
Usage: mountoverflowtmp [start|stop]

Once started what it does it remounts the /tmp to be 1 megabyte in size and stops its execution like it never run. Well maybe, the developers had something in mind with introducing this script I will not argue. What I should complain though is the script design is completely broken. Once the script gets "activated" and does its job. This 1MB mount stays like this, even if hard disk space is freed on the root partition – / ….

Hence to cope with this unhandy situation, once I had freed disk space on the root partition for some reason mountoverflowtmp stop option was not working,
So I had to initiate "hard" unmount:

server:~# mount -l /tmp

Also as I had a bunch of crashed tables and to fix them, also issued on each of the broken tables reported on /etc/init.d/mysql start start-up.

server:~# mysql -u root -p
mysql> use Database_Name;
mysql> repair table Table_Name extended;
....

Then to finally solve the stupid Incorrect key file for table '/tmp/#sql_XXYYZZ33444.MYI'; try to repair it error, I had to restart once again the SQL server:

Stopping MySQL database server: mysqld.
Starting MySQL database server: mysqld.
Checking for corrupt, not cleanly closed and upgrade needing tables..
root@server:/etc/init.d#

Tadadadadam!, SQL now loads and works back as before!

How to convert file content encoded in windows-cp1251 charset to UTF-8 (with iconv) to be delivered properly encoded to browsing end clients

Wednesday, May 16th, 2012

Reading Time: 3minutes

windows-cp1251 bulgarian to UTF-8 / Encoding Communication Decoding Communication Funny Picture

I have a bunch of old html files all encoded in the historically obsolete Windows-cp1251. Windows-CP1251 used to be common used 7 years ago and therefore still big portions of the web content in Bulgarian / Russian Cyrillic is still transferred to the end users in this encoding.

This was just before the "UTF-8 revolution", where massively people started using UTF-8,
Well it was clear the specific national country text encoding standards will quickly be moved by to UTF-8 – Universal Encoding format which abbreviation stands for (Unicode Transformation Format).

Though UTF-8 was clear to be "the future", many web developers mostly because of their incompetency or using an old sources of learning how to writen in HTML continued to use windows-cp1251 in HTMLs. I'm even convinced, there are still developers out there who are writting websites for Bulgarian / Russian / Macedonian customers using obsolete encodings …

The smarter developers of those accustomed to windows-cp1251, KOI-8R etc. etc., were using the meta tag to specify the type of charset of the web page content with:

<meta http-equiv="content-type" content="text/html;charset=windows-cp1251">

or

<meta http-equiv="content-type" content="text/html;charset=koi-8r">

Anyhow, still many devs even didn't placed the windows-cp1251 in the head of the HTML …

The result for the system administrator is always a mess – a lot of webpages that are showing like unreadable signs and tons of unhappy customers.
As always the system administrator is considered responsible, for the programmer mistakes :). So instead of programmers fix their bad cooking, the admin has to fix it all!

One quick work around me as admin has applied to failing to display pages in Cyrillic using the Windows-cp1251 character encoding was to force windows-cp1251 as a default encoding for the whole virtualhost or Apache directory with Apache directives like:

<VirtualHost *:80>
ServerAdmin some_user@some_host.com
DocumentRoot /var/www/html
AddDefaultCharset windows-cp1251
ServerName the_host_name.com
ServerAlias www.the_host_name.com
....
....
<Directory>
AddDefaultCharset windows-cp1251
>/Directory>
</VirtualHost>

Though this mostly would, work there are some occasions, where only a particular html files from all the content served by Apache is encoded in windows-cp1251, if most of the content is already written in UTF-8, this could be a big issues as you cannot just change the UTF-8 globally to windows-cp1251, just because few pages are written in archaic encoding….
Since most of the content is displayed to the client by Apache (as prior explained) just fine, only particular htmls lets's ay single.html, single2.html etc. etc. are displayed with some question marks or some non-human readable "hieroglyphs".

Below is a screenshot from two pages returned to my browser in wrongly set htmls charset:

Improper Windows CP1251 encoding with Apache set to serve UTF-8 encoding questiomarks

Improper Windows CP1251 delivered page in UTF-8 browser view

Apache returns cp1251 in some non-UTF8 wrong encoding (webserver improperly served cyrillic encoding)

Improperly served encoding CP1251 delivered by Apache in non-utf-8 encoding

When this kind of issues occur, the only solution is to simply login to the server and use iconv command to convert all files returning unreadable content from whatever the non UTF-8 encoding is lets say in my case Bulgarian typeset of cp1251 to UTF-8

Here is how the iconv command to convert between windows-cp1251 to utf-8 the two sample files named single1.html and single2.html

server:/web# /usr/bin/iconv -f WINDOWS-1251 -t UTF-8 single1.html > single1.html.utf8
server:/web# mv single1.html single1.html.bak;
server:/web# mv single1.html.utf8 single1.html
server:/web# /usr/bin/iconv -f WINDOWS-1251 -t UTF-8 single2.html > single2.html.utf8
server:/web# mv single2.html single2.html.bak;
server:/web# mv single2.html.utf8 single2.html

I always, make copies of the original cp1251 encoded files (as you see mv single1.html single1.html.bak), because if something goes wrong with convertion I can easily revert back.

If there are 10 files with consequential numbers naming they can be converted using a short for loop, like so:

server:/web# for i $(seq 1 10); do
/usr/bin/iconv -f WINDOWS-1251 -t UTF-8 single$i.html > single$i.html.utf8;mv single$i.html single$i.html.bak
mv single$i.html.utf8 single$i.html
done

Just as earlier mentioned if single1.html, single2.html … has in the html <head>:

<meta http-equiv="Content-Type" content="text/html; charset=windows-1251">

You should open, each of the files in question and wipe out the line either by hand or use sed to wipe it in one loop if it has to be done for lets say 10 files named (single{1..10})

server:/web# for i in $(seq 1 10); do
sed '/<meta http-equiv="Content-Type" content="text\/html; charset=windows-1251>/d' single$i.txt > single$i.txt.new;
mv single$i.txt single$i.txt.bak;
mv single$i.txt.new single$i.txt

Well now,

Fixing QMAIL mail server SMTP auto-configure issues in Thunderbird and other mail IMAP / POP3 mobile clients

Friday, July 13th, 2012

Reading Time: 3minutes

One of the QMAIL mail servers, setup-uped on a Debian host has been creating some auto configuration issues. Every-time a new mail user tries to use the embedded Thunderbird client auto configuration, the auto config fails leaving the client unable to use his Mailbox through POP3 or IMAP protocols.

Since about 2 years Thunderbird and many other modern pop3 and imap mail desktop and mobile clients are by default using the auto configuration and hence it was unthinkable to manually change settings for new clients with the QMAIl install; Besides that most of the Office users are always confused, whether they have to manually change SMTP or POP3 host for a server.

Below is a screenshot displaying the warning during email auto-configuration:

Thunderbird new Mail account setup auto config warning SMTP not OKThe orange color in the button for the newly auto-detected smtp.mail-domain.com indicates, something is not right with the SMTP host.

Obviously, something was wrong with smtp.mail-domain.com, hence I checked where smtp.mail.domain.com resolves with host command. What I found was actually smtp.mail-domain.comActive ( A ) DNS records was pointing to an IP address, our company previously used for the mail server. At present time the correct mail server host name is mx.mail-domain.com and the QMAIL installation on mx.soccerfame.com is configured to be the actual SMTP server.

By default Thunderbird and many other POP3, IMAP mail clients, however automatically assume the default SMTP host for a mail server is to be configured under a host name smtp.mail-domain.com. This is really strange, especially when the primary MX record for mail-domain.com domain is pointing to mx.mail-domain.com, e.g.:

qmail:~# host -t MX mail-domain.com
soccerfame.com mail is handled by 10 mx.mail-domain.com.
soccerfame.com mail is handled by 20 mail.mail-domain.com.
soccerfame.com mail is handled by 30 mail-domain.com.

The whole warning was caused due to the fact mx.mail-domain.com was resolving to an IP like xxx.xxx.xxx.xxx, whether smtp.mail-domain.com was resolving to yyy.yyy.yyy.yyy

Both xxx.xxx.xxx.xxx and yyy.yyy.yyy.yyy hosts were configured to have a different qmail SMTP host i.e.:

The server under IP xxx.xxx.xxx.xxx – (mx.mail-domain.com) was configured in /var/qmail/control/me to be mx.mail-domain.com and the other old one yyy.yyy.yyy.yyy – (mail.mail-domain.com) had (mail.mail-domain.com) in /var/qmail/control/me

As smtp.mail-domain.com was actually being still resolved to mail.mail-domain.com, the EMAILs were improperly trying to be sent with a configured DNS hostname of smtp.mail-domain.com, where the actual one on the server was mail.mail-domain

It took, me about an hour of pondering what is causing the oddities until I got the here explained issue. As the DNS recors for the domain the sample mail-domain.com were handled by Godaddy, to fix the mess, I logged in to Godaddy and;

a)deleted – DNS record for smtp.mail-domain.com.
b) Created new CNAME record for smtp.mail-domain.com to be a domain alias for mx.soccerfame.com

A few minutes, afterwards I tried configuring once again the same email account in Thunderbird and this time both imap.mail-domain.com and smtp.mail-domain.comturned green; indicating everything is configured fine.

To be 100% sure all is working fine I first fetched, all email via the IMAP protocol without hassles and onwards sent a test email to my Gmail account; thanksfully the sent email was delivered to Gmail indicating both Get Mail and Send Mail functions worked now fine.

Thunderbird icedove new mail account setup auto config Okay
 

Thanks to God I’m in Arnhem the Netherlands! :)

Tuesday, August 26th, 2008

Reading Time: 2minutes
I had 3 days trip with bus with my girl classmate (Ina). We traveled using the union-ivkoni bus lines. As a wholebeing on the road with bus for 3 days in order to reach some destination is pretty killing. We started the historicaltravel from Kavarna to Sofia and then at 2 o’clock we catched the bus Sofia -> Utrecht. There was big delays in the Serbianand the Hungarian borders. On all the other boarders we and our luggage weren’t checked. We had a bunch of stops on a oil stations.And I have to note everything in the oil stations in europe is pretty expensive. For example one sanwdich costs somwehere aroundalmost 4EU!. I and Ina came at Utrecht at around 6 o’clock and went to the Utrecht’s train station where we took two tickets to Arnhem.At 7 we were at Arnhem and went to the bus station. Originally we expected that there are gonna be welcoming students there and HAN university buses traveling from there to Vivare and the other accommodation places, unfortunately this was not the case. We were absolutely alone at an unknown country again I prayed to God in Jesus name to help me find a way to fix this mess. I went to search for Mobile SIM card, at the end after 20 minutes of walk I asked a police officer near the train station and he told me about a bookstore where I can find mobile SIM cards. I took two of them one for Ina and one for me. I took the T-Mobile mobile. I heard that the prices of conversations between the Bulgarian GSM operator Globul and T-Mobile are cheaper so I decided to give it a try. We called Koko (A College Colleague, who is gonna study HRQM just like me and is going to continue as a 3rd year student in Arnhem Business School, he came instantly in 20 minutes or so with another Bulgarian guy who already studied a year in Arnhem (Drago). Drago didn’t helped much with the traveling bags. But Kaloyan helped a lot. Today I feel the grace of God so real. I pray that he keep me and guide me in the same way in the future too. Another thing to note is that the living room that Vivare selected for me or should I say God make it be for me is just perfect. It has a toilet (!big plus!, a terrace, аsink, a nice bed, A Buro, a lamp and a chair. The room number is with ID K. 111. I think 111 stays for the Holy Trinity (The Father, The Son and The Holy Spirit) to Whom is and be the Glory the worship and power now and Forever and Ever, Amen!) I forgot to mention that I blocked my mobile telephone while trying to make the T-mobile SIM card work with my Motorola C115, luckily God has thought for this too. It seems Koko has one Mobile apartus he didn’t need right now so he gave it on a good will to me. Again what I can say Our Lord is an awesome God. Now I’m pretty tired and I’m going to bed. I have to mention Arnhem is excitingly charmful city and I really like it, also I’m impressed by the Dutch guys with which I had any work until now.Well for final I can only say: ” I screamed to the Lord and he heard my prayer and delivered me from evil”! Glory to you Lord of Hosts! Amen !END—–

6 days in sickness

Friday, August 10th, 2007

Reading Time: < 1minute
My physical health was quite not good during the last 6 / 7 days. Today it was a quiet day.I haven’t prayed seriously for few days but I can’t. Since my life looks like going nowhere.There is almost nothing in this town which keeps me still. I went to the Old Dobrich inMino’s coffee. But after a little argue and being a little rude to a girl I leavedthis awful mess. This guys are not a good company/match for me. It seems I don’t have friendsexcept Lily. Well I hope at least I haven’t builded all the time for nothing.Thanks Goodness that at least at work there isn’t a lot of work so I’m in a period of recovery.The world is going mad. I’m starting to scare my self. Seems like, life is created to be livednot to think about it’s purpose.END—–

In Rusalka a.k.a. Marmayed and Shabla Camping

Monday, September 3rd, 2007

Reading Time: 2minutes
I spend the weekend with Megi, Niki and Nomen in Rusalka (we beached there), although there was no sun at allthe water was warm and it was good experience (this happened in the late evening). In 06:00 or 07:00 o’clock.We decided to go to Tulenovo’s caves and stay there and make a wood fire. But the caves were already taken by others.So in the end we went to Shablenska Tuzla. We stretch the 2 tents and fired a firewood on the beach and started having a supper, unfortunately a rain started and we have to gather the 2 tents and the food and go to the car. We waited to see ifthe rain would stop but it was raining and we went to a near family hotel where Mitko, Megi and Niki slept into a room and slept in the car (this is the first time I have to sleep in a car). In the morning we went to the beach I stayed out of the sea because there was wind and I was scared of getting sick again. Around 12:30 we were in Dobrich. So this is how most of the weekend passed in the night we went to my Grandma and Grandpa’s (Peace be upon him) village with my father and we stayed there for 30 minutes or so. During the weekend I successfully made a binary upgrade of my xorg 6.9 -> 7.2 (it was a full mess), it took me 2 days! As usual the upgrades under FBSD are a real nightmare. Speaking about faith I’m not sure what do I believe anymore I still hope that God would fix my health issues, but I’m tired of waiting really :[ The bad thing about the weekend was that one more time I felt like not being on my right place. I realized soon that I can’t hear the voice of God. And currently I’m praying that God would give me this ability. But ofcourse only time will show.END—–

The end of the work week :)

Friday, February 1st, 2008

Reading Time: 2minutes
One more week passed without serious server problems. Yesterday after upgrade to debian 4.0rc2 with

apt-get dist-upgrade and reboot the pc-freak box became unbootable.

I wasn’t able to fix it until today because the machine’s box seemed not to read cds well.The problem was consisted of this that after the boot process of the linux kernel has started the machine the boot up was interrupted with a message saying
/sbin/init is missing

and I was dropped to a busybox without being able to read nothing from my filesystem.Thankfully nomen came to Dobrich for the weekend and today he bring me his cdrom-drive I booted with the debian.

Using Debian’s linux rescue I mounted the partition to check what’s wrong. I suspected something is terribly wrong with the lilo’s conf.

Looking closely to it I saw it’s the lilo conf file it was setupped to load a initrd for the older kernel. changing the line to thenew initrd in /etc/lilo.conf and rereading the lilo; /sbin/lilo -C; /sbin/lilo;

fixed the mess and pc-freak booted succesfully! 🙂

Yesterday I had to do something kinky. It was requested from a client to have access to a mysql service of one of the company servers,the problem was that the client didn’t have static IP so I didn’t have a good way to put into the current firewall.

Everytime the adsl they use got restarted a new absolutely random IP from all the BTC IP ranges was assigned.

The solution was to make a port redirect to a non-standard mysql port (XXXXX) which pointed to the standard 3306 service. I had to tell the firewall not to check the coming IPs on the non-standard port (XXXXX) against the 3306 service fwall rules.

Thanks to the help of a guy inirc.freenode.net #iptables jengelh I figured out the solution.

To complete the requested task it was needed to mark all packagescoming into port (XXXXX) using the iptables mangle option and to add a rule to ACCEPT all marked packages.

The rules looked like this

/sbin/iptables -t mangle -A PREROUTING -p tcp –dport XXXXX -j MARK –set-mark 123456/sbin/iptables -t nat -A PREROUTING -d EXTERNAL_IP -i eth0 -p tcp –dport XXXXX -j DNAT –to-destination EXTERNAL_IP:3306

/sbin/iptables -t filter -A INPUT -p tcp –dport 3306 -m mark –mark 123456 -j ACCEPT .

Something I wondered a bit was should /proc/sys/net/ipv4/ip_forward in order for the above redirect to be working, in case you’re wondering too well it doesn’t 🙂 The working week was a sort of quiteful no serious problems with servers and work no serious problems at school (although I see me and my collegues become more and more unserious) at studying. My grand parentsdecided to make me a gift and give me money to buy a laptop and I’m pretty happy for this 🙂 All that is left is to choose a good machine with hardware supported both by FreeBSD and Linux.

END—–

How to exclude files on copy (cp) on GNU / Linux / Linux copy and exclude files and directories (cp -r) exclusion

Saturday, March 3rd, 2012

Reading Time: 3minutes

I've recently had to make a copy of one /usr/local/nginx directory under /usr/local/nginx-bak, in order to have a working copy of nginx, just in case if during my nginx update to new version from source mess ups.

I did not check the size of /usr/local/nginx , so just run the usual:

nginx:~# cp -rpf /usr/local/nginx /usr/local/nginx-bak
...

Execution took more than 20 seconds, so I check the size and figured out /usr/local/nginx/logs has grown to 120 gigabytes.

I didn't wanted to extra load the production server with copying thousands of gigabytes so I asked myself if this is possible with normal Linux copy (cp) command?. I checked cp manual e.g. man cp, but there is no argument like –exclude or something.

Even though the cp command exclude feature is not implemented by default there are a couple of ways to copy a directory with exclusion of subdirectories of files on G / Linux.

Here are the 3 major ones:

1. Copy directory recursively and exclude sub-directories or files with GNU tar

Maybe the quickest way to copy and exclude directories is through a littke 'hack' with GNU tar nginx:~# mkdir /usr/local/nginx-new;
nginx:~# cd /usr/local/nginx#
nginx:/usr/local/nginx# tar cvf - \. --exclude=/usr/local/nginx/logs/* \
| (cd /usr/local/nginx-new; tar -xvf - )

Copying that way however is slow, in my case it fits me perfectly but for copying large chunks of data it is better not to use pipe and instead use regular tar operation + mv

# cd /source_directory
# tar cvf test.tar --exclude=dir_to_exclude/*\--exclude=dir_to_exclude1/* . \
# mv test.tar /destination_directory
# cd /destination# tar xvf test.tar

2. Copy folder recursively excluding some directories with rsync

P>eople who has experience with rsync , already know how invaluable this tool is. Rsync can completely be used as for substitute=de.a# rsync -av –exclude='path1/to/exclude' –exclude='path2/to/exclude' source destination

This example, can also be used as a solution to my copy nginx and exclude logs directory casus like so:

nginx:~# rsync -av --exclude='/usr/local/nginx/logs/' /usr/local/nginx/ /usr/local/nginx-new

As you can see for yourself, this is a way more readable for the tar, however it will not work on servers, where rsync is not installed and it is unusable if you have to do operations as a regular users on such for that case surely the GNU tar hack is more 'portable' across systems.
rsync has also Windows version and therefore, the same methodology should be working on MS Windows and good for batch scripting.
I've not tested it myself, yet as I've never used rsync on Windows, if someone has tried and it works pls drop me a short msg in comments.
3. Copy directory and exclude sub directories and files with find

Find in collaboration with cp can also be used to exclude certain directories while copying. Actually this method is better than the GNU tar hack and surely more efficient. For machines, where rsync is not installed it is just a perfect way to copy files from location to location, while excluding some directories, here is an example use of find and cp, for the above nginx case:

nginx:~# cd /usr/local/nginx
nginx:~# mkdir /usr/local/nginx
nginx:/usr/local/nginx# find . -type d \( ! -name logs \) -print -exec cp -rpf '{}' /usr/local/nginx-bak \;

This will find all directories inside /usr/local/nginx with find command print them on the screen, then execute recursive copy over each found directory and copy to /usr/local/nginx-bak

This example will work fine in the nginx case because /usr/local/nginx does not contain any files but only sub-directories. In other occwhere the directory does contain some files besides sub-directories the files had to also be copied e.g.:

# for i in $(ls -l | egrep -v '^d'); do\
cp -rpf $i /destination/directory

This will copy the files from source directory (for instance /usr/local/nginx/my_file.txt, /usr/local/nginx/my_file1.txt etc.), which doesn't belong to a subdirectory.

The cmd expression:

# ls -l | egrep -v '^d'

Lists only the files while excluding all the directories and in a for loop each of the files is copied to /destination/directory

If someone has better ideas, please share with me 🙂

How to fix a broken QMAIL queue with queue-repair and qmhandle

Friday, May 27th, 2011

Reading Time: 5minutes
How qmail works, qmail queue picture :)

The aim of this small post is to give just a brief idea of how I fix my qmail server after breaking it or in case it is broken after mail bomb attacks, etc.

Most common cases when I break my qmail queue myself, are after I’m implementing some new patches and reinstall parts of the qmail server with a patched version of default qmail binaries.
On other occasions, I simply used the qmailctl to start or stop the server as a part of some routine tasks necessery for the administration of the qmail server.

Everybody who has already experience with qmail should have experienced, that qmail is very fragile and could break even with a simple changes, though if it works once it’s rock solid piece of mail servant.

Below I explain few ways I used through my days as a qmail sys admin to deal with broken or messed queues.

1. Fixing a broken qmail queue using automatic tools There are few handy tools which in most cases are able to solve issues with the queue, one very popular one isqueue-repair – check http://pyropus.ca/software/queue-repair/.
Installation of qmail-repair is dead easy, but it needs to be installed from source as no official debian package is available:

linux:/usr/local/src# wget http://pyropus.ca/software/queue-repair/queue-repair-0.9.0.tar.gz
linux:/usr/local/src# tar -xzvvf queue-repair-0.9.0.tar.gzdrwxr-xr-x charlesc/qcc 0 2003-10-22 16:54 queue-repair-0.9.0/
-rw-r--r-- charlesc/qcc 268 2003-10-22 16:54 queue-repair-0.9.0/TODO
-rw-r--r-- charlesc/qcc 1700 2003-10-22 16:54 queue-repair-0.9.0/CHANGELOG
-rw-r--r-- charlesc/qcc 18007 2003-10-22 16:54 queue-repair-0.9.0/COPYING
-rw-r--r-- charlesc/qcc 1098 2003-10-22 16:54 queue-repair-0.9.0/BLURB
-rwxr-xr-x charlesc/qcc 26286 2003-10-22 16:54 queue-repair-0.9.0/queue_repair.py

To check if there are issues fixable within the qmail queue it’s as easy as:

linux:/usr/local/src# cd queue-repair-0.9.0
linux:/usr/local/src/queue-repair-0.9.0# ./queue-repair -t
...
checking files...
checking queue/mess files...
checking split locations...

The tool will walk through the mail sub-directories containing mail queued files in /var/qmail/queue and will list any issues found.
It’s recommended that the qmail server is stopped before any queue modify operations are issued on the server:

linux:/usr/local/src# qmailctl stop
...

Further on in order to solve any found issues with the queue, there is the “-r”/repair option:

linux:/usr/local/src/queue-repair-0.9.0# ./queue-repair -r
...

Another tool which comes handy whether a repair of a messed qmail queue is needed is qmhandlehttp://sourceforge.net/projects/qmhandle/

The use of qmhandle is also pretty easy, all one has to do is to follow the usual classical steps of a download the source & compile:

linux:/usr/local/src# wget https://pc-freak.net/files/qmhandle-1.3.2.tar.gz
linux:/usr/local/src# tar -zxvvf qmhandle-1.3.2
...
linux:/usr/local/src# cd qmhandle-1.3.2

Once again it’s necessery that the qmail server is stopped via its init script before qmHandle tool is used, e.g.:

linux:~# qmailctl stop
...

There is a difference between qmail queue repair tool and qmail handle , while qmail queue-repair tool is used to fix improper permissions of queued files with the qmail queue, qmhandle ‘s application is to completely delete the stored mail contents of a broken queue.

Deleting all the qmail queue content is in some cases the only option to fix the queue.
Often such a drastic measure is required after a heavy mail server overload, let’s say a result of spammers or caused by virus infected mail users which send a massive amounts of spam mails.

Thus at many cases when queue-repair was unable to solve a queue mess, I use qmhandble and sacrifice all the queued emails by completely wiping them out like so:

linux:/usr/local/src/qmhandle-1.3.2# ./qmhandle -D
...

Above command would eradicate all queued emails. Hopefully after the qmail server gets launched again with qmailctl start all the mail server operations should be back to normal.

Note that the use of qmhandle’s queue delete capabilities is pretty dangerous, if you forgot to stop the qmail server before issuing the above command!

Note that in order to use both qmHandle and queue-repair tools you will need to install python interpreter as both of the tools are written in python.

To check what is currently in the queue in Qmail, there are also native tools available, as you should probably know if you have dealt with qmail, e.g.:

debian:~# qmail-qstat
debian:~# qmail-qstat
messages in queue: 2
messages in queue but not yet preprocessed: 0

Often when there are problems with Qmail and more specificly with qmail server queue the qmail-qstat command does show messages in queue, however when an attempt to check what kind of messages are in the queue with qmail-qread no messages are shown, for instance below you see an example of that, even though qmail-qstat claims 2 messages are in the queue, qmail-qread is unable to list the messages:

debian:~# qmail-qread
debian:~#

If all is fine with qmail queue above’s qmail-qread command should have returned something similar to:

debian:~# qmail-qread
26 May 2011 07:46:47 GMT #659982 3517 <hipo@pc-freak.net>
remote somemail@gmail.nl
26 May 2011 07:46:47 GMT #659983 3517 <hipo@pc-freak.net>

2. Fixing qmail queue manually This is very dangerous initiative, so before you try anything, make sure that you know what you’re doing, the possibility that you make the situation worst if you attempt to tamper manually the qmail queue is quite high 🙂

However if you’re still convinced to try fixing it manually, take a look at /var/qmail/queue it’s very likely that there are permission issues with some of the queued files, in order to fix the situation it’s necessery that the following directories:

/var/qmail/queue/mess/
/var/qmail/queue/remote/
/var/qmail/queue/bounce
/var/qmail/queue/info

gets explored with midnight commander / mc or some kind of convenient file explorer.

If there are queued files owned by users different from qmailq and user group qmail , for instance if owned by the root user, a simple chown qmailq:qmail to the wrong permissions file, should be able to resolve the issues.

Apart from all I explain above, there are many other ways suggested online on howto clean a qmail queue, one very popular one is using James’s qfixq shell script.

This script as of this very date is not working on Debian based systems, the script is dedicated initially to run on Fedora and Redhat based Linuces

Moreover myy experience with qfixq was never successful.

One very important note which is often a cause of many problems, is always make sure you stop and start the qmail server with an interval of at least of 10 seconds.

I’ve managed many servers which after an immediate (undelayed) qmailctl stop and qmailctl start was unable to run the whole engine of the qmail server (and either email sending or email receiving was not properly working) afterwards.

In that cases many weird behaviours are common, consider this seriously if you deal with the qmail-queue, it might happen that even if you have fixed your qmail queue, after a restart the qmail might breaks up.
I’ve experienced this kind of oddities numerous times, thus when I do changes to qmail I always make sure I restart the server a couple of times (at least 5 times 😉 ) always with a good delay between the HUPs.

And as always with qmail prayer is always needed, this server is complex, you never know what will happen next 🙂