Watchdog Archives - ☩ Walking in Light with Christ - Faith, Computing, Diary ☩ Walking in Light with Christ

Posts Tagged ‘Watchdog’

Fixing insserv: warning: script ‘…’ missing LSB tags and overrides on Debian and Ubuntu Linux

Friday, April 12th, 2013

Some of packages I just tried to install on one of the Debian servers I admin failed during package (set up) configuration stage. Here is little paste with the errors due to it dpkg-reconfigure on each of newly set-up packages failed:

Setting up acct (6.5.4-2.1) ... insserv: warning: script 'K02courier-imap' missing LSB tags and overrides insserv: warning: script 'courier-imapd' missing LSB tags and overrides insserv: warning: script 'iptables' missing LSB tags and overrides insserv: warning: script 'courier-imap' missing LSB tags and overrides

Because of this whole package install failed and the usual

# apt-get -f install
supposed to fix mess with packages end up with same errors:

insserv: warning: script 'K02courier-imap' missing LSB tags and overrides insserv: warning: script 'iptables' missing LSB tags and overrides insserv: warning: script 'courier-imap' missing LSB tags and overrides insserv: There is a loop between service watchdog and iptables if stopped insserv: loop involving service iptables at depth 2 insserv: loop involving service watchdog at depth 1 insserv: Stopping iptables depends on watchdog and therefore on system facility `$all' which can not be true! insserv: exiting now without changing boot order! update-rc.d: error: insserv rejected the script header dpkg: error processing acct (--configure): subprocess installed post-installation script returned error exit status 1
The scripts in question iptables / courier-imap / K02courier-imap were custom created scripts by me earlier and I have completely forgot about it. In Debian 5 and earlier I used the same scripts to make system load custom services not installed through a standard Debian package. After a bit of research, I've noticed in newer Debian / Ubuntu release, new Commented tags are included in all Debian belonging packages init scripts. Thus the reason for failing package configuration, were my custom scripts were missing those tags. To get around the situation I had to open manually each of the scripts missing init script LSB tags i.e. ( iptables / courier-imap / K02courier-imap ) and add after
#! /bin/sh
shebang;

### BEGIN INIT INFO # Provides: skeleton # Required-Start: $remote_fs $syslog # Required-Stop: $remote_fs $syslog # Default-Start: 2 3 4 5 # Default-Stop: 0 1 6 # Short-Description: Example initscript # Description: This file should be used to construct scripts to be # placed in /etc/init.d. ### END INIT INFO

Once those "boilerplate", skele comments are included to solve the mess I had to run again:

# apt-get -f install
This solves it. Enjoy 🙂

Tags: configuration stage, courier imap, exit status, exiting, imap, imapd, installation script, iptables, Linux, overrides, script header, scripts, servers, ubuntu linux, Watchdog
Posted in Linux, System Administration | 1 Comment »

How to automatically reboot (restart) Debian GNU Lenny / Squeeze Linux on kernel panic, some general CPU overload or system crash

Monday, June 21st, 2010

If you are a system administrator, you have probably wondered at least once ohw to configure your Linux server to automatically reboot itself if it crashes, is going through a mass CPU overload, e.g. the server load average “hits the sky”.
I just learned from a nice article found here that there is a kernel variable which when enabled takes care to automatically restart a crashed server with the terrible Kernel Panic message we all know.

The variable I’m taking about is kernel.panic for instance kernel.panic = 20 would instruct your GNU Linux kernel to automatically reboot if it experiences a kernel panic system crash within a time limit of 20 seconds.

To start using the auto-reboot linux capabilities on a kernel panic occurance just set the variable to /etc/sysctl.conf

debian-server:~# echo 'kernel.panic = 20' >> /etc/sysctl.conf

Now we will also have to enable the variable to start being use on the system, so execute:

debian-server:~# sysctl -p There you go automatic system reboots on kernel panics is now on.
Now to further assure yourself the linux server you’re responsible of will automatically restart itself on a emergency situation like a system overload I suggest you check Watchdog

You might consider checking out this auto reboot tutorial which explains in simple words how watchdog is installed and configured.
On Debian installing and maintaining watchdog is really simple and comes to installing and enabling the watchdog system service, right afteryou made two changes in it’s configuration file /etc/watchdog.conf

To do so execute:

debian-server:~# apt-get install watchdog debian-server:~# echo "file = /var/log/messages" >> /etc/watchdog.conf debian-server:~# echo "watchdog-device = /dev/watchdog" >> /etc/watchdog.conf

Well that should be it, you might also need to load some kernel module to monitor your watchdog.
On my system the kernel modules related to watchdog are located in:

/lib/modules/2.6.26-2-amd64/kernel/drivers/watchdog/
If not then you should certainly try the software watchdog linux kernel module called softdog , to do so issue:
debian-server:~# /sbin/modprobe softdog

It’s best if you load the module while the softdog daemon is disabled.
If you consider auto loadig the softdog software watchdog kernel driver you should exec:

debian-server:~# echo 'softdog' >> /etc/modules

Finally a start of the watchdog is necessery:

debian-server:~# /etc/init.d/watchdog start Stopping watchdog keepalive daemon.... Starting watchdog daemon....

That should be all your automatic system reboots should be now on! 🙂

Tags: auto reboot, care, configuration file, configure, crash, debian gnu, debian server, emergency, emergency situation, file, gnu linux, How to automatically reboot (restart) Debian GNU Lenny Linux on kernel panic, init, instance, kernel panic, kernel panics, limit, linux capabilities, linux kernel, linux server, log, log messages, modprobe, nbsp, necessery, ohw, quot, server load, sky, software, software watchdog with kernel module softdog, some general CPU overload or system crash, squeeze, Stopping, system administrator, system crash, system overload, time, time limit, tutorial, Watchdog, watchdog system
Posted in Linux, System Administration | 6 Comments »

How to auto restart CentOS Linux server with software watchdog (softdog) to reduce server downtime

Wednesday, August 10th, 2011

How to auto restart centos with software watchdog daemon to mitigate server downtimes, watchdog linux artistic logo

I’m in charge of dozen of Linux servers these days and therefore am required to restart many of the servers with a support ticket (because many of the Data Centers where the servers are co-located does not have a web interface or IPKVM connected to the server for that purpose). Therefore the server restart requests in case of crash sometimes gets processed in few hours or in best case in at least half an hour.

I’m aware of the existence of Hardware Watchdog devices, which are capable to detect if a server is hanged and auto-restart it, however the servers I administrate does not have Hardware support for Watchdog timer.

Thanksfully there is a free software project called Watchdog which is easily configured and mitigates the terrible downtimes caused every now and then by a server crash and respective delays by tech support in Data Centers.

I’ve recently blogged on the topic of Debian Linux auto-restart in case of kernel panic , however now i had to conifgure watchdog on some dozen of CentOS Linux servers.

It appeared installation & configuration of Watchdog on CentOS is a piece of cake and comes to simply following few easy steps, which I’ll explain quickly in this post:

1. Install with yum watchdog to CentOS

[root@centos:/etc/init.d ]# yum install watchdog ...

2. Add to configuration a log file to log watchdog activities and location of the watchdog device

The quickest way to add this two is to use echo to append it in /etc/watchdog.conf:

[root@centos:/etc/init.d ]# echo 'file = /var/log/messages' >> /etc/watchdog.conf echo 'watchdog-device = /dev/watchdog' >> /etc/watchdog.conf

3. Load the softdog kernel module to initialize the software watchdog via /dev/watchdog

[root@centos:/etc/init.d ]# /sbin/modprobe softdog

Initialization of softdog should be indicated by a line in dmesg kernel log like the one above:

[root@centos:/etc/init.d ]# dmesg |grep -i watchdog Software Watchdog Timer: 0.07 initialized. soft_noboot=0 soft_margin=60 sec (nowayout= 0)

4. Include the softdog kernel module to load on CentOS boot up

This is necessery, because otherwise after reboot the softdog would not be auto initialized and without it being initialized, the watchdog daemon service could not function as it does automatically auto reboots the server if the /dev/watchdog disappears.

It’s better that the softdog module is not loaded via /etc/rc.local but the default CentOS methodology to load module from /etc/rc.module is used:

[root@centos:/etc/init.d ]# echo modprobe softdog >> /etc/rc.modules [root@centos:/etc/init.d ]# chmod +x /etc/rc.modules

5. Start the watchdog daemon service

The succesful intialization of softdog in step 4, should have provided the system with /dev/watchdog, before proceeding with starting up the watchdog daemon it’s wise to first check if /dev/watchdog is existent on the system. Here is how:

[root@centos:/etc/init.d ]# ls -al /dev/watchdogcrw------- 1 root root 10, 130 Aug 10 14:03 /dev/watchdog

Being sure, that /dev/watchdog is there, I’ll start the watchdog service.

[root@centos:/etc/init.d ]# service watchdog restart ...

Very important note to make here is that you should never ever configure watchdog service to run on boot time with chkconfig. In other words the status from chkconfig for watchdog boot on all levels should be off like so:

[root@centos:/etc/init.d ]# chkconfig --list |grep -i watchdog watchdog 0:off 1:off 2:off 3:off 4:off 5:off 6:off

Enabling the watchdog from the chkconfig will cause watchdog to automatically restart the system as it will probably start the watchdog daemon before the softdog module is initialized. As watchdog will be unable to read the /dev/watchdog it will though the system has hanged even though the system might be in a boot process. Therefore it will end up in an endless loops of reboots which can only be fixed in a linux single user mode!!! Once again BEWARE, never ever activate watchdog via chkconfig!

Next step to be absolutely sure that watchdog device is running it can be checked with normal ps command:

[root@centos:/etc/init.d ]# ps aux|grep -i watchdog root@hosting1-fr [~]# ps axu|grep -i watch|grep -v greproot 18692 0.0 0.0 1816 1812 ? SNLs 14:03 0:00 /usr/sbin/watchdog root 25225 0.0 0.0 0 0 ? ZN 17:25 0:00 [watchdog] <defunct>

You have probably noticed the defunct state of watchdog, consider that as absolutely normal, above output indicates that now watchdog is properly running on the host and waiting to auto reboot in case of sudden /dev/watchdog disappearance.

As a last step before, after being sure its initialized properly, it’s necessery to add watchdog to run on boot time via /etc/rc.local post init script, like so:

[root@centos:/etc/init.d ]# echo 'echo /sbin/service watchdog start' >> /etc/rc.local

Now enjoy, watchdog is up and running and will automatically restart the CentOS host 😉

Tags: CentOS, crash, data, dmesg, existence, file, free software project, half an hour, hardware support, host, init, installation, installation configuration, kernel panic, Linux, linux server, linux servers, log, log messages, modprobe, necessery, piece of cake, root, server crash, server downtime, software, support, support ticket, tech support, ticket, time, topic, Watchdog, watchdog timer, web interface, yum
Posted in Linux, System Administration | 1 Comment »

How to enable AUTO fsck (ext3, ext4, reiserfs, LVM filesystems) checking on Linux boot through /etc/fstab

Tuesday, July 12th, 2011

How to auto FSCK manual fsck screenshot

Are you an administrator of servers and it happens a server is DOWN.
You request the Data Center to reboot, however suddenly the server fails to boot properly and you have to request for IPKVM or some web java interface to directly access the server physical terminal …

This is a very normal admin scenario and many people who have worked in the field of remote system administrators (like me), should have experienced that bad times multiple times.

Sadly enough only a insignifant number of administrators try to do their best to reduce this down times to resolve client stuff downtime but prefer spending time playing the ztype! game or watching some porn website 😉

Anyways there are plenty of things like Server Auto Reboot on Crash with software Watchdog etc., that we as sysadmins can do to reduce server downtimes and most of the manual human interactions on server boot time.

In that manner of thougts a very common thing when setting up a new Linux server that many server admins forget or don’t know is to enable all the server partition filesystems to be auto fscked during server boot time.

By not enabling the auto filesystem check options in Linux the server filesystems did not automatically scan and fix hard drive partitions for fs innode inconsistencies.
Even though the filesystems are tuned to automatically get checked on every 38 system reboots, still if some kind of filesystem errors are found that require a manual confirmation the boot process is interrupted and the admin ends up with a server which is not reachable remotely via ssh !

For the remote system administrator, this times are a terrible times of waitings, prayers and hopes that the server hardware is fine 😉 as well as being on hold to get a KVM to get into the server manually and enter the necessery input to fsck prompt.

Many of this bad times can be completely avoided with a very simple fix through /etc/fstab by enabling all server partitions containing any filesystem to be automatically checked and fixed in case if inconsistencies or errors are found by fsck.ext3, fsck.ext4, fsck.reiserfs etc. commands.

A very typical default /etc/fstab file you will find on many servers should look something like:

/dev/sda8 / ext3 errors=remount-ro 0 1 tmpfs /dev/shm tmpfs defaults 0 0 devpts /dev/pts devpts gid=5,mode=620 0 0 sysfs /sys sysfs defaults 0 0 proc /proc proc defaults 0 0 /dev/sda1 /home ext3 defaults 0 0

Notice the line:
/dev/sda1 /home ext3 defaults 0 0

The first column in the example contains the device name, the second one its mount point, third its filesystem type, fourth the mount options, fifth (a number) dump options, and sixth (another number) filesystem check options. Let’s take a closer look at this stuff.

The ones which are interesting to enable auto fsck checking and error resolving is provided usually by the last sixth variable (filesystem check option) which in the above example equals 0 .

When the filesystem check option equals 0 this means the auto fsck and repair for the respective filesystem is disabled.
Some time in the past the dump backup option (5th option in the example) was also used but as far as I can understand today it’s not that important in modern GNU/Linux distributions.

Now having the above sample crontab in order to enable the fsck file checking on Linux boot for /dev/sda1 , we will need to modify the above line’s filesystem check option be 2, e.g. the line would afterwards look like:

/dev/sda1 /home ext3 defaults 0 2

Setting the 2 as an option for filesystem check is necessery for every filesystem which is not mounted as a root filesystem /

In above example /etc/fstab you already see that auto filesystem fsck is enabled for root partition:

/dev/sda8 / ext3 errors=remount-ro 0 1
(notice the 1 in the end of the line)

Finally a modified version of the default sample /etc/fstab which will check the extra /dev/sda1 /home partition would look like so:

Making sure all Linux server partitions has the auto filesystem check option enabled is something absoultely necessery!
Enabling the auto fsck on servers always makes me sleep calmer 😉
Hope it helps your too. 🙂

Tags: auto reboot, boot process, boot time, center, client, crash, data, ext, file, filesystem errors, hard drive partitions, human interactions, inconsistencies, java interface, linux server, multiple times, necessery, number, option, partition, physical terminal, porn website, reiserfs, root, sda, server boot, server downtimes, server hardware, shm, software, something, spending, spending time, system administrators, terminal, terrible times, time, Watchdog, web java, ztype
Posted in Linux, System Administration | 2 Comments »

☩ Walking in Light with Christ – Faith, Computing, Diary

Posts Tagged ‘Watchdog’

Daily Bible quote

GET ARTICLE UPDATES

Useful blog? Help it:

Links to Other Places

Recent Posts

Ads

Categories

About Myself

Recent Comments

Top Post Views

blogtopsites