Posts Tagged ‘Auto’

How to install VirtualBox Virtual Machine to run Windows XP on Ubuntu Linux (11.10)

Tuesday, January 17th, 2012

Enable_VirtualBox_Windows_XP-fullscreen-with-vboxguest-additions-iso
My beloved sister was complaining games were failing to properly be played with wine emulator , therefore I decided to be kind and help her by installing a Windows XP to run inside a Virtual Machine.My previous install experiments with running MS Windows XP on Linux was on Debian using QEMU virtualmachine emulator.
However as Qemu is a bit less interactive and slower virtualmachine for running Windows (though I prefer it for being completely free software), this time I decided to install the Windows OS with Virtualbox.

My hope was using VirtualBox would be a way easier but I was wrong… I've faced few troubles and I thought many people who initially try to install Virtualbox VM to run Windows on Ubuntu and other Debian based Linux distros will probably experience the same problems as mine, so here is how this article was born.

Here is what I did to have a VirtualBox OS emulator to run Windows XP SP2 on Ubuntu 11.10 Linux

1. Install Virtualbox required packages with apt

root@ubuntu:~# apt-get install virtualbox virtualbox-dkms virtualbox-guest-dkms root@ubuntu:~# apt-get install virtualbox-ose-dkms virtualbox-guest-utils virtualbox-guest-x11
...

If you prefer more GUI or lazy to type commands, the Software Package Manager can also be used to straight install the same packages.
virtualbox-dkms virtualbox-guest-dkms packages are the two which are absolutely necessery in order to enable VirtualBox to support installing Microsoft Windows XP. DKMS modules are also necessery to be able to emulate some other proprietary (non-free) operating systems.
The DKMS packages provide a source for building Vbox guest (OS) additional kernel modules. They also require the kernel source to be install otherwise they fail to compile.

Failing to build the DKMS modules will give you error every time you try to create new VirtualMachine container for installing a fresh Windows XP.
The error happens if the two packages do not properly build the vboxdrv extra Vbox kernel module while the Windows XP installer is loaded from a CD or ISO. The error to pop up is:

Kernel driver not installed (rc=-1908)

The VirtualBox Linux kernel driver (vboxdrv) is either not loaded or there is a permission problem with /dev/vboxdrv. Please reinstall the kernel module by executing

VirtualBox vboxdrv not loaded error Ubuntu Screen

To fix the error:

2. Install latest Kernel source that corresponds to your current kernel version

root@ubuntu:~# apt-get install linux-headers-`uname -r`
...

Next its necessery to rebuild the DKMS modules using dpkg-reconfigure:

3. Rebuild VirtualBox DKMS deb packages

root@ubuntu:~# dpkg-reconfigure virtualbox-dkms
...
root@ubuntu:~# dpkg-reconfigure virtualbox-guest-dkms
...
root@ubuntu:~# dpkg-reconfigure virtualbox-ose-dkms
...

Hopefully the copilation of vboxdrv kernel module should complete succesfully.
To test if all is fine just load the module:

4. Load vboxdrv virtualbox kernel module

root@ubuntu:~# modprobe vboxdrv
root@ubuntu:~#

If you get some error during loading, this means vboxdrv failed to properly compile, try read thoroughfully what the error is and fix it) ;).

As a next step the vboxdrv has to be set to load on every system boot.

5. Set vboxdrv to load on every Ubuntu boot

root@ubuntu:~# echo 'vboxdrv' >> /etc/modules

I am not sure if this step is required, it could be /etc/init.d/virtualbox init script automatically loads the module, anyways putting it to load on boot would do no harm, so better do it.

That's all now, you can launch VirtualBox and use the New button to initiate a new Virtual Machine, I will skip explaining how to do the configurations for a Windows XP as most of the configurations offered by default would simply work without any tampering.

After booting the Windows XP installer I simply followed the usual steps to install Windows and all went smoothly.
Below you see a screenshot showing the installed Windows XP Virtualbox saved VM session. The screenshot letters are in Bulgarian as my sisters default lanaguage for Ubuntu is bulgarian 😉

VirtualBox installed MS Windows VM screenshot

I hope this article helps someone out there. Please drop me a comment if you experience any troubles with it. Cya 🙂

Non-free packages to install to make Ubuntu Linux Multimedia ready / Post install packages for new Ubuntu installations

Monday, January 23rd, 2012

non-free-packages-to-install-make-ubuntu-linux-multimedia-ready

1. Add Medibuntu package repository

root@ubuntu:~# wget --output-document=/etc/apt/sources.list.d/medibuntu.list \
http://www.medibuntu.org/sources.list.d/$(lsb_release -cs).list \
&& apt-get --quiet update \
&& apt-get --yes --quiet --allow-unauthenticated install medibuntu-keyring \
&& apt-get --quiet update

2. Enable Ubuntu to play Restricted DVD
root@ubuntu:~# apt-get install --yes libdvdread4
...
root@ubuntu:~# /usr/share/doc/libdvdread4/install-css.sh

After that VLC will be ready to play DVDs for some programs which was compiled without DVD, source rebuilt is required.

If DVDs hang you might need to set a Region Code with regionset:

# regionset

3. Install non-free codecs

root@ubuntu:~# apt-get install non-free-codecs

4. Install Chromium ffmpeg nonfree codecs

root@ubuntu:~# apt-get install chromium
root@ubuntu:~# apt-get install chromium-codecs-ffmpeg-nonfree

5. Install w32codecs / w64codecs

Depending on the Ubuntu Linux installation architecture 32/64 bit install w32codecs or w64codecs

For 32 bit (x86) Ubuntu install w32codecs:

root@ubuntu:~# apt-get install w32codecs

For 64 bit arch Ubuntu:

root@ubuntu:~# apt-get install w64codecs

6. Install ubuntu-restricted-extras meta package

root@ubuntu:~# apt-get install ubuntu-restricted-extras

7. Install cheese for webcam picture/video snapshotting

root@ubuntu:~# apt-get install cheese

8. Install GIMP, Inkscape, xsane,sane, shotwell etc.

root@ubuntu:~# apt-get --yes install sane xsane gimp inkscape gimp-data-extras gimp-plugin-registry \
blender gcolor2 showtwell bluefish kompozer

9. Install multimedia Sound & Video utilities

Install Subtitle editor, video editiking , sound editing, mp3 player, iso mounters, DVD/CD Burners

root@ubuntu:~# apt-get install rhythmbox banshee smplayer mplayer \
realplayer audacity brasero jokosher istanbuk gtk-recordMyDesktop \acetoneisohexedit furiusisomount winff fala audacious dvdstyler lives hydrogen
subtitleeditor gnome-subtitles electricsheep k3b

10. Install CD / DVD RIP tools

root@ubuntu:~# apt-get install acidrip sound-juicer ogmrip thoggen
11. Install chat messanger programs, Browsers, mail pop3 clients, torrent, emulators, ftp clients etc.

apt-get install seamonkey thunderbird transmission transmission-gtk gbgoffice kbedic \
pidgin openoffice.org gxine mozilla-plugin-vlc wine dosbox samba filezilla amsn ntp \epiphany-browser ntpdate desktop-webmail alltray chmsee gftp xchat-gnome ghex \gnome-genius bleachbit arista

12. Install Non-Free Flash Player

Unfortunately Gnash is not yet production ready and crashes in many websites …

root@ubuntu:~# apt-get install flashplugin-nonfree flashplugin-nonfree-extrasound swfdec-gnome

13. Install Archive / Unarchive management programs

root@ubuntu:~# apt-get install unace unrar zip unzip p7zip-full p7zip-rar sharutils rar uudeview \
mpack lha arj cabextract file-roller

15. Install VirtualBox and QEmu

root@ubuntu:~# apt-get install qemu-launcher qemu-kvm-extras virtualbox virtualbox-ose \
virtualbox-ose-guest-dkms virtualbox-ose-guest-dkms

This should be enough to use Ubuntu normally for multimedia Desktop just as MS Windows for most of the daily activities.
Am I missing some important program?

How to search text strings only in hidden files dot (.) files within a directory on Linux and FreeBSD

Saturday, April 28th, 2012

how-to-search-hidden-files-linux-freebsd-logo_grep
If there is necessity to look for a string in all hidden files with all sub-level subdirectories (be aware this will be time consuming and CPU stressing) use:
 

hipo@noah:~$ grep -rli 'PATH' .*

./.gftp/gftprc
./.gftp/cache/cache.OOqZVP
….

Sometimes its necessery to only grep for variables within the first-level directories (lets say you would like to grep a 'PATH' variable set, string within the $HOME directory, the command is:

hipo@noah:~$ grep PATH .[!.]*

.profile:PATH=/bin:/usr/bin/:${PATH}
.profile:export PATH
.profile:# set PATH so it includes user's private bin if it exists
.profile: PATH="$HOME/bin:$PATH"
.profile.language-env-bak:# set PATH so it includes user's private bin if it exists
.profile.language-env-bak: PATH="$HOME/bin:$PATH"
.viminfo:?/PATH.xcyrillic: XNLSPATH=/usr/X11R6/lib/X11/nls
.xcyrillic: export XNLSPATH

The regular expression .[!.]*, means exclude any file or directory name starting with '..', e.g. match only .* files

Note that to use the grep PATH .[!.]* on FreeBSD you will have to use this regular expression in bash shell, the default BSD csh or tsch shells will not recognize the regular expression, e.g.:

grep PATH '.[!.]*'
grep: .[!.]*: No such file or directory

Hence on BSD, if you need to look up for a string within the home directory, hidden files: .profile .bashrc .bash_profile .cshrc run it under bash shell:

freebsd# /usr/local/bin/bash
[root@freebsd:/home/hipo]# grep PATH .[!.]*

.bash_profile:# set PATH so it includes user's private bin if it exists
.bash_profile:# PATH=~/bin:"${PATH}"
.bash_profile:# do the same with …

Another easier to remember, alternative grep cmd is:

hipo@noah:~$ grep PATH .*
.profile:PATH=/bin:/usr/bin/:${PATH}
.profile:export PATH
.profile:# set PATH so it includes user's private bin if it exists
.profile: PATH="$HOME/bin:$PATH"
….

Note that grep 'string' .* is a bit different in meaning, as it will not prevent grep to match filenames with names ..filename1, ..filename2 etc.
Though grep 'string' .* will work note that it will sometimes output some unwanted matches if filenames with double dot in the beginning of file name are there …
That's all folks 🙂

Resolving “nf_conntrack: table full, dropping packet.” flood message in dmesg Linux kernel log

Wednesday, March 28th, 2012

nf_conntrack_table_full_dropping_packet
On many busy servers, you might encounter in /var/log/syslog or dmesg kernel log messages like

nf_conntrack: table full, dropping packet

to appear repeatingly:

[1737157.057528] nf_conntrack: table full, dropping packet.
[1737157.160357] nf_conntrack: table full, dropping packet.
[1737157.260534] nf_conntrack: table full, dropping packet.
[1737157.361837] nf_conntrack: table full, dropping packet.
[1737157.462305] nf_conntrack: table full, dropping packet.
[1737157.564270] nf_conntrack: table full, dropping packet.
[1737157.666836] nf_conntrack: table full, dropping packet.
[1737157.767348] nf_conntrack: table full, dropping packet.
[1737157.868338] nf_conntrack: table full, dropping packet.
[1737157.969828] nf_conntrack: table full, dropping packet.
[1737157.969928] nf_conntrack: table full, dropping packet
[1737157.989828] nf_conntrack: table full, dropping packet
[1737162.214084] __ratelimit: 83 callbacks suppressed

There are two type of servers, I've encountered this message on:

1. Xen OpenVZ / VPS (Virtual Private Servers)
2. ISPs – Internet Providers with heavy traffic NAT network routers
 

I. What is the meaning of nf_conntrack: table full dropping packet error message

In short, this message is received because the nf_conntrack kernel maximum number assigned value gets reached.
The common reason for that is a heavy traffic passing by the server or very often a DoS or DDoS (Distributed Denial of Service) attack. Sometimes encountering the err is a result of a bad server planning (incorrect data about expected traffic load by a company/companeis) or simply a sys admin error…

– Checking the current maximum nf_conntrack value assigned on host:

linux:~# cat /proc/sys/net/ipv4/netfilter/ip_conntrack_max
65536

– Alternative way to check the current kernel values for nf_conntrack is through:

linux:~# /sbin/sysctl -a|grep -i nf_conntrack_max
error: permission denied on key 'net.ipv4.route.flush'
net.netfilter.nf_conntrack_max = 65536
error: permission denied on key 'net.ipv6.route.flush'
net.nf_conntrack_max = 65536

– Check the current sysctl nf_conntrack active connections

To check present connection tracking opened on a system:

:

linux:~# /sbin/sysctl net.netfilter.nf_conntrack_count
net.netfilter.nf_conntrack_count = 12742

The shown connections are assigned dynamicly on each new succesful TCP / IP NAT-ted connection. Btw, on a systems that work normally without the dmesg log being flooded with the message, the output of lsmod is:

linux:~# /sbin/lsmod | egrep 'ip_tables|conntrack'
ip_tables 9899 1 iptable_filter
x_tables 14175 1 ip_tables

On servers which are encountering nf_conntrack: table full, dropping packet error, you can see, when issuing lsmod, extra modules related to nf_conntrack are shown as loaded:

linux:~# /sbin/lsmod | egrep 'ip_tables|conntrack'
nf_conntrack_ipv4 10346 3 iptable_nat,nf_nat
nf_conntrack 60975 4 ipt_MASQUERADE,iptable_nat,nf_nat,nf_conntrack_ipv4
nf_defrag_ipv4 1073 1 nf_conntrack_ipv4
ip_tables 9899 2 iptable_nat,iptable_filter
x_tables 14175 3 ipt_MASQUERADE,iptable_nat,ip_tables

 

II. Remove completely nf_conntrack support if it is not really necessery

It is a good practice to limit or try to omit completely use of any iptables NAT rules to prevent yourself from ending with flooding your kernel log with the messages and respectively stop your system from dropping connections.

Another option is to completely remove any modules related to nf_conntrack, iptables_nat and nf_nat.
To remove nf_conntrack support from the Linux kernel, if for instance the system is not used for Network Address Translation use:

/sbin/rmmod iptable_nat
/sbin/rmmod ipt_MASQUERADE
/sbin/rmmod rmmod nf_nat
/sbin/rmmod rmmod nf_conntrack_ipv4
/sbin/rmmod nf_conntrack
/sbin/rmmod nf_defrag_ipv4

Once the modules are removed, be sure to not use iptables -t nat .. rules. Even attempt to list, if there are any NAT related rules with iptables -t nat -L -n will force the kernel to load the nf_conntrack modules again.

Btw nf_conntrack: table full, dropping packet. message is observable across all GNU / Linux distributions, so this is not some kind of local distribution bug or Linux kernel (distro) customization.
 

III. Fixing the nf_conntrack … dropping packets error

– One temporary, fix if you need to keep your iptables NAT rules is:

linux:~# sysctl -w net.netfilter.nf_conntrack_max=131072

I say temporary, because raising the nf_conntrack_max doesn't guarantee, things will get smoothly from now on.
However on many not so heavily traffic loaded servers just raising the net.netfilter.nf_conntrack_max=131072 to a high enough value will be enough to resolve the hassle.

– Increasing the size of nf_conntrack hash-table

The Hash table hashsize value, which stores lists of conntrack-entries should be increased propertionally, whenever net.netfilter.nf_conntrack_max is raised.

linux:~# echo 32768 > /sys/module/nf_conntrack/parameters/hashsize
The rule to calculate the right value to set is:
hashsize = nf_conntrack_max / 4

– To permanently store the made changes ;a) put into /etc/sysctl.conf:

linux:~# echo 'net.netfilter.nf_conntrack_count = 131072' >> /etc/sysctl.conf
linux:~# /sbin/sysct -p

b) put in /etc/rc.local (before the exit 0 line):

echo 32768 > /sys/module/nf_conntrack/parameters/hashsize

Note: Be careful with this variable, according to my experience raising it to too high value (especially on XEN patched kernels) could freeze the system.
Also raising the value to a too high number can freeze a regular Linux server running on old hardware.

– For the diagnosis of nf_conntrack stuff there is ;

/proc/sys/net/netfilter kernel memory stored directory. There you can find some values dynamically stored which gives info concerning nf_conntrack operations in "real time":

linux:~# cd /proc/sys/net/netfilter
linux:/proc/sys/net/netfilter# ls -al nf_log/

total 0
dr-xr-xr-x 0 root root 0 Mar 23 23:02 ./
dr-xr-xr-x 0 root root 0 Mar 23 23:02 ../
-rw-r--r-- 1 root root 0 Mar 23 23:02 0
-rw-r--r-- 1 root root 0 Mar 23 23:02 1
-rw-r--r-- 1 root root 0 Mar 23 23:02 10
-rw-r--r-- 1 root root 0 Mar 23 23:02 11
-rw-r--r-- 1 root root 0 Mar 23 23:02 12
-rw-r--r-- 1 root root 0 Mar 23 23:02 2
-rw-r--r-- 1 root root 0 Mar 23 23:02 3
-rw-r--r-- 1 root root 0 Mar 23 23:02 4
-rw-r--r-- 1 root root 0 Mar 23 23:02 5
-rw-r--r-- 1 root root 0 Mar 23 23:02 6
-rw-r--r-- 1 root root 0 Mar 23 23:02 7
-rw-r--r-- 1 root root 0 Mar 23 23:02 8
-rw-r--r-- 1 root root 0 Mar 23 23:02 9

 

IV. Decreasing other nf_conntrack NAT time-out values to prevent server against DoS attacks

Generally, the default value for nf_conntrack_* time-outs are (unnecessery) large.
Therefore, for large flows of traffic even if you increase nf_conntrack_max, still shorty you can get a nf_conntrack overflow table resulting in dropping server connections. To make this not happen, check and decrease the other nf_conntrack timeout connection tracking values:

linux:~# sysctl -a | grep conntrack | grep timeout
net.netfilter.nf_conntrack_generic_timeout = 600
net.netfilter.nf_conntrack_tcp_timeout_syn_sent = 120
net.netfilter.nf_conntrack_tcp_timeout_syn_recv = 60
net.netfilter.nf_conntrack_tcp_timeout_established = 432000
net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 120
net.netfilter.nf_conntrack_tcp_timeout_close_wait = 60
net.netfilter.nf_conntrack_tcp_timeout_last_ack = 30
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 120
net.netfilter.nf_conntrack_tcp_timeout_close = 10
net.netfilter.nf_conntrack_tcp_timeout_max_retrans = 300
net.netfilter.nf_conntrack_tcp_timeout_unacknowledged = 300
net.netfilter.nf_conntrack_udp_timeout = 30
net.netfilter.nf_conntrack_udp_timeout_stream = 180
net.netfilter.nf_conntrack_icmp_timeout = 30
net.netfilter.nf_conntrack_events_retry_timeout = 15
net.ipv4.netfilter.ip_conntrack_generic_timeout = 600
net.ipv4.netfilter.ip_conntrack_tcp_timeout_syn_sent = 120
net.ipv4.netfilter.ip_conntrack_tcp_timeout_syn_sent2 = 120
net.ipv4.netfilter.ip_conntrack_tcp_timeout_syn_recv = 60
net.ipv4.netfilter.ip_conntrack_tcp_timeout_established = 432000
net.ipv4.netfilter.ip_conntrack_tcp_timeout_fin_wait = 120
net.ipv4.netfilter.ip_conntrack_tcp_timeout_close_wait = 60
net.ipv4.netfilter.ip_conntrack_tcp_timeout_last_ack = 30
net.ipv4.netfilter.ip_conntrack_tcp_timeout_time_wait = 120
net.ipv4.netfilter.ip_conntrack_tcp_timeout_close = 10
net.ipv4.netfilter.ip_conntrack_tcp_timeout_max_retrans = 300
net.ipv4.netfilter.ip_conntrack_udp_timeout = 30
net.ipv4.netfilter.ip_conntrack_udp_timeout_stream = 180
net.ipv4.netfilter.ip_conntrack_icmp_timeout = 30

All the timeouts are in seconds. net.netfilter.nf_conntrack_generic_timeout as you see is quite high – 600 secs = (10 minutes).
This kind of value means any NAT-ted connection not responding can stay hanging for 10 minutes!

The value net.netfilter.nf_conntrack_tcp_timeout_established = 432000 is quite high too (5 days!)
If this values, are not lowered the server will be an easy target for anyone who would like to flood it with excessive connections, once this happens the server will quick reach even the raised up value for net.nf_conntrack_max and the initial connection dropping will re-occur again …

With all said, to prevent the server from malicious users, situated behind the NAT plaguing you with Denial of Service attacks:

Lower net.ipv4.netfilter.ip_conntrack_generic_timeout to 60 – 120 seconds and net.ipv4.netfilter.ip_conntrack_tcp_timeout_established to stmh. like 54000

linux:~# sysctl -w net.ipv4.netfilter.ip_conntrack_generic_timeout = 120
linux:~# sysctl -w net.ipv4.netfilter.ip_conntrack_tcp_timeout_established = 54000

This timeout should work fine on the router without creating interruptions for regular NAT users. After changing the values and monitoring for at least few days make the changes permanent by adding them to /etc/sysctl.conf

linux:~# echo 'net.ipv4.netfilter.ip_conntrack_generic_timeout = 120' >> /etc/sysctl.conf
linux:~# echo 'net.ipv4.netfilter.ip_conntrack_tcp_timeout_established = 54000' >> /etc/sysctl.conf

How to permanently enable Cookies in Lynx text browser – Disable accept cookies prompt in lynx console browser

Wednesday, April 18th, 2012

lynx-text-browser-logo
The default behaviour of lynx console text browser on Linuces, BSD and other free OSes is to always ask, for the accept cookies prompt once an internet web page is opened that requires browser cookies to be enabled.

I should admin, having this "secure by default" (always ask for new cookies) behaviour in lynx was a good practice from a security point of view.

Another reason, why this cookies prompt is enabled by default is back in the days, when lynx was actively developed by programmers the websites with cookies support was not that many and even cookies was mostly required for user/pass authentication (all those who still remember this days the websites that requires authentication was a way less than today) …
With this said the current continuing security cautious behaviour in the browser, left from its old days is understandable.

Screenshot Google Accept cookies Lynx dialog FreeBSD

However I personally sometimes, need to use lynx more frequently and this behaviour of always opening a new website in text mode in console to prompts me for a cookie suddenly becomes a big waste of time if you use lynx to browser more than few sites. Hence I decided to change the default way lynx handles cookies and make them enabled by default instead.
Actually even in the past, when I was mainly using internet in console on every new server or home Linux install, I was again making the cookies to be permanently accepted.
Everyone who used lynx a few times already knows its "annoying" to all time accept cookie prompts … This provoked me to write this short article to explain how enabling of constant cookie accepting in lynx is done

To enable the persistent cookies in lynx, one needs to edit lynx.cfg on different GNU / Linux and BSD* distributions lynx.cfg is located in different directory.

Most of the lynx.cfg usual locations are /etc/lynx/lynx.cfg or /etc/lynx.cfg as of time of writting this post in Debian Squeeze GNU / Linux the lynx.cfg is located in /etc/lynx-cur/lynx.cfg, whether for FreeBSD / NetBSD / OpenBSD users the file is located in /usr/local/etc/lynx.cfg

What I did to allow all cookies is open lynx.cfg in vim edit and change the following lines:

a)

#FORCE_SSL_COOKIES_SECURE:FALSE

with

FORCE_SSL_COOKIES_SECURE:TRUE

b)

#SET_COOKIES:TRUE

uncomment it to:

SET_COOKIES:TRUE

c) next, change

ACCEPT_ALL_COOKIES:FALSE

ACCEPT_ALL_COOKIES:TRUE

Onwards opening any website with lynx auto-accepts the cookies.

lynx Always allowing from domain cookies Linux screenshot

Google in Bulgarian Lynx browser screenshot

For people who care about there security (who still browse in console (surely not many anymore)), permanently allowing the cookies is not a good idea. But for those who are ready to drop off little security for convenience its ok.
 

How to make a mirror of website on GNU / Linux with wget / Few tips on wget site mirroring

Wednesday, February 22nd, 2012

how-to-make-mirror-of-website-on-linux-wget

Everyone who used Linux is probably familiar with wget or has used this handy download console tools at least thousand of times. Not so many Desktop GNU / Linux users like Ubuntu and Fedora Linux users had tried using wget to do something more than single files download.
Actually wget is not so popular as it used to be in earlier linux days. I've noticed the tendency for newer Linux users to prefer using curl (I don't know why).

With all said I'm sure there is plenty of Linux users curious on how a website mirror can be made through wget.
This article will briefly suggest few ways to do website mirroring on linux / bsd as wget is both available on those two free operating systems.

1. Most Simple exact mirror copy of website

The most basic use of wget's mirror capabilities is by using wget's -mirror argument:

# wget -m http://website-to-mirror.com/sub-directory/

Creating a mirror like this is not a very good practice, as the links of the mirrored pages will still link to external URLs. In other words link URL will not pointing to your local copy and therefore if you're not connected to the internet and try to browse random links of the webpage you will end up with many links which are not opening because you don't have internet connection.

2. Mirroring with rewritting links to point to localhost and in between download page delay

Making mirror with wget can put an heavy load on the remote server as it fetches the files as quick as the bandwidth allows it. On heavy servers rapid downloads with wget can significantly reduce the download server responce time. Even on a some high-loaded servers it can cause the server to hang completely.
Hence mirroring pages with wget without explicity setting delay in between each page download, could be considered by remote server as a kind of DoS – (denial of service) attack. Even some site administrators have already set firewall rules or web server modules configured like Apache mod_security which filter requests to IPs which are doing too frequent HTTP GET /POST requests to the web server.
To make wget delay with a 10 seconds download between mirrored pages use:

# wget -mk -w 10 -np --random-wait http://website-to-mirror.com/sub-directory/

The -mk stands for -m/-mirror and -k / shortcut argument for –convert-links (make links point locally), –random-wait tells wget to make random waits between o and 10 seconds between each page download request.

3. Mirror / retrieve website sub directory ignoring robots.txt "mirror restrictions"

Some websites has a robots.txt which restricts content download with clients like wget, curl or even prohibits, crawlers to download their website pages completely.

/robots.txt restrictions are not a problem as wget has an option to disable robots.txt checking when downloading.
Getting around the robots.txt restrictions with wget is possible through -e robots=off option.
For instance if you want to make a local mirror copy of the whole sub-directory with all links and do it with a delay of 10 seconds between each consequential page request without reading at all the robots.txt allow/forbid rules:

# wget -mk -w 10 -np -e robots=off --random-wait http://website-to-mirror.com/sub-directory/

4. Mirror website which is prohibiting Download managers like flashget, getright, go!zilla etc.

Sometimes when try to use wget to make a mirror copy of an entire site domain subdirectory or the root site domain, you get an error similar to:

Sorry, but the download manager you are using to view this site is not supported.
We do not support use of such download managers as flashget, go!zilla, or getright

This message is produced by the site dynamic generation language PHP / ASP / JSP etc. used, as the website code is written to check on the browser UserAgent sent.
wget's default sent UserAgent to the remote webserver is:
Wget/1.11.4

As this is not a common desktop browser useragent many webmasters configure their websites to only accept well known established desktop browser useragents sent by client browsers.
Here are few typical user agents which identify a desktop browser:
 

  • Mozilla/5.0 (Windows NT 6.1; rv:6.0) Gecko/20110814 Firefox/6.0
  • Mozilla/5.0 (X11; Linux i686; rv:6.0) Gecko/20100101 Firefox/6.0
  • Mozilla/6.0 (Macintosh; I; Intel Mac OS X 11_7_9; de-LI; rv:1.9b4) Gecko/2012010317 Firefox/10.0a4
  • Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:2.2a1pre) Gecko/20110324 Firefox/4.2a1pre

etc. etc.

If you're trying to mirror a website which has implied some kind of useragent restriction based on some "valid" useragent, wget has the -U option enabling you to fake the useragent.

If you get the Sorry but the download manager you are using to view this site is not supported , fake / change wget's UserAgent with cmd:

# wget -mk -w 10 -np -e robots=off \
--random-wait
--referer="http://www.google.com" \--user-agent="Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6" \--header="Accept:text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5" \--header="Accept-Language: en-us,en;q=0.5" \--header="Accept-Encoding: gzip,deflate" \--header="Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7" \--header="Keep-Alive: 300"

For the sake of some wget anonimity – to make wget permanently hide its user agent and pretend like a Mozilla Firefox running on MS Windows XP use .wgetrc like this in home directory.

5. Make a complete mirror of a website under a domain name

To retrieve complete working copy of a site with wget a good way is like so:

# wget -rkpNl5 -w 10 --random-wait www.website-to-mirror.com

Where the arguments meaning is:
-r – Retrieve recursively
-k – Convert the links in documents to make them suitable for local viewing
-p – Download everything (inline images, sounds and referenced stylesheets etc.)
-N – Turn on time-stamping
-l5 – Specify recursion maximum depth level of 5

6. Make a dynamic pages static site mirror, by converting CGI, ASP, PHP etc. to HTML for offline browsing

It is often websites pages are ending in a .php / .asp / .cgi … extensions. An example of what I mean is for instance the URL http://php.net/manual/en/tutorial.php. You see the url page is tutorial.php once mirrored with wget the local copy will also end up in .php and therefore will not be suitable for local browsing as .php extension is not understood how to interpret by the local browser.
Therefore to copy website with a non-html extension and make it offline browsable in HTML there is the –html-extension option e.g.:

# wget -mk -w 10 -np -e robots=off \
--random-wait \
--convert-links http://www.website-to-mirror.com

A good practice in mirror making is to set a download limit rate. Setting such rate is both good for UP and DOWN side (the local host where downloading and remote server). download-limit is also useful when mirroring websites consisting of many enormous files (documental movies, some music etc.).
To set a download limit to add –limit-rate= option. Passing by to wget –limit-rate=200K would limit download speed to 200KB.

Other useful thing to assure wget has made an accurate mirror is wget logging. To use it pass -o ./my_mirror.log to wget.
 

Auto restart Apache on High server load (bash shell script) – Fixing Apache server temporal overload issues

Saturday, March 24th, 2012

auto-restart-apache-on-high-load-bash-shell-script-fixing-apache-temporal-overload-issues

I've written a tiny script to check and restart, Apache if the server encounters, extremely high load avarage like for instance more than (>25). Below is an example of a server reaching a very high load avarage:;

server~:# uptime
13:46:59 up 2 days, 18:54, 1 user, load average: 58.09, 59.08, 60.05
load average: 0.09, 0.08, 0.08

Sometimes high load avarage is not a problem, as the server might have a very powerful hardware. A high load numbers is not always an indicator for a serious problems. Some 16 CPU dual core (2.18 Ghz) machine with 16GB of ram could probably work normally with a high load avarage like in the example. Anyhow as most servers are not so powerful having such a high load avarage, makes the machine hardly do its job routine.

In my specific, case one of our Debian Linux servers is periodically reaching to a very high load level numbers. When this happens the Apache webserver is often incapable to serve its incoming requests and starts lagging for clients. The only work-around is to stop the Apache server for a couple of seconds (10 or 20 seconds) and then start it again once the load avarage has dropped to less than "3".

If this temporary fix is not applied on time, the server load gets increased exponentially until all the server services (ssh, ftp … whatever) stop responding normally to requests and the server completely hangs …

Often this server overloads, are occuring at night time so I'm not logged in on the server and one such unexpected overload makes the server unreachable for hours.
To get around the sudden high periodic load avarage server increase, I've written a tiny bash script to monitor, the server load avarage and initiate an Apache server stop and start with a few seconds delay in between.

#!/bin/sh
# script to check server for extremely high load and restart Apache if the condition is matched
check=`cat /proc/loadavg | sed 's/\./ /' | awk '{print $1}'`
# define max load avarage when script is triggered
max_load='25'
# log file
high_load_log='/var/log/apache_high_load_restart.log';
# location of inidex.php to overwrite with temporary message
index_php_loc='/home/site/www/index.php';
# location to Apache init script
apache_init='/etc/init.d/apache2';
#
site_maintenance_msg="Site Maintenance in progress - We will be back online in a minute";
if [ $check -gt "$max_load" ]; then>
#25 is load average on 5 minutes
cp -rpf $index_php_loc $index_php_loc.bak_ap
echo "$site_maintenance_msg" > $index_php_loc
sleep 15;
if [ $check -gt "$max_load" ]; then
$apache_init stop
sleep 5;
$apache_init restart
echo "$(date) : Apache Restart due to excessive load | $check |" >> $high_load_log;
cp -rpf $index_php_loc.bak_ap $index_php_loc
fi
fi

The idea of the script is partially based on a forum thread – Auto Restart Apache on High Loadhttp://www.webhostingtalk.com/showthread.php?t=971304Here is a link to my restart_apache_on_high_load.sh script

The script is written in a way that it makes two "if" condition check ups, to assure 100% there is a constant high load avarage and not just a temporal 5 seconds load avarage jump. Once the first if is matched, the script first tries to reduce the server load by overwritting a the index.php, index.html script of the website with a one stating the server is ongoing a maintenance operations.
Temporary stopping the index page, often reduces the load in 10 seconds of time, so the second if case is not necessery at all. Sometimes, however this first "if" condition cannot decrease enough the load and the server load continues to stay too high, then the script second if comes to play and makes apache to be completely stopped via Apache init script do 2 secs delay and launch the apache server again.

The script also logs about, the load avarage encountered, while the server was overloaded and Apache webserver was restarted, so later I can check what time the server overload occured.
To make the script periodically run, I've scheduled the script to launch every 5 minutes as a cron job with the following cron:

# restart Apache if load is higher than 25
*/5 * * * * /usr/sbin/restart_apache_on_high_load.sh >/dev/null 2>&1

I have also another system which is running FreeBSD 7_2, which is having the same overload server problems as with the Linux host.
Copying the auto restart apache on high load script on FreeBSD didn't work out of the box. So I rewrote a little chunk of the script to make it running on the FreeBSD host. Hence, if you would like to auto restart Apache or any other service on FreeBSD server get /usr/sbin/restart_apache_on_high_load_freebsd.sh my script and set it on cron on your BSD.

This script is just a temporary work around, however as its obvious that the frequency of the high overload will be rising with time and we will need to buy new server hardware to solve permanently the issues, anyways, until this happens the script does a great job 🙂

I'm aware there is also alternative way to auto restart Apache webserver on high server loads through using monit utility for monitoring services on a Unix system. However as I didn't wanted to bother to run extra services in the background I decided to rather use the up presented script.

Interesting info to know is Apache module mod_overload exists – which can be used for checking load average. Using this module once load avarage is over a certain number apache can stop in its preforked processes current serving request, I've never tested it myself so I don't know how usable it is. As of time of writting it is in early stage version 0.2.2
If someone, have tried it and is happy with it on a busy hosting servers, please share with me if it is stable enough?

How to take multiple screenshots with scrot and ImageMagick import commands in terminal on GNU / Linux and FreeBSD

Friday, January 13th, 2012

scrot and import are two commands, which can be used to take screenshot in terminal on Linux and FreeBSD:

To use scrot cmd to take screenshots on Ubuntu and Debian the scrot package has to be installed:

noah:~# apt-get install scrot
...

scrot should also be available on most other Linux distributions in the main repositories, I'll be glad to hear if someone has used it on Fedora, SUSE etc.

On FreeBSD, there is a port called scrot , to install on FreeBSD:

freebsd# cd /usr/ports/graphics/scrot
freebsd# make install clean
...

Scrot has plenty of nice arguments one can use to make a screenshot. Maybe the most handy one in my view is after a preliminary set delay before screenshot is taken.

To take screenshot with it after lets say 5 seconds delay before the screenshot:

hipo@noah:~/Desktop$ scrot -t 20 -d 5

Screenshot scrot my debian Linux gnome-termina

To put an year, month and day and year followed by screen resolution with scrot :

hipo@noah:~$ scrot '%Y-%m-%d_$wx$h.png'

Another way to take a screenshot of screen with command is by using ImageMagick'simport image manipulation package.
To take screenshot of the current screen via terminal using import , type in xterm, gnome-termina or Gnome's Run Application (ALT+F2)

hipo@noah:~$ import -window root ScreenShot.png

To make import command to save the taken screenshot in a format (minute:hour:day:month:year)i :

hipo@noah:~$ import -window root $screenshot_dir/screenshot-$(date +%M_%k_%d_%m_%Y|sed -e 's/^ *//').png

Taking a delayed screenshot is also possible via The GIMP via menus File -> Create -> Screenshot

GIMP Screenshot 15 seconds delay GIMP window screenshot

Now here is an interesting question, what if I would like to take periodic screenshots of what I do on my Desktop to take random movie scenes from a movie I watch with totem or vlc??

This task is quite easily achiavable with a little bash shell script, I wrote:

screenshot_dir='Screenshots';
seconds='60';
if [ ! -d "$screenshot_dir" ]; then
mkdir $screenshot_dir;
fi
while [ 1 ]; do
sleep $seconds;
(import -window root $screenshot_dir/screenshot-$(date +%M_%k_%d_%m_%Y|sed -e 's/^ *//').png) &
done

This script will take screenshot automatically to Screenshots/ directory every (1 min – 60 seconds)
You can also my downloads take_screenshot_every_60_secs_import.sh here

To use take_screenshot_every_60_secs_import.sh just issue the script inside xterm or gnome-terminal, after that simply use your computer as you normally would.
The script will take snapshots every minute and store all taken screenshots in Screenshots dir.

If you prefer to use scrot to take automatically the screenshots every lets say 5 minutes, you can use a script like:

screenshot_dir='Screenshots';
# 300 secs (5 mins)seconds='300';
if [ ! -d "$screenshot_dir" ]; then
mkdir $screenshot_dir;
fi
while [ 1 ]; do
sleep $seconds;
(scrot $screenshot_dir/'%Y-%m-%d_$wx$h.png') &
done

You can fetch take_screenshot_every_60_secs_scrot.sh here

The script using scrot is better in terms of efficiency, the system load scrot will put on your machine will be less.
Using some of this scripts will be handy if you need screenshots to Movies, Programs and favourite Free Software games.
Hope this is educative to someone 😉