Posts Tagged ‘something’

How to keep your Linux server Healthy for Years: Hard learned lessons

Friday, November 28th, 2025

how-to-keep-your-linux-servers-healthy-every-year-doctor_tux

I’ve been running Linux servers long enough to watch hardware die, kernels panic, filesystems fill up at midnight hours, and network cards slowly burn out like old light bulbs.

Over time, you learn that keeping a server alive is less about “perfect architecture” and more about steady discipline – the small habits built to manage the machines, helps prevent big disasters.

Here are some practical, battle-tested lessons that keep my boxes running for years with minimal downtime. Most of them were learned the hard way.

1. Monitor Before You Fix – and Fix Before It Breaks

Most Linux disasters come from things we should have noticed earlier. The lack of monitoring, there is modern day saying that should become your favourite if you are a sysadmin or Dev Ops engineer.

"Monitoring everything !"

  • The disk that was at 89% yesterday will be at 100% tonight.
  • The log file that grew by 500 MB last week will explode this week.
  • The swap usage creeping from 1% → 5% → 20% means your next heavy task will choke.
  • The unseen failing BIOS CMOS battery
  • The RAID disks degradation etc.

You don’t need enterprise monitoring to prevent this. And even simple tools like monit or a simple zabbix-agent -> zabbix-server or any other simplistic scripted  monitoring gives you a basic issues pre-warning.

Even a simple cronjob shell one liner can save you hours of further sh!t :

#!/bin/bash

df -h / | awk 'NR==2 { if($5+0 > 85) print "Disk Alert: / is at " $5 }' \
| mail -s "Disk Warning on $(hostname)" admin@example.com

2. Treat /etc directory as Sacred – Treat It Like an expensive gem

Every sysadmin eventually faces the nightmare of a broken config overwritten by a package update or a hasty command at 2 AM.

To avoid crying later, archive /etc automatically:

# tar czf /root/etc-$(date +%Y-%m-%d).tar.gz /etc


If you prefer the backup to be more sophisticated you can use my clone of the dirs_backup.sh (an old script I wrote for easifying backup of specific directories on the filesystem ) the etc_backup.sh you can get here.
Run it weekly via cron.
This little trick has saved me more times than I can count — especially when migrating between Debian releases or recovering from accidental edits.

3. Automate everything what you have to repeatevely do

If you find yourself doing something manually more than twice, script it and forget it.

Examples:

  • rotating logs for misbehaving apps
  • restarting services that occasionally get “stuck”
  • syncing backups between machines
  • cleaning temp directories

Here’s a small example I still use today:

#!/bin/bash

# Kill zombie PHP-FPM children that keep leaking memory

ps aux | grep php-fpm | awk '{if($6 > 300000) print $2}' | xargs -r kill -9

Dirty way to get rid of misfunctioning php-fpm ?
Yes. But it works.

4. Backups Don’t Exist Unless You Test Them

It’s easy to feel proud when you write a backup script.
It’s harder – and far more important – to test the restore.

Once a month  or at least once in a few months, try restore a random backup to a dummy VM.
Sometimes backup might fails, or you might get something different from what you originally expected and by doing so
you can guarantee you will not cry later helplessly.

A broken backup doesn’t fail quietly – it fails on the day you need it most.

5. Don’t Ignore Hardware – It Ages like Everything Else

Linux might run forever, but hardware doesn’t.

Signs of impending doom:

  • dmesg spam with I/O errors
  • slow SSD response
  • increasing SMART reallocated sectors
  • random freezes without logs
  • sudden network flakiness

Run this monthly:

6. Document Everything (Future You Will Thank Past You)

There are moments when you ask yourself:

“Why did I configure this machine like this?”

If you don’t document your decisions, you’ll have no idea one year later.

A simple markdown file inside /root/notes.txt or /root/README.md is enough.

Document:

  • installed software
  • custom scripts
  • non-standard configs
  • firewall rules
  • weird hacks you probably forgot already

This turns chaos into something you can actually maintain.

7. Keep Things Simple – Complexity Is the Enemy of Uptime

The longer I work with servers, the more I strip away:

  • fewer moving parts
  • fewer services
  • fewer custom patches
  • fewer “temporary” hacks that become permanent

A simple system is a reliable system.
A complex one dies at the worst possible moment.

8. Accept That Failure Will Still Happen

No matter how careful you are, servers will surely:

  • crash
  • corrupt filesystems
  • lose network connectivity
  • inexplicably freeze
  • reboot after a kernel panic

Don’t aim for perfection.Aim for resilience.

If you can restore the machine in under an hour, you're winning and in the white.

Final Thoughts

Linux is powerful – but it rewards those who treat it with respect and perseverance.
Over many years, I’ve realized that maintaining servers is less about brilliance and more about humble, consistent care and hard work persistence.

I hope this article helps some sysamdmin to rethink and rebundle servers maintenance strategy in a way that will avoid a server meltdown at  night hours like 3 AM.

Cheers ! 

 

How to configure multiple haproxies and frontends to log in separate log files via rsyslog

Monday, September 5th, 2022

log-multiple-haproxy-servers-to-separate-files-log-haproxy-froentend-to-separate-file-haproxy-rsyslog-Logging-diagram
In my last article How to create multiple haproxy instance separate processes for different configuration listeners,  I've shortly explained how to create a multiple instances of haproxies by cloning the systemd default haproxy.service and the haproxy.cfg to haproxyX.cfg.
But what if you need also to configure a separate logging for both haproxy.service and haproxy-customname.service instances how this can be achieved?

The simplest way is to use some system local handler staring from local0 to local6, As local 1,2,3 are usually used by system services a good local handler to start off would be at least 4.
Lets say we already have the 2 running haproxies, e.g.:

[root@haproxy2:/usr/lib/systemd/system ]# ps -ef|grep -i hapro|grep -v grep
root      128464       1  0 Aug11 ?        00:01:19 /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -S /run/haproxy-master.sock
haproxy   128466  128464  0 Aug11 ?        00:49:29 /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -S /run/haproxy-master.sock

root      346637       1  0 13:15 ?        00:00:00 /usr/sbin/haproxy-customname-wrapper -Ws -f /etc/haproxy/haproxy_customname_prod.cfg -p /run/haproxy_customname_prod.pid -S /run/haproxy-customname-master.sock
haproxy   346639  346637  0 13:15 ?        00:00:00 /usr/sbin/haproxy-customname-wrapper -Ws -f /etc/haproxy/haproxy_customname_prod.cfg -p /run/haproxy_customname_prod.pid -S /run/haproxy-customname-master.sock


1. Configure local messaging handlers to work via /dev/log inside both haproxy instance config files
 

To congigure the separte logging we need to have in /etc/haproxy/haproxy.cfg and in /etc/haproxy/haproxy_customname_prod.cfg the respective handlers.

To log in separate files you should already configured in /etc/haproxy/haproxy.cfg something like:

 

global
        stats socket /var/run/haproxy/haproxy.sock mode 0600 level admin #Creates Unix-Like socket to fetch stats
        log /dev/log    local0
        log /dev/log    local1 notice

#       nbproc 1
#       nbthread 2
#       cpu-map auto:1/1-2 0-1
        nbproc          1
        nbthread 2
        cpu-map         1 0
        cpu-map         2 1
        chroot /var/lib/haproxy
        user haproxy
        group haproxy
        daemon
        maxconn 99999

defaults
        log     global
        mode    tcp


        timeout connect 5000
        timeout connect 30s
        timeout server 10s

    timeout queue 5s
    timeout tunnel 2m
    timeout client-fin 1s
    timeout server-fin 1s

    option forwardfor
        maxconn 3000
    retries                 15

frontend http-in
        mode tcp

        option tcplog
        log global

 

        option logasap
        option forwardfor
        bind 0.0.0.0:80

default_backend webservers_http
backend webservers_http
    fullconn 20000
        balance source
stick match src
    stick-table type ip size 200k expire 30m

        server server-1 192.168.1.50:80 check send-proxy weight 255 backup
        server server-2 192.168.1.54:80 check send-proxy weight 254
        server server-3 192.168.0.219:80 check send-proxy weight 252 backup
        server server-4 192.168.0.210:80 check send-proxy weight 253 backup
        server server-5 192.168.0.5:80 maxconn 3000 check send-proxy weight 251 backup

For the second /etc/haproxy/haproxy_customname_prod.cfg the logging configuration should be similar to:
 

global
        stats socket /var/run/haproxy/haproxycustname.sock mode 0600 level admin #Creates Unix-Like socket to fetch stats
        log /dev/log    local5
        log /dev/log    local5 notice

#       nbproc 1
#       nbthread 2
#       cpu-map auto:1/1-2 0-1
        nbproc          1
        nbthread 2
        cpu-map         1 0
        cpu-map         2 1
        chroot /var/lib/haproxy
        user haproxy
        group haproxy
        daemon
        maxconn 99999

defaults
        log     global
        mode    tcp

 

2. Configure separate haproxy Frontend logging via local5 inside haproxy.cfg
 

As a minimum you need a configuration for frontend like:

 

frontend http-in
        mode tcp

        option tcplog
        log /dev/log    local5 debug
…..
….

..
.

Of course the mode tcp in my case is conditional you might be using mode http etc. 


3. Optionally but (preferrably) make local5 / local6 handlers to work via rsyslogs UDP imudp protocol

 

In this example /dev/log is straightly read by haproxy instead of sending the messages first to rsyslog, this is a good thing in case if you have doubts that rsyslog might stop working and respectively you might end up with no logging, however if you prefer to use instead rsyslog which most of people usually do you will have instead for /etc/haproxy/haproxy.cfg to use config:

global
    log          127.0.0.1 local6 debug

defaults
        log     global
        mode    tcp

And for /etc/haproxy_customname_prod.cfg config like:

global
    log          127.0.0.1 local5 debug

defaults
        log     global
        mode    tcp

If you're about to send the haproxy logs directly via rsyslog, it should have enabled in /etc/rsyslog.conf the imudp module if you're not going to use directly /dev/log

# provides UDP syslog reception
module(load="imudp")
input(type="imudp" port="514")

 

4. Prepare first and second log file and custom frontend output file and set right permissions
 

Assumably you already have /var/log/haproxy.log and this will be the initial haproxy log if you don't want to change it, normally it is installed on haproxy package install time on Linux and should have some permissions like following:

root@haproxy2:/etc/rsyslog.d# ls -al /var/log/haproxy.log
-rw-r–r– 1 haproxy haproxy 6681522  1 сеп 16:05 /var/log/haproxy.log


To create the second config with exact permissions like haproxy.log run:

root@haproxy2:/etc/rsyslog.d# touch /var/log/haproxy_customname.log
root@haproxy2:/etc/rsyslog.d# chown haproxy:haproxy /var/log/haproxy_customname.log

Create the haproxy_custom_frontend.log file that will only log output of exact frontend or match string from the logs
 

root@haproxy2:/etc/rsyslog.d# touch  /var/log/haproxy_custom_frontend.log
root@haproxy2:/etc/rsyslog.d# chown haproxy:haproxy  /var/log/haproxy_custom_frontend.log


5. Create the rsyslog config for haproxy.service to log via local6 to /var/log/haproxy.log
 

root@haproxy2:/etc/rsyslog.d# cat 49-haproxy.conf
# Create an additional socket in haproxy's chroot in order to allow logging via
# /dev/log to chroot'ed HAProxy processes
$AddUnixListenSocket /var/lib/haproxy/dev/log

# Send HAProxy messages to a dedicated logfile
:programname, startswith, "haproxy" {
  /var/log/haproxy.log
  stop
}

 

Above configs will make anything returned with string haproxy (e.g. proccess /usr/sbin/haproxy) to /dev/log to be written inside /var/log/haproxy.log and trigger a stop (by the way the the stop command works exactly as the tilda '~' discard one, except in some newer versions of haproxy the ~ is no now obsolete and you need to use stop instead (bear in mind that ~ even though obsolete proved to be working for me whether stop not ! but come on this is no strange this is linux mess), for example if you run latest debian Linux 11 as of September 2022 haproxy with package 2.2.9-2+deb11u3.
 

6. Create configuration for rsyslog to log from single Frontend outputting local2 to /var/log/haproxy_customname.log
 

root@haproxy2:/etc/rsyslog.d# cat 48-haproxy.conf
# Create an additional socket in haproxy's chroot in order to allow logging via
# /dev/log to chroot'ed HAProxy processes
$AddUnixListenSocket /var/lib/haproxy/dev/log

# Send HAProxy messages to a dedicated logfile
#:programname, startswith, "haproxy" {
#  /var/log/haproxy.log
#  stop
#}
# GGE/DPA 2022/08/02: HAProxy logs to local2, save the messages
local5.*                                                /var/log/haproxy_customname.log
 


You might also explicitly define the binary that will providing the logs inside the 48-haproxy.conf as we have a separate /usr/sbin/haproxy-customname-wrapper in that way you can log the output from the haproxy instance only based
on its binary command and you can omit writting to local5 to log via it something else 🙂

root@haproxy2:/etc/rsyslog.d# cat 48-haproxy.conf
# Create an additional socket in haproxy's chroot in order to allow logging via
# /dev/log to chroot'ed HAProxy processes
$AddUnixListenSocket /var/lib/haproxy/dev/log

# Send HAProxy messages to a dedicated logfile
#:programname, startswith, "haproxy" {
#  /var/log/haproxy.log
#  stop
#}
# GGE/DPA 2022/08/02: HAProxy logs to local2, save the messages

:programname, startswith, "haproxy-customname-wrapper " {
 
/var/log/haproxy_customname.log
  stop
}

 

7. Create the log file to log the custom frontend of your preference e.g. /var/log/haproxy_custom_frontend.log under local5 /prepare rsyslog config for
 

root@haproxy2:/etc/rsyslog.d# cat 47-haproxy-custom-frontend.conf
$ModLoad imudp
$UDPServerAddress 127.0.0.1
$UDPServerRun 514
#2022/02/02: HAProxy logs to local6, save the messages
local4.*                                                /var/log/haproxy_custom_frontend.log
:msg, contains, "https-in" ~

The 'https-in' is my frontend inside /etc/haproxy/haproxy.cfg it returns the name of it every time in /var/log/haproxy.log therefore I will log the frontend to local5 and to prevent double logging inside /var/log/haproxy.log of connections incoming towards the same frontend inside /var/log/haproxy.log, I have the tilda symbol '~' which instructs rsyslog to discard any message coming to rsyslog with "https-in" string in, immediately after the same frontend as configured inside /etc/haproxy/haproxy.cfg will output the frontend operations inside local5.


!!! Note that for rsyslog it is very important to have the right order of configurations, the configuration order is being considered based on the file numbering. !!!
 

Hence notice that my filter file number 47_* preceeds the other 2 configured rsyslog configs.
 

root@haproxy2:/etc/rsyslog.d# ls -1
47-haproxy-custom-frontend.conf
48-haproxy.conf
49-haproxy.conf

This will make 47-haproxy-custom-frontend.conf to be read and processed first 48-haproxy.conf processed second and 49-haproxy.conf processed third.


8. Reload rsyslog and haproxy and test

 

root@haproxy2: ~# systemctl restart rsyslog
root@haproxy2: ~# systemctl restart haproxy
root@haproxy2: ~# systemctl status rsyslog

● rsyslog.service – System Logging Service
     Loaded: loaded (/lib/systemd/system/rsyslog.service; enabled; vendor preset: enabled)
     Active: active (running) since Thu 2022-09-01 17:34:51 EEST; 1s ago
TriggeredBy: ● syslog.socket
       Docs: man:rsyslogd(8)
             man:rsyslog.conf(5)
             https://www.rsyslog.com/doc/
   Main PID: 372726 (rsyslogd)
      Tasks: 6 (limit: 4654)
     Memory: 980.0K
        CPU: 8ms
     CGroup: /system.slice/rsyslog.service
             └─372726 /usr/sbin/rsyslogd -n -iNONE

сеп 01 17:34:51 haproxy2 systemd[1]: Stopped System Logging Service.
сеп 01 17:34:51 haproxy2 rsyslogd[372726]: warning: ~ action is deprecated, consider using the 'stop' statement instead [v8.210>
сеп 01 17:34:51 haproxy2 systemd[1]: Starting System Logging Service…
сеп 01 17:34:51 haproxy2 rsyslogd[372726]: [198B blob data]
сеп 01 17:34:51 haproxy2 systemd[1]: Started System Logging Service.
сеп 01 17:34:51 haproxy2 rsyslogd[372726]: [198B blob data]
сеп 01 17:34:51 haproxy2 rsyslogd[372726]: [198B blob data]
сеп 01 17:34:51 haproxy2 rsyslogd[372726]: [198B blob data]
сеп 01 17:34:51 haproxy2 rsyslogd[372726]: imuxsock: Acquired UNIX socket '/run/systemd/journal/syslog' (fd 3) from systemd.  [>
сеп 01 17:34:51 haproxy2 rsyslogd[372726]: [origin software="rsyslogd" swVersion="8.2102.0" x-pid="372726" x-info="https://www.

Do some testing with some tool like curl / wget / lynx / elinks etc. on each of the configured haproxy listeners and frontends and check whether everything ends up in the correct log files.
That's all folks enjoy ! 🙂
 

Install Zabbix Agent client on CentOS 9 Stream Linux, Disable Selinux and Firewalld on CentOS9 to make zabbix-agentd send data to server

Thursday, April 14th, 2022

https://pc-freak.net/images/zabbix_agent_active_passive-zabbix-agent-centos-9-install-howto

Installing Zabbix is usually a trivial stuff, you either use the embedded distribution built packages if such are available this is for example defetch the right zabbix release repository  that configures the Zabbix official repo in the system, configure the Zabbix server or Proxy if such is used inside /etc/zabbix/zabbix_agentd.conf and start the client, i.e. I expected that it will be a simple and straight forward also on the freshly installed CentOS 9 Linux cause placing a zabbix-agent monitroing is a trivial stuff however installing came to error:

Key import failed (code 2). Failing package is: zabbix-agent-6.0.3-1.el8.x86_64

 

This is what I've done

1. Download and install zabbix-release-6.0-1.el8.noarch.rpm directly from zabbix

I've followed the official documentation from zabbix.com and ran:
 

[root@centos9 /root ]# rpm -Uvh https://repo.zabbix.com/zabbix/6.0/rhel/8/x86_64/zabbix-release-6.0-1.el8.noarch.rpm


2. Install  the zabbix-agent RPM package from the repositry

[root@centos9 rpm-gpg]# yum install zabbix-agent -y
Last metadata expiration check: 0:02:46 ago on Tue 12 Apr 2022 08:49:34 AM EDT.
Dependencies resolved.
=============================================
 Package                               Architecture                Version                              Repository                      Size
=============================================
Installing:
 zabbix-agent                          x86_64                      6.0.3-1.el8                          zabbix                         526 k
Installing dependencies:
 compat-openssl11                      x86_64                      1:1.1.1k-3.el9                       appstream                      1.5 M
 openldap-compat                       x86_64                      2.4.59-4.el9                         baseos                          14 k

Transaction Summary
==============================================
Install  3 PackagesTotal size: 2.0 M
Installed size: 6.1 M
Downloading Packages:
[SKIPPED] openldap-compat-2.4.59-4.el9.x86_64.rpm: Already downloaded
[SKIPPED] compat-openssl11-1.1.1k-3.el9.x86_64.rpm: Already downloaded
[SKIPPED] zabbix-agent-6.0.3-1.el8.x86_64.rpm: Already downloaded
Zabbix Official Repository – x86_64                                                                          1.6 MB/s | 1.7 kB     00:00
Importing GPG key 0xA14FE591:
 Userid     : "Zabbix LLC <packager@zabbix.com>"
 Fingerprint: A184 8F53 52D0 22B9 471D 83D0 082A B56B A14F E591
 From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-ZABBIX-A14FE591
Key import failed (code 2). Failing package is: zabbix-agent-6.0.3-1.el8.x86_64
 GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-ZABBIX-A14FE591
The downloaded packages were saved in cache until the next successful transaction.
You can remove cached packages by e
xecuting 'yum clean packages'.
Error: GPG check FAILED


3. Work around to skip GPG to install zabbix-agent 6 on CentOS 9

With Linux everything becomes more and more of a hack …
The logical thing to was to first,  check and it assure that the missing RPM GPG key is at place

[root@centos9 rpm-gpg]# ls -al  /etc/pki/rpm-gpg/RPM-GPG-KEY-ZABBIX-A14FE591
-rw-r–r– 1 root root 1719 Feb 11 16:29 /etc/pki/rpm-gpg/RPM-GPG-KEY-ZABBIX-A14FE591

Strangely the key was in place.

Hence to have the key loaded I've tried to import the gpg key manually with gpg command:

[root@centos9 rpm-gpg]# gpg –import /etc/pki/rpm-gpg/RPM-GPG-KEY-ZABBIX-A14FE591


And attempted install again zabbix-agent once again:
 

[root@centos9 rpm-gpg]# yum install zabbix-agent -y
Last metadata expiration check: 0:02:46 ago on Tue 12 Apr 2022 08:49:34 AM EDT.
Dependencies resolved.
==============================================
 Package                               Architecture                Version                              Repository                      Size
==============================================
Installing:
 zabbix-agent                          x86_64                      6.0.3-1.el8                          zabbix                         526 k
Installing dependencies:
 compat-openssl11                      x86_64                      1:1.1.1k-3.el9                       appstream                      1.5 M
 openldap-compat                       x86_64                      2.4.59-4.el9                         baseos                          14 k

Transaction Summary
==============================================
Install  3 Packages

Total size: 2.0 M
Installed size: 6.1 M
Downloading Packages:
[SKIPPED] openldap-compat-2.4.59-4.el9.x86_64.rpm: Already downloaded
[SKIPPED] compat-openssl11-1.1.1k-3.el9.x86_64.rpm: Already downloaded
[SKIPPED] zabbix-agent-6.0.3-1.el8.x86_64.rpm: Already downloaded
Zabbix Official Repository – x86_64                                                                          1.6 MB/s | 1.7 kB     00:00
Importing GPG key 0xA14FE591:
 Userid     : "Zabbix LLC <packager@zabbix.com>"
 Fingerprint: A184 8F53 52D0 22B9 471D 83D0 082A B56B A14F E591
 From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-ZABBIX-A14FE591
Key import failed (code 2). Failing package is: zabbix-agent-6.0.3-1.el8.x86_64
 GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-ZABBIX-A14FE591
The downloaded packages were saved in cache until the next successful transaction.
You can remove cached packages by executing 'yum clean packages'.
Error: GPG check FAILED


Unfortunately that was not a go, so totally pissed off I've disabled the gpgcheck for packages completely as a very raw bad and unrecommended work-around to eventually install the zabbix-agentd like that.

Usually the RPM gpg key failures check on RPM packages could be could be workaround with in dnf, so I've tried that one without success.

[root@centos9 rpm-gpg]# dnf update –nogpgcheck
Total                                                                                                        181 kB/s | 526 kB     00:02
Zabbix Official Repository – x86_64                                                                          1.6 MB/s | 1.7 kB     00:00
Importing GPG key 0xA14FE591:
 Userid     : "Zabbix LLC <packager@zabbix.com>"
 Fingerprint: A184 8F53 52D0 22B9 471D 83D0 082A B56B A14F E591
 From       : /etc/pki/rpm-gpg/RPM-GPG-KEY-ZABBIX-A14FE591
Is this ok [y/N]: y
Key import failed (code 2). Failing package is: zabbix-agent-6.0.3-1.el8.x86_64
 GPG Keys are configured as: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-ZABBIX-A14FE591
The downloaded packages were saved in cache until the next successful transaction.
You can remove cached packages by executing 'dnf clean packages'.
Error: GPG check FAILED

Further tried to use the –nogpgpcheck 
which according to its man page:


–nogpgpcheck 
Skip checking GPG signatures on packages (if RPM policy allows).


In yum the nogpgcheck option according to its man yum does exactly the same thing


[root@centos9 rpm-gpg]# yum install zabbix-agent –nogpgcheck -y
 

Dependencies resolved.
===============================================
 Package                             Architecture                  Version                               Repository                     Size
===============================================
Installing:
 zabbix-agent                        x86_64                        6.0.3-1.el8                           zabbix                        526 k

Transaction Summary
===============================================

Total size: 526 k
Installed size: 2.3 M
Is this ok [y/N]: y
Downloading Packages:

Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                                                                                     1/1
  Running scriptlet: zabbix-agent-6.0.3-1.el8.x86_64                                                                                     1/2
  Reinstalling     : zabbix-agent-6.0.3-1.el8.x86_64                                                                                     1/2
  Running scriptlet: zabbix-agent-6.0.3-1.el8.x86_64                                                                                     1/2
  Running scriptlet: zabbix-agent-6.0.3-1.el8.x86_64                                                                                     2/2
  Cleanup          : zabbix-agent-6.0.3-1.el8.x86_64                                                                                     2/2
  Running scriptlet: zabbix-agent-6.0.3-1.el8.x86_64                                                                                     2/2
  Verifying        : zabbix-agent-6.0.3-1.el8.x86_64                                                                                     1/2
  Verifying        : zabbix-agent-6.0.3-1.el8.x86_64                                                                                     2/2

Installed:
  zabbix-agent-6.0.3-1.el8.x86_64

Complete!
[root@centos9 ~]#

Voila! zabbix-agentd on CentOS 9 Install succeeded!

Yes I know disabling a GPG check is not really secure and seems to be an ugly solution but since I'm cut of time in the moment and it is just for experimental install of zabbix-agent on CentOS
plus we already trusted the zabbix package repository anyways, I guess it doesn't much matter.

4. Configure Zabbix-agent on the machine

Once you choose how the zabbix-agent should sent the data to the zabbix-server (e.g. Active or Passive) mode the The minimum set of configuration you should
have at place should be something like mine:

[root@centos9 ~]# grep -v '\#' /etc/zabbix/zabbix_agentd.conf | sed /^$/d
PidFile=/var/run/zabbix/zabbix_agentd.pid
LogFile=/var/log/zabbix/zabbix_agentd.log
LogFileSize=0
Server=192.168.1.70,127.0.0.1
ServerActive=192.168.1.70,127.0.0.1
Hostname=centos9
Include=/etc/zabbix/zabbix_agentd.d/*.conf

5. Start and Enable zabbix-agent client

To have it up and running

[root@centos9 ~]# systemct start zabbix-agent
[root@centos9 ~]# systemctl enable zabbix-agent

6. Disable SELinux to prevent it interfere with zabbix-agentd 

Other amazement was that even though I've now had configured Active check and a Server and correct configuration the Zabbix-Server could not reach the zabbix-agent for some weird reason.
I thought that it might be selinux and checked it and seems by default in the fresh installed CentOS 9 Linux selinux is already automatically set to enabled.

After stopping it i made sure, SeLinux would block for security reasons client connectivity to the zabbix-server until you either allow zabbix exception in SeLinux or until completely disable it.
 

[root@centos9 ~]# sestatus

SELinux status:                 enabled
SELinuxfs mount:                /sys/fs/selinux
SELinux root directory:         /etc/selinux
Loaded policy name:             targeted
Current mode:                   enforcing
Mode from config file:          enforcing
Policy MLS status:              enabled
Policy deny_unknown status:     allowed
Memory protection checking:     actual (secure)
Max kernel policy version:      31

To temporarily change the mode from its default targeted to permissive mode 

[root@centos9 ~]# setenforce 0

[root@centos9 ~]# sestatus

SELinux status:                 enabled
SELinuxfs mount:                /sys/fs/selinux
SELinux root directory:         /etc/selinux
Loaded policy name:             targeted
Current mode:                   permissive
Mode from config file:          permissive
Policy MLS status:              enabled
Policy deny_unknown status:     allowed
Memory protection checking:     actual (secure)
Max kernel policy version:      31


That would work for current session but won't take affect on next reboot, thus it is much better to disable selinux on next boot:

[root@centos9 ~]# cat /etc/selinux/config
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing – SELinux security policy is enforced.
#     permissive – SELinux prints warnings instead of enforcing.
#     disabled – No SELinux policy is loaded.
SELINUX=permissive
# SELINUXTYPE= can take one of these three values:
#     targeted – Targeted processes are protected,
#     minimum – Modification of targeted policy. Only selected processes are protected. 
#     mls – Multi Level Security protection.
SELINUXTYPE=targeted

 

To disable selinux change:

SELINUXTYPE=disabled

[root@centos9 ~]# grep -v \# /etc/selinux/config

SELINUX=disabled
SELINUXTYPE=targeted


To make the OS disable selinux and test it is disabled you will have to reboot 

[root@centos9 ~]# reboot


Check its status again, it should be:

[root@centos9 ~]# sestatus
SELinux status:                 disabled


7. Enable zabbix-agent through firewall or disable firewalld service completely

By default CentOS 9 has the firewalld also enabled and either you have to enable zabbix to communicate to the remote server host.

To enable access for from and to zabbix-agentd in both Active / Passive mode:

#firewall settings:
[root@centos9 rpm-gpg]# firewall-cmd –permanent –add-port=10050/tcp
[root@centos9 rpm-gpg]# firewall-cmd –permanent –add-port=10051/tcp
[root@centos9 rpm-gpg]# firewall-cmd –reload
[root@centos9 rpm-gpg]# systemctl restart firewalld
[root@centos9 rpm-gpg]# systemctl restart zabbix-agent


If the machine is in a local DMZ-ed network with tightly configured firewall router in front of it, you could completely disable firewalld.

[root@centos9 rpm-gpg]# systemctl stop firewalld
[root@centos9 rpm-gpg]# systemctl disable firewalld
Removed /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.

 

Next login to Zabbix-server web interface with administrator and from Configuration -> Hosts -> Create the centos9 hostname and add it a template of choice. The data from the added machine should shortly appear after another zabbix restart:

[root@centos9 rpm-gpg]#  systemctl restart zabbix-agentd


8. Tracking other oddities with the zabbix-agent through log

If anyways still zabbix have issues connectin to remote node, increase the debug log level section
 

[root@centos9 rpm-gpg]# vim /etc/zabbix/zabbix_agentd.conf
DebugLevel 5

### Option: DebugLevel
#       Specifies debug level:
#       0 – basic information about starting and stopping of Zabbix processes
#       1 – critical information
#       2 – error information
#       3 – warnings
#       4 – for debugging (produces lots of information)
#       5 – extended debugging (produces even more information)
#
# Mandatory: no
# Range: 0-5
# Default:
# DebugLevel=3

[root@centos9 rpm-gpg]# systemctl restart zabbix-agent

Keep in mind that debugging will be too verbose, so once you make the machine being seen in zabbix, don't forget to comment out the line and restart agent to turn it off.

9. Testing zabbix-agent, How to send an alert to specific item key

Usually when writting userparameter scripts, data collected from scripts is being sent to zabbix serveria via Item keys.
Thus one way to check the zabbix-agent -> zabbix server data send works fine is to send some simultaneous data via a key
Once zabbix-agent is configured on the machine 

In this case we will use something like ApplicationSupport-Item as an item.
 

[root@centos9 rpm-gpg]# /usr/bin/zabbix_sender -c "/etc/zabbix/zabbix_agentd.conf" -k "ApplicationSupport-Item" -o "here is the message"

Assuming you have created the newly prepared zabbix-agent host into Zabbix Server, you should be shortly able to see the data come in Latest data.

Create Linux High Availability Load Balancer Cluster with Keepalived and Haproxy on Linux

Tuesday, March 15th, 2022

keepalived-logo-linux

Configuring a Linux HA (High Availibiltiy) for an Application with Haproxy is already used across many Websites on the Internet and serious corporations that has a crucial infrastructure has long time
adopted and used keepalived to provide High Availability Application level Clustering.
Usually companies choose to use HA Clusters with Haproxy with Pacemaker and Corosync cluster tools.
However one common used alternative solution if you don't have the oportunity to bring up a High availability cluster with Pacemaker / Corosync / pcs (Pacemaker Configuration System) due to fact machines you need to configure the cluster on are not Physical but VMWare Virtual Machines which couldn't not have configured a separate Admin Lans and Heartbeat Lan as we usually do on a Pacemaker Cluster due to the fact the 5 Ethernet LAN Card Interfaces of the VMWare Hypervisor hosts are configured as a BOND (e.g. all the incoming traffic to the VMWare vSphere  HV is received on one Virtual Bond interface).

I assume you have 2 separate vSphere Hypervisor Physical Machines in separate Racks and separate switches hosting the two VMs.
For the article, I'll call the two brand new brought Virtual Machines with some installation automation software such as Terraform or Ansible – vm-server1 and vm-server2 which would have configured some recent version of Linux.

In that scenario to have a High Avaiability for the VMs on Application level and assure at least one of the two is available at a time if one gets broken due toe malfunction of the HV, a Network connectivity issue, or because the VM OS has crashed.
Then one relatively easily solution is to use keepalived and configurea single High Availability Virtual IP (VIP) Address, i.e. 10.10.10.1, which would float among two VMs using keepalived so at a time at least one of the two VMs would be reachable on the Network.

haproxy_keepalived-vip-ip-diagram-linux

Having a VIP IP is quite a common solution in corporate world, as it makes it pretty easy to add F5 Load Balancer in front of the keepalived cluster setup to have a 3 Level of security isolation, which usually consists of:

1. Physical (access to the hardware or Virtualization hosts)
2. System Access (The mechanism to access the system login credetials users / passes, proxies, entry servers leading to DMZ-ed network)
3. Application Level (access to different programs behind L2 and data based on the specific identity of the individual user,
special Secondary UserID,  Factor authentication, biometrics etc.)

 

1. Install keepalived and haproxy on machines

Depending on the type of Linux OS:

On both machines
 

[root@server1:~]# yum install -y keepalived haproxy

If you have to install keepalived / haproxy on Debian / Ubuntu and other Deb based Linux distros

[root@server1:~]# apt install keepalived haproxy –yes

2. Configure haproxy (haproxy.cfg) on both server1 and server2

 

Create some /etc/haproxy/haproxy.cfg configuration

 

[root@server1:~]vim /etc/haproxy/haproxy.cfg

#———————————————————————
# Global settings
#———————————————————————
global
    log          127.0.0.1 local6 debug
    chroot       /var/lib/haproxy
    pidfile      /run/haproxy.pid
    stats socket /var/lib/haproxy/haproxy.sock mode 0600 level admin 
    maxconn      4000
    user         haproxy
    group        haproxy
    daemon
    #debug
    #quiet

#———————————————————————
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#———————————————————————
defaults
    mode        tcp
    log         global
#    option      dontlognull
#    option      httpclose
#    option      httplog
#    option      forwardfor
    option      redispatch
    option      log-health-checks
    timeout connect 10000 # default 10 second time out if a backend is not found
    timeout client 300000
    timeout server 300000
    maxconn     60000
    retries     3

#———————————————————————
# round robin balancing between the various backends
#———————————————————————

listen FRONTEND_APPNAME1
        bind 10.10.10.1:15000
        mode tcp
        option tcplog
#        #log global
        log-format [%t]\ %ci:%cp\ %bi:%bp\ %b/%s:%sp\ %Tw/%Tc/%Tt\ %B\ %ts\ %ac/%fc/%bc/%sc/%rc\ %sq/%bq
        balance roundrobin
        timeout client 350000
        timeout server 350000
        timeout connect 35000
        server app-server1 10.10.10.55:30000 weight 1 check port 68888
        server app-server2 10.10.10.55:30000 weight 2 check port 68888

listen FRONTEND_APPNAME2
        bind 10.10.10.1:15000
        mode tcp
        option tcplog
        #log global
        log-format [%t]\ %ci:%cp\ %bi:%bp\ %b/%s:%sp\ %Tw/%Tc/%Tt\ %B\ %ts\ %ac/%fc/%bc/%sc/%rc\ %sq/%bq
        balance roundrobin
        timeout client 350000
        timeout server 350000
        timeout connect 35000
        server app-server1 10.10.10.55:30000 weight 5
        server app-server2 10.10.10.55:30000 weight 5 

 

You can get a copy of above haproxy.cfg configuration here.
Once configured roll it on.

[root@server1:~]#  systemctl start haproxy
 
[root@server1:~]# ps -ef|grep -i hapro
root      285047       1  0 Mar07 ?        00:00:00 /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid
haproxy   285050  285047  0 Mar07 ?        00:00:26 /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid

Bring up the haproxy also on server2 machine, by placing same configuration and starting up the proxy.
 

[root@server1:~]vim /etc/haproxy/haproxy.cfg


 

3. Configure keepalived on both servers

We'll be configuring 2 nodes with keepalived even though if necessery this can be easily extended and you can add more nodes.
First we make a copy of the original or existing server configuration keepalived.conf (just in case we need it later on or if you already had something other configured manually by someone – that could be so on inherited servers by other sysadmin)
 

[root@server1:~]# mv /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.orig
[root@server2:~]# mv /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.orig

a. Configure keepalived to serve as a MASTER Node

 

[root@server1:~]# vim /etc/keepalived/keepalived.conf

Master Node
global_defs {
  router_id server1-fqdn # The hostname of this host.
  
  enable_script_security
  # Synchro of the state of the connections between the LBs on the eth0 interface
   lvs_sync_daemon eth0
 
notification_email {
        linuxadmin@notify-domain.com     # Email address for notifications 
    }
 notification_email_from keepalived@server1-fqdn        # The from address for the notifications
    smtp_server 127.0.0.1                       # SMTP server address
    smtp_connect_timeout 15
}

vrrp_script haproxy {
  script "killall -0 haproxy"
  interval 2
  weight 2
  user root
}

vrrp_instance LB_VIP_QA {
  virtual_router_id 50
  advert_int 1
  priority 51

  state MASTER
  interface eth0
  smtp_alert          # Enable Notifications Via Email
  
  authentication {
              auth_type PASS
              auth_pass testp141

    }
### Commented because running on VM on VMWare
##    unicast_src_ip 10.44.192.134 # Private IP address of master
##    unicast_peer {
##        10.44.192.135           # Private IP address of the backup haproxy
##   }

#        }
# master node with higher priority preferred node for Virtual IP if both keepalived up
###  priority 51
###  state MASTER
###  interface eth0
  virtual_ipaddress {
     10.10.10.1 dev eth0 # The virtual IP address that will be shared between MASTER and BACKUP
  }
  track_script {
      haproxy
  }
}

 

 To dowload a copy of the Master keepalived.conf configuration click here

Below are few interesting configuration variables, worthy to mention few words on, most of them are obvious by their names but for more clarity I'll also give a list here with short description of each:

 

  • vrrp_instance – defines an individual instance of the VRRP protocol running on an interface.
  • state – defines the initial state that the instance should start in (i.e. MASTER / SLAVE )state –
  • interface – defines the interface that VRRP runs on.
  • virtual_router_id – should be unique value per Keepalived Node (otherwise slave master won't function properly)
  • priority – the advertised priority, the higher the priority the more important the respective configured keepalived node is.
  • advert_int – specifies the frequency that advertisements are sent at (1 second, in this case).
  • authentication – specifies the information necessary for servers participating in VRRP to authenticate with each other. In this case, a simple password is defined.
    only the first eight (8) characters will be used as described in  to note is Important thing
    man keepalived.conf – keepalived.conf variables documentation !!! Nota Bene !!! – Password set on each node should match for nodes to be able to authenticate !
  • virtual_ipaddress – defines the IP addresses (there can be multiple) that VRRP is responsible for.
  • notification_email – the notification email to which Alerts will be send in case if keepalived on 1 node is stopped (e.g. the MASTER node switches from host 1 to 2)
  • notification_email_from – email address sender from where email will originte
    ! NB ! In order for notification_email to be working you need to have configured MTA or Mail Relay (set to local MTA) to another SMTP – e.g. have configured something like Postfix, Qmail or Postfix

b. Configure keepalived to serve as a SLAVE Node

[root@server1:~]vim /etc/keepalived/keepalived.conf
 

#Slave keepalived
global_defs {
  router_id server2-fqdn # The hostname of this host!

  enable_script_security
  # Synchro of the state of the connections between the LBs on the eth0 interface
  lvs_sync_daemon eth0
 
notification_email {
        linuxadmin@notify-host.com     # Email address for notifications
    }
 notification_email_from keepalived@server2-fqdn        # The from address for the notifications
    smtp_server 127.0.0.1                       # SMTP server address
    smtp_connect_timeout 15
}

vrrp_script haproxy {
  script "killall -0 haproxy"
  interval 2
  weight 2
  user root
}

vrrp_instance LB_VIP_QA {
  virtual_router_id 50
  advert_int 1
  priority 50

  state BACKUP
  interface eth0
  smtp_alert          # Enable Notifications Via Email

authentication {
              auth_type PASS
              auth_pass testp141
}
### Commented because running on VM on VMWare    
##    unicast_src_ip 10.10.192.135 # Private IP address of master
##    unicast_peer {
##        10.10.192.134         # Private IP address of the backup haproxy
##   }

###  priority 50
###  state BACKUP
###  interface eth0
  virtual_ipaddress {
     10.10.10.1 dev eth0 # The virtual IP address that will be shared betwee MASTER and BACKUP.
  }
  track_script {
    haproxy
  }
}

 

Download the keepalived.conf slave config here

 

c. Set required sysctl parameters for haproxy to work as expected
 

[root@server1:~]vim /etc/sysctl.conf
#Haproxy config
# haproxy
net.core.somaxconn=65535
net.ipv4.ip_local_port_range = 1024 65000
net.ipv4.ip_nonlocal_bind = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_max_syn_backlog = 10240
net.ipv4.tcp_max_tw_buckets = 400000
net.ipv4.tcp_max_orphans = 60000
net.ipv4.tcp_synack_retries = 3

4. Test Keepalived keepalived.conf configuration syntax is OK

 

[root@server1:~]keepalived –config-test
(/etc/keepalived/keepalived.conf: Line 7) Unknown keyword 'lvs_sync_daemon_interface'
(/etc/keepalived/keepalived.conf: Line 21) Unable to set default user for vrrp script haproxy – removing
(/etc/keepalived/keepalived.conf: Line 31) (LB_VIP_QA) Specifying lvs_sync_daemon_interface against a vrrp is deprecated.
(/etc/keepalived/keepalived.conf: Line 31)              Please use global lvs_sync_daemon
(/etc/keepalived/keepalived.conf: Line 35) Truncating auth_pass to 8 characters
(/etc/keepalived/keepalived.conf: Line 50) (LB_VIP_QA) track script haproxy not found, ignoring…

I've experienced this error because first time I've configured keepalived, I did not mention the user with which the vrrp script haproxy should run,
in prior versions of keepalived, leaving the field empty did automatically assumed you have the user with which the vrrp script runs to be set to root
as of RHELs keepalived-2.1.5-6.el8.x86_64, i've been using however this is no longer so and thus in prior configuration as you can see I've
set the user in respective section to root.
The error Unknown keyword 'lvs_sync_daemon_interface'
is also easily fixable by just substituting the lvs_sync_daemon_interface and lvs_sync_daemon and reloading
keepalived etc.

Once keepalived is started and you can see the process on both machines running in process list.

[root@server1:~]ps -ef |grep -i keepalived
root     1190884       1  0 18:50 ?        00:00:00 /usr/sbin/keepalived -D
root     1190885 1190884  0 18:50 ?        00:00:00 /usr/sbin/keepalived -D

Next step is to check the keepalived statuses as well as /var/log/keepalived.log

If everything is configured as expected on both keepalived on first node you should see one is master and one is slave either in the status or the log

[root@server1:~]#systemctl restart keepalived

 

[root@server1:~]systemctl status keepalived|grep -i state
Mar 14 18:59:02 server1-fqdn Keepalived_vrrp[1192003]: (LB_VIP_QA) Entering MASTER STATE

[root@server1:~]systemctl status keepalived

● keepalived.service – LVS and VRRP High Availability Monitor
   Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Mon 2022-03-14 18:15:51 CET; 32min ago
  Process: 1187587 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 1187589 (code=exited, status=0/SUCCESS)

Mar 14 18:15:04 server1lb-fqdn Keepalived_vrrp[1187590]: Sending gratuitous ARP on eth0 for 10.44.192.142
Mar 14 18:15:50 server1lb-fqdn systemd[1]: Stopping LVS and VRRP High Availability Monitor…
Mar 14 18:15:50 server1lb-fqdn Keepalived[1187589]: Stopping
Mar 14 18:15:50 server1lb-fqdn Keepalived_vrrp[1187590]: (LB_VIP_QA) sent 0 priority
Mar 14 18:15:50 server1lb-fqdn Keepalived_vrrp[1187590]: (LB_VIP_QA) removing VIPs.
Mar 14 18:15:51 server1lb-fqdn Keepalived_vrrp[1187590]: Stopped – used 0.002007 user time, 0.016303 system time
Mar 14 18:15:51 server1lb-fqdn Keepalived[1187589]: CPU usage (self/children) user: 0.000000/0.038715 system: 0.001061/0.166434
Mar 14 18:15:51 server1lb-fqdn Keepalived[1187589]: Stopped Keepalived v2.1.5 (07/13,2020)
Mar 14 18:15:51 server1lb-fqdn systemd[1]: keepalived.service: Succeeded.
Mar 14 18:15:51 server1lb-fqdn systemd[1]: Stopped LVS and VRRP High Availability Monitor

[root@server2:~]systemctl status keepalived|grep -i state
Mar 14 18:59:02 server2-fqdn Keepalived_vrrp[297368]: (LB_VIP_QA) Entering BACKUP STATE

[root@server1:~]# grep -i state /var/log/keepalived.log
Mar 14 18:59:02 server1lb-fqdn Keepalived_vrrp[297368]: (LB_VIP_QA) Entering MASTER STATE
 

a. Fix Keepalived SECURITY VIOLATION – scripts are being executed but script_security not enabled.
 

When configurating keepalived for a first time we have faced the following strange error inside keepalived status inside keepalived.log 
 

Feb 23 14:28:41 server1 Keepalived_vrrp[945478]: SECURITY VIOLATION – scripts are being executed but script_security not enabled.

 

To fix keepalived SECURITY VIOLATION error:

Add to /etc/keepalived/keepalived.conf on the keepalived node hosts
inside 

global_defs {}

After chunk
 

enable_script_security

include

# Synchro of the state of the connections between the LBs on the eth0 interface
  lvs_sync_daemon_interface eth0

 

5. Prepare rsyslog configuration and Inlcude additional keepalived options
to force keepalived log into /var/log/keepalived.log

To force keepalived log into /var/log/keepalived.log on RHEL 8 / CentOS and other Redhat Package Manager (RPM) Linux distributions

[root@server1:~]# vim /etc/rsyslog.d/48_keepalived.conf

#2022/02/02: HAProxy logs to local6, save the messages
local7.*                                                /var/log/keepalived.log
if ($programname == 'Keepalived') then -/var/log/keepalived.log
if ($programname == 'Keepalived_vrrp') then -/var/log/keepalived.log
& stop

[root@server:~]# touch /var/log/keepalived.log

Reload rsyslog to load new config
 

[root@server:~]# systemctl restart rsyslog
[root@server:~]# systemctl status rsyslog

 

rsyslog.service – System Logging Service
   Loaded: loaded (/usr/lib/systemd/system/rsyslog.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/rsyslog.service.d
           └─rsyslog-service.conf
   Active: active (running) since Mon 2022-03-07 13:34:38 CET; 1 weeks 0 days ago
     Docs: man:rsyslogd(8)

           https://www.rsyslog.com/doc/
 Main PID: 269574 (rsyslogd)
    Tasks: 6 (limit: 100914)
   Memory: 5.1M
   CGroup: /system.slice/rsyslog.service
           └─269574 /usr/sbin/rsyslogd -n

Mar 15 08:15:16 server1lb-fqdn rsyslogd[269574]: — MARK —
Mar 15 08:35:16 server1lb-fqdn rsyslogd[269574]: — MARK —
Mar 15 08:55:16 server1lb-fqdn rsyslogd[269574]: — MARK —

 

If once keepalived is loaded but you still have no log written inside /var/log/keepalived.log

[root@server1:~]# vim /etc/sysconfig/keepalived
 KEEPALIVED_OPTIONS="-D -S 7"

[root@server2:~]# vim /etc/sysconfig/keepalived
 KEEPALIVED_OPTIONS="-D -S 7"

[root@server1:~]# systemctl restart keepalived.service
[root@server1:~]#  systemctl status keepalived

● keepalived.service – LVS and VRRP High Availability Monitor
   Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2022-02-24 12:12:20 CET; 2 weeks 4 days ago
 Main PID: 1030501 (keepalived)
    Tasks: 2 (limit: 100914)
   Memory: 1.8M
   CGroup: /system.slice/keepalived.service
           ├─1030501 /usr/sbin/keepalived -D
           └─1030502 /usr/sbin/keepalived -D

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.

[root@server2:~]# systemctl restart keepalived.service
[root@server2:~]# systemctl status keepalived

6. Monitoring VRRP traffic of the two keepaliveds with tcpdump
 

Once both keepalived are up and running a good thing is to check the VRRP protocol traffic keeps fluently on both machines.
Keepalived VRRP keeps communicating over the TCP / IP Port 112 thus you can simply snoop TCP tracffic on its protocol.
 

[root@server1:~]# tcpdump proto 112

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
11:08:07.356187 IP server1lb-fqdn > vrrp.mcast.net: VRRPv2, Advertisement, vrid 50, prio 53, authtype simple, intvl 1s, length 20
11:08:08.356297 IP server1lb-fqdn > vrrp.mcast.net: VRRPv2, Advertisement, vrid 50, prio 53, authtype simple, intvl 1s, length 20
11:08:09.356408 IP server1lb-fqdn > vrrp.mcast.net: VRRPv2, Advertisement, vrid 50, prio 53, authtype simple, intvl 1s, length 20
11:08:10.356511 IP server1lb-fqdn > vrrp.mcast.net: VRRPv2, Advertisement, vrid 50, prio 53, authtype simple, intvl 1s, length 20
11:08:11.356655 IP server1lb-fqdn > vrrp.mcast.net: VRRPv2, Advertisement, vrid 50, prio 53, authtype simple, intvl 1s, length 20

[root@server2:~]# tcpdump proto 112

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
​listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
11:08:07.356187 IP server1lb-fqdn > vrrp.mcast.net: VRRPv2, Advertisement, vrid 50, prio 53, authtype simple, intvl 1s, length 20
11:08:08.356297 IP server1lb-fqdn > vrrp.mcast.net: VRRPv2, Advertisement, vrid 50, prio 53, authtype simple, intvl 1s, length 20
11:08:09.356408 IP server1lb-fqdn > vrrp.mcast.net: VRRPv2, Advertisement, vrid 50, prio 53, authtype simple, intvl 1s, length 20
11:08:10.356511 IP server1lb-fqdn > vrrp.mcast.net: VRRPv2, Advertisement, vrid 50, prio 53, authtype simple, intvl 1s, length 20
11:08:11.356655 IP server1lb-fqdn > vrrp.mcast.net: VRRPv2, Advertisement, vrid 50, prio 53, authtype simple, intvl 1s, length 20

As you can see the VRRP traffic on the network is originating only from server1lb-fqdn, this is so because host server1lb-fqdn is the keepalived configured master node.

It is possible to spoof the password configured to authenticate between two nodes, thus if you're bringing up keepalived service cluster make sure your security is tight at best the machines should be in a special local LAN DMZ, do not configure DMZ on the internet !!! 🙂 Or if you eventually decide to configure keepalived in between remote hosts, make sure you somehow use encrypted VPN or SSH tunnels to tunnel the VRRP traffic.

[root@server1:~]tcpdump proto 112 -vv
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
11:36:25.530772 IP (tos 0xc0, ttl 255, id 59838, offset 0, flags [none], proto VRRP (112), length 40)
    server1lb-fqdn > vrrp.mcast.net: vrrp server1lb-fqdn > vrrp.mcast.net: VRRPv2, Advertisement, vrid 50, prio 53, authtype simple, intvl 1s, length 20, addrs: VIPIP_QA auth "testp431"
11:36:26.530874 IP (tos 0xc0, ttl 255, id 59839, offset 0, flags [none], proto VRRP (112), length 40)
    server1lb-fqdn > vrrp.mcast.net: vrrp server1lb-fqdn > vrrp.mcast.net: VRRPv2, Advertisement, vrid 50, prio 53, authtype simple, intvl 1s, length 20, addrs: VIPIP_QA auth "testp431"

Lets also check what floating IP is configured on the machines:

[root@server1:~]# ip -brief address show
lo               UNKNOWN        127.0.0.1/8 
eth0             UP             10.10.10.5/26 10.10.10.1/32 

The 10.10.10.5 IP is the main IP set on LAN interface eth0, 10.10.10.1 is the floating IP which as you can see is currently set by keepalived to listen on first node.

[root@server2:~]# ip -brief address show |grep -i 10.10.10.1

An empty output is returned as floating IP is currently configured on server1

To double assure ourselves the IP is assigned on correct machine, lets ping it and check the IP assigned MAC  currently belongs to which machine.
 

[root@server2:~]# ping 10.10.10.1
PING 10.10.10.1 (10.10.10.1) 56(84) bytes of data.
64 bytes from 10.10.10.1: icmp_seq=1 ttl=64 time=0.526 ms
^C
— 10.10.10.1 ping statistics —
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.526/0.526/0.526/0.000 ms

[root@server2:~]# arp -an |grep -i 10.44.192.142
? (10.10.10.1) at 00:48:54:91:83:7d [ether] on eth0
[root@server2:~]# ip a s|grep -i 00:48:54:91:83:7d
[root@server2:~]# 

As you can see from below output MAC is not found in configured IPs on server2.
 

[root@server1-fqdn:~]# /sbin/ip a s|grep -i 00:48:54:91:83:7d -B1 -A1
 eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 00:48:54:91:83:7d brd ff:ff:ff:ff:ff:ff
inet 10.10.10.1/26 brd 10.10.1.191 scope global noprefixroute eth0

Pretty much expected MAC is on keepalived node server1.

 

7. Testing keepalived on server1 and server2 maachines VIP floating IP really works
 

To test the overall configuration just created, you should stop keeaplived on the Master node and in meantime keep an eye on Slave node (server2), whether it can figure out the Master node is gone and switch its
state BACKUP to save MASTER. By changing the secondary (Slave) keepalived to master the floating IP: 10.10.10.1 will be brought up by the scripts on server2.

Lets assume that something went wrong with server1 VM host, for example the machine crashed due to service overload, DDoS or simply a kernel bug or whatever reason.
To simulate that we simply have to stop keepalived, then the broadcasted information on VRRP TCP/IP proto port 112 will be no longer available and keepalived on node server2, once
unable to communicate to server1 should chnage itself to state MASTER.

[root@server1:~]# systemctl stop keepalived
[root@server1:~]# systemctl status keepalived

● keepalived.service – LVS and VRRP High Availability Monitor
   Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Tue 2022-03-15 12:11:33 CET; 3s ago
  Process: 1192001 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 1192002 (code=exited, status=0/SUCCESS)

Mar 14 18:59:07 server1lb-fqdn Keepalived_vrrp[1192003]: Sending gratuitous ARP on eth0 for 10.10.10.1
Mar 15 12:11:32 server1lb-fqdn systemd[1]: Stopping LVS and VRRP High Availability Monitor…
Mar 15 12:11:32 server1lb-fqdn Keepalived[1192002]: Stopping
Mar 15 12:11:32 server1lb-fqdn Keepalived_vrrp[1192003]: (LB_VIP_QA) sent 0 priority
Mar 15 12:11:32 server1lb-fqdn Keepalived_vrrp[1192003]: (LB_VIP_QA) removing VIPs.
Mar 15 12:11:33 server1lb-fqdn Keepalived_vrrp[1192003]: Stopped – used 2.145252 user time, 15.513454 system time
Mar 15 12:11:33 server1lb-fqdn Keepalived[1192002]: CPU usage (self/children) user: 0.000000/44.555362 system: 0.001151/170.118126
Mar 15 12:11:33 server1lb-fqdn Keepalived[1192002]: Stopped Keepalived v2.1.5 (07/13,2020)
Mar 15 12:11:33 server1lb-fqdn systemd[1]: keepalived.service: Succeeded.
Mar 15 12:11:33 server1lb-fqdn systemd[1]: Stopped LVS and VRRP High Availability Monitor.

 

On keepalived off, you will get also a notification Email on the Receipt Email configured from keepalived.conf from the working keepalived node with a simple message like:

=> VRRP Instance is no longer owning VRRP VIPs <=

Once keepalived is back up you will get another notification like:

=> VRRP Instance is now owning VRRP VIPs <=

[root@server2:~]# systemctl status keepalived
● keepalived.service – LVS and VRRP High Availability Monitor
   Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2022-03-14 18:13:52 CET; 17h ago
  Process: 297366 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 297367 (keepalived)
    Tasks: 2 (limit: 100914)
   Memory: 2.1M
   CGroup: /system.slice/keepalived.service
           ├─297367 /usr/sbin/keepalived -D -S 7
           └─297368 /usr/sbin/keepalived -D -S 7

Mar 15 12:11:33 server2lb-fqdn Keepalived_vrrp[297368]: Sending gratuitous ARP on eth0 for 10.10.10.1
Mar 15 12:11:33 server2lb-fqdn Keepalived_vrrp[297368]: Sending gratuitous ARP on eth0 for 10.10.10.1
Mar 15 12:11:33 server2lb-fqdn Keepalived_vrrp[297368]: Remote SMTP server [127.0.0.1]:25 connected.
Mar 15 12:11:33 server2lb-fqdn Keepalived_vrrp[297368]: SMTP alert successfully sent.
Mar 15 12:11:38 server2lb-fqdn Keepalived_vrrp[297368]: (LB_VIP_QA) Sending/queueing gratuitous ARPs on eth0 for 10.10.10.1
Mar 15 12:11:38 server2lb-fqdn Keepalived_vrrp[297368]: Sending gratuitous ARP on eth0 for 10.10.10.1
Mar 15 12:11:38 server2lb-fqdn Keepalived_vrrp[297368]: Sending gratuitous ARP on eth0 for 10.10.10.1
Mar 15 12:11:38 server2lb-fqdn Keepalived_vrrp[297368]: Sending gratuitous ARP on eth0 for 10.10.10.1
Mar 15 12:11:38 server2lb-fqdn Keepalived_vrrp[297368]: Sending gratuitous ARP on eth0 for 10.10.10.1
Mar 15 12:11:38 server2lb-fqdn Keepalived_vrrp[297368]: Sending gratuitous ARP on eth0 for 10.10.10.1

[root@server2:~]#  ip addr show|grep -i 10.10.10.1
    inet 10.10.10.1/32 scope global eth0
    

As you see the VIP is now set on server2, just like expected – that's OK, everything works as expected. If the IP did not move double check the keepalived.conf on both nodes for errors or misconfigurations.

To recover the initial order of things so server1 is MASTER and server2 SLAVE host, we just have to switch on the keepalived on server1 machine.

[root@server1:~]# systemctl start keepalived

The automatic change of server1 to MASTER node and respective move of the VIP IP is done because of the higher priority (of importance we previously configured on server1 in keepalived.conf).
 

What we learned?
 

So what we learned in  this article?
We have seen how to easily install and configure a High Availability Load balancer with Keepalived with single floating VIP IP address with 1 MASTER and 1 SLAVE host and a Haproxy example config with few frontends / App backends. We have seen how the config can be tested for potential errors and how we can monitor whether the VRRP2 network traffic flows between nodes and how to potentially debug it further if necessery.
Further on rawly explained some of the keepalived configurations but as keepalived can do pretty much more,for anyone seriously willing to deal with keepalived on a daily basis or just fine tune some already existing ones, you better read closely its manual page "man keepalived.conf" as well as the official Redhat Linux documentation page on setting up a Linux cluster with Keepalived (Be prepare for a small nightmare as the documentation of it seems to be a bit chaotic, and even I would say partly missing or opening questions on what does the developers did meant – not strange considering the havoc that is pretty much as everywhere these days.)

Finally once keepalived hosts are prepared, it was shown how to test the keepalived application cluster and Floating IP does move between nodes in case if one of the 2 keepalived nodes is inaccessible.

The same logic can be repeated multiple times and if necessery you can set multiple VIPs to expand the HA reachable IPs solution.

high-availability-with-two-vips-example-diagram

The presented idea is with haproxy forward Proxy server to proxy requests towards Application backend (servince machines), however if you need to set another set of server on the flow to  process HTML / XHTML / PHP / Perl / Python  programming code, with some common Webserver setup ( Nginx / Apache / Tomcat / JBOSS) and enable SSL Secure certificate with lets say Letsencrypt, this can be relatively easily done. If you want to implement letsencrypt and a webserver check this redundant SSL Load Balancing with haproxy & keepalived article.

That's all folks, hope you enjoyed.
If you need to configure keepalived Cluster or a consultancy write your query here 🙂

How to filter dhcp traffic between two networks running separate DHCP servers to prevent IP assignment issues and MAC duplicate addresses

Tuesday, February 8th, 2022

how-to-filter-dhcp-traffic-2-networks-running-2-separate-dhcpd-servers-to-prevent-ip-assignment-conflicts-linux
Tracking the Problem of MAC duplicates on Linux routers
 

If you have two networks that see each other and they're not separated in VLANs but see each other sharing a common netmask lets say 255.255.254.0 or 255.255.252.0, it might happend that there are 2 dhcp servers for example (isc-dhcp-server running on 192.168.1.1 and dhcpd running on 192.168.0.1 can broadcast their services to both LANs 192.168.1.0.1/24 (netmask 255.255.255.0) and Local Net LAN 192.168.1.1/24. The result out of this is that some devices might pick up their IP address via DHCP from the wrong dhcp server.

Normally if you have a fully controlled little or middle class home or office network (10 – 15 electronic devices nodes) connecting to the LAN in a mixed moth some are connected via one of the Networks via connected Wifi to 192.168.1.0/22 others are LANned and using static IP adddresses and traffic is routed among two ISPs and each network can see the other network, there is always a possibility of things to go wrong. This is what happened to me so this is how this post was born.

The best practice from my experience so far is to define each and every computer / phone / laptop host joining the network and hence later easily monitor what is going on the network with something like iptraf-ng / nethogs  / iperf – described in prior  how to check internet spepeed from console and in check server internet connectivity speed with speedtest-cliiftop / nload or for more complex stuff wireshark or even a simple tcpdump. No matter the tools network monitoring is only part on solving network issues. A very must have thing in a controlled network infrastructure is defining every machine part of it to easily monitor later with the monitoring tools. Defining each and every host on the Hybrid computer networks makes administering the network much easier task and  tracking irregularities on time is much more likely. 

Since I have such a hybrid network here hosting a couple of XEN virtual machines with Linux, Windows 7 and Windows 10, together with Mac OS X laptops as well as MacBook Air notebooks, I have followed this route and tried to define each and every host based on its MAC address to pick it up from the correct DHCP1 server  192.168.1.1 (that is distributing IPs for Internet Provider 1 (ISP 1), that is mostly few computers attached UTP LAN cables via LiteWave LS105G Gigabit Switch as well from DHCP2 – used only to assigns IPs to servers and a a single Wi-Fi Access point configured to route incoming clients via 192.168.0.1 Linux NAT gateway server.

To filter out the unwanted IPs from the DHCPD not to propagate I've so far used a little trick to  Deny DHCP MAC Address for unwanted clients and not send IP offer for them.

To give you more understanding,  I have to clear it up I don't want to have automatic IP assignments from DHCP2 / LAN2 to DHCP1 / LAN1 because (i don't want machines on DHCP1 to end up with IP like 192.168.0.50 or DHCP2 (to have 192.168.1.80), as such a wrong IP delegation could potentially lead to MAC duplicates IP conflicts. MAC Duplicate IP wrong assignments for those older or who have been part of administrating large ISP network infrastructures  makes the network communication unstable for no apparent reason and nodes partially unreachable at times or full time …

However it seems in the 21-st century which is the century of strangeness / computer madness in the 2022, technology advanced so much that it has massively started to break up some good old well known sysadmin standards well documented in the RFCs I know of my youth, such as that every electronic equipment manufactured Vendor should have a Vendor Assigned Hardware MAC Address binded to it that will never change (after all that was the idea of MAC addresses wasn't it !). 
Many mobile devices nowadays however, in the developers attempts to make more sophisticated software and Increase Anonimity on the Net and Security, use a technique called  MAC Address randomization (mostly used by hackers / script kiddies of the early days of computers) for their Wi-Fi Net Adapter OS / driver controlled interfaces for the sake of increased security (the so called Private WiFi Addresses). If a sysadmin 10-15 years ago has seen that he might probably resign his profession and turn to farming or agriculture plant growing, but in the age of digitalization and "cloud computing", this break up of common developed network standards starts to become the 'new normal' standard.

I did not suspected there might be a MAC address oddities, since I spare very little time on administering the the network. This was so till recently when I accidently checked the arp table with:

Hypervisor:~# arp -an
192.168.1.99     5c:89:b5:f2:e8:d8      (Unknown)
192.168.1.99    00:15:3e:d3:8f:76       (Unknown)

..


and consequently did a network MAC Address ARP Scan with arp-scan (if you never used this little nifty hacker tool I warmly recommend it !!!)
If you don't have it installed it is available in debian based linuces from default repos to install

Hypervisor:~# apt-get install –yes arp-scan


It is also available on CentOS / Fedora / Redhat and other RPM distros via:

Hypervisor:~# yum install -y arp-scan

 

 

Hypervisor:~# arp-scan –interface=eth1 192.168.1.0/24

192.168.1.19    00:16:3e:0f:48:05       Xensource, Inc.
192.168.1.22    00:16:3e:04:11:1c       Xensource, Inc.
192.168.1.31    00:15:3e:bb:45:45       Xensource, Inc.
192.168.1.38    00:15:3e:59:96:8e       Xensource, Inc.
192.168.1.34    00:15:3e:d3:8f:77       Xensource, Inc.
192.168.1.60    8c:89:b5:f2:e8:d8       Micro-Star INT'L CO., LTD
192.168.1.99     5c:89:b5:f2:e8:d8      (Unknown)
192.168.1.99    00:15:3e:d3:8f:76       (Unknown)

192.168.x.91     02:a0:xx:xx:d6:64        (Unknown)
192.168.x.91     02:a0:xx:xx:d6:64        (Unknown)  (DUP: 2)

N.B. !. I found it helpful to check all available interfaces on my Linux NAT router host.

As you see the scan revealed, a whole bunch of MAC address mess duplicated MAC hanging around, destroying my network topology every now and then 
So far so good, the MAC duplicates and strangely hanging around MAC addresses issue, was solved relatively easily with enabling below set of systctl kernel variables.
 

1. Fixing Linux ARP common well known Problems through disabling arp_announce / arp_ignore / send_redirects kernel variables disablement

 

Linux answers ARP requests on wrong and unassociated interfaces per default. This leads to the following two problems:

ARP requests for the loopback alias address are answered on the HW interfaces (even if NOARP on lo0:1 is set). Since loopback aliases are required for DSR (Direct Server Return) setups this problem is very common (but easy to fix fortunately).

If the machine is connected twice to the same switch (e.g. with eth0 and eth1) eth2 may answer ARP requests for the address on eth1 and vice versa in a race condition manner (confusing almost everything).

This can be prevented by specific arp kernel settings. Take a look here for additional information about the nature of the problem (and other solutions): ARP flux.

To fix that generally (and reboot safe) we  include the following lines into

 

Hypervisor:~# cp -rpf /etc/sysctl.conf /etc/sysctl.conf_bak_07-feb-2022
Hypervisor:~# cat >> /etc/sysctl.conf

# LVS tuning
net.ipv4.conf.lo.arp_ignore=1
net.ipv4.conf.lo.arp_announce=2
net.ipv4.conf.all.arp_ignore=1
net.ipv4.conf.all.arp_announce=2

net.ipv4.conf.all.send_redirects=0
net.ipv4.conf.eth0.send_redirects=0
net.ipv4.conf.eth1.send_redirects=0
net.ipv4.conf.default.send_redirects=0

Press CTRL + D simultaneusly to Write out up-pasted vars.


To read more on Load Balancer using direct routing and on LVS and the arp problem here


2. Digging further the IP conflict / dulicate MAC Problems

Even after this arp tunings (because I do have my Hypervisor 2 LAN interfaces connected to 1 switch) did not resolved the issues and still my Wireless Connected devices via network 192.168.1.1/24 (ISP2) were randomly assigned the wrong range IPs 192.168.0.XXX/24 as well as the wrong gateway 192.168.0.1 (ISP1).
After thinking thoroughfully for hours and checking the network status with various tools and thanks to the fact that my wife has a MacBook Air that was always complaining that the IP it tried to assign from the DHCP was already taken, i"ve realized, something is wrong with DHCP assignment.
Since she owns a IPhone 10 with iOS and this two devices are from the same vendor e.g. Apple Inc. And Apple's products have been having strange DHCP assignment issues from my experience for quite some time, I've thought initially problems are caused by software on Apple's devices.
I turned to be partially right after expecting the logs of DHCP server on the Linux host (ISP1) finding that the phone of my wife takes IP in 192.168.0.XXX, insetad of IP from 192.168.1.1 (which has is a combined Nokia Router with 2.4Ghz and 5Ghz Wi-Fi and LAN router provided by ISP2 in that case Vivacom). That was really puzzling since for me it was completely logical thta the iDevices must check for DHCP address directly on the Network of the router to whom, they're connecting. Guess my suprise when I realized that instead of that the iDevices does listen to the network on a wide network range scan for any DHCPs reachable baesd on the advertised (i assume via broadcast) address traffic and try to connect and take the IP to the IP of the DHCP which responds faster !!!! Of course the Vivacom Chineese produced Nokia router responded DHCP requests and advertised much slower, than my Linux NAT gateway on ISP1 and because of that the Iphone and iOS and even freshest versions of Android devices do take the IP from the DHCP that responds faster, even if that router is not on a C class network (that's invasive isn't it??). What was even more puzzling was the automatic MAC Randomization of Wifi devices trying to connect to my ISP1 configured DHCPD and this of course trespassed any static MAC addresses filtering, I already had established there.

Anyways there was also a good think out of tthat intermixed exercise 🙂 While playing around with the Gigabit network router of vivacom I found a cozy feature SCHEDULEDING TURNING OFF and ON the WIFI ACCESS POINT  – a very useful feature to adopt, to stop wasting extra energy and lower a bit of radiation is to set a swtich off WIFI AP from 12:30 – 06:30 which are the common sleeping hours or something like that.
 

3. What is MAC Randomization and where and how it is configured across different main operating systems as of year 2022?

Depending on the operating system of your device, MAC randomization will be available either by default on most modern mobile OSes or with possibility to have it switched on:

  • Android Q: Enabled by default 
  • Android P: Available as a developer option, disabled by default
  • iOS 14: Available as a user option, disabled by default
  • Windows 10: Available as an option in two ways – random for all networks or random for a specific network

Lately I don't have much time to play around with mobile devices, and I do not my own a luxury mobile phone so, the fact this ne Androids have this MAC randomization was unknown to me just until I ended a small mess, based on my poor configured networks due to my tight time constrains nowadays.

Finding out about the new security feature of MAC Randomization, on all Android based phones (my mother's Nokia smartphone and my dad's phone, disabled the feature ASAP:


4. Disable MAC Wi-Fi Ethernet device Randomization on Android

MAC Randomization creates a random MAC address when joining a Wi-Fi network for the first time or after “forgetting” and rejoining a Wi-Fi network. It Generates a new random MAC address after 24 hours of last connection.

Disabling MAC Randomization on your devices. It is done on a per SSID basis so you can turn off the randomization, but allow it to function for hotspots outside of your home.

  1. Open the Settings app
  2. Select Network and Internet
  3. Select WiFi
  4. Connect to your home wireless network
  5. Tap the gear icon next to the current WiFi connection
  6. Select Advanced
  7. Select Privacy
  8. Select "Use device MAC"
     

5. Disabling MAC Randomization on MAC iOS, iPhone, iPad, iPod

To Disable MAC Randomization on iOS Devices:

Open the Settings on your iPhone, iPad, or iPod, then tap Wi-Fi or WLAN

 

  1. Tap the information button next to your network
  2. Turn off Private Address
  3. Re-join the network


Of course next I've collected their phone Wi-Fi adapters and made sure the included dhcp MAC deny rules in /etc/dhcp/dhcpd.conf are at place.

The effect of the MAC Randomization for my Network was terrible constant and strange issues with my routings and networks, which I always thought are caused by the openxen hypervisor Virtualization VM bugs etc.

That continued for some months now, and the weird thing was the issues always started when I tried to update my Operating system to the latest packetset, do a reboot to load up the new piece of software / libraries etc. and plus it happened very occasionally and their was no obvious reason for it.

 

6. How to completely filter dhcp traffic between two network router hosts
IP 192.168.0.1 / 192.168.1.1 to stop 2 or more configured DHCP servers
on separate networks see each other

To prevent IP mess at DHCP2 server side (which btw is ISC DHCP server, taking care for IP assignment only for the Servers on the network running on Debian 11 Linux), further on I had to filter out any DHCP UDP traffic with iptables completely.
To prevent incorrect route assignments assuming that you have 2 networks and 2 routers that are configurred to do Network Address Translation (NAT)-ing Router 1: 192.168.0.1, Router 2: 192.168.1.1.

You have to filter out UDP Protocol data on Port 67 and 68 from the respective source and destination addresses.

In firewall rules configuration files on your Linux you need to have some rules as:

# filter outgoing dhcp traffic from 192.168.1.1 to 192.168.0.1
-A INPUT -p udp -m udp –dport 67:68 -s 192.168.1.1 -d 192.168.0.1 -j DROP
-A OUTPUT -p udp -m udp –dport 67:68 -s 192.168.1.1 -d 192.168.0.1 -j DROP
-A FORWARD -p udp -m udp –dport 67:68 -s 192.168.1.1 -d 192.168.0.1 -j DROP

-A INPUT -p udp -m udp –dport 67:68 -s 192.168.0.1 -d 192.168.1.1 -j DROP
-A OUTPUT -p udp -m udp –dport 67:68 -s 192.168.0.1 -d 192.168.1.1 -j DROP
-A FORWARD -p udp -m udp –dport 67:68 -s 192.168.0.1 -d 192.168.1.1 -j DROP

-A INPUT -p udp -m udp –sport 67:68 -s 192.168.1.1 -d 192.168.0.1 -j DROP
-A OUTPUT -p udp -m udp –sport 67:68 -s 192.168.1.1 -d 192.168.0.1 -j DROP
-A FORWARD -p udp -m udp –sport 67:68 -s 192.168.1.1 -d 192.168.0.1 -j DROP


You can download also filter_dhcp_traffic.sh with above rules from here


Applying this rules, any traffic of DHCP between 2 routers is prohibited and devices from Net: 192.168.1.1-255 will no longer wrongly get assinged IP addresses from Network range: 192.168.0.1-255 as it happened to me.


7. Filter out DHCP traffic based on MAC completely on Linux with arptables

If even after disabling MAC randomization on all devices on the network, and you know physically all the connecting devices on the Network, if you still see some weird MAC addresses, originating from a wrongly configured ISP traffic router host or whatever, then it is time to just filter them out with arptables.

## drop traffic prevent mac duplicates due to vivacom and bergon placed in same network – 255.255.255.252
dchp1-server:~# arptables -A INPUT –source-mac 70:e2:83:12:44:11 -j DROP


To list arptables configured on Linux host

dchp1-server:~# arptables –list -n


If you want to be paranoid sysadmin you can implement a MAC address protection with arptables by only allowing a single set of MAC Addr / IPs and dropping the rest.

dchp1-server:~# arptables -A INPUT –source-mac 70:e2:84:13:45:11 -j ACCEPT
dchp1-server:~# arptables -A INPUT  –source-mac 70:e2:84:13:45:12 -j ACCEPT


dchp1-server:~# arptables -L –line-numbers
Chain INPUT (policy ACCEPT)
1 -j DROP –src-mac 70:e2:84:13:45:11
2 -j DROP –src-mac 70:e2:84:13:45:12

Once MACs you like are accepted you can set the INPUT chain policy to DROP as so:

dchp1-server:~# arptables -P INPUT DROP


If you later need to temporary, clean up the rules inside arptables on any filtered hosts flush all rules inside INPUT chain, like that
 

dchp1-server:~#  arptables -t INPUT -F

Install and configure rkhunter for improved security on a PCI DSS Linux / BSD servers with no access to Internet

Wednesday, November 10th, 2021

install-and-configure-rkhunter-with-tightened-security-variables-rkhunter-logo

rkhunter or Rootkit Hunter scans systems for known and unknown rootkits. The tool is not new and most system administrators that has to mantain some good security servers perhaps already use it in their daily sysadmin tasks.

It does this by comparing SHA-1 Hashes of important files with known good ones in online databases, searching for default directories (of rootkits), wrong permissions, hidden files, suspicious strings in kernel modules, commmon backdoors, sniffers and exploits as well as other special tests mostly for Linux and FreeBSD though a ports for other UNIX operating systems like Solaris etc. are perhaps available. rkhunter is notable due to its inclusion in popular mainstream FOSS operating systems (CentOS, Fedora,Debian, Ubuntu etc.).

Even though rkhunter is not rapidly improved over the last 3 years (its last Official version release was on 20th of Febuary 2018), it is a good tool that helps to strengthen even further security and it is often a requirement for Unix servers systems that should follow the PCI DSS Standards (Payment Card Industry Data Security Standards).

Configuring rkhunter is a pretty straight forward if you don't have too much requirements but I decided to write this article for the reason there are fwe interesting options that you might want to adopt in configuration to whitelist any files that are reported as Warnings, as well as how to set a configuration that sets a stricter security checks than the installation defaults. 

1. Install rkhunter .deb / .rpm package depending on the Linux distro or BSD

  • If you have to place it on a Redhat based distro CentOS / Redhat / Fedora

[root@Centos ~]# yum install -y rkhunter

 

  • On Debian distros the package name is equevallent to install there exec usual:

root@debian:~# apt install –yes rkhunter

  • On FreeBSD / NetBSD or other BSD forks you can install it from the BSD "World" ports system or install it from a precompiled binary.

freebsd# pkg install rkhunter

One important note to make here is to have a fully functional Alarming from rkhunter, you will have to have a fully functional configured postfix / exim / qmail whatever mail server to relay via official SMTP so you the Warning Alarm emails be able to reach your preferred Alarm email address. If you haven't installed postfix for example and configure it you might do.

– On Deb based distros 

[root@Centos ~]#yum install postfix


– On RPM based distros

root@debian:~# apt-get install –yes postfix


and as minimum, further on configure some functional Email Relay server within /etc/postfix/main.cf
 

# vi /etc/postfix/main.cf
relayhost = [relay.smtp-server.com]

2. Prepare rkhunter.conf initial configuration


Depending on what kind of files are present on the filesystem it could be for some reasons some standard package binaries has to be excluded for verification, because they possess unusual permissions because of manual sys admin monification this is done with the rkhunter variable PKGMGR_NO_VRFY.

If remote logging is configured on the system via something like rsyslog you will want to specificly tell it to rkhunter so this check as a possible security issue is skipped via ALLOW_SYSLOG_REMOTE_LOGGING=1. 

In case if remote root login via SSH protocol is disabled via /etc/ssh/sshd_config
PermitRootLogin no variable, the variable to include is ALLOW_SSH_ROOT_USER=no

It is useful to also increase the hashing check algorithm for security default one SHA256 you might want to change to SHA512, this is done via rkhunter.conf var HASH_CMD=SHA512

Triggering new email Warnings has to be configured so you receive, new mails at a preconfigured mailbox of your choice via variable
MAIL-ON-WARNING=SetMailAddress

 

# vi /etc/rkhunter.conf

PKGMGR_NO_VRFY=/usr/bin/su

PKGMGR_NO_VRFY=/usr/bin/passwd

ALLOW_SYSLOG_REMOTE_LOGGING=1

# Needed for corosync/pacemaker since update 19.11.2020

ALLOWDEVFILE=/dev/shm/qb-*/qb-*

# enabled ssh root access skip

ALLOW_SSH_ROOT_USER=no

HASH_CMD=SHA512

# Email address to sent alert in case of Warnings

MAIL-ON-WARNING=Your-Customer@Your-Email-Server-Destination-Address.com

MAIL-ON-WARNING=Your-Second-Peronsl-Email-Address@SMTP-Server.com

DISABLE_TESTS=os_specific


Optionally if you're using something specific such as corosync / pacemaker High Availability cluster or some specific software that is creating /dev/ files identified as potential Risks you might want to add more rkhunter.conf options like:
 

# Allow PCS/Pacemaker/Corosync
ALLOWDEVFILE=/dev/shm/qb-attrd-*
ALLOWDEVFILE=/dev/shm/qb-cfg-*
ALLOWDEVFILE=/dev/shm/qb-cib_rw-*
ALLOWDEVFILE=/dev/shm/qb-cib_shm-*
ALLOWDEVFILE=/dev/shm/qb-corosync-*
ALLOWDEVFILE=/dev/shm/qb-cpg-*
ALLOWDEVFILE=/dev/shm/qb-lrmd-*
ALLOWDEVFILE=/dev/shm/qb-pengine-*
ALLOWDEVFILE=/dev/shm/qb-quorum-*
ALLOWDEVFILE=/dev/shm/qb-stonith-*
ALLOWDEVFILE=/dev/shm/pulse-shm-*
ALLOWDEVFILE=/dev/md/md-device-map
# Needed for corosync/pacemaker since update 19.11.2020
ALLOWDEVFILE=/dev/shm/qb-*/qb-*

# tomboy creates this one
ALLOWDEVFILE="/dev/shm/mono.*"
# created by libv4l
ALLOWDEVFILE="/dev/shm/libv4l-*"
# created by spice video
ALLOWDEVFILE="/dev/shm/spice.*"
# created by mdadm
ALLOWDEVFILE="/dev/md/autorebuild.pid"
# 389 Directory Server
ALLOWDEVFILE=/dev/shm/sem.slapd-*.stats
# squid proxy
ALLOWDEVFILE=/dev/shm/squid-cf*
# squid ssl cache
ALLOWDEVFILE=/dev/shm/squid-ssl_session_cache.shm
# Allow podman
ALLOWDEVFILE=/dev/shm/libpod*lock*

 

3. Set the proper mirror database URL location to internal network repository

 

Usually  file /var/lib/rkhunter/db/mirrors.dat does contain Internet server address where latest version of mirrors.dat could be fetched, below is how it looks by default on Debian 10 Linux.

root@debian:/var/lib/rkhunter/db# cat mirrors.dat 
Version:2007060601
mirror=http://rkhunter.sourceforge.net
mirror=http://rkhunter.sourceforge.net

As you can guess a machine that doesn't have access to the Internet neither directly, neither via some kind of secure proxy because it is in a Paranoic Demilitarized Zone (DMZ) Network with many firewalls. What you can do then is setup another Mirror server (Apache / Nginx) within the local PCI secured LAN that gets regularly the database from official database on http://rkhunter.sourceforge.net/ (by installing and running rkhunter –update command on the Mirror WebServer and copying data under some directory structure on the remote local LAN accessible server, to keep the DB uptodate you might want to setup a cron to periodically copy latest available rkhunter database towards the http://mirror-url/path-folder/)

# vi /var/lib/rkhunter/db/mirrors.dat

local=http://rkhunter-url-mirror-server-url.com/rkhunter/1.4/


A mirror copy of entire db files from Debian 10.8 ( Buster ) ready for download are here.

Update entire file property db and check for rkhunter db updates

 

# rkhunter –update && rkhunter –propupdate

[ Rootkit Hunter version 1.4.6 ]

Checking rkhunter data files…
  Checking file mirrors.dat                                  [ Skipped ]
  Checking file programs_bad.dat                             [ No update ]
  Checking file backdoorports.dat                            [ No update ]
  Checking file suspscan.dat                                 [ No update ]
  Checking file i18n/cn                                      [ No update ]
  Checking file i18n/de                                      [ No update ]
  Checking file i18n/en                                      [ No update ]
  Checking file i18n/tr                                      [ No update ]
  Checking file i18n/tr.utf8                                 [ No update ]
  Checking file i18n/zh                                      [ No update ]
  Checking file i18n/zh.utf8                                 [ No update ]
  Checking file i18n/ja                                      [ No update ]

 

rkhunter-update-propupdate-screenshot-centos-linux


4. Initiate a first time check and see whether something is not triggering Warnings

# rkhunter –check

rkhunter-checking-for-rootkits-linux-screenshot

As you might have to run the rkhunter multiple times, there is annoying Press Enter prompt, between checks. The idea of it is that you're able to inspect what went on but since usually, inspecting /var/log/rkhunter/rkhunter.log is much more easier, I prefer to skip this with –skip-keypress option.

# rkhunter –check  –skip-keypress


5. Whitelist additional files and dev triggering false warnings alerts


You have to keep in mind many files which are considered to not be officially PCI compatible and potentially dangerous such as lynx browser curl, telnet etc. might trigger Warning, after checking them thoroughfully with some AntiVirus software such as Clamav and checking the MD5 checksum compared to a clean installed .deb / .rpm package on another RootKit, Virus, Spyware etc. Clean system (be it virtual machine or a Testing / Staging) machine you might want to simply whitelist the files which are incorrectly detected as dangerous for the system security.

Again this can be achieved with

PKGMGR_NO_VRFY=

Some Cluster softwares that are preparing their own /dev/ temporary files such as Pacemaker / Corosync might also trigger alarms, so you might want to suppress this as well with ALLOWDEVFILE

ALLOWDEVFILE=/dev/shm/qb-*/qb-*


If Warnings are found check what is the issue and if necessery white list files due to incorrect permissions in /etc/rkhunter.conf .

rkhunter-warnings-found-screenshot

Re-run the check until all appears clean as in below screenshot.

rkhunter-clean-report-linux-screenshot

Fixing Checking for a system logging configuration file [ Warning ]

If you happen to get some message like, message appears when rkhunter -C is done on legacy CentOS release 6.10 (Final) servers:

[13:45:29] Checking for a system logging configuration file [ Warning ]
[13:45:29] Warning: The 'systemd-journald' daemon is running, but no configuration file can be found.
[13:45:29] Checking if syslog remote logging is allowed [ Allowed ]

To fix it, you will have to disable SYSLOG_CONFIG_FILE at all.
 

SYSLOG_CONFIG_FILE=NONE

VIM Project (VI Improvied IDE Editor extension to facilitate web development with vi enhanced editor

Wednesday, August 25th, 2010

I use VIM as an editor of choice for many years already.
Yet it's until recently I use it for a PHP ZF (Zend Framework) web development.

Few days ago I've blogged How to configure vimrc for a php syntax highlightning (A Nicely pre-configured vimrc to imrpove the daily text editing experience

This enhancements significantly improves the overall PHP code editing with VIM. However I felt something is yet missing because I didn't have the power and functunality of a complete IDE like for instance The Eclipse IDE

I was pretty sure that VIM has to have a way to be used in a similar fashion to a fully functional IDE and looked around the net to find for any VIM plugins that will add vim an IDE like coding interface.

I then come accross a vim plugin called VIM Prokject : Organize/Navigate projects of files (like IDE/buffer explorer)

The latest VIM Project as of time of writting is 1.4.1 and I've mirrored it here

The installation of the VIM Project VIM extension is pretty straight forward to install it and start using it on your PC issue commands:

1. Install the project VIM add-on

debian:~$ wget https://www.pc-freak.net/files/project-1.4.1.tar.gz
debian:~$ mv project-1.4.1.tar.gz ~/.vim/
debian:~$ cd ~/.vim/
debian:~$ tar -zxvvf project-1.4.1.tar.gz

2. Load the plugin

Launch your vim editor and type : Project(without the space between : and P)
You will further see a screen like:

vim project entry screen

3. You will have to press C within the Project window to load a new project

Then you will have to type a directory to use to load a project sources files from:

vim project enter file source directory screen

You will be prompted with to type a project name like in the screenshot below:

vim project load test project

4. Next you will have to type a CD (Current Dir) parameter
To see more about the CD parameter consult vim project documentation by typing in main vim pane :help project

The appearing screen will be something like:

vim project extension cd parameter screen

5. Thereafter you will have to type a file filter

File filter is necessary and will instruct the vim project plugin to load all files with the specified extension within vim project pane window

You will experience a screen like:


vim project plugin file filter screen

Following will be a short interval in which all specified files by the filter type will get loaded in VIM project pane and your Zend Framework, PHP or any other source files will be listed in a directory tree structure like in the picture shown below:

vim project successful loaded project screen

6. Saving loaded project hierarchy state

In order to save a state of a loaded project within the VIM project window pane you will have to type in vim, let's say:

:saveas .projects/someproject

Later on to load back the saved project state you will have to type in vim :r .projects/someproject

You will now have almost fully functional development IDE on top of your simple vim text editor.

You can navigate within the Project files loaded with the Project extension pane easily and select a file you would like to open up, whenever a source file is opened and you work on it to switch in between the Project file listing pane and the opened source code file you will have to type twice CTRL+w or in vim language C-w

To even further sophisticate your web development in PHP with vim you can add within your ~/.vimrc file the following two lines:

" run file with PHP CLI (CTRL-M)
:autocmd FileType php noremap <C-M> :w!<CR>:!/usr/bin/php %<CR>
" PHP parser check (CTRL-L)
:autocmd FileType php noremap <C-L> :!/usr/bin/php -l %>CR>

In the above vim configuration directovies the " character is a comment line and the autocmd is actually vim declarations.
The first :autocmd … declaration will instruct vim to execute your current opened php source file with the php cli interpreter whenever a key press of CTRL+M (C-m) occurs.

The second :autocmd … will add to your vim a shortcut, so whenever a CTRL+L (C-l) key combination is pressed VIM editor will check your current edited source file for syntax errors.
Therefore this will enable you to very easily periodically check if your file syntax is correct.

Well this things were really helpful to me, so I hope they will be profitable for you as well.
Cheers 🙂

Vodka! :)

Wednesday, September 12th, 2007

Yesterday I drinked 200 gr. of Vodka yesterday Night, it was pretty refreshing for me but I got drunk a little.I'm smoking again … Things are going bad in my life recently. I have health issues. And I intend to go to doctor today.Yesterday I went to the polyclinic but my personal Dr. Nikolay  was not there (I was angry, I went to doctor once in years and he is not there) so I'll try again today. I had pains somewhere around the stomach. At least at work things are going smoothly at least God hears my prayers about this. I'm very confused and I have completely no idea what to do with my life. Yesterday I was out with Lily and Kiril on the fountain. The previous day Nomen, I, Yavor, Kiro and Bino went to the "Kobaklyka" (a woody place which is close to Dobrich.) Well that's most of what's happening lately with my life. I wrote a little script to make that nautilus to get restarted if it starts burning the cpu. It's a dumb script (the bad thing is that I'm loosing form scripting, Well I don't script much lately). Here is the script http://pcfreak.d-bg.net/bshscr/restart_nautilus.sh https://www.pc-freak.net/bshscr/restart_nautilus.sh. The days before the 4 days weekend, I hat to spend a lot of time on one of the servers fighting with Spammers. Hate spammers really! I ended removing bounce messages at all for one of the domains, which fixed the bounce spam method spammers use (btw qmail's chkuser seems to not work properly for some reason) … Also I started watching Stargate – SG1. First I thought it's a stupid sci-fi serial. But after the first serie I now think it has it's good moments :]. Also I had something like a Mortification Day going on during Monday. The whole day I listened to Mortification (The first Christian Death Metal Band). I Liked much the "Hammer of God" album. In the evening Sabin (Bino) came home and we watched some Mortification videos at Youtube. Right now I listen again to "Ever – Idyll" a pretty great song. And yeah I keep listening to ChristianIndustrial.net a lot, a great radio. Try it if you haven't!END—–

Set Domain multiple alias (Aliases) in IIS on Windows server howto

Saturday, October 24th, 2020

https://www.pc-freak.net/images/microsoft-iis-logo

On Linux as mentioned in my previous article it is pretty easy to use the VirtualHost Apache directive etc. to create ServerName and ServerAlias but on IIS Domain multiple alias add too me a while to google.

<VirtualHost *>
ServerName whatevever-server-host.com
ServerAlias www.whatever-server-host.com whatever-server-host.com
</VirtualHost>


In click and pray environments as Winblows, sometimes something rather easy to be done can be really annoying if you are not sure what to do and where to click and you have not passed some of the many cryptic microsoft offered ceritification programs offer for professional sysadmins, I'll name a few of them as to introduce UNIX guys like me to what you might ask a M$ admin during an interview if you want to check his 31337 Windows Sk!lls 🙂

 

  • Microsoft Certified Professional (MCP)
  • Microsoft Technology Associate (MTA) –
  • Microsoft Certified Solutions Expert (MCSE)-
  • Microsoft Specialist (MS) etc. –

A full list of Microsoft Certifed Professsional program here

Ok enough of  balling.

Here is  how to  create a domain alias in IIS on Windows server.

Login to your server and click on the START button then ‘Run’¦’, and then type ‘inetmgr.exe’.

Certainly you can go and click trough the Administrative tools section to start ISS manager, but for me this is fastest and easiest way.

create-domain-alias-on-windows-server-1a
 

Now expand the (local computer), then ‘Web Sites’ and locate the site for which you want to add alias (here it is called additional web site identification).

Right click on the domain and choose ‘Properties’ option at the bottom.

This will open the properties window where you have to choose ‘Web Site’ and then to locate ‘Website identification‘ section. Click on the ‘Advanced’¦’ button which stands next to the IP of the domain.

create-domain-alias-on-windows-server-2a
Advanced Web site identification window (Microsoft likes to see the word ‘Advanced’ in all of the management menus) will be opened, where we are going to add a new domain alias.

create-domain-alias-on-windows-server-3a.png

Click on the ‘Add’¦’ button and ‘Add/Edit website (alias)identification’ window will appear.

create-domain-alias-on-windows-server-4a.png

Make sure that you will choose the same IP address from the dropdown menu, then set the port number on which your web server is running (the default is 80), write the domain you want, and click ‘OK’ to create the new domain alias.

Actually click ‘OK’ until you have ‘Advanced Web site identification’ and the domain properties windows closed.

Right click on the domain again and ‘Stop’ and ‘Start’ the service.
This will be enough the IIS domain alias to start working.

create-domain-alias-on-windows-server-5a


Another useful thing for novice IIS admins that come from UNIX is a domain1 to domain2 redirect, this is done with writting an IIS rule which is an interesting but long topic for a limited post as like this, but just for the reference of fun to let you know this exist.

Domain 1 to Domain 2 Redirect
This rule comes handy when you change the name of your site or may be when you need to catch and alias and direct it to your main site. If the new and the old URLs share some elements, then you could just use this rule to have the matching pattern together with the redirect target being

domain1-to-domain2-redirect-iis

That's all folks, if you enjoyed the clicking laziness you're ready to retrain yourself to become a successful lazy Windows admin who calls Microsoft Support everyday as many of the errors and problems Windows sysadmins experience as I heard from a friend can only be managed by M$ Support (if they can be managed at all). 

Yes that's it the great and wonderful life of the avarage sysadmin. Long live computing … it's great! Everyday something broken needs to get fixed everyday something to rethink / relearn / reupdate and restructure or migrate a never ending story of weirdness.

A remark to  make, the idea for this post is originated based on a task I had to do long time ago on IIS, the images and the description behind them are taking from a post originally written on Domain Aliasing in IIS originally written by Anthony Gee unfortunately his blog is not available anymore so credits goes to him.