Posts Tagged ‘temp’

Why du and df reporting different on a filesystem / How to fix inconsistency between used space on FS and disk showing full strangeness

Wednesday, July 24th, 2019

linux-why-du-and-df-shows-different-result-inconsincy-explained-filesystem-full-oddity

If you're a sysadmin on a large server environment such as a couple of hundred of Virtual Machines running Linux OS on either physical host or OpenXen / VmWare hosted guest Virtual Machine, you might end up sometimes at an odd case where some mounted partition mount point reports its file use different when checked with
df
cmd than when checked with du command, like for example:
 

root@sqlserver:~# df -hT /var/lib/mysql
Filesystem   Type  Size Used Avail Use% Mounted On
/dev/sdb5      ext4    19G  3,4G    14G  20% /var/lib/mysql

Here the '-T' argument is used to show us the filesystem.

root@sqlserver:~# du -hsc /var/lib/mysql
0K    /var/lib/mysql/
0K    total

1. Simple debug on what might be the root cause for df / du inconsistency reporting

Of course the basic thing to do when in that weird situation is to be totally shocked how this is possible and to investigate a bit what is the biggest first level sub-directories that eat up the space on the mounted location, with du:

# du -hkx –max-depth=1 /var/lib/mysql/|uniq|sort -n
4       /var/lib/mysql/test
8       /var/lib/mysql/ezmlm
8       /var/lib/mysql/micropcfreak
8       /var/lib/mysql/performance_schema
12      /var/lib/mysql/mysqltmp
24      /var/lib/mysql/speedtest
64      /var/lib/mysql/yourls
144     /var/lib/mysql/narf
320     /var/lib/mysql/webchat_plus
424     /var/lib/mysql/goodfaithair
528     /var/lib/mysql/moonman
648     /var/lib/mysql/daniel
852     /var/lib/mysql/lessn
1292    /var/lib/mysql/gallery

The given output is in Kilobytes so it is a little bit hard to read, if you're used to Mbytes instead, do

 # du -hmx –max-depth=1 /var/lib/mysql/|uniq|sort -n|less

I've also investigated on the complete /var directory contents sorted by size with:

 # du -akx ./ | sort -n
5152564    ./cache/rsnapshot/hourly.2/localhost
5255788    ./cache/rsnapshot/hourly.2
5287912    ./cache/rsnapshot
7192152    ./cache


Even after finding out the bottleneck dirs and trying to clear up a bit, continued facing that inconsistently shown in two commands and if you're likely to be stunned like me and try … to move some files to a different filesystem to free up space or assigned inodes with a hope that shown inconsitency output will be fixed as it might be caused  due to some kernel / FS caching ?? and this will eventually make the mounted FS to refresh …

But unfortunately, if you try it you'll figure out clearing up a couple of Megas or Gigas will make no difference in cmd output.

In my exact case /var/lib/mysql is a separate mounted ext4 filesystem, however same issue was present also on a Network Filesystem (NFS) and thus, my first thought that this is caused by a network failure problem or NFS bug turned to be wrong.

After further short investigation on the inodes on the Filesystem, it was clear enough inodes are available:
 

# df -i /var/lib/mysql
Filesystem       Inodes  IUsed   IFree IUse% Mounted on
/dev/sdb5      1221600  2562 1219038   1% /var/lib/mysql

So the filled inodes count assumed issue also has been rejected.
P.S. (if you're not well familiar with them read manual, i.e. – man 7 inode).
 

– Remounting the mounted filesystem

To make sure the filesystem shown inconsistency between du and df is not due to some hanging network mount or bug, first logical thing I did is to remount the filesytem showing different in size, in my case this was done with:
 

# mount -o remount,rw -t ext4 /var/lib/mysql

For machines with NFS remote mounted storage locations, used:

# mount -o remount,rw -t nfs /var/www


FS remount did not solved it so I continued to ponder what oddity and of course I thought of a workaround (in case if this issues are caused by kernel bug or OS lib issue) reboot might be the solution, however unfortunately restarting the VMs was not a wanted easy to do solution, thus I continued investigating what is wrong …

Next check of course was to check, what kind of network connections are opened to the affected hosts with:
 

# netstat -tupanl


Did not found anything that might point me to the reported different Megabytes issue, so next step was to check what is the situation with currently opened files by running processes on the weird df / du reported systems with lsof, and boom there I observed oddity such as multiple files

# lsof -nP | grep '(deleted)'

COMMAND   PID   USER   FD   TYPE DEVICE    SIZE NLINK  NODE NAME
mysqld   2588  mysql    4u   REG 253,17      52     0  1495 /var/lib/mysql/tmp/ibY0cXCd (deleted)
mysqld   2588  mysql    5u   REG 253,17    1048     0  1496 /var/lib/mysql/tmp/ibOrELhG (deleted)
mysqld   2588  mysql    6u   REG 253,17       777884290     0  1497 /var/lib/mysql/tmp/ibmDFAW8 (deleted)
mysqld   2588  mysql    7u   REG 253,17       123667875     0 11387 /var/lib/mysql/tmp/ib2CSACB (deleted)
mysqld   2588  mysql   11u   REG 253,17       123852406     0 11388 /var/lib/mysql/tmp/ibQpoZ94 (deleted)

Notice that There were plenty of '(deleted)' STATE files shown in memory an overall of 438:

# lsof -nP | grep '(deleted)' |wc -l
438


As I've learned a bit online about the problem, I found it is also possible to find deleted unlinked files only without any greps (to list all deleted files in memory files with lsof args only):

# lsof +L1|less


The SIZE field (fourth column)  shows a number of files that are really hard in size and that are kept in open on filesystem and in memory, totally messing up with the filesystem. In my case this is temp files created by MYSQLD daemon but depending on the server provided service this might be apache's www-data, some custom perl / bash script executed via a cron job, stalled rsync jobs etc.
 

2. Check all the list open files with the mysql / root user as part of the the server filesystem inconsistency debugging with:

– Grep opened files on server by user

# lsof |grep mysql
mysqld    1312                       mysql  cwd       DIR               8,21       4096          2 /var/lib/mysql
mysqld    1312                       mysql  rtd       DIR                8,1       4096          2 /
mysqld    1312                       mysql  txt       REG                8,1   20336792   23805048 /usr/sbin/mysqld
mysqld    1312                       mysql  mem       REG               8,21      24576         20 /var/lib/mysql/tc.log
mysqld    1312                       mysql  DEL       REG               0,16                 29467 /[aio]
mysqld    1312                       mysql  mem       REG                8,1      55792   14886933 /lib/x86_64-linux-gnu/libnss_files-2.28.so

# lsof | grep root
COMMAND    PID   TID TASKCMD          USER   FD      TYPE             DEVICE   SIZE/OFF       NODE NAME
systemd      1                        root  cwd       DIR                8,1       4096          2 /
systemd      1                        root  rtd       DIR                8,1       4096          2 /
systemd      1                        root  txt       REG                8,1    1489208   14928891 /lib/systemd/systemd
systemd      1                        root  mem       REG                8,1    1579448   14886924 /lib/x86_64-linux-gnu/libm-2.28.so

Other command that helped to track the discrepancy between df and du different file usage on FS is:
 

# du -hxa  / | egrep '^[[:digit:]]{1,1}G[[:space:]]*'
 

3. Fixing large files kept in memory filesystem problem


What is the real reason for ending up with this file handlers opened by running backgrounded programs on the Linux OS?
It could be multiple  but most likely it is due to exceeded server / client interactions or breaking up RAM or HDD drive with writing plenty of logs on the FS without ending keeping space occupied or Programming library bugs used by hanged service leaving the FH opened on storage.

What is the solution to file system files left in memory problem?

The best solution is to first fix custom script or hanged service and then if possible to simply restart the server to make the kernel / services reload or if this is not possible just restart the problem creation processes.

Once the process is identified like in my case this was MySQL on systemd enabled newer OS distros, just do:

# systemctl restart mysqld.service


or on older init.d system V ones:

# /etc/init.d/service restart


For custom hanged scripts being listed in ps axuwef you can grep the pid and do a kill -HUP (if the script is written in a good way to recognize -HUP and restart the sub-running process properly – BE EXTRA CAREFUL IF YOU'RE RESTARTING BROKEN SCRIPTS as this might cause your running service disruptions …).

# pgrep -l script.sh
7977 script.sh


# kill -HUP PID

Now finally this should either mitigate or at best case completely solve the reported disagreement between df and du, after which the calculated / reported disk space should be back to normal and show up approximately the same (note that size changes a bit as mysql service is writting data) constantly extending the size between the two checks.

# df -hk /var/lib/mysql; du -hskc /var/lib/mysql
Filesystem       Inodes  IUsed   IFree IUse% Mounted on
/dev/sdb5        19097172 3472744 14631296  20% /var/lib/mysql
3427772    /var/lib/mysql
3427772    total

What we learned?

What I've explained in this article is why and how it comes that 'zoombie' files reside on a filesystem
appearing to be eating disk space on a mounted local or network partition, giving strange inconsistent
reports, leading to system service disruptions and impossibility to have correctly shown information on used
disk space on mounted drive.

I went through with some standard logic on debugging service / filesystem / inode issues up explainat, that led me to the finding about deleted files being kept in filesystem and producing the filesystem strange sized / showing not correct / filled even after it was extended with tune2fs and was supposed to have extra 50GBs.

Finally it was explained shortly how to HUP / restart hanging script / service to fix it.

Some few good readings that helped to fix the issue:

What to do when du and df report different usage is here
df in linux not showing correct free space after file removal is here
Why do “df” and “du” commands show different disk usage?
 

Where is Firefox plugin / bookmarks / temp files directory on Windows?

Saturday, February 13th, 2016

windows-appdata-mozilla-plugins-how-to-check-the-extensions-folder-firefox-windows-screenshot

If you want to find out where Firefox downloads and keeps installed extensions in a quick manner, just press together:
KBD Windows flag button + R

This shortcut will open WIndows Run prompt
And paste inside the run prompt

%appdata%\Mozilla\plugins

The %appdata% is Windows internal variable that keeps inside  path to C:\Users\Your-Username\AppData\Roaming

On my workPC this contains:

C:\Users\georgi7>echo %appdata%
C:\Users\georgi7\AppData\Roaming

mozilla-plugins-folder-in-windows-explorer-screenshot

Enjoy 🙂

Maximal protection against SSH attacks. If your server has to stay with open SSH (Secure Shell) port open to the world

Thursday, April 7th, 2011

Brute Force Attack SSH screen, Script kiddie attacking
If you’re a a remote Linux many other Unix based OSes, you have defitenily faced the security threat of many failed ssh logins or as it’s better known a brute force attack

During such attacks your /var/log/messages or /var/log/auth gets filled in with various failed password logs like for example:

Feb 3 20:25:50 linux sshd[32098]: Failed password for invalid user oracle from 95.154.249.193 port 51490 ssh2
Feb 3 20:28:30 linux sshd[32135]: Failed password for invalid user oracle1 from 95.154.249.193 port 42778 ssh2
Feb 3 20:28:55 linux sshd[32141]: Failed password for invalid user test1 from 95.154.249.193 port 51072 ssh2
Feb 3 20:30:15 linux sshd[32163]: Failed password for invalid user test from 95.154.249.193 port 47481 ssh2
Feb 3 20:33:20 linux sshd[32211]: Failed password for invalid user testuser from 95.154.249.193 port 51731 ssh2
Feb 3 20:35:32 linux sshd[32249]: Failed password for invalid user user from 95.154.249.193 port 38966 ssh2
Feb 3 20:35:59 linux sshd[32256]: Failed password for invalid user user1 from 95.154.249.193 port 55850 ssh2
Feb 3 20:36:25 linux sshd[32268]: Failed password for invalid user user3 from 95.154.249.193 port 36610 ssh2
Feb 3 20:36:52 linux sshd[32274]: Failed password for invalid user user4 from 95.154.249.193 port 45514 ssh2
Feb 3 20:37:19 linux sshd[32279]: Failed password for invalid user user5 from 95.154.249.193 port 54262 ssh2
Feb 3 20:37:45 linux sshd[32285]: Failed password for invalid user user2 from 95.154.249.193 port 34755 ssh2
Feb 3 20:38:11 linux sshd[32292]: Failed password for invalid user info from 95.154.249.193 port 43146 ssh2
Feb 3 20:40:50 linux sshd[32340]: Failed password for invalid user peter from 95.154.249.193 port 46411 ssh2
Feb 3 20:43:02 linux sshd[32372]: Failed password for invalid user amanda from 95.154.249.193 port 59414 ssh2
Feb 3 20:43:28 linux sshd[32378]: Failed password for invalid user postgres from 95.154.249.193 port 39228 ssh2
Feb 3 20:43:55 linux sshd[32384]: Failed password for invalid user ftpuser from 95.154.249.193 port 47118 ssh2
Feb 3 20:44:22 linux sshd[32391]: Failed password for invalid user fax from 95.154.249.193 port 54939 ssh2
Feb 3 20:44:48 linux sshd[32397]: Failed password for invalid user cyrus from 95.154.249.193 port 34567 ssh2
Feb 3 20:45:14 linux sshd[32405]: Failed password for invalid user toto from 95.154.249.193 port 42350 ssh2
Feb 3 20:45:42 linux sshd[32410]: Failed password for invalid user sophie from 95.154.249.193 port 50063 ssh2
Feb 3 20:46:08 linux sshd[32415]: Failed password for invalid user yves from 95.154.249.193 port 59818 ssh2
Feb 3 20:46:34 linux sshd[32424]: Failed password for invalid user trac from 95.154.249.193 port 39509 ssh2
Feb 3 20:47:00 linux sshd[32432]: Failed password for invalid user webmaster from 95.154.249.193 port 47424 ssh2
Feb 3 20:47:27 linux sshd[32437]: Failed password for invalid user postfix from 95.154.249.193 port 55615 ssh2
Feb 3 20:47:54 linux sshd[32442]: Failed password for www-data from 95.154.249.193 port 35554 ssh2
Feb 3 20:48:19 linux sshd[32448]: Failed password for invalid user temp from 95.154.249.193 port 43896 ssh2
Feb 3 20:48:46 linux sshd[32453]: Failed password for invalid user service from 95.154.249.193 port 52092 ssh2
Feb 3 20:49:13 linux sshd[32458]: Failed password for invalid user tomcat from 95.154.249.193 port 60261 ssh2
Feb 3 20:49:40 linux sshd[32464]: Failed password for invalid user upload from 95.154.249.193 port 40236 ssh2
Feb 3 20:50:06 linux sshd[32469]: Failed password for invalid user debian from 95.154.249.193 port 48295 ssh2
Feb 3 20:50:32 linux sshd[32479]: Failed password for invalid user apache from 95.154.249.193 port 56437 ssh2
Feb 3 20:51:00 linux sshd[32492]: Failed password for invalid user rds from 95.154.249.193 port 45540 ssh2
Feb 3 20:51:26 linux sshd[32501]: Failed password for invalid user exploit from 95.154.249.193 port 53751 ssh2
Feb 3 20:51:51 linux sshd[32506]: Failed password for invalid user exploit from 95.154.249.193 port 33543 ssh2
Feb 3 20:52:18 linux sshd[32512]: Failed password for invalid user postgres from 95.154.249.193 port 41350 ssh2
Feb 3 21:02:04 linux sshd[32652]: Failed password for invalid user shell from 95.154.249.193 port 54454 ssh2
Feb 3 21:02:30 linux sshd[32657]: Failed password for invalid user radio from 95.154.249.193 port 35462 ssh2
Feb 3 21:02:57 linux sshd[32663]: Failed password for invalid user anonymous from 95.154.249.193 port 44290 ssh2
Feb 3 21:03:23 linux sshd[32668]: Failed password for invalid user mark from 95.154.249.193 port 53285 ssh2
Feb 3 21:03:50 linux sshd[32673]: Failed password for invalid user majordomo from 95.154.249.193 port 34082 ssh2
Feb 3 21:04:43 linux sshd[32684]: Failed password for irc from 95.154.249.193 port 50918 ssh2
Feb 3 21:05:36 linux sshd[32695]: Failed password for root from 95.154.249.193 port 38577 ssh2
Feb 3 21:06:30 linux sshd[32705]: Failed password for bin from 95.154.249.193 port 53564 ssh2
Feb 3 21:06:56 linux sshd[32714]: Failed password for invalid user dev from 95.154.249.193 port 34568 ssh2
Feb 3 21:07:23 linux sshd[32720]: Failed password for root from 95.154.249.193 port 43799 ssh2
Feb 3 21:09:10 linux sshd[32755]: Failed password for invalid user bob from 95.154.249.193 port 50026 ssh2
Feb 3 21:09:36 linux sshd[32761]: Failed password for invalid user r00t from 95.154.249.193 port 58129 ssh2
Feb 3 21:11:50 linux sshd[537]: Failed password for root from 95.154.249.193 port 58358 ssh2

This brute force dictionary attacks often succeed where there is a user with a weak a password, or some old forgotten test user account.
Just recently on one of the servers I administrate I have catched a malicious attacker originating from Romania, who was able to break with my system test account with the weak password tset .

Thanksfully the script kiddie was unable to get root access to my system, so what he did is he just started another ssh brute force scanner to crawl the net and look for some other vulnerable hosts.

As you read in my recent example being immune against SSH brute force attacks is a very essential security step, the administrator needs to take on a newly installed server.

The easiest way to get read of the brute force attacks without using some external brute force filtering software like fail2ban can be done by:

1. By using an iptables filtering rule to filter every IP which has failed in logging in more than 5 times

To use this brute force prevention method you need to use the following iptables rules:
linux-host:~# /sbin/iptables -I INPUT -p tcp --dport 22 -i eth0 -m state -state NEW -m recent -set
linux-host:~# /sbin/iptables -I INPUT -p tcp --dport 22 -i eth0 -m state -state NEW
-m recent -update -seconds 60 -hitcount 5 -j DROP

This iptables rules will filter out the SSH port to an every IP address with more than 5 invalid attempts to login to port 22

2. Getting rid of brute force attacks through use of hosts.deny blacklists

sshbl – The SSH blacklist, updated every few minutes, contains IP addresses of hosts which tried to bruteforce into any of currently 19 hosts (all running OpenBSD, FreeBSD or some Linux) using the SSH protocol. The hosts are located in Germany, the United States, United Kingdom, France, England, Ukraine, China, Australia, Czech Republic and setup to report and log those attempts to a central database. Very similar to all the spam blacklists out there.

To use sshbl you will have to set up in your root crontab the following line:

*/60 * * * * /usr/bin/wget -qO /etc/hosts.deny http://www.sshbl.org/lists/hosts.deny

To set it up from console issue:

linux-host:~# echo '*/60 * * * * /usr/bin/wget -qO /etc/hosts.deny http://www.sshbl.org/lists/hosts.deny' | crontab -u root -

These crontab will download and substitute your system default hosts with the one regularly updated on sshbl.org , thus next time a brute force attacker which has been a reported attacker will be filtered out as your Linux or Unix system finds out the IP matches an ip in /etc/hosts.deny

The /etc/hosts.deny filtering rules are written in a way that only publicly known brute forcer IPs will only be filtered for the SSH service, therefore other system services like Apache or a radio, tv streaming server will be still accessible for the brute forcer IP.

It’s a good practice actually to use both of the methods 😉
Thanks to Static (Multics) a close friend of mine for inspiring this article.