Posts Tagged ‘hard drive’

Controlling fan with Thinkfan on Lenovo Thinkpad R61 on Debian GNU/Linux (adjusting proper fan cycling)

Saturday, August 7th, 2010

Some time ago before I have blogged about How to properly control your Lenovo Thinkpad R61 fan rotation cycles on Linux with ThinkFan
In this tiny article I have explained my previous obstacles of making my Notebook CPU cooling fan to properly rotate and cool up my Central Processing Unit.

However just recently I’ve upgraded my Debian Unstable – Squeeze/Sid through the apt-get manager to the newest possible package updates.
The upgraded bundle of packages also updated my sid thinkfan package to:

hipo@noah:~$ dpkg -l |grep -i thinkfan
ii thinkfan 0.7.1-1 simple and lightweight fan control program

I was unpleasently suprised when I tried to restart thinkfan using the thinkfan init.d script I have used until recently /etc/init.d/thinkfan , cause /etc/init.d/thinkfan was no longer be.

Furthermore I give a try to directly launch the thinkfan daemon from the terminal trying to backround the service, like so:

noah:~# thinkfan &
WARNING: Using default temperature inputs in /proc/acpi/ibm/thermal.
WARNING: You have not provided any correction values for any sensor, and your fan will only start at 55 °C. This can be dangerous for your hard drive.

Though this started up the thinkfan daemon as you can see the note in the message below it started up with a consistent cycling cooling to keep the CPU wamrth sticked to 55 ° degrees:
<</p>

Preventive measures against hard disk failures with smard / Installing smartmontools on Linux

Friday, March 15th, 2013

Many admins might not know about smartmontools Linux package. It provides two useful tools  smartctl and smard which use (Self Monitoring and Reporting Technology system) often abreviated as S.M.A.R.T.. SMART support is nowdays available across any modern ATA, SATA and SCSI hard disks. smartontools package is installable via default package repositories on virtually all different Linux distributions. Having smartmontools installed on all critical productive server is a must for the reason it serves as early notification system in case if hard disk is on the down-verge of break-up (i.e. physical media of hard disk storage starts getting damaged). Through the last 14 years I worked as Linux sysadmin. I've used smartmontools on hundreds of servers and on many times it save companies hundreds of dollars by simply reporting a system hdd is dying and by replacing the server or hard disk with identifically configured ones. smartmontools supports monitoring of single  hard disks as well as ones configured on a hardware level to work in some RAID array. As of time of writing you can check list of smartmontools supported hardware RAID-Controllers here.

1. Installing smartmontools

a) To install smartmontools on Debian and Ubuntu and other .deb based servers:

debian:~# apt-get install --yes smartmontools
.....

b) On CentOS, Fedora,RHEL and other RPM based  install with:

[root@centos ~]# yum --yes install smartmontools
.....

2. Configuring and Enabling smartd hard disk health monitoring

a) on Debian and derivatives

Edit /etc/default/smartmontools:

debian:~# vim /etc/default/smartmontools

By default file looks smth. like;

 

# Defaults for smartmontools initscript (/etc/init.d/smartmontools)
# This is a POSIX shell fragment

# List of devices you want to explicitly enable S.M.A.R.T. for
# Not needed (and not recommended) if the device is monitored by smartd
#enable_smart="/dev/hda /dev/hdb"
#enable_smart="/dev/hda"
# uncomment to start smartd on system startup
#start_smartd=yes

# uncomment to pass additional options to smartd on startup
#smartd_opts="–interval=1800"

Config file should look something like;

 

# Defaults for smartmontools initscript (/etc/init.d/smartmontools)
# This is a POSIX shell fragment

# List of devices you want to explicitly enable S.M.A.R.T. for
# Not needed (and not recommended) if the device is monitored by smartd
#enable_smart="/dev/hda /dev/hdb"
enable_smart="/dev/sda"
# uncomment to start smartd on system startup
start_smartd=yes

# uncomment to pass additional options to smartd on startup
#smartd_opts="–interval=1800"

 

b) on CentOS, RHEL, Fedora  for smartd options

By default on RPM based distros there is no need for special configuration. However for some custom cases edit /etc/sysconfig/smartmontools and /etc/smartd.conf

c) Enabling smartmontools

[root@centos default]# /etc/init.d/smartd start
Starting smartd:           [  OK  ]

3. Checking hard disk failure status with smartctl

Checking whether a SMART hard disk consistency check Passes is done simplest with:

debian:~# /usr/sbin/smartctl -H /dev/sda

smartctl 5.40 2010-07-12 r3124 [x86_64-unknown-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

SMART Health Status: OK

 

 

debian:~# /usr/sbin/smartctl -i /dev/sda1

smartctl version 5.38 [i686-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.7 and 7200.7 Plus family
Device Model:     ST340014AS
Serial Number:    4MQ0LV3B
Firmware Version: 3.43
User Capacity:    40,020,664,320 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   6
ATA Standard is:  ATA/ATAPI-6 T13 1410D revision 2
Local Time is:    Fri Mar 15 15:27:12 2013 EET
SMART support is: Available – device has SMART capability.
SMART support is: Enabled

To print as much information as possible for hard disk health status;

 

[root@centos default]# /usr/sbin/smartctl -a /dev/sda1

smartctl version 5.38 [i686-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.7 and 7200.7 Plus family
Device Model:     ST340014AS
Serial Number:    4MQ0LV3B
Firmware Version: 3.43
User Capacity:    40,020,664,320 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   6
ATA Standard is:  ATA/ATAPI-6 T13 1410D revision 2
Local Time is:    Fri Mar 15 15:14:53 2013 EET
SMART support is: Available – device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)    Offline data collection activity
                    was completed without error.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever
                    been run.
Total time to complete Offline
data collection:          ( 423) seconds.
Offline data collection
capabilities:              (0x5b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   1) minutes.
Extended self-test routine
recommended polling time:      (  19) minutes.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   052   045   006    Pre-fail  Always       –       172137473
  3 Spin_Up_Time            0x0002   098   098   000    Old_age   Always       –       0
  4 Start_Stop_Count        0x0033   096   096   020    Pre-fail  Always       –       4198
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       –       0
  7 Seek_Error_Rate         0x000f   090   060   030    Pre-fail  Always       –       945095084
  9 Power_On_Hours          0x0032   075   075   000    Old_age   Always       –       22769
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       –       0
 12 Power_Cycle_Count       0x0033   099   099   020    Pre-fail  Always       –       1084
194 Temperature_Celsius     0x0022   038   046   000    Old_age   Always       –       38 (0 15 0 0)
195 Hardware_ECC_Recovered  0x001a   052   045   000    Old_age   Always       –       172137473
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       –       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      –       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       –       0
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age   Offline      –       0
202 TA_Increase_Count       0x0032   100   253   000    Old_age   Always       –       0

SMART Error Log Version: 1
ATA Error Count: 33 (device log contains only the most recent five errors)
    CR = Command Register [HEX]
    FR = Features Register [HEX]
    SC = Sector Count Register [HEX]
    SN = Sector Number Register [HEX]
    CL = Cylinder Low Register [HEX]
    CH = Cylinder High Register [HEX]
    DH = Device/Head Register [HEX]
    DC = Device Command Register [HEX]
    ER = Error register [HEX]
    ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 33 occurred at disk power-on lifetime: 21588 hours (899 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  — — — — — — —
  40 51 00 77 c3 6a e0  Error: UNC at LBA = 0x006ac377 = 6996855

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  — — — — — — — —  —————-  ——————–
  c8 00 08 77 c3 6a e0 00      14:07:39.385  READ DMA
  ec 00 00 00 00 00 a0 00      14:07:35.553  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 00      14:07:35.550  SET FEATURES [Set transfer mode]
  ec 00 00 00 00 00 a0 00      14:07:35.547  IDENTIFY DEVICE
  c8 00 08 77 c3 6a e0 00      14:07:35.543  READ DMA

Error 32 occurred at disk power-on lifetime: 21588 hours (899 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  — — — — — — —
  40 51 00 77 c3 6a e0  Error: UNC at LBA = 0x006ac377 = 6996855

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  — — — — — — — —  —————-  ——————–
  c8 00 08 77 c3 6a e0 00      14:07:23.940  READ DMA
  ec 00 00 00 00 00 a0 00      14:07:35.553  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 00      14:07:35.550  SET FEATURES [Set transfer mode]
  ec 00 00 00 00 00 a0 00      14:07:35.547  IDENTIFY DEVICE
  c8 00 08 77 c3 6a e0 00      14:07:35.543  READ DMA

Error 31 occurred at disk power-on lifetime: 21588 hours (899 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  — — — — — — —
  40 51 00 77 c3 6a e0  Error: UNC at LBA = 0x006ac377 = 6996855

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  — — — — — — — —  —————-  ——————–
  c8 00 08 77 c3 6a e0 00      14:07:23.940  READ DMA
  ec 00 00 00 00 00 a0 00      14:07:23.937  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 00      14:07:20.071  SET FEATURES [Set transfer mode]
  ec 00 00 00 00 00 a0 00      14:07:20.057  IDENTIFY DEVICE
  c8 00 08 77 c3 6a e0 00      14:07:20.044  READ DMA

Error 30 occurred at disk power-on lifetime: 21588 hours (899 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  — — — — — — —
  40 51 00 77 c3 6a e0  Error: UNC at LBA = 0x006ac377 = 6996855

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  — — — — — — — —  —————-  ——————–
  c8 00 08 77 c3 6a e0 00      14:07:23.940  READ DMA
  ec 00 00 00 00 00 a0 00      14:07:23.937  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 00      14:07:20.071  SET FEATURES [Set transfer mode]
  ec 00 00 00 00 00 a0 00      14:07:20.057  IDENTIFY DEVICE
  c8 00 08 77 c3 6a e0 00      14:07:20.044  READ DMA

Error 29 occurred at disk power-on lifetime: 21588 hours (899 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  — — — — — — —
  40 51 00 77 c3 6a e0  Error: UNC at LBA = 0x006ac377 = 6996855

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  — — — — — — — —  —————-  ——————–
  c8 00 08 77 c3 6a e0 00      14:07:23.940  READ DMA
  ec 00 00 00 00 00 a0 00      14:07:23.937  IDENTIFY DEVICE
  ef 03 45 00 00 00 a0 00      14:07:20.071  SET FEATURES [Set transfer mode]
  ec 00 00 00 00 00 a0 00      14:07:20.057  IDENTIFY DEVICE
  c8 00 08 77 c3 6a e0 00      14:07:20.044  READ DMA

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%         1         –

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

4. Visualizing smartd collected data in GUI with gsmartcontrol

For people who prefer to visualize things in Graphical environment smartd service hard disk health data can be viewed in nice graphical interface wth gsmartcontrol tool. Most Linux servers don't have graphical environment as having a X server with any graphics manager is a waste of system resources thus installing gsmartcontrol doesn't make much sense, however for monitoring and reporting for upcoming Hard Disk issues gsmartcontrol is a good one to have.

a) To install gsmartcontrol on Debian and Ubuntu Linux;

debian:~# apt-get install --yes gsmartcontrol
....

 

b) Installing gsmartcontrol on CentOS, Fedora, RHEL and SuSE;

gsmartcontrol has a binary package builds for all major Linux distributions, except Slackware Linux. For any of RPM based Linux distros. Go and download required smartmontools distro version and type binary from here then install the RPMs one by one with the usual:

[root@centos ~]# rpm -ivh glimm*
....
[root@centos ~]# rpm -ivh libglademm*
....
[root@centos ~]# rpm -ivh libsigc*
....
[root@centos ~]# rpm -ivh cairomm*
....
[root@centos ~]# rpm -ivh gsmartcontrol*
....

Below, are 2 screenshots of GSmartControl taken from my

gsmartmontools Debian stable Linux screenshot monitor hard disk health in graphical environment

Lenovo gsmartcontrol Thinkpad Device information /dev/sda ST9160824AS screenshot 
If you get something different from Overall health self-assessment test PASSED, this means hard disk has a surface damage and needs to be replaced ASAP. If during hard disk normal operation HDD hits I/O errors and you can't afford to have a GUI environment just for gsmartcontrol, errors gets logged in dmesg hence dmesg could be useful to provide you with info of a failing hard drive.

diskinfo Linux hdparm FreeBSD equivalent command for disk info and benchmarking

Thursday, March 8th, 2012

FreeBSD Linux hdparm equivalent is diskinfo artistic logo

On Linux there is the hdparm tool for various hard disk benchmarking and extraction of hard disk operations info.
As the Linux manual states hdparmget/set SATA/IDE device parameters

Most Linux users should already know it and might wonder if there is hdparm port or equivalent for FreeBSD, the aim of this short post is to shed some light on that.

The typical use of hdparm is like this:

linux:~# hdparm -t /dev/sda8

/dev/sda8:
Timing buffered disk reads: 76 MB in 3.03 seconds = 25.12 MB/sec
linux:~# hdparm -T /dev/sda8
/dev/sda8:
Timing cached reads: 1618 MB in 2.00 seconds = 809.49 MB/sec

The above output here is from my notebook Lenovo R61i.
If you're looking for alternative command to hdparm you should know in FreeBSD / OpenBSD / NetBSD, there is no exact hdparm equivalent command.
The somehow similar hdparm equivallent command for BSDs (FreeBSD etc.) is:
diskinfo

diskinfo is not so feature rich as linux's hdparm. It is just a simple command to show basic information for hard disk operations without no possibility to tune any hdd I/O and seek operations.
All diskinfo does is to show statistics for a hard drive seek times I/O overheads. The command takes only 3 arguments.

The most basic and classical use of the command is:

freebsd# diskinfo -t /dev/ad0s1a
/dev/ad0s1a
512 # sectorsize
20971520000 # mediasize in bytes (20G)
40960000 # mediasize in sectors
40634 # Cylinders according to firmware.
16 # Heads according to firmware.
63 # Sectors according to firmware.
ad:4JV48BXJs0s0 # Disk ident.

Seek times:
Full stroke: 250 iter in 3.272735 sec = 13.091 msec
Half stroke: 250 iter in 3.507849 sec = 14.031 msec
Quarter stroke: 500 iter in 9.705555 sec = 19.411 msec
Short forward: 400 iter in 2.605652 sec = 6.514 msec
Short backward: 400 iter in 4.333490 sec = 10.834 msec
Seq outer: 2048 iter in 1.150611 sec = 0.562 msec
Seq inner: 2048 iter in 0.215104 sec = 0.105 msec

Transfer rates:
outside: 102400 kbytes in 3.056943 sec = 33498 kbytes/sec
middle: 102400 kbytes in 2.696326 sec = 37978 kbytes/sec
inside: 102400 kbytes in 3.178711 sec = 32214 kbytes/sec

Another common use of diskinfo is to measure hdd I/O command overheads with -c argument:

freebsd# diskinfo -c /dev/ad0s1e
/dev/ad0s1e
512 # sectorsize
39112312320 # mediasize in bytes (36G)
76391235 # mediasize in sectors
75784 # Cylinders according to firmware.
16 # Heads according to firmware.
63 # Sectors according to firmware.
ad:4JV48BXJs0s4 # Disk ident.

I/O command overhead:
time to read 10MB block 1.828021 sec = 0.089 msec/sector
time to read 20480 sectors 4.435214 sec = 0.217 msec/sector
calculated command overhead = 0.127 msec/sector

Above diskinfo output is from my FreeBSD home router.

As you can see, the time to read 10MB block on my hard drive is 1.828021 (which is very high number),
this is a sign the hard disk experience too many read/writes and therefore needs to be shortly replaced with newer faster one.
diskinfo is part of the basis bsd install (bsd world). So it can be used without installing any bsd ports or binary packages.

For the purpose of stress testing hdd, or just some more detailed benchmarking on FreeBSD there are plenty of other tools as well.
Just to name a few:
 

  • rawio – obsolete in FreeBSD 7.x version branch (not available in BSD 7.2 and higher)
  • iozone, iozone21 – Tools to test the speed of sequential I/O to files
  • bonnie++ – benchmark tool capable of performing number of simple fs tests
  • bonnie – predecessor filesystem benchmark tool to bonnie++
  • raidtest – test performance of storage devices
  • mdtest – Software to test metadata performance on filsystems
  • filebench – tool for micro-benchmarking storage subsystems

Linux hdparm allows also changing / setting various hdd ATA and SATA settings. Similarly, to set and change ATA / SATA settings on FreeBSD there is the:

  • ataidle

tool.

As of time of writting ataidle is in port path /usr/ports/sysutils/ataidle/

To check it out install it as usual from the port location:

FreeBSD also has also the spindown port – a small program for handling automated spinning down ofSCSI harddrive
spindown is useful in setting values to SATA drives which has problems with properly controlling HDD power management.

To keep constant track on hard disk operations and preliminary warning in case of failing hard disks on FreeBSD there is also smartd service, just like in Linux.
smartd enables you to to control and monitor storage systems using the Self-Monitoring, Analysisand Reporting Technology System (S.M.A.R.T.) built into most modern ATA and SCSI hard disks.
smartd and smartctl are installable via the port /usr/ports/sysutils/smartmontools.

To install and use smartd, ataidle and spindown run:

freebsd# cd /usr/ports/sysutils/smartmontools
freebsd# make && make install clean
freebsd# cd /usr/ports/sysutils/ataidle/
freebsd# make && make install clean
freebsd# cd /usr/ports/sysutils/spindown/
freebsd# make && make install clean

Check each one's manual for more info.

Drunk

Monday, February 26th, 2007

I drinked 100 gr. of Rom and a beer after that and I got drunk. I have some problems with one of the servers in DBG. I hope the machine’s hard drive didn’t die. If the hdd is dead it would be very very bad for me. Tomorrow I have test in Management. As usual I haven’t study :]] I’m listening to Pantera and I want to drink more .. which is not good at all. END—–

How to reboot remotely Linux server if reboot, shutdown and init commands are not working (/sbin/reboot: Input/output error) – Reboot Linux in emergency using MagicSysRQ kernel sysctl variable

Saturday, July 23rd, 2011

SysRQ an alternative way to restart unrestartable Linux server

I’ve been in a situation today, where one Linux server’s hard drive SCSI driver or the physical drive is starting to break off where in dmesg kernel log, I can see a lot of errors like:

[178071.998440] sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
[178071.998440] end_request: I/O error, dev sda, sector 89615868

I tried a number of things to remount the hdd which was throwing out errors in read only mode, but almost all commands I typed on the server were either shown as missng or returning an error:
Input/output error

Just ot give you an idea what I mean, here is a paste from the shell:

linux-server:/# vim /etc/fstab
-bash: vim: command not found
linux-server:/# vi /etc/fstab
-bash: vi: command not found
linux-server:/# mcedit /etc/fstab
-bash: /usr/bin/mcedit: Input/output error
linux-server:/# fdisk -l
-bash: /sbin/fdisk: Input/output error

After I’ve tried all kind of things to try to diagnose the server and all seemed failing, I thought next a reboot might help as on server boot the filesystems will get checked with fsck and fsck might be able to fix (at least temporary) the mess.

I went on and tried to restart the system, and guess what? I got:

/sbin/reboot init Input/output error

I hoped that at least /sbin/shutdown or /sbin/init commands might work out and since I couldn’t use the reboot command I tried this two as well just to get once again:

linux-server:/# shutdown -r now
bash: /sbin/shutdown: Input/output error
linux-server:/# init 6
bash: /sbin/init: Input/output error

You see now the situation was not pinky, it seemed there was no way to reboot the system …
Moreover the server is located in remote Data Center and I the tech support there is conducting assigned task with the speed of a turtle.
The server had no remote reboot, web front end or anything and thefore I needed desperately a way to be able to restart the machine.

A bit of research on the issue has led me to other people who experienced the /sbin/reboot init Input/output error error mostly caused by servers with failing hard drives as well as due to HDD control driver bugs in the Linux kernel.

As I was looking for another alternative way to reboot my Linux machine in hope this would help. I came across a blog post Rebooting the Magic Wayhttp://www.linuxjournal.com/content/rebooting-magic-way

As it was suggested in Cory’s blog a nice alternative way to restart a Linux machine without using reboot, shutdown or init cmds is through a reboot with the Magic SysRQ key combination

The only condition for the Magic SysRQ key to work is to have enabled the SysRQ – CONFIG_MAGIC_SYSRQ in Kernel compile time.
As of today luckily SysRQ Magic key is compiled and enabled by default in almost all modern day Linux distributions in this numbers Debian, Fedora and their derivative distributions.

To use the sysrq kernel capabilities as a mean to restart the server, it’s necessery first to activate the sysrq through sysctl, like so:

linux-server:~# sysctl -w kernel.sysrq=1
kernel.sysrq = 1

I found enabling the kernel.sysrq = 1 permanently in the kernel is also quite a good idea, to achieve that I used:

echo 'kernel.sysrq = 1' >> /etc/sysctl.conf

Next it’s wise to use the sync command to sync any opened files on the server as well stopping as much of the server active running services (MySQL, Apache etc.).

linux-server:~# sync

Now to reboot the Linux server, I used the /proc Linux virtual filesystem by issuing:

linux-server:~# echo b > /proc/sysrq-trigger

Using the echo b > /proc/sysrq-trigger simulates a keyboard key press which does invoke the Magic SysRQ kernel capabilities and hence instructs the kernel to immediately reboot the system.
However one should be careful with using the sysrq-trigger because it’s not a complete substitute for /sbin/reboot or /sbin/shutdown -r commands.
One major difference between the standard way to reboot via /sbin/reboot is that reboot kills all the running processes on the Linux machine and attempts to unmount all filesystems, before it proceeds to sending the kernel reboot instruction.

Using echo b > /proc/sysrq-trigger, however neither tries to umount mounted filesystems nor tries to kill all processes and sync the filesystem, so on a heavy loaded (SQL data critical) server, its use might create enormous problems and lead to severe data loss!

SO BEWARE be sure you know what you’re doing before you proceed using /proc/sysrq-trigger as a way to reboot ;).

How to install Ubuntu Linux on Acer ASPIRE 5736Z Notebook / Get around the black screen install CD issue

Friday, July 1st, 2011

My sister’s newly bought laptop is Acer Aspire 5736Z . By the default this notebook comes with some kind of Linux distribution Linpus .
Even though this Linpus (crafted Linux especially for Acer notebooks) looked really nice, it prooved to be a piece of shit linux distro.
Linplus was unable to even establish a simple Wireless WPA2 protected connection with my wireless router, not to mention that the physical Linux consoles (CTRL+ALT+F1) were disabled …

This LinPlus was so bad that I couldn’t even launch any type of terminal on it (I was stuck!) so I decided to kill it and make a decent latest Ubuntu 11.04 Install on it.

I was surprised to find out that trying to boot up the Ubuntu 11.04 installer led me to a black screen (black screen of death).

The v Aspire’s 5736Z monitor kept completely blank, where the hard drive was continuously reading (indicating that the Ubuntu installer has properly booted but it couldn’t light up the notebook screen).

A bit of investigation on any issues with this Acer notebook model has led me to a thread in fedora forums:
http://forums.fedoraforum.org/showthread.php?t=263794
On this forum the same kind of Linux install problem was described to also occur with ASPIREs 5736Z during a Fedora install.

I just tried the suggested fix and it works like a charm.

The fix goes like this:

1. Invoke the Ubuntu settings parameter Install pre install screen

Just press any button while the Ubuntu installer CD is reading and after few secs the Install options screen should appear, like you see it in below’s screenshot:

Ubuntu Install boot options parameters screen

2. Select the nomodetest Boot CD Ubuntu option

You see in the above screenshot the F6 Other Options . I had toto press F6 and choose the nomodetest boot option to make the Ubuntu be able to further boot up.

After selecting the nomodetest option and pressing on the Install Ubuntu menu option the graphic installer launched succesfully 😉
Hope this small tip to be helpful to some Ubuntu or other Linux user who is trying to install Linux on his Acer Aspire 5736Z
Cheers 😉