With respect to hard drives, the acronym “SMART” stands for Self-Monitoring, Analysis and Reporting Technology. This was built into many ATA-3 and later ATA, IDE and SCSI-3 hard drives. Basically anything after about 2005 should have it.
Ubuntu/Debian:
sudo apt-get install smartmontools |
CentOS/Fedora/RH:
sudo yum install smartmontools |
Gentoo:
sudo emerge sys-apps /smartmontools |
Wiki: http://sourceforge.net/apps/trac/smartmontools/wiki
smartctl
The program smartctl is used to interface with the SMART features on the drive firmware. Here are a couple of easy things to get started with (however some versions do not have the –scan option):
$ smartctl --scan -d ata /dev/hda -d ata # /dev/hda, ATA device /dev/hdc -d ata # /dev/hdc, ATA device $ sudo smartctl --info /dev/hdc smartctl 5.42 2011-10-20 r3458 [i686-linux-2.6.33.1-xedvia] ( local build) Copyright (C) 2002-11 by Bruce Allen, http: //smartmontools .sourceforge.net === START OF INFORMATION SECTION === Model Family: Seagate Barracuda 7200.7 and 7200.7 Plus Device Model: ST3160023A Serial Number: 5JS9MDKW Firmware Version: 8.01 User Capacity: 160,041,885,696 bytes [160 GB] Sector Size: 512 bytes logical /physical Device is: In smartctl database [ for details use: -P show] ATA Version is: 6 ATA Standard is: ATA /ATAPI-6 T13 1410D revision 2 Local Time is: Thu Feb 7 09:27:18 2013 PST SMART support is: Available - device has SMART capability. SMART support is: Disabled |
Note that the “SMART support” is listed as available but disabled. To enable full diagnostic checking turn it on with something like this:
$ sudo smartctl --smart=on --offlineauto=on --saveauto=on /dev/hdc === START OF ENABLE /DISABLE COMMANDS SECTION === SMART Enabled. SMART Attribute Autosave Enabled. SMART Automatic Offline Testing Enabled every four hours. |
In theory this should only need to be done once and the drive should remember this (because of the saveauto directive). The offlineauto will cause automatic testing every 4 hours. In theory it will wait “nicely” if the drive is already busy so performance should not be seriously impacted.
Testing
Here’s a way to run a “short” off-line test. This tests electrical and mechanical performance of the drive and does read testing.
$ sudo smartctl -- test =short /dev/hda === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command : "Execute SMART Short self-test routine immediately in off-line mode" . Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful. Testing has begun. Please wait 1 minutes for test to complete. Test will complete after Thu Feb 7 10:13:19 2013 Use smartctl -X to abort test . $ sudo smartctl --log=selftest /dev/hda === START OF READ SMART DATA SECTION === SMART Self- test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 43398 - $ sudo smartctl --log=selftest /dev/hdc === START OF READ SMART DATA SECTION === SMART Self- test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed: read failure 90% 37994 7234643 |
The first command starts the test off and it tells you to come back in 1 or 2 minutes. The second command shows how to query the log file to see if anything bad came up. In this case hda was fine (“Completed without error”) but hdc had a very important “read error”. Replace that drive ASAP!