SMART Drive Testing in Watchman Monitoring

Last updated:

Failing drive detection is one of the core services Watchman Monitoring provides. Our Disk I/O plugin has saved untold volumes of data from being lost permanently. SMART Reporting is just as valuable.

The underlying agent to most SMART tools is well the well known project smartmontools, which is listed in our Open Source recognitions page. The key is in the interpretation of the SMART Data. Rather than re-invent this wheel, we looked at the most popular commercial implementations of SMART reporting, and partnered with Volitans Software to bring a well known, trusted provider to our tool. In our testing, SMART Utility provided clear, honest reporting of which drives were failing, and which can not yet be tested.

SMART Testing pays attention to the various failure states a drive can report, and balances bad sectors, drive errors, and ECC Errors.

Sample Report

Sample Report

In this report, a computer has three internal volumes, and one named Design4HD is failing.

SMART Testing Timing

Watchman Monitoring's agent runs hourly. Checking SMART on a hard drive every hour may lead to reduced system performance. To prevent an impact to system performance, the full SMART test is run only once per day. The SMART test is schedules for the 1:00 AM run (in the computer's local time).

If a computer's drive(s) have not been tested in over 24 hours, the plugin will be run at the first opportunity.

Warnings will be sent for any hourly report where the number of errors increases.

The drive will be considered bad when ten (10) of any given error is reported.

Categorization of SMART Errors over time

Many drives will have record errors only for a short period of time, then "stabilize". Watchman Monitoring will warn when errors have happened within the past 250 drive-hours (hours during which the drive was in use) but only inform if a drive has run for 250 hours without recording a new issue.

In addition to disk errors, a drive may mark sectors as bad. Disks with bad sectors should be replaced as soon as possible however, it not always feasible to do so. Watchman Monitoring will warn each time a new bad sector is detected, and once the drive has recorded 10 bad sectors, it is always marked as a problem drive.

Given the ratio of drive cost to down time, we recommend the replacement of any drive which displays any error.

SMART Utility by Volitans Software

The SMART Utility based plugin in the monitoring client software provides a base level reading of a drive’s SMART status.

As problems are found, the full version of SMART Utility can be used to get additional details. IT Professionals who use SMART Utility in this fashion are encouraged to purchase a Consultant's License. Details are available here:

https://www.volitans-software.com/smart-utility-consultants-license/

Disable Reporting on Drives with SMART Reporting

Mac

You can select to disable notifications for disks with failed SMART reports by opening the System Preferences Navigate to Settings, expand Report SMART Errors and deselect the checkmark next to the appropriate drives serial number.

Mac - Report SMART Errors

External drives

By default, Mac OS will not look to disks in an external enclosure, and are skipped by Watchman Monitoring.

There is a project referred to as OS X SAT SMART Driver which attempts to surface this data. This package is NOT supported by Watchman Monitoring, and you understand that any missed errors or lost data relating to the use of the OS X SAT SMART Driver is not the responsibility of Watchman Monitoring.

If you are interested in experimenting with this project, a signed installer can be downloaded from Volitan's Software.

  • Ensure all data on any external drive is fully backed up.
  • Download a retail copy of SMART Utility from Volitans Software
  • Choose Install SAT SMART Driver the affected computer.
  • Watchman Monitoring cannot provide support for this driver.

Requesting additional information about SMART Failures

SMART Error reporting, predicting future drive failure, is as much an art as a science. Watchman Monitoring attempts to report situations where one can assume a drive is failing.

When the manual inspection of a drive is not conclusive, and more information is requested, the following steps will allow Watchman Monitoring staff to provide a support.

  • Use the Plugin Support option in the Watchman Monitoring dashboard to create a support ticket.
  • Download a retail copy of SMART Utility from Volitans Software
  • Launch SMART Utility on the affected computer and choose Save As... from the File Menu.
  • Submit the resulting .sudr file in a reply to the support ticket, or, if more convenient, via our un-branded Notes Form:

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.