CMRR Sub Header

Gordon F. Hughes

Gordon F. Hughes
Associate Director, CMRR

University of California, San Diego
Center for Magnetic Recording Research
9500 Gilman Drive, 0401
La Jolla, CA, 92093-0401

gfhughes@ucsd.edu
(858)534-5317 - Phone
(858)254-2600 - Cell
(858)534-8059 - Fax

Office:
Room 102

S.M.A.R.T. Dataset

The links below explain and download a dataset for testing disk drive failure prediction algorithms, such as machine learning or pattern recognition methods.

All hard disk drives currently implement a simple S.M.A.R.T. (Self Monitoring and Reporting Technology) failure prediction method. The purpose is to predict the near-term failure of an individual hard disk drive, and issue a backup warning to its user before data loss from failure occurs. Very low false alarm rates of about 0.2% per year must be attained; i.e., predicting a drive will fail when it won't. (A false alarm rate of 0.2% of total drives per year implies that a large 20% of drive returns would be good drives, relative to a 1% annual failure rate of drives).

This requirement for very low false alarm rates is also typical in medical diagnostic tests for rare diseases (epidemiology), and is known to pose difficult challenges in obtaining the maximum possible failure prediction accuracy. Improved methods developed by CMRR (see references) show that 50-60% accuracy can be obtained at the same low false alarm rates, on test data from several thousand drives of each of three individual drive models from two different drive manufacturers. The references show experimental performance of the present SMART system at only a moderate 10-30% accuracy at 0.2% false alarm rates.

The download dataset here is from one of these drive models and is described in:

J. F. Murray, G. F. Hughes, K. Kreutz-Delgado

"Comparison of machine learning methods for predicting failures in hard dives"
Journal of Machine Learning Research, vol 6, 2005.
(Available online at http://www.jmlr.org)


S.M.A.R.T. Dataset and Explanation (harddrive1.zip, 3.7MB)

About the Dataset

Return to S.M.A.R.T. Research Page