Speaker
Mr
Andras Horvath
(CERN)
Description
Building data centers from a very large number of components
of finite reliability increases the probability of hardware
failures, potentially leading to data corruption and unscheduled
downtime. In addition, the typical extensive variations in
hardware types increase the probability of similar errors
due to software incompatibility.
We report on the testing and verification methods and
software used to check system integrity and decrease service
downtime by early problem detection and prediction.
Primary author
Mr
Andras Horvath
(CERN)