Incident-11-05-2011

From CT Wiki
Jump to: navigation, search

The EMC file server beuk/ebro crashed May 11 at 21:00. Recovering by a remote EMC engineer started at May 12 at 08:00. After two hours the file server has been put back into operation. The last step is to recover all other services like login, mail, imap, web, ldap, and many more.

Incoming mail during the outage is buffered in the external SURFmail filter and delivered to the Nikhef mail server after recovery (no mail is lost).

The EMC file server is the heart of the Nikhef network. If this one fails almost all services in the network will fail as well. That's why we bought this expensive, redundant and usely very stable file server. In the past five years we did not have any incident with this type of file server. We will receive an incident report from the EMC 3rd level support to understand what went wrong and what measures are taken to prevent this happening again.

Views
Personal tools