If you opened up your USF e-mail account over the winter break and discovered your inbox empty, you were not alone.
That’s because on Dec. 22, just after 2 p.m., about 80,000 USF e-mail accounts were wiped clean.
“I’ve worked here since 1994, and I’ve never seen a situation like this before,” said Academic Computing Associate Director Alex Campoe. “It was catastrophic.”
Twelve hard drives hold all the messages for every e-mail account. The system is capable of handling the simultaneous failure of two hard drives. In this case, three hard drives failed at the same time, crashing the system and completely cleaning out every mail.usf.edu e-mail account.
Eric Pierce, an assistant administrator at Academic Computing, called the likelihood of three hard drives crashing at the same time “almost impossible.”
“I’ve never seen three hard drives fail in a system before,” said Pierce, who has worked at USF for five years.
Naturally, the situation caused slight panic, confusion and some frustration.
“The e-mail outage was a definite inconvenience,” graduate student and teacher’s assistant Larry Porter said. “Students rely on e-mail for a great deal of academic communication, especially graduate students.”
Marina Schramm, a history major, said she thought she lost every report she wrote for her Theory of History class last semester.
“Plus, those reports had the corrections and the opinions of the professor, which were absolutely invaluable to me,” she said.One student, commenting on Academic Computing’s blog, complained that he lost graduation pictures.
Within three days of the crash, Campoe said 70,000 accounts had been restored.
“Then we had another 10,000 to deal with that had all sorts of problems,” Campoe said. “Things didn’t restore correctly. When things restored, certain files weren’t placed in the appropriate place.”
That meant Campoe and the Academic Computing staff had to manually fix the remaining 10,000 accounts, physically manipulating files into their correct places.
As of Wednesday evening, Pierce said at least 98 percent of everything had been restored.
“I thought the staff did a great job in getting the system back online quickly and restoring the e-mail that was lost,” student Patti Weeks said.In the days after the crash, the situation was so dire that Pierce said he worked a total of 30 hours on Christmas Eve and Christmas Day to restore everything as quickly as possible.
“I was working at midnight at Christmas Eve,” he said. “I slept about 10 hours in two days; it was a bad weekend.”
Despite the interrupted holiday, Pierce and his co-workers know they that even though they sustained some damage, they dodged an even bigger bullet.
“That’s the best time it could have happened,” Pierce said. “If it would have happened just two weeks ago, during the last week of classes, it would have been much, much worse.”
Both Pierce and Campoe admitted the restoration process took too long. That’s why they hope to get increased funding that would allow a switch from the three-year-old tape drive backup system – which works the same way as a VCR tape – to a hard-drive backup system.
“If we ever had another disaster like this, the data would be read from a hard drive, which would be much faster than a tape,” Pierce said.
Campoe concurred and suggested the tape system is outdated.
“The tape technology we use to back this stuff up can’t keep up with the amount of data we have to back up,” he said.
In the future, Campoe said hardware will be monitored more closely by working with the hardware’s manufacturer. But he also admitted that there’s no way to predict when or if something like this might happen again.
“It could be 10 years from now,” he said, “or it could be tomorrow.”