Tonight, the Chattanooga TN area will see extremely strong thunderstorms, as well as possible tornadoes that have already done damage to other parts of the United States.
Disaster response readiness – for any type of emergency – is important. But one area of importance, related to IT emergencies, is often times glossed over by nonprofit organizations and small businesses due to the perceived high costs associated with planning and preparing for them. In this blog post, I will discuss some of the concerns nonprofits should consider when thinking about disaster response for their technology infrastructure, and some ways to help mitigate this risk – especially in the face of inclement weather like tornadoes, and finally some ways to respond to disasters when they do strike.
Planning for an IT Disaster
Information Technology (IT) outages come in all shapes and sizes. Your server could crash unexpectedly, lightning could strike and fry all of your equipment (even with good surge protection), your building could get flooded, or worse: you could get hit by a tornado.
At the very least, all of your important data should regularly be backed up from the primary server or other device (such as a Network-Attached Storage – NAS) to a secondary device.
But often times, organizations will rely on a RAID (Redundant Array of Inexpensive Disks) inside the primary computer or server, or an external hard drive or a second NAS.
An extremely important concept that many people don’t understand is that a RAID is not the same thing as a backup. Sure, it will provide additional redundancy in case a single hard drive fails. However, if the single server in which the RAID exists is struck by lightning, and no backups are available, that RAID will do you no good!
If your organization relies on one of the methods I’ve mentioned above for backup purposes, think with me about what would happen if your entire building was affected. What would happen if your building burned down? Or if there was a horrible flood that ruined all of your computer equipment? Or if a tornado came through and demolished your building?
This is why IT professionals including myself will always promote off-site backups. The most common way to backup your data off-site is to use a cloud provider. “The cloud” is a vague term, but I’ve tried to provide a good introduction to it in this blog post. You can also learn more about “the cloud” by reading through the slides from a workshop I co-presented last year.
Other methods of backing up your data off-site are to create your own cloud-like environment. I am in the process of helping one of our clients setup an on-site NAS which will be automatically (and securely) mirrored to my client’s secondary location. This will ensure that the data is stored on not one, but two different devices, each with a RAID configuration, each of which are not in the same location!
Besides backing up your data (preferably to an off-site location), other suggestions include (but granted, are not available to some organizations due to budget concerns):
- Keeping extra (unused) hard drives on-site for possible failed RAID arrays. Note that if you do this, your spare hard drives should be identical in every way to the hard drives currently in use. You should never use a different kind of hard drive or a different size (or even a different model) in a RAID array as the other disks inside the array.
- Keeping other spare hardware on-site (such as a spare server) in case your main server or NAS goes down.
- Creating a disaster-response plan. Let’s face it: Copying and pasting data stored in the cloud is a whole lot easier than physically replacing hard drives or other failed equipment on-site. Regardless, you should have a step-by-step process in place for what to do, and how to do it, if your infrastructure takes a nose-dive.
Responding to an IT Disaster
The response to a specific disaster will depend on the scale, and the type, of disaster. For example, your response to a failed RAID array would be much different than a response to a full-scale obliteration of your building.
Typically, I recommend that organizations call in a professional (if they do not have one in-house) to help get things back to normal. Regardless, a few things to consider include:
- The age of your failed equipment
Does it make sense to repair it, or should you replace it? If its more than a few years old, then depending on what’s wrong, it may be worth considering replacing the entire device rather than replacing bits and pieces of it. - The security of the data on your failed equipment
Even if your hard drives crash, or your device no longer works, you should still properly discard the data and the hard drives.