This has been a long time due article to follow up on some of my sessions about DR planning that I presented at the GPUG Summit as far back as 2014 in St-Louis. The topic is pretty extensive, so it will divided in several parts.
Some Facts to start with..
You read that right! A large chunk of IT disasters are due to infrastructure failures… from those, 15% are often linked to badly evaluated impact of system updates or application patches, lack of testing or other similar software related issues. 43% are due to Power Outages which happens more often people would assume. Viruses and Malware represents also a huge chunk of disaster, especially when it comes to e-mail threats. The chart above shows that natural causes are representing a small part of IT disaster compared to the HW & human factors.
The cost of downtime can be tremendous and quickly grow into the tens or hundreds of thousands of dollars. Just think about loss of productivity when an entire building is out of power and no employee can work with their computer for several hours, not to speak about your customers that try to reach out to your sales department to place orders or ask for customer support. Today most companies rely on modern IP based telephony systems, and when the internet or the power is out, those systems go down as well..
Pictures above show you an IT room that burned down after a building fire and the computer above is another example of what you might be able to recover after a fire.. Electronic components do not partner well with smoke, heat & water (once the FD or sprinkler entered into action), and it is very unlikely that you’ll be able to recover any data from those systems.
Yes, as silly as it sounds, disaster recovery planning starts with simple steps like regular backups that are kept on site and off site.
I always suggest to go by the 3-2-1 rule of backup:
– keep at least 3 copies of your data
– store 2 backup copies on different devices or storage media
– keep at least 1 copy off site in remote location (cloud or safe vault)
There will always be someone telling you that this isn’t enough, but it’s simply enough to remember and can be applied with little or no budget for most businesses. Of course you can spend millions of dollars for your DR & Business Continuity (BC) plans, but every plan is useless if not tested on a regular basis. The 3-2-1 rule mentioned earlier should have a 0 added to the end: zero failures. Which means you should make sure that after each backup the files are checked against errors and validated that it can be read by the backup software. What are backups worth if the medium they are stored on can’t be accessed ?
I’ve seen a case where a company had been backing up their SQL server for years on DAT tapes and proceed with tape rotation and all the regulatory process, but they never though about testing if those tapes could still be restored.. just because it would takes hours to process due to the amount of data and the slowness of the technology that was used. The day they ran into a major disk crash, they realized that the tapes were worn out and couldn’t be read and thus had no valid data for months back.
Another more recent example was related to a server disk crash where the OS couldn’t start anymore, and the SQL server would be out as well.. Fortunately the company had applied some of the 3-2-1 rule, but not to a full extend and the local SQL backups were copied over a network location by a scheduled script that would run on the other server nightly to simply use the Windows XCopy function and copy over entire folder structure.. The bad news was that the scheduled job would start by deleting the prior data before copying over the new data, and thus when the server crashed, the job just wiped out the previous backups on the network, before it failed to retrieve the backups from the other system that was down. That was another example of badly planned processes, were the person that put this in place was clearly lacking practice of DR & BC planning.
Fortunately I was able to recover most of the delete files with the help of a disk data recovery program from the network location where they were deleted, allowing for a quick recovery of operations. It took several days to a an external IT company (this was a small business that couldn’t afford a full time IT staff) to salvage and recover the data from the other drives on that physical server, which allowed them to also restore their CRM operations that got lost in the data wipe-out (one of few I couldn’t recover, since some data had been physically overwritten on the network disk).
In the next article of this series, we’re going to look more into the details of a Disaster Recovery (DR) and Business Continuity (BC) plan, and how you can apply some concrete steps to make your Dynamics GP better protected against data loss.
Hope that helps and until next post, wish you happy reading.