We just received word from the data recovery firm that based on their initial analysis, we have an 80% probability of full recovery of the data on our drives. Obviously, everyone would like that number to be 100% probability, but they cannot provide a 100% probability until they actually begin the recovery which is the next step in the process.
In effort to keep everyone informed on a timely basis, here is where we currently stand. The failed drives were delivered to the data recovery firm first thing this morning. We have requested their most expedited service which is 24-72 hours. We are also in the process of attempting the file level restore mentioned in the previous update. It is posible that this process could produce positive results quicker than the data recovery firm. This work is ongoing and we will provide another update as soon as we have additional information.
As with everything we've experienced during this incident, restoring our customer sites from the October 2017 state to the July 10, 2018 state has proven more complicated and time consuming than anticipated. We are working the problem from two angles. The first is a file-level restore from the original drives that failed. We are also sending some of the drives to a data recovery company in Los Angeles. We are hopeful that we will be able to recover the data with one or both of these methods. We'd like to also answer some questions you've asked over the last few days:
1) Can we add current data to our site? The answer is YES. You can update your content through the CMS. (Remember, you are essentially dealing with your site as it existed in October) When the new data comes online we will work with you to merge the new data you've added with the data from July 10th. Please help us in this process by keeping notes regarding which areas of your site you've updated. This will reduce the amount of time it takes to merge the data. We would also encourage you to only make updates to the areas of the site that provide critical/time sensitive information (news updates, calendars, etc...) or areas where the outdated content would be confusing to your customers.
2) Would it be possible to take the site with the old data down and put a notice that the site will be back online soon? YES. If you would like us to do this, please email email@example.com
3) Can we point our domain to another hosting provider temporarily? YES. If you have an old site or would like to point your URL to another provider until this incident is fully resolved, please email us at firstname.lastname@example.org to coordinate.
If you have other questions, feel free to reach out. We will update this space as new information becomes available.
As of this morning we have 95% of our sites back up and running. We are working to restore the data to the most recent version. We will keep you posted as that progresses.
We continue to make slow but steady progress. The data center believes they have found a workaround that should significantly speed up the process. They are testing that now. If that works as planned and we donít encounter any additional unexpected obstacles, we hope to have all clients up running no later that Monday morning. As Iíve mentioned, once the issue is fully resolved we will be reaching out to all clients regarding the credit we will be applying to your account and the measures we are taking to prevent something like this from ever happening again. Thank you for your patience.
We have brought approximately 1/3 of our client sites back online and restored them to their Oct. 2017 state. The others are in the process of being brought online now. We will then begin the process of restoring each site to the Jul 10, 2018 state. We will test each site as we do this. We will continue to work around the clock until this issue is fully resolved. We will post updates to this space as they become available. Keep in mind that until your site is restored to its Jul 10 state, any changes you make will be temporary. We appreciate your patience and the encouraging messages many of you have sent our team.
In an effort to keep you posted on the current status, we have begun to bring the first batch of sites online. We have multiple servers that must be brought online individually. As with everything we've experienced in this incident, it's taking longer than we had hoped and expected. We have people in Tulsa, Chicago, Dallas and LA working to get the issue resolved. We are still making steady progress and are working to get everything fully restored as quickly as possible.
Our team and the team in the data center have been working through the night. As a temporary measure, we are restoring sites to their October 2017 status which should be complete in the next 4-6 hours. This is the most recent date that we had a FULL/clean backup with no errors or corruption. The data center conducted periodic tests on our backups and did not receive any indication that there were underlying problems. However, when they attempted to do a full restoration to a production environment, there were some issues. The data center technicians seem confident that they will be able to restore all sites back to their July 10 state within the next 24 hours. We chose to restore using the October 2017 data in an effort to as many customer websites back up and running as soon as possible. PLEASE NOTE: Any changes you make via the Content Management System will either not save or be overwritten when we restore from the July 10 backup. We will update this space as soon as the sites are fully restored to their most recent version and notify you that you can begin updating the sites as normal. Once again, we apologize for the inconvenience. When this issue is resolved we will be reaching out to all clients. We will do everything we can to make this right and regain your trust.
As you know, we have experienced a major downtime event in our primary web hosting environment. This is - by a large margin - the longest unplanned downtime we have experienced in our 18 year history.
I will provide some technical details of what caused the issue. But first, I would like to personally apologize for the inconvenience this downtime has caused. I am keenly aware that many of us rely on our websites to be the digital "face" of our business. We will be applying a credit to the accounts of every customer who has been impacted.
Our client sites are hosted in a private cloud environment within a Class A data center with industry leading security, scalability and redundancy. Like many similar companies, we do all of our planned maintenance in the overnight hours to minimize impact on our customers. On Wednesday, July 11 at 10:00 p.m. Central, we had planned to swap a drive that had shown early signs indicating it needed to be replaced. This is not uncommon and because of the redundancy that is built into our environment, this usually requires less than 5 minutes to complete.
Unfortunately, this set off an extremely unlikely and unfortunate series of events. When the Data Center technician attempted to reboot the servers after swapping the drive, he informed us that it looked like a boot drive had also failed. This took several hours to diagnose and replace. The environment still would not come back online as expected. The technicians spent the ensuing 20+ hours removing the servers from the racks, testing, troubleshooting, etc... They were not able to come to a successful resolution. Ultimately they were forced to replace all of our hardware and rebuild our environment. This included the need to restore from one of our backups. As of 11:00 p.m. Central on Thursday, July 12 this process is still underway.
Again, I apologize for the inconvenience. If you have any additional questions, please feel free to email me at email@example.com
We will post additional updates to this space as we receive more information from our data center.
Brent Lollis, President
Creative State, LLC