The Challenge: A Burdensome DR Process
To support their suite of cloud-based time and attendance solutions for small and mid-size businesses, Workwell originally assembled their disaster recovery (DR) solution in increments, from a collection of manual processes, homegrown scripts, and various backup tools. As a result, like many modern enterprises, their DR process was complex and an ongoing maintenance burden for the team.
Further, while Workwell’s piecemeal approach was functional, the actual recovery was a highly manual endeavor. A team of engineers was required to follow a lengthy disaster recovery plan and execute operations across multiple systems with many tools. The process was error-prone and difficult to practice and resulted in recovery times that measured several hours in duration.
The engineering leadership at Workwell wanted a solution that would eliminate the burden of maintaining, documenting, and manually executing their DR processes. They reasoned that the engineering time they were spending on DR would generate much greater ROI if invested in more strategic initiatives for their business. When they learned about Arpio’s automated disaster recovery capabilities, they recognized an opportunity to eliminate a workstream while simultaneously improving the resilience of their workloads.
The Solution: Arpio’s DR Capability
Arpio is an industry-leading DR solution for businesses whose critical operations reside in AWS. After seeing a demo, the Workwell team asked Arpio for a proof-of-concept to demonstrate and validate the ease of setup and time to service restoration. In the first hour of that POC, Workwell was able to implement cross-region and cross-account disaster recovery for their initial workload.
The Workwell team was in disbelief at how quickly Arpio was able to protect their critical workloads, and replace their home-grown DR solutions. Scott Berry, Senior VP of Engineering, recounted, “when my team delivered the initial POC report, I didn’t believe it.
“I had to see it first hand because it seemed too good to be true. The ease of setup is tremendous. The fact that we can spin up a DR environment and have it fully functional in less than 30 minutes, and have it tested… I’d never seen it before.”
Arpio replaced all of the tools in Workwell’s multi-faceted, multi-supplier legacy DR plan and streamlined the majority of their recovery processes into a single solution. “To think that in minutes we had automated a manual process that previously took a team of engineers many hours to execute is absolutely phenomenal.”
Once Workwell’s engineering leadership validated the extent to which Arpio would streamline their disaster recovery operations, they wasted no time in migrating their other critical workloads. In fact, Jim Cavenaugh, Workwell’s Director of Engineering, says that one engineer was able to onboard the rest of their workloads in just over a week. “And most of that time was just testing.”
Indeed, the ease of testing their DR solution has been another unexpected win for the Workwell team. With their previous solution, DR testing was infrequent because of the significant time involved. With Arpio, their devops team now executes a push-button test of their DR solution on a monthly basis. The business is now confident that it can recover quickly from any cyber event or cloud outage that might impact its AWS environment.
The Results: Peace of Mind
When asked about the impact that Arpio has had on their business, Scott said, “More than anything, the best way to sum it up is just peace of mind. We have a rock-solid DR solution in place, and we can act on it on-demand, I no longer have to rally my troops in the middle of the night to try to execute against a highly manual process.”
Arpio is now the de facto approach to DR for all net-new Workwell products.
The simplicity of Arpio has made what was once an arduous process – planning and documenting DR processes – pain-free. “Of course we are going to do it with every new product,” said Jim, “because we get it out of the box.”
“The sense of comfort goes a long way. Knowing day in and day out that we can recover our workloads in a matter of minutes means we can focus our engineering efforts on delivering on the strategic initiatives that advance our business,” Scott said. “We don’t have to think about it … or do a bunch of planning around it. It’s a no-brainer.”
Key Metrics
- Implemented and verified DR for 11 different workloads on Arpio by one engineer in under 10 days. Workwell estimates this is 1/10th the time it would have taken without Arpio.
- Reduced AWS regional outage recovery from 4 hours to 15 minutes
- Reduced manual processes required to respond to unexpected downtime by 90%
- Disaster Recovery testing cadence went from annual to monthly