Skip to main content

The Challenge: Resiliency Gaps Existed, but DR Seemed ‘Insurmountable’

NurseGrid is a health technology company that has been serving the needs of the healthcare community since 2013. Every day, the NurseGrid technology delivers life-critical resource planning capabilities to more than a million healthcare professionals. NurseGrid’s software streamlines the often convoluted scheduling process in healthcare facilities and they’re proud to offer the #1 rated mobile nurse scheduling solution in the Apple App Store and on Google Play.

“Throughout our evaluation, Arpio was the only solution that made sense. The other options failed to recognize the complexity of our cloud-native workload.”

— Lorenzo Ciacci, NurseGrid CTO

As NurseGrid’s Chief Technology Officer, Lorenzo Ciacci is aware of the importance that his service plays in the lives of nurses and patients everywhere. “When you’re a nurse manager, and you’re responsible for organizing the care of a fluctuating population of patients, you need to be able to review and augment staffing schedules in real-time. If a hospital experiences an influx of patients, and you can’t immediately add staff to upcoming shifts, patient care suffers.”

Because of this imperative, being resilient to cloud outages and protected from cyber-attacks has always been important for the NurseGrid team. Their service runs in Amazon Web Services, so they benefit from the many layers of redundancy and security that are built into AWS. But Lorenzo was aware of the resiliency gaps that remained.

“Having an effective DR solution was always something we were thinking about, but we couldn’t find the cycles to get it done. It kept getting pushed down the priority list because of its seeming ‘insurmountableness.’”

One day in the fall of 2019, a data loss event reminded them again of the importance of great disaster recovery. As happens so often, that day a human error resulted in the need to recover databases from a backup. Luckily, the impacted databases were managed in Amazon’s RDS service, so in-region backups had been happening and the data was restored in a few hours. But questions were raised: “What if the loss hadn’t been a mistake? What if a malicious actor had deleted the data and the backups? And how would we recover everything in the event of a catastrophic event?”

The Solution: Arpio’s Disaster Recovery Capabilities

NurseGrid’s board of directors wanted this problem solved well, so they pushed Lorenzo to look broadly for the best solution. “Throughout our evaluation, Arpio was the only solution that made sense. The other options failed to recognize the complexity of our cloud-native workload. Arpio is the only disaster recovery solution that truly understands Amazon Web Services.”

Lorenzo was similarly impressed with Arpio’s ability to test disaster recovery for his application. “The fact that we can test it every other day if we want is amazing. It’s simply 1 click to bring up the entire DR environment for testing. And it won’t break the bank.”

“Arpio uses AWS the way I would have wanted to build my own solution, but wouldn’t have been able to execute. It’s there, but it’s not there because I don’t have to maintain a second stack.”

— Taylor Eke, Lead Architect

The NurseGrid environment consists of load balancers, EC2 instances, RDS databases, Redis ElastiCache nodes, and a MongoDB database hosted in MongoDB’s Atlas service. They manage their environment with Chef via the AWS OpsWorks configuration management service. The production environment lives in Northern Virginia, and they wanted their DR environment established in Oregon. They also wanted the DR environment in a different AWS account, locked down so that it wouldn’t be vulnerable to a cyber attack.

Responsibility for implementing Arpio fell to Taylor Eke, NurseGrid’s lead architect. Taylor was able to quickly configure Arpio to protect the production environment. The process was as simple as installing a CloudFormation template to configure permissions, and then selecting a few critical resources in the Arpio UI. Taylor selected the load balancer, the database, the ElastiCache node, and some EC2 instances. Arpio automatically discovered the security groups, IAM roles, network configuration, and everything else needed to fully recover the application in the recovery environment. Suddenly, the project that was previously deemed insurmountable was accomplished in under one hour.

The Results: NurseGrid Can Focus on Their Solution, Not AWS Disasters

Following the experience, Taylor had some flattering things to say. “Arpio uses AWS the way I would have wanted to build my own solution, but wouldn’t have been able to execute. It’s there, but it’s not there because I don’t have to maintain a second stack.”

“Arpio is magic.”

Today, NurseGrid continues to innovate on providing the best possible staffing and scheduling solution for healthcare workers. And Arpio continues to ensure that their service is protected from any disaster that could befall their production environment. Every time a nurse is assigned a shift, every time new software is deployed, and every time NurseGrid’s AWS environment is updated, Arpio replicates their data and their infrastructure to their protected DR environment.

We all hope they’ll never need it, but NurseGrid’s customers and their patients are glad it’s there if they ever do.