Follow the instructions below to set up an Intermediate Failover Architecture as described in Best Practices for using Elastic IPs and Availability Zones. This setup has a production deployment where instances are running in more than one availability zone. In this setup, there are three front ends using Elastic IPs. If there is a failure in the primary availability zone (us-east-1a), you will still have a running front end instance that is serving content from the secondary availability zone (us-east-1b). Simply promote the Slave-DB to "Master" and launch your backup deployment in a different availability zone (us-east-1c). Obviously, the performance of the site will suffer proportionally during an availability zone failure, but the site will remain operational until you can launch additional front ends and a new Slave-DB. Remember, if you have instances running in multiple availability zones, you'll have to pay for the zone-to-zone transfer fees. Data transferred between instances across different availability zones on EC2 costs $0.01 per GB.
In this example, we will use DNS with multiple A records. One A record is created for each EIP with the same DNS name. The DNS server will provide the rotation of A records. See Advanced Failover Architecture example for HAproxy with superior load balancer performance.
If you follow the diagram above as an example, your production deployment should look like the following screenshot. Notice that you're using an Elastic IP for each frontend and that the Slave-DB and FrontEnd-3 are in a different availability zone. Make sure that you selected the correct Availability Zone for each server.
WARNING! The current default for a server's Availability Zone is "-any-."
Now that you have an operational production deployment distributed across multiple availability zones, you should create backup instances that are ready to be launched in a failure scenario. It's important to be proactive when designing your deployment for quick failover and recovery. Remember, since you only have to pay for instances that are launched, it's a good idea to create "backups" for each instance that are properly configured and ready to be launched in a failure scenario.
For an intermediate setup like the diagram above, we recommend creating backups of each type of instance that are ready to be launched with the appropriate Elastic IP and into the appropriate availability zone. Once your deployment is running, simply clone each type of instance. In this example, you should have the following two backups:
Once you add your backups, your deployment can now look like the screenshot below.
WARNING! You should not use the "start all" link once you've added backup instances.
Notice that the backups do not have Elastic IPs or Availability Zones predefined because you don't know where you will launch these instances when problems occur.
You will not have a backup for the Master-DB because you have a redundant MySQL setup. If the Master-DB ever fails, simply promote the Slave-DB to Master-DB and launch a new Slave-DB instance. By configuring these instances ahead of time, you'll be "ready to launch" when problems occur.
In this example, the us-east-1a Availability Zone stops providing service, but your server is still up. By designing a deployment according to Best Practices, you've provided a deployment configuration that is tolerant of a one-zone failure.
The first step is to get back to full performance and then add back a zone failure tolerance as a second step. During this recipe, service is never interrupted.
Once again you have a distributed production deployment spread across multiple availability zones. Remember, performance will be affected proportionately if a zone stops, but the important thing is that your service will continue to perform.
In this example, all that has happened is that our redundant zone stopped performing. In this recipe we will replace the failed server to reestablish high reliability. During this recipe, service is never interrupted.
Once again you have a distributed production deployment spread across multiple availability zones. Remember, performance will be affected proportionately if a zone stops, but the important thing is that your service will continue to perform.
© 2006-2014 RightScale, Inc. All rights reserved.
RightScale is a registered trademark of RightScale, Inc. All other products and services may be trademarks or servicemarks of their respective owners.