Note: Please go to to access the current RightScale documentation set. Also, feel free to Chat with us!
Home > Clouds > AWS > Amazon EC2 > Designing Failover Architectures on EC2 > Advanced Failover Architecture

Advanced Failover Architecture

Follow the instructions below to set up an Advanced Failover Architecture as described in Best Practices for using Elastic IPs and Availability Zones.  This setup has a production deployment where instances are running in more than one availability zone.  In this setup, there are two frontend load balancers using Elastic IPs and the same number of application servers in each zone.  Therefore, if there is a failure in one of the availability zones, your site's available performance will temporarily be cut directly in half.  If redundancy is the main goal and there are no limits to your budget, you can create very complex deployments with instances spread across many zones.  

Remember, if you have instances running in multiple availability zones, you'll have to pay for the zone-to-zone transfer fees.  Data transferred between instances across different availability zones on EC2 costs $0.01 per GB.

This advanced architecture addresses the weaknesses of DNS multiple A record solutions.  Either load balancer 1 or 2 could distribute load to any available application server.  TTL delays that are inherent with any DNS scheme have been avoided.



Setup Instructions

Production Deployment (us-east-1a)

If you follow the diagram above as an example, your production setup will consist of two active deployments with another deployment configured to launch in another zone.  Each deployment will be dedicated to a particular zone.  Notice that you're using an Elastic IP for each frontend load balancer with the same number of application servers in each availability zone.  The clone operation makes this duplication a one-click solution (with a few adjustments later).




The next step is to clone the Production (1b) deployment and call the new one "Production (1c)".  Remember, you do not want to clone "Production (1a)" because if the zone with the Master-DB fails, you will simply promote the Slave-DB to Master-DB and launch a new Slave-DB instance.

  • Change the availability zone for each server in Production (1c) to launch into a different availability zone (ex: us-east-1c).
  • Change the Elastic IP for the backup frontend load balancer (FrontEnd-3) to be "-none-" because you don't know which Elastic IP you'll need to assign to this instance. 



For advanced architectures, it's better to think at the deployment level, instead of at the individual server level.  Instead of creating backups for each type of server, you can simply create backup deployments that are ready to be launched in a different availability zone if one of the zones stops. 

Once you start thinking of collections of servers are a "Deployment Unit" the move from Zone to Zone is a simple operation.

Simply click the Clone button for a deployment and change the availability zones for all servers accordingly.

All of the servers in your account with the same network options are on the same local network even when in distinct deployments.

Failover Scenarios

Failure in Availability Zone (us-east-1a)




In this recipe, you do not want to launch the entire deployment at the same time.  Do not use the "launch-all" button because it's important that the servers in the backup deployment are launched in the correct order.

  1. Go to the backup deployment (Production 1c).
  2. If the availability zone with the Master-DB fails, promote the Slave-DB to Master-DB.  If the Master-DB is still operational, proceed to Step 2.
  3. Launch the Slave-DB into the new availability zone (us-east-1c).  Use operational scripts to attach the new Slave-DB to the current Master-DB to restart redundancy and replication.
  4. Launch the application servers (app-5, app-6) into the new availability zone (us-east-1c).
  5. Launch the frontend load balancers (FrontEnd-3) into the new availability zone (us-east-1c).
  6. Execute the LB get HA proxy config operational action on the new load balancer (FrontEnd-3) to get the configuration file from the running load balancer in order to establish communication with the application servers.  If this RightScript is not used in your FrontEnd's ServerTemplate, you can import the latest version of the "Rails FrontEnd" ServerTemplate from the MultiCloud Marketplace to gain access to this script.  You can either add the RightScript as an operational script or run it as an "Any Script" on the server.
  7. Associate the unused Elastic IP to the new load balancer (FrontEnd-3).


If your Master-DB is large and takes a long time to start from backups, one advanced strategy would be to keep a Slave-DB "hot" by keeping one slave up and running in as many zones as you can afford.

In a failure scenario (or heavy load), the extra backup Slave-DB will already be connected to the load balancers and application servers, and be ready to serve a larger percent of the traffic.  Most application servers launch quickly and can be added as needed.  These "spare slaves" make a great place to test backup restore policy and QA new software.  Occasionally you can restore a Slave-DB and check to make sure the data is correct. It provides a great place to perform these types of audits.


You must to post a comment.
Last modified
21:33, 16 May 2013



This page has no classifications.



© 2006-2014 RightScale, Inc. All rights reserved.
RightScale is a registered trademark of RightScale, Inc. All other products and services may be trademarks or servicemarks of their respective owners.