Note: Please go to to access the current RightScale documentation set. Also, feel free to Chat with us!
Home > Tutorials > Amazon Web Services (AWS) > Standard Edition Deployment > Set up Autoscaling using Alert Escalations

Set up Autoscaling using Alert Escalations


After you set up your production deployment, you may already be thinking "How do I set up my deployment to autoscale?" 

This tutorial shows you how to change a basic 4-instance setup into a scalable deployment with an application server array that scales up and down based on alert conditions and scaling parameters that you define. In order to create a scalable deployment, you need to define an alert and alert escalation, and then specify the appropriate ServerTemplate and array that you want to scale. An alert (previously called "alert specification") is a notification that a problematic condition occurred in your deployment. An alert escalation performs an action or set of actions in response to the triggered alert condition. Together, they can be used to configure your deployment to automatically take action on your behalf.


For example, if your site is experiencing a large amount of traffic, you can set up an alert and escalation to grow or "scale" your deployment, where more server resources are added to your server array in order to handle the increased bandwidth requirements. In this way, you can set up your website to automatically scale based on particular alerts and escalations that you define.


Basic Four-server Setup

The basic setup has two front ends (load balancer + app server), as well as a master and slave database. 


Scalable 4-server Setup with a Scalable Application Server Array

When additional server resources are needed in a scalable setup, new servers are launched and added to the server array. When those resources are no longer needed, the application servers can be terminated and removed from the server array (shrink) while the basic four-server setup stays intact.  Autoscaling is especially useful for ensuring that your deployment can easily scale up whenever extra server resources are needed, whether that's tomorrow, next week, next year, or perhaps never. At least you know your deployment is set up to take advantage of one of the key benefits of cloud computing - launching additional server resources on-demand.


Scalable 6+ Server Setup with a Scalable Application Server Array

Similarly, you could also transition into a three-tiered, scalable 6-server setup with two servers that act as dedicated load balancers. See the Transition to a Scalable 6+ Server Setup section below.




Create a Server Array

Note:  If you ran the E2E Gaming Deployment macro then you can skip this step and proceed to the next step - Create Alert Escalations.  (The macro creates the Server Array for you, but does not configure it for Alerts and autoscaling.)  Tip:  If new to Server Arrays, even if you don't have to perform this step, you should read through it for educational purposes.

Go to Manage > Arrays. Click New Array and assign it to the desired cloud and deployment.  For this example, since our deployment is in AWS US-East, we will choose the same region for our server array.  Server arrays are cloud and AWS region-specific.

Define the Array

The next step is to configure the array and define how the array will be scaled.



General  Info
  • Nickname - Provide a Nickname for the server array. (e.g. MyFirstArray)
  • Array type: Alert-based
    There are two types of server arrays: Alert-based or Queue-based.  Queue-based arrays are used for grid applications.
  • Deployment - Each array must be associated with a deployment.  You can attach more than one server array to a deployment.
  • Active - Leave the "Active" checkbox unchecked until you are ready to make the array and the alerts active.
Server Options
  • Cloud - Each array must be associated with a cloud or cloud region. Once you've chosen the cloud and/or region, it cannot be changed.
  • ServerTemplate - Choose the ServerTemplate that you would like to use in order to scale your deployment.  This ServerTemplate will be used to create the new servers that will be added to the server array.  Since you are scaling the number of application servers, you will need to select the latest version of the appropriate (PHP/Rails/TomCat) application ServerTemplate, not a frontend ServerTemplate. (If you need to make changes to the template, you will need to clone it.)
    NOTE: Make sure that the Inputs for the ServerTemplate are properly configured and that the server will boot correctly.
  • SSH KeySecurity Group - Specify the appropriate SSH key and Security Group that will be attached to the array. 
Server Allocation Policy
  • Default min count: 2
    The 'min_count' is the minimum number of servers that you want running at any given time.  If you have a basic 4-instance setup with two load balancers, then set this value to 2.
  • Default max count: 20
    The 'max_count' is the maximum number of servers that you can run at any given time.  Default = 20.  Amazon will not let you launch more than 20 servers unless you submit a form request to increase this limit.
  • Availability Zone: Any (default)
    Defines into which availability zone(s) new servers in your array will be launched.  You can either select a specific zone, any, or weighted.  By selecting the 'Any' option, you will be dispersing servers across all availability zones, which would prevent all of your array servers from disappearing if an availabilty zone ever failed. 'Weighted' allows you to control the ratio of server dispersion.  For example, if you wanted to put more servers into a particular zone.
  • Decision threshold: 51%
    The decision threshold only applies to scaling actions where servers can "vote" for a particular scaling action (e.g. 'vote_grow_array' or 'vote_shrink_array').  Other actions such as 'send_email' or 'run_right_script' are performed at the server level and are not applicable to a server array's configuration.  The decision threshold is the percentage of servers that must vote in agreement before a scaling action is executed. We recommend setting this value to 51% in order to ensure that the majority of your servers are voting for the same action (e.g. 'vote_grow_array') before any action is taken. Our voting system allows you to "scale democratically" because it ensures that you don't accidentally launch a bunch of new servers because one of your servers went "out of control."
  • Resize Up by: 2
    The resize up parameter defines how many servers you want to launch when you scale up.  When a decision is made to "grow an array," we recommend launching at least 2 servers in order to ensure that a significant impact will be made to your setup.  If your deployment needs more server resources, it's better to overcompensate than undercompensate.  Similarly, if you have larger setups or you have predictable scaling patterns (ex: 5 servers at a time), you also have the flexibility to scale in bulk.
  • Resize Down by: 1
    The resize down parameter defines how many servers you want to terminate when you scale down.  For example, you might want to scale down more conservatively (slowly) than how you scale up.  
  • Resize calm time: 15 (minutes)
    The calm time defines how long you want to wait before you repeat another action.  Since it takes a few minutes for a new server to be launched, become fully operational, and start to have an impact, you'll want to give yourself a buffer before allowing another action to occur.  For normal situations, we recommend using a calm time of at least 15 minutes.

Click Save.

Enable the array

Be sure to enable (activate) the array that you just created. Even if you attach an array to a Deployment, the alerts and escalations will not be able to cause a scaling action unless the array is active.  If you are adding a server array to a live deployment, you might want to launch a test server in the server array before making it active. See Test a Server Array.


Click Enable


Create Alert Escalations 

The next step is to define the action(s) to be taken when an alert is triggered.  An escalation can have one or more actions.  We will create two escalations that contain the appropriate scaling action:

  • "scale-up" - add new server resources to the application server array.
    • Use the 'vote_grow_array' action
  • "scale-down" - terminate (remove) server resources from the application server array.
    • Use the 'vote_shrink_array' action



Go to Design > Alert Escalations > New.


Enter a Name and a Description for the new Alert Escalation.  Next, you'll need to associate the Alert Escalation to a Deployment.  You can either associate it to "All Deployments" (default) or only make it available by a specific deployment.  If you have followed the other tutorials, attach this Alert Escalation to your "Production" Deployment and click Save.

Note: You will need to remember the name of the new escalation ("scale-up") when you create an alert in the next step.



Now that you've created a new escalation you'll need to define what action should be taken when the alert is triggered.  In this example, we want to add server resources to the application server array. i.e. "grow the array."

Click on the Actions tab.

You'll notice that we've pre-defined several common actions that can be associated with escalations.  See Valid Actions for Alert Escalations for complete descriptions of each action.

Select the “vote_grow_array” action from the dropdown and click Add

Note: For each Alert Escalation, you can add a single action or define a sequence of multiple actions.  For example, you might want to send an email to your system administrator before or during a scaling action.  To simplify this tutorial we will only use a single action. 



If there are more than one actions associated with an Alert Escalation, you can control how soon (in minutes) the next action will take effect.  Since we only have one action, you can leave the "Escalate after" field blank.  If the RightScale account has multiple server arrays, you will need to select which server array the alert escalation should take effect on.  Click Save



Now that we've created an Alert Escalation to grow the server array, we need to create another escalation to shrink the server array. 

Click the New button to create another Alert Escalation.

Once again you will need to define the new alert escalation.  Be sure to choose the same Deployment as the 'scale-up' Alert Escalation.  Click Save.



Now define the action(s) that should be taken when the 'scale-down' Alert Escalation is called by a triggered alert.

Click on the Actions tab. 

This time select the “vote_shrink_array” action from the dropdown bar and click Add.  You can leave the "Escalate after" field blank for this action as well.  Click Save.

Now that we've set up Alert Escalations to grow and shrink the server array, let's configure the alerts that will call these escalations when they're triggered.



Connect Alerts to the FrontEnd ServerTemplate

Next you need to define the alerts.  You will need to create two alerts:

  • "Need to Grow Array" - if the monitored conditions are met and the alert is triggered, then call the "scale-up" Alert Escalation.
  • "Need to Shrink Array" - if the monitored conditions are met and the alert is triggered, then call the "scale-down" Alert Escalation.


Alerts can be associated to a ServerTemplate, Server, or Server Array.  You can either create a new alert or import an alert from another source.  Remember, you will need to attach the alert to each server that you want to monitor for your alert's conditions, so that each server can "vote" when one of the alerts is triggered.

In this example, we want to make sure the alerts will be configured on all "application" servers.  Since the FrontEnd server also acts as an application server, we'll need to add the alerts to both ServerTemplates.

  • PHP FrontEnd (clone) - ServerTemplate that's used by both FrontEnd servers
  • PHP App Server (clone) - ServerTemplate that will be used to launch new server instances into the server array

First, let's create and add the grow and shrink alerts to the FrontEnd ServerTemplate.  Go to the Alerts tab of the FrontEnd ServerTemplate and click New.  In this tutorial we will create our own alerts instead of importing existing ones.  (Note: The existing Alerts are practical, but often during the setup/test phase of your Alerts and autoscaling, you will want to change the numbers so it scales faster.  For example, why wait for 60 minutes of busy CPU during test.  Waiting for just a few minutes before voting to grow will speed up each test cycle.) 



Need to Grow Array

When you create an alert you will need to give it a name, description, and then define what condition will be monitored.  First, let's configure an alert that will get triggered when a server is getting overworked.  We'll configure this first alert to trigger when the server's CPU Idle value is less than 30% for 3 minutes.  If a server is at over 70% capacity, it may be a good time to scale-up and launch additional servers to absorb some of the load.


  • Select cpu-0/cpu-idle as the metric that you are monitoring.  This is the cpu-idle value of the server.
  • Select value for the variable.
  • Select less than < as the condition.
  • Type 30 for the threshold since we want to trigger this alert when the computer's idle time is less than 30%.
  • Type 3 for the Duration, since we want the condition to exist for 3 consecutive minutes before triggering an Alert Escalation.
  • Select the Alert Escalation that you want to call when the alert is triggered.
    Note: You will need to use the same name that you used to create the Alert Escalation in the previous step. (e.g. 'scale up')

Click Save.


Need to Shrink Array

Since you don't want to pay for underutilized servers that are no longer necessary, you'll need to create a converse alert that will indicate when it is time to shrink the server array and scale-down.

Click New.  Once again give it a name, description, and then define what condition will be monitored.



You'll want this alert to trigger when a server is no longer being utilized.  Configure the alert's condition so that if CPU Idle value is more than 85% for 3 consecutive minutes, then call the 'scale-down' Alert Escalation.  If a server is only using 15% of its CPU power, it's often a good indication that it's time to scale-down and safely terminate the unused server(s) from the array.


Connect Alerts to the Server Array's ServerTemplate

In the previous step you created two new alerts and added them to the FrontEnd ServerTemplate.  Now you need to add those same alerts to the ServerTemplate that will be used to configure and launch new server instances into the array since you want to monitor those same conditions across all application servers in your setup including the FrontEnd servers in your deployment and the application servers in the server array.  You'll want to give each of those servers the ability to vote for scaling actions.  Since you've already created the correct alerts, you can import the alert from the FrontEnd ServerTemplate. 


Go to the ServerTemplate that will be used to launch servers into your array.  Under the Alerts tab, click the Import Alert button.

Find the private FrontEnd ServerTemplate where you just created those alerts and add the "Need to Grow Array" and "Need to Shrink Array" alerts to the array's ServerTemplate. 

You are essentially cloning the alert.  Alert specifications are not global.  So if you change the settings of the "grow" alert specification on Template B, it will not affect the alert's condition on Template A even though it has the same name..

Note: When you import an alert, you are essentially creating a clone of the alert.  Unlike Alert Escalations, Alerts are not global.  So if you change the alert's condition on the ServerTemplate A, it will not affect the alert's condition on Template B even though it has the same name.


Both alerts are now attached to both of the ServerTemplates.  Whenever a new server is launched and added to the application server array, it will have the same alerts as the FrontEnds, so all servers will have the ability to vote for a scaling action.

You just set up a scalable 4-server Deployment that is configured to grow and shrink its associated Server Array when the application servers become overworked. 

Transition to a Scalable 6+ Server Setup

As your site continues to grow, you may need to consider transitioning to a 3-tiered architecture where you have two dedicated load balancers, where the entire bandwidth/cpu/memory of the FrontEnd is used for load balancing purposes only so that it's no longer responsible for also serving the application.  If you typically, only need around 6 servers to handle a majority of your load 90% of the time, you might want to let the FrontEnds continue to serve your application.  However, if you anticipate a large scaling event in the future or if you plan to manage more than these 6 servers on a regular basis, you might want to migrate from the standard four-server setup.  See  Transition from a 4-Server to 6+ Server Setup.

You must to post a comment.
Last modified
21:17, 16 May 2013



This page has no classifications.



© 2006-2014 RightScale, Inc. All rights reserved.
RightScale is a registered trademark of RightScale, Inc. All other products and services may be trademarks or servicemarks of their respective owners.