To set up a scalable alert-based Server Array that autoscales based on Alert Escalations. Used for horizontal scaling.
Warning: The ability to set up autoscaling using Alert Escalations will be eventually be deprecated. Please use the Set up Autoscaling using Voting Tags tutorial for autoscaling purposes.
Table of Contents
After you set up your production deployment, you may already be thinking "How do I set up my deployment to autoscale?"
This tutorial shows you how to change a basic 4-instance setup into a scalable deployment with an application server array that scales up and down based on alert conditions and scaling parameters that you define. In order to create a scalable deployment, you need to define an alert and alert escalation, and then specify the appropriate ServerTemplate and array that you want to scale. An alert (previously called "alert specification") is a notification that a problematic condition occurred in your deployment. An alert escalation performs an action or set of actions in response to the triggered alert condition. Together, they can be used to configure your deployment to automatically take action on your behalf.
For example, if your site is experiencing a large amount of traffic, you can set up an alert and escalation to grow or "scale" your deployment, where more server resources are added to your server array in order to handle the increased bandwidth requirements. In this way, you can set up your website to automatically scale based on particular alerts and escalations that you define.
The basic setup has two front ends (load balancer + app server), as well as a master and slave database.
When additional server resources are needed in a scalable setup, new servers are launched and added to the server array. When those resources are no longer needed, the application servers can be terminated and removed from the server array (shrink) while the basic four-server setup stays intact. Autoscaling is especially useful for ensuring that your deployment can easily scale up whenever extra server resources are needed, whether that's tomorrow, next week, next year, or perhaps never. At least you know your deployment is set up to take advantage of one of the key benefits of cloud computing - launching additional server resources on-demand.
Similarly, you could also transition into a three-tiered, scalable 6-server setup with two servers that act as dedicated load balancers. See the Transition to a Scalable 6+ Server Setup section below.
Note: If you ran the E2E Gaming Deployment macro then you can skip this step and proceed to the next step - Create Alert Escalations. (The macro creates the Server Array for you, but does not configure it for Alerts and autoscaling.) Tip: If new to Server Arrays, even if you don't have to perform this step, you should read through it for educational purposes.
Go to Manage > Arrays. Click New Array and assign it to the desired cloud and deployment. For this example, since our deployment is in AWS US-East, we will choose the same region for our server array. Server arrays are cloud and AWS region-specific.
The next step is to configure the array and define how the array will be scaled.
Be sure to enable (activate) the array that you just created. Even if you attach an array to a Deployment, the alerts and escalations will not be able to cause a scaling action unless the array is active. If you are adding a server array to a live deployment, you might want to launch a test server in the server array before making it active. See Test a Server Array.
The next step is to define the action(s) to be taken when an alert is triggered. An escalation can have one or more actions. We will create two escalations that contain the appropriate scaling action:
Go to Design > Alert Escalations > New.
Enter a Name and a Description for the new Alert Escalation. Next, you'll need to associate the Alert Escalation to a Deployment. You can either associate it to "All Deployments" (default) or only make it available by a specific deployment. If you have followed the other tutorials, attach this Alert Escalation to your "Production" Deployment and click Save.
Note: You will need to remember the name of the new escalation ("scale-up") when you create an alert in the next step.
Now that you've created a new escalation you'll need to define what action should be taken when the alert is triggered. In this example, we want to add server resources to the application server array. i.e. "grow the array."
Click on the Actions tab.
You'll notice that we've pre-defined several common actions that can be associated with escalations. See Valid Actions for Alert Escalations for complete descriptions of each action.
Select the “vote_grow_array” action from the dropdown and click Add.
Note: For each Alert Escalation, you can add a single action or define a sequence of multiple actions. For example, you might want to send an email to your system administrator before or during a scaling action. To simplify this tutorial we will only use a single action.
If there are more than one actions associated with an Alert Escalation, you can control how soon (in minutes) the next action will take effect. Since we only have one action, you can leave the "Escalate after" field blank. If the RightScale account has multiple server arrays, you will need to select which server array the alert escalation should take effect on. Click Save.
Now that we've created an Alert Escalation to grow the server array, we need to create another escalation to shrink the server array.
Click the New button to create another Alert Escalation.
Once again you will need to define the new alert escalation. Be sure to choose the same Deployment as the 'scale-up' Alert Escalation. Click Save.
Now define the action(s) that should be taken when the 'scale-down' Alert Escalation is called by a triggered alert.
Click on the Actions tab.
This time select the “vote_shrink_array” action from the dropdown bar and click Add. You can leave the "Escalate after" field blank for this action as well. Click Save.
Now that we've set up Alert Escalations to grow and shrink the server array, let's configure the alerts that will call these escalations when they're triggered.
Next you need to define the alerts. You will need to create two alerts:
Alerts can be associated to a ServerTemplate, Server, or Server Array. You can either create a new alert or import an alert from another source. Remember, you will need to attach the alert to each server that you want to monitor for your alert's conditions, so that each server can "vote" when one of the alerts is triggered.
In this example, we want to make sure the alerts will be configured on all "application" servers. Since the FrontEnd server also acts as an application server, we'll need to add the alerts to both ServerTemplates.
First, let's create and add the grow and shrink alerts to the FrontEnd ServerTemplate. Go to the Alerts tab of the FrontEnd ServerTemplate and click New. In this tutorial we will create our own alerts instead of importing existing ones. (Note: The existing Alerts are practical, but often during the setup/test phase of your Alerts and autoscaling, you will want to change the numbers so it scales faster. For example, why wait for 60 minutes of busy CPU during test. Waiting for just a few minutes before voting to grow will speed up each test cycle.)
When you create an alert you will need to give it a name, description, and then define what condition will be monitored. First, let's configure an alert that will get triggered when a server is getting overworked. We'll configure this first alert to trigger when the server's CPU Idle value is less than 30% for 3 minutes. If a server is at over 70% capacity, it may be a good time to scale-up and launch additional servers to absorb some of the load.
Since you don't want to pay for underutilized servers that are no longer necessary, you'll need to create a converse alert that will indicate when it is time to shrink the server array and scale-down.
Click New. Once again give it a name, description, and then define what condition will be monitored.
You'll want this alert to trigger when a server is no longer being utilized. Configure the alert's condition so that if CPU Idle value is more than 85% for 3 consecutive minutes, then call the 'scale-down' Alert Escalation. If a server is only using 15% of its CPU power, it's often a good indication that it's time to scale-down and safely terminate the unused server(s) from the array.
In the previous step you created two new alerts and added them to the FrontEnd ServerTemplate. Now you need to add those same alerts to the ServerTemplate that will be used to configure and launch new server instances into the array since you want to monitor those same conditions across all application servers in your setup including the FrontEnd servers in your deployment and the application servers in the server array. You'll want to give each of those servers the ability to vote for scaling actions. Since you've already created the correct alerts, you can import the alert from the FrontEnd ServerTemplate.
Go to the ServerTemplate that will be used to launch servers into your array. Under the Alerts tab, click the Import Alert button.
Find the private FrontEnd ServerTemplate where you just created those alerts and add the "Need to Grow Array" and "Need to Shrink Array" alerts to the array's ServerTemplate.
You are essentially cloning the alert. Alert specifications are not global. So if you change the settings of the "grow" alert specification on Template B, it will not affect the alert's condition on Template A even though it has the same name..
Note: When you import an alert, you are essentially creating a clone of the alert. Unlike Alert Escalations, Alerts are not global. So if you change the alert's condition on the ServerTemplate A, it will not affect the alert's condition on Template B even though it has the same name.
Both alerts are now attached to both of the ServerTemplates. Whenever a new server is launched and added to the application server array, it will have the same alerts as the FrontEnds, so all servers will have the ability to vote for a scaling action.
You just set up a scalable 4-server Deployment that is configured to grow and shrink its associated Server Array when the application servers become overworked.
As your site continues to grow, you may need to consider transitioning to a 3-tiered architecture where you have two dedicated load balancers, where the entire bandwidth/cpu/memory of the FrontEnd is used for load balancing purposes only so that it's no longer responsible for also serving the application. If you typically, only need around 6 servers to handle a majority of your load 90% of the time, you might want to let the FrontEnds continue to serve your application. However, if you anticipate a large scaling event in the future or if you plan to manage more than these 6 servers on a regular basis, you might want to migrate from the standard four-server setup. See Transition from a 4-Server to 6+ Server Setup.
© 2006-2014 RightScale, Inc. All rights reserved.
RightScale is a registered trademark of RightScale, Inc. All other products and services may be trademarks or servicemarks of their respective owners.