This section will describe the end-to-end flow of a job through the RightScale Grid framework, highlighting the components supplied and automated by RightScale, as well as the components and responsibilities of the end user.
The overall architecture and workflow of a grid processing system implemented with the RightScale Grid framework is shown in Figure 1 below.
Figure 1 – RightScale Grid architecture and workflow
In this figure, everything shown within the box is under the control of the RightScale Cloud Management Platform. This includes managing the AWS infrastructure components, the worker processes, and the elasticity metrics and controls. The user-supplied components (shown in green) are limited to the Job Producer, the input files needed by a job, the worker code to execute the job, and finally an (optional) Job Consumer process that is responsible for any post-processing or output file manipulation that may be required. Although these Job Producer and Job Consumer processes are shown in Figure 1 as external to the RightScale platform, they can be (and often are) run under control of the platform as well. As a result, you can have one setup that serves as an entire end-to-end grid processing solution managed within a single interface.
Prior to creating a workflow, the four SQS queues indicated in Figure 1 (input, output, error, and audit) need to be created, which can easily be performed via the RightScale platform by using a predefined macro to assist in various system configuration tasks. By running the RightScale Grid macro within the RightScale dashboard, these four queues will be created automatically.
The workflow process is initiated by the Job Producer. This can be an application running within the cloud (and under management of the RightScale platform), or it can be running in a customer or hosting provider infrastructure. The first role of the Job Producer is to create the input file(s) required by the job. It then constructs the work unit data structure with details about the inputs, and the work to be performed. Next, it uploads the input files to S3 and creates a message data structure which encompasses the work unit information, in addition to details on the input file(s) location. Finally, the Job Producer inserts this new message into the SQS Input Queue. This process is repeated for every new job that is to be introduced into the workflow. The Job Producer application can take virtually any form: a PHP script, a compiled application, a Ruby script, etc. The RightScale Grid macro mentioned previously generates a Job Producer Ruby script that can be modified and used in new applications, or it can be used as a model or template for developing a new Job Producer. The RightScale Grid macro creates a single Job Producer for simplicity of illustration, but multiple Job Producers can be run simultaneously feeding the same input queue (or even multiple input queues in more complex scenarios).
The RightScale platform contains an Elasticity Daemon whose function is to monitor the input queues, and to launch worker instances to process jobs in the queue. Different scaling metrics can be used to determine the number of worker instances to launch and when to launch these instances, with the most common metric being the number of jobs in the queue. Within the RightScale dashboard, the user can specify that for every N jobs in the input queue, a worker instance should be launched. The second metric that can be used is related to the length of time that jobs have been in the queue. This is useful in situations where particular Service Level Agreements (SLA) need to be met. Using this time-based metric, the user may specify that when the average time a job spends in the queue exceeds a certain threshold, a worker instance should be launched. Similarly, when the maximum time that any job has been in the queue exceeds a specified value, a worker instance is launched to assist in processing the load. (These averages and maximums are calculated using a random sampling of 10 jobs in the input queue.)
Once the Elasticity Daemon has determined that additional worker instances are required, a call is made within the RightScale platform to initiate the allocation of these server resources (illustrated by the orange blocks in Figure 1 containing the Worker Daemon and Worker Code). These worker instances are launched in a server array, which can be controlled via configuration settings in the RightScale dashboard. These settings allow the minimum and maximum number of worker instances to be specified. This provides a mechanism to ensure there are always one or more instances available to process any jobs that are inserted into the queue, and to limit the maximum capital expenditure in the case of unplanned or out-of-control input growth. Another feature of the RightScale platform that plays a major role in this phase of the workflow is the ServerTemplate. Prior to launching a grid computing application, a ServerTemplate for the worker instances is created, which defines the instance-specific details, such as the size of the instance, the image to use as the base operating system, the region and availability zone in which the instance should be launched, along with other configuration information. This ServerTemplate can be created manually or automatically by calling the RightScale Grid macro. Another key aspect of the ServerTemplate is that it specifies the technology stack required on the instance (Ruby, Perl, PHP, etc.) and performs the installation of these tools, as well as the installation of the worker code. This worker code is downloaded from an SVN repository, and installed on the worker instance as specified in a script run at the end of the instance’s boot cycle. Optionally, the worker code can be included as an attachment to the ServerTemplate, but this method is not as flexible as downloading from a repository, and is not considered a best practice.
In addition to the user-specified tools and worker code, the RightScale Grid Worker Daemon is also installed on each worker instance. This worker daemon is responsible for managing the workflow on the instance itself, executing the worker code and uploading output files to S3 to be processed by the Job Consumer. Numerous functions are performed by the RightScale Grid Worker Daemon and the worker code, but they can be generally categorized as follows:
At this point in the workflow, the job has been processed by the worker code, and the resulting data structure has been passed to the RightScale Worker Daemon, which is the daemon’s indication that the results-processing portion of the workflow can now commence. At this point, the Worker Daemon performs the following steps:
© 2006-2014 RightScale, Inc. All rights reserved.
RightScale is a registered trademark of RightScale, Inc. All other products and services may be trademarks or servicemarks of their respective owners.