Note: If you cannot find a topic, go to docs.rightscale.com where all RightScale documentation will soon be located. Also, feel free to Chat with us!
Home > Partners > Grid Engine > Documentation > Asset Runbook for Grid Engine On Demand Solution

Asset Runbook for Grid Engine On Demand Solution

This page describes all of the scripts, inputs, and operational procedures required to run the solution.

Table of Contents

Macros

Macro:  Univa - Create UGE Cluster

Description:  Creates a Deployment which is set up for use as a Univa Grid Engine (UGE) cluster. Each independent Deployment created by this Macro can be operated as an independent cluster.

How to Use:

To begin, subscribe to the Macro (Design -> MultiCloud Marketplace -> Macros).  Once you've got a copy of the Macro in your local collection you can run it.

IMPORTANT NOTE! When asked if you wish to re-use a Security Group and/or SSH Key, we normally recommend leaving them blank and proceeding without specifying any existing group or key! The macro will proceed to prompt you to create a New security group and SSH key with appropriate values.

It is often NOT a good idea to re-use these items in order to maintain independent clusters. If you are going to be re-using an existing key or security group, please select the proper values and ENSURE that they are correctly defined, otherwise you may run into problems.

Click on the ‘Run’ button to begin executing the Macro. You will be prompted for several items which define your cluster. You may accept the default values provided, or change them if you wish. You will be asked to specify:

  • Deployment name. This will be the name of the Deployment which is visible from your dashboard (Manage -> Deployments). This Deployment must not already exist, or the macro will display an error.
  • AWS Security Group name. This security group will be created and set up. If you chose to re-use a security group before running the macro, this value will be ignored. Otherwise, the security group must not already exist, or the macro will display an error.
  • SSH keypair name. This keypair will be created. If you chose to re-use a keypair before running the Macro, this value will be ignored. Otherwise, the keypair must not already exist, or the Macro will display an error.
  • Number of execution hosts. Your Deployment will have one Server per execution host.
  • Instance type for the queue master. The supported instance types are displayed and you are asked to make a choice. The meaning of the instance types is available at http://aws.amazon.com/ec2/instance-types. See below for more information.
  • Instance type for the execution hosts. Consider the workload you will be running on your cluster (CPU-intensive, memory-intensive, etc.) when selecting a type.

 

Instance types affect both the processing power and cost associated with your cluster. In general, your queue master can be a less-powerful instance type, and your execution hosts should be either the same or a more powerful instance type. If you are creating very large clusters (more than 20 nodes), consider using a more powerful instance type for your queue master to compensate.

When the macro completes, your new deployment will consist of a server called “Queue Master” along with individually-numbered “Execution Host” servers. Congratulations! Your new UGE cluster is ready to go.

Once the Macro completes, you will have a new UGE cluster Deployment.

To start your new cluster, go to the Deployment in your RightScale dashboard (Manage -> Deployments).

  1. Go to the Deployment's Servers tab.
  2. Below the “Actions” column, choose the Select: ‘All Shown’ link.
  3. Ensure the action dropdown is set to ‘Launch’ and click ‘Apply to selected’.
  4. Confirm the action by selecting ‘Yes’ in the pop-up dialog.

 

All inputs needed by the scripts are pre-populated with the proper credentials and information. This information is automatically configured for your Deployment. You can simply accept these values as-is and click “Launch” on the Launch Request screen.

To increase the number of Servers in your cluster, simply clone an existing Execution Host server. To reduce the number of servers in your cluster, either terminate and delete an Execution Host server, or start less than all of the Execution Hosts.

Note: You should not clone the Queue Master server or change its nickname.

 

Scripts and Inputs

Boot Scripts

The following RightScripts are used to bring servers into an operational state. These scripts are not meant to be run manually, as they perform software installation and configuration.

RightScript:  FUSE+S3FS Installer for CentOS 5.4

Description:  Installs the S3FS filesystem software on a server. The FUSE software distributed with CentOS is replaced with a newer version (required to support S3FS). S3FS allows you to mount an S3 bucket as a filesystem, which in turn means you don’t have to change applications to store output data on S3 storage.

Required Inputs and Default Settings:

  • AWS_ACCESS_KEY_ID - Amazon credential. See FAQ 13.
  • AWS_SECRET_ACCESS_KEY - Amazon credential. See FAQ 13.

AWS credentials are only used to read files from S3 and are not stored after the script exits.

RightScript:  Tortuga Core Installer

Description:  Installs core Tortuga software on a server. The Tortuga software is part of the UniCloud management software and is required to manage a UGE cluster.

Required Inputs and Default Settings:

  • AWS_ACCESS_KEY_ID - Amazon credential. See FAQ 13.
  • AWS_SECRET_ACCESS_KEY - Amazon credential. See FAQ 13.
  • MASTER_IP – The private IP address of the master node (automatically inserted)

AWS credentials are only used to read files from S3 and are not stored after the script exits.

RightScript:  Univa Grid Engine Kit Installer

Description:  Installs and configures the Univa Grid Engine software on a cluster.

Required Inputs and Default Settings:

  • AWS_ACCESS_KEY_ID - Amazon credential. See FAQ 13.
  • AWS_SECRET_ACCESS_KEY - Amazon credential. See FAQ 13.

AWS credentials are only used to read files from S3 and are not stored after the script exits.

RightScript:  UniCloud Client Setup

Description:  Configures a compute node server for use by the UGE software.

Required Inputs and Default Settings:

  • AWS_ACCESS_KEY_ID - Amazon credential. See FAQ 13.
  • AWS_SECRET_ACCESS_KEY - Amazon credential. See FAQ 13.
  • MASTER_IP – The private IP address of the master node (automatically inserted)

AWS credentials are only used to read files from S3 and are not stored after the script exits.

RightScript:  Univa Grid Engine Kit Installer

Description:  Installs and configures the Univa Grid Engine software on cluster.

Required Inputs and Default Settings:

  • AWS_ACCESS_KEY_ID - Amazon credential. See FAQ 13.
  • AWS_SECRET_ACCESS_KEY - Amazon credential. See FAQ 13.

AWS credentials are only used to read files from S3 and are not stored after the script exits.

RightScript:  UniCloud Installer

Description:  Installs and configures the UniCloud management software on the cluster.

Required Inputs and Default Settings:

  • AWS_ACCESS_KEY_ID - Amazon credential. See FAQ 13.
  • AWS_SECRET_ACCESS_KEY - Amazon credential. See FAQ 13.

AWS credentials are only used to read files from S3 and are not stored after the script exits.

Operational Scripts

Currently none. TBD.

Decommission Scripts

The following RightScripts are used when servers are decommissioned. They are not designed to be run manually.

RightScript:  UniCloud Client Teardown

Description:  Gracefully removes a cluster resource so it is no longer used by UGE.

Required Inputs and Default Settings:

  • AWS_ACCESS_KEY_ID - Amazon credential. See FAQ 13.
  • AWS_SECRET_ACCESS_KEY - Amazon credential. See FAQ 13.
  • MASTER_IP – The private IP address of the master node (automatically inserted)

AWS credentials are only used to read files from S3 and are not stored after the script exits.

 

Common Runbook Operations

Once you have created your HPC cluster, you are ready to submit basic jobs immediately, or alternatively, you can install custom applications and load data into the cloud.

Running a sample test job

UGE comes with several 'sample' jobs you can use to test your cluster's operation. This example uses the simple.sh job, which simply sleeps for a brief period before exiting, and requires no additional software or configuration.

  • Open an SSH console to the “Queue Master” server. From the command-line prompt, type: qsub /opt/sge/examples/jobs/simple.sh
  • Repeat the above command multiple times, if desired. You should submit more jobs than you have available compute nodes, so that you can see the jobs queue up and dispatch when resources become available.
  • Type “watch -d qstat -f”. You will see output showing the available cluster resources, a list of jobs that are both running and waiting to be run, and will be able to view how they are being dispatched to your compute nodes for completion.
  • Press ^C (CTRL-C) to stop watching the output.

Loading a Custom Application

UGE is capable of running any application or script you have created. There is no 'standard' application installed with UGE because it is a framework. You submit a job to run your application the same way you would run it if starting it at a command-line prompt. You are able to run scripts, including shell scripts, or binary executables.

If your application is available in a single-file archive, you may be able to use a RightScript which has been provided by RightScale to install it automatically.

  • Upload the files necessary to run your application to Amazon EC2. In the RightScale Dashboard, go Clouds -> AWS Global -> S3 Browser. From here, you can create an S3 bucket and upload a file to it.
  • Import the “WEB app s3 code checkout” RightScript to your account from the MultiCloud Marketplace (Design -> MultiCloud Marketplace -> RightScripts). The publisher of this script is RightScale.
  • Run the “WEB app s3 code checkout” RightScript under a running Server's Scripts tab as an 'Any Script' to install your application.

Note that your application may not be distributed as a single compressed file, or may have a non-standard format which cannot be installed by the standard RightScript. In this case, you will most likely want to write a custom RightScript to install your application so that it is available to you each time you start a new cluster.

Submitting Jobs

  • Open an SSH console to the 'Master node' server.
  • Use the “qsub” command to submit one or more jobs. This command has a large number of options to customize how your jobs are run. Please see below for documentation on how to submit jobs.
You must to post a comment.
Last modified
23:41, 16 May 2013

Tags

This page has no custom tags.

Classifications

This page has no classifications.

Announcements

None


© 2006-2014 RightScale, Inc. All rights reserved.
RightScale is a registered trademark of RightScale, Inc. All other products and services may be trademarks or servicemarks of their respective owners.