Note: Please go to docs.rightscale.com to access the current RightScale documentation set. Also, feel free to Chat with us!
Home > Tutorials > Amazon Web Services (AWS) > Grid Edition > Create a Job Producer

Create a Job Producer

Table of Contents

Objective

To summarize the steps required to create a RightGrid job producer, and provide sample code for you to use as a starting point.

Overview

The job producer does not have to be running on EC2.  It can be located anywhere on the Internet.  The code for the job producer and job consumer can be written in any programming language provided that you can upload/download data to S3 and send/receive work units from SQS queues.

rightgrid_diagram_job_producer.gif


A job producer performs the following tasks:

  • It breaks down data-sets into individual work units.
  • It upload a work unit's matching input data files to a specified bucket on S3.
  • It sends a message about a work unit to the SQS Input Queue.  NOTE: Since the RightGrid daemon on a worker instance will automatically grab work units that become available in the input queue, it's important that the associated input data files for the work unit are already available on S3.

 

Steps

To create a job producer follow the steps below. 

Note:  This is considered an advanced tutorial.  The steps are summarized below, not given in a literal step-by-step procedure as are many of our other beginner and intermediate tutorials.  Sample code is also provided for which you will have to adapt to your specific application and situation.

  1. Install the RightScale AWS interface gem (right_aws).
  2. Add code to upload the input data files to the specified bucket on S3.  (See '# Get S3 and SQS handle')
  3. Add code to generate and encode a work unit.  (See '# Get S3 and SQS handle')
  4. Add code to send a work unit's message to the SQS input queue.

Similar to SQS messages, input queue messages are limited to 256KB. In most cases, the input queue messages contain only the work unit meta-data while the actual input data files for the worker application are uploaded to S3.

Sample Code

The sample code below is written in Ruby.  Use this code as a template for creating your own job producer.

 

jobproducer.rb
require 'yaml'
require 'rubygems'
require 'right_aws'

def upload_file(bucket, key, data)
  bucket.put(key, data)
end

def enqueue_work_unit(queue, work_unit)
  queue.send_message(work_unit)
end

# Load jobspec
jobspec = YAML::load_file("oneshotspec.yml")

# Get S3 and SQS handle
s3 = RightAws::S3.new(jobspec[:access_key_id], jobspec[:secret_access_key])
bucket = s3.bucket(jobspec[:bucket], false)
sqs = RightAws::SqsGen2.new(jobspec[:access_key_id], jobspec[:secret_access_key])
inqueue = sqs.queue(jobspec[:inputqueue], false)

# Generate work units
for id in 1...(jobspec[:number_of_units]+1)
  puts "Generating work unit #{id}"
  filename = "in/Log#{id}.log"
  text = "HelloWorld!"

  work_unit = {
    :created_at => Time.now.utc.strftime('%Y-%m-%d %H:%M:%S %Z'),
    :s3_download => [File.join(jobspec[:bucket], filename)],
    :worker_name => jobspec[:worker_name],
    :id => id,
  }

  wu_yaml = work_unit.to_yaml
  upload_file(bucket, filename, text)
  enqueue_work_unit(inqueue, wu_yaml)
  puts wu_yaml
end

 

oneshotspec.yml

---
:name: OneshotJob
:worker_name: RGHelloWorld
:number_of_units: 5000
:bucket: dw_rightgrid_demo
:inputqueue: RG-Inputs
:outputqueue: RG-Outputs
:access_key_id: <AWS_ACCESS_KEY>
:secret_access_key: <AWS_SECRET_ACCESS_KEY>
 
Notice that the 'work_unit' section is written in the YAML format.  If you use a different format, you will have to create an encoder for the message.  Therefore, we recommend using the YAML format. 
You must to post a comment.
Last modified
21:18, 16 May 2013

Tags

Classifications

This page has no classifications.

Announcements

None


© 2006-2014 RightScale, Inc. All rights reserved.
RightScale is a registered trademark of RightScale, Inc. All other products and services may be trademarks or servicemarks of their respective owners.