|Table of Contents||Sections|
The building blocks for describing the sequencing are the sub and concurrent expressions. All sub-expressions of a sub expression execute sequentially while all sub-expressions of a concurrent expression execute concurrently. sub expressions can be nested inside concurrent expressions and vice versa providing the mean for describing what runs concurrently and when to synchronize.
All activities taking place in a process occur in tasks. Each sub-expression of a concurrent expression runs in its own task. Processes start with one task: the main task. The task_name attribute can be used to adorn sub-expressions of the concurrent expression to name tasks. This attribute can be specified on any expression (changing the name of the current task each time). However, the convention is to adorn the outer sub expression if there is one. That name can then be used to pause, resume, cancel, abort, or wait for that task:
The snippet above shows two recurring tasks. The idea being that one of the tasks need to be controlled by the other. In this example, when an error occurs in the second task, the maintain_application task needs to be paused then later resumed. The controlling task accomplishes this by using the pause_task and resume_task keywords respectively.
Tasks can be referred to using two different names: the local name (used in the example above) is the name used with the task_name attribute. This name can only be used to refer to a task that is a sibling, that is a task that was launched by the same task that also launched the task using the name. The other way to address a task is to use its global name: this name is defined using the parent tasks names recursively (excluding the main task) combined with the current task name using / as separator:
Tasks that are not explicitly named using the task_name attribute get assigned a unique name by the engine. The task_name() function (functions are covered in the next section) returns the global name of the current task.
Note how quotes around task names are optional, but strings and variables containing strings can be used wherever task names can:
As mentioned earlier tasks can be paused, resumed, canceled, or aborted. The respective keywords are pause_task, resume_task, cancel_task and abort_task. Each of these are optionally followed by a task name. If no task name is given then the action applies to the current task.
Note that tasks can only be paused between expressions, that is if a request to pause a task is made while the task is running an expression that task will keep running until the expression is done and control returns to the engine proper. So if an expression takes a long time to execute, like running an action on a collection containing many resources, pausing the task won't suspend the execution of that expression. The action will keep running on all resources. Only when that expression is done will the task be paused.
A task that is paused prevents the process from completing. All tasks must finish for the whole process to complete successfully.
A process can also be paused, resumed, canceled, or aborted in its entirety using respectively the pause, resume, cancel, or abort keyword. Executing any of these has the same effect as executing the corresponding task version in all currently running tasks. In particular, this means that pausing, canceling, or aborting a process will take effect once all tasks have finished running their current expression. The exact behavior of canceling and aborting are described below in the Ending Cloud Workflow processes section.
Note that (global) task names can be used in any expression; even expressions that are not sub-expressions of the concurrent expression that creates the task:
Canceling a task that has completed is not an error, it just has no effect. However if a task with the given name does not exist (i.e. there was never a task started with that name) then an error is raised. Similarly pausing or resuming a task that is already paused or running respectively has no effect as long as the task name is valid (otherwise an error is raised).
When multiple tasks attempt to control the same task concurrently the following priority order applies:
As covered earlier a task can be waited on using the wait_task keyword. The current task blocks until the given task finishes (i.e. completes, fails, is canceled, or is aborted). Like other task controlling actions (pause_task, resume_task, etc.) wait_task will have no effect if the task has already completed, but will raise an error if there is no task with the given name.
This means that there needs to be a way to wait for a task to start to guarantee that these actions do not result in errors. The expect_task keyword can be used for that purpose, here is an admittedly contrived example:
expect_task blocks until a task with the corresponding name exists (and thus never raises an error).
Note: The state of the task does not matter: expect_task will not block if a task with the corresponding name has already finished.
wait_task can also be used with a number indicating the number of tasks that should be waited on. The task running the wait_task expression blocks until the given number of tasks complete. Note that this form is mostly useful when used as an attribute on a concurrent expression to indicate how many concurrent tasks should complete before the next expression runs.
Finally, wait_task also accepts an array of task names corresponding to the tasks that should complete prior to the execution resuming. This form can also be used as an attribute:
The most basic synchronization primitive is a bare concurrent expression. This expression will block until all sub-expressions have completed. Sometimes more control is needed. For example, it may suffice for one of the concurrent expressions to finish before proceeding. The concurrent expression wait_task attribute can be used in two different ways to provide the additional control:
In the following example:
The front-ends will be launched as soon as either all servers tagged with app:role=app_server_1 or servers tagged with app:role=app_server_2 are operational. As stated above tasks can be waited on using their names:
One interesting application of the wait_task attribute is when used in conjunction with the number 0 as follows:
In this case, the process proceeds past the concurrent expression without waiting for any of the launched tasks. This is the same behavior as wrapping the whole definition extract above in an outer concurrent.
Whenever a task is launched it gets its own copy of the parent task state. This includes all references and all variables currently defined in the parent task.
Once a task finishes its state is discarded, however, it is sometimes necessary to retrieve state from a different task. RCL provides two mechanisms to share state across tasks:
Here is an example using the return keyword:
Another way to create tasks in a process apart from concurrent is through the concurrent foreach expression. This expression runs all sub-expressions in sequence on all resources in the given resources collection concurrently. In other words a task is created for each resource in the collection:
In the snippet above, the three RightScripts get run sequentially on all servers in the collection at once. If the @servers collection in the example above contained two resources the following would have the same effect:
Sometimes it is necessary to explicitly refer to one of the tasks that was spawned from the concurrent foreach execution. The task_prefix attribute is only valid for the concurrent foreach expression and allows defining a common prefix for all generated tasks. The task names are built from the prefix and the index of the resource in the collection:
In the example above, cassandra_setup0 refers to the task generated to run the concurrent foreach sub-expressions on the first resource in the @servers collection.
Apart from concurrent and concurrent foreach, concurrent map is the only other way to create tasks in a process. A concurrent map works as expected: each iteration runs concurrently and the resulting collections are built from the results of each iteration.
Note: Even though the resulting collections are built concurrently, concurrent map guarantees that the order of elements in the final collection(s) match the order of the collection being iterated on.
So for example:
In the example above the instances in the @instances collection will be ordered identically to the servers in the @servers collection (that is, the instance at a given index in the @instances collection will correspond to the server at the same index in the @servers collection).
A process ends once all its tasks end. There are four conditions that will cause the execution of a task to end:
Canceling a task can be done at any time in any task using the cancel_task keyword. This provides a way to finish "cleanly" a task that still has expressions to run. The cloud workflow can define rollback handlers that get triggered when the task cancels. These handlers behave much like timeout or error handlers: they may take arbitrary arguments and inherit the local variables and references of the caller. Nested rollback handlers are executed in reverse order as shown in this example:
In this snippet, if $has_errors gets initialized then the process is canceled and both the delete_servers and the delete_deployment get run in that order.
Note: the cancel_task keyword can also be used as an attribute value for the on_timeout and on_error attributes meaning that if a timeout or an error occur respectively then the task is canceled.
Canceling a process is done using the cancel keyword. This causes all the running tasks to be canceled and follow the same logic as above, potentially executing multiple rollback handlers concurrently. Once all rollback handlers finish then the process ends and the status of all its tasks is set to canceled.
Tasks can also be terminated through the abort_task keyword. This causes the task to finish immediately bypassing all rollback handlers. The abort keyword causes all the tasks in the current process to be aborted. The process thus finishes immediately and the status of all its tasks is set to aborted.
As described in Cloud Workflow and Definitions a definition consists of a sequence statement. Each statement is in turn made of expressions. The engine makes the following guarantee:
Expressions always run atomatically
In particular, if we consider any expression running in a concurrence (inside a concurrent, concurrent foreach, or concurrent map), then the rule above dictates that each concurrent expression runs atomatically. So if we consider:
In the definition above, statement (1) is composed of two expressions: the call to the launch() action followed by the assignment to @@instances. Statement (2) is composed of 3 expressions: the call to get() followed by the call to launch() and finally the append to the @@instances collection. Since expressions run atomatically the definition above guarantees that the @@instances collection will end-up with all instances, there is no need to explicitly synchronize the appends to @@instances. There is no guarantee about ordering though so it could be that the single instance retrieved in statement 2 is first in the collection.
Note that the following could generate inconsistent results:
In the example above, statement (1) is comprised of two expressions: the increment and the assignment. If two tasks were to increment concurrently after reading the same value then one of the increments would get lost (both tasks would write back the same value to $$failed). The concurrent map expression should be used instead to build results concurrently:
The concurrent map expression takes care of building the resulting array from the results returned by each concurrent execution. There is no problem of the task overriding values concurrently in this case.
A process may run one or more tasks concurrently at any time. RCL allows for describing how these tasks should be synchronized by providing both synchronization primitives and keywords for controlling tasks individually. A process ends once all its tasks end. A task ends when it completes (no more expression to execute), fails (an expression raises an error that is not handled), is canceled or is aborted. Canceling a task will cause any rollback handler to trigger and do additional processing before the task ends.
Note: the concept of tasks and definitions are completely disjoint and should not be confused: a definition always runs in the task that ran the call expression. In other words, simply using call does not create a new task.
|RCL||Resources||Cloud Workflows & Definitions||Variables||Attributes & Error Handling||Branching & Looping||► Processes||Functions||Operators||Mapping|
© 2006-2014 RightScale, Inc. All rights reserved.
RightScale is a registered trademark of RightScale, Inc. All other products and services may be trademarks or servicemarks of their respective owners.