Use a parallel
step to define a part of your workflow where two or more steps can execute concurrently. A parallel step waits until all the steps defined within it have completed or are interrupted by an unhandled exception; execution then continues. To learn more about the intended use and benefits of parallel steps, see Execute workflow steps in parallel.
YAML
- PARALLEL_STEP_NAME: parallel: exception_policy: POLICY shared: [VARIABLE_A, VARIABLE_B, ...] concurrency_limit: CONCURRENCY_LIMIT BRANCHES_OR_FOR: ...
JSON
[ { "PARALLEL_STEP_NAME": { "parallel": { "exception_policy": "POLICY", "shared": [ "VARIABLE_A", "VARIABLE_B", ... ], "concurrency_limit": "CONCURRENCY_LIMIT", "BRANCHES_OR_FOR": ... } } } ]
Replace the following:
PARALLEL_STEP_NAME
: the name of the parallel step.POLICY
(optional): determines the action other branches will take when an unhandled exception occurs. The default policy,continueAll
, results in no further action, and all other branches will attempt to run. Note thatcontinueAll
is the only policy currently supported.VARIABLE_A
,VARIABLE_B
, and so on: a list of writable variables with parent scope that allow assignments within the parallel step. In this document, see Shared variables.CONCURRENCY_LIMIT
(optional): the maximum number of branches and iterations that can concurrently execute within a single workflow execution before further branches and iterations are queued to wait. This applies to a singleparallel
step only and does not cascade. Must be a positive integer and can be either a literal value or an expression. In this document, see Concurrency limits.BRANCHES_OR_FOR
: use eitherbranches
orfor
to indicate one of the following:- Branches that can run concurrently. In this document, see Parallel branches.
- A loop where iterations can run concurrently. In this document, see Parallel iteration.
Note the following:
- Parallel branches and iterations can run in any order, and might run in a different order with each execution.
- Parallel steps can include other, nested parallel steps up to the depth limit. See Quotas and limits.
Parallel branches
A branch is a named set of steps that execute sequentially. Parallel branches
can execute concurrently (with the steps in each branch executing sequentially).
YAML
- PARALLEL_STEP_NAME: parallel: ... branches: - BRANCH_NAME_A: steps: ... - BRANCH_NAME_B: steps: ...
JSON
[ { "PARALLEL_STEP_NAME": { "parallel": { ... "branches": [ { "BRANCH_NAME_A": { "steps": ... } }, { "BRANCH_NAME_B": { "steps": ... } } ] } } } ]
Example of parallel branches
This workflow retrieves customer and notification information from two different microservices. These operations are performed in parallel to reduce the end-to-end execution time. The user
and notification
variables are shared so that they can be updated in branches and accessed later in the workflow.
YAML
JSON
Parallel iteration
Ordinary for
loops can be executed in parallel by nesting the for
block within a parallel
step. Parallel steps execute iterations nondeterministically and can arrive at an outcome using various paths, with multiple iterations (up to the concurrency limit) executing concurrently. See Quotas and limits.
YAML
- PARALLEL_ITERATION_STEP_NAME: parallel: ... for: value: LOOP_VARIABLE_NAME ... steps: - STEP_NAME_A: ...
JSON
[ { "PARALLEL_ITERATION_STEP_NAME": { "parallel": { ... "for": { "value": "LOOP_VARIABLE_NAME", ... "steps": [ { "STEP_NAME_A": ... } ] } } } } ]
The LOOP_VARIABLE_NAME
refers to the value of the currently iterated element. The variable name must not be used in assignments or expressions outside of the loop. The same name can be used in multiple loops, as long as they are not nested.
Example of parallel iteration
This workflow calculates the total number of comments for a set of posts provided as a runtime argument. Since the comment count for each post must be retrieved using a separate call, the workflow executes the loop iterations in parallel to reduce the end-to-end execution time. The shared variable, total
, is updated in each iteration so that it contains the sum after the countComments
parallel step completes.
YAML
JSON
Shared variables
Branches and iterations support local variable scopes with a special property: variables from the parent scope are read-only unless explicitly marked as shared
for write-access. Assignments in a parallel
step to a non-shared variable from a parent scope will result in a deployment error.
Atomic assignments
All assignments in parallel steps are atomic. The assigned value is determined (evaluating any specified expression), and written without any intervening writes by other branches. Shared variables writes are immediately seen by other branches.
Note that assignments can happen in any order, and expressions should not depend on the order of evaluation and assignment. For example, changing the order of addends in a + b = b + a
does not change the sum.
Local variables and memory limits
Variables assigned only in a parallel branch or loop are local to that branch or iteration unless marked shared
. In for
loops, a local variable is unique to each iteration and cannot be used to pass or accumulate values between iterations. The variable memory limit is applied independently to each branch, and must not exceed the limit when considering variables from both the parent and local scopes. Local variables in a branch do not affect the memory available to other branches.
For example, the following diagram illustrates how if a parent step uses 40% of the available memory, a child step can only use the remaining 60%, and so on:
Variables that are not assigned
To optimize performance, variables should not be marked shared
if they are intended to be read-only and not assigned within a parallel
step.
Variables and nested parallel steps
Marking a variable as shared
only affects the branches or for
loop in the given parallel
step. To write to a shared variable in a nested parallel
step, mark it as shared
in both the parent and child parallel steps.
Example of variable scopes
This workflow demonstrates the scope of a shared variable, my_result
, as well as variables that are local to their respective branch scopes. Assigning to a variable from a parent scope (input
) in a parallel
step will result in a deployment error unless the variable is shared in the parallel
step.
YAML
JSON
Concurrency limits
You can concurrently execute branches or iterations using parallel steps up to the maximum concurrency quota before further branches and iterations are queued to wait. You can't change this global limit; however, you can specify a lower parallel
step concurrency limit by setting a concurrency_limit
value.
The concurrency restrictions for any child are independent from those of its parent and, unlike the global limit, the parallel
step concurrency limit is not cascading: any descendants do not have to adhere as a group to the concurrency limit set for an ancestor. Note that if a parent is waiting for its children to complete executing, it is not considered active, and it is not counted towards the concurrency limit.
Examples of concurrency limits
Example 1
The following diagram illustrates how the concurrency_limit
applies only to the current parallel
step and is not inherited by any nested parallel
steps.
Note the following:
- Child A has a
concurrency_limit
of 1; therefore, only one of GrandChild A, GrandChild B, or GrandChild C can execute at any given time. When one completes, another can execute. - The concurrency restrictions for any child are independent from those of its parent. For example, the concurrency limit set for Child A (1) does not impact the concurrency limits set for GrandChild B (10) or GrandChild C (2). Unlike the global limit, the
parallel
step concurrency limit is not cascading: any descendants do not have to adhere as a group to the concurrency limit set for an ancestor. Note that if a parent is waiting for its children to complete executing, it is not considered active, and it is not counted towards the concurrency limit. - GrandChild A does not have a defined
concurrency_limit
set; however, it still adheres to the global limit.
Example 2
In the following workflow, there are three iterations of the -parent
step. Since only two can be active at a time, two concurrent www.foo.com
calls can occur. However, three iterations can exist at the same time (two active, one inactive), and the inactive branch can complete its call to www.foo.com
while waiting for its five -nestedChild
iterations to complete.
YAML
- parent: parallel: concurrency_limit: 2 for: range: [1, 3] value: i steps: - parentHttpCall: call: http.get args: url: "www.foo.com" - nestedChild: parallel: concurrency_limit: 4 for: range: [1, 5] value: j steps: - childHttpCall: call: http.get args: url: "www.bar.com"
JSON
[ { "parent": { "parallel": { "concurrency_limit": 2, "for": { "range": [ 1, 3 ], "value": "i", "steps": [ { "parentHttpCall": { "call": "http.get", "args": { "url": "www.foo.com" } } }, { "nestedChild": { "parallel": { "concurrency_limit": 4, "for": { "range": [ 1, 5 ], "value": "j", "steps": [ { "childHttpCall": { "call": "http.get", "args": { "url": "www.bar.com" } } } ] } } } } ] } } } } ]
Example 3
The following image illustrates nine concurrent (active) -nestedChild
(GrandChild) iterations. It applies a maximum child concurrency limit multiplied by the maximum number of parent iterations (4 * 3 = 12). Each of the 12 can make a concurrent www.bar.com
request. In total, 15 -nestedChild
iterations are executed (3 parent * 5 child).
Jumps
You can jump to steps only within the same branch or loop. To exit a single iteration or branch early, use next: continue
. For details, see Use break/continue in a loop.
Returns
Although you can use return
in the main workflow to stop a workflow's execution, return
steps are not allowed inside a branch or loop step. To return one or more values from a branch, use a shared variable instead.
Exceptions
Exceptions in branches and iterations can be handled inline by retrying steps and catching errors within the branch or for
loop step. An unhandled exception in a branch terminates that branch. An unhandled exception in a parallel for
loop terminates the single iteration where it is raised. See the maximum number of unhandled exceptions than can be raised during the execution of a workflow.
The exception_policy
determines what action other branches will take when an unhandled exception occurs. The default policy, continueAll
, results in no further action, and all other branches will attempt to run. (Note that continueAll
is the only policy currently supported.)
As shared variables are atomic, any shared variables that are set prior to a branch exception are seen by other branches with access to the variables.
Unhandled exceptions
An UnhandledBranchError
is raised in a parallel
step if there are unhandled exceptions from branches or iterations. This is a runtime exception that can be caught. Branches or iterations that throw an exception are listed as string entries in the branches
field in an exception map. The exception map indicates which branch or iteration raised the error, and the error itself. For example:
{ "branches": [ { "id": "1", "error": { "context": "RuntimeError: \"branch error\"\nin step \"step1\", routine \"main\", line: 9", "payload": { "message": "ZeroDivisionError: division by zero", "tags": [ "ZeroDivisionError", "ArithmeticError" ] } } } ], "message": "UnhandledBranchError: One or more branches or iterations encountered an unhandled runtime error", "tags": [ "UnhandledBranchError", "RuntimeError" ], "truncated": false }
Note the following:
- Exceptions are included in the
branches
field in chronological order, with the earliest error appearing first. - Branch execution order is not guaranteed.
- If including the error payload for a child branch would result in the parent scope exceeding 90% of the remaining memory capacity, the exception is truncated. Specifically, all branch IDs that return an error (up to the maximum number of unhandled exceptions) are included in the
branches
list; however, each branch error might or might not be truncated:- If an error is truncated, it is not included, and
error:null
andtruncated:true
are set. - If an error is not truncated, the size of the error payload is subtracted from the parent scope's remaining memory.
- If an error is truncated, it is not included, and
Nested parallel steps follow the same pattern, recursively. The following workflow is an example of nested parallel steps that results in unhandled exceptions:
YAML
- parallelStep: parallel: for: value: num range: [0,1] steps: - parentLoop: parallel: for: value: num2 range: [0,0] steps: - checkEven: switch: - condition: '${num % 2 != 0}' raise: "how odd!"
JSON
[ { "parallelStep": { "parallel": { "for": { "value": "num", "range": [ 0, 1 ], "steps": [ { "parentLoop": { "parallel": { "for": { "value": "num2", "range": [ 0, 0 ], "steps": [ { "checkEven": { "switch": [ { "condition": "${num % 2 != 0}", "raise": "how odd!" } ] } } ] } } } } ] } } } } ]
{ "branches":[ { "id":"1", "error":{ "context":"RuntimeError: \"branch error\"\nin step \"parentLoop\", routine \"main\", line: 8", "payload":{ "branches":[ { "id":"0", "error":{ "context":"RuntimeError: \"branch error\"\nin step \"checkEven\", routine \"main\", line: 15", "payload":"how odd!" } } ], "message":"UnhandledBranchError: One or more branches or iterations encountered an unhandled runtime error", "tags":[ "UnhandledBranchError", "RuntimeError" ], "truncated":false } } } ], "message":"UnhandledBranchError: One or more branches or iterations encountered an unhandled runtime error", "tags":[ "UnhandledBranchError", "RuntimeError" ], "truncated":false }
Example of error handling
The following example uses a try/except
structure for error handling:
YAML
JSON
An error message similar to the following is logged:
{ "branches":[ { "id":"3", "error":{ "context":"RuntimeError: \"branch error\"\nin step \"checkEven\", routine \"main\", line: 12", "payload":"how odd!" } }, { "id":"5", "error":{ "context":"RuntimeError: \"branch error\"\nin step \"checkEven\", routine \"main\", line: 12", "payload":"how odd!" } }, { "id":"1", "error":{ "context":"RuntimeError: \"branch error\"\nin step \"checkEven\", routine \"main\", line: 12", "payload":"how odd!" } } ], "message":"UnhandledBranchError: One or more branches or iterations encountered an unhandled runtime error", "tags":[ "UnhandledBranchError", "RuntimeError" ], "truncated":false }
What's next
- Workflows overview
- Replace
experimental.executions.map
function with parallel step - Tutorial: Run multiple BigQuery jobs in parallel