Skip to content

Commit 01737cd

Browse files
committed
doc edits
1 parent 8e8aef9 commit 01737cd

File tree

1 file changed

+100
-72
lines changed

1 file changed

+100
-72
lines changed

docs/Coroutines.rst

Lines changed: 100 additions & 72 deletions
Original file line numberDiff line numberDiff line change
@@ -12,14 +12,17 @@ Status
1212
This document describes a set of experimental extensions to LLVM. Use
1313
with caution. Because the intrinsics have experimental status,
1414
compatibility across LLVM releases is not guaranteed. These intrinsics
15-
are added to support C++ Coroutine TS (P0057), though they are general enough
16-
to be used to implement coroutines in other languages as well.
15+
are added to support C++ Coroutines (P0057), though they are general enough
16+
to be used to implement coroutines in other languages as well as to
17+
experiment with C++ coroutine alternatives other than P0057.
1718

1819
Overview
1920
========
2021

22+
.. _coroutine handle:
23+
2124
LLVM coroutines are functions that have one or more `suspend points`_.
22-
When a suspend point is reached, execution of a coroutine is suspended.
25+
When a suspend point is reached, the execution of a coroutine is suspended.
2326
A suspended coroutine can be resumed to continue execution from the last
2427
suspend point or be destroyed. In the following example function `f` returns
2528
a handle to a suspended coroutine (**coroutine handle**) that can be passed to
@@ -39,27 +42,32 @@ coroutine respectively.
3942
4043
.. _coroutine frame:
4144

42-
In addition to the stack frame which exists when a coroutine is executing,
43-
there is an additional region of storage that contains objects that keeps the
45+
In addition to the function stack frame which exists when a coroutine is executing,
46+
there is an additional region of storage that contains objects that keep the
4447
coroutine state when a coroutine is suspended. This region of storage
45-
is called **coroutine frame**. It is created when a coroutine is invoked.
46-
It is destroyed when a coroutine runs to completion or destroyed.
48+
is called **coroutine frame**. It is created when a coroutine is called and
49+
destroyed when a coroutine runs to completion or destroyed by a call to
50+
the `coro.destroy`_ intrinsic.
4751

4852
An LLVM coroutine is represented as an LLVM function that has calls to
4953
`coroutine intrinsics`_ defining the structure of the coroutine.
54+
After mandatory CoroSplit_ pass, a coroutine is split into several
55+
functions that represent three different ways of how control can enter the
56+
coroutine:
5057

51-
.. marking up suspend points and coroutine frame
52-
allocation and deallocation code. Marking up allocation and deallocation code
53-
allows an optimization to remove allocation/deallocation when coroutine frame
54-
can be stored on a frame of the caller.
58+
1. a ramp function, which represents an initial invocation of the coroutine that
59+
creates the coroutine frame and executes the coroutine code until it
60+
encounters any suspend point or reaches the end of the function;
5561

56-
After mandatory coroutine processing passes a coroutine is split into several
57-
functions that represent three different ways of how control can enter the
58-
coroutine: an initial invocation that creates the coroutine frame and executes
59-
the coroutine code until it encounters a suspend point or reaches the end
60-
of the coroutine, coroutine resume function that contains the code to be
61-
executed once coroutine is resumed at a particular suspend point, and a
62-
coroutine destroy function that is invoked when the coroutine is destroyed.
62+
2. a coroutine resume function that contains the code to be
63+
executed once the coroutine is resumed at a particular suspend point;
64+
65+
3. a coroutine destroy function that is invoked when the coroutine is destroyed.
66+
67+
..
68+
This is not the only way of lowering the coroutine intrinsics. Another
69+
alternative is to split the coroutine ever further into an individual functions
70+
for every suspend point.
6371
6472
Coroutines by Example
6573
=====================
@@ -87,7 +95,7 @@ a `main` shown in the previous section. It will call `yield` with values 4, 5
8795
and 6 after which the coroutine will be destroyed.
8896

8997
We will look at individual parts of the LLVM coroutine matching the pseudo-code
90-
above starting with coroutine frame creating and destruction:
98+
above starting with coroutine frame creation and destruction:
9199

92100
.. code-block:: llvm
93101
@@ -115,25 +123,26 @@ above starting with coroutine frame creating and destruction:
115123
}
116124
117125
First three lines of `entry` block establish the coroutine frame. The
118-
`coro.size`_ intrinsic expands to represent the size required for the coroutine
119-
frame. The `coro.init`_ intrinsic returns the address to be used as a coroutine
120-
frame pointer (which could be at offset relative to the allocated block of
121-
memory). We will examine the other parameters to `coro.init`_ later.
126+
`coro.size`_ intrinsic is lowered to a constant representing the size required
127+
for the coroutine frame.
128+
The `coro.init`_ intrinsic returns the address to be used as a coroutine
129+
frame pointer (which could be at an offset relative to the allocated block of
130+
memory).
122131

123-
In the cleanup block `coro.delete` intrinsic, given the coroutine frame pointer,
124-
returns a memory address to be freed.
132+
The `coro.delete` intrinsic, given the coroutine frame pointer,
133+
returns a pointer of the memory block to be freed.
125134

126135
Two other intrinsics seen in this fragment are used to mark up the control flow
127-
during the initial and subsequent invocation of the coroutine. The true branch
128-
of the conditional branch following the `coro.fork`_ intrinsic indicates the
129-
block where control flow should transfer on the first suspension of the
130-
coroutine. The `coro.resume.end`_ intrinsic is a no-op during an
131-
initial invocation of the coroutine. When the coroutine resumes, the intrinsic
132-
marks the point when coroutine need to return control back to the caller.
136+
during an initial and subsequent invocation of the coroutine. The true branch
137+
of the conditional branch instruction consuming the result of the `coro.fork`_
138+
intrinsic indicates the block where control should transfer on the first
139+
suspension of the coroutine. The `coro.resume.end`_ intrinsic marks the point
140+
where coroutine needs to return control back to the caller if it is not an initial
141+
invocation of the coroutine. (During the inital coroutine invocation this
142+
intrinsic is a no-op).
133143

134-
The `coro.return` block returns a pointer to a coroutine frame which happens to
135-
be the same as `coroutine frame`_ expected by `coro.resume`_ and `coro.destroy`_
136-
intrinsics.
144+
This function returns a pointer to a coroutine frame which acts as
145+
a `coroutine handle`_ expected by `coro.resume`_ and `coro.destroy`_ intrinsics.
137146

138147
.. The `malloc` function is used to allocate memory dynamically for
139148
.. coroutine frame.
@@ -162,20 +171,24 @@ suspend point represents a `final suspend`_ or not.
162171
Coroutine Transformation
163172
------------------------
164173

174+
One of the step in coroutine transformation is to figure out what objects can
175+
leave on the normal function stack frame and which needs to go into a coroutine
176+
frame.
177+
165178
In the coroutine shown in the previous section, use of virtual register `%n.val`
166179
is separated from the definition by a suspend point, it cannot reside
167180
on the stack frame of the coroutine since it will go away once coroutine is
168-
suspended and therefore need to go into the coroutine frame.
181+
suspended and therefore need to be part of the coroutine frame.
169182

170-
Other members of the coroutine frame will be an address of a resume and destroy
171-
functions representing the coroutine behavior that needs to happen when coroutine
183+
Other members of the coroutine frame are addresses of a resume and destroy
184+
functions representing the coroutine behavior for happen when a coroutine
172185
is resumed and destroyed respectively.
173186

174187
.. code-block:: llvm
175188
176189
%f.frame = type { void (%f.frame*)*, void (%f.frame*)*, i32 }
177190
178-
After coroutine transformation function `f` is responsible for creation and
191+
After coroutine transformation, function `f` is responsible for creation and
179192
initialization of the coroutine frame and execution of the coroutine code until
180193
any suspend point is reached or control reaches the end of the function. It will
181194
look like:
@@ -199,7 +212,7 @@ look like:
199212
ret i8* %frame
200213
}
201214
202-
Part of the orginal coroutine `f` that is responsible for executing code after
215+
Part of the original coroutine `f` that is responsible for executing code after
203216
resume will be extracted into `f.resume` function:
204217

205218
.. code-block:: llvm
@@ -225,22 +238,23 @@ Whereas function `f.destroy` will end up simply calling `free` function:
225238
ret void
226239
}
227240
228-
This transformation is performed by `coro-split` LLVM pass.
241+
.. This transformation is performed by `coro-split` LLVM pass.
229242
230243
Avoiding Heap Allocations
231244
-------------------------
232245

233-
A particular coroutine usage pattern which is illustrated by the `main` function
246+
A particular coroutine usage pattern, which is illustrated by the `main` function
234247
in the overview section where a coroutine is created, manipulated and destroyed by
235-
the same calling function is common for generator coroutines and is suitable for
236-
allocation elision optimization which stores coroutine frame in the caller's
237-
frame.
248+
the same calling function, is common for generator coroutines and is suitable for
249+
allocation elision optimization which avoid dynamic allocation by storing
250+
coroutine frame on the caller's frame.
238251

239-
To enable heap elision, we need to make frame allocation and deallocation
240-
as follows:
252+
To enable this optimization, we need to mark frame allocation and deallocation
253+
calls to allow bypassing them if not needed.
241254

242-
In the entry block, we will invoke `coro.elide`_ intrinsic that will return
243-
an address of a coroutine frame on the callers if possible and `null` otherwise:
255+
In the entry block, we will call `coro.elide`_ intrinsic that will return
256+
an address of a coroutine frame on the caller's frame when possible and
257+
`null` otherwise:
244258

245259
.. code-block:: llvm
246260
@@ -260,7 +274,7 @@ an address of a coroutine frame on the callers if possible and `null` otherwise:
260274
261275
In the cleanup block, we will make freeing the coroutine frame conditional on
262276
`coro.delete`_ intrinsic. If allocation is elided, `coro.delete`_ returns `null`
263-
thus avoiding deallocation code:
277+
thus skipping the deallocation code:
264278

265279
.. code-block:: llvm
266280
@@ -277,8 +291,8 @@ thus avoiding deallocation code:
277291
call void @llvm.experimental.coro.resume.end()
278292
br label %coro.return
279293
280-
With allocations and deallocations described as above after inlining and heap
281-
allocation elision optimization the resulting main will end up looking as:
294+
With allocations and deallocations described as above, after inlining and heap
295+
allocation elision optimization, the resulting main will end up looking like:
282296

283297
.. code-block:: llvm
284298
@@ -290,6 +304,7 @@ allocation elision optimization the resulting main will end up looking as:
290304
ret i32 0
291305
}
292306
307+
293308
Multiple Suspend Points
294309
-----------------------
295310

@@ -368,13 +383,13 @@ Distinct Save and Suspend
368383
-------------------------
369384

370385
In the previous example, setting a resume index (or some other state change that
371-
needs to happen to prepare coroutine for resumption) happens at the same time as
372-
suspension of a coroutine. However, in certain cases it is necessary to control
386+
needs to happen to prepare a coroutine for resumption) happens at the same time as
387+
a suspension of a coroutine. However, in certain cases, it is necessary to control
373388
when coroutine is prepared for resumption and when it is suspended.
374389

375-
In the following example, coroutine represents some activity that is driven
390+
In the following example, a coroutine represents some activity that is driven
376391
by completions of asynchronous operations `async_op1` and `async_op2` which get
377-
a coroutine handle as a parameter and will resume the coroutine once async
392+
a coroutine handle as a parameter and resume the coroutine once async
378393
operation is finished.
379394

380395
.. code-block:: llvm
@@ -422,11 +437,22 @@ Final Suspend
422437
until explicitly destroyed by the call to `coro.destroy`_. If we consider a case
423438
of a coroutine representing a generator that produces a finite sequence of
424439
440+
.. note::
441+
* reason 1: We know suspend the final suspend point. There is no need for the
442+
user to have extra code to track whether we are at final suspend point or
443+
not.
444+
* reason 2: Guard against misuse of a coroutine by trying to resume the
445+
coroutine that reached the end. For example replacing ResumeFnPtr in the
446+
coroutine frame when final suspend is reached, will result in a trap for
447+
free if someone call `coro.resume` on such a coroutine.
448+
* reason 3: One less case for a switch in the beginning of the the resume
449+
function.
450+
425451
One of the common coroutine usage patterns is a generator, where a coroutine
426452
produces a (sometime finite) sequence of values. To facilitate this pattern
427453
frontend can designate a suspend point to be final. A coroutine suspended at
428454
the final suspend point, can only be resumed with `coro.destroy`_ intrinsic.
429-
Resuming such coroutine with `coro.resume`_ results in undefined behavior.
455+
Resuming such a coroutine with `coro.resume`_ leads to undefined behavior.
430456
The `coro.done`_ intrinsic can be used to check whether a suspended coroutine
431457
is at the final suspend point or not.
432458

@@ -462,7 +488,7 @@ be used to communicate with the coroutine. This distinguished alloca is called
462488
intrinsic.
463489

464490
The following coroutine designates a 32 bit integer `promise` and uses it to
465-
store the current value produces by a coroutine.
491+
store the current value produced by a coroutine.
466492

467493
.. code-block:: llvm
468494
@@ -495,7 +521,7 @@ store the current value produces by a coroutine.
495521
ret i8* %frame
496522
}
497523
498-
Coroutine consumer can rely on the `coro.promise`_ intrinsic to access the
524+
A coroutine consumer can rely on the `coro.promise`_ intrinsic to access the
499525
coroutine promise.
500526

501527
.. code-block:: llvm
@@ -556,10 +582,10 @@ The argument is a coroutine handle to a suspended coroutine.
556582
Semantics:
557583
""""""""""
558584

559-
If coroutine identity is known, the `coro.destroy` intrinsic is replaced with a
585+
When possible, the `coro.destroy` intrinsic is replaced with a
560586
direct call to coroutine destroy function. Otherwise it is replaced with an
561587
indirect call based on the function pointer for the destroy function stored
562-
in the coroutine frame. Destroying a coroutine that is not suspended results in
588+
in the coroutine frame. Destroying a coroutine that is not suspended leads to
563589
undefined behavior.
564590

565591
.. _coro.resume:
@@ -585,10 +611,10 @@ The argument is a handle to a suspended coroutine.
585611
Semantics:
586612
""""""""""
587613

588-
If coroutine identity is known, the `coro.resume` intrinsic is replaced with a
614+
When possible, the `coro.resume` intrinsic is replaced with a
589615
direct call to coroutine resume function. Otherwise it is replaced with an
590616
indirect call based on the function pointer for the resume function stored
591-
in the coroutine frame. Resuming a coroutine that is not suspended results in
617+
in the coroutine frame. Resuming a coroutine that is not suspended leads to
592618
undefined behavior.
593619

594620
.. _coro.done:
@@ -615,7 +641,7 @@ Semantics:
615641
""""""""""
616642

617643
Using this intrinsic on a coroutine that does not have a `final suspend`_ point
618-
or on a coroutine that is not suspended results in an undefined behavior.
644+
or on a coroutine that is not suspended leads to undefined behavior.
619645

620646
.. _coro.promise:
621647

@@ -629,8 +655,8 @@ or on a coroutine that is not suspended results in an undefined behavior.
629655
Overview:
630656
"""""""""
631657

632-
The '``llvm.experimental.coro.promise``' intrinsic returns an address of
633-
a `coroutine promise`_.
658+
The '``llvm.experimental.coro.promise``' intrinsic returns a pointer to a
659+
`coroutine promise`_.
634660

635661
Arguments:
636662
""""""""""
@@ -891,7 +917,7 @@ Overview:
891917

892918
The '``llvm.experimental.coro.fork``' intrinsic together with the conditional
893919
branch consuming the boolean value returned from this intrinsic is used to
894-
indicates where the control flows should transfer on the first suspension of the
920+
indicates where the control should transfer on the first suspension of the
895921
coroutine.
896922

897923
Arguments:
@@ -935,7 +961,7 @@ The `coro.resume.end`_ intrinsic is a no-op during an initial invocation of the
935961
coroutine. When the coroutine resumes, the intrinsic marks the point when
936962
coroutine need to return control back to the caller.
937963

938-
This intrinsic is removed by the CoroSplit pass when coroutine is split into
964+
This intrinsic is removed by the CoroSplit pass when a coroutine is split into
939965
the start, resume and destroy parts. In start part, the intrinsic is removed,
940966
in resume and destroy parts, it is replaced with `ret void` instructions and
941967
the rest of the block containing `coro.resume.end` instruction is discarded.
@@ -992,10 +1018,9 @@ Overview:
9921018
"""""""""
9931019

9941020
The '``llvm.experimental.coro.save``' marks the point where a coroutine
995-
is considered suspened (and thus eligible for resumption) but control
996-
is not yet transferred back to the caller. Its return value should be consumed
997-
by exactly one `coro.suspend` intrinsic that marks the point when control need
998-
to be transferred to the coroutine's caller.
1021+
is considered suspened (and thus eligible for resumption). Its return value
1022+
should be consumed by exactly one `coro.suspend` intrinsic that marks the point
1023+
when control need to be transferred to the coroutine's caller.
9991024

10001025
Arguments:
10011026
""""""""""
@@ -1010,7 +1035,8 @@ the coroutine from the corresponding suspend point should be done at the point o
10101035
`coro.save` intrinsic.
10111036

10121037
Example:
1013-
========
1038+
""""""""
1039+
10141040
Separate save and suspend points are a necessity when coroutine is used to
10151041
represent an asynchronous control flow driven by callbacks representing
10161042
completions of asynchronous operations.
@@ -1047,6 +1073,8 @@ to execute coroutine and inliner passes in the following order.
10471073
#. If function `F` is a coroutine, resume and destroy parts are extracted into
10481074
`F.resume` and `F.destroy` functions by the CoroSplit pass.
10491075

1076+
.. _CoroSplit:
1077+
10501078
CoroSplit
10511079
---------
10521080
The pass CoroSplit extracts resume and destroy parts into separate functions.

0 commit comments

Comments
 (0)