@@ -12,14 +12,17 @@ Status
1212This document describes a set of experimental extensions to LLVM. Use
1313with caution. Because the intrinsics have experimental status,
1414compatibility across LLVM releases is not guaranteed. These intrinsics
15- are added to support C++ Coroutine TS (P0057), though they are general enough
16- to be used to implement coroutines in other languages as well.
15+ are added to support C++ Coroutines (P0057), though they are general enough
16+ to be used to implement coroutines in other languages as well as to
17+ experiment with C++ coroutine alternatives other than P0057.
1718
1819Overview
1920========
2021
22+ .. _coroutine handle :
23+
2124LLVM coroutines are functions that have one or more `suspend points `_.
22- When a suspend point is reached, execution of a coroutine is suspended.
25+ When a suspend point is reached, the execution of a coroutine is suspended.
2326A suspended coroutine can be resumed to continue execution from the last
2427suspend point or be destroyed. In the following example function `f ` returns
2528a handle to a suspended coroutine (**coroutine handle **) that can be passed to
@@ -39,27 +42,32 @@ coroutine respectively.
3942
4043 .. _coroutine frame :
4144
42- In addition to the stack frame which exists when a coroutine is executing,
43- there is an additional region of storage that contains objects that keeps the
45+ In addition to the function stack frame which exists when a coroutine is executing,
46+ there is an additional region of storage that contains objects that keep the
4447coroutine state when a coroutine is suspended. This region of storage
45- is called **coroutine frame **. It is created when a coroutine is invoked.
46- It is destroyed when a coroutine runs to completion or destroyed.
48+ is called **coroutine frame **. It is created when a coroutine is called and
49+ destroyed when a coroutine runs to completion or destroyed by a call to
50+ the `coro.destroy `_ intrinsic.
4751
4852An LLVM coroutine is represented as an LLVM function that has calls to
4953`coroutine intrinsics `_ defining the structure of the coroutine.
54+ After mandatory CoroSplit _ pass, a coroutine is split into several
55+ functions that represent three different ways of how control can enter the
56+ coroutine:
5057
51- .. marking up suspend points and coroutine frame
52- allocation and deallocation code. Marking up allocation and deallocation code
53- allows an optimization to remove allocation/deallocation when coroutine frame
54- can be stored on a frame of the caller.
58+ 1. a ramp function, which represents an initial invocation of the coroutine that
59+ creates the coroutine frame and executes the coroutine code until it
60+ encounters any suspend point or reaches the end of the function;
5561
56- After mandatory coroutine processing passes a coroutine is split into several
57- functions that represent three different ways of how control can enter the
58- coroutine: an initial invocation that creates the coroutine frame and executes
59- the coroutine code until it encounters a suspend point or reaches the end
60- of the coroutine, coroutine resume function that contains the code to be
61- executed once coroutine is resumed at a particular suspend point, and a
62- coroutine destroy function that is invoked when the coroutine is destroyed.
62+ 2. a coroutine resume function that contains the code to be
63+ executed once the coroutine is resumed at a particular suspend point;
64+
65+ 3. a coroutine destroy function that is invoked when the coroutine is destroyed.
66+
67+ ..
68+ This is not the only way of lowering the coroutine intrinsics. Another
69+ alternative is to split the coroutine ever further into an individual functions
70+ for every suspend point.
6371
6472Coroutines by Example
6573=====================
@@ -87,7 +95,7 @@ a `main` shown in the previous section. It will call `yield` with values 4, 5
8795and 6 after which the coroutine will be destroyed.
8896
8997We will look at individual parts of the LLVM coroutine matching the pseudo-code
90- above starting with coroutine frame creating and destruction:
98+ above starting with coroutine frame creation and destruction:
9199
92100.. code-block :: llvm
93101
@@ -115,25 +123,26 @@ above starting with coroutine frame creating and destruction:
115123 }
116124
117125 First three lines of `entry ` block establish the coroutine frame. The
118- `coro.size `_ intrinsic expands to represent the size required for the coroutine
119- frame. The `coro.init `_ intrinsic returns the address to be used as a coroutine
120- frame pointer (which could be at offset relative to the allocated block of
121- memory). We will examine the other parameters to `coro.init `_ later.
126+ `coro.size `_ intrinsic is lowered to a constant representing the size required
127+ for the coroutine frame.
128+ The `coro.init `_ intrinsic returns the address to be used as a coroutine
129+ frame pointer (which could be at an offset relative to the allocated block of
130+ memory).
122131
123- In the cleanup block `coro.delete ` intrinsic, given the coroutine frame pointer,
124- returns a memory address to be freed.
132+ The `coro.delete ` intrinsic, given the coroutine frame pointer,
133+ returns a pointer of the memory block to be freed.
125134
126135Two other intrinsics seen in this fragment are used to mark up the control flow
127- during the initial and subsequent invocation of the coroutine. The true branch
128- of the conditional branch following the `coro.fork `_ intrinsic indicates the
129- block where control flow should transfer on the first suspension of the
130- coroutine. The `coro.resume.end `_ intrinsic is a no-op during an
131- initial invocation of the coroutine. When the coroutine resumes, the intrinsic
132- marks the point when coroutine need to return control back to the caller.
136+ during an initial and subsequent invocation of the coroutine. The true branch
137+ of the conditional branch instruction consuming the result of the `coro.fork `_
138+ intrinsic indicates the block where control should transfer on the first
139+ suspension of the coroutine. The `coro.resume.end `_ intrinsic marks the point
140+ where coroutine needs to return control back to the caller if it is not an initial
141+ invocation of the coroutine. (During the inital coroutine invocation this
142+ intrinsic is a no-op).
133143
134- The `coro.return ` block returns a pointer to a coroutine frame which happens to
135- be the same as `coroutine frame `_ expected by `coro.resume `_ and `coro.destroy `_
136- intrinsics.
144+ This function returns a pointer to a coroutine frame which acts as
145+ a `coroutine handle `_ expected by `coro.resume `_ and `coro.destroy `_ intrinsics.
137146
138147.. The `malloc` function is used to allocate memory dynamically for
139148.. coroutine frame.
@@ -162,20 +171,24 @@ suspend point represents a `final suspend`_ or not.
162171Coroutine Transformation
163172------------------------
164173
174+ One of the step in coroutine transformation is to figure out what objects can
175+ leave on the normal function stack frame and which needs to go into a coroutine
176+ frame.
177+
165178In the coroutine shown in the previous section, use of virtual register `%n.val `
166179is separated from the definition by a suspend point, it cannot reside
167180on the stack frame of the coroutine since it will go away once coroutine is
168- suspended and therefore need to go into the coroutine frame.
181+ suspended and therefore need to be part of the coroutine frame.
169182
170- Other members of the coroutine frame will be an address of a resume and destroy
171- functions representing the coroutine behavior that needs to happen when coroutine
183+ Other members of the coroutine frame are addresses of a resume and destroy
184+ functions representing the coroutine behavior for happen when a coroutine
172185is resumed and destroyed respectively.
173186
174187.. code-block :: llvm
175188
176189 %f.frame = type { void (%f.frame*)*, void (%f.frame*)*, i32 }
177190
178- After coroutine transformation function `f ` is responsible for creation and
191+ After coroutine transformation, function `f ` is responsible for creation and
179192initialization of the coroutine frame and execution of the coroutine code until
180193any suspend point is reached or control reaches the end of the function. It will
181194look like:
@@ -199,7 +212,7 @@ look like:
199212 ret i8* %frame
200213 }
201214
202- Part of the orginal coroutine `f ` that is responsible for executing code after
215+ Part of the original coroutine `f ` that is responsible for executing code after
203216resume will be extracted into `f.resume ` function:
204217
205218.. code-block :: llvm
@@ -225,22 +238,23 @@ Whereas function `f.destroy` will end up simply calling `free` function:
225238 ret void
226239 }
227240
228- This transformation is performed by `coro-split ` LLVM pass.
241+ .. This transformation is performed by `coro-split` LLVM pass.
229242
230243 Avoiding Heap Allocations
231244-------------------------
232245
233- A particular coroutine usage pattern which is illustrated by the `main ` function
246+ A particular coroutine usage pattern, which is illustrated by the `main ` function
234247in the overview section where a coroutine is created, manipulated and destroyed by
235- the same calling function is common for generator coroutines and is suitable for
236- allocation elision optimization which stores coroutine frame in the caller's
237- frame.
248+ the same calling function, is common for generator coroutines and is suitable for
249+ allocation elision optimization which avoid dynamic allocation by storing
250+ coroutine frame on the caller's frame.
238251
239- To enable heap elision , we need to make frame allocation and deallocation
240- as follows:
252+ To enable this optimization , we need to mark frame allocation and deallocation
253+ calls to allow bypassing them if not needed.
241254
242- In the entry block, we will invoke `coro.elide `_ intrinsic that will return
243- an address of a coroutine frame on the callers if possible and `null ` otherwise:
255+ In the entry block, we will call `coro.elide `_ intrinsic that will return
256+ an address of a coroutine frame on the caller's frame when possible and
257+ `null ` otherwise:
244258
245259.. code-block :: llvm
246260
@@ -260,7 +274,7 @@ an address of a coroutine frame on the callers if possible and `null` otherwise:
260274
261275 In the cleanup block, we will make freeing the coroutine frame conditional on
262276`coro.delete `_ intrinsic. If allocation is elided, `coro.delete `_ returns `null `
263- thus avoiding deallocation code:
277+ thus skipping the deallocation code:
264278
265279.. code-block :: llvm
266280
@@ -277,8 +291,8 @@ thus avoiding deallocation code:
277291 call void @llvm.experimental.coro.resume.end()
278292 br label %coro.return
279293
280- With allocations and deallocations described as above after inlining and heap
281- allocation elision optimization the resulting main will end up looking as :
294+ With allocations and deallocations described as above, after inlining and heap
295+ allocation elision optimization, the resulting main will end up looking like :
282296
283297.. code-block :: llvm
284298
@@ -290,6 +304,7 @@ allocation elision optimization the resulting main will end up looking as:
290304 ret i32 0
291305 }
292306
307+
293308 Multiple Suspend Points
294309-----------------------
295310
@@ -368,13 +383,13 @@ Distinct Save and Suspend
368383-------------------------
369384
370385In the previous example, setting a resume index (or some other state change that
371- needs to happen to prepare coroutine for resumption) happens at the same time as
372- suspension of a coroutine. However, in certain cases it is necessary to control
386+ needs to happen to prepare a coroutine for resumption) happens at the same time as
387+ a suspension of a coroutine. However, in certain cases, it is necessary to control
373388when coroutine is prepared for resumption and when it is suspended.
374389
375- In the following example, coroutine represents some activity that is driven
390+ In the following example, a coroutine represents some activity that is driven
376391by completions of asynchronous operations `async_op1 ` and `async_op2 ` which get
377- a coroutine handle as a parameter and will resume the coroutine once async
392+ a coroutine handle as a parameter and resume the coroutine once async
378393operation is finished.
379394
380395.. code-block :: llvm
@@ -422,11 +437,22 @@ Final Suspend
422437 until explicitly destroyed by the call to `coro.destroy`_. If we consider a case
423438 of a coroutine representing a generator that produces a finite sequence of
424439
440+ .. note ::
441+ * reason 1: We know suspend the final suspend point. There is no need for the
442+ user to have extra code to track whether we are at final suspend point or
443+ not.
444+ * reason 2: Guard against misuse of a coroutine by trying to resume the
445+ coroutine that reached the end. For example replacing ResumeFnPtr in the
446+ coroutine frame when final suspend is reached, will result in a trap for
447+ free if someone call `coro.resume ` on such a coroutine.
448+ * reason 3: One less case for a switch in the beginning of the the resume
449+ function.
450+
425451One of the common coroutine usage patterns is a generator, where a coroutine
426452produces a (sometime finite) sequence of values. To facilitate this pattern
427453frontend can designate a suspend point to be final. A coroutine suspended at
428454the final suspend point, can only be resumed with `coro.destroy `_ intrinsic.
429- Resuming such coroutine with `coro.resume `_ results in undefined behavior.
455+ Resuming such a coroutine with `coro.resume `_ leads to undefined behavior.
430456The `coro.done `_ intrinsic can be used to check whether a suspended coroutine
431457is at the final suspend point or not.
432458
@@ -462,7 +488,7 @@ be used to communicate with the coroutine. This distinguished alloca is called
462488intrinsic.
463489
464490The following coroutine designates a 32 bit integer `promise ` and uses it to
465- store the current value produces by a coroutine.
491+ store the current value produced by a coroutine.
466492
467493.. code-block :: llvm
468494
@@ -495,7 +521,7 @@ store the current value produces by a coroutine.
495521 ret i8* %frame
496522 }
497523
498- Coroutine consumer can rely on the `coro.promise `_ intrinsic to access the
524+ A coroutine consumer can rely on the `coro.promise `_ intrinsic to access the
499525coroutine promise.
500526
501527.. code-block :: llvm
@@ -556,10 +582,10 @@ The argument is a coroutine handle to a suspended coroutine.
556582Semantics:
557583""""""""""
558584
559- If coroutine identity is known , the `coro.destroy ` intrinsic is replaced with a
585+ When possible , the `coro.destroy ` intrinsic is replaced with a
560586direct call to coroutine destroy function. Otherwise it is replaced with an
561587indirect call based on the function pointer for the destroy function stored
562- in the coroutine frame. Destroying a coroutine that is not suspended results in
588+ in the coroutine frame. Destroying a coroutine that is not suspended leads to
563589undefined behavior.
564590
565591.. _coro.resume :
@@ -585,10 +611,10 @@ The argument is a handle to a suspended coroutine.
585611Semantics:
586612""""""""""
587613
588- If coroutine identity is known , the `coro.resume ` intrinsic is replaced with a
614+ When possible , the `coro.resume ` intrinsic is replaced with a
589615direct call to coroutine resume function. Otherwise it is replaced with an
590616indirect call based on the function pointer for the resume function stored
591- in the coroutine frame. Resuming a coroutine that is not suspended results in
617+ in the coroutine frame. Resuming a coroutine that is not suspended leads to
592618undefined behavior.
593619
594620.. _coro.done :
@@ -615,7 +641,7 @@ Semantics:
615641""""""""""
616642
617643Using this intrinsic on a coroutine that does not have a `final suspend `_ point
618- or on a coroutine that is not suspended results in an undefined behavior.
644+ or on a coroutine that is not suspended leads to undefined behavior.
619645
620646.. _coro.promise :
621647
@@ -629,8 +655,8 @@ or on a coroutine that is not suspended results in an undefined behavior.
629655Overview:
630656"""""""""
631657
632- The '``llvm.experimental.coro.promise ``' intrinsic returns an address of
633- a `coroutine promise `_.
658+ The '``llvm.experimental.coro.promise ``' intrinsic returns a pointer to a
659+ `coroutine promise `_.
634660
635661Arguments:
636662""""""""""
@@ -891,7 +917,7 @@ Overview:
891917
892918The '``llvm.experimental.coro.fork ``' intrinsic together with the conditional
893919branch consuming the boolean value returned from this intrinsic is used to
894- indicates where the control flows should transfer on the first suspension of the
920+ indicates where the control should transfer on the first suspension of the
895921coroutine.
896922
897923Arguments:
@@ -935,7 +961,7 @@ The `coro.resume.end`_ intrinsic is a no-op during an initial invocation of the
935961coroutine. When the coroutine resumes, the intrinsic marks the point when
936962coroutine need to return control back to the caller.
937963
938- This intrinsic is removed by the CoroSplit pass when coroutine is split into
964+ This intrinsic is removed by the CoroSplit pass when a coroutine is split into
939965the start, resume and destroy parts. In start part, the intrinsic is removed,
940966in resume and destroy parts, it is replaced with `ret void ` instructions and
941967the rest of the block containing `coro.resume.end ` instruction is discarded.
@@ -992,10 +1018,9 @@ Overview:
9921018"""""""""
9931019
9941020The '``llvm.experimental.coro.save ``' marks the point where a coroutine
995- is considered suspened (and thus eligible for resumption) but control
996- is not yet transferred back to the caller. Its return value should be consumed
997- by exactly one `coro.suspend ` intrinsic that marks the point when control need
998- to be transferred to the coroutine's caller.
1021+ is considered suspened (and thus eligible for resumption). Its return value
1022+ should be consumed by exactly one `coro.suspend ` intrinsic that marks the point
1023+ when control need to be transferred to the coroutine's caller.
9991024
10001025Arguments:
10011026""""""""""
@@ -1010,7 +1035,8 @@ the coroutine from the corresponding suspend point should be done at the point o
10101035`coro.save ` intrinsic.
10111036
10121037Example:
1013- ========
1038+ """"""""
1039+
10141040Separate save and suspend points are a necessity when coroutine is used to
10151041represent an asynchronous control flow driven by callbacks representing
10161042completions of asynchronous operations.
@@ -1047,6 +1073,8 @@ to execute coroutine and inliner passes in the following order.
10471073#. If function `F ` is a coroutine, resume and destroy parts are extracted into
10481074 `F.resume ` and `F.destroy ` functions by the CoroSplit pass.
10491075
1076+ .. _CoroSplit :
1077+
10501078CoroSplit
10511079---------
10521080The pass CoroSplit extracts resume and destroy parts into separate functions.
0 commit comments