- Notifications
You must be signed in to change notification settings - Fork 18.5k
Description
(I feel sure I've reported this before but cannot find it.)
Variables local to different cases of a switch statement have disjoint live ranges. That means it is in principle possible for each case to interpret the same stack slots differently. However the compiler does not implement this optimization (I guess to make it easier for the GC to know the type of a given stack slot at a given pc?). Consequently, large switch statements lead to large stack frames.
This is particularly pathological for a bytecode interpreter, whose main loop contains a 256-case switch, some of whose cases are quite complex. For example, the Starlark-go interpreter uses a whopping 1.4KB per recursive Starlark call, which cannot be good for locality.
The comments in the code below show the size of the SUB sp operation as the number of switch cases grows:
func f(b byte) { switch b { case 1: // 88 var x int var s string use(&x) use(&s) case 2: // 140 var x int var s string use(&x) use(&s) case 3: // 192 var x int var s string use(&x) use(&s) case 4: // 252 var x int var s string use(&x) use(&s) case 5: // 308 var x int var s string use(&x) use(&s) } } //go:noinline func use(any) {}Related: