Goroutine stack and local variable allocation in Go Cherie Hsieh Golang Taiwan Community GDG Hsinchu Women Techmakers Community
Outline 1. The Run-Time Stack 2. User Stack in C Program 3. User Stack in Go goroutine 4. Goroutine Stack allocation 5. System Stack and User Stack 6. Stack Growing Mechanism
The Run-Time Stack
The Run-Time Stack In common Linux systems, each process/thread has two stacks: 1. Kernel stack (system calls, interrupt handlers) 2. User stack (execute user-space applications)
The Run-Time Stack User Stack provides last-in, first out memory management for procedure-calling. source: https://www.mdpi.com/computers/computers-09-00048/article_deploy/html/images/computers-09-00048-g005.png
User Stack in C Program
User Stack in C Program The memory layout of a process with main thread. source: https://cseweb.ucsd.edu/classes/sp16/cse120-a/applications/ln/proccontext.jpg
User Stack in C Program The memory layout of a process with multiple threads. source: https://cseweb.ucsd.edu/classes/sp16/cse120-a/applications/ln/thdcontext.jpg
Stack Size in C Program Default user stack size of main thread: 8k Use ulimit cmd-tool to check or change the default stack size. execve() causes the program that is currently being run by the calling process to be replaced with a new program, with newly initialized stack, heap, and data segments. cherie@coscup:~$ ulimit -a stack size (kbytes, -s) 8192 max user processes (-u) 59988
Stack Size in C Program Create a new thread with a customized stack size using system call clone() The stack argument specifies the location of the stack used by the child threads. int clone(int (*fn)(void *), void *stack, int flags, void *arg, ... /* pid_t *parent_tid, void *tls, pid_t *child_tid */);
User Stack in Go goroutine
User Stack in Go goroutine Goroutine is lightweight, costing little more than the allocation of stack space. And the stacks start small, so they are cheap. /* source code: runtime/runtime2.go */ type stack struct { lo uintptr hi uintptr } type g struct { // Stack parameters. // stack describes the actual stack memory: [stack.lo, stack.hi). stack stack // ...more members }
Stack Allocation in Go goroutine
The Free List in stackCache and stackPool Four size classes of free list: 2k, 4k, 8k, and 16k. /* source code: runtime/malloc.go */ /* We want to cache 2KB, 4KB, 8KB, and 16KB stacks. Larger stacks */ /* will be allocated directly. */ /* Since FixedStack is different on different systems, we */ /* must vary NumStackOrders to keep the same maximum cached size. */ /* OS | FixedStack | NumStackOrders */ /* -----------------+------------+--------------- */ /* linux/darwin/bsd | 2KB | 4 */ /* windows/32 | 4KB | 3 */ /* windows/64 | 8KB | 2 */ /* plan9 | 4KB | 3 */ The size classes would be different due to different operating systems and architectures.
The Free List in stackCache and stackPool Considering the cache line of each architecture, a cache line padding is added to item structure for performance optimization. /* Global pool of spans that have free stacks. */ /* Stacks are assigned an order according to size. */ /* There is a free list for each order. */ var stackpool [_NumStackOrders]struct { item stackpoolItem _ [cpu.CacheLinePadSize - unsafe.Sizeof(stackpoolItem{})%cpu.CacheLinePadSize]byte }
System Stack in Go
System Stack in Go There are two stacks in Go 1. User stack Every non-dead G has a user stack associated with it, which is what user Go code executes on. 2. System stack Every M has a system stack associated with it (also known as the M's "g0") . System and signal stacks cannot grow, but are large enough to execute runtime and cgo code (8K). Code running on the system stack is implicitly non-preemptible and the garbage collector does not scan system stacks.
System Stack in Go A main thread with a single goroutine
User Stack Extension in Go goroutine
User Stack Extension in Go goroutine A goroutines stack starts small, so they are cheap, and grow by allocating (and freeing) heap storage as required. const ( StackSmall = 128 StackBig = 4096 ) type g struct { stack stack stackguard0 uintptr // ...more members }
User Stack Extension in Go goroutine Stack extension check: 1. frame size <= StackSmall Comparing the stack pointer with the stack guard, then extend the stack if the stack pointer is below the stack guard. 2. StackSmall < frame size <= StackBig Moving down the stack pointer and comparing the stack pointer with the stack guard. Extending the stack if the stack pointer is below the stack guard. 3. frame size > StackBig Always does stack extension.
User Stack Extension in Go goroutine
The Process of Stack Growth How stack growth works? 1. record the stack map (global) for pointer checking. 2. adjust the pointer value according to adjustInfo.
Experiments
Stack Overflow package main const ( size = 2048 maxDepth = 5000000 ) func bigFrame(args [size]uint64, depth int) { if depth < maxDepth { bigFrame(args, depth+1) } } func main() { var args [size]uint64 bigFrame(args, 0) }
Stack Overflow Stack Overflow Error!
Stack Overflow Default Maxstacksize /* source code: runtime/proc.go */ if sys.PtrSize == 8 { maxstacksize = 1000000000 } else { maxstacksize = 250000000 }
Goroutine Initialization with > 2k function args package main /* change arrSize to 260 and check the result */ const arrSize = 4 var c = make(chan bool) func bigArray(arr [arrSize]uint64) { var sum uint64 = 1 for i := 0; i < arrSize; i++ { sum = sum + arr[i] } c <- true } func main() { var arr [arrSize]uint64 go bigArray(arr) <-c }
Further Study 1. stack frame is not always required 2. stack allocation with large frame size. 3. MemStats.StackInuse memStats := new(runtime.MemStats) runtime.ReadMemStats(memStats)
Thank You. Blog: https://yushuanhsieh.github.io/ Linkedin: https://www.linkedin.com/in/yu-shuan-hsieh-96994491/

Goroutine stack and local variable allocation in Go

  • 1.
    Goroutine stack andlocal variable allocation in Go Cherie Hsieh Golang Taiwan Community GDG Hsinchu Women Techmakers Community
  • 2.
    Outline 1. The Run-TimeStack 2. User Stack in C Program 3. User Stack in Go goroutine 4. Goroutine Stack allocation 5. System Stack and User Stack 6. Stack Growing Mechanism
  • 3.
  • 4.
    The Run-Time Stack Incommon Linux systems, each process/thread has two stacks: 1. Kernel stack (system calls, interrupt handlers) 2. User stack (execute user-space applications)
  • 5.
    The Run-Time Stack UserStack provides last-in, first out memory management for procedure-calling. source: https://www.mdpi.com/computers/computers-09-00048/article_deploy/html/images/computers-09-00048-g005.png
  • 6.
    User Stack inC Program
  • 7.
    User Stack inC Program The memory layout of a process with main thread. source: https://cseweb.ucsd.edu/classes/sp16/cse120-a/applications/ln/proccontext.jpg
  • 8.
    User Stack inC Program The memory layout of a process with multiple threads. source: https://cseweb.ucsd.edu/classes/sp16/cse120-a/applications/ln/thdcontext.jpg
  • 9.
    Stack Size inC Program Default user stack size of main thread: 8k Use ulimit cmd-tool to check or change the default stack size. execve() causes the program that is currently being run by the calling process to be replaced with a new program, with newly initialized stack, heap, and data segments. cherie@coscup:~$ ulimit -a stack size (kbytes, -s) 8192 max user processes (-u) 59988
  • 10.
    Stack Size inC Program Create a new thread with a customized stack size using system call clone() The stack argument specifies the location of the stack used by the child threads. int clone(int (*fn)(void *), void *stack, int flags, void *arg, ... /* pid_t *parent_tid, void *tls, pid_t *child_tid */);
  • 11.
    User Stack inGo goroutine
  • 12.
    User Stack inGo goroutine Goroutine is lightweight, costing little more than the allocation of stack space. And the stacks start small, so they are cheap. /* source code: runtime/runtime2.go */ type stack struct { lo uintptr hi uintptr } type g struct { // Stack parameters. // stack describes the actual stack memory: [stack.lo, stack.hi). stack stack // ...more members }
  • 13.
    Stack Allocation inGo goroutine
  • 14.
    The Free Listin stackCache and stackPool Four size classes of free list: 2k, 4k, 8k, and 16k. /* source code: runtime/malloc.go */ /* We want to cache 2KB, 4KB, 8KB, and 16KB stacks. Larger stacks */ /* will be allocated directly. */ /* Since FixedStack is different on different systems, we */ /* must vary NumStackOrders to keep the same maximum cached size. */ /* OS | FixedStack | NumStackOrders */ /* -----------------+------------+--------------- */ /* linux/darwin/bsd | 2KB | 4 */ /* windows/32 | 4KB | 3 */ /* windows/64 | 8KB | 2 */ /* plan9 | 4KB | 3 */ The size classes would be different due to different operating systems and architectures.
  • 15.
    The Free Listin stackCache and stackPool Considering the cache line of each architecture, a cache line padding is added to item structure for performance optimization. /* Global pool of spans that have free stacks. */ /* Stacks are assigned an order according to size. */ /* There is a free list for each order. */ var stackpool [_NumStackOrders]struct { item stackpoolItem _ [cpu.CacheLinePadSize - unsafe.Sizeof(stackpoolItem{})%cpu.CacheLinePadSize]byte }
  • 16.
  • 17.
    System Stack inGo There are two stacks in Go 1. User stack Every non-dead G has a user stack associated with it, which is what user Go code executes on. 2. System stack Every M has a system stack associated with it (also known as the M's "g0") . System and signal stacks cannot grow, but are large enough to execute runtime and cgo code (8K). Code running on the system stack is implicitly non-preemptible and the garbage collector does not scan system stacks.
  • 18.
    System Stack inGo A main thread with a single goroutine
  • 19.
    User Stack Extensionin Go goroutine
  • 20.
    User Stack Extensionin Go goroutine A goroutines stack starts small, so they are cheap, and grow by allocating (and freeing) heap storage as required. const ( StackSmall = 128 StackBig = 4096 ) type g struct { stack stack stackguard0 uintptr // ...more members }
  • 21.
    User Stack Extensionin Go goroutine Stack extension check: 1. frame size <= StackSmall Comparing the stack pointer with the stack guard, then extend the stack if the stack pointer is below the stack guard. 2. StackSmall < frame size <= StackBig Moving down the stack pointer and comparing the stack pointer with the stack guard. Extending the stack if the stack pointer is below the stack guard. 3. frame size > StackBig Always does stack extension.
  • 22.
    User Stack Extensionin Go goroutine
  • 23.
    The Process ofStack Growth How stack growth works? 1. record the stack map (global) for pointer checking. 2. adjust the pointer value according to adjustInfo.
  • 24.
  • 25.
    Stack Overflow package main const( size = 2048 maxDepth = 5000000 ) func bigFrame(args [size]uint64, depth int) { if depth < maxDepth { bigFrame(args, depth+1) } } func main() { var args [size]uint64 bigFrame(args, 0) }
  • 26.
  • 27.
    Stack Overflow Default Maxstacksize /*source code: runtime/proc.go */ if sys.PtrSize == 8 { maxstacksize = 1000000000 } else { maxstacksize = 250000000 }
  • 28.
    Goroutine Initialization with> 2k function args package main /* change arrSize to 260 and check the result */ const arrSize = 4 var c = make(chan bool) func bigArray(arr [arrSize]uint64) { var sum uint64 = 1 for i := 0; i < arrSize; i++ { sum = sum + arr[i] } c <- true } func main() { var arr [arrSize]uint64 go bigArray(arr) <-c }
  • 29.
    Further Study 1. stackframe is not always required 2. stack allocation with large frame size. 3. MemStats.StackInuse memStats := new(runtime.MemStats) runtime.ReadMemStats(memStats)
  • 30.
    Thank You. Blog: https://yushuanhsieh.github.io/ Linkedin:https://www.linkedin.com/in/yu-shuan-hsieh-96994491/