Go语言并发模型 G源码分析

Go 的线程实现模型，有三个核心的元素 M、P、G，它们共同支撑起了这个线程模型的框架。其中，G 是 goroutine 的缩写，通常称为 “协程”。关于协程、线程和进程三者的异同，可以参照 “进程、线程和协程的区别”。

每一个 Goroutine 在程序运行期间，都会对应分配一个 g 结构体对象。g 中存储着 Goroutine 的运行堆栈、状态以及任务函数，g 结构的定义位于 src/runtime/runtime2.go 文件中。

g 对象可以重复使用，当一个 goroutine 退出时，g 对象会被放到一个空闲的 g 对象池中以用于后续的 goroutine 的使用，以减少内存分配开销。

1. Goroutine 字段注释

g 字段非常的多，我们这里分段来理解：

type g struct {

    // Stack parameters.

    // stack describes the actual stack memory: [stack.lo, stack.hi).

    // stackguard0 is the stack pointer compared in the Go stack growth prologue.

    // It is stack.lo+StackGuard normally, but can be StackPreempt to trigger a preemption.

    // stackguard1 is the stack pointer compared in the C stack growth prologue.

    // It is stack.lo+StackGuard on g0 and gsignal stacks.

    // It is ~0 on other goroutine stacks, to trigger a call to morestackc (and crash).

    stack       stack   // offset known to runtime/cgo

    // 检查栈空间是否足够的值, 低于这个值会扩张, stackguard0 供 Go 代码使用

    stackguard0 uintptr // offset known to liblink

    // 检查栈空间是否足够的值, 低于这个值会扩张, stackguard1 供 C 代码使用

    stackguard1 uintptr // offset known to liblink

}

stack 描述了当前 goroutine 的栈内存范围[stack.lo, stack.hi)，其中 stack 的数据结构：

// Stack describes a Go execution stack.

// The bounds of the stack are exactly [lo, hi),

// with no implicit data structures on either side.

// 描述 goroutine 执行栈

// 栈边界为[lo, hi)，左包含右不包含，即 lo≤stack<hi

// 两边都没有隐含的数据结构。

type stack struct {

    lo uintptr // 该协程拥有的栈低位

    hi uintptr // 该协程拥有的栈高位

}

stackguard0 和 stackguard1 均是一个栈指针，用于扩容场景，前者用于 Go stack ，后者用于 C stack。

如果 stackguard0 字段被设置成 StackPreempt，意味着当前 Goroutine 发出了抢占请求。

在g结构体中的stackguard0 字段是出现爆栈前的警戒线。stackguard0 的偏移量是16个字节，与当前的真实SP(stack pointer)和爆栈警戒线（stack.lo+StackGuard）比较，如果超出警戒线则表示需要进行栈扩容。先调用runtime·morestack_noctxt()进行栈扩容，然后又跳回到函数的开始位置，此时此刻函数的栈已经调整了。然后再进行一次栈大小的检测，如果依然不足则继续扩容，直到栈足够大为止。

type g struct {

    preempt       bool // preemption signal, duplicates stackguard0 = stackpreempt

    preemptStop   bool // transition to _Gpreempted on preemption; otherwise, just deschedule

    preemptShrink bool // shrink stack at synchronous safe point

}

preempt 抢占标记，其值为 true 执行 stackguard0 = stackpreempt。
preemptStop 将抢占标记修改为 _Gpreedmpted，如果修改失败则取消。
preemptShrink 在同步安全点收缩栈。

type g struct {

    _panic       *_panic // innermost panic - offset known to liblink

    _defer       *_defer // innermost defer

}

_panic 当前Goroutine 中的 panic。
_defer 当前Goroutine 中的 defer。

type g struct {

    m            *m      // current m; offset known to arm liblink

    sched        gobuf

    goid         int64

}

m 当前 Goroutine 绑定的 M。
sched 存储当前 Goroutine 调度相关的数据，上下方切换时会把当前信息保存到这里，用的时候再取出来。
goid 当前 Goroutine 的唯一标识，对开发者不可见，一般不使用此字段，Go 开发团队未向外开放访问此字段。

gobuf 结构体定义：

type gobuf struct {

    // The offsets of sp, pc, and g are known to (hard-coded in) libmach.

    // 寄存器 sp, pc 和 g 的偏移量，硬编码在 libmach

    //

    // ctxt is unusual with respect to GC: it may be a

    // heap-allocated funcval, so GC needs to track it, but it

    // needs to be set and cleared from assembly, where it's

    // difficult to have write barriers. However, ctxt is really a

    // saved, live register, and we only ever exchange it between

    // the real register and the gobuf. Hence, we treat it as a

    // root during stack scanning, which means assembly that saves

    // and restores it doesn't need write barriers. It's still

    // typed as a pointer so that any other writes from Go get

    // write barriers.

    sp   uintptr

    pc   uintptr

    g    guintptr

    ctxt unsafe.Pointer

    ret  sys.Uintreg

    lr   uintptr

    bp   uintptr // for GOEXPERIMENT=framepointer

}

sp 栈指针位置。
pc 程序计数器，运行到的程序位置。

ctxt

不常见，可能是一个分配在heap的函数变量，因此GC 需要追踪它，不过它有可能需要设置并进行清除，在有

写屏障

的时候有些困难。重点了解一下

write barriers

g 当前 gobuf 的 Goroutine。
ret 系统调用的结果。

调度器在将 G 由一种状态变更为另一种状态时，需要将上下文信息保存到这个gobuf结构体，当再次运行 G 的时候，再从这个结构体中读取出来，它主要用来暂存上下文信息。其中的栈指针 sp 和程序计数器 pc 会用来存储或者恢复寄存器中的值，设置即将执行的代码。

2. Goroutine 状态种类

Goroutine 的状态有以下几种：

状态	描述
`_Gidle`	0 刚刚被分配并且还没有被初始化
`_Grunnable`	1 没有执行代码，没有栈的所有权，存储在运行队列中
`_Grunning`	2 可以执行代码，拥有栈的所有权，被赋予了内核线程 M 和处理器 P
`_Gsyscall`	3 正在执行系统调用，没有执行用户代码，拥有栈的所有权，被赋予了内核线程 M 但是不在运行队列上
`_Gwaiting`	4 由于运行时而被阻塞，没有执行用户代码并且不在运行队列上，但是可能存在于 Channel 的等待队列上。若需要时执行ready()唤醒。
`_Gmoribund_unused`	5 当前此状态未使用，但硬编码在了gdb 脚本里，可以不用关注
`_Gdead`	6 没有被使用，可能刚刚退出，或在一个freelist；也或者刚刚被初始化；没有执行代码，可能有分配的栈也可能没有；G和分配的栈（如果已分配过栈）归刚刚退出G的M所有或从free list 中获取
`_Genqueue_unused`	7 目前未使用，不用理会
`_Gcopystack`	8 栈正在被拷贝，没有执行代码，不在运行队列上
`_Gpreempted`	9 由于抢占而被阻塞，没有执行用户代码并且不在运行队列上，等待唤醒
`_Gscan`	10 GC 正在扫描栈空间，没有执行代码，可以与其他状态同时存在

需要注意的是对于 _Gmoribund_unused 状态并未使用，但在 gdb 脚本中存在；而对于 _Genqueue_unused 状态目前也未使用，不需要关心。

_Gscan 与上面除了_Grunning 状态以外的其它状态相组合，表示 GC 正在扫描栈。Goroutine 不会执行用户代码，且栈由设置了 _Gscan 位的 Goroutine 所有。

状态	描述
`_Gscanrunnable`	= _Gscan + _Grunnable // 0x1001
`_Gscanrunning`	= _Gscan + _Grunning // 0x1002
`_Gscansyscall`	= _Gscan + _Gsyscall // 0x1003
`_Gscanwaiting`	= _Gscan + _Gwaiting // 0x1004
`_Gscanpreempted`	= _Gscan + _Gpreempted // 0x1009

3. Goroutine 状态转换

可以看到除了上面提到的两个未使用的状态外一共有14种状态值。许多状态之间是可以进行改变的。如下图所示：

type g strcut {

    syscallsp    uintptr        // if status==Gsyscall, syscallsp = sched.sp to use during gc

    syscallpc    uintptr        // if status==Gsyscall, syscallpc = sched.pc to use during gc

    stktopsp     uintptr        // expected sp at top of stack, to check in traceback

    param        unsafe.Pointer // passed parameter on wakeup

    atomicstatus uint32

    stackLock    uint32 // sigprof/scang lock; TODO: fold in to atomicstatus

}

atomicstatus 当前 G 的状态，上面介绍过 G 的几种状态值。
syscallsp 如果 G 的状态为 Gsyscall，那么值为 sched.sp 主要用于GC 期间。
syscallpc 如果 G 的状态为 GSyscall，那么值为 sched.pc 主要用于GC 期间。由此可见这两个字段通常一起使用。
stktopsp 用于回源跟踪。
param 唤醒 G 时传入的参数，例如调用 ready()。
stackLock 栈锁。

type g struct {

    waitsince    int64      // approx time when the g become blocked

    waitreason   waitReason // if status==Gwaiting

}

waitsince G 阻塞时长。
waitreason 阻塞原因。

type g struct {

    // asyncSafePoint is set if g is stopped at an asynchronous

    // safe point. This means there are frames on the stack

    // without precise pointer information.

    asyncSafePoint bool

    paniconfault bool // panic (instead of crash) on unexpected fault address

    gcscandone   bool // g has scanned stack; protected by _Gscan bit in status

    throwsplit   bool // must not split stack

}

asyncSafePoint 异步安全点；如果 g 在异步安全点停止则设置为true，表示在栈上没有精确的指针信息。
paniconfault 地址异常引起的 panic（代替了崩溃）。
gcscandone g 扫描完了栈，受状态 _Gscan 位保护。
throwsplit 不允许拆分 stack。

type g struct {

    // activeStackChans indicates that there are unlocked channels

    // pointing into this goroutine's stack. If true, stack

    // copying needs to acquire channel locks to protect these

    // areas of the stack.

    activeStackChans bool

    // parkingOnChan indicates that the goroutine is about to

    // park on a chansend or chanrecv. Used to signal an unsafe point

    // for stack shrinking. It's a boolean value, but is updated atomically.

    parkingOnChan uint8

}

activeStackChans 表示是否有未加锁定的 channel 指向到了 g 栈，如果为 true,那么对栈的复制需要 channal 锁来保护这些区域。
parkingOnChan 表示 g 是放在 chansend 还是 chanrecv。用于栈的收缩，是一个布尔值，但是原子性更新。

type g struct {

    raceignore     int8     // ignore race detection events

    sysblocktraced bool     // StartTrace has emitted EvGoInSyscall about this goroutine

    sysexitticks   int64    // cputicks when syscall has returned (for tracing)

    traceseq       uint64   // trace event sequencer

    tracelastp     puintptr // last P emitted an event for this goroutine

    lockedm        muintptr

    sig            uint32

    writebuf       []byte

    sigcode0       uintptr

    sigcode1       uintptr

    sigpc          uintptr

    gopc           uintptr         // pc of go statement that created this goroutine

    ancestors      *[]ancestorInfo // ancestor information goroutine(s) that created this goroutine (only used if debug.tracebackancestors)

    startpc        uintptr         // pc of goroutine function

    racectx        uintptr

    waiting        *sudog         // sudog structures this g is waiting on (that have a valid elem ptr); in lock order

    cgoCtxt        []uintptr      // cgo traceback context

    labels         unsafe.Pointer // profiler labels

    timer          *timer         // cached timer for time.Sleep

    selectDone     uint32         // are we participating in a select and did someone win the race?

}

gopc 创建当前 G 的 pc。
startpc go func 的 pc。
timer 通过time.Sleep 缓存 timer。

type g struct {

    // Per-G GC state

    // gcAssistBytes is this G's GC assist credit in terms of

    // bytes allocated. If this is positive, then the G has credit

    // to allocate gcAssistBytes bytes without assisting. If this

    // is negative, then the G must correct this by performing

    // scan work. We track this in bytes to make it fast to update

    // and check for debt in the malloc hot path. The assist ratio

    // determines how this corresponds to scan work debt.

    gcAssistBytes int64

}

gcAssistBytes 与 GC 相关。

4. Goroutin 总结

每个 G 都有自己的状态，状态保存在 atomicstatus 字段，共有十几种状态值。
每个 G 在状态发生变化时，即 atomicstatus 字段值被改变时，都需要保存当前G的上下文的信息，这个信息存储在 sched 字段，其数据类型为gobuf，想理解存储的信息可以看一下这个结构体的各个字段。
每个 G 都有三个与抢占有关的字段，分别为 preempt、preemptStop 和 premptShrink。
每个 G 都有自己的唯一id, 字段为goid，但此字段官方不推荐开发使用。
每个 G 都可以最多绑定一个m，如果可能未绑定，则值为 nil。
每个 G 都有自己内部的 defer 和 panic。
G 可以被阻塞，并存储有阻塞原因，字段 waitsince 和 waitreason。
G 可以被进行 GC 扫描，相关字段为 gcscandone、atomicstatus （ _Gscan 与上面除了_Grunning 状态以外的其它状态组合）

参考资料：