How “go build” Works

Graham Jenson
Maori Geek
Published in
8 min readSep 11, 2020

--

How does go build compile the simplest Golang program? This post is here to answer that question.

The simplest go program (I can think of) is main.go:

package mainfunc main() {}

If we run go build main.go it outputs an executable main that is 1.1Mb and does nothing. What did go build do to do create such a useful binary?

go build has some args that are useful for seeing how it builds:

  1. -work: go build creates a temporary folder for work files. This arg will print out the location of that folder and not delete it after the build
  2. -a: Golang caches previously built packages. -a makes go build ignore the cache so our build will print all steps
  3. -p 1: This sets the concurrency to a single thread to log output linear
  4. -x: go build is a wrapper around other Golang tools like compile. -x outputs the commands and arguments that are sent to these tools

Running go build -work -a -p 1 -x main.go will output not only the main binary, but a lot of logs describing exactly what build did to create main.

The logs starts with:

WORK=/var/folders/rw/gtb29xf92fv23f0zqsg42s840000gn/T/go-build940616988

This is the work directory whose structure looks like:

├── b001
│ ├── _pkg_.a
│ ├── exe
│ ├── importcfg
│ └── importcfg.link
├── b002
│ └── ...
├── b003
│ └── ...
├── b004
│ └── ...
├── b006
│ └── ...
├── b007
│ └── ...
└── b008
└── ...

What are these incrementing directory numbers?

go build defines an action graph of tasks that need to be completed. Each action in this graph gets its own sub-directory (defined in NewObjdir). The first node b001 in the graph is the root task to compile the main binary. Each dependent action has a higher number, the final being b008. (I don’t know where b005 went, I assume its ok)

The first action to be executed is the leaf of the graph, b008:

mkdir -p $WORK/b008/
cat >$WORK/b008/importcfg << 'EOF'
# import config
EOF
cd /<..>/src/runtime/internal/sys
/<..>/compile
-o $WORK/b008/_pkg_.a
-trimpath "$WORK/b008=>"
-p runtime/internal/sys
-std
-+
-complete
-buildid gEtYPexVP43wWYWCxFKi/gEtYPexVP43wWYWCxFKi
-goversion go1.14.7
-D ""
-importcfg $WORK/b008/importcfg
-pack
-c=16
./arch.go ./arch_amd64.go ./intrinsics.go ./intrinsics_common.go ./stubs.go ./sys.go ./zgoarch_amd64.go ./zgoos_darwin.go ./zversion.go
/<..>/buildid -w $WORK/b008/_pkg_.a
cp $WORK/b008/_pkg_.a /<..>/Caches/go-build/01/01b...60a-d

The b008 action:

  1. creates the action directory (all actions do this so I ignore this later on)
  2. creates the importcfg file to be used by the compile tool (it is empty)
  3. changes the directory to the runtime/internal/sys packages source folder. This package contains constants used by the runtime
  4. compile this package
  5. Use buildid to write (-w) metadata to the package and copy the package to the go-build cache (all packages are cached so I ignore this later on)

Let’s break this down the arguments sent to the compile tool (also described in go tool compile --help):

  1. -o is the output file
  2. -trimpath this removes the prefix from the source file paths $WORK/b008=> (probably helps with debugging?)
  3. -p sets the package path used by import
  4. -std compiling standard library (not sure what this does)
  5. -+ compiling runtime (another mystery)
  6. -complete the compiler outputs a complete package (no C or assembly).
  7. -buildid adds build id to the metadata (as defined here)
  8. -goversion required version for compiled package
  9. -D the relative path for local imports is ""
  10. -importcfg import configuration file refers to other packages
  11. -pack create package archive (.a) instead of object file (.o)
  12. -c concurrency of the build
  13. finished with a list of files in the package

Most of these arguments are the same for all compile calls, so I ignore them later.

The output of b008 is the file $WORK/b008/_pkg_.a for runtime/internal/sys

Let’s dive into buildid for a second.

The buildid is in the format <actionid>/<contentid>. It is used as an index to cache packages to improve go build performance. The <actionid> is the hash of the action (all calls, arguments, and input files). The <contentid> is a hash of the output .a file. For each go build action, it can look up in the cache for contents created by another action with the same <actionid>. This is implemented in buildid.go.

The buildid is stored as metadata in the file so that it does not need to be hashed every time to get the <contentid>. You can see this id with go tool buildid <file> (also works on binaries).

In the log of b008 above the buildID is being set in by the compile tool as gEtYPexVP43wWYWCxFKi/gEtYPexVP43wWYWCxFKi. This is a just a place holder and is later overwritten with go tool buildid -w to the correct gEtYPexVP43wWYWCxFKi/b-rPboOuD0POrlJWPTEi before being cached.

The next action to be run is b007:

cat >$WORK/b007/importcfg << 'EOF'
# import config
packagefile runtime/internal/sys=$WORK/b008/_pkg_.a
EOF
cd /<..>/src/runtime/internal/math
/<..>/compile
-o $WORK/b007/_pkg_.a
-p runtime/internal/math
-importcfg $WORK/b007/importcfg
...
./math.go
  1. This writes the importcfg but it includes the line packagefile runtime/internal/sys=$WORK/b008/_pkg_.a. This means b007 depends on the output of b008
  2. compile’s the runtime/internal/math package. If you inspect math.go, it has import "runtime/internal/sys" built by b008

The output of b007 is the file $WORK/b007/_pkg_.a for runtime/internal/math

The next action is b006:

cat >$WORK/b006/go_asm.h << 'EOF'
EOF
cd /<..>/src/runtime/internal/atomic
/<..>/asm
-I $WORK/b006/
-I /<..>/go/1.14.7/libexec/pkg/include
-D GOOS_darwin
-D GOARCH_amd64
-gensymabis
-o $WORK/b006/symabis
./asm_amd64.s
/<..>/asm
-I $WORK/b006/
-I /<..>/go/1.14.7/libexec/pkg/include
-D GOOS_darwin
-D GOARCH_amd64
-o $WORK/b006/asm_amd64.o
./asm_amd64.s
cat >$WORK/b006/importcfg << 'EOF'
# import config
EOF
/<..>/compile
-o $WORK/b006/_pkg_.a
-p runtime/internal/atomic
-symabis $WORK/b006/symabis
-asmhdr $WORK/b006/go_asm.h
-importcfg $WORK/b006/importcfg
...
./atomic_amd64.go ./stubs.go
/<..>/pack r $WORK/b006/_pkg_.a $WORK/b006/asm_amd64.o

Here is where we step out of the normal .go files and start dealing with lower level “Go assembly.s files. b006:

  1. First this makes the header file go_asm.h
  2. goes to the runtime/internal/atomic package (a bunch of low-level functions).
  3. runs the go tool asm tool (described with go tool asm --help) to build the symabis “Symbol Application Binary Interfaces (ABI) file” and then the object file asm_amd64.o
  4. Uses compile create the _pkg_.a file including the symabis file and the header with -asmhdr.
  5. Uses pack to add the asm_amd64.o object file to _pkg_.a package archive

The asm tool is called with the args:

  1. -I: include the action b007 and includes folders. includes has three files asm_ppc64x.h funcdata.h and textflag.h all having low level function definitions, e.g. FIXED_FRAME defines the size of the fixed part of a stack frame
  2. -D: Adds a predefined symbol
  3. -gensymabis: flag to generate the symabis file
  4. -o: The output file

The output of b006 is $WORK/b006/_pkg_.a for runtime/internal/atomic

Next is b004:

cd /<..>/src/internal/cpu
/<..>/asm ... -o $WORK/b004/symabis ./cpu_x86.s
/<..>/asm ... -o $WORK/b004/cpu_x86.o ./cpu_x86.s/<..>/compile ... -o $WORK/b004/_pkg_.a ./cpu.go ./cpu_amd64.go ./cpu_x86.go/<..>/pack r $WORK/b004/_pkg_.a $WORK/b004/cpu_x86.o

b004 is the same as b006 for the package internal/cpu. First we we assemble the symabis and object files, then compile the go files and pack the .o files into _pkg_.a.

The output of b004 is $WORK/b004/_pkg_.a for internal/cpu

The next action is b003

cat >$WORK/b003/go_asm.h << 'EOF'
EOF
cd /<..>/src/internal/bytealg
/<..>/asm ... -o $WORK/b003/symabis ./compare_amd64.s ./count_amd64.s ./equal_amd64.s ./index_amd64.s ./indexbyte_amd64.scat >$WORK/b003/importcfg << 'EOF'
# import config
packagefile internal/cpu=$WORK/b004/_pkg_.a
EOF
/<..>/compile ... -o $WORK/b003/_pkg_.a -p internal/bytealg ./bytealg.go ./compare_native.go ./count_native.go ./equal_generic.go ./equal_native.go ./index_amd64.go ./index_native.go ./indexbyte_native.go
/<..>/asm ... -o $WORK/b003/compare_amd64.o ./compare_amd64.s
/<..>/asm ... -o $WORK/b003/count_amd64.o ./count_amd64.s
/<..>/asm ... -o $WORK/b003/equal_amd64.o ./equal_amd64.s
/<..>/asm ... -o $WORK/b003/index_amd64.o ./index_amd64.s
/<..>/asm ... -o $WORK/b003/indexbyte_amd64.o ./indexbyte_amd64.s
/<..>/pack r $WORK/b003/_pkg_.a $WORK/b003/compare_amd64.o $WORK/b003/count_amd64.o $WORK/b003/equal_amd64.o $WORK/b003/index_amd64.o $WORK/b003/indexbyte_amd64.o

b003 is the same as the previous actions b004 b006 for the package internal/bytealg. The main complication with this package is that there are multiple .s files to create many .o object files that each need to be added to the _pkg_.a file.

The output of b003 is $WORK/b003/_pkg_.a for internal/bytealg

The penultimate action, b002:

cat >$WORK/b002/go_asm.h << 'EOF'
EOF
cd /<..>/src/runtime
/<..>/asm
...
-o $WORK/b002/symabis
./asm.s ./asm_amd64.s ./duff_amd64.s ./memclr_amd64.s ./memmove_amd64.s ./preempt_amd64.s ./rt0_darwin_amd64.s ./sys_darwin_amd64.s

cat >$WORK/b002/importcfg << 'EOF'
# import config
packagefile internal/bytealg=$WORK/b003/_pkg_.a
packagefile internal/cpu=$WORK/b004/_pkg_.a
packagefile runtime/internal/atomic=$WORK/b006/_pkg_.a
packagefile runtime/internal/math=$WORK/b007/_pkg_.a
packagefile runtime/internal/sys=$WORK/b008/_pkg_.a
EOF
/<..>/compile
-o $WORK/b002/_pkg_.a
...
-p runtime
./alg.go ./atomic_pointer.go ./cgo.go ./cgocall.go ./cgocallback.go ./cgocheck.go ./chan.go ./checkptr.go ./compiler.go ./complex.go ./cpuflags.go ./cpuflags_amd64.go ./cpuprof.go ./cputicks.go ./debug.go ./debugcall.go ./debuglog.go ./debuglog_off.go ./defs_darwin_amd64.go ./env_posix.go ./error.go ./extern.go ./fastlog2.go ./fastlog2table.go ./float.go ./hash64.go ./heapdump.go ./iface.go ./lfstack.go ./lfstack_64bit.go ./lock_sema.go ./malloc.go ./map.go ./map_fast32.go ./map_fast64.go ./map_faststr.go ./mbarrier.go ./mbitmap.go ./mcache.go ./mcentral.go ./mem_darwin.go ./mfinal.go ./mfixalloc.go ./mgc.go ./mgcmark.go ./mgcscavenge.go ./mgcstack.go ./mgcsweep.go ./mgcsweepbuf.go ./mgcwork.go ./mheap.go ./mpagealloc.go ./mpagealloc_64bit.go ./mpagecache.go ./mpallocbits.go ./mprof.go ./mranges.go ./msan0.go ./msize.go ./mstats.go ./mwbbuf.go ./nbpipe_pipe.go ./netpoll.go ./netpoll_kqueue.go ./os_darwin.go ./os_nonopenbsd.go ./panic.go ./plugin.go ./preempt.go ./preempt_nonwindows.go ./print.go ./proc.go ./profbuf.go ./proflabel.go ./race0.go ./rdebug.go ./relax_stub.go ./runtime.go ./runtime1.go ./runtime2.go ./rwmutex.go ./select.go ./sema.go ./signal_amd64.go ./signal_darwin.go ./signal_darwin_amd64.go ./signal_unix.go ./sigqueue.go ./sizeclasses.go ./slice.go ./softfloat64.go ./stack.go ./string.go ./stubs.go ./stubs_amd64.go ./stubs_nonlinux.go ./symtab.go ./sys_darwin.go ./sys_darwin_64.go ./sys_nonppc64x.go ./sys_x86.go ./time.go ./time_nofake.go ./timestub.go ./trace.go ./traceback.go ./type.go ./typekind.go ./utf8.go ./vdso_in_none.go ./write_err.go

/<..>/asm ... -o $WORK/b002/asm.o ./asm.s
/<..>/asm ... -o $WORK/b002/asm_amd64.o ./asm_amd64.s
/<..>/asm ... -o $WORK/b002/duff_amd64.o ./duff_amd64.s
/<..>/asm ... -o $WORK/b002/memclr_amd64.o ./memclr_amd64.s
/<..>/asm ... -o $WORK/b002/memmove_amd64.o ./memmove_amd64.s
/<..>/asm ... -o $WORK/b002/preempt_amd64.o ./preempt_amd64.s
/<..>/asm ... -o $WORK/b002/rt0_darwin_amd64.o ./rt0_darwin_amd64.s
/<..>/asm ... -o $WORK/b002/sys_darwin_amd64.o ./sys_darwin_amd64.s

/<..>/pack r $WORK/b002/_pkg_.a $WORK/b002/asm.o $WORK/b002/asm_amd64.o $WORK/b002/duff_amd64.o $WORK/b002/memclr_amd64.o $WORK/b002/memmove_amd64.o $WORK/b002/preempt_amd64.o $WORK/b002/rt0_darwin_amd64.o $WORK/b002/sys_darwin_amd64.o

b002 is the reason for all actions seen so far. It is the runtime package containing all the operations needed for a go binary to run. For example, it contains mgc.go the implementation of the garbage collection in Go (that also imports both internal/cpu from b004 and runtime/internal/atomic from b006).

b002 although probably the most complex package in the core library, is built using the same pattern we have seen before, it just contains files. It uses asm compile and pack to build _pkg_.a.

The output of b002 is $WORK/b002/_pkg_.a for runtime

The final action, the one that pulls everything together, is b001:

cat >$WORK/b001/importcfg << 'EOF'
# import config
packagefile runtime=$WORK/b002/_pkg_.a
EOF
cd /<..>/main
/<..>/compile ... -o $WORK/b001/_pkg_.a -p main ./main.gocat >$WORK/b001/importcfg.link << 'EOF'
packagefile command-line-arguments=$WORK/b001/_pkg_.a
packagefile runtime=$WORK/b002/_pkg_.a
packagefile internal/bytealg=$WORK/b003/_pkg_.a
packagefile internal/cpu=$WORK/b004/_pkg_.a
packagefile runtime/internal/atomic=$WORK/b006/_pkg_.a
packagefile runtime/internal/math=$WORK/b007/_pkg_.a
packagefile runtime/internal/sys=$WORK/b008/_pkg_.a
EOF
/<..>/link
-o $WORK/b001/exe/a.out
-importcfg $WORK/b001/importcfg.link
-buildmode=exe
-buildid=yC-qrh2sY_qI0zh2-NE7/owNzOBTqPO00FkqK0_lF/HPXqvMz_4PvKsQzqGWgD/yC-qrh2sY_qI0zh2-NE7
-extld=clang
$WORK/b001/_pkg_.a
mv $WORK/b001/exe/a.out main
  1. First it builds an importcfg that includes runtime built in b002 to then compile main.go to _pkg_.a
  2. Then it creates importcfg.link which includes all previous actions packages, plus command-line-arguments referencing the main package we built. Using link to then create an executable file
  3. rename and move the binary to main

link has the new arguments:

  1. -buildmode: set to build an executable
  2. -extld: reference to the external linker

Finally, we have the output we want; the output of b001 is the main binary.

Similarities with Bazel

The building of an action graph in order to have efficient caching is the same idea the build tool Bazel uses for fast builds. Golang’s actionid and contentid map neatly to the action cache and the content-addressable store (CAS) Bazel uses in caching. Bazel is a product of Google, so is Golang. It would make sense that they would have a similar philosophy of how to build software quickly and reliably.

In Bazel’s rules_go package you can see how it reimplements go build in its builder code. This is a very clean implementation because the action graph, the folder management, and the caching are handled externally by Bazel.

The Next Steps

go build does a lot to compile a program that does nothing! I didn’t even get into much specific detail about the tools (compile asm) or their inputs and output files ( .a .o .s). Also, we are still only compiling the most basic program. We could add complications like:

  1. importing another package, e.g. using fmt to print Hello World adds another 23 actions to the action graph
  2. having a go.mod file referencing external packages
  3. Setting GOOS and GOARCH to other architectures, e.g. compiling to WASM has entirely different actions and arguments

Running go build and inspecting logs is a very top-down approach to learning how the Golang compiler works. It is a great starting point to dive into more resources like:

  1. Introduction to the Go compiler
  2. Go: Overview of the Compiler
  3. Go at Google: Language Design in the Service of Software Engineering
  4. Source code like build.go the definition of the go build command, or compile/main.go the entry point to go tool compile

There is a lot of information out there so still lots to learn about compiling the simplest program.

--

--