Friday, November 13, 2009

On the Compilation of Go Packages

Man, Go sucks, the creators went and invented their own encoding, which Go source code is required to be in.

Compilation environment

I have been playing with the Go programming language. The gc compilation environment is broadly similar to gcc's, but with enough subtle differences that I am seeing a lot of confusion about how it works.

Traditionally, C is thought of as consisting of .c source files and .h header files. Each .c file compiles to a .o object file. The object files are then linked into some manner of final binary, be it a .a static library, a .so dynamic library, or an executable binary.

In this scenario, the binary depends on the object files which are linked into it, the object files each depend on their respective source file, and each source file depends on the headers which it includes (directly or indirectly).

The gc compiler for Go (the one also known variously as 5g, 6g, and 8g) works slightly differently. Here, we have .go source files. Every .go file belongs to exactly one package, and a package may be comprised of as many .go files as you desire. Packages have a .5, .6, or .8 extension, depending on the architecture you are compiling for. A package is obtained by compiling all of the .go files which comprise it.

Packages are the result of compiling .go files. They fill two roles. They are roughly the equivalent of object files in C (though, as I will explore in a moment, they are not exactly equivalent). They also fill the role that header files fill in C. Packages contain meta-data describing what the public interface of the package is.

Therefore, in order for one package to use another, you require the other package to be compiled, and you do not require the source files for the other package. Further, when a package is compiled, it statically links in all of its dependencies.

Once you have a "main" package, you may then link it into an executable binary. The only file required to do this is the compiled "main" package.

An example

Here is an example Makefile for a simple executable binary. The binary will be named hello. It depends on two custom packages, foo and bar, which each consist of a number of .go files. The "main" package itself contains two .go files: hello.go and goodbye.go.

For the sake of clarity, all of these source files live in the same directory (in a real application, each package's source files might live in their own directory), and I have hardcoded the architecture to amd64.

MAIN_FILES=hello.go goodbye.go
FOO_FILES=foo1.go foo2.go foo3.go
BAR_FILES=bar1.go bar2.go bar3.go

hello: main.6
        6l -o hello main.6
main.6: $(MAIN_FILES) foo.6 bar.6
        6g -I. -o main.6 $(MAIN_FILES)
foo.6: $(FOO_FILES)
        6g -o foo.6 $(FOO_FILES)
bar.6: $(BAR_FILES)
        6g -o bar.6 $(BAR_FILES)
.PHONY: clean
clean:
        rm -f hello main.6 foo.6 bar.6

I will explain each of these rules one at a time.

hello: main.6
        6l -o hello main.6

The hello executable binary is obtained by linking the main package. This package already (statically) contains the other packages which it depends on, thus, the only prerequisite is the main package itself.

main.6: $(MAIN_FILES) foo.6 bar.6
        6g -I. -o main.6 $(MAIN_FILES)

The main package is obtained by compiling the source files which comprise the package, and statically linking in the packages which they depend on. Thus, the foo and bar packages are prerequisites of the main package. However, we do not explicitly pass these packages to the compiler. Instead, we pass the compiler the -I option to inform it of where these packages may be found. It is therefore important for a package's filename to match the name it is imported as.

foo.6: $(FOO_FILES)
        6g -o foo.6 $(FOO_FILES)
bar.6: $(BAR_FILES)
        6g -o bar.6 $(BAR_FILES)

The foo and bar packages are simple, and consist only of their own source files.

.PHONY: clean
clean:
        rm -f hello main.6 foo.6 bar.6

We can make clean by removing all of the binary files. These are the binary, and the packages.

Simplifying matters

It would be relatively straightforward to write a tool which, given a list of source files, could determine which files belong to which package, what the dependencies between these packages are, and automatically rebuild what is required in a make-like fashion. Writing such a tool may be a project for the future.