The Go Blog
The Go Gopher
24 March 2014

The Go gopher is an iconic mascot and one of the most distinctive features of the Go project. In this post we'll talk about his origins, evolution, and behavior.
About 15 years ago—long before the Go project—the gopher first appeared as a promotion for the WFMU radio station in New Jersey. Renee French was commissioned to design a T-shirt for an annual fundraiser and out came the gopher.

He next made an appearance at Bell Labs, as Bob Flandrena's avatar in the Bell Labs mail system. Other Renee drawings became avatars for ken, r, rsc, and others. (Of course, Peter Weinberger's was his own iconic face.)

Another Bell Labs activity led to Renee creating Glenda, the Plan 9 mascot, a distant cousin of the WFMU gopher.

When we started the Go project we needed a logo, and Renee volunteered to draw it. It was featured on the first Go T-shirt and the Google Code site.

For the open source launch in 2009, Renee suggested adapting the WFMU gopher as a mascot. And the Go gopher was born:

(The gopher has no name. He's just the "Go gopher".)
For the launch of the Go App Engine runtime at Google I/O 2011 we engaged Squishable to manufacture the plush gophers. This was the first time the gopher was colored blue and appeared in three dimensions. The first prototype was kinda hairy:

But the second one was just right:

Around the same time, Renee roughed out a gopher in clay. This inspired a refined sculpture that became a vinyl figurine made by Kidrobot. The vinyls were first distributed at OSCON 2011.

The gopher therefore exists in many forms, but he has always been Renee's creation. He stands for the Go project and Go programmers everywhere, and is one of the most popular things in the Go world.
The Go gopher is a character; a unique creation. He's not any old gopher, just as Snoopy is not any old cartoon dog.
The gopher images are Creative Commons Attributions 3.0 licensed. That means you can play with the images but you must give credit to their creator (Renee French) wherever they are used.
Here are a few gopher adaptations that people have used as mascots for user group mascots and similar organizations.

They're cute and we like them, but by the Creative Commons rules the groups should give Renee credit, perhaps as a mention on the user group web site.
The vinyl and plush gophers are copyrighted designs; accept no substitutes! But how can you get one? Their natural habitat is near high concentrations of Go programmers, and their worldwide population is growing. They may be purchased from the Google Store, although the supply can be irregular. (These elusive creatures have been spotted in all kinds of places.)
Perhaps the best way to get a gopher is to catch one in the wild at a Go conference. There are two big chances this year: GopherCon (Denver, April 24-26) and dotGo (Paris, October 10).

(Photo by Noah Lorang.)
Go Concurrency Patterns: Pipelines and cancellation
13 March 2014
Introduction
Go's concurrency primitives make it easy to construct streaming data pipelines that make efficient use of I/O and multiple CPUs. This article presents examples of such pipelines, highlights subtleties that arise when operations fail, and introduces techniques for dealing with failures cleanly.
What is a pipeline?
There's no formal definition of a pipeline in Go; it's just one of many kinds of concurrent programs. Informally, a pipeline is a series of stages connected by channels, where each stage is a group of goroutines running the same function. In each stage, the goroutines
- receive values from upstream via inbound channels
- perform some function on that data, usually producing new values
- send values downstream via outbound channels
Each stage has any number of inbound and outbound channels, except the first and last stages, which have only outbound or inbound channels, respectively. The first stage is sometimes called the source or producer; the last stage, the sink or consumer.
We'll begin with a simple example pipeline to explain the ideas and techniques. Later, we'll present a more realistic example.
Squaring numbers
Consider a pipeline with three stages.
The first stage, gen
, is a function that converts a list of integers to a
channel that emits the integers in the list. The gen
function starts a
goroutine that sends the integers on the channel and closes the channel when all
the values have been sent:
func gen(nums ...int) <-chan int { out := make(chan int) go func() { for _, n := range nums { out <- n } close(out) }() return out }
The second stage, sq
, receives integers from a channel and returns a
channel that emits the square of each received integer. After the
inbound channel is closed and this stage has sent all the values
downstream, it closes the outbound channel:
func sq(in <-chan int) <-chan int { out := make(chan int) go func() { for n := range in { out <- n * n } close(out) }() return out }
The main
function sets up the pipeline and runs the final stage: it receives
values from the second stage and prints each one, until the channel is closed:
func main() { // Set up the pipeline. c := gen(2, 3) out := sq(c) // Consume the output. fmt.Println(<-out) // 4 fmt.Println(<-out) // 9 }
Since sq
has the same type for its inbound and outbound channels, we
can compose it any number of times. We can also rewrite main
as a
range loop, like the other stages:
func main() { // Set up the pipeline and consume the output. for n := range sq(sq(gen(2, 3))) { fmt.Println(n) // 16 then 81 } }
Fan-out, fan-in
Multiple functions can read from the same channel until that channel is closed; this is called fan-out. This provides a way to distribute work amongst a group of workers to parallelize CPU use and I/O.
A function can read from multiple inputs and proceed until all are closed by multiplexing the input channels onto a single channel that's closed when all the inputs are closed. This is called fan-in.
We can change our pipeline to run two instances of sq
, each reading from the
same input channel. We introduce a new function, merge, to fan in the
results:
func main() { in := gen(2, 3) // Distribute the sq work across two goroutines that both read from in. c1 := sq(in) c2 := sq(in) // Consume the merged output from c1 and c2. for n := range merge(c1, c2) { fmt.Println(n) // 4 then 9, or 9 then 4 } }
The merge
function converts a list of channels to a single channel by starting
a goroutine for each inbound channel that copies the values to the sole outbound
channel. Once all the output
goroutines have been started, merge
starts one
more goroutine to close the outbound channel after all sends on that channel are
done.
Sends on a closed channel panic, so it's important to ensure all sends
are done before calling close. The
sync.WaitGroup
type
provides a simple way to arrange this synchronization:
func merge(cs ...<-chan int) <-chan int { var wg sync.WaitGroup out := make(chan int) // Start an output goroutine for each input channel in cs. output // copies values from c to out until c is closed, then calls wg.Done. output := func(c <-chan int) { for n := range c { out <- n } wg.Done() } wg.Add(len(cs)) for _, c := range cs { go output(c) } // Start a goroutine to close out once all the output goroutines are // done. This must start after the wg.Add call. go func() { wg.Wait() close(out) }() return out }
Stopping short
There is a pattern to our pipeline functions:
- stages close their outbound channels when all the send operations are done.
- stages keep receiving values from inbound channels until those channels are closed.
This pattern allows each receiving stage to be written as a range
loop and
ensures that all goroutines exit once all values have been successfully sent
downstream.
But in real pipelines, stages don't always receive all the inbound values. Sometimes this is by design: the receiver may only need a subset of values to make progress. More often, a stage exits early because an inbound value represents an error in an earlier stage. In either case the receiver should not have to wait for the remaining values to arrive, and we want earlier stages to stop producing values that later stages don't need.
In our example pipeline, if a stage fails to consume all the inbound values, the goroutines attempting to send those values will block indefinitely:
// Consume the first value from output. out := merge(c1, c2) fmt.Println(<-out) // 4 or 9 return // Since we didn't receive the second value from out, // one of the output goroutines is hung attempting to send it. }
This is a resource leak: goroutines consume memory and runtime resources, and heap references in goroutine stacks keep data from being garbage collected. Goroutines are not garbage collected; they must exit on their own.
We need to arrange for the upstream stages of our pipeline to exit even when the downstream stages fail to receive all the inbound values. One way to do this is to change the outbound channels to have a buffer. A buffer can hold a fixed number of values; send operations complete immediately if there's room in the buffer:
c := make(chan int, 2) // buffer size 2 c <- 1 // succeeds immediately c <- 2 // succeeds immediately c <- 3 // blocks until another goroutine does <-c and receives 1
When the number of values to be sent is known at channel creation time, a buffer
can simplify the code. For example, we can rewrite gen
to copy the list of
integers into a buffered channel and avoid creating a new goroutine:
func gen(nums ...int) <-chan int { out := make(chan int, len(nums)) for _, n := range nums { out <- n } close(out) return out }
Returning to the blocked goroutines in our pipeline, we might consider adding a
buffer to the outbound channel returned by merge
:
func merge(cs ...<-chan int) <-chan int { var wg sync.WaitGroup out := make(chan int, 1) // enough space for the unread inputs // ... the rest is unchanged ...
While this fixes the blocked goroutine in this program, this is bad code. The
choice of buffer size of 1 here depends on knowing the number of values merge
will receive and the number of values downstream stages will consume. This is
fragile: if we pass an additional value to gen
, or if the downstream stage
reads any fewer values, we will again have blocked goroutines.
Instead, we need to provide a way for downstream stages to indicate to the senders that they will stop accepting input.
Explicit cancellation
When main
decides to exit without receiving all the values from
out
, it must tell the goroutines in the upstream stages to abandon
the values they're trying it send. It does so by sending values on a
channel called done
. It sends two values since there are
potentially two blocked senders:
func main() { in := gen(2, 3) // Distribute the sq work across two goroutines that both read from in. c1 := sq(in) c2 := sq(in) // Consume the first value from output. done := make(chan struct{}, 2) out := merge(done, c1, c2) fmt.Println(<-out) // 4 or 9 // Tell the remaining senders we're leaving. done <- struct{}{} done <- struct{}{} }
The sending goroutines replace their send operation with a select
statement
that proceeds either when the send on out
happens or when they receive a value
from done
. The value type of done
is the empty struct because the value
doesn't matter: it is the receive event that indicates the send on out
should
be abandoned. The output
goroutines continue looping on their inbound
channel, c
, so the upstream stages are not blocked. (We'll discuss in a moment
how to allow this loop to return early.)
func merge(done <-chan struct{}, cs ...<-chan int) <-chan int { var wg sync.WaitGroup out := make(chan int) // Start an output goroutine for each input channel in cs. output // copies values from c to out until c is closed or it receives a value // from done, then output calls wg.Done. output := func(c <-chan int) { for n := range c { select { case out <- n: case <-done: } } wg.Done() } // ... the rest is unchanged ...
This approach has a problem: each downstream receiver needs to know the number of potentially blocked upstream senders and arrange to signal those senders on early return. Keeping track of these counts is tedious and error-prone.
We need a way to tell an unknown and unbounded number of goroutines to stop sending their values downstream. In Go, we can do this by closing a channel, because a receive operation on a closed channel can always proceed immediately, yielding the element type's zero value.
This means that main
can unblock all the senders simply by closing
the done
channel. This close is effectively a broadcast signal to
the senders. We extend each of our pipeline functions to accept
done
as a parameter and arrange for the close to happen via a
defer
statement, so that all return paths from main
will signal
the pipeline stages to exit.
func main() { // Set up a done channel that's shared by the whole pipeline, // and close that channel when this pipeline exits, as a signal // for all the goroutines we started to exit. done := make(chan struct{}) defer close(done) in := gen(done, 2, 3) // Distribute the sq work across two goroutines that both read from in. c1 := sq(done, in) c2 := sq(done, in) // Consume the first value from output. out := merge(done, c1, c2) fmt.Println(<-out) // 4 or 9 // done will be closed by the deferred call. }
Each of our pipeline stages is now free to return as soon as done
is closed.
The output
routine in merge
can return without draining its inbound channel,
since it knows the upstream sender, sq
, will stop attempting to send when
done
is closed. output
ensures wg.Done
is called on all return paths via
a defer
statement:
func merge(done <-chan struct{}, cs ...<-chan int) <-chan int { var wg sync.WaitGroup out := make(chan int) // Start an output goroutine for each input channel in cs. output // copies values from c to out until c or done is closed, then calls // wg.Done. output := func(c <-chan int) { defer wg.Done() for n := range c { select { case out <- n: case <-done: return } } } // ... the rest is unchanged ...
Similarly, sq
can return as soon as done
is closed. sq
ensures its out
channel is closed on all return paths via a defer
statement:
func sq(done <-chan struct{}, in <-chan int) <-chan int { out := make(chan int) go func() { defer close(out) for n := range in { select { case out <- n * n: case <-done: return } } }() return out }
Here are the guidelines for pipeline construction:
- stages close their outbound channels when all the send operations are done.
- stages keep receiving values from inbound channels until those channels are closed or the senders are unblocked.
Pipelines unblock senders either by ensuring there's enough buffer for all the values that are sent or by explicitly signalling senders when the receiver may abandon the channel.
Digesting a tree
Let's consider a more realistic pipeline.
MD5 is a message-digest algorithm that's useful as a file checksum. The command
line utility md5sum
prints digest values for a list of files.
% md5sum *.go d47c2bbc28298ca9befdfbc5d3aa4e65 bounded.go ee869afd31f83cbb2d10ee81b2b831dc parallel.go b88175e65fdcbc01ac08aaf1fd9b5e96 serial.go
Our example program is like md5sum
but instead takes a single directory as an
argument and prints the digest values for each regular file under that
directory, sorted by path name.
% go run serial.go . d47c2bbc28298ca9befdfbc5d3aa4e65 bounded.go ee869afd31f83cbb2d10ee81b2b831dc parallel.go b88175e65fdcbc01ac08aaf1fd9b5e96 serial.go
The main function of our program invokes a helper function MD5All
, which
returns a map from path name to digest value, then sorts and prints the results:
func main() { // Calculate the MD5 sum of all files under the specified directory, // then print the results sorted by path name. m, err := MD5All(os.Args[1]) if err != nil { fmt.Println(err) return } var paths []string for path := range m { paths = append(paths, path) } sort.Strings(paths) for _, path := range paths { fmt.Printf("%x %s\n", m[path], path) } }
The MD5All
function is the focus of our discussion. In
serial.go, the implementation uses no concurrency and
simply reads and sums each file as it walks the tree.
// MD5All reads all the files in the file tree rooted at root and returns a map // from file path to the MD5 sum of the file's contents. If the directory walk // fails or any read operation fails, MD5All returns an error. func MD5All(root string) (map[string][md5.Size]byte, error) { m := make(map[string][md5.Size]byte) err := filepath.Walk(root, func(path string, info os.FileInfo, err error) error { if err != nil { return err } if info.IsDir() { return nil } data, err := ioutil.ReadFile(path) if err != nil { return err } m[path] = md5.Sum(data) return nil }) if err != nil { return nil, err } return m, nil }
Parallel digestion
In parallel.go, we split MD5All
into a two-stage
pipeline. The first stage, sumFiles
, walks the tree, digests each file in
a new goroutine, and sends the results on a channel with value type result
:
type result struct { path string sum [md5.Size]byte err error }
sumFiles
returns two channels: one for the results
and another for the error
returned by filepath.Walk
. The walk function starts a new goroutine to
process each regular file, then checks done
. If done
is closed, the walk
stops immediately:
func sumFiles(done <-chan struct{}, root string) (<-chan result, <-chan error) { // For each regular file, start a goroutine that sums the file and sends // the result on c. Send the result of the walk on errc. c := make(chan result) errc := make(chan error, 1) go func() { var wg sync.WaitGroup err := filepath.Walk(root, func(path string, info os.FileInfo, err error) error { if err != nil { return err } if info.IsDir() { return nil } wg.Add(1) go func() { data, err := ioutil.ReadFile(path) select { case c <- result{path, md5.Sum(data), err}: case <-done: } wg.Done() }() // Abort the walk if done is closed. select { case <-done: return errors.New("walk canceled") default: return nil } }) // Walk has returned, so all calls to wg.Add are done. Start a // goroutine to close c once all the sends are done. go func() { wg.Wait() close(c) }() // No select needed here, since errc is buffered. errc <- err }() return c, errc }
MD5All
receives the digest values from c
. MD5All
returns early on error,
closing done
via a defer
:
func MD5All(root string) (map[string][md5.Size]byte, error) { // MD5All closes the done channel when it returns; it may do so before // receiving all the values from c and errc. done := make(chan struct{}) defer close(done) c, errc := sumFiles(done, root) m := make(map[string][md5.Size]byte) for r := range c { if r.err != nil { return nil, r.err } m[r.path] = r.sum } if err := <-errc; err != nil { return nil, err } return m, nil }
Bounded parallelism
The MD5All
implementation in parallel.go
starts a new goroutine for each file. In a directory with many large
files, this may allocate more memory than is available on the machine.
We can limit these allocations by bounding the number of files read in parallel. In bounded.go, we do this by creating a fixed number of goroutines for reading files. Our pipeline now has three stages: walk the tree, read and digest the files, and collect the digests.
The first stage, walkFiles
, emits the paths of regular files in the tree:
func walkFiles(done <-chan struct{}, root string) (<-chan string, <-chan error) { paths := make(chan string) errc := make(chan error, 1) go func() { // Close the paths channel after Walk returns. defer close(paths) // No select needed for this send, since errc is buffered. errc <- filepath.Walk(root, func(path string, info os.FileInfo, err error) error { if err != nil { return err } if info.IsDir() { return nil } select { case paths <- path: case <-done: return errors.New("walk canceled") } return nil }) }() return paths, errc }
The middle stage starts a fixed number of digester
goroutines that receive
file names from paths
and send results
on channel c
:
func digester(done <-chan struct{}, paths <-chan string, c chan<- result) { for path := range paths { data, err := ioutil.ReadFile(path) select { case c <- result{path, md5.Sum(data), err}: case <-done: return } } }
Unlike our previous examples, digester
does not close its output channel, as
multiple goroutines are sending on a shared channel. Instead, code in MD5All
arranges for the channel to be closed when all the digesters
are done:
// Start a fixed number of goroutines to read and digest files. c := make(chan result) var wg sync.WaitGroup const numDigesters = 20 wg.Add(numDigesters) for i := 0; i < numDigesters; i++ { go func() { digester(done, paths, c) wg.Done() }() } go func() { wg.Wait() close(c) }()
We could instead have each digester create and return its own output channel, but then we would need additional goroutines to fan-in the results.
The final stage receives all the results
from c
then checks the
error from errc
. This check cannot happen any earlier, since before
this point, walkFiles
may block sending values downstream:
m := make(map[string][md5.Size]byte) for r := range c { if r.err != nil { return nil, r.err } m[r.path] = r.sum } // Check whether the Walk failed. if err := <-errc; err != nil { return nil, err } return m, nil }
Conclusion
This article has presented techniques for constructing streaming data pipelines in Go. Dealing with failures in such pipelines is tricky, since each stage in the pipeline may block attempting to send values downstream, and the downstream stages may no longer care about the incoming data. We showed how closing a channel can broadcast a "done" signal to all the goroutines started by a pipeline and defined guidelines for constructing pipelines correctly.
Further reading:
- Go Concurrency Patterns (video) presents the basics of Go's concurrency primitives and several ways to apply them.
- Advanced Go Concurrency Patterns (video) covers more complex uses of Go's primitives, especially
select
. - Douglas McIlroy's paper Squinting at Power Series shows how Go-like concurrency provides elegant support for complex calculations.
Go talks at FOSDEM 2014
24 February 2014
Introduction
At FOSDEM on the 2nd of February 2014 members of the Go community presented a series of talks in the Go Devroom. The day was a huge success, with 13 great talks presented to a consistently jam-packed room.
Video recordings of the talks are now available, and a selection of these videos are presented below.
The complete series of talks is available as a YouTube playlist. (You can also get them directly at the FOSDEM video archive.)
Scaling with Go: YouTube's Vitess
Google Engineer Sugu Sougoumarane described how he and his team built Vitess in Go to help scale YouTube.
Vitess is a set of servers and tools primarily developed in Go. It helps scale MySQL databases for the web, and is currently used as a fundamental component of YouTube's MySQL infrastructure.
The talk covers some history about how and why the team chose Go, and how it paid off. Sugu also talks abou tips and techniques used to scale Vitess using Go.
The slides for the talk are available here.
Camlistore
Camlistore is designed to be "your personal storage system for life, putting you in control, and designed to last." It's open source, under nearly 4 years of active development, and extremely flexible. In this talk, Brad Fitzpatrick and Mathieu Lonjaret explain why they built it, what it does, and talk about its design.
Write your own Go compiler
Elliot Stoneham explains the potential for Go as a portable language and reviews the Go tools that make that such an exciting possibility.
He said: "Based on my experiences writing an experimental Go to Haxe translator, I'll talk about the practical issues of code generation and runtime emulation required. I'll compare some of my design decisions with those of two other Go compiler/translators that build on the go.tools library. My aim is to encourage you to try one of these new 'mutant' Go compilers. I hope some of you will be inspired to contribute to one of them or even to write a new one of your own."
More
There were many more great talks, so please check out the complete series as a YouTube playlist. In particular, the lightning talks were a lot of fun.
I would like to give my personal thanks to the excellent speakers, Mathieu Lonjaret for managing the video gear, and to the FOSDEM staff for making all this possible.
Go on App Engine: tools, tests, and concurrency
13 December 2013
Background
When we launched Go for App Engine in May 2011 the SDK was just a modified version of the Python SDK. At the time, there was no canonical way to build or organize Go programs, so it made sense to take the Python approach. Since then Go 1.0 was released, including the go tool and a convention for organizing Go programs.
In January 2013 we announced better integration between the Go App Engine SDK and the go tool, promoting the use of conventional import paths in App Engine apps and making it possible to use "go get" to fetch app dependencies.
With the recent release of App Engine 1.8.8 we are pleased to announce more improvements to the developer experience for Go on App Engine.
The goapp tool
The Go App Engine SDK now includes the "goapp" tool, an App Engine-specific version of the "go" tool. The new name permits users to keep both the regular "go" tool and the "goapp" tool in their system PATH.
In addition to the existing "go" tool commands, the "goapp" tool provides new commands for working with App Engine apps. The "goapp serve" command starts the local development server and the "goapp deploy" command uploads an app to App Engine.
The main advantages offered by the "goapp serve" and "goapp deploy" commands are a simplified user interface and consistency with existing commands like "go get" and "go fmt". For example, to run a local instance of the app in the current directory, run:
$ goapp serve
To upload it to App Engine:
$ goapp deploy
You can also specify the Go import path to serve or deploy:
$ goapp serve github.com/user/myapp
You can even specify a YAML file to serve or deploy a specific module:
$ goapp deploy mymodule.yaml
These commands can replace most uses of dev_appserver.py
and appcfg.py
,
although the Python tools are still available for their less common uses.
Local unit testing
The Go App Engine SDK now supports local unit testing, using Go's native testing package and the "go test" command (provided as "goapp test" by the SDK).
Furthermore, you can now write tests that use App Engine services. The aetest package provides an appengine.Context value that delegates requests to a temporary instance of the development server.
For more information about using "goapp test" and the aetest package, see the Local Unit Testing for Go documentation. Note that the aetest package is still in its early days; we hope to add more features over time.
Better concurrency support
It is now possible to configure the number of concurrent requests served by
each of your app's dynamic instances by setting the
max_concurrent_requests
option
(available to Automatic Scaling modules only).
Here's an example app.yaml
file:
application: maxigopher version: 1 runtime: go api_version: go1 automatic_scaling: max_concurrent_requests: 100
This configures each instance of the app to serve up to 100 requests concurrently (up from the default of 10). You can configure Go instances to serve up to a maximum of 500 concurrent requests.
This setting allows your instances to handle more simultaneous requests by taking advantage of Go's efficient handling of concurrency, which should yield better instance utilization and ultimately fewer billable instance hours.
Conclusion
With these changes Go on App Engine is more convenient and efficient than ever, and we hope you enjoy the improvements. Please join the google-appengine-go group to raise questions or discuss these changes with the engineering team and the rest of the community.
Inside the Go Playground
12 December 2013
Introduction
In September 2010 we introduced the Go Playground, a web service that compiles and executes arbitrary Go code and returns the program output.
If you're a Go programmer then you have probably already used the playground by using the Go Playground directly, taking the Go Tour, or running executable examples from the Go documentation.
You may also have used it by clicking one of the "Run" buttons in a slide deck on talks.golang.org or a post on this very blog (such as the recent article on Strings).
In this article we will take a look at how the playground is implemented and integrated with these services. The implementation involves a variant operating system environment and runtime and our description here assumes you have some familiarity with systems programming using Go.
Overview

The playground service has three parts:
- A back end that runs on Google's servers. It receives RPC requests, compiles the user program using the gc tool chain, executes the user program, and returns the program output (or compilation errors) as the RPC response.
- A front end that runs on Google App Engine. It receives HTTP requests from the client and makes corresponding RPC requests to the back end. It also does some caching.
- A JavaScript client that implements the user interface and makes HTTP requests to the front end.
The back end
The back end program itself is trivial, so we won't discuss its implementation here. The interesting part is how we safely execute arbitrary user code in a secure environment while still providing core functionality such as time, the network, and the file system.
To isolate user programs from Google's infrastructure, the back end runs them under Native Client (or "NaCl"), a technology developed by Google to permit the safe execution of x86 programs inside web browsers. The back end uses a special version of the gc tool chain that generates NaCl executables.
(This special tool chain will be merged into the core for Go 1.3. To learn more, read the design document. If you want to play with NaCl before then, you can check out a fork that has all the changes.)
NaCl limits the amount of CPU and RAM a program may consume, and it prevents programs from accessing the network or file system. This presents a problem, however. Go's concurrency and networking support are among its key strengths, and access to the file system is vital for many programs. To demonstrate concurrency effectively we need time, and to demonstrate networking and the file system we obviously need a network and a file system.
Although all these things are supported today, the first version of the
playground, launched in 2010, had none of them.
The current time was fixed at 10 November 2009, time.Sleep
had no effect,
and most functions of the os
and net
packages were stubbed out to
return an EINVALID
error.
A year ago we implemented fake time in the playground, so that programs that sleep would behave correctly. A more recent update to the playground introduced a fake network stack and a fake file system, making the playground's tool chain similar to a normal Go tool chain. These facilities are described in the following sections.
Faking time
Playground programs are limited in the amount of CPU time and memory they can use, but they are also restricted in how much real time they can use. This is because each running program consumes resources on the back end and any stateful infrastructure between it and the client. Limiting the run time of each playground program makes our service more predictable and defends us against denial of service attacks.
But these restrictions become stifling when running code that uses time.
The Go Concurrency Patterns
talk demonstrates concurrency with examples that use timing functions like
time.Sleep
and
time.After
.
When run under early versions of the playground, these programs' sleeps would
have no effect and their behavior would be strange (and sometimes wrong).
By using a clever trick we can make a Go program think that it is sleeping, when really the sleeps take no time at all. To explain the trick we first need to understand how the scheduler manages sleeping goroutines.
When a goroutine calls time.Sleep
(or similar) the scheduler adds a timer to
a heap of pending timers and puts the goroutine to sleep.
Meanwhile, a special timer goroutine manages that heap.
When the timer goroutine starts it tells the scheduler to wake
it when the next pending timer is ready to fire and then sleeps.
When it wakes up it checks which timers have expired, wakes the appropriate
goroutines, and goes back to sleep.
The trick is to change the condition that wakes the timer goroutine. Instead of waking it after a specific time period, we modify the scheduler to wait for a deadlock; the state where all goroutines are blocked.
The playground version of the runtime maintains its own internal clock. When the modified scheduler detects a deadlock it checks whether any timers are pending. If so, it advances the internal clock to the trigger time of the earliest timer and then wakes the timer goroutine. Execution continues and the program believes that time has passed, when in fact the sleep was nearly instantaneous.
These changes to the scheduler can be found in proc.c
and time.goc
.
Fake time fixes the issue of resource exhaustion on the back end, but what about the program output? It would be odd to see a program that sleeps run to completion correctly without taking any time.
The following program prints the current time each second and then exits after three seconds. Try running it.
func main() { stop := time.After(3 * time.Second) tick := time.NewTicker(1 * time.Second) defer tick.Stop() for { select { case <-tick.C: fmt.Println(time.Now()) case <-stop: return } } }
How does this work? It is a collaboration between the back end, front end, and client.
We capture the timing of each write to standard output and standard error and provide it to the client. Then the client can "play back" the writes with the correct timing, so that the output appears just as if the program were running locally.
The playground's runtime
package provides a special
write
function
that includes a small "playback header" before each write.
The playback header comprises a magic string, the current time, and the
length of the write data. A write with a playback header has this structure:
0 0 P B <8-byte time> <4-byte data length> <data>
The raw output of the program above looks like this:
\x00\x00PB\x11\x74\xef\xed\xe6\xb3\x2a\x00\x00\x00\x00\x1e2009-11-10 23:00:01 +0000 UTC \x00\x00PB\x11\x74\xef\xee\x22\x4d\xf4\x00\x00\x00\x00\x1e2009-11-10 23:00:02 +0000 UTC \x00\x00PB\x11\x74\xef\xee\x5d\xe8\xbe\x00\x00\x00\x00\x1e2009-11-10 23:00:03 +0000 UTC
The front end parses this output as a series of events and returns a list of events to the client as a JSON object:
{ "Errors": "", "Events": [ { "Delay": 1000000000, "Message": "2009-11-10 23:00:01 +0000 UTC\n" }, { "Delay": 1000000000, "Message": "2009-11-10 23:00:02 +0000 UTC\n" }, { "Delay": 1000000000, "Message": "2009-11-10 23:00:03 +0000 UTC\n" } ] }
The JavaScript client (running in the user's web browser) then plays back the events using the provided delay intervals. To the user it appears that the program is running in real time.
Faking the file system
Programs built with the Go's NaCl tool chain cannot access the local machine's
file system. Instead, the syscall
package's file-related functions
(Open
, Read
, Write
, and so on) operate on an in-memory file system
that is implemented by the syscall
package itself.
Since package syscall
is the interface between the Go code and the operating
system kernel, user programs see the file system exactly the same way as they
would a real one.
The following example program writes data to a file, and then copies its contents to standard output. Try running it. (You can edit it, too!)
func main() { const filename = "/tmp/file.txt" err := ioutil.WriteFile(filename, []byte("Hello, file system\n"), 0644) if err != nil { log.Fatal(err) } b, err := ioutil.ReadFile(filename) if err != nil { log.Fatal(err) } fmt.Printf("%s", b) }
When a process starts, the file system is populated with some devices under
/dev
and an empty /tmp
directory. The program can manipulate the file
system as usual, but when the process exits any changes to the file system are
lost.
There is also a provision to load a zip file into the file system at init time
(see unzip_nacl.go
).
So far we have only used the unzip facility to provide the data files required
to run the standard library tests, but we intend to provide playground programs
with a set of files that can be used in documentation examples, blog posts, and
the Go Tour.
The implementation can be found in the
fs_nacl.go
and
fd_nacl.go
files
(which, by virtue of their _nacl
suffix, are built into package syscall
only
when GOOS
is set to nacl
).
The file system itself is represented by the
fsys
struct,
of which a global instance (named fs
) is created during init time.
The various file-related functions then operate on fs
instead of making the
actual system call.
For instance, here is the syscall.Open
function:
func Open(path string, openmode int, perm uint32) (fd int, err error) { fs.mu.Lock() defer fs.mu.Unlock() f, err := fs.open(path, openmode, perm&0777|S_IFREG) if err != nil { return -1, err } return newFD(f), nil }
File descriptors are tracked by a global slice named
files
.
Each file descriptor corresponds to a file
and each file
provides a value that implements the fileImpl
interface.
There are several implementations of the interface:
- regular files and devices (such as
/dev/random
) are represented byfsysFile
, - standard input, output, and error are instances of
naclFile
, which uses system calls to interact with the actual files (these are a playground program's only way to interact with the outside world), - network sockets have their own implementation, discussed in the next section.
Faking the network
Like the file system, the playground's network stack is an in-process fake
implemented by the syscall
package. It permits playground projects to use
the loopback interface (127.0.0.1
). Requests to other hosts will fail.
For an executable example, run the following program. It listens on a TCP port, waits for an incoming connection, copies the data from that connection to standard output, and exits. In another goroutine, it makes a connection to the listening port, writes a string to the connection, and closes it.
func main() { l, err := net.Listen("tcp", "127.0.0.1:4000") if err != nil { log.Fatal(err) } defer l.Close() go dial() c, err := l.Accept() if err != nil { log.Fatal(err) } defer c.Close() io.Copy(os.Stdout, c) } func dial() { c, err := net.Dial("tcp", "127.0.0.1:4000") if err != nil { log.Fatal(err) } defer c.Close() c.Write([]byte("Hello, network\n")) }
The interface to the network is more complex than the one for files, so the implementation of the fake network is larger and more complex than the fake file system. It must simulate read and write timeouts, different address types and protocols, and so on.
The implementation can be found in net_nacl.go
.
A good place to start reading is netFile
, the network socket implementation of the fileImpl
interface.
The front end
The playground front end is another simple program (shorter than 100 lines). It receives HTTP requests from the client, makes RPC requests to the back end, and does some caching.
The front end serves an HTTP handler at http://golang.org/compile
.
The handler expects a POST request with a body
field
(the Go program to run) and an optional version
field
(for most clients this should be "2"
).
When the front end receives a compilation request it first checks memcache to see if it has cached the results of a previous compilation of that source. If found, it returns the cached response. The cache prevents popular programs such as those on the Go home page from overloading the back ends. If there is no cached response, the front end makes an RPC request to the back end, stores the response in memcache, parses the playback events, and returns a JSON object to the client as the HTTP response (as described above).
The client
The various sites that use the playground each share some common JavaScript code for setting up the user interface (the code and output boxes, the run button, and so on) and communicating with the playground front end.
This implementation is in the file
playground.js
in the go.tools
repository, which can be imported from the
go.tools/godoc/static
package.
Some of it is clean and some is a bit crufty, as it is the result of
consolidating several divergent implementations of the client code.
The playground
function takes some HTML elements and turns them into an interactive
playground widget. You should use this function if you want to put the
playground on your own site (see 'Other clients' below).
The Transport
interface (not formally defined, this being JavaScript)
abstracts the user interface from the means of talking to the web front end.
HTTPTransport
is an implementation of Transport
that speaks the HTTP-based protocol
described earlier.
SocketTransport
is another implementation that speaks WebSocket (see 'Playing offline' below).
To comply with the same-origin policy,
the various web servers (godoc, for instance) proxy requests to
/compile
through to the playground service at http://golang.org/compile
.
The common go.tools/playground
package does this proxying.
Playing offline
Both the Go Tour and the Present Tool can be run offline. This is great for people with limited internet connectivity or presenters at conferences who cannot (and should not) rely on a working internet connection.
To run offline, the tools run their own version of the playground back end on the local machine. The back end uses a regular Go tool chain with none of the aforementioned modifications and uses a WebSocket to communicate with the client.
The WebSocket back end implementation can be found in the
go.tools/playground/socket
package.
The Inside Present talk discusses this code in detail.
Other clients
The playground service is used by more than just the official Go project (Go by Example is one other instance) and we are happy for you to use it on your own site. All we ask is that you contact us first, use a unique user agent in your requests (so we can identify you), and that your service is of benefit to the Go community.
Conclusion
From godoc to the tour to this very blog, the playground has become an essential part of our Go documentation story. With the recent additions of the fake file system and network stack we are excited to expand our learning materials to cover those areas.
But, ultimately, the playground is just the tip of the iceberg. With Native Client support scheduled for Go 1.3, we look forward to seeing what the community can do with it.
This article is part 12 of the Go Advent Calendar, a series of daily blog posts throughout December .
See the index for more articles.