Menu

Asynchronous Go for Beginners

By Noam Yadgar
#go #concurrency #asynchronous #goroutines #channels #sync #waitgroup #software-engineering

If you would like to skip to a more advanced article, check out Synchronization Patterns in Go

When it comes to asynchronous programming, Go is one of the best languages in the market. The design of the language is so multi-thread oriented, that the go keyword in the language is used to start a new goroutine.

What is a goroutine?

A goroutine is simply a process, managed concurrently by the Go runtime. Whenever you prefix a function call with the keyword go, the current goroutine will spawn a new goroutine with the function call as a new process, running in the background.

In this post, I’ll try to bring all of the basics for writing multithreaded programs in Go, so after this read, you will be able to understand important concepts, like shared memory, channels, locks, WaitGroups, deadlocks, and more.

Why should I write asynchronous programs?

To understand the power of asynchronous programming, let’s simulate a process that might take up to 5 seconds:

func doSomething() {
  rand.Seed(time.Now().UnixNano())
  time.sleep(time.Second * time.Duration(rand.Intn(5)))
  fmt.Println("writing from a goroutine")
}

This function will write a message to standard output, but doing so can take up to 5 seconds. If we’ll run this main function:

func main() {
  doSomething()
  fmt.Println("writing from main goroutine")
}

Our program will potentially take up to 5 seconds, print: writing from a goroutine and eventually print: writing from main goroutine.

Think of a scenario where we are running doSomething about a hundred times:

func main() {
  n := 100
  for i := 0; i < n; i++ {
    doSomething()
  }
  fmt.Println("writing from main goroutine")
}

Running this program can potentially take up to about 8:20 minutes! This program’s behavior is very inefficient, we have to wait for the doSomething function to finish on every iteration. This is known as synchronous behavior

We can do a lot better if we introduce our code to asynchronous behavior. By prefixing our doSomething function with the go keyword, we can spawn a new goroutine and run each iteration on its separate process:

Goroutines

package main

import (
  "fmt"
  "math/rand"
  "time"
)

func doSomething() {
  rand.Seed(time.Now().UnixNano())
  time.Sleep(time.Second * time.Duration(rand.Intn(5)))
  fmt.Println("writing from a goroutine")
}

func main() {
  n := 100
  for i := 0; i < n; i++ {
    go doSomething()
  }
  fmt.Println("writing from main goroutine")
}

I’ve included the entire main.go content because I encourage you to copy and run this on your machine.

Spoiler, when you’ll run this, there’s a pretty good chance that you’ll get:

go run main.go
writing from main goroutine

Where is the output from doSomething? Weird, isn’t it? Well no. When the main goroutine is finished, it will terminate all running goroutines along with it. If you think about it, our main function can be over even before the first call of doSomething (which can take up to 5 seconds). So our main goroutine should be kept alive, waiting for all goroutines to finish. And to do this, we should talk about a very common pattern of synchronization in Go

sync.WaitGroup

goroutines are not aware of each other. You should think of each goroutine as a separate process. However, these processes are under the same Go runtime tree and have access to the same resources, including memory.

So maybe we can pass each goroutine a pointer to some counter, starting by setting this counter to the number of planned goroutines, and at the end of every goroutine, invoke an atomic operation that can decrement the counter by 1.

To finish this pattern, we can call a blocking function on the main goroutine that will wait for the counter to reach 0. When the counter will reach 0, this function will end and let the main goroutine continue. This pattern will make the main goroutine wait for all other goroutines to end before moving on.

Although this can be a fun challenge to implement, Go has a dedicated type for this exact pattern. It can be found in the standard library, in a package called sync, and it’s called WaitGroup

package main

import (
  "fmt"
  "math/rand"
  "time"
  "sync"
)

func doSomething(wg *sync.WaitGroup) {
  rand.Seed(time.Now().UnixNano())
  time.Sleep(time.Second * time.Duration(rand.Intn(5)))
  fmt.Println("writing from a goroutine")
  defer wg.Done()
}

func main() {
  var wg sync.WaitGroup
  n := 100

  wg.Add(n)
  for i := 0; i < n; i++ {
    go doSomething(&wg)
  }

  wg.Wait()
  fmt.Println("writing from main goroutine")
}

Take a moment to appreciate the power of this program… We’ve turned a program that has the potential of running for 8:20 minutes to one that will run for 5 seconds at most, using asynchronous, concurrent programming.

Channels

Can goroutines exchange data? Yes, they can! The channel type (or chan) is Go’s built-in way to handle communication between goroutines. Remember that goroutines are not aware of each other, but they can send and receive signals, using channels. Channels wrap an inner type and are used as a medium for moving data of this inner type (for example chan int or chan string).

There are 2 types of channels:

  • Buffered - A channel with a fixed size
  • Default - A channel with no size

Buffered Channels

We’ll start by creating a channel of type int and set it to the size of 3

bChan := make(chan int, 3)

What it means is that we can send to the channel up to 3 values of type int and read from this channel up to 3 values, using the <- operator.

bChan <- 5
bChan <- 20
bChan <- 1
fmt.Println(<-bChan)
fmt.Println(<-bChan)
fmt.Println(<-bChan)

If the channel is full, and we are trying to send data, this operation will block the goroutine. The same goes if we’re trying to read from a channel with an empty buffer.

If there are no longer any running goroutines that send data to a channel, the buffer is empty, and we’re trying to read data from this channel in our main goroutine. We will be blocked forever! Since our main goroutine is blocked, it cannot do anything to send data to the channel. The Go runtime will detect this, and it will let you know that you have reached a deadlock

The same goes for the other direction. If your main goroutine is trying to send to a full buffered channel and there are no running goroutines in the background to read data from this channel, you’ll reach a deadlock.

To better understand this behavior, I’ll send and receive some data on a single goroutine

bChan <- 1 // {1, nil, nil}
bChan <- 7 // {1, 7, nil}
fmt.Println(<-bChan) // {7, nil, nil}
bChan <- 3 // {7, 3, nil}
bChan <- 9 // {7, 3, 9}
bChan <- 1 // {7, 3, 9} **BLOCKED**
// since we can't make anyone read data from this channel,
// this is also a deadlock

Ok, let’s go back to our code and tweak it.

  • I’ll modify doSomething, so it will return the number of seconds it ran
  • Create a channel with size n
  • Send each result of doSomething to the channel
  • Read the data from the channel and accumulate the data
  • print the total collective time of all the goroutines
package main

import (
	"fmt"
	"math/rand"
	"time"
)

func doSomething() int {
	rand.Seed(time.Now().UnixNano())
	r := rand.Intn(5)
	time.Sleep(time.Second * time.Duration(r))
	return r
}

func main() {
	n := 100
	c := make(chan int, n)

	for i := 0; i < n; i++ {
		go func() {
			c <- doSomething()
		}()
	}

	y := 0
	for i := 0; i < n; i++ {
		y += <-c
		fmt.Printf("current colective time: %d\n", y)
	}

	fmt.Printf("total colective time: %d\n", y)
}

If you’ll run this, it will calculate how long (in seconds) it would have taken to run this program if the behavior was synchronous (of course, it would take about 5 seconds to tell you this)

Also, notice that we don’t need a WaitGroup here. Since we’re spawning exactly n goroutines, sending exactly n values to the n sized channel, and reading exactly n values from the channel. The predetermined size and the blocking nature of a buffered channel will sync the goroutines perfectly and protect it from deadlocks.

Most of the time you’ll probably not have a nice, predetermined size of values. Your channels will act as “pipes” for data flow.

A Channel (unbuffered)

Unlike the buffered channel, a channel’s default “size” is no size.

c := make(chan int)

Since you don’t know how much signal will this channel handle, the behavior of this type of channel will be different from the buffered channel.

Sending to this channel will block the thread until someone else will receive from this channel, and vice versa. So if you’ll make a channel in the main goroutine and send data to this channel in the same goroutine, you will reach a deadlock, since the sending operation will block this thread and no one will ever be able to receive signal from the channel.

package main

func main() {
  c := make(chan int)
  c <- 5 // deadlock
}

And the other way around

package main

import "fmt"

func main() {
    c := make(chan int)
    go func() { c <- 5 }()
    go func() { c <- 6 }()

    for x := range c { // constatnly trying to receive from c
        fmt.Println(x)
    }

    fmt.Println("this line will never be printed")
}

The output will be

go run main.go
6
5
fatal error: all goroutines are asleep - deadlock!

goroutine 1 [chan receive]:
main.main()
	/Users/nyadgar/main.go:10 +0x105
exit status 2

We were able to read up to the point where our main goroutine read from an empty channel, and there weren’t any other goroutines to send to this channel. Essentially, blocking the thread forever. To fix this problem we should gracefully close() this channel

close(c)

But obviously, we can’t just close the channel randomly. We want to make sure, we’ve read all the data and then close it. This is where we can say hello again, to our good old friend, the WaitGroup.

Our plan goes like this:

  • Every time we’re spawning a new goroutine, we will increment the WaitGroup by 1
  • Whenever a goroutine is finished, it will call Done() (decrement the WaitGroup by 1)
  • We’ll open one extra goroutine, its job will be to Wait() until the WaitGroup reach 0 and then close(c) the channel. This is important because we’re keeping a dedicated thread alive until all other goroutines (except the main goroutine) will terminate. We don’t have to worry about the main goroutine terminating prematurely, because it will be blocked for at least until the final value of the channel will be read by it.
  • Once our channel is closed, the main goroutine will stop iterating through the channel and move to the next block of code.
package main

import (
  "fmt"
  "sync"
)

func main() {
  var wg sync.WaitGroup
  c := make(chan int)

  wg.Add(1)
  go func() {
    defer wg.Done()
    c <- 5
  }()

  wg.Add(1)
  go func() {
    defer wg.Done()
    c <- 6
  }()

  go func() {
    wg.Wait()
    close(c)
  }()

  for x := range c {
    fmt.Println(x)
  }

  fmt.Println("this line will be printed")
}

Running this

go run main.go
6
5
this line will be printed

Great! Let’s add this pattern to our original code. Instead of setting up a predetermined n, our code will iterate through a random number between 1-1000

for i := 0; i < rand.Intn(1000); i++ {
  go func() {
   c <- doSomething()
  }
}

Let’s write some WaitGroup management around these goroutines

var wg sync.WaitGroup
c := make(chan int)

for i := 0; i < rand.Intn(1000); i++ {
  wg.Add(1)
  go func() {
    defer wg.Done()
    c <- doSomething()
  }()
}

Let’s set up one extra goroutine that will wait for all others and then close the channel

go func() {
  wg.Wait()
  close(c)
}()

Now we can safely iterate through the range of this channel, here’s the whole thing

package main

import (
	"fmt"
	"math/rand"
	"sync"
	"time"
)

func doSomething() int {
	rand.Seed(time.Now().UnixNano())
	r := rand.Intn(5)
	time.Sleep(time.Second * time.Duration(r))
	return r
}

func main() {
	rand.Seed(time.Now().UnixNano())
	var wg sync.WaitGroup
	c := make(chan int)

	for i := 0; i < rand.Intn(1000); i++ {
		wg.Add(1)
		go func() {
			defer wg.Done()
			c <- doSomething()
		}()
	}

	go func() {
		wg.Wait()
		close(c)
	}()

	y := 0
	for x := range c {
		y += x
		fmt.Printf("current colective time: %d\n", y)
	}

	fmt.Printf("total colective time: %d\n", y)
}

I hoped you enjoyed this post, if you have any questions or suggestions, or you’ve found a mistake, please contact me (email above). Thank you for reading.

If you would like to continue to a more advanced article, check out Synchronization Patterns in Go