Goroutine leak with default http server configuration

I have experienced a goroutine leak once because of a misconfiguration of http server.
In this article, I will explain how it occurred and how to prevent it with a proper configuration by using sample code.

Setup a server

This server serves as an example and performs the following functions:

Accepts a POST request and logs the request body at the /post endpoint.
Provides Prometheus metrics at the /metrics endpoint, enabling the observation of basic metrics.
Exposes runtime profiling data on port 6060, allowing for the profiling of runtime data.

// handler function takes the body and log it
func handler(w http.ResponseWriter, r *http.Request) {
	if r.Method != "POST" {
		http.Error(w, "Method is not supported.", http.StatusMethodNotAllowed)
		return
	}

	body, err := io.ReadAll(r.Body)
	if err != nil {
		http.Error(w, "Error reading request body", http.StatusInternalServerError)
		return
	}

	log.Println(string(body))

	defer r.Body.Close()
	w.WriteHeader(http.StatusOK)
	w.Write([]byte("Post request processed."))
}

func main() {
	s := &http.Server{
		Addr:        ":8080"
	}

    // outputs prometheus metrics
	http.Handle("/metrics", promhttp.Handler())
	http.HandleFunc("/post", handler)

	go func() {
		// outputs runtime profilng data at this port
		log.Println(http.ListenAndServe("localhost:6060", nil))
	}()

	if err := s.ListenAndServe(); err != nil {
		log.Fatal("failed to start server", err)
	}
}

Goroutine observation with some scenarios

Let’s send some requests by various scenarios and see how goroutines behave.

Scenarios: 1 Send normal http requests

In the scenario 1, I send some http requests.

The client below sends 30 requests every second for 3 minutes.

func main() {
	url := "http://localhost:8080/post"

	numGoroutines := 30 // Number of parallel requests to send

	for i := 0; i < numGoroutines; i++ {
		go sendRequest(url) // Start a new goroutine for each request
	}

	time.Sleep(3 * time.Minute)
}

func sendRequest(url string) {
	for {
		res, err := http.Post(url, "plain/text", bytes.NewBuffer([]byte("test")))
		if err != nil {
			log.Printf("Error sending request: %v", err)
			continue
		}
		body, err := io.ReadAll(res.Body)
		if err != nil {
			log.Printf("Error reading response body: %v", err)
			res.Body.Close()
			continue
		}
		res.Body.Close() 

		fmt.Printf("Received response: %s\n", body)
		time.Sleep(1 * time.Second) // Throttle the requests (optional)
	}
}

After sending requests, I got this metrics below.

As you can see, the number of goroutines is stable. It’s always around 10.

Scenario 2: Open connections to the server then wait without sending requests

In the scenario 2, I just open connections to the server, but wait for 3 minutes without sending requests. This client below opens 10 connections every second for 3 minutes.

func main() {
	numGoroutines := 10 // Number of parallel requests to send

	for i := 0; i < 180; i++ {
		for j := 0; j < numGoroutines; j++ {
			go openConnection() // Start a new goroutine for each request
		}
		time.Sleep(1 * time.Second)
	}
}

// openConnection opens a connection to the server and wait for 3 minutes
func openConnection() {
	serverAddress := "127.0.0.1:8080"

	conn, err := net.Dial("tcp", serverAddress)
	if err != nil {
		fmt.Println("Error connecting to server:", err)
		return
	}
	defer conn.Close()

	time.Sleep(3 * time.Minute)
}

The below is the metrics after client finished the process.
As you can see, the number of goroutine were growing as long as the client keeps the connection.

Ok, somehow goroutine is leaked.
In order to see where the goroutine is leaked, I dumped the goroutine state by calling the profiling endpoint.

curl http://localhost:6060/debug/pprof/goroutine\?debug=1 > goroutine_dump.txt

At that point, there were 10005 goroutines and almost all goroutine are generated from the http server.

goroutine profile: total 10005
10000 @ 0x43d5ae 0x4363b7 0x46ed25 0x4c8e87 0x4c99da 0x4c99c8 0x577b45 0x581ba5 0x6429eb 0x5eec23 0x5ef709 0x5ef965 0x5f25a5 0x63c80e 0x63c829 0x643f88 0x648279 0x474141
#       0x46ed24        internal/poll.runtime_pollWait+0x84             /usr/local/go/src/runtime/netpoll.go:345
#       0x4c8e86        internal/poll.(*pollDesc).wait+0x26             /usr/local/go/src/internal/poll/fd_poll_runtime.go:84
#       0x4c99d9        internal/poll.(*pollDesc).waitRead+0x279        /usr/local/go/src/internal/poll/fd_poll_runtime.go:89
#       0x4c99c7        internal/poll.(*FD).Read+0x267                  /usr/local/go/src/internal/poll/fd_unix.go:164
#       0x577b44        net.(*netFD).Read+0x24                          /usr/local/go/src/net/fd_posix.go:55
#       0x581ba4        net.(*conn).Read+0x44                           /usr/local/go/src/net/net.go:179
#       0x6429ea        net/http.(*connReader).Read+0x14a               /usr/local/go/src/net/http/server.go:789
#       0x5eec22        bufio.(*Reader).fill+0x102                      /usr/local/go/src/bufio/bufio.go:110
#       0x5ef708        bufio.(*Reader).ReadSlice+0x28                  /usr/local/go/src/bufio/bufio.go:376
#       0x5ef964        bufio.(*Reader).ReadLine+0x24                   /usr/local/go/src/bufio/bufio.go:405
#       0x5f25a4        net/textproto.(*Reader).readLineSlice+0xa4      /usr/local/go/src/net/textproto/reader.go:56
#       0x63c80d        net/textproto.(*Reader).ReadLine+0xad           /usr/local/go/src/net/textproto/reader.go:39
#       0x63c828        net/http.readRequest+0xc8                       /usr/local/go/src/net/http/request.go:1059
#       0x643f87        net/http.(*conn).readRequest+0x247              /usr/local/go/src/net/http/server.go:1004
#       0x648278        net/http.(*conn).serve+0x338                    /usr/local/go/src/net/http/server.go:1964

...

What was actually happening here?

The below image helps finding the reason. So, when the client establishes the connection to the server, the server spawns a goroutine and wait for the request.

Then, what happens if the client doesn’t send a request after established the connection?

There are a couple of timeout configurations in the http server.

ReadTimeout: the maximum duration for reading the entire request, including the body
ReadHeaderTimeout: the maximum duration allowed to read request headers
WriteTimeout: the maximum duration before timing out writes of the response. It is reset whenever a new request’s header is read.

Basically, when the server reaches out the configured timeout, it closes the connection and deletes the goroutine waiting for the request. However, the default timeout is none, which means it waits forever. Therefore, the leaked goroutines stayed until the client stopped the connection.

Scenario 3: Prevent goroutine leaks

Ok, let’s setup a read timeout then.

I just setup 5 seconds read timeout to the http server as below.

func main() {
	s := &http.Server{
		Addr:        ":8080",
		ReadTimeout: 5 * time.Second,
	}

	...
}

As you can see, the number of goroutines was not growing.

Conclusion

As we saw how the default http server configuraion leads to a goroutine leak, which is pretty dangerous. We must set up timeouts for the http server.

In addition, it applies to the http client as well. This article explains about the client timeout.

References

Published Mar 3, 2024

Golang

Backend engineer, living in BerlinHiroki Fujino on Twitter