100 Go Mistakes and How to Avoid Them

关于本书

Spot errors in your Go code you didn’t even know you were making and boost your productivity by avoi...

https://book.douban.com/subject/36084407/

2. Code and Project organization

本节主要讲解代码和项目的结构组织，目的是教会大家：

以符合go语言习惯的组织代码

代码能高效的处理抽象：接口和泛型

提供构建项目的最佳实践

2.1 #1 Unintended variable shadowing（变量遮蔽带来的非意料行为）

Bad Code


// 这种场景在if语句中使用 := 声明了和最外层client同名的变量，
// 这样会出现内部变量覆盖外层变量的情况
var client *http.Client
if tracing {
	// 这里的client是新的临时变量，实际上没有对外层的client赋值
	client, err := createClientWithTracing()
	if err != nil {
		return err
	}
	log.Println(client)
} else {
	client, err := createDefaultClient()
	if err != nil {
		return err
	}
	log.Println(client)
}

Good Code


// 把判断语句中的声明语法换成赋值语法，同样在最外层声明err，可以做到减少代码量的效果
var client *http.Client
var err error
if tracing {
	client, err = createClientWithTracing()
} else {
	client, err = createDefaultClient()
}
if err != nil {
		return err
}
log.Println(client)

2.2 #2 Unnecessary nested code （没有必要的代码嵌套）

代码的嵌套层数可以用来评价代码的可读性

Bad Code


func join(s1, s2 string, max int) (string, error) {
	if s1 == "" {
		return "", errors.New("s1 is empty")
	} else {
		if s2 == "" {
			return "", errors.New("s2 is empty")
		} else {
			concat, err := concatenate(s1, s2)
			if err != nil {
				return "", errors
			} else {
				if len(concat) > max {
					return concat[:max], nil
				} else {
					return concat, nil
				}
			}
		}
	}
}

上面的代码因为夹杂了多种判断导致代码嵌套层数太多，代码可读性大大降低，并且维护性也变差。我们使用下面的代码代替上面的写法。

Good Code


func join(s1, s2 string, max int) (string, error) {
	if s1 == "" {
		return "", errors.New("s1 is empty")
	}
	if s2 == "" {
		return "", errors.New("s2 is empty")
	}
	concat, err := concatenate(s1, s2)
	if err != nil {
		return "", err
	}
	if len(concat) > max {
		return concat[:max], nil
	}
	return concat, nil
}

经过修改后的代码，我们只需要简单的心智负担就能阅读和理解代码。

💡

Align the happy path to the left; you should quickly be able to scan down one column to see the expected execution flow.

把代码逻辑靠最左边实现，能够让代码可读性增强，这里最左边的路径被Gopher成为 Happy Path,阅读代码时候，第一层级的逻辑是预期代码行为的执行流程，第二层级为Error Handler，具体处理边缘情况。

下面列出几个优化代码的规则

💡

1. 如果if块会直接执行return返回，请不要再写else块，这样else里的逻辑会被放在Happy Path，便于后面的阅读


if foo() {
	// ...
	return true
} else {
	// ...
}

Bad Code ⇒


if foo() {
	// ...
	return true
}
// ...

Good Code

💡

2. 遇到错误及时抛出，让主要逻辑放在 Happy Path 上


if s != "" {
	// ...
} else {
	return errors.New("empty string")
}

Bad Code ⇒


if s == "" {
		return errors.New("empty string")
}
// ...

Good Code

💡

个人建议：不要为了路径在第一层而写代码，主要还是要代码可读性，以及在for循环的场景下，要考虑到cpu分支预测的场景

2.3 #3 Misusing init functions（错误使用和理解 init 函数）

💡

一个包内有多个init函数，执行顺序按照包内文件名的字典序执行

init函数有3个缺点：

init函数无法抛出异常，错误处理有限制，极端情况只能panic

增加了测试的复杂性，因为在测试代码执行前，包内的init函数

如果init函数需要设置状态，往往需要一个全局变量存储。

init函数适用的场景：

定义静态配置, go的官方博客使用 init 函数定义静态配置


// Copyright 2013 The Go Authors.  All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.

// Command blog is a web server for the Go blog that can run on App Engine or
// as a stand-alone HTTP server.
package main

import (
	"net/http"
	"strings"
	"time"

	"golang.org/x/tools/blog"
	"golang.org/x/website/content/static"

	_ "golang.org/x/tools/playground"
)

const hostname = "blog.golang.org" // default hostname for blog server

var config = blog.Config{
	Hostname:     hostname,
	BaseURL:      "https://" + hostname,
	GodocURL:     "https://golang.org",
	HomeArticles: 5,  // articles to display on the home page
	FeedArticles: 10, // articles to include in Atom and JSON feeds
	PlayEnabled:  true,
	FeedTitle:    "The Go Programming Language Blog",
}
// here
func init() {
	// Redirect "/blog/" to "/", because the menu bar link is to "/blog/"
	// but we're serving from the root.
	redirect := func(w http.ResponseWriter, r *http.Request) {
		http.Redirect(w, r, "/", http.StatusFound)
	}
	http.HandleFunc("/blog", redirect)
	http.HandleFunc("/blog/", redirect)

	// Keep these static file handlers in sync with app.yaml.
	static := http.FileServer(http.Dir("static"))
	http.Handle("/favicon.ico", static)
	http.Handle("/fonts.css", static)
	http.Handle("/fonts/", static)

	http.Handle("/lib/godoc/", http.StripPrefix("/lib/godoc/", http.HandlerFunc(staticHandler)))
}

func staticHandler(w http.ResponseWriter, r *http.Request) {
	name := r.URL.Path
	b, ok := static.Files[name]
	if !ok {
		http.NotFound(w, r)
		return
	}
	http.ServeContent(w, r, name, time.Time{}, strings.NewReader(b))
}

2.4 #4 Overusing getters and setters（过度使用 getters 和 setters）

在Go中 setter 和 getter 并不是惯用的方式，标准库的代码也没有非得使用这种方式获取结构体字段。


timer := time.NewTimer(time.Second)
<-timer.C

使用 getters 和 setters的好处：

They encapsulate a behavior associated with getting or setting a field, allowing new functionality to be added later (for example, validating a field, returning a computed value, or wrapping the access to a field around a mutex).

They hide the internal representation, giving us more flexibility in what we expose.

They provide a debugging interception point for when the property changes at run time, making debugging easier.

2.5 #5 Interface pollution （接口污染）

接口污染的概念是代码里存在大量的抽象导致代码难以理解

💡

It’s a common mistake made by devel- opers coming from another language with different habits.

在公司经常看到这样的代码，我也是头皮发麻，有些人经常高举着语言不重要的大旗，依赖自己已有编程语言的领域知识，四处污染代码…，看到推上的一个吐槽，只能说我也头皮发麻。（不用杠，杠就是你对）。

接口的抽象力度

越少的接口抽象程度越高，而且接口是要被发现而不是自己创建出来的，没有必要一上来就定义一堆接口，然后再实现。

💡

The bigger the interface, the weaker the abstraction. —Rob Pike

💡

Don’t design with interfaces, discover them. —Rob Pike

此外，我们还可以组合细粒度接口来创建更高级别的抽象


type ReadWriter interface {
    Reader
    Writer 
}

何时使用接口

多种类型有相同的行为，像sort包提供的sort方法只需要类型实现下面的接口就可以被排序

代码解耦，结构体中包含接口成员而不是具体的结构体，能够帮助我们更好的解耦，不需要做大量的代码修改就能切换其他成员。

可以代码限制行为，在这个例子里，调用Foo的代码，只能获取到阈值，却不能修改阈值，这就达到了限制行为的作

2.6 #6 Interface on the producer side （错误的把接口定义在实现端）

默认情况下，我们会把接口定义在接口的实现侧(包)里，但是本文建议把接口定义在客户(调用)侧(包)，这样建议的原则还是：

💡

Don’t design with interfaces, discover them. —Rob Pike

因为接口是被发现，而不是创建出来的，只有在客户端的程序才知道自己需要什么样的接口，这样抽象程度更高，不会出现一个接口出现了很多冗余的方法。如果放在实现侧定义的话，受到开发者自己的主观影响，往往没法定义出一个由较好抽象的接口。PS: 这种规则在标准库里不适用,原因自行体会。

2.7 #7 Returning interfaces (返回接口)

这条规则要配合上一条一起食用，如果在实现端，创建一个返回接口的函数，会因为引用循环的原因导致无法实现。

本文在坚持2.6观点的前提下，提出了2点最佳实践：

返回结构体，而不是接口

函数或者结构体尽量接受接口

当然这2点提出也是引经据典引申出来的。

💡

Be conservative in what you do, be liberal in what you accept from others. - Transmission Control Protocol

当然，规则不是100%准确的，比如 Error 类型经常被函数返回，标准库 io 包里也经常有返回接口的函数。


func LimitReader(r Reader, n int64) Reader {
    return &LimitedReader{r, n}
}

如果我们知道(不是预见)一个抽象将对调用方有帮助，我们可以考虑返回一个接口

2.8 #8 any says nothing （any 类型表达性很弱）

从Go,1.18出现了any类型其实底层代码就是interface{}, 本节提出了不能随意使用any作为类型，不然会丢失重要的类型信息，代码可读性降低。

2.9 #9: Being confused about when to use generics （不知道何时使用泛型）

本节讲解了泛型的知识点，其中有一个例子显示，泛型可以限制类型范围，如下语法表示。


// customConstraint 接口要求实现类型的底层类型是int就可以了，并且可以实现String方法
// 如果 ~int 改为 int，编译就会报错，int类型没有实现String方法
type customConstraint interface {
    ~int
    String() string
}

// customInt 满足 customConstraint 的要求，并且实现了 String 方法
type customInt int
func (i customInt) String() string {
    return strconv.Itoa(int(i))
}

泛型可以用在函数参数中，也可以用在数据结构中。

何时使用泛型

构建数据结构，例如，如果我们实现二叉树、链表或堆，我们可以使用泛型来分解出元素类型。

处理任何类型的slice、map和channel的函数-例如，合并两个channel的函数适用于任何通道类型。


func merge[T any](ch1, ch2 <-chan T) <-chan T {
	// ...
}

众多类型在实现接口方法时候拥有相似的处理逻辑，例如调用sort包方法对数组排序，需要数组元素类型实现Interface接口，没有泛型的时候，int,int32,int64,float都要实现一遍接口，而这些类型的实际方法的逻辑都是一样的，用泛型能够减少大量冗余代码


// sort package
// This interface is used by different functions such as sort.Ints or 
// sort .Float64s. Using type parameters, we could factor out the sorting 
// behavior (for example, by defining a struct holding a slice and a comparison function):
type Interface interface {
     Len() int
     Less(i, j int) bool
     Swap(i, j int)
}


s := SliceFn[int]{
	S: []int{3, 2, 1},
	Compare: func(a, b int) bool {
		return a < b
	}
}
sort.Sort(s)
fmt.Println(s.S)

何时不使用泛型

下面的例子我们调用了接口的方法，使用泛型没有带来任何好处


func foo[T io.Writer](w T) {
	b := getBytes()
	_, _ = w.Write(b)
}

当使用泛型会带来更多复杂度

2.10 #10: Not being aware of the possible problems with type embedding （对类型嵌入的使用场景不熟悉）

简单来说，类型嵌入可以避免编写大量的模板代码，使用嵌入类型的原则是，如果不想暴露成员的访问权，就尽量不使用类型嵌入。

嵌入接口的话，嵌入类型还能够默认实现嵌入接口。

2.11 #11: Not using the functional options pattern（编写构造函数的时候，使用option模式）

在很多场景下下，我们需要对对象的成员进行配置化，往往我们会在NewXXX函数中的参数中罗列配置化的值，这样就出现很多问题。

Bad Code


func NewServer(addr string, port int) (*http.Server, error) { 
	// ...
}

如果不传递port的时候，我们希望使用默认值，但是这里port又必须要填写一个值

额外增加参数的时候，会对函数签名有不兼容的修改。

我们期望配置*http.Server的时候，可以做到下面几点：

如果port没有设置，我们使用默认值

如果port是负数，抛出错误，这很重要

如果port是0，那我们就随机生成端口

其他场景，我们使用默认值

解决方案一：Config struct

第一种解决方案是，使用一个Config对象存储配置。

💡

这里遇到第一个点，配置字段为指针类型，判断用户是否配置了该值


port := 0
// 但是用户使用的时候需要传递指针类型给Config结构体，会给人造成困惑，
// 配置一个端口需要传递一个int指针，所以这种设计并不API友好
config := httplib.Config{
	Port: &port,
}
// 当我们不想传递配置的时候，也不行，必要给一个空的结构提
httplib.NewServer("localhost", httplib.Config{})

解决方案二：Builder pattern

解决方案三：Functional options pattern


// 构造函数，可以传递多个配置项
server, err := httplib.NewServer("localhost",
        httplib.WithPort(8080),
        httplib.WithTimeout(time.Second))
// 因为是可变参数，之前必填的cfg对象，可以省略
server, err := httplib.NewServer("localhost")

2.12 #12: Project misorganization（推荐的项目组织结构）

project-layout/README_zh-CN.md at master · golang-standards/project-layout

Standard Go Project Layout. Contribute to golang-standards/project-layout development by creating an account on GitHub.

https://github.com/golang-standards/project-layout/blob/master/README_zh-CN.md

2.13 #13: Creating utility packages（不推荐使用utility包）

util、common、shared这类包名会抹去很多信息，不如直接使用有意义的包名来替代一些工具包


// bad
set := util.NewStringSet("c", "a", "b")
fmt.Println(util.SortStringSet(set))

// good
set := stringset.New("c", "a", "b")
fmt.Println(stringset.Sort(set))

2.14 #14: Ignoring package name collisions（忽略包命名冲突）

一般IDE都会有冲突提醒


import "xxx/redis"

// bad
redis := redis.NewClient()
v, err := redis.Get("foo")

方案一


redisClient := redis.NewClient()
v, err := redisClient.Get("foo")

方案二


import redisapi "mylib/redis"

redis := redisapi.NewClient()
v, err := redis.Get("foo")

2.15 #15: Missing code documentation（完善代码文档）

省略

2.16 #16: Not using linters（不使用linters）

golangci-lint

golangci • Updated Mar 25, 2023

3 Data types

3.1 #17: Creating confusion with octal literals（混淆八进制）

在GO中以0或者0o开头的数字被认为是8进制数


sum := 100 + 010
fmt.Println(sum)

// output
// 108

在一些特殊场景，比较读取文件的时候使用八进制描述可以方便的指定权限


file, err := os.OpenFile("foo", os.O_RDONLY, 0o644)

二进制-使用0b或者0B作为前缀（0b100代表4）

16进制-使用0x或者0X前缀（0xF代表15）

虚数-使用i作为后缀（3i）

除此之外我们可以在数字中穿插下划线提升可读性：1_000_000_000

3.2 #18 Neglecting integer overflows（忽略整数溢出）

GO有专门处理大数的包：math/big


// 当对整数进行+1操作时，对溢出判断
func Inc32(counter int32) int32 {
	if counter == math.MaxInt32 {
		panic("int32 overflow")
	}
	return counter + 1
}

func IncInt(counter int) int {
	if counter == math.MaxInt {
		panic("int overflow")
	}
	return counter + 1
}

func IncUint(counter uint) uint {
	if counter == math.MaxUint {
		panic("uint overflow")
	}
	return counter + 1
}

// 当两个整数相加时，提前做溢出判断
func AddInt(a, b int) int {
	if a > math.MaxInt-b {
		panic("int overflow")
	}

	return a + b
}

// 两数相乘时提前做溢出判断
func MultiplyInt(a, b int) int {
	if a == 0 || b == 0 {
		return 0
	}

	result := a * b
	if a == 1 || b == 1 {
		return result
	}
	if a == math.MinInt || b == math.MinInt {
		panic("integer overflow")
	}
	if result/b != a {
		panic("integer overflow")
	}
	return result
}

3.3 #19: Not understanding floating points（对浮点数理解欠缺）

浮点数的表示方式，可以阅读下面的参考文章：

浮点数的二进制表示(IEEE 754标准)

浮点数是我们在程序里常用的数据类型，它在内存中到底是怎么样的形式存在，是我了解之前是觉得好神奇，以此记录，作为学习笔记并分享。现代计算机中，一般都以IEEE 754标准存储浮点数，这个标准的在内存中存储的…

https://zhuanlan.zhihu.com/p/144697348

float32和float64存储精度有限，可能会导致存储出现偏差。因此也会对使用方式产生影响：首先就是对2个浮点数进行比较大小，因此对2个浮点数对比的时候，我们保证误差在可接受范围内即可：

testify

stretchr • Updated Mar 25, 2023

仓库 testify 就提供了 InDelta 函数来断言两个值的误差在可接受范围内，因为不同机器的FPU(浮点单元)不一致，导致浮点数计算的精度也不一样，使用 InDelta 来对比2个数值也适用于使用不同机器运行的场景。

浮点运算的误差会随着执行次数累积：f2和f1的执行顺序不一致，导致f2的误差更低


func f1(n int) float64 {
	result := 10_000.
	for i := 0; i < n; i++ {
		result += 1.0001
	}
	return result
}

func f2(n int) float64 {
	result := 0.
	for i := 0; i < n; i++ {
		result += 1.0001
	}
	return result + 10_000.
}

注意，浮点计算的顺序会影响结果的准确性。当执行加减法链时，我们应该将操作分组，以添加或减去具有相似数量级的值，然后再添加或减去那些大小不接近的值。因为f2在最后加了10000，最终它产生的结果比f1更准确


a := 100000.001
b := 1.0001
c := 1.0002

fmt.Println(a * (b + c))
// 方式2精度更高，优先计算乘除，但是执行效率低
fmt.Println(a*b + a*c)

// output
// 200030.00200030004
// 200030.0020003

在进行浮点数操作的时候，记住下面几点：

浮点数比较大小的时候，检查2数误差在可接受范围内即可

为了提升结果的精度，在进行加减操作时，把具有相似数量级的数进行分组操作

为了提高准确性，如果一系列操作需要加法、减法、乘法或除法，请先执行乘法和除法操作

3.4 #20: Not understanding slice length and capacity（不理解切片len和cap）

3.5 #21: Inefficient slice initialization（低效的切片初始化）


func convertEmptySlice(foos []Foo) []Bar {
	bars := make([]Bar, 0)
	// 如果foos过长，导致bars频繁申请内存，拷贝数组对执行效率和GC产生影响
	for _, foo := range foos {
		bars = append(bars, fooToBar(foo))
	}
	return bars
}

func convertGivenCapacity(foos []Foo) []Bar {
	n := len(foos)
  // 预先指定 cap
	bars := make([]Bar, 0, n)
	
	for _, foo := range foos {
		bars = append(bars, fooToBar(foo))
	}
	return bars
}

func convertGivenLength(foos []Foo) []Bar {
	n := len(foos)
	// 预先指定len，通过下标赋值，比使用append内置函数效率更高一些
	bars := make([]Bar, n)

	for i, foo := range foos {
		bars[i] = fooToBar(foo)
	}
	return bars
}

3.6 #22: Being confused about nil vs. empty slices（区分nil和empty切片）

nil 切片和 empty 切片的区别是 nil切片不占用空间，可以依次来判断是否使用nil 切片, 在json解析场景下，nil和empty的切片也不同


func main() {
	var s1 []float32
	customer1 := customer{
		ID:         "foo",
		Operations: s1,
	}
	b, _ := json.Marshal(customer1)
	// {"ID":"foo","Operations":null} nil 的切片被解析为 null
	fmt.Println(string(b))

	s2 := make([]float32, 0)
	customer2 := customer{
		ID:         "bar",
		Operations: s2,
	}
	b, _ = json.Marshal(customer2)
  // {"ID":"bar","Operations":[]} empty 的切片被解析为 []
	fmt.Println(string(b))
}

type customer struct {
	ID         string
	Operations []float32
}

3.7 #23: Not properly checking if a slice is empty（判断切片为空）


len(s) == 0

3.8 #24: Not making slice copies correctly（没有正确的复制切片）

copy函数会复制元素的个数为2个切片长度最小值


func bad() {
	src := []int{0, 1, 2}
	var dst []int
	copy(dst, src)
	fmt.Println(dst)

	_ = src
	_ = dst
}

func correct() {
	src := []int{0, 1, 2}
	dst := make([]int, len(src))
	copy(dst, src)
	fmt.Println(dst)

	_ = src
	_ = dst
}

3.9 #25: Unexpected side effects using slice append（使用切片append时产生的副作用-s[low:high:max]用法）

s2和s1共享底层数组，当s2的cap大于len，对s2 append会影响s1的结果


// 当调用f函数的时候，对s入参做的修改都会反应到上层函数中的s
func listing1() {
	s := []int{1, 2, 3}

	f(s[:2])
	fmt.Println(s)
}

// 解决方案一：对s进行深拷贝
func listing2() {
	s := []int{1, 2, 3}
	sCopy := make([]int, 2)
	copy(sCopy, s)

	f(sCopy)
	result := append(sCopy, s[2])
	fmt.Println(result)
}

// 解决方案二：使用 full slice expression: s[low:high:max]，这样就限制了切片的cap
// cap = max-low = 2-0 = 2
// 这样f函数对切片进行append操作就不用影响其他共享底层存储的切片了
func listing3() {
	s := []int{1, 2, 3}
	f(s[:2:2])
	fmt.Println(s)
}

func f(s []int) {
	_ = append(s, 10)
}

3.10 #26: Slices and memory leaks（切片的内存泄露）


func main() {
	foos := make([]Foo, 1_000)
	printAlloc()

	for i := 0; i < len(foos); i++ {
		foos[i] = Foo{
			v: make([]byte, 1024*1024),
		}
	}
	printAlloc()
	
  // 和上面类似不会回收剩下998个元素
	two := keepFirstTwoElementsOnly(foos)
	runtime.GC()
	printAlloc()
	runtime.KeepAlive(two)
}

func keepFirstTwoElementsOnly(foos []Foo) []Foo {
	return foos[:2]
}

// keepFirstTwoElementsOnlyCopy 和 keepFirstTwoElementsOnlyMarkNil
// 都能保证剩下的资源被回收，但是如果想留下的元素不是2，我们想保留500个元素，
// 使用 keepFirstTwoElementsOnlyMarkNil 会有更高的效率，不用额外拷贝资源
func keepFirstTwoElementsOnlyCopy(foos []Foo) []Foo {
	res := make([]Foo, 2)
	copy(res, foos)
	return res
}

// keepFirstTwoElementsOnlyMarkNil 给元素中的指针成员置为nil，能够让GC自动回收
func keepFirstTwoElementsOnlyMarkNil(foos []Foo) []Foo {
	for i := 2; i < len(foos); i++ {
		foos[i].v = nil
	}
	return foos[:2]
}

func printAlloc() {
	var m runtime.MemStats
	runtime.ReadMemStats(&m)
	fmt.Printf("%d KB\n", m.Alloc/1024)
}

3.11 #27: Inefficient map initialization（低效的map初始化）

map 的实现原理

什么是 map # 维基百科里这样定义 map： In computer science, an associative array, map, symbol table, or dictionary is an abstract data type composed of a collection of (key, value) pairs, such that each possible key appears at most once in the collection. 简单说明一下：在计算机科学里，被称为相关数组、map、符号表或者字典，是由一组 <key, value> 对组成的抽象数据结构，，并且同一个 key 只会出现一次。有两个关键点：map 是由 key-value 对组成的；key 只会出现一次。和 map 相关的操作主要是：增加一个 k-v 对 —— Add or insert；删除一个 k-v 对 —— Remove or delete；修改某个 k 对应的 v —— Reassign；查询某个 k 对应的 v —— Lookup；简单说就是最基本的增删查改。

https://golang.design/go-questions/map/principal/

💡

bmap就是我们常说的“桶”，桶里面会最多装 8 个 key，这些 key 之所以会落入同一个桶，是因为它们经过哈希计算后，哈希结果是“一类”的。在桶内，又会根据 key 计算出来的 hash 值的高 8 位来决定 key 到底落入桶内的哪个位置（一个桶内最多有8个位置）

触发 map 扩容的时机：在向 map 插入新 key 的时候，会进行条件检测，符合下面这 2 个条件，就会触发扩容：

装载因子超过阈值，源码里定义的阈值是 6.5。(loadFactor := count / (2^B)) count 就是 map 的元素个数，2^B 表示 bucket 数量。

overflow 的 bucket 数量过多：当 B 小于 15，也就是 bucket 总数 2^B 小于 2^15 时，如果 overflow 的 bucket 数量超过 2^B；当 B >= 15，也就是 bucket 总数 2^B 大于等于 2^15，如果 overflow 的 bucket 数量超过 2^15。

因此，就像切片一样，如果我们预先知道map将包含的元素数量，我们应该通过提供初始大小来创建它。这样做可以避免潜在的map增长，这是相当繁重的计算，因为它需要重新分配足够的空间并重新平衡所有元素。

3.12 #28: Maps and memory leaks（map类型的内存泄露，map中的值建议使用指针类型）

Go map 如何缩容？

快速了解 map 缩容机制

https://mp.weixin.qq.com/s/Slvgl3KZax2jsy2xGDdFKw

💡

GO的缩容操作就是不增长内存，并没有实际上的缩容，所以创建一个较大的map之后，需要自己手动处理缩容的问题


func main() {
	// Init
	n := 1_000_000
	m := make(map[int][128]byte)
	printAlloc()

	// Add elements
	for i := 0; i < n; i++ {
		m[i] = randBytes()
	}
	printAlloc()

	// Remove elements
	for i := 0; i < n; i++ {
		delete(m, i)
	}

	// End
	runtime.GC()
	printAlloc()
	runtime.KeepAlive(m)
}
// output
// 0 MB
// 461 MB
// 293 MB

func randBytes() [128]byte {
	return [128]byte{}
}

func printAlloc() {
	var m runtime.MemStats
	runtime.ReadMemStats(&m)
	fmt.Printf("%d MB\n", m.Alloc/1024/1024)
}

事例代码创建1000,000个元素的map后调用delete方法删除了每个元素，但是内存占用并没有明显的降低，因为GO的map没有真正意义上的缩容，删除了所有元素之后，降低的内存应该是bmap中overflow指向的bucket，map[int][128]byte中的数组[128]byte空值就占用128byte。

解决方案一是不断的拷贝新的map，但是在下个gc周期前，内存会翻倍；

解决方案二修改map中value的类型为指针，减少value空值占用的空间；

对于方案二,可以看到扩缩容场景的内存占用更小：

💡

如果键或值超过128字节，Go不会将其直接存储在映射存储桶中。相反，Go存储一个指针来引用键或值。

所以当你可以估算value或者key大于128的时候可以不用关心value是采用指针或者值。

3.13 #29: Comparing values incorrectly（错误的对比值）

对与一些不可比较的类型：slice不能直接使用 == 来比较


var cust1 any = customer{id: "x", operations: []float64{1.}}
var cust2 any = customer{id: "x", operations: []float64{1.}}
fmt.Println(cust1 == cust2)

// panic: runtime error: comparing uncomparable type main.customer

使用 reflect.DeepEqual 方法可以根据类型不同进行不同的比较，但是运行时间是 == 的100倍，因此在对性能有需要的场景，建议自行编写对比函数。


cust1 := customer{id: "x", operations: []float64{1.}} 
cust2 := customer{id: "x", operations: []float64{1.}} 
// 1. reflect.DeepEqual 执行效率低
fmt.Println(reflect.DeepEqual(cust1, cust2))
// 2. equal 执行效率是 reflect.DeepEqual 的 96 倍
fmt.Println(cust1.equal(cust2))

func (a customer) equal(b customer) bool {
	if a.id != b.id {
		return false
	}
	if len(a.operations) != len(b.operations) {
		return false
	}
	for i := 0; i < len(a.operations); i++ {
		if a.operations[i] != b.operations[i] {
			return false
		}
	}
	return true
}

4 Control structures

4.1 #30: Ignoring the fact that elements are copied in range loops（range loop中的值为副本）

💡

我们应该记住，range loops 中的 value 元素是一个副本。


func main() {
	accounts := createAccounts()
	for _, a := range accounts {
		a.balance += 1000
	}
  // [{100} {200} {300}]
	fmt.Println(accounts)
	
	// 方案一：直接使用下标访问到特定的值，仍然使用 range loop
	accounts = createAccounts()
	for i := range accounts {
		accounts[i].balance += 1000
	}
  // [{1100} {1200} {1300}]
	fmt.Println(accounts)

	// 方案二：直接使用下标访问到特定的值，自己计算下标值
	accounts = createAccounts()
	for i := 0; i < len(accounts); i++ {
		accounts[i].balance += 1000
	}
  // [{1100} {1200} {1300}]
	fmt.Println(accounts)
	
	// 方案三：迭代指针类型，及时a为副本但是仍然指向同一个内存地址
	accountsPtr := createAccountsPtr()
	for _, a := range accountsPtr {
		a.balance += 1000
	}
}

func createAccounts() []account {
	return []account{
		{balance: 100.},
		{balance: 200.},
		{balance: 300.},
	}
}

func createAccountsPtr() []*account {
	return []*account{
		{balance: 100.},
		{balance: 200.},
		{balance: 300.},
	}
}

4.2 #31: Ignoring how arguments are evaluated in range loops（忽略rang的对象是何时被计算）

slice


// range loop 的对象是一个表达式，这个表达式可以是 string, slice等，当执行
// 循环的时候，exp只会被计算一次，对原始迭代值进行拷贝生成一个副本
for i, v := range exp

// 迭代对象只计算一次，迭代3次就结束，不会无限循环
s := []int{0, 1, 2}
for range s {
    s = append(s, 10)
}

// 但是下面这种写法就会无限循环
s := []int{0, 1, 2}
for i := 0; i < len(s); i++ {
    s = append(s, 10)
}

channel


func main() {
	ch1 := make(chan int, 3)
	go func() {
		ch1 <- 0
		ch1 <- 1
		ch1 <- 2
		close(ch1)
	}()

	ch2 := make(chan int, 3)
	go func() {
		ch2 <- 10
		ch2 <- 11
		ch2 <- 12
		close(ch2)
	}()

	ch := ch1
	// rang loop里的ch一直指向ch1，只会迭代3次
	for v := range ch {
		fmt.Println(v)
		ch = ch2
	}
}
// output
// 0 1 2

4.3 #32: Ignoring the impact of using pointer elements in range loops （错误的引用range loop的生成值）

当在迭代对象的时候，把range loop生成元素的指针赋值给其他对象，会产生问题！


type Customer struct {
	ID      string
	Balance float64
}

type Store struct {
	m map[string]*Customer
}

func main() {
	s := Store{
		m: make(map[string]*Customer),
	}
	s.storeCustomers([]Customer{
		{ID: "1", Balance: 10},
		{ID: "2", Balance: -10},
		{ID: "3", Balance: 0},
	})
	print(s.m)
}

// 问题来源：
func (s *Store) storeCustomers(customers []Customer) {
	for _, customer := range customers {
		// 每次迭代的时候 customer 都是同一个对象，只是 customer 的值会不断被新的值覆盖
		fmt.Printf("%p\n", &customer)
		// 最终map中的每个key都指向了同一个内存地址
		s.m[customer.ID] = &customer
	}
}

// 解决方案一：把每次迭代的值copy到一个新的临时变量
func (s *Store) storeCustomers2(customers []Customer) {
	for _, customer := range customers {
		current := customer
		s.m[current.ID] = &current
	}
}

// 解决方案二：直接引用切片的元素
// 问题：如果slice出现扩容的时候，那原本地址还是被引用，可能出现内存泄露
func (s *Store) storeCustomers3(customers []Customer) {
	for i := range customers {
		customer := &customers[i]
		s.m[customer.ID] = customer
	}
}

func print(m map[string]*Customer) {
	for k, v := range m {
		fmt.Printf("key=%s, value=%#v\n", k, v)
	}
}

4.4 #33: Making wrong assumptions during map iterations（在迭代map的时候做出了错误的假设）

迭代map的时候，GO不保证有序，甚至还估计打乱迭代顺序。如果想要按顺序迭代map，可以使用第三方库

gods

emirpasic • Updated Mar 25, 2023

边迭代map边增加map的值，GO的行为不可预测，有可能迭代到新加入的值有可能迭代不到

4.5 #34: Ignoring how the break statement works（不明确break语句是如何工作-break label的使用）

在switch和select语句中使用break只会跳出switch和select的作用域，如果这2个语句被for循环包裹是不会跳出for循环的,可以在break的时候指定label标签来跳出循环


func listing() {
loop:
	for i := 0; i < 5; i++ {
		fmt.Printf("%d ", i)
		switch i {
		default:
		case 2:
			break loop
		}
	}
}

4.6 #35: Using defer inside a loop（在loop循环中使用defer语句）

在每个for循环的迭代作用域中调用defer是不会在迭代结束的时候执行，只有最内层函数调用返回的时候才会执行defer，如果每次迭代都打开了一个文件描述符，想要及时关闭：

每次迭代之后就主动关闭，不借助于defer的能力

放在一个函数闭包内，当闭包执行完之后执行defer


func readFiles(ch <-chan string) error {
	for path := range ch {
		
		err := func() error {
			file, err := os.Open(path)
			if err != nil {
				return err
			}

			defer file.Close()

			// Do something with file
			return nil
		}() // 放在闭包内并及时调用
		if err != nil {
			return err
		}
	}
	return nil
}

5 Strings

5.1 #36: Not understanding the concept of a rune（不理解rune的概念）

Go源代码使用UTF-8进行编码。因此，所有字符串都是UTF-8字符串。但是，由于字符串可以包含任意字节，如果它来自其他地方（而不是源代码），则不能保证基于UTF-8编码

rune对应着Unicode中的码点的概念，一个rune代表Unicode中的一个字符

使用UTF-8编码，Unicode码点可以编码为1到4个字节

len函数返回的是字节长度，而不是rune的长度


func main() {
	s := "hello"
	// output: 5
	fmt.Println(len(s))

	s = "汉"
  // output: 3
	fmt.Println(len(s))
	
  // 汉 被编码为3个字节
	s = string([]byte{0xE6, 0xB1, 0x89})
	fmt.Printf("%s\n", s)
}

5.2 #37: Inaccurate string iteration（错误的迭代string-当string中的字符并非单字节Utf-8编码时）

如果想要按照rune为单位返回下标和长度,把string转为[]rune，但是这种作为会有O(n)的时间复杂度和内存拷贝开销，一般情况迭代字符串使用 range loop的方式迭代


runes := []rune(s)
for i, r := range runes {
	fmt.Printf("position %d: %c\n", i, r)
}
// output
// position 0: h
// position 1: ê
// position 2: l
// position 3: l
// position 4: o

fmt.Println(utf8.RuneCountInString(s)) // 5 方法可以计算string中rune的个数

5.3 #38: Misusing trim functions（错误使用 trim 函数）


func main() {
	// output: 123
	// TrimRight removes all the trailing runes contained in a given set
	fmt.Println(strings.TrimRight("123oxo", "xo"))
	
	// output: 123o
	// TrimSuffix returns a string without a provided trailing suffix
	fmt.Println(strings.TrimSuffix("123oxo", "xo"))
	
	// output: 123
	// TrimLeft like TrimRight
	fmt.Println(strings.TrimLeft("oxo123", "ox"))

	// output: o123
	// TrimPrefix like TrimSuffix
	fmt.Println(strings.TrimPrefix("oxo123", "ox"))
	
	// output: 123
	fmt.Println(strings.Trim("oxo123oxo", "ox"))
}

5.4 #39: Under-optimized string concatenation（如何高效的字符串拼接-strings.Builder）


func concatV1(values []string) string {
	s := ""
	for _, value := range values {
		s += value
	}
	return s
}

func concatV2(values []string) string {
	sb := strings.Builder{}
	for _, value := range values {
		_, _ = sb.WriteString(value)
	}
	return sb.String()
}

func concatV3(values []string) string {
	total := 0
	for i := 0; i < len(values); i++ {
		total += len(values[i])
	}

	sb := strings.Builder{}
	sb.Grow(total)
	for _, value := range values {
		_, _ = sb.WriteString(value)
	}
	return sb.String()
}

5.5 #40: Useless string conversions（无用的字符串转换）

bytes 包支持很多和strings包类似的方法，很多时候没有必要一定要把[]byte转化为string：比如bytes.TrimSpace 方法

5.6 #41: Substrings and memory leaks（子字符串和内存泄露-Consistent with #26）

和 #26 条类似，子字符串引用了一个较长父字符串的部分内容，导致父字符串不能被GC，导致内存暴增


func main() {
	s1 := "Hello, World!"
	s2 := s1[:5]
	fmt.Println(s2)

	s1 = "Hêllo, World!"
	s2 = string([]rune(s1)[:5])
	fmt.Println(s2)
}

type store struct{}

func (s store) handleLog1(log string) error {
	if len(log) < 36 {
		return errors.New("log is not correctly formatted")
	}
	uuid := log[:36]
	s.store(uuid)
	// Do something
	return nil
}

func (s store) handleLog2(log string) error {
	if len(log) < 36 {
		return errors.New("log is not correctly formatted")
	}
	uuid := string([]byte(log[:36]))
	s.store(uuid)
	// Do something
	return nil
}

func (s store) handleLog3(log string) error {
	if len(log) < 36 {
		return errors.New("log is not correctly formatted")
	}
	uuid := string(strings.Clone(log[:36]))
	s.store(uuid)
	// Do something
	return nil
}

6 Functions and methods

6.1 #42: Not knowing which type of receiver to use（不知道receiver应该使用什么样的类型）

提供几个场景供我们决定选择receiver的类型：

一定要用指针类型的场景

方法需要修改receiver的值

如果receiver的成员不能被拷贝

应该使用指针类型的场景

如果receiver是一个大的对象，使用指针类型可以更高效的处理程序，这里大的具体数值不好界定，需要看实际的场景，这里可以使用benchmark来估计。

一定要使用值类型的场景

如果强制要求receiver是不可变的

如果receiver是一个map、function、channel类型，否则会有编译错误

应该使用值类型的场景

receiver是一个slice，并且一定需要修改

receiver是一个array或者是一个没有可变字段的struct，如time.Time

receiver是基本类型，例如int、float64或string。

结构体组成复杂的场景

这种场景建议receiver的类型设为指针类型，能明确的告诉使用者这个方法会修改成员的值

6.2 #43: Never using named result parameters （“永远”不要使用命名结果参数）

这里永远要加引号，因为这一节都大部分内容都在说使用命名结果参数的好处。

接口定义的函数使用命名结果参数，可以提升可读性，但是实现接口的方法没有强制要求返回值有命名

以标准库的函数为例，返回值使用命名参数，可以减少代码行数 ，但是这种裸返回（不指定返回值）比较适合实现行数比较短的函数，因为代码行数太长会导致可读性大大降低。

6.3 #44: Unintended side effects with named result parameters （使用命名参数可能会出现意向不到的副作用）

按照编程惯性，遇到错误就直接返回err，但是这里err因为初始化为空值，直接返回就会有问题。

6.4 ⚠️ #45: Returning a nil receiver（返回了一个nil的值，该值的指针类型实现了某一接口）


// MultiError 实现了 Error 接口
type MultiError struct {
    errs []string
}
func (m *MultiError) Add(err error) {
    m.errs = append(m.errs, err.Error())
}
func (m *MultiError) Error() string {
    return strings.Join(m.errs, ";")
}

func (c Customer) Validate() error {
    var m *MultiError
    if c.Age < 0 {
        m = &MultiError{}
        m.Add(errors.New("age is negative"))
    }
    if c.Name == "" {
        if m == nil {
            m = &MultiError{}
				}
        m.Add(errors.New("name is nil"))
    }
    // 按照惯性思维，如果校验通过m也就是个nil，直接返回nil表示没有错误
		return m 
}

执行下面的测试代码会发现，及时满足校验条件返回的error也还是非空的


customer := Customer{Age: 33, Name: "John"}
if err := customer.Validate(); err != nil {
    log.Fatalf("customer is invalid: %v", err)
}
// output
// 2021/05/08 13:47:28 customer is invalid: <nil>

这类属于常见的错误，和for循环中的临时变量的错误差不多，当返回值指针实现了一个接口，即使这个指针为nil，函数返回的时候转化为接口也会被当成非nil的值。

上面的例子可以这样改为正确的：


if c.Age < 0 {
    // ...
}
if c.Name == "" {
	// ... 
}
if m != nil {
    return m
}
return nil

6.5 #46: Using a filename as a function input（没有考虑好函数参数抽象）

这一节主要是提醒我们定义函数的时候，要考虑可扩展性，使用一个统一的接口类型作为参数不仅能方便代码测试，也可以提升代码的抽象能力，把所有的数据源(file, http, string) 使用 io.Reader 类型代替，会是一个更好的解决方案。


func countEmptyLines(reader io.Reader) (int, error) {
    scanner := bufio.NewScanner(reader)
    for scanner.Scan() {
			// ... 
		}
}

6.6 ⚠️ #47: Ignoring how defer arguments and receivers are evaluated（认清defer是如何以及何时计算函数参数）

defer调用函数时候常犯的错误

defer调用函数的时候函数参数使用指针类型

defer调用函数的时候，把函数放在闭包内，利用闭包引用环境变量的能力。

defer执行的是一个方法时，执行结果和receiver的类型有关


// 可以把方法转化为下面的形式，会帮助你理解
print(s Struct) vs print(s &Struct)

100 Go Mistakes and How to Avoid Them - 1

关于本书

2. Code and Project organization

2.1 #1 Unintended variable shadowing（变量遮蔽带来的非意料行为 ）

2.2 #2 Unnecessary nested code （没有必要的代码嵌套 ）

2.3 #3 Misusing init functions（错误使用和理解 init 函数）

2.4 #4 Overusing getters and setters（过度使用 getters 和 setters）

2.5 #5 Interface pollution （接口污染）

2.6 #6 Interface on the producer side （错误的把接口定义在实现端）

2.7 #7 Returning interfaces (返回接口)

2.8 #8 any says nothing （any 类型表达性很弱）

2.9 #9: Being confused about when to use generics （不知道何时使用泛型）

2.10 #10: Not being aware of the possible problems with type embedding （对类型嵌入的使用场景不熟悉）

2.11 #11: Not using the functional options pattern（编写构造函数的时候，使用option模式）

2.12 #12: Project misorganization（推荐的项目组织结构）

2.13 #13: Creating utility packages（不推荐使用utility包）

2.14 #14: Ignoring package name collisions（忽略包命名冲突）

2.15 #15: Missing code documentation（完善代码文档）

2.16 #16: Not using linters（不使用linters）

3 Data types

3.1 #17: Creating confusion with octal literals（混淆八进制）

3.2 #18 Neglecting integer overflows（忽略整数溢出）

3.3 #19: Not understanding floating points（对浮点数理解欠缺）

3.4 #20: Not understanding slice length and capacity（不理解切片len和cap）

3.5 #21: Inefficient slice initialization（低效的切片初始化）

3.6 #22: Being confused about nil vs. empty slices（区分nil和empty切片）

3.7 #23: Not properly checking if a slice is empty（判断切片为空）

3.8 #24: Not making slice copies correctly（没有正确的复制切片）

3.9 #25: Unexpected side effects using slice append（使用切片append时产生的副作用-s[low:high:max]用法）

3.10 #26: Slices and memory leaks（切片的内存泄露）

3.11 #27: Inefficient map initialization（低效的map初始化）

3.12 #28: Maps and memory leaks（map类型的内存泄露，map中的值建议使用指针类型）

3.13 #29: Comparing values incorrectly（错误的对比值）

4 Control structures

4.1 #30: Ignoring the fact that elements are copied in range loops（range loop中的值为副本）

4.2 #31: Ignoring how arguments are evaluated in range loops（忽略rang的对象是何时被计算）

4.3 #32: Ignoring the impact of using pointer elements in range loops （错误的引用range loop的生成值）

4.4 #33: Making wrong assumptions during map iterations（在迭代map的时候做出了错误的假设）

4.5 #34: Ignoring how the break statement works（不明确break语句是如何工作-break label的使用）

4.6 #35: Using defer inside a loop（在loop循环中使用defer语句）

5 Strings

5.1 #36: Not understanding the concept of a rune（不理解rune的概念）

5.2 #37: Inaccurate string iteration（错误的迭代string-当string中的字符并非单字节Utf-8编码时）

5.3 #38: Misusing trim functions（错误使用 trim 函数）

5.4 #39: Under-optimized string concatenation（如何高效的字符串拼接-strings.Builder）

5.5 #40: Useless string conversions（无用的字符串转换）

5.6 #41: Substrings and memory leaks（子字符串和内存泄露-Consistent with #26）

6 Functions and methods

6.1 #42: Not knowing which type of receiver to use（不知道receiver应该使用什么样的类型）

6.2 #43: Never using named result parameters （“永远”不要使用命名结果参数）

6.3 #44: Unintended side effects with named result parameters （使用命名参数可能会出现意向不到的副作用）

6.4 ⚠️ #45: Returning a nil receiver（返回了一个nil的值，该值的指针类型实现了某一接口）

6.5 #46: Using a filename as a function input（没有考虑好函数参数抽象）

6.6 ⚠️ #47: Ignoring how defer arguments and receivers are evaluated（认清defer是如何以及何时计算函数参数）

2.1 #1 Unintended variable shadowing（变量遮蔽带来的非意料行为）

2.2 #2 Unnecessary nested code （没有必要的代码嵌套）