Zero-copy techniques are also used extensively in the Go standard library to improve performance. Since many zero-copy related techniques are provided through system calls, these system calls are also encapsulated in the Go standard library, and the related encapsulated code can be found in internal/poll.
Let’s take Linux as an example, after all, most of our applications are running on Linux.
sendfile
The sendfile
system call is encapsulated in the internal/poll/sendfile_linux.go
file, and I removed part of the code.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
|
// SendFile wraps the sendfile system call.
func SendFile(dstFD *FD, src int, remain int64) (int64, error) {
...... //lock
dst := dstFD.Sysfd
var written int64
var err error
for remain > 0 {
n := maxSendfileSize
if int64(n) > remain {
n = int(remain)
}
n, err1 := syscall.Sendfile(dst, src, nil, n)
if n > 0 {
written += int64(n)
remain -= int64(n)
} else if n == 0 && err1 == nil {
break
}
...... // Error Handling
}
return written, err
}
|
You can see that SendFile
calls senfile to write data in bulk. The sendfile
system call will transfer up to 0x7ffff00 (2147479552) bytes of data at a time. Here golang sets maxSendfileSize to 4194304 bytes.
It is used in the net/sendfile_linux.go
file.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
|
func sendFile(c *netFD, r io.Reader) (written int64, err error, handled bool) {
var remain int64 = 1 << 62 // by default, copy until EOF
lr, ok := r.(*io.LimitedReader)
......
f, ok := r.(*os.File)
if !ok {
return 0, nil, false
}
sc, err := f.SyscallConn()
if err != nil {
return 0, nil, false
}
var werr error
err = sc.Read(func(fd uintptr) bool {
written, werr = poll.SendFile(&c.pfd, int(fd), remain)
return true
})
if err == nil {
err = werr
}
if lr != nil {
lr.N = remain - written
}
return written, wrapSyscallError("sendfile", err), written > 0
}
|
And who will call this function? It’s TCPConn
.
1
2
3
4
5
6
7
8
9
|
func (c *TCPConn) readFrom(r io.Reader) (int64, error) {
if n, err, handled := splice(c.fd, r); handled {
return n, err
}
if n, err, handled := sendFile(c.fd, r); handled {
return n, err
}
return genericReadFrom(c, r)
}
|
This method in turn will be encapsulated by the ReadFrom
method. Remember this ReadFrom method, we’ll talk about it later.
1
2
3
4
5
6
7
8
9
10
|
func (c *TCPConn) ReadFrom(r io.Reader) (int64, error) {
if !c.ok() {
return 0, syscall.EINVAL
}
n, err := c.readFrom(r)
if err != nil && err != io.EOF {
err = &OpError{Op: "readfrom", Net: c.fd.net, Source: c.fd.laddr, Addr: c.fd.raddr, Err: err}
}
return n, err
}
|
The implementation of TCPConn.readFrom
method is interesting. It first checks if the zero-copy optimization using the slice system call is satisfied, and only if the destination is a TCP connection, the source is TCP or a Unix connection can slice be called.
Otherwise it tries to use sendfile, and if it wants to use sendfile optimization, there is a restriction that the source is an *os.File
file.
Otherwise, use a different copy method.
When will ReadFrom be called? In fact, you will often use it, and io.Copy
will call ReadFrom
. Maybe inadvertently you use zero-copy when you write a file to a socket. Of course this is not the only way it is called and used.
If we look at a call chain, we will get the pulse clear: io.Copy
-> *TCPConn.ReadFrom
-> *TCPConn.readFrom
-> net.sendFile
-> poll.sendFile
.
splice
As you can see above, *TCPConn.readFrom
is initially an attempt to use splice, and the scenarios and limitations of its use are also mentioned.
The net.splice
function is actually a call to poll.Splice
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
|
func Splice(dst, src *FD, remain int64) (written int64, handled bool, sc string, err error) {
p, sc, err := getPipe()
if err != nil {
return 0, false, sc, err
}
defer putPipe(p)
var inPipe, n int
for err == nil && remain > 0 {
max := maxSpliceSize
if int64(max) > remain {
max = int(remain)
}
inPipe, err = spliceDrain(p.wfd, src, max)
handled = handled || (err != syscall.EINVAL)
if err != nil || inPipe == 0 {
break
}
p.data += inPipe
n, err = splicePump(dst, p.rfd, inPipe)
if n > 0 {
written += int64(n)
remain -= int64(n)
p.data -= n
}
}
if err != nil {
return written, handled, "splice", err
}
return written, true, "", nil
}
|
So you see, inadvertently you will use splice or sendfile.
CopyFileRange
copy_file_range_linux.go wraps the copy_file_range system call. Since this system call is very new, it is important to first check the Linux version to see if it is supported when wrapping it. We’ll skip the version check and the code to call the bulk copy and see how this system call is used.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
func copyFileRange(dst, src *FD, max int) (written int64, err error) {
if err := dst.writeLock(); err != nil {
return 0, err
}
defer dst.writeUnlock()
if err := src.readLock(); err != nil {
return 0, err
}
defer src.readUnlock()
var n int
for {
n, err = unix.CopyFileRange(src.Sysfd, nil, dst.Sysfd, nil, max, 0)
if err != syscall.EINTR {
break
}
}
return int64(n), err
}
|
Where will it be used? os.File
when reading data.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
|
var pollCopyFileRange = poll.CopyFileRange
func (f *File) readFrom(r io.Reader) (written int64, handled bool, err error) {
// copy_file_range(2) does not support destinations opened with
// O_APPEND, so don't even try.
if f.appendMode {
return 0, false, nil
}
remain := int64(1 << 62)
lr, ok := r.(*io.LimitedReader)
if ok {
remain, r = lr.N, lr.R
if remain <= 0 {
return 0, true, nil
}
}
src, ok := r.(*File)
if !ok {
return 0, false, nil
}
if src.checkValid("ReadFrom") != nil {
// Avoid returning the error as we report handled as false,
// leave further error handling as the responsibility of the caller.
return 0, false, nil
}
written, handled, err = pollCopyFileRange(&f.pfd, &src.pfd, remain)
if lr != nil {
lr.N -= written
}
return written, handled, NewSyscallError("copy_file_range", err)
}
|
The same is true for the *FIle.ReadFrom
call.
1
2
3
4
5
6
7
8
9
10
|
func (f *File) ReadFrom(r io.Reader) (n int64, err error) {
if err := f.checkValid("write"); err != nil {
return 0, err
}
n, handled, e := f.readFrom(r)
if !handled {
return genericReadFrom(f, r) // without wrapping
}
return n, f.wrapErr("write", e)
}
|
So this optimization is used in file copying, the general call link is io.Copy
-> *File.ReadFrom
-> *File.readFrom
-> poll.CopyFileRange
-> poll.copyFileRange
.
Zero-copy in the standard library
The Go standard library encapsulates the zero-copy technology at the bottom, so many times you are unaware of it. Let’s say you implement a simple file server.
1
2
3
4
5
|
import "net/http"
func main() {
http.Handle("/", http.StripPrefix("/static/", http.FileServer(http.Dir("../root.img"))))
http.ListenAndServe(":8972", nil)
}
|
Call chain: http.FileServer
-> *fileHandler.ServeHTTP
-> http.serveFile
-> http.serveContent
-> io.CopyN
-> io.Copy
-> sendFile
call chain.
You can see that sendFile
is called when accessing the file.
Third-party libraries
There are several libraries that provide a wrapper for sendFile/splice
.
Because it’s easy to call system calls directly, we can often imitate the standard library to implement our own zero-copy methods.
So I personally feel that there is not much icing on the cake for these traditional methods, and all we need to do is to encapsulate or customize the development of new zero-copy system interfaces.