Explaining the principle and implementation of KCP protocol

1. Introduction

KCP is a fast and reliable ARQ (Automatic Repeat-reQuest) protocol that uses a different automatic retransmission policy than TCP and has a lower network latency than TCP. In practice, KCP over UDP is often used instead of TCP for online games and audio/video transmission. The implementation of KCP is short and concise, which is good for learning. The ARQ mechanism of KCP is similar to TCP, but some of the policies are different. This article discusses the principles and implementation of KCP, including its ARQ mechanism, congestion control, and so on.

I do not want to post a large code and then analyze it line by line, so each section of this article will first introduce the mechanisms of KCP with graphics, and then show the code and analyze its implementation. The readability of the KCP source code is relatively good, so it is recommended that you read this article in conjunction with the KCP source code.

We first look at the main interfaces of KCP in Section 2; then Section 3 introduces the data structures in KCP, including message segments, KCP objects, and queues; Section 4 introduces the sending, receiving, and retransmission mechanisms of KCP; Section 5 analyzes the congestion control mechanism of KCP; and finally, a brief summary. This article is a bit long, but a lot of code is shown in it, so I think it is not too complicated.

2. Using the interface

Let’s start with the KCP usage interface. Open ikcp.h, and focus on these interfaces:

// 一个 ikcpcb 实例代表一个 KCP 连接
typedef struct IKCPCB ikcpcb;

// 创建一个 KCP 实例
ikcpcb* ikcp_create(IUINT32 conv, void *user);

// 释放一个 KCP 实例
void ikcp_release(ikcpcb *kcp);

// 设置下层协议输出回调函数
void ikcp_setoutput(ikcpcb *kcp, int (*output)(const char *buf, int len,
    ikcpcb *kcp, void *user));

// 接收数据
int ikcp_recv(ikcpcb *kcp, char *buffer, int len);

// 发送数据
int ikcp_send(ikcpcb *kcp, const char *buffer, int len);

// 时钟更新
void ikcp_update(ikcpcb *kcp, IUINT32 current);

// 下层协议输入
int ikcp_input(ikcpcb *kcp, const char *data, long size);

// flush 发送缓冲区, 会在 ikcp_update 中调用
void ikcp_flush(ikcpcb *kcp);

ikcp_create creates a KCP instance. The conv parameter passed in identifies the KCP connection, which means that every message segment sent by the connection will have conv on it, and it will only receive message segments with conv equal to it. The two communicating parties must first negotiate an identical pair of convs. KCP itself does not provide any handshake mechanism, and the negotiation of conv is left to the user, e.g., over an existing TCP connection.

KCP is a purely algorithmic implementation, it is not responsible for lower-level protocols, there are no internal system calls, and even the clock has to be passed in externally. So we need to:

Call ikcp_setoutput to set the lower layer protocol output function. Whenever KCP needs to send data, it will call back this output function. For example, if the lower layer protocol is UDP, we call sendto in the output callback to send the data to the other side. The user argument of the output callback is equal to the user argument passed in by ikcp_create.
When the lower protocol data arrives, ikcp_input is called to pass the data to KCP.
ikcp_update is called at a certain frequency to drive the KCP clock. current indicates the current time in milliseconds.

After setting up the lower layer protocol and clock, you can call ikcp_recv and ikcp_send to send and receive KCP data.

Before we dive into the details, let’s take a brief look at these functions, so we can see what they will probably do:

ikcp_send puts data in the send queue waiting to be sent;
ikcp_recv reads data from the receive queue;
ikcp_input reads the lower protocol input data, parses the message segment; if it is data, puts the data in the receive buffer; if it is ACK, marks the corresponding message segment as delivered in the send buffer;
ikcp_flush calls the output callback to send the data out of the send buffer.

In the next few sections we will explore the details of the KCP implementation in more detail.

3. Data structure

3.1 Message segments

3.1.1 Message Segment Structure

Let’s look at the structure of KCP’s message segments. First, KCP has four types of message segments, or four commands:

Data message IKCP_CMD_PUSH.
Acknowledgement message `IKCP_CMD_ACK
Window probe message IKCP_CMD_WASK , which asks the counterpart the size of the remaining receive window.
Window notification message IKCP_CMD_WINS , notifying the counterpart of the size of the remaining receive window.

The structure of either message segment is the same:

0               4   5   6       8 (BYTE)
+---------------+---+---+-------+
|     conv      |cmd|frg|  wnd  |
+---------------+---+---+-------+   8
|     ts        |     sn        |
+---------------+---------------+  16
|     una       |     len       |
+---------------+---------------+  24
|                               |
|        DATA (optional)        |
|                               |
+-------------------------------+

You can see that there are several fields:

conv 4 bytes: connection identifier, as discussed earlier.
cmd 1 byte: Command.
frg 1 byte: number of fragments. Indicates how many subsequent messages belong to the same packet.
wnd 2 bytes: size of the sender’s remaining receive window.
ts 4 bytes: timestamp.
sn 4 bytes: message number.
una 4 bytes: the number of the smallest unreceived message segment in the sender’s receive buffer. That is, all segments with a number smaller than that have been received.
len 4 bytes: the length of the data segment.
data : Data segment. Only data messages will have this field.

First, each data message and ACK will carry sn, which uniquely identifies a message; the sender sends a data message, the receiver receives an ACK, and the receiver marks the corresponding message as delivered based on sn; at the same time, each message will carry una, and the sender marks the corresponding message as delivered based on una.

ts can be used to estimate the RTT (Round-Trip Time) and thus calculate the RTO (Retransmission TimeOut). We determine the timeout for each message based on the RTO, and if the message is not marked as delivered within the timeout, it will be retransmitted.

The size of the packet may exceed a MSS (Maximum Segment Size). At this point, a segmentation is performed, and frg indicates the number of subsequent segments, i.e., how many more messages belong to the same packet.

Each message is accompanied by wnd, which tells the sender the size of the remaining receive window, which helps the other side control the sending rate. We will discuss this in more detail in Section 5.

3.1.2 Implementation

In the KCP implementation, the following structure is used to represent a KCP message segment:

struct IKCPSEG
{
    struct IQUEUEHEAD node;
    IUINT32 conv;
    IUINT32 cmd;
    IUINT32 frg;
    IUINT32 wnd;
    IUINT32 ts;
    IUINT32 sn;
    IUINT32 una;
    IUINT32 len;
    IUINT32 resendts;
    IUINT32 rto;
    IUINT32 fastack;
    IUINT32 xmit;
    char data[1];
};

In addition to the fields of the message, there are also the following fields:

node : link table node. We will discuss this in detail in Section 3.3.
resendts : retransmission timestamp. After this time, the message has timed out and needs to be retransmitted.
rto : The RTO of the message.
fastack : The number of ACK out-of-order times. This is the number of “skips” referred to in KCP Readme.
xmit : The number of times the message was transmitted.

Table of Contents