Redis provides two persistence mechanisms, RDB and AOF. This chapter first describes how the Redis server saves and loads RDB files, focusing on the implementation of the SVAE and BGSAVE commands. After that, the chapter continues with a description of how the Redis server’s auto-save feature is implemented. The components are described, and the structure and meaning of these components are explained. At the end of this chapter, we will analyze and interpret actual RDB files, and put what we have learned about RDB files into practical applications. Some pseudo-code will be included to facilitate understanding. The source of this article is the book redis Design and Implementation.
Basic Introduction
RDB persistence can be performed either manually or periodically depending on the server configuration options, which allows saving the database state at a point in time to an RDB file. The generated RDB file is a compressed binary file that can be restored. The generated RDB file is a compressed binary file that can be used to restore the database state at the time the RDB file was generated.
RDB file creation and loading
There are two Redis commands that can be used to generate RDB files, one is SAVE and the other is BGSAVE.
- The SAVE command blocks the Redis server process until the RDB file is created, and the server cannot process any command requests while the server process is blocked.
- The BGSAVE command spawns a child process, which is then responsible for creating the RDB file, while the server process (the parent process) continues to process the command requests.
The actual work of creating the RDB file is done by the rdb.c/rdbsave function. The SAVE command and the BGSAVE command call this function in different ways, and the difference between these two commands can be clearly seen by the following pseudo code:
The loading of RDB files is performed automatically at server startup, and the actual work of loading RDB files is done by the rdb.c/rdbLoad function. So Redis does not have a command specifically for loading RDB files, as long as the Redis server detects the existence of an RDB file at startup, it will automatically load the RDB file.
See the above output is printed when the RDB
file is successfully loaded. It is also worth mentioning that because the AOF
file is usually updated more frequently than the RDB
file, so :
- If the server has
AOF
persistence enabled, then the server will use theAOF
file to restore the database state first. - The server will use the
RDB
file to restore the database state only when theAOF
persistence feature is off.
The following diagram shows the flow of judgment when the server loads a file.
The different states of the server when the SAVE and BGSAVE commands are executed
SAVE
As mentioned earlier, the Redis
server is blocked while the SAVE
command is executing, so all command requests sent by the client are rejected while the SAVE
command is executing.
Only after the server finishes executing the SAVE
command and starts accepting command requests again will the commands sent by the client be processed.
BGSAVE
Because the saving of BGSAVE command is executed by the child process, the Redis server can still continue to process the command requests from the client during the process of creating RDB files by the child process, however, during the execution of BGSAVE command, the server will handle the SAVE, BGSAVE and BGREWRITEAOF commands in a different way than usual.
First, during the execution of the BGSAVE command, the SAVE command sent by the client will be rejected by the server. The server prohibits the SAVE command and the BGSAVE command from executing at the same time to avoid the parent process (server process) and the child process from executing two rdbSave function calls at the same time to prevent competing conditions.
Second, during the execution of the BGSAVE command, the BGSAVE command sent by the client will be rejected by the server because the simultaneous execution of two BGSAVE commands will also create a race condition. Finally, the BGREWRITEAOF and BGSAVE commands cannot be executed at the same time.
- If the BGSAVE command is executing, then the BGREWRITEAOF command sent by the client is delayed until the BGSAVE command has finished executing.
- If the BGREWRITEAOF command is executing, then the BGSAVE command sent by the client will be rejected by the server.
Since the actual work of both BGSAVE and BGREWRITEAOF is performed by the subprocesses, there is no operational conflict between the two commands, and it is only a performance consideration that they cannot be executed simultaneously. It is not a good idea to issue two subprocesses and have both of them perform a lot of disk write operations at the same time.
The server will remain in a blocking state while the RDB file is loaded until the load is complete.
Automatic interval saving
This is to use the BGSAVE
command to set the relevant conditions to execute the command, for example, we redis
generally have the following configuration.
Explanation of the above configuration
- The server has made at least 1 changes to the database within 900 seconds
- The server has made at least 10 changes to the database in 300 seconds
- The server has made at least 10000 changes to the database in 60 seconds
Autosave pseudocode
Roughly, the diagram looks like this.
In addition to the saveparams array, the server state maintains a dirty counter, and a lastsave attribute.
- The dirty counter records how many changes (including writes, deletes, updates, etc.) the server has made to the database state (all databases on the server) since the last successful execution of a SAVE command or BGSAVE command.
- The lastsave property is a UNIX timestamp that records when the server last successfully executed a SAVE command or a BGSAVE command.
Example.
The above figure shows the dirty counter and the lastsave attribute contained in the server state, illustrated as follows.
- The dirty counter has a value of 123, indicating that the server has made 123 changes to the database state since the last save.
- The lastsave property records the timestamp of the last time the server performed a save operation.
Check if the save condition is met
Redis
’s server-periodic operation function servercron
is executed by default every 100 milliseconds to see if the condition has been met and, if so, to execute the BGSAVE
command.
The following pseudo-code shows the servercron
function checking for a save condition.
The above code shows that the program will iterate through and check all the save conditions in the saveparams
array, and as long as any of the conditions are met, then the server will execute the BGSAVE
command.
RDB file structure
The following shows the various parts of a complete RDB
file.
REDIS
At the beginning of the RDB
file is the REDIS
section, which is 5 bytes long and holds the five characters "REDIS"
. With these five characters, the program can quickly check if the file loaded is a RDB
file when it is loaded.
db_version
db_version
is a four-byte character integer that records the RDB
version number used by the file. The current version of the RDB
file is 0009
. Since different versions of RDB
files are not compatible with each other, you need to choose different read methods depending on the version when reading into the program.
databases
The databases
section contains zero or any number of databases, and the key-value pairs in each database:
- If the server’s database status is empty (all databases are empty), then this section is also empty and is 0 bytes long
- If the server’s database status is non-empty (at least one database is non-empty), then this section is also non-empty, and the length of this section varies depending on the number, type, and content of the key-value pairs stored in the database.
EOF
The length of the EOF
constant is 1 byte. This constant marks the end of the body of the RDB
file, and when the read program encounters this value, it knows that all key-value pairs for all databases have been loaded.
CheckSum
check_sum is an 8-byte unsigned integer that holds a checksum, which is calculated by the program from the contents of REDIS, db_version, databases, and EOF. When the server loads the RDB file, it will compare the checksum calculated from the loaded data with the checksum recorded by check_sum to check whether there is any error or corruption in the RDB file.
Starting from Version 5, if rdbchecksum yes is enabled in the configuration file, the checksum of the whole file content will be calculated by CRC64 with 8 bytes at the end of the RDB file.
For an example
This is my latest pull of redis with empty data, let’s analyze it in turn: od -c rdb.rdb.
|
|
These fields are actually the file header contents of the AOF
and RDB
generic sections.
- the first 5 bytes are fixed as
REDIS
- the first four bytes are
RDB
version number from 6 to 9 - next is
redis-ver
and its value, i.e.redis
version - then
redis-bits
and its value, i.e. the number of bits ofredis
, the value is 32 or 64 - next is
ctime
and its value, theRDB
file creation time - then
used-mem
and its value,RDB
file creation time - and finally
aof-preamble
and its value, the value is 0 or 1, 1 means RDB is valid.
But the RDB
file header has three more items before aof-preamble
as follows.
repl-stream-db
The database selected in theserver.master
clientrepl-id
The current instancereplication ID
repl-offset
The offset of the current instance replication
Summary
- The
RDB
file is used to save and restore all key-value pairs of data in all databases of theRedis
server. - The
SAVE
command performs the save operation directly by the server process, so this command blocks the server. - The
BGSAVE
command performs the save operation by a child process, so the command does not block the server. - All save conditions set with the
save
option are stored in the server state, and the server will automatically execute theBGSAVE
command when any of the save conditions are satisfied. - An
RDB
file is a compressed binary file consisting of multiple parts. - For different types of key-value pairs, the
RDB
file will use different ways to save them.