Continuous Archival Backups
pg has three basic backup methods.
- sql dump using
pg_dump
, which is a logical backup that cannot be restored to a specified state. - File system based backup, requires file system to provide snapshot function to ensure consistency, otherwise must be shut down first and then backed up.
- Continuous archiving, the preferred high-reliability backup technology.
Continuous archiving of WAL logs is the key to implementing archived backups, combining a file system level backup with an archived WAL file, so that when recovery is required, the file system backup is restored first, and then the archived WAL file is replayed to restore the system to its current (or specified point in time) state.
Advantages of continuous archiving.
- Rather than requiring a fully consistent file system backup, a consistent state is reached by replaying the WAL from a redo point, so a simple tool can be used to make a base backup without stopping the machine.
- Simple continuous archiving of WAL files enables continuous backups in situations where full backups are not possible frequently.
- Use base backups and WAL archive reversion to specified points in time to restore the database to any point in time after the base backup.
- Continuous transfer of WAL archive files to another machine to which the same base backup has been applied enables a master-backup replication system.
Disadvantages.
- Only the entire database set cluster can be restored by backup, no finer granularity is supported.
- Base backups and continuous archive files can take up a lot of space.
Enabling WAL archiving
pg continuously generates 16M WAL segment files in the pg_wal
directory by default. pg will clean up and recycle WAL segment files when archiving mode is not enabled.
The archive process transfers the WAL segment files before the checkpoint from the pg_wal
directory to a specified location (e.g., copied to a user-defined directory), and the original WAL segment files are purged or recycled after a successful transfer.
The pg archiving process is highly scalable, as the transfer process executes a shell command provided by the user, and the exact method and target location of the transfer is completely user-defined, with only the return code of the command determining whether the archiving was successful.
Example configuration to enable WAL archiving.
There are the following notes about the archive command.
- The archive command should never directly overwrite an archive file with the same name, since different set clusters may produce WAL segment files with the same name, and the archive command should return with a non-zero status code when an archive file already exists at the archive location.
- The archive command should preserve the original filename of the WAL segment file.
- The archive command should be designed to resolve or handle potential exceptions that could cause the
pg_wal
directory to grow and cause pg to shut down due to lack of space. - It is necessary to enable the
logging_collector
parameter so that the standard error output of the archive command will be collected in the database logs for easy debugging and monitoring of the archiving process. - For databases with a small workload, a WAL segment file may not be switched for a long time. Setting a small
archive_timeout
can shorten the time interval between when a transaction is executed and when it is reliably archived; ` Manually switch the WAL segment file, which also triggers the archiving of the WAL segment file used before the switch, this command can only be executed on the master. - If the command is terminated by a signal other than SIGTERM or a shell error (e.g., command not found), archiving will be aborted and the server will shut down.
Basic backup
A basic backup is a staged operation consisting of several sequential commands.
-
Execute
pg_start_backup
to start the backup, the database performs a checkpoint, explicitly creates a re-point at the start moment of the backup, writes as many transactions as possible to disk before this point in time, and records the LSN location of this checkpoint. In order not to interfere with online operation, this command performs the checkpoint by default by spreading out the IO, so it may take a long time to return.pg_start_backup
also forces full-page write mode to ensure that writes to the database are fully replayable during the base backup. -
After
pg_start_backup
is successfully executed, the user needs to archive the data directory using the file system tool itself to obtain a file system backup, noting that different files may have been copied at different points in time and therefore the state of the database in the file system backup may not be consistent. pg expects this to happen and brings it to a consistent state by replaying the WAL segment files written throughout the base backup to bring them to a consistent state. -
When
pg_stop_backup
is executed, the database forcibly switches the WAL segment files in use and records the last LSN before the switch, with the ending LSN being the consistent state actually backed up by this base backup. The WAL segment files written during the base backup are archived, and the end operation sends a backup history file with the names of these WAL segment files to the archive location, the file name of which records the first WAL segment file needed to restore using this base backup, and the contents of which record the start and end timestamps of this backup and the required starting and ending WAL segment files. Applying these archived WAL segment files on top of the file system backup will restore to the consistent state represented by the ending LSN. -
After ending the backup you need to create manually a
backup_label
file in the root directory of the file system backup with the output returned by thepg_stop_backup
operation and preferably save the WAL segment files recorded in the backup history file separately, the file system backup plus the WAL segment files written during the backup is a valid base The file system backup plus the WAL segment files written during the backup is a valid base backup that can successfully restore the database to the consistent state it was in when the backup ended.
The details of the pg_start_backup
and pg_stop_backup
commands can be found below.
After securely archiving file system backups and WAL segment files used during backups (specified in the backup history file), WAL segment files with a numeric sequence in the file name smaller than the backup history file name can be deleted.
Recovery requires a base backup plus ongoing WAL archives created after that base backup, and since replaying a large number of WAL archives is time-consuming, the recommended practice is to do a base backup periodically, while cleaning up older WAL archives before the base backup (after cleaning it is impossible to restore to that point in time).
Make a base backup using the underlying API
-
Make sure the archiving function is enabled and working properly.
-
Connect to the database using the superuser and execute the following command (this connection needs to be maintained during the backup).
1
SELECT pg_start_backup('label', false, false);
- The first parameter is a custom descriptive label.
- The second parameter represents whether to turn on fast checkpointing, which when turned on will execute fast checkpointing immediately initiating a large number of IOs that may affect database performance during backups. Turning it off will disperse the IOs to reduce the impact on the database, but will take longer to execute.
- The third parameter represents whether to enable exclusive backups, which is no longer recommended in newer versions.
-
Use any file system backup tool to archive the data directory. This error can be ignored if the backup tool returns with a non-zero status code because the files were changed during the copy. Partial temporary subdirectories, files or files in subdirectories under the data directory (e.g.
pg_wal
) can be ignored during archiving. -
Continue with the following command in the same connection.
1
SELECT * FROM pg_stop_backup(false, true);
This command will terminate the backup mode and automatically switch the WAL segment files so that the WAL segment files written during the backup can be archived, by default these WAL end files are archived successfully before the command returns, the second field output returned needs to be written to the
backup_label
file in the root directory of the file system backup, this file needs to be created manually in non-exclusive backup mode. -
When the end command returns successfully, it means that the WAL segment files written during the backup have been archived and the backup is finished. You can save the file system backup together with these WAL segment files, which form a complete base backup.
Making basic backups with pg_basebackup
pg_basebackup
encapsulates the underlying pg_start_backup
and pg_stop_backup
commands and provides some very convenient features.
- Supports archiving and compressing file system backups of data directories, automatically ignoring unwanted files, and automatically writing to
backup_label
files. - Automatically record checksums of backup files to prevent backups from being changed.
- Support to automatically fetch all WAL segment archive files generated during backup and save them to
pg_wal
or other directories.
Usage examples.
|
|
Extract base.tar.gz
and pg_wal.tar.gz
respectively to get the file system backup and the WAL segment archives written during the backup.
|
|
The backup_label file in the data directory does not record the LSN location of the end of the base backup, but it can be viewed in the corresponding backup history file in the WAL archive directory.
|
|
When recovering with this base backup, the recovery process replayed the 0000000100000000000000000085
WAL segment file before the database entered a consistent state, after which the database was able to accept connections to perform query operations, as shown in the recovery log below.
|
|
Recovering from an archived backup
To restore, proceed as follows.
- Stop the pg server process.
- Back up the current data directory, or at least the WAL segment files in the
pg_wal
directory that have not been archived, if there is not enough space. - Remove all files and subdirectories from the data directory.
- Restore file system backup to the data directory, making sure that the restored files and directories have the correct permissions.
- If the
pg_wal
directory was not ignored when doing the file system backup, you now need to empty the restoredpg_wal
directory and copy the unarchived WAL segment files to it. - Create a
recovery.signal
file in the data directory that instructs pg to enter recovery mode at boot and will be automatically deleted after a successful recovery. - Set recovery-related parameters, such as
restore_command
, in the configuration filepostgresql.conf
, which is described in detail below. - Start the pg server. The server will enter recovery mode and start fetching and processing the WAL segment files needed for recovery. If the recovery process is terminated due to an external error (e.g. host power failure), you can simply restart the server to let it continue the recovery. When recovery is complete, the server deletes the
recovery.signal
file to prevent re-entry into recovery mode, and then begins normal operation.
The recovery process is not complicated, and the key points are set via the configuration file to.
- The way to get the archived WAL segment file when recovering.
- The target state to be reached for recovery.
Restore command
The restore_command
(henceforth called the restore command) must be set to tell pg how to get the archived WAL segment file. Similar to the archive command, this command defines how to transfer the specific WAL segment file (which also contains other types of files) that pg needs from the archive location to a temporary location in the data directory in a user-defined way, and then pg will then reads the temporary file and performs a redo. More complex behavior can be customized in this command by executing a script.
If the archived WAL segment files are stored in the /backup/demo/pg_wal
directory, the sample recovery command is as follows.
|
|
The recovery command is designed with the following caveats.
-
The recovery command must return with a non-zero status code if the transfer fails or the transferred file does not exist. pg will try to fetch some files that are not in the archive location with the recovery command, so the recovery will not be aborted directly because the recovery command failed.
1 2 3 4 5 6
2022-07-10 08:05:24.762 UTC [21181] LOG: restored log file "00000003.history" from archive 2022-07-10 08:05:24.766 UTC [21181] LOG: restored log file "00000004.history" from archive cp: cannot stat '/var/lib/postgresql/archive/14/demo/00000005.history': No such file or directory 2022-07-10 08:05:24.771 UTC [21181] LOG: starting point-in-time recovery to 2022-07-10 08:03:23.157122+00 2022-07-10 08:05:24.776 UTC [21181] LOG: restored log file "00000004.history" from archive 2022-07-10 08:05:24.862 UTC [21181] LOG: restored log file "000000010000000000000076" from archive
-
If pg cannot fetch a file with the recovery command, it will later try to find it from the
pg_wal
directory in the data directory. -
If the command is terminated by a signal other than SIGTERM or a shell error (e.g., command not found), recovery will be aborted and the server will shut down.
Recovery modes
Recovery actually works in two modes.
- Create a file named
standby.signal
in the data directory, which will enter standby mode when the server starts. The server enters the recovery state and does not stop recovering when it reaches the end of the archived WAL, but continues to try to recover by connecting to the primary server specified by theprimary_conninfo
setting or by using therestore_command
to get a new WAL segment. - Create a file called
recovery.signal
in the data directory that will enter the target recovery mode when the server starts. Target recovery mode ends when the archived WAL segment file is fully replayed or whenrecovery_target
is reached.
Typically, standby mode is used to provide high availability and extend read-only libraries, while target recovery is used to recover lost data or clone a new server. If both standby.signal
and recovery.signal
files are created, the standby mode takes precedence.
Recovery Target
By default, recovery mode processes all available WAL segment files, i.e., it increments the sequence number in the WAL segment naming to get the next WAL segment file until it fails, and finally restores the database to a “relatively up-to-date” state. So it is usually normal to see a “file not found” error during the recovery process, especially at the end of the recovery.
If you only need to restore to a point in time between the end of the base backup and the current moment, you need to specify the stopping point, the recovery target. The recovery target can be a timestamp, a named recovery point and a transaction ID, in practice point-in-time based recovery is used mostly.
We can specify a recovery target by choosing one of the following parameters and only one of the following.
recovery_target
: the only available value isimmediate
, which stops recovery when it reaches a consistent state at the end of the base backup.recovery_target_lsn
: Set the recovery target to the specified LSN.recovery_target_name
: set the recovery target to the specified named recovery point (created bypg_create_restore_point()
).recovery_target_time
: set the recovery target to the specified timestamp.recovery_target_xid
: sets the recovery target to the specified transaction ID.
Further set whether to include the recovery target and the behavior upon arrival with the following options.
recovery_target_inclusive
: Specifies whether the recovery target is included on recovery, i.e. whether to replay WAL log entries that contain a target timestamp, LSN, or transaction ID, otherwise it will stop before that recovery target. The default value istrue
.recovery_target_timeline
: Specifies the timeline reached during recovery, usually using the default valuelatest
, which is the last generated timeline in the archive log.recovery_target_action
: specifies the action that the server will perform when the recovery target is reached, which only takes effect when the recovery target is set. The following three actions can be executed.pause
: Suspends the recovery process. When paused, the user can perform the next action based on the current status queried from the database. You can choose to resume the paused recovery process, which will cause the recovery mode to end, or you can choose to restart the database server after changing the recovery target configuration and revert to another target. This option is equivalent toshutdown
if the server is unable to accept queries due to thehot_standby=off
configuration.promot
: Ends the recovery process and creates a new timeline to start accepting database connections.shutdown
: Shut down the server directly. This does not delete therecovery.signal
file and will immediately shut down again if the server is restarted; if it is restarted after changing the recovery target configuration, it will enter recovery mode and continue redirecting to a new recovery target from the previous recovery target.
Recovery termination
In any case, if the recovery process ends due to running out of available WAL segment files before reaching the configured recovery target, the recovery process will exit with a FATAL error causing the server to shut down.
|
|
If the recovery process finds corrupt WAL data, recovery will stop immediately and the server will not start normally. In this case a new recovery target can be specified before the point of corruption and then the recovery process can be rerun.
If the recovery fails for external reasons, such as a system crash or inaccessible WAL archives, the recovery can be continued by simply restarting the server and the recovery process will start over from where it failed. Recovery restarts work much like normal checkpoints: the server periodically flushes all its state to disk, then updates the pg_control
file with the completed WAL data at the record, which no longer needs to be reprocessed at restart.
Timeline
The concept of timeline is introduced in pg. A new timeline is created after each successful recovery based on the recovered timeline, which is mainly reflected in the naming of WAL segment files.
Each time a new timeline is created pg also creates a timeline history file <timeline-ID>.history
, which records the parent timeline of that timeline. When restoring from an archive containing multiple timelines, pg needs the timeline history file to keep going back and selecting the correct sequence of WAL segment files. Therefore the timeline history file is also archived to the WAL archive location.
The advantage of introducing a timeline is that users can restore based on the same backup and then back up again, simulating a tree backup history through the timeline, and then easily locate and switch between forks without worrying about confusion between them due to naming conflicts, etc. The timeline is very useful in some scenarios, and usually we use the most recent timeline in the WAL archive directly.