1. Introduction
rsync
is a commonly used Linux
application for file synchronization.
It can synchronize files between the local computer and a remote computer, or between two local directories (but does not support synchronization between two remote computers). It can also be used as a file copying tool, replacing the cp
and mv
commands.
The r
in its name refers to remote
, rsync actually means “remote synchronization”. Unlike other file transfer tools (such as FTP or scp), rsync’s main feature is that it checks the existing files of both the sender and the receiver and only transfers the parts that have changed (the default rule is that the file size or modification time has changed).
2. Installation
If rsync is not installed on the local or remote computer, you can use the following command to install it.
Note that rsync must be installed on both sides of the transfer.
3. Basic usage
3.1 -r parameter
When using the rsync command locally, it can be used as an alternative to the cp and mv commands to synchronize the source directory to the target directory.
|
|
In the above command, -r
means recursive, i.e. contains subdirectories. Note that -r
is required, otherwise rsync
will not run successfully. The source
directory indicates the source directory, and destination
indicates the destination directory.
If there are multiple files or directories to be synchronized, you can write it like the following.
|
|
In the above command, source1 and source2 will be synchronized to the destination directory.
3.2 -a parameters
The -a
parameter can be used instead of -r
to synchronize meta-information (such as modification time, permissions, etc.) in addition to recursive synchronization. Since rsync uses file size and modification time by default to determine if a file needs to be updated, -a is more useful than -r. The following usage is the common way to write it.
|
|
If the destination directory destination
does not exist, rsync will create it automatically. After executing the above command, the source directory source
is copied completely to the destination directory destination
, i.e. the directory structure destination/source
is formed.
If you want to synchronize only the contents of the source directory source
to the destination directory destination
, you need to add a slash after the source directory.
|
|
After the above command is executed, the contents of the source
directory will be copied to the destination
directory, and no source
subdirectory will be created under destination
.
3.3 -n parameter
If you are not sure what the result of rsync execution will be, you can first simulate the result with -n
or -dry-run
arguments.
|
|
In the above command, the -n
parameter simulates the result of the command execution and does not actually execute the command. The -v parameter, on the other hand, outputs the results to the terminal so that you can see what will be synchronized.
3.4 –delete parameter
By default, rsync
only ensures that all the contents of the source directory (except for explicitly excluded files) are copied to the target directory. It does not make the two directories identical and does not delete files. If you want to make the target directory a mirror copy of the source directory, you must use the --delete
parameter, which will delete files that exist only in the target directory and not in the source directory.
|
|
The -delete
parameter in the above command will make destination
a mirror of source
.
4. Excluded documents
4.1 –exclude parameter
Sometimes we want to exclude certain files or directories from synchronization, so we can specify the exclusion mode with the --exclude
parameter.
The above command excludes all TXT files.
Note that rsync will synchronize hidden files starting with “dot”, to exclude hidden files, you can write --exclude=". *"
.
If you want to exclude all files inside a directory, but do not want to exclude the directory itself, you can write it like this.
|
|
Multiple exclude modes with multiple --exclude
parameters.
|
|
Multiple exclusion patterns can also take advantage of Bash’s large expansion number extension with just one -exclude
parameter.
|
|
If there are many exclusion patterns, you can write them to a file, one line per pattern, and then specify this file with the --exclude-from
parameter.
|
|
4.2 The –include parameter
The --include
parameter is used to specify the file pattern that must be synchronized, often in combination with --exclude
.
|
|
The above command specifies that when synchronizing, all files are excluded, but TXT files will be included.
5. Remote synchronization
5.1 SSH Protocol
rsync supports synchronization between two local directories as well as remote synchronization. It can synchronize local content, to a remote server.
|
|
It is also possible to synchronize remote content to local.
|
|
rsync uses SSH for remote login and data transfer by default.
Since rsync does not use SSH protocol in the early days, you need to specify the protocol with -e
parameter, but it was changed later. So, the following -e ssh
can be omitted.
|
|
However, if the ssh command has additional parameters, you must specify the SSH command to be executed with the -e
parameter.
|
|
In the above command, the -e
parameter specifies that SSH uses port 2234.
5.2 rsync protocol
In addition to using SSH, if another server has the rsync daemon installed and running, you can also transfer using the rsync://protocol
(default port 873). This is written with a double colon separating the server from the target directory::
.
|
|
Note that the module
in the above address is not an actual pathname, but a resource name assigned by the rsync
daemon and assigned by the administrator.
If you want to know the list of all modules assigned by the rsync daemon, you can execute the following command.
|
|
In addition to using a double colon, the rsync protocol also allows you to specify the address directly with rsync:// protocol
.
|
|
6. Incremental backup
The most important feature of sync is that it can perform incremental backups, i.e. only files that have changed are copied by default.
In addition to direct comparison between the source and target directories, rsync also supports the use of a base directory, i.e. syncing the changes between the source and base directories to the target directory.
The first sync is a full backup, and all files are synchronized in the base directory. Each subsequent sync is an incremental backup, where only the part of the source directory that has changed from the base directory is synced to a new target directory. This new target directory also contains all files, but in fact, only the files that have changed exist in this directory, the other files that have not changed are hard links to the base directory files.
The -link-dest
parameter is used to specify the base directory for synchronization.
|
|
In the above command, the --link-dest
parameter specifies the base directory /compare/path
, then the source directory /source/path
is compared with the base directory, the changed files are found and copied to the target directory /target/path
, and the unchanged files are generated as hard links. The first backup of this command is a full backup, and all subsequent backups are incremental.
Here is an example script that backs up the user’s home directory.
|
|
In the above script, each sync generates a new directory ${BACKUP_DIR}/${DATETIME}
and points the soft link ${BACKUP_DIR}/latest
to this directory. The next time you backup, use ${BACKUP_DIR}/latest
as the base directory to generate a new backup directory. Finally, point the soft link ${BACKUP_DIR}/latest
to the new backup directory again.
7. Configuration items
The -a
, --archive
parameters indicate archive mode, which saves all metadata such as modification time, permissions, owner, etc., and the soft links are synchronized over.
The --append
parameter specifies that the file continues the transfer where it was last interrupted.
The --append-verify
parameter is similar to the --append
parameter, but it performs a checksum on the completed file after transfer. If the checksum fails, the entire file will be resent.
The -b
, -backup
parameters specify that when deleting or updating a file that already exists in the target directory, the file is renamed and then backed up, and the default behavior is to delete. The renaming rule adds the file extension specified by the -suffix
parameter, the default is ~
.
The -backup-dir
parameter specifies the directory where the files are stored when backing up, e.g. -backup-dir=/path/to/backups
.
The -bwlimit
parameter specifies the bandwidth limit, the default unit is KB/s
, for example --bwlimit=100
.
The -c
, --checksum
parameter changes the way rsync checksums. By default, rsync only checks if the file size and last modified date have changed, and retransmits if they have; after using this parameter, it decides whether to retransmit by determining the checksum of the file content.
The -delete
parameter deletes files that exist only in the target directory and not in the source target, i.e. it ensures that the target directory is a mirror of the source target.
The -e
parameter specifies that the SSH protocol is used to transfer data.
The --exclude
parameter specifies the exclusion of files that are not synchronized, e.g. --exclude="*.iso"
.
The --exclude-from
parameter specifies a local file containing the file patterns to be excluded, one line per pattern.
The -existing
, -ignore-non-existing
parameters indicate that files and directories that do not exist in the target directory are not synchronized.
The -h
parameter indicates output in a human-readable format.
The -h
, --help
arguments return help information.
The -i
parameter indicates the output of details of file differences between the source and target directories.
The --ignore-existing
parameter indicates that as long as the file already exists in the target directory, skip it and do not synchronize these files again.
The --include
parameter specifies the files to be included when synchronizing, and is usually used in combination with --exclude
.
The -link-dest
parameter specifies the base directory for incremental backups.
The -m
parameter specifies that empty directories are not synchronized.
The -max-size
parameter sets the size limit of the maximum file to be transferred, e.g. no more than 200KB (-max-size='200k'
).
The -min-size
parameter sets the size limit of the smallest file to be transferred, e.g. not less than 10KB (-min-size=10k
).
The -n
parameter or the -dry-run
parameter simulates the operation that will be performed without actually performing it. Used with the -v
parameter, you can see what will be synchronized over.
The -P
parameter is a combination of the -progress
and -partial
parameters.
The --partial
parameter allows resuming an interrupted transfer. When this parameter is not used, rsync
will delete the files interrupted halfway through the transfer; when this parameter is used, the files halfway through the transfer will also be synchronized to the target directory, and the interrupted transfer will be resumed the next time the transfer is synchronized. Usually needs to be used with -append
or -append-verify
.
The --partial-dir
parameter specifies that the files transferred to half are saved to a temporary directory, e.g. --partial-dir=.rsync-partial
. Usually needs to be used with --append
or --append-verify
.
The --progress
parameter indicates that progress is displayed.
The -r
argument indicates recursion, i.e., the inclusion of subdirectories.
The --remove-source-files
parameter indicates that the sender’s files are removed after a successful transfer.
The --size-only
parameter indicates that only files with changes in size are synchronized, regardless of the difference in file modification time.
The -suffix
parameter specifies the suffix to be added to the filename when it is backed up, the default is ~
.
The -u
, -update
arguments indicate that files with updated modification times in the target directory are skipped when synchronizing, i.e., those files with updated timestamps are not synchronized.
The -v
parameter indicates output details. -vv
indicates output of more detailed information, and -vvv
indicates output of the most detailed information.
The --version
argument returns the version of rsync
.
The -z
parameter specifies to compress the data when synchronizing.
8. Reference Links
- How To Use Rsync to Sync Local and Remote Directories on a VPS, Justin Ellingwood
- Mirror Your Web Site With rsync, Falko Timme
- Examples on how to use Rsync, Egidio Docile
- How to create incremental backups using rsync on Linux, Egidio Docile
Reference http://www.ruanyifeng.com/blog/2020/08/rsync.html