To speed up CI execution, caching is a very effective tool. Ensuring the highest utilization of the cache is the most important concern when using caching. For example, after caching the entire target
directory, when do you update the cache? The best way to do this is when there is a dependency change, which is Cargo.lock
for Rust and package.lock
for Node.
Let’s see how to use the cache component to achieve the above effect, with three main parameters.
key
the cache ID, which can be seen as a KV pair for the entire cache spacepath
which is the path to cacherestore-keys
specifies which cache to select when thekey
does not hit
The following is an example of caching a Rust project.
|
|
First look at the definition of key
, which is divided into four parts, namely
- fixed value,
debug
to distinguish it fromrelease
compilation - variables, operating system
- toolchain file Hash
- cargo.lock file Hash
For a cache, when using a key
hit, called a cache hit
, there is no need to update the cache at the end of the Actions, so when designing the key, two things need to be kept in mind:
- the key should be able to represent cache changes, as the four variables above can determine a complete valid cache
- there may be more than one field that can represent cache changes, and the ones that change frequently need to be placed last, and the four variables above follow this order
The second design point needs to be seen in conjunction with restore-key
. If the key
is not inconsistent when Actions is executed, this means that the cache contents have changed, either in the lock file or in the toolchain, and by the general design of the cache, it is not possible to use the cache if the key is inconsistent.
However, the cache can still be used in this case. For example, if there are 10 dependencies in the project, and only one of them has been updated, the cache is still valid for the remaining 9. In this case, the cache selection is done with restore-key
.
restore-key Specifies a series of candidate cache keys to be used as an alternate cache in case there are no hit keys.
Since the alternate cache is top-down, the length of restore-key
is generally decreasing in order. For the example above, the cache selection is as follows.
- If only Cargo.lock has changed, then use the cache pointed to by the first
restore-key
, as it has the highest cache efficiency - If both Cargo.lock and toolchain files have changed, then use the second one for the same reason
- If you are currently compiling a release, you can’t use the cache regardless of changes to Cargo.lock and toolchain files, so as to avoid confusion with the debug cache, which can lead to an oversized cache
As long as the key
is not hit, the cache will be updated for the next direct hit after the execution of Actions. As you can see, the clever design of restore-key
ensures that the most valid cache is always the “hot” one.
Caveats
For security and cost reasons, GitHub places the following restrictions on caching.
- If the cache is generated on the main branch, then all other branches derived from the main branch can use it; however, the cache generated by branches B1 and B2, which are also derived from the main, cannot be shared
- Caches that have not been accessed for 7 days will be deleted automatically.
- The cache space is only 10G, more than that will be eliminated according to the access time LRU