Cache recipes for cache, restore and save actions (#1055)

* Added outline and cache basics

* Update CACHING.md

* Added info about key and restore keys

* Update CACHING.md

Co-authored-by: Bishal Prasad <bishal-pdmsft@github.com>

* Review comments and some snippets

* Updated doc with comments

* Formatted sub headings

* Markdown linting

* Added paths

* Fixed heading

* Update CACHING.md

Co-authored-by: Bishal Prasad <bishal-pdmsft@github.com>

* Update CACHING.md

Co-authored-by: Bishal Prasad <bishal-pdmsft@github.com>

* Update CACHING.md

Co-authored-by: Bishal Prasad <bishal-pdmsft@github.com>

* Update CACHING.md

Co-authored-by: Bishal Prasad <bishal-pdmsft@github.com>

* Update CACHING.md

Co-authored-by: Bishal Prasad <bishal-pdmsft@github.com>

* Update CACHING.md

Co-authored-by: Bishal Prasad <bishal-pdmsft@github.com>

* Update CACHING.md

Co-authored-by: Bishal Prasad <bishal-pdmsft@github.com>

* Update CACHING.md

Co-authored-by: Bishal Prasad <bishal-pdmsft@github.com>

* Update CACHING.md

Co-authored-by: Bishal Prasad <bishal-pdmsft@github.com>

* Update CACHING.md

Co-authored-by: Bishal Prasad <bishal-pdmsft@github.com>

* Update CACHING.md

Co-authored-by: Bishal Prasad <bishal-pdmsft@github.com>

* Update CACHING.md

Co-authored-by: Bishal Prasad <bishal-pdmsft@github.com>

* Update CACHING.md

Co-authored-by: Bishal Prasad <bishal-pdmsft@github.com>

* Update CACHING.md

Co-authored-by: Bishal Prasad <bishal-pdmsft@github.com>

* Updated paths

* Renamed file and added readme reference

* Fixed heading of a section

* Update README.md

* Moved back section to strategies

* Reverted to older version

* Fixed broken link

Co-authored-by: Bishal Prasad <bishal-pdmsft@github.com>
This commit is contained in:
Sankalp Kotewar 2023-01-12 12:00:47 +05:30 committed by GitHub
parent 87396fe6b4
commit 9183691e97
WARNING! Although there is a key with this ID in the database it does not verify this commit! This commit is SUSPICIOUS.
GPG key ID: 4AEE18F83AFDEB23
2 changed files with 392 additions and 41 deletions

136
README.md
View file

@ -14,9 +14,10 @@ In addition to this `cache` action, other two actions are also available
See ["Caching dependencies to speed up workflows"](https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows).
## What's New
### v3
* Added support for caching from GHES 3.5.
* Fixed download issue for files > 2GB during restore.
* Updated the minimum runner version support from node 12 -> node 16.
@ -35,34 +36,39 @@ Refer [here](https://github.com/actions/cache/blob/v2/README.md) for previous ve
## Usage
### Pre-requisites
Create a workflow `.yml` file in your repositories `.github/workflows` directory. An [example workflow](#example-workflow) is available below. For more information, reference the GitHub Help Documentation for [Creating a workflow file](https://help.github.com/en/articles/configuring-a-workflow#creating-a-workflow-file).
Create a workflow `.yml` file in your repositories `.github/workflows` directory. An [example workflow](#example-cache-workflow) is available below. For more information, reference the GitHub Help Documentation for [Creating a workflow file](https://help.github.com/en/articles/configuring-a-workflow#creating-a-workflow-file).
If you are using this inside a container, a POSIX-compliant `tar` needs to be included and accessible in the execution path.
### Inputs
* `path` - A list of files, directories, and wildcard patterns to cache and restore. See [`@actions/glob`](https://github.com/actions/toolkit/tree/main/packages/glob) for supported patterns.
* `key` - An explicit key for restoring and saving the cache
* `key` - An explicit key for restoring and saving the cache. Refer [creating a cache key](#creating-a-cache-key).
* `restore-keys` - An ordered list of prefix-matched keys to use for restoring stale cache if no cache hit occurred for key.
* `enableCrossOsArchive` - An optional boolean when enabled, allows Windows runners to save or restore caches that can be restored or saved respectively on other platforms. Default: false
#### Environment Variables
* `SEGMENT_DOWNLOAD_TIMEOUT_MINS` - Segment download timeout (in minutes, default `60`) to abort download of the segment if not completed in the defined number of minutes. [Read more](https://github.com/actions/cache/blob/main/tips-and-workarounds.md#cache-segment-restore-timeout)
### Outputs
* `cache-hit` - A boolean value to indicate an exact match was found for the key.
* `cache-hit` - A boolean value to indicate an exact match was found for the key.
> Note: `cache-hit` will be set to `true` only when cache hit occurs for the exact `key` match. For a partial key match via `restore-keys` or a cache miss, it will be set to `false`.
See [Skipping steps based on cache-hit](#skipping-steps-based-on-cache-hit) for info on using this output
### Cache scopes
The cache is scoped to the key, [version](#cache-version) and branch. The default branch cache is available to other branches.
See [Matching a cache key](https://help.github.com/en/actions/configuring-and-managing-workflows/caching-dependencies-to-speed-up-workflows#matching-a-cache-key) for more info.
### Example workflow
### Example cache workflow
#### Restoring and saving cache using a single action
```yaml
name: Caching Primes
@ -91,7 +97,49 @@ jobs:
run: /primes.sh -d prime-numbers
```
> Note: You must use the `cache` action in your workflow before you need to use the files that might be restored from the cache. If the provided `key` matches an existing cache, a new cache is not created and if the provided `key` doesn't match an existing cache, a new cache is automatically created provided the job completes successfully.
The `cache` action provides one output `cache-hit` which is set to `true` when cache is restored using primary key and `false` when cache is restored using `restore-keys` or no cache is restored.
#### Using combination of restore and save actions
```yaml
name: Caching Primes
on: push
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Restore cached Primes
id: cache-primes-restore
- uses: actions/cache/restore@v3
with:
path: |
path/to/dependencies
some/other/dependencies
key: ${{ runner.os }}-primes
.
. //intermediate workflow steps
.
- name: Save Primes
id: cache-primes-save
- uses: actions/cache/save@v3
with:
path: |
path/to/dependencies
some/other/dependencies
key: ${{ steps.cache-primes-restore.outputs.cache-primary-key }}
```
> **Note**
> You must use the `cache` or `restore` action in your workflow before you need to use the files that might be restored from the cache. If the provided `key` matches an existing cache, a new cache is not created and if the provided `key` doesn't match an existing cache, a new cache is automatically created provided the job completes successfully.
## Caching Strategies
With introduction of two new actions `restore` and `save`, a lot of caching use cases can now be achieved. Please refer the [caching strategies](./caching-strategies.md) document for understanding how you can use the actions strategically to achieve the desired goal.
## Implementation Examples
@ -99,31 +147,31 @@ Every programming language and framework has its own way of caching.
See [Examples](examples.md) for a list of `actions/cache` implementations for use with:
- [C# - NuGet](./examples.md#c---nuget)
- [Clojure - Lein Deps](./examples.md#clojure---lein-deps)
- [D - DUB](./examples.md#d---dub)
- [Deno](./examples.md#deno)
- [Elixir - Mix](./examples.md#elixir---mix)
- [Go - Modules](./examples.md#go---modules)
- [Haskell - Cabal](./examples.md#haskell---cabal)
- [Haskell - Stack](./examples.md#haskell---stack)
- [Java - Gradle](./examples.md#java---gradle)
- [Java - Maven](./examples.md#java---maven)
- [Node - npm](./examples.md#node---npm)
- [Node - Lerna](./examples.md#node---lerna)
- [Node - Yarn](./examples.md#node---yarn)
- [OCaml/Reason - esy](./examples.md#ocamlreason---esy)
- [PHP - Composer](./examples.md#php---composer)
- [Python - pip](./examples.md#python---pip)
- [Python - pipenv](./examples.md#python---pipenv)
- [R - renv](./examples.md#r---renv)
- [Ruby - Bundler](./examples.md#ruby---bundler)
- [Rust - Cargo](./examples.md#rust---cargo)
- [Scala - SBT](./examples.md#scala---sbt)
- [Swift, Objective-C - Carthage](./examples.md#swift-objective-c---carthage)
- [Swift, Objective-C - CocoaPods](./examples.md#swift-objective-c---cocoapods)
- [Swift - Swift Package Manager](./examples.md#swift---swift-package-manager)
- [Swift - Mint](./examples.md#swift---mint)
* [C# - NuGet](./examples.md#c---nuget)
* [Clojure - Lein Deps](./examples.md#clojure---lein-deps)
* [D - DUB](./examples.md#d---dub)
* [Deno](./examples.md#deno)
* [Elixir - Mix](./examples.md#elixir---mix)
* [Go - Modules](./examples.md#go---modules)
* [Haskell - Cabal](./examples.md#haskell---cabal)
* [Haskell - Stack](./examples.md#haskell---stack)
* [Java - Gradle](./examples.md#java---gradle)
* [Java - Maven](./examples.md#java---maven)
* [Node - npm](./examples.md#node---npm)
* [Node - Lerna](./examples.md#node---lerna)
* [Node - Yarn](./examples.md#node---yarn)
* [OCaml/Reason - esy](./examples.md#ocamlreason---esy)
* [PHP - Composer](./examples.md#php---composer)
* [Python - pip](./examples.md#python---pip)
* [Python - pipenv](./examples.md#python---pipenv)
* [R - renv](./examples.md#r---renv)
* [Ruby - Bundler](./examples.md#ruby---bundler)
* [Rust - Cargo](./examples.md#rust---cargo)
* [Scala - SBT](./examples.md#scala---sbt)
* [Swift, Objective-C - Carthage](./examples.md#swift-objective-c---carthage)
* [Swift, Objective-C - CocoaPods](./examples.md#swift-objective-c---cocoapods)
* [Swift - Swift Package Manager](./examples.md#swift---swift-package-manager)
* [Swift - Mint](./examples.md#swift---mint)
## Creating a cache key
@ -167,6 +215,7 @@ A repository can have up to 10GB of caches. Once the 10GB limit is reached, olde
Using the `cache-hit` output, subsequent steps (such as install or build) can be skipped when a cache hit occurs on the key. It is recommended to install the missing/updated dependencies in case of a partial key match when the key is dependent on the `hash` of the package file.
Example:
```yaml
steps:
- uses: actions/checkout@v3
@ -184,11 +233,11 @@ steps:
> Note: The `id` defined in `actions/cache` must match the `id` in the `if` statement (i.e. `steps.[ID].outputs.cache-hit`)
## Cache Version
Cache version is a hash [generated](https://github.com/actions/toolkit/blob/500d0b42fee2552ae9eeb5933091fe2fbf14e72d/packages/cache/src/internal/cacheHttpClient.ts#L73-L90) for a combination of compression tool used (Gzip, Zstd, etc. based on the runner OS) and the `path` of directories being cached. If two caches have different versions, they are identified as unique caches while matching. This for example, means that a cache created on `windows-latest` runner can't be restored on `ubuntu-latest` as cache `Version`s are different.
> Pro tip: [List caches](https://docs.github.com/en/rest/actions/cache#list-github-actions-caches-for-a-repository) API can be used to get the version of a cache. This can be helpful to troubleshoot cache miss due to version.
Cache version is a hash [generated](https://github.com/actions/toolkit/blob/500d0b42fee2552ae9eeb5933091fe2fbf14e72d/packages/cache/src/internal/cacheHttpClient.ts#L73-L90) for a combination of compression tool used (Gzip, Zstd, etc. based on the runner OS) and the `path` of directories being cached. If two caches have different versions, they are identified as unique caches while matching. This for example, means that a cache created on `windows-latest` runner can't be restored on `ubuntu-latest` as cache `Version`s are different.
> Pro tip: [List caches](https://docs.github.com/en/rest/actions/cache#list-github-actions-caches-for-a-repository) API can be used to get the version of a cache. This can be helpful to troubleshoot cache miss due to version.
<details>
<summary>Example</summary>
@ -239,22 +288,27 @@ jobs:
if: steps.cache-primes.outputs.cache-hit != 'true'
run: ./generate-primes -d prime-numbers
```
</details>
## Known practices and workarounds
Following are some of the known practices/workarounds which community has used to fulfill specific requirements. You may choose to use them if suits your use case. Note these are not necessarily the only or the recommended solution.
- [Cache segment restore timeout](./tips-and-workarounds.md#cache-segment-restore-timeout)
- [Update a cache](./tips-and-workarounds.md#update-a-cache)
- [Use cache across feature branches](./tips-and-workarounds.md#use-cache-across-feature-branches)
- [Cross OS cache](./tips-and-workarounds.md#cross-os-cache)
- [Force deletion of caches overriding default cache eviction policy](./tips-and-workarounds.md#force-deletion-of-caches-overriding-default-cache-eviction-policy)
* [Cache segment restore timeout](./tips-and-workarounds.md#cache-segment-restore-timeout)
* [Update a cache](./tips-and-workarounds.md#update-a-cache)
* [Use cache across feature branches](./tips-and-workarounds.md#use-cache-across-feature-branches)
* [Cross OS cache](./tips-and-workarounds.md#cross-os-cache)
* [Force deletion of caches overriding default cache eviction policy](./tips-and-workarounds.md#force-deletion-of-caches-overriding-default-cache-eviction-policy)
#### Windows environment variables
Please note that Windows environment variables (like `%LocalAppData%`) will NOT be expanded by this action. Instead, prefer using `~` in your paths which will expand to HOME directory. For example, instead of `%LocalAppData%`, use `~\AppData\Local`. For a list of supported default environment variables, see [this](https://docs.github.com/en/actions/learn-github-actions/environment-variables) page.
### Windows environment variables
Please note that Windows environment variables (like `%LocalAppData%`) will NOT be expanded by this action. Instead, prefer using `~` in your paths which will expand to HOME directory. For example, instead of `%LocalAppData%`, use `~\AppData\Local`. For a list of supported default environment variables, see [this](https://docs.github.com/en/actions/learn-github-actions/environment-variables) page.
## Contributing
We would love for you to contribute to `actions/cache`, pull requests are welcome! Please see the [CONTRIBUTING.md](CONTRIBUTING.md) for more information.
## License
The scripts and documentation in this project are released under the [MIT License](LICENSE)

297
caching-strategies.md Normal file
View file

@ -0,0 +1,297 @@
# Caching Strategies
This document lists some of the strategies (and example workflows if possible) which can be used
- to use an effective cache key and/or path
- to solve some common use cases around saving and restoring caches
- to leverage the step inputs and outputs more effectively
## Choosing the right key
```yaml
jobs:
build:
runs-on: ubuntu-latest
- uses: actions/cache@v3
with:
key: ${{ some-metadata }}-cache
```
In your workflows, you can use different strategies to name your key depending on your use case so that the cache is scoped appropriately for your need. For example, you can have cache specific to OS, or based on the lockfile or commit SHA or even workflow run.
### Updating cache for any change in the dependencies
One of the most common use case is to use hash for lockfile as key. This way, same cache will be restored for a lockfile until there's a change in dependencies listed in lockfile.
```yaml
- uses: actions/cache@v3
with:
path: |
path/to/dependencies
some/other/dependencies
key: cache-${{ hashFiles('**/lockfiles') }}
```
### Using restore keys to download the closest matching cache
If cache is not found matching the primary key, restore keys can be used to download the closest matching cache that was recently created. This ensures that the build/install step will need to additionally fetch just a handful of newer dependencies, and hence saving build time.
```yaml
- uses: actions/cache@v3
with:
path: |
path/to/dependencies
some/other/dependencies
key: cache-npm-${{ hashFiles('**/lockfiles') }}
restore-keys: |
cache-npm-
```
The restore keys can be provided as a complete name, or a prefix, read more [here](https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#matching-a-cache-key) on how a cache key is matched using restore keys.
### Separate caches by Operating System
In case of workflows with matrix running for multiple Operating Systems, the caches can be stored separately for each of them. This can be used in combination with hashfiles in case multiple caches are being generated per OS.
```yaml
- uses: actions/cache@v3
with:
path: |
path/to/dependencies
some/other/dependencies
key: ${{ runner.os }}-cache
```
### Creating a short lived cache
Caches scoped to the particular workflow run id or run attempt can be stored and referred by using the run id/attempt. This is an effective way to have a short lived cache.
```yaml
key: cache-${{ github.run_id }}-${{ github.run_attempt }}
```
On similar lines, commit sha can be used to create a very specialized and short lived cache.
```yaml
- uses: actions/cache@v3
with:
path: |
path/to/dependencies
some/other/dependencies
key: cache-${{ github.sha }}
```
### Using multiple factors while forming a key depening on the need
Cache key can be formed by combination of more than one metadata, evaluated info.
```yaml
- uses: actions/cache@v3
with:
path: |
path/to/dependencies
some/other/dependencies
key: ${{ runner.os }}-${{ hashFiles('**/lockfiles') }}
```
The [GitHub Context](https://docs.github.com/en/actions/learn-github-actions/contexts#github-context) can be used to create keys using the workflows metadata.
## Restoring Cache
### Understanding how to choose path
While setting paths for caching dependencies it is important to give correct path depending on the hosted runner you are using or whether the action is running in a container job. Assigning different `path` for save and restore will result in cache miss.
Below are GiHub hosted runner specific paths one should take care of when writing a workflow which saves/restores caches across OS.
#### Ubuntu Paths
Home directory (`~/`) = `/home/runner`
`${{ github.workspace }}` = `/home/runner/work/repo/repo`
`process.env['RUNNER_TEMP']`=`/home/runner/work/_temp`
`process.cwd()` = `/home/runner/work/repo/repo`
#### Windows Paths
Home directory (`~/`) = `C:\Users\runneradmin`
`${{ github.workspace }}` = `D:\a\repo\repo`
`process.env['RUNNER_TEMP']` = `D:\a\_temp`
`process.cwd()` = `D:\a\repo\repo`
#### MacOS Paths
Home directory (`~/`) = `/Users/runner`
`${{ github.workspace }}` = `/Users/runner/work/repo/repo`
`process.env['RUNNER_TEMP']` = `/Users/runner/work/_temp`
`process.cwd()` = `/Users/runner/work/repo/repo`
Where:
`cwd()` = Current working directory where the repository code resides.
`RUNNER_TEMP` = Environment variable defined for temporary storage location.
### Make cache read only / Reuse cache from centralized job
In case you are using a centralized job to create and save your cache that can be reused by other jobs in your repository, this action will take care of your restore only needs and make the cache read-only.
```yaml
steps:
- uses: actions/checkout@v3
- uses: actions/cache/restore@v3
id: cache
with:
path: path/to/dependencies
key: ${{ runner.os }}-${{ hashFiles('**/lockfiles') }}
- name: Install Dependencies
if: steps.cache.outputs.cache-hit != 'true'
run: /install.sh
- name: Build
run: /build.sh
- name: Publish package to public
run: /publish.sh
```
### Failing/Exiting the workflow if cache with exact key is not found
You can use the output of this action to exit the workflow on cache miss. This way you can restrict your workflow to only initiate the build when `cache-hit` occurs, in other words, cache with exact key is found.
```yaml
steps:
- uses: actions/checkout@v3
- uses: actions/cache/restore@v3
id: cache
with:
path: path/to/dependencies
key: ${{ runner.os }}-${{ hashFiles('**/lockfiles') }}
- name: Check cache hit
if: steps.cache.outputs.cache-hit != 'true'
run: exit 1
- name: Build
run: /build.sh
```
## Saving cache
### Reusing primary key from restore cache as input to save action
If you want to avoid re-computing the cache key again in `save` action, the outputs from `restore` action can be used as input to the `restore` action.
```yaml
- uses: actions/cache/restore@v3
id: restore-cache
with:
path: |
path/to/dependencies
some/other/dependencies
key: ${{ runner.os }}-${{ hashFiles('**/lockfiles') }}
.
.
.
- uses: actions/cache/save@v3
with:
path: |
path/to/dependencies
some/other/dependencies
key: ${{ steps.restore-cache.outputs.key }}
```
### Re-evaluate cache key while saving cache
On the other hand, the key can also be explicitly re-computed while executing the save action. This helps in cases where the lockfiles are generated during the build.
Let's say we have a restore step that computes key at runtime
```yaml
uses: actions/cache/restore@v3
id: restore-cache
with:
key: cache-${{ hashFiles('**/lockfiles') }}
```
Case 1: Where an user would want to reuse the key as it is
```yaml
uses: actions/cache/save@v3
with:
key: ${{ steps.restore-cache.outputs.key }}
```
Case 2: Where the user would want to re-evaluate the key
```yaml
uses: actions/cache/save@v3
with:
key: npm-cache-${{hashfiles(package-lock.json)}}
```
### Saving cache even if the build fails
There can be cases where a cache should be saved even if the build job fails. For example, a job can fail due to flaky tests but the caches can still be re-used. You can use `actions/cache/save` action to save the cache by using `if: always()` condition.
Similarly, `actions/cache/save` action can be conditionally used based on the output of the previous steps. This way you get more control on when to save the cache.
```yaml
steps:
- uses: actions/checkout@v3
.
. // restore if need be
.
- name: Build
run: /build.sh
- uses: actions/cache/save@v3
if: always() // or any other condition to invoke the save action
with:
path: path/to/dependencies
key: ${{ runner.os }}-${{ hashFiles('**/lockfiles') }}
```
### Saving cache once and reusing in multiple workflows
In case of multi-module projects, where the built artifact of one project needs to be reused in subsequent child modules, the need of rebuilding the parent module again and again with every build can be eliminated. The `actions/cache` or `actions/cache/save` action can be used to build and save the parent module artifact once, and restored multiple times while building the child modules.
#### Step 1 - Build the parent module and save it
```yaml
steps:
- uses: actions/checkout@v3
- name: Build
run: ./build-parent-module.sh
- uses: actions/cache/save@v3
id: cache
with:
path: path/to/dependencies
key: ${{ runner.os }}-${{ hashFiles('**/lockfiles') }}
```
#### Step 2 - Restore the built artifact from cache using the same key and path
```yaml
steps:
- uses: actions/checkout@v3
- uses: actions/cache/restore@v3
id: cache
with:
path: path/to/dependencies
key: ${{ runner.os }}-${{ hashFiles('**/lockfiles') }}
- name: Install Dependencies
if: steps.cache.outputs.cache-hit != 'true'
run: ./install.sh
- name: Build
run: ./build-child-module.sh
- name: Publish package to public
run: ./publish.sh
```