You have created GitHub Actions workflows to automate the building and testing of your applications, saving yourself a ton of work. You can take this one step further and automate the deployment process as well. Not only that, but you can also save test results for debugging.
To achieve this you need a way to share data between jobs — that is, between build and deployment jobs. If you're using GitHub-hosted runners, any outputs your build and test jobs generate are deleted when the workflow finishes executing. You need a way to save these outputs before they are deleted. Let's discuss how this is done using artifacts.
Artifacts
Artifacts allow us to share data between jobs or retain that data after a workflow run is complete. This data could be:
- Production files — binary, compressed, or executable files
- Logs, test reports, and crash dumps
The ability to share data between jobs in a workflow allows you to create efficient continuous integration and continuous delivery (CI/CD) pipelines. To use artifacts, you upload them during a workflow run. Then, when the run is complete, you can view those artifacts in the Summary tab of the GitHub Actions UI. You can click the artifact name to download it as a compressed zip file.
We can manually delete artifacts from the UI by clicking the trashcan icon. But what if there are hundreds of artifacts? Fortunately, artifacts and logs stick around for 90 days before they are automatically removed. Better yet, you can customize this retention period in the repository settings or specify it in the workflow file to as low as one day, thus automating the cleanup process. We will see how this is achieved in the next section.
Uploading artifacts
It might take an archeologist several years to find an ancient artifact, but you only need a few lines of code!
name: Time artifact
on:
push:
jobs:
create:
runs-on: ubuntu-latest
steps:
- name: Create a text file
run: |
echo "The time is $(date -u)" > time.txt
- name: Upload time artifact
uses: actions/upload-artifact@v3
with:
name: time-artifact # optional, default is artifact
path: time.txt
if-no-files-found: error # optional, default is warn
retention-days: 1 # optional, maximum 90 days
The first step creates a simple text file and stores the current time in it. Then, we use the upload artifact action from the GitHub marketplace to upload the text file as an artifact. Let's discuss the inputs for the action:
path(required): a file, folder, or wildcard pattern to be uploaded. In our case, it is the file time.txt.name(optional): specifies the name of the artifact. Defaults to artifact if no name is provided.if-no-files-found(optional): the action to perform if no files are found. Options:warn,error, orignoreretention-days(optional): the duration an artifact is stored. The value ranges between 1 and 90 days. In our case, GitHub retains the artifact for 1 day.
That's it! We've successfully uploaded an artifact. Now let's discuss the upload path in more detail.
Customizing the upload path
Sometimes you might want to upload multiple files and folders or exclude some files or folders. You can easily achieve this with the help of the path input. You are already familiar with glob patterns (wildcard patterns) and this is a great place to try them out!
steps:
- name: Using wildcards
uses: actions/upload-artifact@v3
with:
name: build-artifacts
path: |
./*
!./.git/**
!./*.yml
The ./* pattern matches all files and folders using the asterisk (*) character. We can add an exclamation mark (!) before the file/folder name to exclude it. In the snippet above, the !./*.yml pattern excludes files with the .yml extension. This can be very useful when you need to exclude some files, such as dependencies or temporary support files.
We can also choose to upload artifacts only if a previous step fails using if: failure(). In the following example, a previous step stores its logs in a failure.log file located in the current working directory. You can use this file for debugging.
steps:
# previous steps
- name: Upload on failure
uses: actions/upload-artifact@v3
if: failure()
with:
name: failure_artifact
path: failure.log
Downloading artifacts
We successfully uploaded the current time as an artifact. Let's go ahead and download it.
name: Time artifact
on:
push:
jobs:
create: ...
print:
runs-on: ubuntu-latest
needs: create
steps:
- name: Download the time artifact
uses: actions/download-artifact@v3
with:
name: time-artifact
- name: Print the contents of the text file
run: cat time.txt
We use the download artifact action from the GitHub marketplace and execute cat time.txt to display the file's contents. The needs keyword ensures that the create step is completed successfully before downloading the artifact. You get the following output when you run the workflow:
Customizing the download path
If no path is specified, the artifact is downloaded to the current working directory. You can also specify a destination path for the artifact:
steps:
- name: Downloading artifact to the home directory
uses: actions/download-artifact@v3
with:
name: time-artifact
path: ~/time.txt # /home/runner/time.txt
The snippet downloads the artifact to the home directory (~). Environment variables and context expressions are also supported here. If the name input is not specified, all artifacts will be downloaded. An extra folder is created for every downloaded artifact whose name corresponds to the artifact name.
To access the destination path of an artifact, we use the download-path step output. Note the id attribute used to reference the step:
steps:
- name: Echo artifact destination path
uses: actions/download-artifact@v3
id: download-artifact
with:
name: time-artifact
path: ~/time.txt
- name: Output the download path
run: echo ${{ steps.download-artifact.outputs.download-path }}
Artifacts and dependency caching
Both artifacts and caching allow us to share data between jobs but they aren't the same. We've already seen what artifacts are and how to work with them. Now, let's define caching and then see how it differs from artifacts.
Caching allows us to reuse files that don't change regularly between jobs or workflows, thus speeding up workflow execution. The cache action is used here. Let's look at some of the key differences between artifacts and caching:
| # | Key difference | Artifacts | Cache |
|---|---|---|---|
| 1 | Purpose | Sharing data between jobs — for instance, build and deployment jobs. | Speed up workflow execution by reusing files that don't change regularly. |
| 2 | Common files | Build and test outputs: Compressed files, binary or executable files, test reports, and logs. | Dependencies: files generated by package managers such as npm, yarn, or pip. For example: node_modules. |
| 3 |
Sharing |
Artifacts can only be shared between jobs in the same workflow. |
Cached files are shared between jobs or workflows. |
| 4 | Retention period | 90 days in public repositories and up to 400 days in private repositories. | Cached files that have not been accessed in 7 days are automatically removed. Up to 10 GB of cache files per repository. |
Conclusion
Artifacts allow us to share data between jobs and save that data when the workflow run is complete. Let's summarize what you've learned:
- To use artifacts, upload them during a workflow run and then download them in subsequent jobs.
- Use the Upload artifact action to upload artifacts and the Download artifact action to download them.
- You can customize both the upload and download paths for greater flexibility when uploading or downloading artifacts.
- You can specify the retention period for artifacts within the workflow to automate the cleanup process.
- Both caching and artifacts allow data storage and sharing, but they are used for different purposes.