Computer scienceSystem administration and DevOpsCI/CD processesGitHub Actions

Artifacts

11 minutes read

You have created GitHub Actions workflows to automate the building and testing of your applications, saving yourself a ton of work. You can take this one step further and automate the deployment process as well. Not only that, but you can also save test results for debugging.

To achieve this you need a way to share data between jobs — that is, between build and deployment jobs. If you're using GitHub-hosted runners, any outputs your build and test jobs generate are deleted when the workflow finishes executing. You need a way to save these outputs before they are deleted. Let's discuss how this is done using artifacts.

Artifacts

Artifacts allow us to share data between jobs or retain that data after a workflow run is complete. This data could be:

  • Production files — binary, compressed, or executable files
  • Logs, test reports, and crash dumps

The ability to share data between jobs in a workflow allows you to create efficient continuous integration and continuous delivery (CI/CD) pipelines. To use artifacts, you upload them during a workflow run. Then, when the run is complete, you can view those artifacts in the Summary tab of the GitHub Actions UI. You can click the artifact name to download it as a compressed zip file.

Viewing artifacts in the GitHub Actions UI

We can manually delete artifacts from the UI by clicking the trashcan icon. But what if there are hundreds of artifacts? Fortunately, artifacts and logs stick around for 90 days before they are automatically removed. Better yet, you can customize this retention period in the repository settings or specify it in the workflow file to as low as one day, thus automating the cleanup process. We will see how this is achieved in the next section.

Feel free to explore the other ways of removing unused artifacts — using the delete artifacts action from the marketplace and the REST API.

Uploading artifacts

It might take an archeologist several years to find an ancient artifact, but you only need a few lines of code!

name: Time artifact
on:
  push:
jobs:
  create:
    runs-on: ubuntu-latest
    steps:
      - name: Create a text file
        run: |
          echo "The time is $(date -u)" > time.txt

      - name: Upload time artifact
        uses: actions/upload-artifact@v3
        with:
          name: time-artifact # optional, default is artifact
          path: time.txt
          if-no-files-found: error # optional, default is warn
          retention-days: 1 # optional, maximum 90 days

The first step creates a simple text file and stores the current time in it. Then, we use the upload artifact action from the GitHub marketplace to upload the text file as an artifact. Let's discuss the inputs for the action:

  • path (required): a file, folder, or wildcard pattern to be uploaded. In our case, it is the file time.txt.
  • name (optional): specifies the name of the artifact. Defaults to artifact if no name is provided.
  • if-no-files-found (optional): the action to perform if no files are found. Options: warn, error, or ignore
  • retention-days (optional): the duration an artifact is stored. The value ranges between 1 and 90 days. In our case, GitHub retains the artifact for 1 day.

That's it! We've successfully uploaded an artifact. Now let's discuss the upload path in more detail.

Customizing the upload path

Sometimes you might want to upload multiple files and folders or exclude some files or folders. You can easily achieve this with the help of the path input. You are already familiar with glob patterns (wildcard patterns) and this is a great place to try them out!

steps:
  - name: Using wildcards
    uses: actions/upload-artifact@v3
    with:
      name: build-artifacts
      path: |    
        ./* 
        !./.git/** 
        !./*.yml

The ./* pattern matches all files and folders using the asterisk (*) character. We can add an exclamation mark (!) before the file/folder name to exclude it. In the snippet above, the !./*.yml pattern excludes files with the .yml extension. This can be very useful when you need to exclude some files, such as dependencies or temporary support files.

It is not recommended to upload too many artifacts in a short period as the requests will be blocked. Instead, compress or archive the files before uploading to reduce overhead and improve performance.

We can also choose to upload artifacts only if a previous step fails using if: failure(). In the following example, a previous step stores its logs in a failure.log file located in the current working directory. You can use this file for debugging.

steps:
  # previous steps

  - name: Upload on failure
    uses: actions/upload-artifact@v3
    if: failure()
    with:
      name: failure_artifact
      path: failure.log

Screenshot showing an artifact uploaded on failure

Downloading artifacts

We successfully uploaded the current time as an artifact. Let's go ahead and download it.

name: Time artifact
on:
  push:
jobs:
  create: ...
    
  print:
    runs-on: ubuntu-latest
    needs: create
    steps:
      - name: Download the time artifact
        uses: actions/download-artifact@v3
        with:
          name: time-artifact

      - name: Print the contents of the text file
        run: cat time.txt

We use the download artifact action from the GitHub marketplace and execute cat time.txt to display the file's contents. The needs keyword ensures that the create step is completed successfully before downloading the artifact. You get the following output when you run the workflow:

Download artifact workflow run results

Artifacts can only be downloaded by jobs in the same workflow.

Customizing the download path

If no path is specified, the artifact is downloaded to the current working directory. You can also specify a destination path for the artifact:

steps:
  - name: Downloading artifact to the home directory
    uses: actions/download-artifact@v3
    with:
      name: time-artifact
      path: ~/time.txt # /home/runner/time.txt

The snippet downloads the artifact to the home directory (~). Environment variables and context expressions are also supported here. If the name input is not specified, all artifacts will be downloaded. An extra folder is created for every downloaded artifact whose name corresponds to the artifact name.

To access the destination path of an artifact, we use the download-path step output. Note the id attribute used to reference the step:

steps:
  - name: Echo artifact destination path
    uses: actions/download-artifact@v3
    id: download-artifact
    with:
      name: time-artifact
      path: ~/time.txt
                  
  - name: Output the download path
    run: echo ${{ steps.download-artifact.outputs.download-path }}

Displaying the download path

Artifacts and dependency caching

Both artifacts and caching allow us to share data between jobs but they aren't the same. We've already seen what artifacts are and how to work with them. Now, let's define caching and then see how it differs from artifacts.

Caching allows us to reuse files that don't change regularly between jobs or workflows, thus speeding up workflow execution. The cache action is used here. Let's look at some of the key differences between artifacts and caching:

# Key difference Artifacts Cache
1 Purpose Sharing data between jobs — for instance, build and deployment jobs. Speed up workflow execution by reusing files that don't change regularly.
2 Common files Build and test outputs: Compressed files, binary or executable files, test reports, and logs. Dependencies: files generated by package managers such as npm, yarn, or pip. For example: node_modules.
3

Sharing

Artifacts can only be shared between jobs in the same workflow.

Cached files are shared between jobs or workflows.

4 Retention period 90 days in public repositories and up to 400 days in private repositories. Cached files that have not been accessed in 7 days are automatically removed. Up to 10 GB of cache files per repository.

Conclusion

Artifacts allow us to share data between jobs and save that data when the workflow run is complete. Let's summarize what you've learned:

  • To use artifacts, upload them during a workflow run and then download them in subsequent jobs.
  • Use the Upload artifact action to upload artifacts and the Download artifact action to download them.
  • You can customize both the upload and download paths for greater flexibility when uploading or downloading artifacts.
  • You can specify the retention period for artifacts within the workflow to automate the cleanup process.
  • Both caching and artifacts allow data storage and sharing, but they are used for different purposes.
7 learners liked this piece of theory. 0 didn't like it. What about you?
Report a typo