Sitemap

Utilize Cache in Python App Build with GitHub Actions

4 min readMar 30, 2025

## Overview

In this article, we will explore how to effectively utilize caching when building Docker images for Python applications using GitHub Actions. Caching can significantly reduce build times by avoiding unnecessary installations of dependencies, especially when there are no changes to the libraries or the Dockerfile. We’ll cover best practices for writing Dockerfiles, configuring GitHub Actions, and leveraging different caching strategies to optimize your workflow.

## Steps

### Step 1: Write an Efficient Dockerfile

To maximize caching benefits, structure your Dockerfile correctly. Here’s a simple example:

FROM python:3.12.3-slim

WORKDIR /app
COPY requirements.txt requirements.txt
RUN pip install -r requirements.txt
COPY . .
ENV PORT=8080
EXPOSE 8080

If you need to install additional packages using apt-get, ensure that this step is placed before copying requirements.txt to avoid cache invalidation:

RUN --mount=type=cache,target=/var/cache/apt \
--mount=type=cache,target=/var/lib/apt \
apt-get update && \
apt-get install -y --no-install-recommends \
curl \
git \
&& rm -rf /var/lib/apt/lists/*

### Step 2: Configure GitHub Actions with Caching

In your GitHub Actions workflow, we can use the docker/build-push-action and specify the caching options. Below is an example configuration:

      - name: Build and push Docker image
uses: docker/build-push-action@v6
with:
context: .
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max

When using GitHub runner (not using self-hosted runner), you can just use gha for your cache registry.

name: build-and-push

on:
release:
types:
- published
push:
tags:
- 'v*'
branches:
- main # this is necessary for pr to utilize the cache
pull_request:
paths:
- .github/workflows/build-and-push.yml
- docker-layer-cache/*

env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}

jobs:
build-and-push-image:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write

steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Log in to the Container registry
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Extract metadata (tags, labels) for Docker
id: meta
uses: docker/metadata-action@v5
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}

- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3

- name: Build and push Docker image
uses: docker/build-push-action@v6
with:
context: .
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max

### Step 3: Handle Caching for Pip

To further optimize the build process, utilize caching for Python dependencies. You can mount the pip cache in your Dockerfile:

FROM python:3.12.7-slim

WORKDIR /app

COPY requirements.txt .

RUN --mount=type=cache,target=/root/.cache/pip,sharing=locked \
pip install -r requirements.txt

COPY . .

ENV PORT=8080
EXPOSE 8080

CMD ["python", "app.py"]

And in your GitHub Actions, restore and save the pip cache:

... # metadata の後を以下のようにする
- name: Restore pip cache
uses: actions/cache/restore@d4323d4df104b026a6aa633fdb11d772146be0bf # v4.2.2
id: pip-cache
with:
path: root-dot-cache-pip
key: pip-cache-${{ hashFiles('requirements.txt') }}
restore-keys: |
pip-cache-

# buildkit-cache-dance を使ってキャッシュを Docker ビルドに注入
- name: Inject cache into Docker build
uses: reproducible-containers/buildkit-cache-dance@5de31fc1534ed8789e63d41ea933c5df9944a261 # v3.1.0
with:
cache-map: |
{
"root-dot-cache-pip": "/root/.cache/pip"
}
skip-extraction: ${{ steps.pip-cache.outputs.cache-hit }}

# Docker イメージをビルドして必要に応じてプッシュ
- name: Build and push Docker image
uses: docker/build-push-action@471d1dc4e07e5cdedd4c2171150001c434f0b7a4 # v6.15.0
with:
context: docker-layer-cache
file: docker-layer-cache/Dockerfile.cache
push: ${{ github.event_name != 'pull_request' }}
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
build-args: |
BUILDKIT_INLINE_CACHE=1

- name: Save pip cache
uses: actions/cache/restore@d4323d4df104b026a6aa633fdb11d772146be0bf # v4.2.2
if: github.ref_name == 'main'
with:
path: root-dot-cache-pip
key: ${{ steps.pip-cache.outputs.cache-primary-key }}

### Step 4: Understand Cache Behavior

It’s crucial to comprehend how caching works in GitHub Actions:

  • PRs can access the cache saved during the same PR run or the main branch cache.
  • Main Branch runs can only use the cache saved from the main branch.
  • Cache can be invalidated by changes in requirements.txt or other files that impact the Docker layers.
When successfully caching, it takes only a few seconds to complete building an image.

### Tips and Best Practices

  • Always aim to keep Docker layers that change frequently (like apt-get update) low in the Dockerfile to prevent unnecessary cache invalidation.
  • Be cautious with self-hosted runners, as they may experience slower cache access depending on the region.
  • If you find your cache is not being utilized as expected, check the order of your commands in the Dockerfile and ensure that you’re not inadvertently invalidating the cache.

## Summary

By implementing caching strategies in your Docker builds with GitHub Actions, you can significantly reduce build times and improve developer efficiency. Remember to structure your Dockerfile for optimal caching, configure your GitHub Actions workflow correctly, and leverage pip caching to avoid redundant installations.

### References

--

--

Masato Naka
Masato Naka

Written by Masato Naka

An SRE, mainly working on Kubernetes. CKA (Feb 2021). His Interests include Cloud-Native application development, and machine learning.

No responses yet