A Guide to Set Up Safe and Simple Database Schema Migration Workflow for SQLAlchemy with Atlas

Masato Naka
7 min readSep 29, 2024

--

## 1. Overview

As web developers, managing database schemas can often become a complex and time-consuming task, especially as applications grow and evolve. This is where Atlas comes into play. Atlas is a robust tool that simplifies and standardizes database schema migrations, allowing developers to focus more on building features rather than wrestling with database changes. In this article, we’ll explore why you should consider using Atlas for database management and provide a straightforward, step-by-step guide to integrating it with Python’s SQLAlchemy.

## 2. Steps

### Prerequisites

Before diving in, ensure that you have the necessary tools installed. You need to install the Atlas CLI and the atlas-provider-sqlalchemy. You can do this with the following commands:

curl -sSf https://atlasgo.sh | sh
pip install atlas-provider-sqlalchemy

### Step 1: Define Your Database Schema with SQLAlchemy

First, you will define your database schema using SQLAlchemy. This powerful ORM library enables you to create structured models in Python. Here’s a simple example of how to define a `User` model:

from sqlalchemy import create_engine, Column, Integer, String
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class User(Base):
__tablename__ = 'users'

id = Column(Integer, primary_key=True)
name = Column(String)
age = Column(Integer)

# Create an SQLite database
engine = create_engine('sqlite:///example.db')
# Base.metadata.create_all(engine)

In this snippet, we define a User class that corresponds to a users table with id, name, and age fields.

When using atlas to manage DB schema, we don’t use Base.metadata.create_all(engine) anymore.

### Step 2: Configure Atlas

Next, create an atlas.hcl configuration file in your project’s root directory. This file will define the database connection settings and the directory for migration files:

data “external_schema” “sqlalchemy” {
program = [
"atlas-provider-sqlalchemy",
"--path", "./db",
"--dialect", "postgresql" # Change this to your specific dialect (e.g., mariadb, sqlite, mssql)
]
}

env "local" {
src = data.external_schema.sqlalchemy.url
dev = "postgresql://postgres:password@localhost:5432/test_db_dev?sslmode=disable&search_path=public"
url = "postgresql://postgres:password@localhost:5432/test_db?sslmode=disable&search_path=public"
migration {
dir = "file://db/migrations"
}
format {
migrate {
diff = "{{ sql . \" \" }}"
}
}
exclude = ["atlas_schema_revisions"]
}

diff {
skip {
// By default, none of the changes are skipped.
drop_schema = true
drop_table = true
}
}
  • url: the target database url to which you apply the change.
  • dev: database url that is used for validation internally in atlas (https://atlasgo.io/concepts/dev-database)
  • env: You can configure multiple environment and specify one with--env args (e.g. atlas migrate apply --env local )
  • diff skip: You can skip destructive changes not to drop shema or tables, even if it’s removed from schema definition in ORM. You can also utilize this feature to keep the tables that are created outside atlas.

### Step 3: Generate Migration Files with Atlas

Now that your schema is defined and Atlas is configured, you can generate migration files. Run the following command in your terminal:

atlas migrate diff <file_name> — config file://atlas.hcl — env local

This command creates a migration file in the specified directory (db/migrations in this example), capturing any changes made to your schema. An integrity file named atlas.sum will also be generated, which helps manage concurrent migrations by tracking hash values.

### Step 4: Apply Migrations to Your Database

After generating the migration files, apply the changes to your actual database with the command:

atlas migrate apply — config file://atlas.hcl — env local

If your database schema already exists, you can skip migrations up to a specific timestamp by using the --baseline option:

atlas migrate apply — config file://atlas.hcl — env local — baseline <timestamp>

### Step 5: Configuration for multiple environments

After confirming the behavior in your local environment, now it’s time to prepare configuration for multiple environments, dev, pr, and prod.

Atlas supports AWS Secret Manager and GCP Secret Manager to extract credential data for database connection.

In the example below, AWS Secret Manager is used. For more details, please check the official doc (https://atlasgo.io/guides/deploying/secrets).

data "external_schema" "sqlalchemy" {
program = [
"atlas-provider-sqlalchemy",
"--path", "./db",
"--dialect", "postgresql" # Change this to your specific dialect (e.g., mariadb, sqlite, mssql)
]
}

env "local" {
src = data.external_schema.sqlalchemy.url
dev = "postgresql://postgres:password@localhost:5432/test_db_dev?sslmode=disable&search_path=public"
url = "postgresql://postgres:password@localhost:5432/test_db?sslmode=disable&search_path=public"
migration {
dir = "file://db/migrations"
}
format {
migrate {
diff = "{{ sql . \" \" }}"
}
}
exclude = ["atlas_schema_revisions"]
}

diff {
skip {
// By default, none of the changes are skipped.
drop_schema = true
drop_table = true
}
}

data "runtimevar" "pass" {
url = "awssecretsmanager://database?region=ap-northeast-1"
}

locals {
db_url_dev = jsondecode(data.runtimevar.pass).DB_URL_DEV_FOR_ATLAS
db_url_pr = jsondecode(data.runtimevar.pass).DB_URL_PR_FOR_ATLAS
db_url_prod = jsondecode(data.runtimevar.pass).DB_URL_PROD_FOR_ATLAS
}

env "pr" {
src = data.external_schema.sqlalchemy.url
url = local.db_url_pr
migration {
dir = "file://db/migrations"
}
format {
migrate {
diff = "{{ sql . \" \" }}"
}
}
exclude = ["atlas_schema_revisions"]
}

env "dev" {
src = data.external_schema.sqlalchemy.url
url = local.db_url_dev
migration {
dir = "file://db/migrations"
baseline = "20240929053307"
}
format {
migrate {
diff = "{{ sql . \" \" }}"
}
}
exclude = ["atlas_schema_revisions"]
}

env "prod" {
src = data.external_schema.sqlalchemy.url
url = local.db_url_prod
migration {
dir = "file://db/migrations"
baseline = "20240929053307"
}
format {
migrate {
diff = "{{ sql . \" \" }}"
}
}
exclude = ["atlas_schema_revisions"]
}

### Step 6: Set Up Continuous Integration (CI)

To ensure your migrations are consistently applied, set up CI using GitHub Actions. Create a workflow file in your repository, for example, .github/workflows/migrate.yml, to automate the migration process:

name: python
on:
pull_request:
release:
types: [published]
push:
branches:
- main

env:
COMMENT_BODY_IDENTIFIER: Pytest Coverage Comment
AWS_REGION : ap-northeast-1
AWS_ROLE: arn:aws:iam::<aws account>:role/<aws role>
ENVIRONMENT: ${{ github.event_name == 'release' && 'prod' || github.event_name == 'push' && 'dev' || github.event_name == 'pull_request' && 'pr' || 'default' }}

permissions:
contents: write
pull-requests: write
id-token: write

jobs:
path-filter:
outputs:
python: ${{steps.changes.outputs.python}}
atlas: ${{steps.changes.outputs.atlas}}
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- uses: dorny/paths-filter@v3
id: changes
with:
filters: |
python:
- "**.py"
- .github/workflows/python.yml
- poetry.lock
- .python-version
atlas:
- "src/db/migrations/*.sql"
- "src/db/migrations/atlas.sum"
- "src/atlas.hcl"
- .github/workflows/python.yml

status-check:
runs-on: ubuntu-latest
needs:
- test
- atlas
permissions: {}
if: failure()
steps:
- run: exit 1

test:
if: needs.path-filter.outputs.python == 'true' || github.event_name == 'release'
needs:
- path-filter
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
persist-credentials: false

# https://github.com/pgvector/setup-pgvector
- name: Run postgres
run: docker compose up -d

- name: Set up Python & Poetry
uses: nakamasato/github-actions/setup-poetry@1.6.1

- uses: ariga/setup-atlas@v0.2
with:
version: v0.27.0

- name: install atlas-provider-sqlalchemy
run: |
pip install atlas-provider-sqlalchemy

- name: atlas migrate lint
working-directory: src
run: |
atlas migrate lint --env local --git-base origin/main

- name: atlas migrate apply
working-directory: src
run: |
atlas migrate apply --env local

- name: test
run: |
set -o pipefail
poetry run pytest | tee pytest-coverage.txt

- name: pytest coverage comment
if: github.event_name == 'pull_request'
id: pytest-coverage-comment
uses: MishaKav/pytest-coverage-comment@v1.1.52
with:
pytest-coverage-path: pytest-coverage.txt
junitxml-path: pytest.xml

atlas:
if: needs.path-filter.outputs.atlas == 'true' || github.event_name == 'release'
needs:
- path-filter
runs-on: ubuntu-latest
steps:
- name: Git clone the repository
uses: actions/checkout@v4

- name: Run postgres
run: docker compose up -d

- uses: ariga/setup-atlas@v0.2
with:
version: v0.27.0

- name: Set up Python
uses: actions/setup-python@v5
env:
PIP_ROOT_USER_ACTION: ignore

- name: install atlas-provider-sqlalchemy
run: |
pip install sqlalchemy atlas-provider-sqlalchemy

- name: atlas-diff
id: atlas-diff
working-directory: src
run: |
atlas migrate diff test --env local
if [[ -n "$(git status --porcelain)" ]]; then
{
echo "changed=true";
echo "COMMENT<<EOF";
echo "Please run the command: \`atlas migrate diff\`";
echo "EOF";
} >> "$GITHUB_OUTPUT"
else
echo "changed=false" >> "$GITHUB_OUTPUT"
echo "COMMENT=atlas.sum is up-to-date" >> "$GITHUB_OUTPUT"
fi

- name: configure aws credentials
uses: aws-actions/configure-aws-credentials@v4.0.2
with:
role-to-assume: ${{ env.AWS_ROLE }}
role-session-name: GitHub_to_AWS_via_FederatedOIDC
aws-region: ${{ env.AWS_REGION }}

- name: checkout to base branch (base)
if: github.event_name == 'pull_request'
uses: actions/checkout@v4
with:
ref: ${{ github.base_ref }}

- name: clean up pr database (base)
if: github.event_name == 'pull_request'
working-directory: src
run: |
atlas schema clean --env ${{ env.ENVIRONMENT }} --auto-approve

- name: apply the schema on base branch (base)
if: github.event_name == 'pull_request'
working-directory: src
run: |
atlas migrate apply --env ${{ env.ENVIRONMENT }}

- name: checkout to back to pr branch (PR)
if: github.event_name == 'pull_request'
uses: actions/checkout@v4

- name: atlas migrate apply (${{ env.ENVIRONMENT }})
working-directory: src
run: |
atlas migrate apply --env ${{ env.ENVIRONMENT }}

- name: atlas migrate apply dry run against dev
if: github.event_name == 'pull_request'
id: migrate-apply-dry-run
working-directory: src
run: |
set -eo pipefail
echo 'COMMENT<<EOF' >> "$GITHUB_OUTPUT"
atlas migrate apply --dry-run --env dev | tee -a "$GITHUB_OUTPUT"
echo 'EOF' >> "$GITHUB_OUTPUT"

- name: find comment (pull_request)
if: github.event_name == 'pull_request'
uses: peter-evans/find-comment@3eae4d37986fb5a8592848f6a574fdf654e61f9e # v3.1.0
id: migrate-plan-comment
with:
issue-number: ${{ github.event.pull_request.number }}
body-includes: migrate-plan-comment

- name: create or update comment (pull_request)
if: github.event_name == 'pull_request'
uses: peter-evans/create-or-update-comment@71345be0265236311c031f5c7866368bd1eff043 # v4.0.0
with:
issue-number: ${{ github.event.pull_request.number }}
comment-id: ${{ steps.migrate-plan-comment.outputs.comment-id }}
edit-mode: replace
body: |
## atlas vs DB diff

```
${{ steps.migrate-apply-dry-run.outputs.COMMENT }}
```

## SQLAlchemy & atlas diff ${{ steps.atlas-diff.outputs.changed == 'true' && '❌️' || '✅️' }}

${{ steps.atlas-diff.outputs.COMMENT }}

<!-- migrate-plan-comment -->

- name: exit if atlas.sum is not up-to-date
if: steps.atlas-diff.outputs.changed == 'true'
run: |
exit 1

This workflow triggers on pull requests, push to the main branch, and release events and each event type is corresponding to the target environment. (pull request -> pr env, push to main branch -> dev, release -> pr).

For Pull Requests that contain schema changes, you’ll see a comment that shows what exactly is going to be applied.

You can also create a dedicated job for applying migrations. This can be done using Kubernetes Jobs, Cloud Run Jobs or any other mechanism using custom container or directly using atlas command with migration files. In the example above, we used the same GitHub Actions for pr, dev, and prod env with different triggers.

## 3. Summary

Integrating Atlas into your development workflow significantly simplifies database schema management. By combining Atlas with SQLAlchemy, you can effortlessly track and apply changes to your database. If you’re looking for a reliable solution for managing database migrations, consider giving Atlas a try to enhance your development process!

Happy coding!

## References

  1. Atlas Documentation: https://atlasgo.io/docs
  2. SQLAlchemy with Atlas: https://atlasgo.io/guides/orms/sqlalchemy

--

--

Masato Naka
Masato Naka

Written by Masato Naka

An SRE, mainly working on Kubernetes. CKA (Feb 2021). His Interests include Cloud-Native application development, and machine learning.

No responses yet