Migrate your Airflow environment from Amazon Managed Workflows for Apache Airflow (MWAA) to Astro.
To complete the migration process, you will:
Before starting the migration, ensure that the following are true:
On your local machine, make sure you have:
On the cloud service from which you’re migrating, ensure that you have:
All source Airflow environments on 1.x need to be upgraded to at least Airflow 2.0 before you can migrate them. Astronomer professional services can help you with the upgrade process.
If you’re migrating to Astro from OSS Airflow or another Astronomer product, and you currently use an older version of Airflow, you can still create Deployments with the corresponding version of Astro Runtime even if it is deprecated according to the Astro Runtime maintenance policy. This allows you to migrate your DAGs to Astro without needing to make any code changes and then immediately upgrade to a new version of Airflow. Note that after you migrate your DAGs, Astronomer recommends upgrading to a supported version of Astro Runtime as soon as you can.
(Optional) You can use the AWS CLI to expedite some of the steps in this guide.
The Astronomer Starship migration utility connects your source Airflow environment to your Astro Deployment and migrates your Airflow connections, Airflow variables, environment variables, and dags.
The Starship migration utility works as a plugin with a user interface, or as an Airflow operator if you are migrating from a more restricted Airflow environment.
If you are migrating from an MWAA instance with a private webserver, you will need to use the StarshipOperator pattern.
See the following table for information on which versions of Starship are available, depending on your source Airflow environment:
requirements.txt file for your source Airflow environment from S3. See AWS documentation.astronomer-starship on a new line to your requirements.txt file.To complete this setup from the command line:
Run the following commands to set environment variables on your local machine:
Run the following AWS CLI commands to install Starship:
In your Astro Organization, you can create Workspaces, which are a collection of users that have access to the same Deployments. Workspaces are typically owned by a single team.
You can choose to use an existing Workspace, or create a new one. However, you must have at least one Workspace to complete your migration.
Follow the steps in Manage Workspaces to create a Workspace in the Astro UI for your migrated Airflow environments. Astronomer recommends naming your first Workspace after your data team or initial business use case with Airflow. You can update these names in the Astro UI after you finish the migration.
Follow the steps in Manage Astro users to add users from your team to the Workspace. See Astro user permissions for details about each available Workspace user role.
You can add users to a Workspace an Organization using the Astro CLI. See:
You can also automate adding batches of users to Astro with shell scripts. See Add a group of users to Astro using the Astro CLI.
A Deployment is an Astro Runtime environment that is powered by the core components of Apache Airflow. In a Deployment, you can deploy and run DAGs, configure worker resources, and view metrics.
You can choose to use an existing Deployment, or create a new one. However, you must have at least one Deployment to complete your migration.
Before you create your Deployment, copy the following information from your source Airflow environment:
This setup varies slightly for Astro Hybrid users. See Deployment settings for all configurations related to Astro Hybrid Deployments.
In the Astro UI, select a Workspace.
On the Deployments page, click Deployment.
Complete the following fields:
To configure and use dedicated clusters, see Create a dedicated cluster. If you don’t have the option of choosing between standard or dedicated, that means you are an Astro Hybrid user and must choose a cluster that has been configured for your Organization.
Executor: Choose the same executor as in your source Airflow environment.
Scheduler: Use the following table to determine the Deployment size you need based on the size of your source Airflow environment.
You might have defined Airflow connections and variables in the following places on your source Airflow environment:
If you defined your Airflow variables and connections in the Airflow UI, you can migrate those to Astro with Starship. You can check which resources will be migrated by going to Admin > Variables and Admin > Connections in the Airflow UI to find your source Airflow environment information.
https://<your-organization>.astronomer.run/<id>/home.Create a new directory for your Astro project:
Open the directory:
Run the following Astro CLI command to initialize an Astro project in the directory:
This command generates a set of files that will build into a Docker image that you can both run on your local machine and deploy to Astro.
Add the following line to your Astro project requirements.txt file:
When you deploy your code, this line installs the Starship migration tool on your Deployment so that you can migrate Airflow resources from your source environment to Astro.
(Optional) Run the following command to initialize a new git repository for your Astro project:
Open your Astro project Dockerfile. Update the Runtime version in first line to the version you selected for your Deployment in Step 3. For example, if your Runtime version was 6.3.0, your Dockerfile would look like the following:
The Dockerfile defines the environment that all your Airflow components run in. You can modify it to make certain resources available to your Airflow environment like certificates or keys. For this migration, you only need to update your Runtime version.
Open your Astro project requirements.txt file and add all Python packages from your source Airflow environment’s requirements.txt file. See AWS documentation to find this file in your S3 bucket.
apache-airflow-providers-snowflake version 3.3.0 on MWAA, you would add apache-airflow-providers-snowflake==3.3.0 to your Astro requirements.txt file.Open your Astro project dags folder. Add your dag files from either your source control platform or S3.
If you used the plugins folder in your MWAA project, copy the contents of this folder from your source control platform or S3 to the /plugins folder of your Astro project.
After you confirm that your Astro project has all necessary dependencies, deploy the project to your Astro Deployment.
Run the following command to authenticate to Astro:
Run the following command to deploy your project
This command returns a list of Deployments available in your Workspace and prompts you to pick one.
The core migration of your project is now complete. Read the following to decide whether you need to set up any additional infrastructure on Astro before you cut over your dags.
If you used CI/CD to deploy code to your source Airflow environment, read the following documentation to learn about setting up a similar CI/CD pipeline for your Astro project:
Similarly to MWAA, you can deploy dags to Astro directly from an S3 bucket. See Deploy dags from an AWS S3 bucket to Astro using AWS Lambda.
If you currently store Airflow variables or connections in a secrets backend, you also need to integrate your secrets backend with Astro to access those objects from your migrated dags. See Configure a Secrets Backend for setup steps.
Depending on how thoroughly you want to test your Airflow environment, you can test your project locally before deploying to Astro.
astro dev parse to check for any parsing errors in your dags.astro run <dag-id> to test a specific dag. This command compiles your dag and runs it in a single Airflow worker container based on your Astro project configurations.astro dev start to start a complete Airflow environment on your local machine. After your project starts up, you can access the Airflow UI at localhost:8080. See Troubleshoot your local Airflow environment.Your migrated Airflow variables and connections are not available locally. You must deploy your project to Astro to test these Airflow objects.
Run the following command to authenticate to Astro:
Run the following command to deploy your project
This command returns a list of Deployments available in your Workspace and prompts you to pick one.
In the Astro UI, open your Deployment and click Open Airflow. Confirm that you can see your deployed DAGs in the Airflow UI.
After you successfully deploy your code to Astro, you need to migrate your workloads from your source Airflow environment to Astro on a DAG-by-DAG basis. Depending on how your workloads are set up, Astronomer recommends letting DAG owners determine the order to migrate and test DAGs.
You can complete the following steps in the few days or weeks following your migration set up. Provide updates to your Astronomer Data Engineer as they continue to assist you through the process and any solve any difficulties that arise.
Continue to validate and move your DAGs until you have fully cut over your source Airflow instance. After you finish migrating from your source Airflow environment, repeat the complete migration process for any other Airflow instances in your source Airflow environment.
In the Airflow UI for your Deployment, test all connections that you migrated from your source Airflow environment.
Additionally, check Airflow variable values in Admin > Variables.
To create a strategy for testing DAGs, determine which DAGs need the most care when running and testing them.
If your DAG workflow is idempotent and can run twice or more without negative effects, you can run and test these DAGs with minimal risk. If your DAG workflow is non-idempotent and can become invalid when you rerun it, you should test the DAG with more caution and downtime.
Starship includes features for simultaneously pausing DAGs in your source Airflow environment and starting them on Astro. This allows you to cut over your production workflows without downtime.
For each DAG in your Astro Deployment:
Confirm that the DAG ID in your Deployment is the same as the DAG ID in your source Airflow environment.
In the Airflow UI for your source Airflow environment, go to Astronomer > Migration Tool 🚀.
Click DAGs cutover. In the table that appears, click the Pause icon in the Local column for the DAG you’re cutting over.
Click the Start icon in the Remote column for the DAG you’re cutting over.
After completing this cutover, the Start and Pause icons switch. If there’s an issue after cutting over, click the Remote pause button and then the Local start button to move your workflow back to your source Airflow environment.
Astro includes several features that enhance the Apache Airflow development experience, from DAG writing to testing. To make the most of these features, you might want to make adjustments to your exisitng DAG development workflows.
As you get started on Astro, review the list of features and changes that Astro brings to the Airflow development experience and consider how you want to implement these details in your development experience. See Write and run DAGs on Astro.
As you cut over DAGs, view Deployment metrics to get a sense of how many resources your Deployment is using. Use this information to adjust your worker queues and resource usage accordingly, or to tell when a DAG isn’t running as expected.
If your current worker type doesn’t have the right amount of resources for your workflows, see Deployment settings to learn about configuring worker types on your Deployments.
You can additionally configure worker queues to assign each of your tasks to different worker instance types. View your Deployment metrics to help you determine what changes are required.
Deploying to Astro with DAG-only deploys enabled can make deploys faster in cases where you’ve only modified your dags directory. To enable the DAG-only deploy feature, see Deploy DAGs only.