Deploy Rollbacks: Upgrade Airflow and Deploy DAGs with Confidence
What can go wrong when Deploying Changes to Airflow?
Encountering problems with your Airflow deployment is a common occurrence when making changes. There are a number of scenarios that could cause your data pipelines to go down:
- When upgrading Airflow deployments, your DAGs may be incompatible with the new version of Airflow or the providers packaged with Airflow.
- Deploying changes to a running DAG can cause DAGs and tasks to fail. Your DAGs may start running unpredictably, leading to faulty or lost data.
- Making changes to your requirements, packages, and/or Dockerfile can not only cause problems for your DAGs but may also cause your deployment to enter an unhealthy state.
Due to these challenges, Airflow users can find themselves apprehensive about modifying their deployments. They may delay merging DAG changes into production or put off crucial Airflow upgrades. At Astronomer, we have been working to ease these fears with features like Rollbacks, so you can deploy changes with confidence.
What are Deploy Rollbacks?
Deploy Rollbacks give you the ability to roll back your Airflow deployments on Astro to any prior code deploy. A code deploy can include changes to your DAGs, Airflow version, and environment files (requirements.txt, Dockerfile, etc.). You can see a history of all your past deploys in the “Deploy History” tab in the Deployment page.
Each record in the table includes the state of DAGs represented by the “Bundle Version,” the environment files represented by the “Docker Image,” and the Airflow version represented by the “Astro Runtime” version. You can roll back to any of these deploys by clicking the “Deploy” button in the “Rollback” column.
When rolling back to an old code deploy, a few scenarios can arise. A downgrade will occur anytime the deployment is rolled back to an older Airflow version. During this process, you will encounter a message warning you about potential data loss. This potential data loss is a result of the data migration from the new schema back to the older schema. The new schema may include tables and columns related to features that only exist in the newer version of Airflow. While data loss is possible, it is unlikely, especially if you perform a rollback soon after upgrading.
Another message you may encounter warns you about whether DAG-only deploy will be enabled or disabled. When you rollback to a code deploy, your deployment will adopt the DAG-only deploy setting of that deploy. This ensures that your DAGs from the prior deploy remain intact.
How Rollbacks can fix issues caused by code deploys
The exact state of each of your deploys is captured in the “Deploy History” table. This ensures that if you rollback to a prior deploy, your DAGs will be running with the same Airflow image and version they were running on when your DAGs were running correctly. The only caveat is that rolling back will not change your deployment’s environment variables or resources. These can be easily changed in other tabs on the Deployment page.
Now you can quickly undo any change you make to your deployment, including upgrades, with Deploy Rollbacks. If your DAGs and tasks start to fail after a code deploy for any reason, you can simply rollback your deployment to a working state. Deploys will be available for you to roll back to for 30 days. If you notice any anomalies or loss of your data, an increase in failed tasks, or similar issues, you can rollback to any available deploy before the date when the issues started. You have plenty of time to rollback, so don’t worry if you don’t notice issues caused by a deploy right away.
Read our documentation for more information.
Upgrade with Confidence
Rollbacks fit well into any Airflow upgrade process. Even with rollback, we still suggest that you test changes locally and progress them through a development or staging deployment, and then lastly to production. You can test and debug many common issues related to upgrades with our local Upgrade Test feature. You can also parse and pytest DAG changes using astro dev pytest/parse. We know that you can never be 100% confident after local testing, so now you can add rolling back a development or production deployment that’s broken after a deploy to your process.
Conclusion
With the new Deploy Rollback feature, you never have to worry about bad code deploys again. You can test out new DAG features and upgrades in development with the assurance that you’ll be able to undo these changes if needed. You’ll be able to upgrade production without the worry of pipelines going down for any more than a few minutes. You can try out a free trial of Astro today, deploy a few of your DAGs, and try out Deploy Rollbacks for yourself!