This webinar provides an extensive overview of the different options for running Airflow tasks in isolated environments - a Kubernetes pod or Python virtual environment that’s separate from your Airflow environment. The ability to run tasks in an isolated environment helps avoid common data pipeline problems, like dependency conflicts and resource management, and gives DAG authors control over how and where their tasks run. It’s useful, for example, if you need to run a task requiring Python packages that conflict with core Airflow.
Questions covered in the webinar include:
- What are the use cases for running Airflow tasks in isolated environments?
- What operators should I use to run tasks in an isolated environment?
- How do I use the suggested operators?
- How do I set up my Airflow infrastructure to support running tasks in isolated environments?
All of the sample code shown in this webinar can be found in this repo.