This “Live with Astronomer” session covers the best ways to use DuckDB with Airflow. We walk through some basic data processing and machine learning DAGs to show how and where DuckDB can be an optimal fit for your data pipelines.
Questions covered in this session include:
- How can I run DuckDB queries using Airflow?
- How can I use the Astro Python SDK, an open source framework for east DAG authoring, to integrate DuckDB into my data pipelines?
- What are some tips and tricks for using DuckDB and Airflow with larger datasets?
All code shown in this session can be found in this repo. You can read more about using DuckDB with Airflow on the Airflow medium.