We are excited to announce the new release of the Astro Platform, introducing exciting new features designed to enhance your data orchestration experience.
This release includes features designed to make Astro the best place to run Airflow by:
- Unifying your data estate and reducing the complexity and fragmentation that comes with multiple data tools. Let Astro run dbt for you, and manage your data workflows from a single interface.
- Improving Pipeline Resilience with advanced monitoring capabilities and automated worker healing.
- Automating infrastructure provisioning and management with new capabilities for customer-managed workload identities and the Astro Terraform integration.
In this blog, we dive into each of these new features and how they can take your data orchestration even further.
Unified Workflows with dbt on Astro
Overview
A core focus of our latest release is simplifying your data estate and eliminating the challenges of managing fragmented workflows. Managing dbt and Airflow separately often leads to complexity and fragmentation, which can slow down workflows and increase the chances of errors.
Our newest addition to the dbt on Astro feature set, dbt Deploys, integrates dbt directly into the Astro Platform, allowing you to handle both tools from a single interface.
This seamless integration helps data engineers reduce context switching, simplifies deployment processes, and delivers a streamlined user experience.
Astronomer Cosmos
Cosmos, an open-source initiative by Astronomer launched in 2023 and a feature within the dbt on Astro feature set, simplifies the integration of dbt tasks within your data pipeline. This project has seen widespread adoption among teams managing dbt models and Airflow DAGs, with over 1.3 million downloads per month.
Cosmos allows users to run individual dbt tasks without rerunning the entire model. This enables end-to-end automation and detailed visibility and monitoring within Airflow, improving data pipeline reliability and efficiency.
dbt Deploys
The latest addition to dbt on Astro, dbt Deploys, is designed to simplify the deployment of dbt models. With the astro dbt deploy command, teams can automate the deployment process, ensuring updates are applied quickly and reliably.
By bringing deployment capabilities directly into the Astro platform, dbt Deploys reduces the need for context switching, streamlines workflows, and offers a single, unified interface for managing both dbt and Airflow.
By combining Cosmos and dbt Deploys in one platform, dbt on Astro enables teams to observe and deploy their dbt and Airflow code directly from Astro. This centralizes the monitoring of both dbt and Airflow, making it easier to detect and troubleshoot issues quickly. This simplifies collaboration by allowing data engineers and analysts to work on their respective tasks without interference, accelerating project delivery.
By unifying dbt and Airflow within Astro in a single platform, users can optimize resources, enhance workflow efficiency, and ensure smooth and dependable operations.
Start your Free Trial of Astro. You'll even get $300 in credits to get started.
Try for Free →
Improved Pipeline Resilience
In this release, we are introducing key features to enhance pipeline resilience and enable easy operation of Airflow across your organization.
Universal Metrics Export
Effective monitoring is crucial for managing complex data deployments. Without it, maintaining system reliability and performance can become a significant challenge. Universal Metrics Export allows you to export detailed metrics from your Airflow deployments to third-party monitoring systems, leveraging Prometheus remote-write capabilities. This enables comprehensive and real-time visibility into your deployment’s performance and health.
With this feature, organizations can achieve centralized observability by monitoring the health, reliability, and performance of their data ecosystem in real-time. This allows for proactive management of deployment reliability, as teams can identify and address poorly performing deployments early. Additionally, detailed insights into resource utilization enable teams to make informed decisions on scaling and resource allocation, optimizing operational costs and performance.
This feature includes both Airflow metrics, such as those outlined in the Airflow documentation, and infrastructure utilization metrics, providing a more holistic view of your system's health.
Learning Bytes: Universal Metrics Export — Discover how to quickly connect your observability platform of choice to collect relevant Astro and Airflow metrics.
Self-Healing Workers
System reliability can be compromised by non-functional workers that are online but not processing tasks. These idle workers can lead to inefficiencies and bottlenecks in task execution. Self-Healing Workers automatically identify and terminate these non-functional workers ensuring they are promptly replaced with healthy ones.
This automation ensures system uptime, prevents cascading failures, and reduces stale data risks by managing common Airflow infrastructure issues. Reducing manual intervention, frees up data engineering teams to concentrate on strategic initiatives and optimization, driving greater value for the organization.
Automated Infrastructure Provisioning and Management
For secure and automated infrastructure management, we are introducing features that simplify managing your Airflow infrastructure.
Astro Terraform Provider
Infrastructure management can be complex and error-prone. Manual processes increase the risk of configuration errors and inefficiencies. The Astro Terraform Provider simplifies this process by integrating with Terraform, allowing you to manage your Airflow environments as code.
Using the Astro Terraform Provider, you can streamline your infrastructure management practices, reduce manual errors, and enhance operational efficiency. It supports Astro API functionalities, with some enhancements coming in the future.
Learning Bytes: Astro Terraform Provider — Discover how to automate and simplify Airflow infrastructure management using code.
Customer Managed Workload Identity on AWS
Security and compliance are critical when managing data services. Customer Managed Workload Identity on AWS provides a secure method for authorizing access to data services without relying on static credentials. Customers can assign and manage their own AWS IAM roles, enhancing security and simplifying access management across multiple deployments. Customer Managed Workload Identity is also available for GCP and Azure.
Learning Bytes: Customer Workload Managed Identity — Learn how to set up a managed identity on Astro for Google Cloud Platform (GCP), ensuring passwordless authentication and seamless integration with Airflow.
Summary
Astronomer’s Platform Release brings transformative improvements across three major themes: Unified Workflows with dbt on Astro, Improved Pipeline Resilience, and Automated Infrastructure Provisioning and Management.
By integrating dbt and Airflow, enhancing monitoring capabilities, and simplifying infrastructure management, we aim to empower organizations to achieve greater efficiency, reliability, and scalability.