Unlocking the Power of Scalable Machine Learning with Anyscale and Astronomer

  • Radhika Gulati
  • Steven Hillion

A huge thank you to the Anyscale team for partnering with us on this blog. Special thanks to Chris Zhang, Julia Martins, and Matthew Connor for their invaluable contributions, and to Kamil Kaczmarek, Marwan Sarieddine, and former Astronomer team member Venkata Jagannath for their dedicated work on the providers.

In the rapidly evolving landscape of machine learning and AI, the demand for scalable, efficient, and manageable workflows has never been greater. As machine learning (ML) workflows grow in complexity and scale, data teams face the challenge of managing vast amounts of data and computational tasks across distributed environments.

To address these critical challenges, Anyscale and Astronomer have joined forces, combining their expertise to offer a comprehensive solution:

Astronomer, a modern data orchestration platform designed to empower organizations in building, managing, and scaling their data workflows. With enterprise-grade capabilities powered by Apache Airflow, Astronomer enables teams to streamline complex processes, integrate diverse tools, and unlock the full potential of their data infrastructure. Whether orchestrating data pipelines or managing machine learning workflows, Astronomer helps organizations optimize their operations with minimal overhead, ensuring efficiency and scalability at any level.

Anyscale, the company behind the popular AI Compute Engine, Ray, offers a fully managed platform for deploying and scaling Ray clusters, enabling effortless distribution of computational tasks for ML and AI workloads. By allowing teams to focus on innovation rather than infrastructure management, Anyscale empowers organizations to accelerate their AI initiatives. The platform offers enterprise-grade security and performance optimizations, ensuring that even the most demanding ML workloads can be handled with ease and efficiency.

Together, Astronomer and Anyscale offer a comprehensive solution for orchestrating and scaling machine learning and AI, seamlessly combining Airflow’s powerful workflow management with Ray’s distributed computing capabilities—whether through open-source flexibility or managed scalability.

This collaboration brings the power of Anyscale's distributed computing capabilities directly into the Airflow ecosystem, providing users with enhanced options for scaling their machine learning workflows.

Airflow—running as part of Astronomer's hosted platform, Astro—is instrumental in detangling and orchestrating complex workflows that span many different platforms including our cloud providers, Anyscale, and other pipelines.

Ted Li

Ted Li
Associate Director, AI
Second Dinner

The Technology Behind Scalable Workflows: Apache Airflow and Ray

At the core of managing scalable AI/ML workflows are two key technologies: Apache Airflow and Ray.

Apache Airflow is a widely adopted framework that automates the scheduling and orchestration of complex workflows. It enables data teams to manage dependencies, track task progress, and ensure reliable execution of various data processing tasks, from ETL (Extract, Transform, Load) or ELT (Extract, Load, Transform) jobs to AI/ML pipelines. Airflow's flexibility makes it a key tool for teams needing to automate and scale their workflows effectively.

Ray is the open-source AI Compute Engine, specifically designed for scalable, distributed computing. It allows machine learning teams to distribute computational tasks across multiple nodes, making it particularly useful for training large models such as LLMs (Large Language Models). With Ray’s ability to run parallel tasks efficiently, it has become an increasingly popular alternative to tools like Spark, especially for ML/AI workloads that require high levels of compute.

By combining Airflow’s powerful workflow orchestration with Ray’s distributed computing capabilities, organizations can efficiently scale, integrate, and automate complex machine learning workflows, streamlining the entire data lifecycle from processing to model deployment.

For more information on the provider, visit Astronomer’s Anyscale and Ray Provider page.

The Anyscale and Ray Providers by Astronomer

For teams already using Apache Airflow, we’re excited to announce that Anyscale integrates with Astronomer's platform, offering an additional avenue to leverage Anyscale's powerful capabilities within their existing workflows.

Overview and Key Features

The Anyscale provider integrates RayTurbo directly into Airflow workflows and adds enterprise-grade scalability, reliability, and performance optimizations. RayTurbo is Ray, supercharged, offering up to 5.1x faster node autoscaling, 56% faster QPS serving, and 60% lower costs thanks to spot instance support. Anyscale and RayTurbo enable teams to deploy and scale Ray clusters efficiently without needing to manage the underlying infrastructure. By integrating directly with Airflow, the Anyscale provider enables organizations to leverage distributed computing with minimal operational complexity.

The Ray provider was developed to integrate Ray’s distributed computing capabilities directly into Airflow workflows, providing data teams with a powerful tool for scaling ML workloads. By leveraging Ray’s parallel processing, teams can handle large-scale tasks like model training and inference efficiently, without leaving the familiar Airflow environment.

Both providers offer a powerful set of core features that enable seamless management of distributed computing tasks within Airflow.

  • Integration with Airflow: Seamlessly orchestrate Ray tasks within Airflow, enabling data teams to manage distributed computing workflows in a familiar environment.
  • Automated Cluster Management: Efficiently handle resource allocation and manage the lifecycle of Ray clusters, from spinning them up to scaling them down, based on workload demands.
  • Real-Time Monitoring: Gain visibility into task performance directly from Airflow’s user interface, allowing teams to track progress and make adjustments as needed.

These core features apply to both the Ray and Anyscale providers, with Anyscale offering the added advantage of a managed service for handling infrastructure.

Embrace the Future of Scalable Machine Learning

As the complexity and scale of machine learning operations continue to grow, the integration of Anyscale's distributed computing power with Astronomer's workflow orchestration capabilities offers a compelling solution for modern data teams.

Anyscale is the AI Compute Platform, providing the robust infrastructure needed to efficiently distribute and scale computational tasks. By leveraging Anyscale, organizations can focus on building and deploying models rather than managing the intricacies of distributed systems. The platform's enterprise-grade features ensure that even the most demanding ML workloads can be handled with ease and security.

Astronomer is unmatched in managing and orchestrating complex, interdependent workflows. Its powerful capabilities for task scheduling, dependency handling, and providing clear visibility into pipeline operations make it an essential tool for data engineers and ML teams. By leveraging Airflow, organizations can achieve reliable, reproducible, and efficient execution of machine learning pipelines at scale.

The integration of these two powerful platforms opens up new possibilities for ML teams:

  • Seamless scalability from laptop to cluster for compute-intensive tasks
  • End-to-end visibility and management of complex ML workflows
  • Optimized resource utilization across the entire ML lifecycle
  • Accelerated development and deployment of ML models

We encourage you to explore how this integration can benefit your ML operations:

  • Visit Anyscale.com to learn more about scaling your ML workloads and to book a demo of the Anyscale platform.
  • Get started for free at Astronomer.io and see how Astro, powered by Apache Airflow, can help you orchestrate, manage, and scale your machine learning workflows.

By combining the strengths of Anyscale and Astronomer, organizations can build a robust, scalable, and efficient ML infrastructure that drives innovation and accelerates time-to-value for AI initiatives.

Ready to Get Started?

See how your team can fuel its data workflows with more power and less complexity than ever before.

Start Free Trial →

Which plan works best for your team?

Learn about pricing →

What can Astronomer do for your organization?

Talk to an expert →