November 27, 2024

Airflow in Action: ML and LLM Ops Insights from ASAPP — Reducing Workflow Runtimes by 85%

M Matthew Keep Astronomer

ASAPP, founded in 2014, is an AI leader in the contact center industry. The company’s machine learning (ML) based tools enhance customer experiences for enterprise clients including American Airlines, Dish, EY, and JetBlue.

With a focus on deep research and innovation, ASAPP’s AI-powered cloud applications help contact centers elevate every customer interaction—boosting automation, enhancing agent productivity, and uncovering insights that drive better business outcomes. ASAPP was named as a leader in The Forrester Wave™: Digital Customer Interaction Solutions, Q2 2024 report, which went on to describe the company as “this market’s undisputed leader in AI-led innovation.”

At this year’s Airflow Summit, Udit Saxena—ML Engineer at ASAPP—presented ASAPP’s AI/ML journey, detailing how Apache Airflow®, integrated with custom Apache Spark® solutions, streamlines and optimizes MLOps and LLMOps within the company.

Leveraging Airflow: From DataOps to MLOps

ASAPP’s Data Ops and MLOps ecosystem is built to support continuous innovation, demanding deep analytics along with frequent retraining and fine-tuning of ML models. Airflow serves three key roles, handling one million tasks and 5,000 pipelines every day:

Data Engineering for Ingestion and Pre-Processing: Real-time and batch ETL pipelines are managed via Airflow, which routes data through Spark applications, then into storage solutions like Athena, Amazon S3, and Redshift for downstream analytics.
DataOps for Data Retention and Sampling: By automating periodic checks across diverse data sources, Airflow ensures retention policies are consistently enforced, reducing the risks and errors associated with manual management. Additionally, scheduled DAGs are used to sample and process production data at regular intervals. This data feeds various downstream applications, such as transcribing audio for MLOps workflows.
ML & LLM Model Training and Evaluation: ASAPP relies on Airflow to evaluate and monitor model performance. This includes automated retraining triggered by data changes and regular model evaluations for quality assurance using techniques such as LLM-as-a-judge along with prompt testing and versioning. Airflow is agnostic, handling models hosted by Amazon Bedrock, OpenAI and Anthropic.

Figure 1: Airflow has been ASAPP’s primary MLOps primary orchestration platform since 2020 Image source.

Speech Recognition Workflow: A Case Study

The ASR pipeline is a cornerstone of ASAPP’s MLOps, enabling transcription and downstream analysis of vast amounts of audio data from contact center calls. The workflow involves three primary stages:

Pre-Processing: ASAPP uses Airflow to manage data preparation tasks like language detection and speaker separation.
Transcription: Raw audio files are transcribed, with each task executed as a separate Airflow DAG. ASAPP handles intermediate storage on S3, where hundreds of gigabytes of data are processed for ASR applications.
Post-Processing: Transcripts are refined and anonymized for further analysis, enabling applications like call summarization or intent analysis.

Figure 2: Airflow is used to transcribe current and historical audio data for offline downstream pipelines, including extracting customer insights and model training Image source.

ASAPP's ASR workflow is highly scalable, handling diverse data formats and models (e.g., LLM and IVR-based models), making it adaptable for both real-time and historical data processing.

Integrating Spark for Massive Scaling: From Days to Hours

To scale ASR and LLM tasks, ASAPP leverages Spark in tandem with Airflow, utilizing Spark’s parallel processing architecture. Udit shared insights into the various Spark operators available for Airflow, along with the pros and cons of each.

Rather than maintaining a single Spark cluster, ASAPP dynamically launches Spark clusters for each DAG task, coordinating this with Airflow’s Kubernetes Executor. This approach provides flexible, on-demand scaling, efficiently partitioning and processing data, all orchestrated by Airflow.

The result has been substantial performance gains, with ASAPP reducing ASR workflow runtimes from 43 hours to under 5 hours—a reduction of 85%.

Key Airflow Learnings from ASAPP

Udit concluded his talk by sharing key learnings from his half a decade’s Airflow experience:

Strong Orchestration Foundation: Airflow has proven to be an effective orchestrator within ASAPP’s infrastructure, particularly following upgrades to Airflow 2.3 and beyond, which enhanced pipeline stability and performance.
Integrated Spark Support: By offloading compute-heavy processing to Spark, ASAPP gained flexibility and scalability in their data-intensive workflows.
Adaptability for LLM Workflows: Airflow’s ability to handle complex, on-demand API tasks makes it suitable for evolving LLM tasks, such as fine-tuning, evaluations, and prompt management that are crucial to ASAPP’s generative applications.

Next Steps

ASAPP’s Airflow implementation showcases how orchestration and scalable infrastructure can streamline GenAI and MLOps processes, delivering efficiency and adaptability in high-stakes, data-driven environments. Watch the Summit session Airflow, Spark, and LLMs: Turbocharging MLOps at ASAPP to get all of the details. Udit also documented his company’s journey in the Airflow at ASAPP: Enhancing AI-Powered Contact Centers blog post.

ASAPP’s experience shows that success with Generative AI is all about the data — specifically, bringing your own data into models. To learn more about the critical role data workflows and pipelines play, download our Guide to Data Orchestration for Generative AI

The best way to build GenAI applications powered by your data is to use Astro, the industry’s leading managed Airflow service. You can get started for free here.