Airflow in Action: Infrastructure Management Insights from 1 Million Monthly Deploys at LinkedIn
The scale of LinkedIn’s digital ecosystem is huge: 7,000 services—60% of which are updated weekly—with provisioning orchestrated by one of 12,000 unique pipelines that are collectively handling one million deployments per month. The platform serves one billion users who are creating over 13,000 connections every 60 seconds. That’s why LinkedIn relies on Apache Airflow® — it gets new features to customers reliably, while reducing developer toil.
Many think of Airflow as the industry standard tool for data workflow orchestration. But it's extensible, flexible design and active open source community means it's not just limited to data pipelines. At the Airflow Summit 2024, software engineers from LinkedIn’s infrastructure team presented how the company uses Airflow to orchestrate its systems deployment pipelines.
In this blog post, we’ll recap key highlights from the session and provide you further resources to learn more.
Beyond Push-and-Pray: LinkedIn’s Recipe for Reliable Service Deployments
Service deployments are the last mile in getting features into the hands of customers. Deployments mean more than just engineers pushing code into production and then moving on to the next feature. They are required to run extensive checks to validate the new service is running correctly. These checks verify the right practices and policies are enforced, quickly detecting any issues, and rolling back services if necessary.
Services at LinkedIn take many different forms – they can be stateless and stateful applications or ML models that are internal or external facing, powering online and offline systems. Each has many dependencies with LinkedIn’s infrastructure and data ecosystem. Coupling these variables with LinkedIn’s scale means manual provisioning is out of the question.
Figure 1: High level view of the architecture behind LinkedIn’s Continuous Deployment (LCD) infrastructure platform. Image source.
Linkedin Continuous Deployment
LinkedIn Continuous Deployment (LCD) was created with the goal of improving the deployment experience for all LinkedIn systems and developers, providing a “touchless” experience. LCD delivers a modern and easy to use deployment experience, enabling developers to declare their pipelines and dependencies through a low code/no code UI. LCD translates the pipeline intent into Airflow DAGs that orchestrate the deployment pipeline and automate the necessary validation steps.
The app owners write the business logic for the validation steps which are then deployed into Docker containers and orchestrated as part of the deployment pipeline with the Airflow Kubernetes Pod Operator. The presenters noted that Airflow 3.0’s upcoming support for building and running tasks in any language will be hugely valuable to LCD, further breaking down barriers to developer adoption.
Figure 2: LCD ecosystem orchestrating 7,000 services. Image source.
Results and next steps
At full scale, LCD will have 10,000+ unique DAGs running in parallel. What does all of this mean for LinkedIn? Faster time to ship new services with higher quality and fewer rollbacks, all while reducing developer effort. Learn more from Linkedin's Continuous Deployment session at the Airflow Summit.
The easiest way to get started in your Airflow journey is to try it for free in Astro where you will unlock a suite of features designed to simplify, optimize, and supercharge your pipelines — whether you are orchestrating data workflows or infrastructure provisioning.