Skip to main content

Remote Execution Agents

Airflow 3 only
This feature is only available for Airflow 3.x Deployments.

Overview

Remote Execution relies on Agents in your environment, or the execution plane, to communicate with the API server in the Astro orchestration plane. The Agent heartbeats with its capabilities and the queue it is listening on, and the API server will respond back to assign work accordingly. Worker Agents run synchronous tasks, Triggerer Agents run asynchronous tasks/deferrable operators and the DAG Processor Agent processes dags and sends up serialized representations.

You can register multiple remote queues for your Remote Execution Agents for similar reasons you would register multiple worker queues in Hosted execution mode.

note

Add https://<clusterId>.external.astronomer.run/ to your organization's network allowlist so the Remote Execution Agents in your environment are able to heartbeat to the API server in the Astro orchestration plane.

Register Remote Execution Agent with a Deployment

Prerequisites

  • Kubernetes 1.30+
  • Helm 3+
  • A valid Astro account with permissions to create an Astro Agent
  • An Agent Token from your Astro Deployment
  • An API Deployment Token with Deployment Admin scoped permissions to pull the base Astro Remote Execution Agent Image

Step 1: Create Agent Token

  1. In the Astro UI, select a Workspace, click Deployments, and then select the Remote Execution Deployment that you want to register the Remote Execution Agent with.
  2. Select the Remote Agents tab.
  3. Use the toggle to switch to the Tokens view.
  4. Click +Agent Token.
  5. Define a Name, Expiration and optionally a Description for the Agent Token.
  6. Copy the Agent Token. You will not be shown the Agent Token value again.

Step 2: Install the Helm chart

  1. Use the toggle to switch back to the Agents view, and click Register a Remote Agent
  2. Click the Download button on the modal to download the values.yaml file. Use this Helm chart to configure the Remote Execution Agent.
  3. Most config options in the values.yaml do not need to be updated to register and activate the Remote Agent. The following values in the Remote Agent's Helm chart need to be updated: resourceNamePrefix, namespace, secretBackend, xcomBackend, imagePullSecretName, and agentToken or agentTokenSecretName. The descriptions of each value is provided in the Helm chart itself.
  4. Install the Helm chart. The following code also adds the Helm repo and applies any updates:
helm repo add https://helm.astronomer.io
helm repo update
helm install astro-agent astronomer/astro-remote-execution-agent -f values.yaml

Step 3: Set allowed IP address ranges

Setting allowed IP address ranges for a Remote Execution Deployment is not necessary but allows you to limit the Deployment's incoming traffic to the Remote Agents in your environment.

  1. In the Set Allowed IP Address Ranges step on the modal, click edit this Deployment.
  2. Click +Add IP.
  3. Add the IP address range, then click Add.
  4. Repeat to allowlist more IP address ranges.

Step 4: Check for Agent heartbeat

Close the modal to view the Agent list page and check for a healthy agent.

You should see at least one agent with a health status of "Healthy" and a value for "last heartbeat" no more than one minute prior. If your Remote Agent is healthy, configure dagBundleConfigList in values.yaml and do a Helm upgrade. You can now run Dags on this Agent.

Remote Executon Agents Helm chart configuration reference

The following Helm chart config values are required to run dags with your Remote Execution Agent. To see all available configuration options with descriptions, see the value.yaml file downloaded in step 2.

agentToken or agentTokenSecretName

You must specify either agentToken or agentTokenSecretName to inject the Agent token generated in Step 1 into the Helm chart.

  • For agentToken, pass the Agent token value copied in Step 1 into the Helm chart as a string, and Helm will create a secret with the name agent-token-secret in the namespace.

  • For agentTokenSecretName, pass the name of the secret containing the Agent token to connect the Agent to the Astro Deployment. This field should be used when the secret has already been created in the namespace through your own external mechanism. The secret must contain a key named token with the value set to the Agent Token created in Step 1.

imagePullSecretName or imagePullSecretData

You must specify either imagePullSecretName or imagePullSecretData to allow the Remote Execution Agent to pull container images from the registry. This can use your Astro Deployment API Token from your pre-reqs.

  • For imagePullSecretName, provide the name of an existing Kubernetes secret in the namespace. This secret must contain a key named .dockerconfigjson with your image pull credentials. Use this option if the secret has already been created. Example Kubernetes command to create the secret:
kubectl create secret docker-registry -n <namespace> <secretName> \
--docker-server=images.astronomer.cloud \
--docker-username=cli \
--docker-password=<astroToken>
  • For imagePullSecretData, provide the Docker config JSON as a string. The Helm chart will use this value to create a secret named image-pull-secret in the namespace. The value must follow the standard Docker config format:
{
"auths": {
"<registry.example.com>": {
"auth": "<auth-token>",
"email": "<email-address>"
}
}
}

namespace

Specifies the Kubernetes namespace where the Astro Remote Agent will be deployed.

If createNamespace is set to true, the Helm chart will create the namespace with the name provided in this field.

If createNamespace is set to false, the namespace must be created manually before deploying the chart, and this field should reference the existing namespace.

See Install Remote Execution Agents in a restricted kubernetes namespace for steps to configure an agent in a kubernetes namespace.

note

If you are passing agentTokenSecretName and imagePullSecretName, createNamespace must be set to false and the namespace must be created manually with those secrets already present.

resourceNamePrefix

Specifies a name prefix for all the Kubernetes resources related to Remote Execution Agent components that would be deployed by this chart such as Deployments, ConfigMaps and Secrets.

secretBackend

The Airflow secret backend class to use for the Agent. See Secrets backend integration for Remote Execution for external secrets providers configuration instructions.

It is not recommended for production use cases, but you can also use Airflow’s local filesystem backend as a simpler alternative to an external secrets provider by setting:

secretBackend: "airflow.secrets.local_filesystem.LocalFilesystemBackend"
note

The rest of the configuration should be passed as environment variables to the Agent components.

xcomBackend

The Airflow XCom backend class to use for the Agent.

See Configure XCOM backend for a Remote Execution Agent for how to set up the XCom Backend in the Remote Execution Agent components.

Failures

When the heartbeat between the API server and a Remote Execution Agent is disrupted, the Astro executor prevents task duplication by marking in-flight tasks from failed Agents as failed. This makes them available for reassignment to healthy Agents. An Agent must receive confirmation from the API server before starting a task, ensuring only actively monitored tasks proceed. If an Agent loses connection to the API server, it will continue executing already assigned tasks but will not start new ones until a successful heartbeat is re-established.

Agent failure

The API Server will consider an Agent to have failed when its last heartbeat exceeds 3x the expected heartbeat interval. Once an Agent is considered to have failed, the API Server will determine if it had any in-flight work (tasks that were assigned to that Agent but were not reported back as completed) and mark those tasks as failed. By doing so, it will make those tasks available such that the next Agent that heartbeats will be assigned them.

API Server failure

If an Agent is unable to reach the API Server for its heartbeats, it will assume that other Agents are still able to contact the API Server. As such, the Agent will proceed with any already in-flight tasks, since it can be sure that they will not be reassigned to other Agents for execution. However, the Agent will not start any net new tasks because since the API Server has not had a chance yet to mark the task as "running", the task could be reassigned to another Agent.

Manage Remote Execution Agents

On the Remote Agents tab of the Deployment, click More options for a Remote Execution Agent on the list to access an action menu for that Agent.

You can take the following actions on your registered Remote Execution Agents:

  • Cordon: Cordoning a Remote Execution Agent marks it as unavailable for scheduling new tasks, while allowing it to continue running and complete any tasks already in progress.

This allows you to gracefully remove the Agent from service without interrupting current workloads. For example, you can cordon an Agent to delete or perform maintenance, such as an upgrade, on the Agent or underlying infrastructure.

A cordoned Agent will not receive new work, but it remains active until all running tasks have finished. Once ready to reintroduce the Agent to the task pool, it can be uncordoned to resume normal operation.

  • Uncordon: Uncordoning a Remote Execution Agent re-enables it to receive new tasks and resume normal scheduling.

  • Delete: Deletes the Remote Execution Agent from the Deployment.

Remote Execution Agent maintenance policy

Each Remote Execution Agent minor version is maintained for 6 months from the release month.

See Maintenance policy for more details about versioning, support, and upgrade recommendations.

Was this page helpful?