Configure OpenLineage for a Remote Execution Agent
OpenLineage enables you to access data lineage and provenance across your Airflow workflows for your Remote Execution Agent. Features like Observe and Astro Alerts require that you enable OpenLineage for your data pipelines.
When you create your Remote Execution Agent, Astro automatically generates a Helm values.yaml
file with OpenLineage configurations pre-filled. To set up OpenLineage, you need to configure an access credential for OpenLineage. This can be an Astro Deployment API token used as your OpenLineage API key. There are three methods you can use to add your API key to your Helm values:
- Configure the API key as plain text: This stores your API key in your
values.yaml
file as plaintext, which is the simplest but least secure option. It would be appropriate for development or testing environments. - Use a pre-created Kubernetes secret: This procedure stores your API key separately from your
values.yaml
file, which provides more security than storing as plaintext. This option provides security with standard Kubernetes features. - Inject your API key with a secrets manager: This approach uses the init container to inject the Agent Token into the Remote Execution Agent component Pods. This example uses the Hashicorp Vault agent, but you can use your own secrets manager. This option provides enhanced security with the potential for secret rotation.
Prerequisites
- A Kubernetes cluster
- Helm
- The
values.yaml
file downloaded from the Remote Execution Agent registration modal in the Astro UI - A Deployment API Token that uses a Custom Deployment role with Observe Ingest permissions
Setup
Step 1: Retrieve your OpenLineage Namespace and URL
-
In the Astro UI, go to the Deployment page and choose Agent. Then click Register Remote Agent.
-
Click Download
values.yaml
file.
The downloaded Helm values file will come with some of the key information for OpenLineage pre-filled, including OpenLineage Namespace and URL. So you only need to configure the OpenLineage API key.
Step 2: Configure the OpenLineage API Key
- Configure key as plaintext
- Use pre-created Kubernetes secret
- Use secrets manager
This method stores an Astro Deployment API token as your OpenLineage API key as plain text in your values.yaml
file, so that the Remote Execution Agent Helm chart can use it to creates a Kubernetes secret named openlineage-api-key-secret
. This API key is base64-encoded in the Kubernetes secret.
All Remote Execution Agent components, such as the Worker, DAG Processor, Triggerer, use this API key to authenticate with the OpenLineage endpoint.
- Add the following OpenLineage configuration to your
values.yaml
file:
openlineage:
# Enable OpenLineage integration
enabled: true
# Set your OpenLineage API key directly in the values file
apiKey: "insert-your-openlineage-api-key-here"
# Do NOT set apiKeySecret when using apiKey
# apiKeySecret: ~
# Astro OpenLineage URL endpoint (This should be prefilled in the downloaded values.yaml from Astro UI)
url: "https://your-openlineage-endpoint.example.com"
# Astro Deployment's namespace (This should be prefilled in the downloaded values.yaml from Astro UI)
namespace: "your-astro-deployment-namespace"
- Apply the chart using the
values.yaml
file with the following command:
helm install astro-agent astronomer/astro-remote-execution-agent -f values.yaml
When you use this method, the Remote Execution Agent Helm chart doesn't create a new secret for the OpenLineage API key. Instead, it configures all Agent components, such as the Worker, DAG Processor, Triggerer, to use the existing secret to authenticate with the OpenLineage endpoint.
The secret must have a key named api-key
containing the OpenLineage API key in your values.yaml
file. However, you can use an Astro Deployment API token as your OpenLineage API key:
- Create a Kubernetes secret containing your OpenLineage API key:
kubectl create secret generic openlineage-api-key-secret \
--from-literal=api-key=your-openlineage-api-key-here \
--namespace <your-namespace>
- Configure your
values.yaml
file so that OpenLineage uses your pre-created secret:
openlineage:
# Enable OpenLineage integration
enabled: true
# Do NOT set apiKey when using apiKeySecret
# apiKey: ~
# Reference the pre-created secret you created in the previous step
apiKeySecret: "openlineage-api-key-secret"
# Astro OpenLineage URL endpoint (This should be prefilled in the downloaded values.yaml from Astro UI)
url: "https://your-openlineage-endpoint.example.com"
# Astro Deployment's namespace (This should be prefilled in the downloaded values.yaml from Astro UI)
namespace: "your-astro-deployment-namespace"
- Apply the chart using the
values.yaml
file with the following command:
helm install astro-agent astronomer/astro-remote-execution-agent -f values.yaml
You can also use a secrets manager to securely store your API keys. The following procedure specifically uses the Hashicorp Vault Agent.
The Vault Agent init container runs before the main Remote Execution Agent container. The Vault agent authenticates with Vault to retrieve the OpenLIneage API key and then write the API key to a file in the shared volume. The Remote Exuction Agent container can then read the OpenLineage API Key from the file using the OPENLINEAGE_API_KEY
environment variable. Use your Astro Deployment API token as your OpenLineage API key.
- Configure OpenLineage and add init containers for the
dagProcessor
,workers
, andtriggerer
components:
openlineage:
# Enable OpenLineage integration
enabled: true
# Don't set apiKey or apiKeySecret
# apiKey: ~
# apiKeySecret: ~
# Astro OpenLineage URL endpoint (This should be prefilled in the downloaded values.yaml from Astro UI)
url: "https://your-openlineage-endpoint.example.com"
# Astro Deployment's namespace (This should be prefilled in the downloaded values.yaml from Astro UI)
namespace: "your-astro-deployment-namespace"
# Configure each component to use the Vault init container
dagProcessor:
initContainers:
- name: vault-openlineage
image: hashicorp/vault:1.13.1
command: ["/bin/sh", "-c"]
args:
- |
export VAULT_ADDR=https://vault.example.com
vault agent -config=/vault/config/agent.hcl
volumeMounts:
- name: vault-config
mountPath: /vault/config
- name: openlineage-volume
mountPath: /vault/secrets
volumes:
- name: vault-config
configMap:
name: vault-agent-config
- name: openlineage-volume
emptyDir:
medium: Memory
volumeMounts:
- name: openlineage-volume
mountPath: /vault/secrets
env:
- name: OPENLINEAGE_API_KEY
valueFrom:
fileRef:
path: /vault/secrets/openlineage-api-key
triggerer:
initContainers:
- name: vault-openlineage
image: hashicorp/vault:1.13.1
command: ["/bin/sh", "-c"]
args:
- |
export VAULT_ADDR=https://vault.example.com
vault agent -config=/vault/config/agent.hcl
volumeMounts:
- name: vault-config
mountPath: /vault/config
- name: openlineage-volume
mountPath: /vault/secrets
volumes:
- name: vault-config
configMap:
name: vault-agent-config
- name: openlineage-volume
emptyDir:
medium: Memory
volumeMounts:
- name: openlineage-volume
mountPath: /vault/secrets
env:
- name: OPENLINEAGE_API_KEY
valueFrom:
fileRef:
path: /vault/secrets/openlineage-api-key
workers:
initContainers:
- name: vault-openlineage
image: hashicorp/vault:1.13.1
command: ["/bin/sh", "-c"]
args:
- |
export VAULT_ADDR=https://vault.example.com
vault agent -config=/vault/config/agent.hcl
volumeMounts:
- name: vault-config
mountPath: /vault/config
- name: openlineage-volume
mountPath: /vault/secrets
volumes:
- name: vault-config
configMap:
name: vault-agent-config
- name: openlineage-volume
emptyDir:
medium: Memory
volumeMounts:
- name: openlineage-volume
mountPath: /vault/secrets
env:
- name: OPENLINEAGE_API_KEY
valueFrom:
fileRef:
path: /vault/secrets/openlineage-api-key
- Create a ConfigMap for your vault agent:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
name: vault-agent-config
namespace: <your-namespace>
data:
agent.hcl: |
auto_auth {
method "kubernetes" {
mount_path = "auth/kubernetes"
config = {
role = "astro-agent"
}
}
}
template {
destination = "/vault/secrets/openlineage-api-key"
contents = "{{ with secret \"secret/data/openlineage/api-key\" }}{{ .Data.data.key }}{{ end }}"
}
EOF
- Apply the chart with the
values.yaml
file:
helm install astro-agent astronomer/astro-remote-execution-agent -f values.yaml
Read more about Secrets backends on Astro on Astro.