Create and use data products with Astro Observe
A data product is a pipeline-driven resource that captures the data lifecycle of a pipeline. Tasks, datasets, warehouse tables, and local files can all be assets of data products. Data products are abstractions that serve the purpose of gaining observability into the health and performance of data pipelines.
In the Data Products page of the Observe menu, you can view any of your existing data products or choose to create a new one. You can also create a Service Level Agreement (SLA), which you can configure to assess the on-time delivery of a data product (data timeliness) or how recently the data product has refreshed data (data freshness).
At a glance, the Data Products landing page shows you a list of your available data products. When you select a specific data product, you can see a summary of its performance, including a graph of the SLA Hit Rate, visualizing how frequently the data product meets the terms of an SLA. It also shows other details, like the owner and how recently the data product was updated.
The following procedures describe how to create data products and leverage them to gain insight across pipelines and deployments. After you create a data product, then you can view data product details.
You can follow a comprehensive walkthrough of setting up a Data Product and testing an alert in the Get started with Observe quickstart.
Prerequisites
- Organization Billing Admin or higher user permissions
- An asset. At least one Deployment with a DAG in either an OSS Airflow deployment or an Astro Deployment.
- Version 1.12.1 or higher of OpenLineage. See Upgrade OpenLineage for Astro Deployments.
Create a Data Product
You can create a data product in the Data Products summary page.
- Click + Data Product.
- Enter a Name for your data product and an optional description.
- Choose whether the data product belongs to a User or a Team and select a specific user or team that serve as the point of contact for it.
- Select the final downstream Assets in your pipeline to create your data product.
- Click Create Data Product.
After you make a data product, the page refreshes so you can view your Data Product Details where you can configure additional SLAs and alerts for the data product.
Create an SLA
If you have existing data products, you can create an SLA for them. Astro allows you to create Freshness and Timeliness SLAs. Based on the criteria you define in your SLA, Astro uses the rate of success your data product meets an SLA to generate insights into your data pipelines and make recommendations about creating proactive alerts.
- In the Data Products page, click the specific data product for which you want to create an SLA.
- Click the SLA Evaluations tab and click + Add SLA.
- Add a Name for your SLA.
- Choose the SLA Type:
- Timeliness: Create an SLA that is a window of time. During this time, if your asset completes its defined actions, Astro considers it a success. Define the window of time by choosing Days of the week, Verification time, and Lookback period.
infoTimeliness SLAs only support Standard Time. If you want Local Time support, you must adjust the SLA's UTC time when the time changes from Standard Time to Daylight Savings Time or from Daylight Savings Time to Standard Time.
- Freshness: Create an SLA that defines how recently you want the data product to update. Configure this SLA by defining a Freshness Policy by the number of minutes, hours, or days.
- Click Create SLA.
After you create an SLA, you can configure alerts and proactive alerts.
Create an alert
After you create an SLA, Astro keeps a record of the rate at which your data product hits or misses the SLA. You must configure an Alert or a Proactive Alert to receive notifications when your pipeline experiences an SLA miss or when an upstream process might cause an SLA miss or a failure.
- In the Data Products page, click the specific data product you want to create an SLA for.
- Click the Alerts tab and click + Add Alert.
- Choose the Type of alert and Severity. The following alert types are available:
- Data Product SLA Violation: Send an alert when a data product asset has violated its SLA definition.
- Data Product Proactive SLA: Astro monitors the upstream dependencies of the data product assets, and proactively sends an elert if delays in the upstream dependencies might eventually cause SLA misses.
- Data Product Proactive Failure: Send an alert when a dependent asset upstream of your data product has failed.
- Define the conditions that the alert applies to. These conditions vary depending on the type of alert you want to set up.
- Select or add a Notification Channel where you want to send your alert. For more information about configuring notification channels, see Alert Notification Channels.
- (Optional) Customize the alert name.
- Click Create alerts.
View your Data Product details
After you create a data product, you can view in-depth details about the performance of your assets by selecting your data product for a closer look.
-
Click Observe in the Astro UI, and then click Data Products.
-
Choose the data product that you want to view details about.
When you see information about a specific data product, you can see summary performance data. This includes general statistics, like the average SLA success rate and information about when the data product was created and last updated. Additional details can be found in the following tabs.
Overview
The overview tab provides summary information about the assets in your data product and the rates of overall SLA success rates, consolidated into daily, weekly, and monthly overall rates. This view allows you to quickly identify trends in historical data for your business-critical pipelines.
Additionally, Observe also provides Insights into the performance of the assets you include in your data products. The platform bases these recommendations on how frequently your assets meet the definitions of your SLAs. Some recommendations include:
- If an asset has a high variability in meeting or missing the SLA due to constant upstream delays, Astro might display a recommendation in your Data Product Details page to create a Proactive SLA Alert on the asset.
- If an asset frequently fails to meet an SLA due to consistent upstream failures, Astro might recommend in your Data Product Details page to create a Proactive Failure Alert on the upstream asset.
Event timeline
Each Data Product in Observe has an Event Timeline that reflects activity from the assets, like DAG and task completion, and key issues, like breached SLAs and triggered alerts, in that data product. The event timeline supports filtering by event status along with more fine-grained filters. Selecting any particular event displays metadata about the event, including the notification history of a triggered alert.
The event timeline view allows you to see a record of events associated with your data product. These are categorized into Success, Neutral, and Failure events.
Success events
- SLA Success
- Task Success
- Task Start
Neutral events
- Airflow dataset write
- OpenLineage dataset write
Failure events
- Alert notification
- SLA Breach
- Task failure
Root-cause analysis (RCA)
For failure events, Observe detects the upstream root cause by scanning upstream dependencies and surfacing any anomalies or failures. For failed upstream tasks, Observe can summarize the task failure logs into a human-readable description of the issue directly in the timeline. Failure events also show downstream dependencies to help you understand the potential impact of the issue.
Assets
The Assets tab shows the assets included in the specific data product, in a list or a graph format.
If you select an asset from the list to examine in-depth, you can see additional asset-specific details. See Access asset information for descriptions.
Metrics
Data products that include Airflow task assets report the following metrics by default:
- Task retries
- Task failures
- Task duration
Alerts
The Alerts tab consolidates all alerts and SLAs that are specific to your selected data product. From this page you can Add an alert, find an existing alert, and view the notification history for an alert.