Use a MySQL or PostgreSQL database for metadata or storage
You can create Astronomer Software Deployments with the Houston API that use pre-created databases, external to the Airflow Deployment, as both a metadata storage and result storage backend.
Prerequisites
- Workspace Admin user privileges and a Workspace ID
- (Optional) A MySQL or PostgreSQL database
- (Optional) An existing Deployment
Step 1: Add configuration
- Open your
values.yamlfile. - Add the following to your
values.yamlfile. These configurations enable the use of manual connection strings and provide an example of the database connection string format, which you can later update for your specific connection.
PostgreSQL
MySQL
- Push the configuration change. See See Apply a Config Change.
Step 2: (Optional) Create your database
Substitute astro-db-name with your own database name, if you need to create a new database.
Step 3: Add a user account to your database for the connection
Substitute astro-user-name and astro-user-password with your information. You can use an existing database for this step.
PostgreSQL
MySQL
- Create a user with a password for Astronomer Software to use to access the database.
- Grant all privileges on the database to the user.
- Grant
USAGEandCREATEprivileges on thepublicschema toastro-user-name:
Now, go into the database you created, which is astro-db-name in this example, and run the following queries
- Grant all privileges on all tables, sequences, and functions to the user.
- Set default privileges for the user, so any new tables, sequences, or functions automatically have the user’s access.
Step 4: Retrieve database host information
Retrieve the connection information for your external database. For example, with AWS, you can retrieve your endpoint information by Finding the connection information for an RDS for MySQL DB instance.
Step 5: Compose a connection strings for your database
You need connection strings that define how Astronomer Software configures the connection to your external databases from your Airflow Deployment. The values of these strings are used when you define your metadataConnection or resultBackendConnection when you create, update, or upsert your Deployment.
Use the values for your astro-user-name, astro-user-password, astro-db-name, and the host information you retrieved to compose the connection strings in the following format, depending on whether you want to define a result backend connection or a metadata database connection.
PostgreSQL
MySQL
With PGBouncer disabled
-
metadataConnection: -
resultBackendConnection:
:::warning Celery Executor
The connection string format validation regex don’t cover the resultbackend connection string format, which includes db+. This is specifically required for the Celery executor worker. If the connection string doesn’t include db+, then Celery worker pod fails. The regex validation is not implemented because it adds the complications on format validation logic in different scenarios.
:::
With PGBouncer enabled
If you have PGBouncer enabled, and are using Postgres, you must configure metadataConnectionJson and resultBackendConnectionJson instead.
Use the values for your astro-user-name, astro-user-password, astro-db-name, and the host information you retrieved to compose the connection strings in the following format, depending on whether you want to define a result backend connection or a metadata database connection.
-
metadataConnectionJson: -
resultBackendConnectionJson:
Step 6: Add to Deployment configuration
Use the Houston API to create your Deployment configuration.
The following example shows the mutation and queries for using createDeployment. See Houston API code examples for examples on how to use the update and upsert options for configuring your Deployment.