Argo Workflows¶
OGDC recipes are executed via one or more argo-workflows. The OGDC Service API is responsible for translating user recipes into argo workflows that are submitted and executed by Argo.
This document includes details of how Argo is setup and used by the OGDC.
argo-workflows server configuration¶
The configuration for argo-workflows is defined in the ogdc-helm repository. Refer to the configurations there to see the specifics of how argo is setup for deployment.
Argo Python API (Hera)¶
The OGDC uses the hera python package
to interact with the argo-workflows API. Boilerplate setup and configuration of
argo via hera can be found in ogdc_runner.argo.
Workflow artifacts¶
Argo artifacts are configured to use an S3-compatible artifact repository.
Artifacts are used to store intermediate workflow outputs and the final output
of recipes with a temporary output type.
Artifacts will be automatically garbage collected on workflow deletion. See the Workflow persistence section below for details on automatic workflow deletion.
Workflow persistence¶
Successful Argo workflows are retained for 1 day. Workflows associated with
recipes with the temporary output type are retained for 7 days to allow for
sufficient time to retrieve final outputs.
Successful workflows with the ogdc/persist-workflow-in-archive: true label
will be archived in the OGDC’s postgresql database for long-term storage. These
archived workflows can be used for metrics and data provenance purposes.
Non-successful workflows are not automatically cleaned up or archived. They are retained for inspection/debugging and should be cleaned up manually once the issue leading to failure is resolved.
To ensure consistent behavior of OGDC-submitted argo workflows (e.g., setting
the archival label and TTL for successful workflows), the
ogdc_runner.argo.OgdcWorkflow context manager has been defined to wrap
the behavior of hera’s
Workflow.