ogdc_runner.parallel module

Orchestration for parallel execution of workflow tasks.

This module provides the abstract ParallelExecutionOrchestrator base class which defines the interface for managing parallel Argo workflow tasks. It handles:

  1. Creating execution templates (Container templates or Hera @script functions)

  2. Partitioning input data into parallel chunks

  3. Creating DAG tasks with proper dependencies and parameters

The maximum parallelism is controlled at the workflow level, allowing the Argo workflow engine to automatically schedule tasks as resources become available.

Example

orchestrator = ShellParallelExecutionOrchestrator(

recipe_config=recipe_config, execution_function=ExecutionFunction(name=”cmd-0”, command=”process.sh”),

) template = orchestrator.create_execution_template() tasks = orchestrator.create_parallel_tasks(template)

class ogdc_runner.parallel.ParallelExecutionOrchestrator(recipe_config: RecipeConfig, execution_function: ExecutionFunction) None

Bases: ABC

Abstract base class for orchestrating parallel execution of workflow tasks.

This class defines the interface for creating Argo DAG tasks for parallel execution of a single execution function. Each execution function should have its own orchestrator instance, and the DAG dependencies between different execution functions should be managed by the workflow implementation.

Subclasses must implement: - create_execution_template(): Create workflow-specific execution templates - _create_tasks_from_partitions(): Create tasks from partitions with specific parameters

Parameters:
abstractmethod create_execution_template() Any

Create workflow-specific execution template.

Return type:

Any

Returns:

Execution template (Container, Hera @script function, etc.)

Raises:

ValueError – If execution function configuration is invalid

create_parallel_tasks(template: Any) list[hera.workflows.Task]

Create Argo DAG tasks for parallel execution.

Must be called after create_execution_template() and can be called within DAG context.

Parameters:

template (Any) – Execution template from create_execution_template()

Return type:

list[Task]

Returns:

List of Argo Task objects for parallel execution