ogdc_runner.partitioning module¶
Partitioning strategy for dividing work across parallel tasks.
This module creates independent chunks of work (partitions) by grouping files based on a configured partition size. Each partition represents a unit of work that will be executed in parallel.
- ogdc_runner.partitioning.create_partitions(inputs: Sequence[DataOneInput | UrlInput] | Sequence[Path], execution_function: ExecutionFunction, parallel_config: ParallelConfig | None = None) list[FilePartition]¶
Create file partitions for parallel execution.
Groups input files into partitions based on the configured partition size. Each partition will be processed independently in parallel.
- Parameters:
inputs (
Sequence[DataOneInput|UrlInput] |Sequence[Path]) – List of input parameters or file paths to partitionexecution_function (
ExecutionFunction) – Execution function to create partitions forparallel_config (
ParallelConfig|None) – Optional parallel execution configuration
- Return type:
- Returns:
List of FilePartition objects, each containing a subset of files
- Raises:
ValueError – If no files are provided or extracted from inputs