ogdc_runner.partitioning module

Partitioning strategy for dividing work across parallel tasks.

This module creates independent chunks of work (partitions) by grouping files based on a configured partition size. Each partition represents a unit of work that will be executed in parallel.

ogdc_runner.partitioning.create_partitions(inputs: Sequence[DataOneInput | UrlInput] | Sequence[Path], execution_function: ExecutionFunction, parallel_config: ParallelConfig | None = None) list[FilePartition]

Create file partitions for parallel execution.

Groups input files into partitions based on the configured partition size. Each partition will be processed independently in parallel.

Parameters:
Return type:

list[FilePartition]

Returns:

List of FilePartition objects, each containing a subset of files

Raises:

ValueError – If no files are provided or extracted from inputs