Skip to article frontmatterSkip to article content

How to use parsl parallelization

Many of the QIIME 2 actions can be parallelized by executing actions on smaller subsets of the data called partitions. To make use of parallelization, you will need a parsl config file which will define the resources available to the parallel execution. You can read more about how parallelization works in QIIME 2 by heading to one of the following resources:

The table below shows the parameters which you need to set in the parsl config file for each of the above values:

requirementparsl config parametervalue
worker cpuscores_per_node12
worker RAMmem_per_node96GB
worker countmax_blocks25
runtimewalltime24h

The final config file would then look like this:

[parsl]

[[parsl.executors]]
class = "HighThroughputExecutor"
label = "default"

[parsl.executors.provider]
class = "SlurmProvider"
worker_init = "source ~/.bashrc && conda activate q2-moshpit-2025.10"
walltime = "24:00:00"
nodes_per_block = 1
cores_per_node = 12
mem_per_node = 96
max_blocks = 25

Some HPCs may not allow passing certain flags to the batch system, e.g., sometimes it is not possible to set the total amount of memory per node but rather per CPU. In those cases you can use the scheduler_options directive to pass any flag that your HPC supports, e.g.:

[parsl]

[[parsl.executors]]
class = "HighThroughputExecutor"
label = "default"

[parsl.executors.provider]
class = "SlurmProvider"
scheduler_options = "#SBATCH --mem-per-cpu=8G"
worker_init = "source ~/.bashrc && conda activate q2-moshpit-2025.10"
walltime = "24:00:00"
nodes_per_block = 1
cores_per_node = 12
max_blocks = 25

Notice how we are also adding the worker_init directive to properly activate the same conda environment in each worker.

This is, of course, only one scenario — depending on your exact use case, resource requirements and HPC configuration, the final config file may look different. To see the list of all available resource providers, you can head to the parsl documentation.