YamboCalculator
This module manages parallel calculations with Yambo. Both a scheduler like slurm or the python multiprocessing package can be used.
- class mppi.Calculators.YamboCalculator.YamboCalculator(omp=1, mpi=2, mpi_run='mpirun -np', executable='yambo', scheduler='direct', multiTask=True, skip=True, verbose=True, IO_time=5, clean_restart=True, **kwargs)[source]
Bases:
mppi.Calculators.Runner.RunnerManage (multiple) Yambo calculations performed in parallel. Computations are managed by a scheduler that, in the actual implementation of the class, can be direct or slurm.
- Parameters
omp (
int) – value of the OMP_NUM_THREADS variablempi (
int) – number of mpi processesmpi_run (
string) – command for the execution of mpirun, e.g. ‘mpirun -np’ or ‘mpiexec -np’executable (
string) – set the executable (yambo, ypp, yambo_rt, …) of the Yambo packagescheduler (
string) – choose the scheduler used to submit the job, actually the choices implemented are ‘direct’ that runs the computation using the python multiprocessing package and ‘slurm’ that creates a slurm scriptmultiTask (
bool) – if true a single run_script is built and all the computations are performed in parallel, otherwise an independent script is built for each elements of inputs and the computations are performed sequentiallyskip (
bool) – if True evaluate if one (or many) computations can be skipped. This is done by checking that the folder where yambo write the results contains at least one file ‘o-*’, for each name in namesverbose (
bool) – set the amount of information provided on terminalIO_time (int) – time step (in second) used by the wait method to check that the job is completed
kwargs – other parameters that are stored in the _global_options dictionary
clean_restart (
bool) – if True the delete the folder(s) with the output files and database before running the computation
Example
>>> code = YamboCalculator(omp=1,mpi=4,mpi_run='mpirun -np',executable='yambo',skip=True,verbose=True,scheduler='direct') >>> code.run(inputs = ..., run_dir = ...,names = ...,jobnames = ...)
where the arguments of the run method are:
- Parameters
run_dir (
string) – the folder in which the simulation is performedinputs (
list) – list with the instances of theYamboInputclass that define the input objectsnames (
list) – list with the names associated to the input files (without extension), given in the same order of the inputs list. These strings are used also as the radicals of the folders in which results are written as well as a part of the name of the output files.jobnames (
list) – list with the values of the jobname. If this variable is not specified the value of name is attributed to jobname by process_run.kwargs – other parameters that are stored in the run_options dictionary
- When the run method is called the class runs the command:
executable_name -F name.in -J jobname -C name
The calculator looks for the following variables in the run_options dictionary. These options may be useful for _asincronous_ computation managed the slurm scheduler.
dry_run=True with this option the calculator setup the calculations and write the scrpt for submitting the jobs, but the computations are not run.
wait_end_run=False with this option the wait of the end of the run is suppressed.
- build_run_script(to_run)[source]
Create the run script(s) that are executed by the
submit_job()method. The scripts depend on the scheduler adopted, and specific methods for direct and slurm scheduler are implemented.- Parameters
to_run (
string) – list with the cardinal numbers of the runs to be performed- Returns
list with jobs to run. The type of the object in the list depends on the chosen scheduler
- Return type
list
- direct_scheduler(to_run)[source]
Define the list of Process (methods of multiprocessing) associated to the runs specified in the list to_run.
- Parameters
to_run (
string) – list with the cardinal numbers of the runs to be performed- Returns
list of the
multiprocessingobjects associated to the runs of the job- Return type
list
- post_processing()[source]
Return a dictionary with the names of the o- file(s) and the name of the folder that contains the databases. The construction of the lists for the output key is managed by the :meth:_get_output. For the folders that contain the databases, the method return None if the corresponding folder does not exists.
- Returns
- the dictionary
{‘output’ : [[o-1,o-2,…],[o-1,o-2,…],…], ‘dbs’ : [ndb_folder1,ndb_folder2,…]}
- Return type
dict
- pre_processing()[source]
Process local run dictionary. Check that the run_dir exists and that it contains the SAVE folder. Check that the inputs objects have been provided in the run parameters and write the input files on disk. If skip = False clean the run_dir.
Note
If the run_dir and/or the SAVE folder do not exist an alert is written but the execution of the run method proceedes.
- process_run()[source]
Method associated to the running of the executable. The method prepares the jobs script(s), then submit the jobs and wait the end of the computation before passing to the
post_processing()method. Computations are performed in parallel or serially accordingly to the value of the multiTask option.
- run_command(index)[source]
Define the run command used to run the computation associated to the input file $names[index]. The value of the command depends on the chosen scheduler.
- Parameters
index (
int) – index of the computation to be performed- Returns
command that runs the computation associated to the $names[index] input file
- Return type
string
- select_to_run()[source]
If the skip attribute of run_options is True the method evaluates which computations can be skipped. This is done by checking if the folder where yambo write the results contains at least one file ‘o-*’.
- Returns
list with numbers of the computations that have to be performed, in the same order provided in the run method
- Return type
list
- slurm_scheduler(to_run)[source]
Create the slurm script(s) associated to the runs specified in the list to_run.
- Parameters
to_run (
string) – list with the cardinal numbers of the runs to be performed- Returns
list with the names of the slurm scripts associated to the computations that are not skipped
- Return type
list
- submit_job(jobs)[source]
Submit the job.
- Parameters
jobs – The reference to the jobs to be executed. If the scheduler is direct jobs is a list with the instance of :py:class:multiprocessing. If the scheduler is slurm jobs is a list with the names of the slurm scripts
- wait(jobs, to_run)[source]
Wait the end of the jobs.
- Parameters
jobs – The reference to the jobs to be executed. If the scheduler is direct jobs is a list with the instance of :py:class:multiprocessing. If the scheduler is slurm jobs is a list with the names of the slurm scripts
to_run (
string) – list with the cardinal numbers of the runs to be performed