[1]:
# useful to autoreload the module without restarting the kernel
%load_ext autoreload
%autoreload 2
[2]:
from mppi import InputFiles as I, Calculators as C
Tutorial for the QeCalculator class
This tutorial describes the usage of the QeCalculator class, that manages the run of (many) calculations in parallel with the QuantumESPRESSO package.
[3]:
run_dir = 'QeCalculator_test'
Perform (many) scf computations for silicon
We init the PwInput object using an exsisting input file. Then we define 4 inputs with the associated names by considering different values for the energy cutoff of the wave-functions
[4]:
enegy_cutoffs = [40,50,60,70]
[5]:
from copy import deepcopy
inp = I.PwInput(file='IO_files/si_scf.in')
inp.set_kpoints(points = [6,6,6])
inputs = []
names = []
for e in enegy_cutoffs:
prefix = 'ecut_%s'%e
inp.set_prefix(prefix)
inp.set_energy_cutoff(e)
inputs.append(deepcopy(inp))
names.append(prefix)
[6]:
names
[6]:
['ecut_40', 'ecut_50', 'ecut_60', 'ecut_70']
Note that we have chosen the value of the prefix of the input object as the name of the file. In this way the inp, log and xml file created by QuantumESPRESSO have the same name of the prefix folder.
Now we define an intance of the QeCalculator. For this example we use a direct scheduler, so the computations are runned in parallel using the python multiprocessing module
[7]:
C.QeCalculator?
Init signature: C.QeCalculator(omp=1, mpi=2, mpi_run='mpirun -np', executable='pw.x', scheduler='direct', multiTask=True, skip=True, verbose=True, IO_time=5, **kwargs)
Docstring:
Manage (multiple) QuantumESPRESSO calculations performed in parallel. Computations
are managed by a scheduler that, in the actual implementation of the class, can
be `direct` or `slurm`.
Parameters:
omp (:py:class:`int`) : value of the OMP_NUM_THREADS variable
mpi (:py:class:`int`) : number of mpi processes
mpi_run (:py:class:`string`) : command for the execution of mpirun, e.g. 'mpirun -np' or 'mpiexec -np'
executable (:py:class:`string`) : set the executable (pw.x, ph.x, ..) of the QuantumESPRESSO package
scheduler (:py:class:`string`) : choose the scheduler used to submit the job, actually the choices implemented are
'direct' that runs the computation using the python multiprocessing package and 'slurm' that creates a slurm script
multiTask (:py:class:`bool`) : if true a single run_script is built and all the computations are performed in parallel,
otherwise an independent script is built for each elements of inputs and the computations are performed sequentially
skip (:py:class:`bool`) : if True evaluate if one (or many) computations can be skipped.
This is done by checking if the file $name.xml is present in the prefix folder,
for each name in names
verbose (:py:class:`bool`) : set the amount of information provided on terminal
IO_time (int) : time step (in second) used by the wait method to check that the job is completed
kwargs : other parameters that are stored in the _global_options dictionary. For instance the variable
sbatch_options = [option1,option2,....] allows the user to include further options in the slurm script
Example:
>>> code = calculator(omp=1,mpi=4,mpi_run='mpirun -np',skip=True,verbose=True,scheduler='direct')
>>> code.run(inputs = [...], run_dir = ...,names = [...], source_dir = ..., **kwargs)
where the arguments of the run method are:
Args:
run_dir (:py:class:`string`) : the folder in which the simulation is performed
inputs (:py:class:`list`) : list with the instances of the :class:`PwInput` class
that define the input objects
names (:py:class:`list`) : list with the names associated to the input files,
given in the same order of the inputs list.
Usually you can set the name equal to the prefix of the input object so
the name of the input file and the prefix folder built by QuantumESPRESSO
are equal
source_dir (:py:class:`string`) : location of the scf source folder for a nscf computation.
If present the class copies this folder in the run_dir with the name $prefix.save
kwargs : other parameters that are stored in the run_options dictionary
File: ~/Applications/MPPI/mppi/Calculators/QeCalculator.py
Type: type
[8]:
code = C.QeCalculator(mpi=2)
code.global_options()
Initialize a parallel QuantumESPRESSO calculator with scheduler direct
[8]:
{'omp': 1,
'mpi': 2,
'mpi_run': 'mpirun -np',
'executable': 'pw.x',
'scheduler': 'direct',
'multiTask': True,
'skip': True,
'verbose': True,
'IO_time': 5}
We run the computation(s) passing the list with the inputs object and the associated names to the run method of the calculator
[11]:
results = code.run(run_dir=run_dir,inputs=inputs,names=names,other_variable = 1,skip=False)
results
delete log file: QeCalculator_test/ecut_40.log
delete xml file: QeCalculator_test/ecut_40.xml
delete folder: QeCalculator_test/ecut_40.save
delete log file: QeCalculator_test/ecut_50.log
delete xml file: QeCalculator_test/ecut_50.xml
delete folder: QeCalculator_test/ecut_50.save
delete log file: QeCalculator_test/ecut_60.log
delete xml file: QeCalculator_test/ecut_60.xml
delete folder: QeCalculator_test/ecut_60.save
delete log file: QeCalculator_test/ecut_70.log
delete xml file: QeCalculator_test/ecut_70.xml
delete folder: QeCalculator_test/ecut_70.save
run 0 command: cd QeCalculator_test; mpirun -np 2 pw.x -inp ecut_40.in > ecut_40.log
run 1 command: cd QeCalculator_test; mpirun -np 2 pw.x -inp ecut_50.in > ecut_50.log
run 2 command: cd QeCalculator_test; mpirun -np 2 pw.x -inp ecut_60.in > ecut_60.log
run 3 command: cd QeCalculator_test; mpirun -np 2 pw.x -inp ecut_70.in > ecut_70.log
run0_is_running: True run1_is_running: True run2_is_running: True run3_is_running: True
run0_is_running: True run1_is_running: True run2_is_running: True run3_is_running: True
run0_is_running: False run1_is_running: False run2_is_running: False run3_is_running: True
Job completed
[11]:
{'output': ['QeCalculator_test/ecut_40.save/data-file-schema.xml',
'QeCalculator_test/ecut_50.save/data-file-schema.xml',
'QeCalculator_test/ecut_60.save/data-file-schema.xml',
'QeCalculator_test/ecut_70.save/data-file-schema.xml']}
After the run all the parameters passed to the calculator are written in the run_options attribute
[14]:
#code.run_options
We observe that, if the run of the simulation does not crash the output of the run method is a list with the the data-file-schema.xml (including their relative path) for subsequent parsing. The elements of the list are ordered as the input objects in the inputs list.
Instead, let see what happens if the simulation fails. For instance if we provide an empty input to code
[15]:
inp2 = I.PwInput()
[16]:
prefix = 'si_scf_test2'
inp2.set_prefix(prefix)
inp2
[16]:
{'control': {'prefix': "'si_scf_test2'"},
'system': {},
'electrons': {},
'ions': {},
'cell': {},
'atomic_species': {},
'atomic_positions': {},
'kpoints': {},
'cell_parameters': {}}
[17]:
result2 = code.run(inputs = [inp2], run_dir = run_dir,names=[prefix])
result2
run 0 command: cd QeCalculator_test; mpirun -np 2 pw.x -inp si_scf_test2.in > si_scf_test2.log
run0_is_running: True
Job completed
[17]:
{'output': [None]}
In this case the output of the run method is None
Usage of the skip parameter
If we repeat a calculation that has been already performed and skip = True the class skip its computation, for instance
[15]:
results = code.run(run_dir=run_dir,inputs=inputs,names=names, skip = True)
results
Skip the computation for input ecut_40
Skip the computation for input ecut_50
Skip the computation for input ecut_60
Skip the computation for input ecut_70
Job completed
[15]:
{'output': ['QeCalculator_test/ecut_40.save/data-file-schema.xml',
'QeCalculator_test/ecut_50.save/data-file-schema.xml',
'QeCalculator_test/ecut_60.save/data-file-schema.xml',
'QeCalculator_test/ecut_70.save/data-file-schema.xml']}
If we add one element to inputs and run again onlty the new element is computed
[16]:
e = 80
prefix = 'ecut_%s'%e
inp.set_prefix(prefix)
inp.set_energy_cutoff(e)
inputs.append(deepcopy(inp))
names.append(prefix)
[17]:
results = code.run(run_dir=run_dir,inputs=inputs,names=names, skip = True)
results
Skip the computation for input ecut_40
Skip the computation for input ecut_50
Skip the computation for input ecut_60
Skip the computation for input ecut_70
Skip the computation for input ecut_80
Job completed
[17]:
{'output': ['QeCalculator_test/ecut_40.save/data-file-schema.xml',
'QeCalculator_test/ecut_50.save/data-file-schema.xml',
'QeCalculator_test/ecut_60.save/data-file-schema.xml',
'QeCalculator_test/ecut_70.save/data-file-schema.xml',
'QeCalculator_test/ecut_80.save/data-file-schema.xml']}
Instead if skip = False the class clean the run_dir before performing the computation, for istance
[18]:
results = code.run(run_dir=run_dir,inputs=inputs[0:1],names=names[0:1], skip = False)
results
delete log file: QeCalculator_test/ecut_40.log
delete xml file: QeCalculator_test/ecut_40.xml
delete folder: QeCalculator_test/ecut_40.save
Executing command: cd QeCalculator_test; mpirun -np 2 pw.x -inp ecut_40.in > ecut_40.log
run0_is_running:True
Job completed
[18]:
{'output': ['QeCalculator_test/ecut_40.save/data-file-schema.xml']}
Usage of the multiTask feature
By default the calculator runs in parallel all the computations. However if the multiTask = False option is used the the computations are performed in sequence.
[19]:
results = code.run(run_dir=run_dir,inputs=inputs[0:4],names=names[0:4], skip = False, multiTask = False)
results
delete log file: QeCalculator_test/ecut_40.log
delete xml file: QeCalculator_test/ecut_40.xml
delete folder: QeCalculator_test/ecut_40.save
delete log file: QeCalculator_test/ecut_50.log
delete xml file: QeCalculator_test/ecut_50.xml
delete folder: QeCalculator_test/ecut_50.save
delete log file: QeCalculator_test/ecut_60.log
delete xml file: QeCalculator_test/ecut_60.xml
delete folder: QeCalculator_test/ecut_60.save
delete log file: QeCalculator_test/ecut_70.log
delete xml file: QeCalculator_test/ecut_70.xml
delete folder: QeCalculator_test/ecut_70.save
Executing command: cd QeCalculator_test; mpirun -np 2 pw.x -inp ecut_40.in > ecut_40.log
run0_is_running:True
Job completed
Executing command: cd QeCalculator_test; mpirun -np 2 pw.x -inp ecut_50.in > ecut_50.log
run0_is_running:True
Job completed
Executing command: cd QeCalculator_test; mpirun -np 2 pw.x -inp ecut_60.in > ecut_60.log
run0_is_running:True
Job completed
Executing command: cd QeCalculator_test; mpirun -np 2 pw.x -inp ecut_70.in > ecut_70.log
run0_is_running:True
Job completed
[19]:
{'output': ['QeCalculator_test/ecut_40.save/data-file-schema.xml',
'QeCalculator_test/ecut_50.save/data-file-schema.xml',
'QeCalculator_test/ecut_60.save/data-file-schema.xml',
'QeCalculator_test/ecut_70.save/data-file-schema.xml']}
Test of the slurm scheduler
If the slurm scheduler is chosen the calculator prepare the slurm script and submit it. The effects of skip and multiTask parameters can be tested
[ ]:
results = code.run(run_dir=run_dir,inputs=inputs,names=names, scheduler = 'slurm', skip = False, multiTask = True)
results
The slurm script is written in the run_dir. The execution of the run requires that the slurm scheduler is installed.
Perform a nscf computation for silicon. Usage of the source_dir option
We show how to perform a pw nscf calculation using the results of the first scf run as an input.
We observe that source_dir is unique, so we can run in parallel only runs that use the same directory as source scf input.
For instance we consider two nscf computations
[12]:
run_dir
[12]:
'QeCalculator_test'
[9]:
num_bands = [8,12]
[10]:
inputs = []
names = []
for n in num_bands:
inp.set_nscf(n,force_symmorphic=True)
prefix = 'bands_%s'%n
inp.set_prefix(prefix)
inp.set_energy_cutoff(40)
inputs.append(deepcopy(inp))
names.append(prefix)
[11]:
results = code.run(inputs=inputs,run_dir=run_dir,names=names,source_dir='QeCalculator_test/ecut_40.save')
results
The folder QeCalculator_test/bands_8.save already exsists. Source folder QeCalculator_test/ecut_40.save not copied
The folder QeCalculator_test/bands_12.save already exsists. Source folder QeCalculator_test/ecut_40.save not copied
Skip the run of bands_8
Skip the run of bands_12
Job completed
[11]:
{'output': ['QeCalculator_test/bands_8.save/data-file-schema.xml',
'QeCalculator_test/bands_12.save/data-file-schema.xml']}
Instead, if skip = False the class delete the existing output files before running the computation again.
[23]:
results = code.run(inputs=inputs,run_dir=run_dir,names=names,source_dir='QeCalculator_test/ecut_40.save',skip=False)
results
delete log file: QeCalculator_test/bands_8.log
delete xml file: QeCalculator_test/bands_8.xml
delete folder: QeCalculator_test/bands_8.save
delete log file: QeCalculator_test/bands_12.log
delete xml file: QeCalculator_test/bands_12.xml
delete folder: QeCalculator_test/bands_12.save
Copy source_dir QeCalculator_test/ecut_40.save in the QeCalculator_test/bands_8.save
Copy source_dir QeCalculator_test/ecut_40.save in the QeCalculator_test/bands_12.save
Executing command: cd QeCalculator_test; mpirun -np 2 pw.x -inp bands_8.in > bands_8.log
Executing command: cd QeCalculator_test; mpirun -np 2 pw.x -inp bands_12.in > bands_12.log
run0_is_running:True run1_is_running:True
Job completed
[23]:
{'output': ['QeCalculator_test/bands_8.save/data-file-schema.xml',
'QeCalculator_test/bands_12.save/data-file-schema.xml']}
[ ]: