[1]:

# useful to autoreload the module without restarting the kernel
%load_ext autoreload
%autoreload 2

[2]:

from mppi import InputFiles as I, Calculators as C

Tutorial for the QeCalculator class

This tutorial describes the usage of the QeCalculator class, that manages the run of (many) calculations in parallel with the QuantumESPRESSO package.

[3]:

run_dir = 'QeCalculator_test'

Perform (many) scf computations for silicon

We init the PwInput object using an exsisting input file. Then we define 4 inputs with the associated names by considering different values for the energy cutoff of the wave-functions

[4]:

enegy_cutoffs = [40,50,60,70]

[5]:

from copy import deepcopy

inp = I.PwInput(file='IO_files/si_scf.in')
inp.set_kpoints(points = [6,6,6])

inputs = []
names = []

for e in enegy_cutoffs:
    prefix = 'ecut_%s'%e
    inp.set_prefix(prefix)
    inp.set_energy_cutoff(e)
    inputs.append(deepcopy(inp))
    names.append(prefix)

[6]:

names

[6]:

['ecut_40', 'ecut_50', 'ecut_60', 'ecut_70']

Note that we have chosen the value of the prefix of the input object as the name of the file. In this way the inp, log and xml file created by QuantumESPRESSO have the same name of the prefix folder.

Now we define an intance of the QeCalculator. For this example we use a direct scheduler, so the computations are runned in parallel using the python multiprocessing module

[7]:

C.QeCalculator?

Init signature: C.QeCalculator(omp=1, mpi=2, mpi_run='mpirun -np', executable='pw.x', scheduler='direct', multiTask=True, skip=True, verbose=True, IO_time=5, **kwargs)
Docstring:
Manage (multiple) QuantumESPRESSO calculations performed in parallel. Computations
are managed by a scheduler that, in the actual implementation of the class, can
be `direct` or `slurm`.

Parameters:
   omp (:py:class:`int`) : value of the OMP_NUM_THREADS variable
   mpi (:py:class:`int`) : number of mpi processes
   mpi_run (:py:class:`string`) : command for the execution of mpirun, e.g. 'mpirun -np' or 'mpiexec -np'
   executable (:py:class:`string`) : set the executable (pw.x, ph.x, ..) of the QuantumESPRESSO package
   scheduler (:py:class:`string`) : choose the scheduler used to submit the job, actually the choices implemented are
        'direct' that runs the computation using the python multiprocessing package and 'slurm' that creates a slurm script
   multiTask  (:py:class:`bool`) : if true a single run_script is built and all the computations are performed in parallel,
        otherwise an independent script is built for each elements of inputs and the computations are performed sequentially
   skip (:py:class:`bool`) : if True evaluate if one (or many) computations can be skipped.
       This is done by checking if the file $name.xml is present in the prefix folder,
       for each name in names
   verbose (:py:class:`bool`) : set the amount of information provided on terminal
   IO_time (int) : time step (in second) used by the wait method to check that the job is completed
   kwargs : other parameters that are stored in the _global_options dictionary. For instance the variable
       sbatch_options = [option1,option2,....] allows the user to include further options in the slurm script

Example:
 >>> code = calculator(omp=1,mpi=4,mpi_run='mpirun -np',skip=True,verbose=True,scheduler='direct')
 >>> code.run(inputs = [...], run_dir = ...,names = [...], source_dir = ..., **kwargs)

 where the arguments of the run method are:

Args:
    run_dir (:py:class:`string`) : the folder in which the simulation is performed
    inputs (:py:class:`list`) : list with the instances of the :class:`PwInput` class
        that define the input objects
    names (:py:class:`list`) : list with the names associated to the input files,
        given in the same order of the inputs list.
        Usually you can set the name equal to the prefix of the input object so
        the name of the input file and the prefix folder built by QuantumESPRESSO
        are equal
    source_dir (:py:class:`string`) : location of the scf source folder for a nscf computation.
        If present the class copies this folder in the run_dir with the name $prefix.save
    kwargs : other parameters that are stored in the run_options dictionary
File:           ~/Applications/MPPI/mppi/Calculators/QeCalculator.py
Type:           type

[8]:

code = C.QeCalculator(mpi=2)
code.global_options()

Initialize a parallel QuantumESPRESSO calculator with scheduler direct

[8]:

{'omp': 1,
 'mpi': 2,
 'mpi_run': 'mpirun -np',
 'executable': 'pw.x',
 'scheduler': 'direct',
 'multiTask': True,
 'skip': True,
 'verbose': True,
 'IO_time': 5}

We run the computation(s) passing the list with the inputs object and the associated names to the run method of the calculator

[11]:

results = code.run(run_dir=run_dir,inputs=inputs,names=names,other_variable = 1,skip=False)
results

delete log file: QeCalculator_test/ecut_40.log
delete xml file: QeCalculator_test/ecut_40.xml
delete folder: QeCalculator_test/ecut_40.save
delete log file: QeCalculator_test/ecut_50.log
delete xml file: QeCalculator_test/ecut_50.xml
delete folder: QeCalculator_test/ecut_50.save
delete log file: QeCalculator_test/ecut_60.log
delete xml file: QeCalculator_test/ecut_60.xml
delete folder: QeCalculator_test/ecut_60.save
delete log file: QeCalculator_test/ecut_70.log
delete xml file: QeCalculator_test/ecut_70.xml
delete folder: QeCalculator_test/ecut_70.save
run 0 command: cd QeCalculator_test; mpirun -np 2 pw.x -inp ecut_40.in > ecut_40.log
run 1 command: cd QeCalculator_test; mpirun -np 2 pw.x -inp ecut_50.in > ecut_50.log
run 2 command: cd QeCalculator_test; mpirun -np 2 pw.x -inp ecut_60.in > ecut_60.log
run 3 command: cd QeCalculator_test; mpirun -np 2 pw.x -inp ecut_70.in > ecut_70.log
run0_is_running: True run1_is_running: True run2_is_running: True run3_is_running: True
run0_is_running: True run1_is_running: True run2_is_running: True run3_is_running: True
run0_is_running: False run1_is_running: False run2_is_running: False run3_is_running: True
Job completed

[11]:

{'output': ['QeCalculator_test/ecut_40.save/data-file-schema.xml',
  'QeCalculator_test/ecut_50.save/data-file-schema.xml',
  'QeCalculator_test/ecut_60.save/data-file-schema.xml',
  'QeCalculator_test/ecut_70.save/data-file-schema.xml']}

After the run all the parameters passed to the calculator are written in the run_options attribute

[14]:

#code.run_options

We observe that, if the run of the simulation does not crash the output of the run method is a list with the the data-file-schema.xml (including their relative path) for subsequent parsing. The elements of the list are ordered as the input objects in the inputs list.

Instead, let see what happens if the simulation fails. For instance if we provide an empty input to code

[15]:

inp2 = I.PwInput()

[16]:

prefix = 'si_scf_test2'
inp2.set_prefix(prefix)
inp2

[16]:

{'control': {'prefix': "'si_scf_test2'"},
 'system': {},
 'electrons': {},
 'ions': {},
 'cell': {},
 'atomic_species': {},
 'atomic_positions': {},
 'kpoints': {},
 'cell_parameters': {}}

[17]:

result2 = code.run(inputs = [inp2], run_dir = run_dir,names=[prefix])
result2

run 0 command: cd QeCalculator_test; mpirun -np 2 pw.x -inp si_scf_test2.in > si_scf_test2.log
run0_is_running: True
Job completed

[17]:

{'output': [None]}

In this case the output of the run method is None

Usage of the skip parameter

If we repeat a calculation that has been already performed and skip = True the class skip its computation, for instance

[15]:

results = code.run(run_dir=run_dir,inputs=inputs,names=names, skip = True)
results

Skip the computation for input ecut_40
Skip the computation for input ecut_50
Skip the computation for input ecut_60
Skip the computation for input ecut_70
Job completed

[15]:

{'output': ['QeCalculator_test/ecut_40.save/data-file-schema.xml',
  'QeCalculator_test/ecut_50.save/data-file-schema.xml',
  'QeCalculator_test/ecut_60.save/data-file-schema.xml',
  'QeCalculator_test/ecut_70.save/data-file-schema.xml']}

If we add one element to inputs and run again onlty the new element is computed

[16]:

e = 80
prefix = 'ecut_%s'%e
inp.set_prefix(prefix)
inp.set_energy_cutoff(e)
inputs.append(deepcopy(inp))
names.append(prefix)

[17]:

results = code.run(run_dir=run_dir,inputs=inputs,names=names, skip = True)
results

Skip the computation for input ecut_40
Skip the computation for input ecut_50
Skip the computation for input ecut_60
Skip the computation for input ecut_70
Skip the computation for input ecut_80
Job completed

[17]:

{'output': ['QeCalculator_test/ecut_40.save/data-file-schema.xml',
  'QeCalculator_test/ecut_50.save/data-file-schema.xml',
  'QeCalculator_test/ecut_60.save/data-file-schema.xml',
  'QeCalculator_test/ecut_70.save/data-file-schema.xml',
  'QeCalculator_test/ecut_80.save/data-file-schema.xml']}

Instead if skip = False the class clean the run_dir before performing the computation, for istance

[18]:

results = code.run(run_dir=run_dir,inputs=inputs[0:1],names=names[0:1], skip = False)
results

delete log file: QeCalculator_test/ecut_40.log
delete xml file: QeCalculator_test/ecut_40.xml
delete folder: QeCalculator_test/ecut_40.save
Executing command: cd QeCalculator_test; mpirun -np 2 pw.x -inp ecut_40.in > ecut_40.log
run0_is_running:True
Job completed

[18]:

{'output': ['QeCalculator_test/ecut_40.save/data-file-schema.xml']}

Usage of the multiTask feature

By default the calculator runs in parallel all the computations. However if the multiTask = False option is used the the computations are performed in sequence.

[19]:

results = code.run(run_dir=run_dir,inputs=inputs[0:4],names=names[0:4], skip = False, multiTask = False)
results

delete log file: QeCalculator_test/ecut_40.log
delete xml file: QeCalculator_test/ecut_40.xml
delete folder: QeCalculator_test/ecut_40.save
delete log file: QeCalculator_test/ecut_50.log
delete xml file: QeCalculator_test/ecut_50.xml
delete folder: QeCalculator_test/ecut_50.save
delete log file: QeCalculator_test/ecut_60.log
delete xml file: QeCalculator_test/ecut_60.xml
delete folder: QeCalculator_test/ecut_60.save
delete log file: QeCalculator_test/ecut_70.log
delete xml file: QeCalculator_test/ecut_70.xml
delete folder: QeCalculator_test/ecut_70.save
Executing command: cd QeCalculator_test; mpirun -np 2 pw.x -inp ecut_40.in > ecut_40.log
run0_is_running:True
Job completed
Executing command: cd QeCalculator_test; mpirun -np 2 pw.x -inp ecut_50.in > ecut_50.log
run0_is_running:True
Job completed
Executing command: cd QeCalculator_test; mpirun -np 2 pw.x -inp ecut_60.in > ecut_60.log
run0_is_running:True
Job completed
Executing command: cd QeCalculator_test; mpirun -np 2 pw.x -inp ecut_70.in > ecut_70.log
run0_is_running:True
Job completed

[19]:

{'output': ['QeCalculator_test/ecut_40.save/data-file-schema.xml',
  'QeCalculator_test/ecut_50.save/data-file-schema.xml',
  'QeCalculator_test/ecut_60.save/data-file-schema.xml',
  'QeCalculator_test/ecut_70.save/data-file-schema.xml']}

Test of the slurm scheduler

If the slurm scheduler is chosen the calculator prepare the slurm script and submit it. The effects of skip and multiTask parameters can be tested

[ ]:

results = code.run(run_dir=run_dir,inputs=inputs,names=names, scheduler = 'slurm', skip = False, multiTask = True)
results

The slurm script is written in the run_dir. The execution of the run requires that the slurm scheduler is installed.

Perform a nscf computation for silicon. Usage of the source_dir option

We show how to perform a pw nscf calculation using the results of the first scf run as an input.

We observe that source_dir is unique, so we can run in parallel only runs that use the same directory as source scf input.

For instance we consider two nscf computations

[12]:

run_dir

[12]:

'QeCalculator_test'

[9]:

num_bands = [8,12]

[10]:

inputs = []
names = []

for n in num_bands:
    inp.set_nscf(n,force_symmorphic=True)
    prefix = 'bands_%s'%n
    inp.set_prefix(prefix)
    inp.set_energy_cutoff(40)
    inputs.append(deepcopy(inp))
    names.append(prefix)

[11]:

results = code.run(inputs=inputs,run_dir=run_dir,names=names,source_dir='QeCalculator_test/ecut_40.save')
results

The folder QeCalculator_test/bands_8.save already exsists. Source folder QeCalculator_test/ecut_40.save not copied
The folder QeCalculator_test/bands_12.save already exsists. Source folder QeCalculator_test/ecut_40.save not copied
Skip the run of bands_8
Skip the run of bands_12
Job completed

[11]:

{'output': ['QeCalculator_test/bands_8.save/data-file-schema.xml',
  'QeCalculator_test/bands_12.save/data-file-schema.xml']}

Instead, if skip = False the class delete the existing output files before running the computation again.

[23]:

results = code.run(inputs=inputs,run_dir=run_dir,names=names,source_dir='QeCalculator_test/ecut_40.save',skip=False)
results

delete log file: QeCalculator_test/bands_8.log
delete xml file: QeCalculator_test/bands_8.xml
delete folder: QeCalculator_test/bands_8.save
delete log file: QeCalculator_test/bands_12.log
delete xml file: QeCalculator_test/bands_12.xml
delete folder: QeCalculator_test/bands_12.save
Copy source_dir QeCalculator_test/ecut_40.save in the QeCalculator_test/bands_8.save
Copy source_dir QeCalculator_test/ecut_40.save in the QeCalculator_test/bands_12.save
Executing command: cd QeCalculator_test; mpirun -np 2 pw.x -inp bands_8.in > bands_8.log
Executing command: cd QeCalculator_test; mpirun -np 2 pw.x -inp bands_12.in > bands_12.log
run0_is_running:True  run1_is_running:True
Job completed

[23]:

{'output': ['QeCalculator_test/bands_8.save/data-file-schema.xml',
  'QeCalculator_test/bands_12.save/data-file-schema.xml']}

[ ]:

Tutorial for the QeCalculator class

Perform (many) scf computations for silicon

Usage of the skip parameter

Usage of the multiTask feature

Test of the slurm scheduler

Perform a nscf computation for silicon. Usage of the source_dir option

Table of Contents

Previous topic

Next topic

This Page