[1]:
# useful to autoreload the module without restarting the kernel
%load_ext autoreload
%autoreload 2
[2]:
from mppi import InputFiles as I, Calculators as C, Utilities as U
from mppi.Calculators import Tools
[3]:
omp = 1
mpi = 8
Tutorial for the YamboCalculator class¶
This tutorial describes the usage of the YamboCalculator class, that manages the run of (many) calculations in parallel with the Yambo package.
Perform an Hartree-Fock computation for Silicon¶
Follow the tutorial for the QeCalculator to produce the .save folder needed to generate the yambo SAVE.
The first step needed to perform a Yambo computation is to generate the SAVE folder from a QuantumESPRESSO computation and to init the Yambo run_dir folder.
The mppi.Calculators.Tools module has a functions that perform these tasks
[4]:
input_dir = 'QeCalculator_test/outdir_nscf/bands_8.save'
[6]:
run_dir = 'YamboCalculator_test'
Tools.init_yambo_dir(yambo_dir=run_dir,input_dir=input_dir,overwrite_if_found=False)
SAVE folder YamboCalculator_test/SAVE already present. No operations performed.
Now the YamboInput class can create the yambo input object.
We consider a HF computation
[7]:
exx = 3.0 # Hartree
inp = I.YamboInput(args='yambo -x -V rl',folder=run_dir)
inp.set_kRange(1,2) #restrict the analysis to the first two kpoints
inp['variables']['EXXRLvcs'] = [exx*1e3,'mHa']
name = 'hf_exx'+str(exx)
jobname = 'hf_job_exx'+str(exx)
inp
[7]:
{'args': 'yambo -x -V rl',
'folder': 'YamboCalculator_test',
'filename': 'yambo.in',
'arguments': ['HF_and_locXC'],
'variables': {'FFTGvecs': [2133.0, 'RL'],
'SE_Threads': [0.0, ''],
'EXXRLvcs': [3000.0, 'mHa'],
'VXCRLvcs': [13107.0, 'RL'],
'QPkrange': [[1, 2, 1, 8], '']}}
To run the computation we create an istance of YamboCalculator. This object behaves almost exactly as the QeCalculator for what concern the user interface.
The first step is to create an istance of the RunRules class that contains the options of the calculator
[8]:
rr = C.RunRules(mpi=mpi,omp_num_threads=omp)
rr
[8]:
{'scheduler': 'direct', 'mpi': 8, 'omp_num_threads': 1}
[9]:
C.YamboCalculator?
Init signature:
C.YamboCalculator(
runRules,
executable='yambo',
skip=True,
clean_restart=True,
dry_run=False,
wait_end_run=True,
activate_BeeOND=False,
verbose=True,
fatlog=False,
**kwargs,
)
Docstring:
Perform a Yambo calculation. The parameters used to define the parellelization
strategy are provided in the `runRules` object.
Parameters:
runRulues (:class:`RunRules`) : instance of the :class:`RunRules` class
executable (:py:class:`string`) : set the executable (yambo, ypp, yambo_rt, ...) of the Yambo package
skip (:py:class:`bool`) : if True evaluate if the computation can be skipped. This is done by checking that the
report file built by yambo exists and contains the string `game_over`, defined as a data member of this class
clean_restart (:py:class:`bool`) : if True delete the folder with the output files and the database before running the computation
dry_run (:py:class:`bool`) : with this option enabled the calculator setup the calculations and write the script
for submitting the job, but the computations are not run
wait_end_run (:py:class:`bool`) : with this option disabled the run method does not wait the end of the run.
This option may be useful for interacting with the code in particular in _asincronous_ computation managed
by the slurm scheduler
activate_BeeOND (:py:class:`bool`) : if True set I/O of the run in the BeeOND_dir created by the slurm scheduler.
The value of the ``BeeOND_dir`` is written as a data member of the class and can be modified if needed
verbose (:py:class:`bool`) : set the amount of information provided on terminal
fatlog (:py:class:`bool`) : if True set the `-fatlog` key to provide more information in the report file
kwargs : other parameters that are stored in the _global_options dictionary
Computations are performed in the folder specified by the ``run_dir`` parameter. The ``name`` parameter is
used as name of the yambo input and as the name of the folder where yambo writes the o- `output` files.
The ``jobname`` parameter is the name of the folder where yambo writes the .ndb databases. If this parameter
is not provided in the run method the assumption jobname=name is made by the calculator.
Example:
>>> rr = RunRules(scheduler='slurm',ntasks_per_node=4,memory='124GB')
>>> code = YamboCalculator(rr,executable='yambo',skip=True,verbose=True)
>>> code.run(input = ..., run_dir = ...,name = ...,jobname = ..., **kwargs)
When the run method is called the class runs the command:
cd run_dir ; `mpirun command` executable_name -F name.in -J jobname -C name - O out_dir
where the arguments of the run method are:
Args:
run_dir (:py:class:`string`) : the folder in which the simulation is performed
input (:py:class:`string`) : instance of the :class:`YamboInput` class
that define the input objects
name (:py:class:`string`) : string with the names associated to the input file (without extension).
This string is used also as the name of the folder in which results are written (argument of the -C option of yambo) as
well as a part of the name of the output files
jobname (:py:class:`list` or :py:class:`string`) : string (or list of strings) with the value(s) of the jobname folders
(argument of the -J option of yambo). The first element is the folder name, where yambo writes the database.
The other values (if provided) are the folders where yambo seeks for pre existing databases. All the elements of the
list are assumed to be located in the ``run_dir`` of the calculator. If this variable is not specified the value of
name is attributed to jobname
out_dir (:py:class:`string`) : position of the folder in which the $jobname folder is located. This parameter
is automatically set by the calculator the value of ``BeeOND_dir`` if the option `activate_BeeOND` is enabled.
Otherwise all the folders are written in the ``run_dir``
kwargs : other parameters that are stored in the run_options dictionary
File: ~/Applications/MPPI/mppi/Calculators/YamboCalculator.py
Type: type
Subclasses:
[10]:
code = C.YamboCalculator(rr)
code.global_options()
Initialize a Yambo calculator with scheduler direct
[10]:
{'scheduler': 'direct',
'mpi': 8,
'omp_num_threads': 1,
'executable': 'yambo',
'skip': True,
'clean_restart': True,
'dry_run': False,
'wait_end_run': True,
'activate_BeeOND': False,
'verbose': True,
'fatlog': False}
The structure of the folder in which yambo writes its results is governed by the name and jobname variables. It is possible to provide only the name variable.
The effect of this choice can be seen in the command string executed by the calculator.
[11]:
result = code.run(run_dir=run_dir,input=inp,name=name,jobname=jobname)
result
run command: mpirun -np 8 yambo -F hf_exx3.0.in -J hf_job_exx3.0 -C hf_exx3.0
computation hf_exx3.0 is running...
computation hf_exx3.0 ended
Run performed in 13s
[11]:
{'output': {'hf': 'YamboCalculator_test/hf_exx3.0/o-hf_job_exx3.0.hf'},
'report': 'YamboCalculator_test/hf_exx3.0/r-hf_job_exx3.0_HF_and_locXC',
'dft': 'YamboCalculator_test/SAVE/ns.db1',
'HF_and_locXC': 'YamboCalculator_test/hf_job_exx3.0/ndb.HF_and_locXC'}
In this case yambo create in the run_dir the hf_exx3.0 folder that contains the o- output files and the hf_job_exx3.0 that contains the .ndb databases. Result is a dictionary that contain the names of the o- file and the databases created by yambo.
Instead, if we only provide the name parameter all the files are written by yambo in the name folder
[12]:
result = code.run(run_dir=run_dir,input=inp,name=name+'_only')
result
run command: mpirun -np 8 yambo -F hf_exx3.0_only.in -J hf_exx3.0_only -C hf_exx3.0_only
computation hf_exx3.0_only is running...
computation hf_exx3.0_only ended
Run performed in 13s
[12]:
{'output': {'hf': 'YamboCalculator_test/hf_exx3.0_only/o-hf_exx3.0_only.hf'},
'report': 'YamboCalculator_test/hf_exx3.0_only/r-hf_exx3.0_only_HF_and_locXC',
'dft': 'YamboCalculator_test/SAVE/ns.db1',
'HF_and_locXC': 'YamboCalculator_test/hf_exx3.0_only/ndb.HF_and_locXC'}
Usage of the skip parameter¶
If we repeat a calculation that has been already performed and skip = True the class skip its computation, for instance
[13]:
result = code.run(run_dir=run_dir,input=inp,name=name,jobname=jobname,skip=True)
result
Skip the run of hf_exx3.0
[13]:
{'output': {'hf': 'YamboCalculator_test/hf_exx3.0/o-hf_job_exx3.0.hf'},
'report': 'YamboCalculator_test/hf_exx3.0/r-hf_job_exx3.0_HF_and_locXC',
'dft': 'YamboCalculator_test/SAVE/ns.db1',
'HF_and_locXC': 'YamboCalculator_test/hf_job_exx3.0/ndb.HF_and_locXC'}
Instead, if skip is False the folders with the results are erased and the computation run again
[14]:
result = code.run(run_dir=run_dir,input=inp,name=name+'_only',skip=False)
result
delete folder: YamboCalculator_test/hf_exx3.0_only
run command: mpirun -np 8 yambo -F hf_exx3.0_only.in -J hf_exx3.0_only -C hf_exx3.0_only
computation hf_exx3.0_only is running...
computation hf_exx3.0_only ended
Run performed in 13s
[14]:
{'output': {'hf': 'YamboCalculator_test/hf_exx3.0_only/o-hf_exx3.0_only.hf'},
'report': 'YamboCalculator_test/hf_exx3.0_only/r-hf_exx3.0_only_HF_and_locXC',
'dft': 'YamboCalculator_test/SAVE/ns.db1',
'HF_and_locXC': 'YamboCalculator_test/hf_exx3.0_only/ndb.HF_and_locXC'}
The clean of the results folder can be suppressed with the option clean_restart=False
[15]:
result = code.run(run_dir=run_dir,input=inp,name=name+'_only',skip=False,clean_restart=False)
result
run performed starting from existing results
run command: mpirun -np 8 yambo -F hf_exx3.0_only.in -J hf_exx3.0_only -C hf_exx3.0_only
computation hf_exx3.0_only is running...
computation hf_exx3.0_only ended
[15]:
{'output': {'hf': 'YamboCalculator_test/hf_exx3.0_only/o-hf_exx3.0_only.hf_01'},
'report': 'YamboCalculator_test/hf_exx3.0_only/r-hf_exx3.0_only_HF_and_locXC_01',
'dft': 'YamboCalculator_test/SAVE/ns.db1',
'HF_and_locXC': 'YamboCalculator_test/hf_exx3.0_only/ndb.HF_and_locXC'}
In this case output folder contains several replica of the report and of the output files.
The class ensures that the files associated to the last run are used to build the results dictionary.
Test of the slurm scheduler¶
If the slurm scheduler is chosen the calculator prepare the slurm script and submit it.
In this case the mpi variable is not used and the calculator is set using the ntasks_per_node,nodes and cpus_per_task variables, apart from the omp_num_threads.
[16]:
rr = C.RunRules(scheduler='slurm',ntasks_per_node=4,omp_num_threads=2)
rr
[16]:
{'scheduler': 'slurm',
'nodes': 1,
'ntasks_per_node': 4,
'cpus_per_task': 1,
'omp_num_threads': 2,
'gpus_per_node': None,
'memory': None,
'time': None,
'partition': None,
'account': None,
'qos': None,
'map_by': None,
'pe': 1,
'rank_by': None}
[17]:
code = C.YamboCalculator(rr)
code.global_options()
Initialize a Yambo calculator with scheduler slurm
[17]:
{'scheduler': 'slurm',
'nodes': 1,
'ntasks_per_node': 4,
'cpus_per_task': 1,
'omp_num_threads': 2,
'gpus_per_node': None,
'memory': None,
'time': None,
'partition': None,
'account': None,
'qos': None,
'map_by': None,
'pe': 1,
'rank_by': None,
'executable': 'yambo',
'skip': True,
'clean_restart': True,
'dry_run': False,
'wait_end_run': True,
'activate_BeeOND': False,
'verbose': True,
'fatlog': False}
[18]:
results = code.run(run_dir=run_dir,input=inp,name=name,jobname=jobname,dry_run=True,skip=False,clean_restart=False)
results
run performed starting from existing results
run command: mpirun -np 4 yambo -F hf_exx3.0.in -J hf_job_exx3.0 -C hf_exx3.0
Dry_run option active. Script not submitted
The wait_end_run is False or the dry_run option is active. The calculator proceedes to the postprocessing
Run performed in 13s
[18]:
{'output': {'hf': 'YamboCalculator_test/hf_exx3.0/o-hf_job_exx3.0.hf'},
'report': 'YamboCalculator_test/hf_exx3.0/r-hf_job_exx3.0_HF_and_locXC',
'dft': 'YamboCalculator_test/SAVE/ns.db1',
'HF_and_locXC': 'YamboCalculator_test/hf_job_exx3.0/ndb.HF_and_locXC'}
The slurm script is written in the run_dir. The execution of the run requires that the slurm scheduler is installed.
Perform a GW computation for Silicon¶
We make usage of the YamboCalculator to perform a different yambo computation. In this way we control how this class manage the output files and the ndb database in various cases.
[19]:
rr = C.RunRules(mpi=mpi,omp_num_threads=omp)
code = C.YamboCalculator(rr)
Initialize a Yambo calculator with scheduler direct
[20]:
inp = I.YamboInput(args='yambo -d -k hartee -g n -p p -V qp',folder=run_dir)
inp.set_kRange(1,2)
#inp
[21]:
result = code.run(input=inp,run_dir=run_dir,name='qp_test1')
result
run command: mpirun -np 8 yambo -F qp_test1.in -J qp_test1 -C qp_test1
computation qp_test1 is running...
computation qp_test1 ended
Run performed in 18s
[21]:
{'output': {'qp': 'YamboCalculator_test/qp_test1/o-qp_test1.qp'},
'report': 'YamboCalculator_test/qp_test1/r-qp_test1_HF_and_locXC_gw0_dyson_em1d_ppa_el_el_corr',
'dft': 'YamboCalculator_test/SAVE/ns.db1',
'QP': 'YamboCalculator_test/qp_test1/ndb.QP',
'HF_and_locXC': 'YamboCalculator_test/qp_test1/ndb.HF_and_locXC',
'dipoles': 'YamboCalculator_test/qp_test1/ndb.dipoles',
'pp': 'YamboCalculator_test/qp_test1/ndb.pp'}
Perform the same computation but specify also a jobname
[22]:
result = code.run(input = inp, run_dir = run_dir, name='qp_test2', jobname = 'qp_job_test2')
result
run command: mpirun -np 8 yambo -F qp_test2.in -J qp_job_test2 -C qp_test2
computation qp_test2 is running...
computation qp_test2 ended
Run performed in 17s
[22]:
{'output': {'qp': 'YamboCalculator_test/qp_test2/o-qp_job_test2.qp'},
'report': 'YamboCalculator_test/qp_test2/r-qp_job_test2_HF_and_locXC_gw0_dyson_em1d_ppa_el_el_corr',
'dft': 'YamboCalculator_test/SAVE/ns.db1',
'QP': 'YamboCalculator_test/qp_job_test2/ndb.QP',
'HF_and_locXC': 'YamboCalculator_test/qp_job_test2/ndb.HF_and_locXC',
'dipoles': 'YamboCalculator_test/qp_job_test2/ndb.dipoles',
'pp': 'YamboCalculator_test/qp_job_test2/ndb.pp'}
Test of the ExtendOut option in the input file¶
The ExtendOut option enables the writing of all the variables in the output files of Yambo. This feature has no effect for an HF computation and we test for a QP one:
[23]:
inp = I.YamboInput(args='yambo -d -k hartee -g n -p p -V qp',folder=run_dir)
inp.set_kRange(1,2)
inp.set_extendOut()
#inp
[24]:
result = code.run(input = inp, run_dir = run_dir,name='qp_test_ExtendOut')
result
run command: mpirun -np 8 yambo -F qp_test_ExtendOut.in -J qp_test_ExtendOut -C qp_test_ExtendOut
computation qp_test_ExtendOut is running...
computation qp_test_ExtendOut ended
Run performed in 16s
[24]:
{'output': {'qp': 'YamboCalculator_test/qp_test_ExtendOut/o-qp_test_ExtendOut.qp'},
'report': 'YamboCalculator_test/qp_test_ExtendOut/r-qp_test_ExtendOut_HF_and_locXC_gw0_dyson_em1d_ppa_el_el_corr',
'dft': 'YamboCalculator_test/SAVE/ns.db1',
'QP': 'YamboCalculator_test/qp_test_ExtendOut/ndb.QP',
'HF_and_locXC': 'YamboCalculator_test/qp_test_ExtendOut/ndb.HF_and_locXC',
'dipoles': 'YamboCalculator_test/qp_test_ExtendOut/ndb.dipoles',
'pp': 'YamboCalculator_test/qp_test_ExtendOut/ndb.pp'}
you can check that the o- files contain more information. This feature is managed by the YamboParser class of the package.
Perform a ypp computation¶
The YamboCalculator class can manage also ypp computation.
Let’s see an example by performing a band calculation along a path
[25]:
inp = I.YamboInput(args='ypp -s b',folder=run_dir,filename='ypp.in')
inp
[25]:
{'args': 'ypp -s b',
'folder': 'YamboCalculator_test',
'filename': 'ypp.in',
'arguments': [],
'variables': {'INTERP_Shell_Fac': [20.0, ''],
'INTERP_NofNN': [1.0, ''],
'OutputAlat': [0.0, ''],
'BANDS_steps': [10.0, ''],
'PROJECT_mode': 'none',
'INTERP_mode': 'NN',
'cooIn': 'rlu',
'cooOut': 'rlu',
'CIRCUIT_E_DB_path': 'none',
'BANDS_bands': [[1, 8], '']}}
We define a calculator for ypp. This calculation requires 1 mpirun (see yambo for further information)
[26]:
rr['mpi']=1
code = C.YamboCalculator(rr,executable='ypp')
code.global_options()
Initialize a Yambo calculator with scheduler direct
[26]:
{'scheduler': 'direct',
'mpi': 1,
'omp_num_threads': 1,
'executable': 'ypp',
'skip': True,
'clean_restart': True,
'dry_run': False,
'wait_end_run': True,
'activate_BeeOND': False,
'verbose': True,
'fatlog': False}
Set the input parameter to perform the band computation along a path
[27]:
# in alat
G = [0.,0.,0.]
X = [1.,0.,0.]
L = [0.5,0.5,0.5]
K = [1.0,0.5,0.]
path = [L,G,X,K,G]
band_range = [2,5]
bands_step = 30
[28]:
# scissor
# inp['variables']['GfnQP_E'] = [1.0,1.0,1.0]
# band structure
# Some methods that perform these operation can be added in the YamboInput class
inp['variables']['BANDS_steps'] = [bands_step,'']
inp['variables']['BANDS_bands'] = [band_range,'']
inp['variables']['BANDS_kpts'] = [path,'']
inp['variables']['cooIn'] = 'alat'
inp['variables']['cooOut'] = 'alat'
inp
[28]:
{'args': 'ypp -s b',
'folder': 'YamboCalculator_test',
'filename': 'ypp.in',
'arguments': [],
'variables': {'INTERP_Shell_Fac': [20.0, ''],
'INTERP_NofNN': [1.0, ''],
'OutputAlat': [0.0, ''],
'BANDS_steps': [30, ''],
'PROJECT_mode': 'none',
'INTERP_mode': 'NN',
'cooIn': 'alat',
'cooOut': 'alat',
'CIRCUIT_E_DB_path': 'none',
'BANDS_bands': [[2, 5], ''],
'BANDS_kpts': [[[0.5, 0.5, 0.5],
[0.0, 0.0, 0.0],
[1.0, 0.0, 0.0],
[1.0, 0.5, 0.0],
[0.0, 0.0, 0.0]],
'']}}
Also for these kind of computation we can use the skip and the clean_restart options
[29]:
result = code.run(run_dir=run_dir,input=inp,name='bands_test1',skip=False,clean_restart=False)
result
run performed starting from existing results
run command: mpirun -np 1 ypp -F bands_test1.in -J bands_test1 -C bands_test1
computation bands_test1 is running...
computation bands_test1 ended
[29]:
{'output': {'bands_interpolated': 'YamboCalculator_test/bands_test1/o-bands_test1.bands_interpolated'},
'report': 'YamboCalculator_test/bands_test1/r-bands_test1_electrons_bnds',
'dft': 'YamboCalculator_test/SAVE/ns.db1'}
In this case the report does not contains the time_profile string so the simulation time is not provided.
Result can be parsed using the YamboParser class of this package.
[ ]: