Next: 8 Analysis, Properties and Up: 2 Detailed description of Previous: 6 Initialization Contents

Subsections

7 Programs for running an SCF cycle

In sections 7.1-7.9 we describe the main programs to run an SCF cycle as illustrated in figure 4.1.

1 LAPW0 (generates potential)

lapw0 computes the total potential $V_{tot}$ as the sum of the Coulomb and the exchange-correlation potential $V_{xc}$ using the total electron (spin) density as input. It generates the spherical part (l=0) as case.vsp and the non-spherical part as case.vns. For spin-polarized systems, the spin-densities case.clmup and case.clmdn lead to two pairs of potential files. These files are called: case.vspup, case.vnsup and case.vspdn, case.vnsdn.

The Coulomb potential is calculated by the multipolar Fourier expansion introduced by Weinert (81). Utilizing the spatial partitioning of the unit cell and the dual representation of the charge density [equ. 2.10], firstly the multipole moments inside the spheres are calculated (Q-sp). The Fourier series of the charge density in the interstitial also represent SOME density inside the spheres, but certainly NOT the correct density there. Nevertheless, the multipole moments of this artificial plane-wave density inside each sphere are also calculated (Q-pw). By subtracting Q-pw from Q-sp one obtains pseudo-multipole moments Q. Next a new plane-wave series is generated which has two properties, namely zero density in the interstitial region and a charge distribution inside the spheres that reproduces the pseudo-multipole moments Q. This series is added to the original interstitial Fourier series for the density to form a new series which has two desirable properties: it simultaneously represents the interstitial charge density AND it has the same multipole moments inside the spheres as the actual density. Using this Fourier series the interstitial Coulomb potential follows immediately by dividing the Fourier coefficients by (up to a constant).

Inside the spheres the Coulomb potential is obtained by a straightforward classical Green's function method for the solution of the boundary value problem.

The exchange-correlation potential is computed numerically on a grid. Inside the atomic spheres a least squares procedure is used to reproduce the potential using a lattice harmonics representation (the linear equations are solved with modified LINPACK routines). In the interstitial region a 3-dimensional fast Fourier transformation (FFT) is used.

The total potential is obtained by summation of the Coulomb and exchange-correlation potentials $V_{xc}$ .

In order to find the contribution from the plane wave representation to the Hamilton matrix elements we reanalyze the Fourier series in such a way that the new series represents a potential which is zero inside the spheres but keeps the original value in the interstitial region and this series is put into case.vns.

The contribution to the total energy which involves integrals of the form $\rho*V$ is calculated according to the formalism of Weinert et al (82).

The Hellmann-Feynman force contribution to the total force is also calculated (Yu et al 91).

Finally, the electric field gradient (EFG) is calculated in case you have an L=2 term in the density expansion. The EFG tensor is given in both, the ``local-rotation-matrix'' coordinate system, and then diagonalized. The resulting eigenvectors of this rotation are given by columns.

For surface calculations the total and electrostatic potential at z=0 and z=0.5 is calculated and can be used as energy-zero for the determination of the workfunction. (It is assumed that the middle of your vacuum region is either at z=0 or z=0.5).

1 Execution

The program lapw0 is executed by invoking the command:

lapw0 lapw0.def or x lapw0 [ -p -eece -grr]

2 Dimensioning parameters

The following parameters are used (they are collected in file param.inc, but usually need not to be changed:

NCOM number of lm components in charge density and potential representation; it must satisfy the following condition: NCOM+3 .gt. $\{$ [number of with ] + [2 * number of with ] $\}$

NRAD number of radial mesh points

LMAX2 highest L in the LM expansion of charge and potential

LMAX2X highest L for the gpoint-grid in the xcpot generation (may need large values for ``-eece'')

restrict_output for mpi-jobs, limits the number of case.output0xxx files to ``restrict_output''

3 Input

The input is very simple. It is generated automatically by init_lapw, and needs to be changed only if a different exchange-correlation potential should be used:

------------------ top of file: case.in0 --------------------
TOT   13             ! MULT/COUL/EXCH/POT /TOT ;  VXC-SWITCH
NR2V      IFFT  8    ! R2V  EECE/HYBR IFFT  LUSE
 30  30 108  4.00  1 ! min IFFT-parameters, enhancement factor, iprint
0 0.0     (#of FK in E-field expansion, EFELD (Ry)

------------------- bottom of file ---------------------------

Interpretive comments follow:

line 1:

free format
switch, indxc

switch
	TOT	total energy contributions and total potential calculated
	KXC	total energy contributions and total potential calculated. In addition the kinetic energy contribution as well as the XC-energy will be printed.
	POT	total potential is calculated, but not the total energy
	MULT	multipole moments calculated only
	COUL	Coulomb potential calculated only
	EXCH	exchange correlation potential calculated only
		NOTE: MULT, COUL, and EXCH are for testing only, whereas POT, saves some CPU time if total energy is not needed
indxc		index to specify type of exchange and correlation potential. Supported options (for more options see the SRC_lapw0/vxclm2.f subroutine) include:
	5	Perdew and Wang 92, reparameterization of Ceperly-Alder data, the recommended LDA option
	13	Generalized Gradient approximation (PBE, Perdew-Burke-Ernzerhof 96)
	11	Generalized Gradient approximation (Wu-Cohen 2006, Tran et al. 2007)
	19	Generalized Gradient approximation (PBEsol, Perdew 2008)
	20	Generalized Gradient approximation (AM05, Mattsson 2008)
	12	Meta-GGA PKZB (Perdew et al. 1999). In order to generate the requiered `case.vresp*` files, you need `case.inm_vresp` (`cp $WIENROOT/SRC_templates/case.inm_vresp case.inm_vresp` and run one scf cycle with PBE (indxc=13) after creation of `case.inm_vresp`. Only afterwards change indxc to 12. In addition you must use very large IFFT parameters, otherwise it might be numerically unstable.
	27	Meta-GGA TPSS (Tao et al. 2003). At present the ``best'' meta-GGA. (See also the option above.) Note: Only $E_{XC}$ is obtained that way, $V_{XC}$ of standard PBE is used.
	28	modified Becke-Johnson (mBJ-LDA) potential $V_{XC}$ (Tran and Blaha 2009). Uses a mBJ exchange potential + LDA correlation potential and yields gaps in very good agreement with experiment. The xc-energy $E_{XC}$ is from LDA. For detailed usage see chapter about mBJ calculations (4.5.8).
	50	calculates the average of over the unit cell. This is used for the mBJ potential mentioned above.

line 2:

free format (only blanks are allowed as separator)
RPRINT, H-mod, FFTopt, LUSE

RPRINT	NR2V	no additional output
	R2V	Exchange-correlation (`case.r2v`), Coulomb (`case.vcoul`) and total potentials (`case.vtotal`) are written as () to a file for plotting with `lapw5` (cp `case.vtotal` `case.clmval`; use ``VAL'' for normalization in `case.in5`)
H-mod	EECE	On-site Hartree-Fock (inside spheres) for selected electrons (see 4.5.7)
	HYBR	On-site Hybrid functionals (inside spheres) (see 4.5.7)
FFTopt	IFFT	optional keyword, which lets you define the IFFTx mesh and an enhancement factor in the next line (necessary for `runeece_lapw`)
LUSE		optional l-max value for the angular grid used in xcpot1. For standard LDA/GGA the recommended value is max L value of LM-list in case.in2 + 2; for EECE one should use a better, antialiased grid, thus a large negative LUSE-value is recommended (and set automatically by `runeece_lapw`)

line 3:

free format (must be omitted when IFFT is not specified above)
IFFTx, IFFTy, IFFTz, IFFTfactor, iprint

IFFTx,y,z		FFT-mesh parameters in x,y,z directions for the calculation of the XC-potential in the interstitial region. Usually set automatically in `init_lapw` (dstart). The ratio of the 3 numbers should be indirect proportional to the lattice parameters. (-1 -1 -1 determines these numbers automatically and takes only IFFTfactor into account)
IFFTfactor		Multiplicative factor to the IFFT grid specified above. It needs to be enlarged for highly accurate GGA or meta-GGA calculations as well as for systems with H atoms with small spheres.
iprint		optional print switch. iprint=0 will greatly reduce case.output0 (in particular for `lapw0_mpi`).

The following line is optional and can be omitted. It is used to introduce an electric field via a zig-zag potential (see J.Stahn et al. 2001):

line 4:

free format
IFIELD, EFIELD, WFIELD

	IFIELD	number of Fourier coefficients to model the zig-zag potential. Typically use IEFIELD=30; -999 lists available modes (form) of fields, and these modes can be specified by mode=IEFIELD/1000. (default: mode=0)
	EFIELD	value (amplitude) of the electric field. The electric field (in Ry/bohr) corresponds to EFIELD/c, where c is your c lattice parameter.
	WFIELD	optional value for lambda (see output of IEFIELD=-999).

2 ORB (Calculate orbital dependent potentials)

This program was contributed by:

$\framebox{ \parbox[c]{12cm}{ P.Nov\'ak \\ Inst. of Physics, Acad.Science, P... ...ilinglist. If necessary, we will communicate the problem to the authors.} } }$

orb calculates the orbital dependent potentials, i.e. potentials which are nonzero in the atomic spheres only and depend on the orbital state numbers . In the present version the potential is assumed to be independent of the radius vector and needs the density matrix calculated in lapwdm. Four different potentials are implemented in this package:

LDA+U. There are three variants of this method, two of them are discussed in Novák et al. 2001
1. LDA+U(SIC) - introduced by Anisimov et al. 1993, with an approximate correction for the self-interaction correction. This is probably best suited for strongly correlated systems and for a full potential method we recommend to use an ``effective'' $U_{eff} = U - J$ ; setting .
2. LDA+U(AMF) - introduced by Czyzyk and Sawatzky 1994 as 'Around the Mean Field' method. (In Novák et al. 2001 it is denoted as LDA+U(DFT)). This version is (probably) more suitable for metallic or less strongly correlated systems.
3. LDA+U(HMF) - in addition the Hubbard model in the mean field approximation, as introduced by Anisimov et al. 1991 is also implemented. Note, however, that it is to be used with the LDA (not LSDA) exchange-correlation potential in spin polarized calculations!
All variants are implemented in the rotationally invariant way (Liechtenstein et al. 1995). If LDA+U is used in an unrestricted, general way, it introduces an orbital field in the calculation (in analogy to the exchange field in spin-polarized calculations, but it interacts with the orbital, instead of spin momentum). The presence of such an orbital field may lower the symmetry. In particular the complex version of LAPW1 must be used. Care is needed when dealing with the LDA+U orbital field. It may be quite large, and without specifying its direction it may fluctuate, leading to oscillations of scf procedure or/and to false solutions. It is therefore necessary to use it in combination with the spin-orbit coupling, preferably running first LSDA+(s-o) and then slowly switching on the LDA+U orbital field. If the LDA+U orbital polarization is not needed, it is sufficient to run real version of LAPW1, which then automatically puts the orbital field equal to zero. For systems without the center of inversion, when LAPW1 must be complex, an extra averaging of the LDA+U potential is necessary.
Orbital polarization. The additional potential has the form (Brooks 1985, Eriksson et al. 1989):

$\begin{displaymath} V_{OP}= c_{OP}<L_z>l_z \end{displaymath}$ (11)

where $c_{OP}$ is the orbital polarization parameter, is projection of the orbital momentum on the magnetization direction and is single electron orbital momentum component parallel to $\vec{M}$ .
Exact exchange and Hybrid methods: see Tran et al. 2006 and 4.5.7
Interaction with the external magnetic field. In this case the additional potential has a simple form:

$\begin{displaymath} V_{Bext}= \mu_B \vec{B}_{ext} (\vec{l} + 2\vec{s}) . \end{displaymath}$ (12)

The interaction with the electronic spin is taken into account by shifting the spin up and spin down exchange correlation potentials in LAPW0 by the energy $+\mu_B B_{ext} -\mu_B B_{ext}$ , respectively. The interaction of $B_{ext}$ with spin could be as well calculated using the 'Fixed spin moment' method. For an interaction with the orbital momentum it is necessary to specify the atoms and angular momentum numbers for which this interaction will be considered. Caution is needed when considering interaction of the orbital momentum with $B_{ext}$ in metallic or metallic-like systems. For the analysis see the paper by Hirst 1997

In all cases the resulting potential for a given atom and orbital number is a Hermitian, x matrix. In general this matrix is complex, but in special cases it may be real.

For more information see also section 4.5.6.

1 Execution

The program orb is executed by invoking the command:

x orb [ -up/-dn/-du ] or orb up/dnorb.def

2 Dimensioning parameters

The following parameters are used (collected in file param.inc):

LABC highest l+1 value of orbital dependent potentials

NRAD number of radial mesh points

3 Input

Since this program can handle three different cases, examples and descriptions of case.inorb for all cases are given below:

1 Input for all potentials

line 1:

free format
nmod,natorb,ipr

	nmod	defines the type of potential 1...LDA+U, 2...OP, 3... $B_{ext}$
	natorb	number of atoms for which orbital potential $V_{orb}$ is calculated
	ipr	printing option, the larger ipr, the longer the output

line 2:

(A5,f8.2)
mixmod,amix

	mixmod	PRATT or BROYD (should not be changed, see MIXER for more information)
	amix	coefficient for the Pratt mixing of $V_{orb}$
		This option is now only used for testing. The mixing should be set to PRATT, 1.0

line 3:

free format
iatom(i),nlorb(i),(lorb(li,i),li=1,nlorb(i))

	iatom	index of atom in struct file
	nlorb	number of orbital moments for which Vorb shall be applied
	lorb	orbital numbers (repeated nlorb-times)

3rd line repeated natorb-times

2 Input for LDA+U (nmod=1)

line 4:

free format

nsic		defines 'double counting correction'
	nsic=0	'AMF method' (Czyzyk et al. 1994)
	nsic=1	'SIC method' (Anisimov et al. 1993, Liechtenstein et al. 1995)
	nsic=2	'HMF method' (Anisimov et al. 1991)

line 5:

free format

U(li,i), J(li,i)

Coulomb and exchange parameters, U and J, for LDA+U in Ry for atom type i and orbital number li. We recommend to use $U_{eff}$ only.

5th line repeated natorb-times, for each natorb repeated nlorb-times

Example of the input file for NiO (LDA+U included for two inequivalent Ni atoms that have indexes 1 and 2 in the structure file):

---------------- top of file: case.inorb --------------------
1  2  0                        nmod, natorb, ipr
PRATT,1.0                      mixmod, amix
1 1 2                          iatom nlorb, lorb
2 1 2                          iatom nlorb, lorb
1                              nsic (LDA+U(SIC) used)
0.52 0.0                       U J
0.52 0.0                       U J
---------------- bottom of file:  --------------------

3 Input for Orbital Polarization (nmod=2)

line 4:

(free format)

nmodop		defines mode of 'OP'
	1	average taken separately for spin up, spin down
	0	average is the sum for spin up and spin down

line 5:

(free format)

Ncalc(i)
	1	Orb.pol. parameters are calculated ab-initio
	0	Orb.pol. parameters are read from input

this line is repeated natorb-times

line 6:

(free format) (only if Ncalc=0, then repeated nlorb-times)

pop(li,i)

OP parameter in Ry

line 7:

(free format)
xms(1), xms(2), xms(3)

direction of magnetization expressed in terms of lattice vectors

Example of the input file for NiO (total used in (1), OP parameters calculated ab-initio, $\vec{M}$ along [001]):

---------------- top of file: case.inorb --------------------
2  2  0                        nmod, natorb, ipr
PRATT, 1.0                     mixmod, amix
1 1 2                          iatom nlorb, lorb
2 1 2                          iatom nlorb, lorb
0                              nmodop
1                              Ncalc
1                              Ncalc
0. 0. 1.                       direction of M in terms of lattice vectors
---------------- bottom of file --------------------

4 Input for interaction with $B_{ext}$ (nmod=3)

line 4:

(free format)

$B_{ext}$

external field in Tesla

line 5:

(free format)
xms(1), xms(2), xms(3)

direction of magnetization expressed in terms of lattice vectors

Example of the input file for NiO, ( $B_{ext}$ = 4 T, along [001]):

---------------- top of file: case.inorb --------------------
3  2  0                        nmod, natorb, ipr
PRATT, 1.0                     mixmod, amix
1 1 2                          iatom nlorb, lorb
2 1 2                          iatom nlorb, lorb
4.                             Bext in T
0. 0. 1.                       direction of Bext in terms of lattice vectors
---------------- bottom of file --------------------

3 LAPW1 (generates eigenvalues and eigenvectors)

lapw1 sets up the Hamiltonian and the overlap matrix (Koelling and Arbman 75) and finds by diagonalization eigenvalues and eigenvectors which are written to case.vector. Besides the standard LAPW basis set, also the APW+lo method (see Sjöstedt et al 2000, Madsen et al. 2001) is supported and the basis sets can be mixed for maximal efficiency. If the file case.vns exists (i.e. non-spherical terms in the potential), a full-potential calculation is performed.

For structures without inversion symmetry, where the hamilton and overlap matrix elements are complex numbers, the corresponding program version lapw1c must be used in connection with lapw2c.

Since usually the diagonalization is the most time consuming part of the calculations, two options exist here. In WIEN2k we include highly optimized modifications of LAPACK routines. We call all these routines ``full diagonalization'', but we also provide an option to do an ``iterative diagonalization'' using a new preconditioning of a block-Davidson method (see Singh 89 and Blaha et al. 09). The scheme uses an old eigenvector from the previous scf-iteration, and produces approximate (but usually still highly accurate) eigenvalues/vectors. The preconditioner (inverse of can be calculated at the first iterative step (which will therefore take longer than subsequent iterative steps), stored on disk (case.storeHinv) and reused in all subsequent scf-iterations (until the next ``full'' diagonalization or when it is recreated (x lapw1 -it -noHinv0)). Usually this is the fastest scheme, but storage of case.storeHinv can be large (and slow when you have a slow network) and when the Hamiltonian changes too much, performance may degrade. Alternatively, the preconditioner can be recalculated all the time (x lapw1 -it -noHinv). Depending on the ratio of matrix size to number of eigenvalues (cpu time increases linearly with the number of eigenvalues, but a sufficiently large number is necessary to ensure convergence) a significant speedup compared to ``full'' diagonalization (LAPACK) can be reached. Iterative diagonalization is activated with the -it switch in x lapw1 -it or run_lapw -it. Often the preconditioner is so efficient, that it does not need to be recalculated, even within a structural optimization and one can use min_lapw -j ``run_lapw -I -fc 1 -it''. In some cases it is preferable to use min_lapw -j ``run_lapw -I -fc 1 -it1'', which will recreate case.storeHinv, or do not store these files at all using min_lapw -j ``run_lapw -I -fc 1 -it -noHinv ''

Parallel execution (fine grain and on the k-point level) is also possible and is described in detail in Sec. 5.5.

The switch -nohns skips the calculation of the nonspherical matrix elements inside the sphere. This may be used to save computer time during the first scf cycles.

1 Execution

The program lapw1 is executed by invoking the command:

x lapw1 [-c -up|dn -it -noHinv|-noHinv0 -p -nohns -orb -band -nmat_only] or

lapw1 lapw1.def or lapw1c lapw1.def

In cases without inversion symmetry, the default input filename is case.in1c. For 2-window (not recommended) semi-core calculations the lapw1s.def file uses a case.in1s file and creates the files case.output1s and case.vectors. For the spin-polarized case lapw1 is called twice with uplapw1.def and dnlapw1.def. To all relevant files the keywords ``up`` or ``dn`` are appended (see the fcc Ni test case in the WIEN2k package).

2 Dimensioning parameters

The following parameters (collected in file param.inc_r or param.inc_c) are used:

LMAX highest l+1 in basis function inside sphere (consistent with input in case.in1)

LMMX number of LM terms in potential (should be at least NCOM-1)

LOMAX highest l for local orbital basis (consistent with input in case.in1)

NGAU number of Gaunt coefficients for the non-spherical contributions to the matrix elements

NMATMAX maximum size of H,S-matrix (defines size of program, should be chosen according to the memory of your hardware, see chapter 11.2.2!)

NRAD number of radial mesh points

NSLMAX highest l+1 in basis functions for non-muffin-tin matrix elements (consistent with input in case.in1).If set larger than 5, parameter MAXDIM (modules.F) and LOMAX=8, P(10,10) (gaunt2.f) must also be increased.

NSYM order of point group

NUME maximum number of energy eigenvalues per k-point

NVEC1 defines the largest triple of integers which define reciprocal

NVEC2 K-vectors when multiplied with the reciprocal Bravais matrix

NVEC3

NLOAT max number of LOs for one -quantum number

restrict_output for mpi-jobs, limits the number of case.output1_X_proc_XXX files to ``restrict_output''

3 Input

Below a sample input is shown for (rutile), one of the test cases provided in the WIEN2k package. The input file is written automatically by LSTART, but was modified to set APW only for Ti-3d and O-2p orbitals.

------------------ top of file: case.in1 --------------------
WFFIL  EF=0.5000   (WFPRI,WFFIL,SUPWF ; wave fct. print,file,suppress
  7.500   10    4  (R-mt*K-max; MAX l, max l for hns )
  0.30    5 0 (global energy parameter E(l), with 5 other choices,   LAPW)
 0   -3.00      0.020 CONT 0   ENERGY PARAMETER for s,               LAPW
 0    0.30      0.000 CONT 0   ENERGY PARAMETER for s-local orbital, LAPW-LO
 1   -1.90      0.020 CONT 0   ENERGY PARAMETER for p                LAPW
 1    0.30      0.000 CONT 0   ENERGY PARAMETER for p-local orbitals LAPW-LO
 2    0.20      0.020 CONT 1                                         APW   
  0.20    3 0 (global energy parameter E(l), with 1 other choice,    LAPW)
 0   -0.90      0.020 STOP 0                                         LAPW
 0    0.30      0.000 CONT 0                                         LAPW-LO
 1    0.30      0.000 CONT 1                                         APW
K-VECTORS FROM UNIT:4   -9.0       2.0    69   emin/emax/nband
1.d-15    0.0                    spro_limit for it.diag., lambda for it.diag
------------------- bottom of file ------------------------

Interpretive comments follow:

line 1:

free format
switch, EF

switch	WFFIL	standard option, writes wave functions to file `case.vector` (needed in `lapw2`)
	SUPWF	suppresses wave function calculation (faster for testing eigenvalues only)
	WFPRI	prints eigenvectors to `case.output1` and writes `case.vector` (produces long outputs!)
EF		optional input. If ``EF='' key is present, lapw1 reads EF and sets all default energy parameters (0.3) to ``EF-0.2'' Ry.

line 2:

free format
rkmax, lmax, lnsmax

rkmax		$R_{mt} * K_{max}$ determines matrix size (convergence), where Kmax is the plane wave cut-off, Rmt is the smallest of all atomic sphere radii. Usually this value should be between 5 and 9 (APW+lo) or 6 - 10. (LAPW-basis) ( $K_{max}^2$ would be the plane wave cut-off parameter in Ry used in pseudopotential calculations.) Note that d (f) wavefunctions converge slower than s and p. For systems including hydrogen with short bondlength and thus a very small $R_{mt}$ (e.g. 0.7 a.u.), RKmax = 3 might already be reasonable, but convergence must be checked for a new type of system.
		Note, that the actual matrix size is written on case.scf1. It is determined by whatever is smaller, the plane wave cut-off (specified with RKmax) or the maximum matrix dimension NMATMAX, (see previous section).
lmax		maximum l value for partial waves used inside atomic spheres (should be between 8 and 12)
lnsmax		maximum l value for partial waves used in the computation of non-muffin-tin matrix elements (lnsmax=4 is quite good)

line 3:

free format
Etrial, ndiff, Napw

Etrial		default energy used for all to obtain as regular solution of the radial Schrödinger equation [used in equ.2.4,2.7] (see figure 7.1).
ndiff		number of exceptions (specified in the next ndiff lines)
Napw		0 ... use LAPW basis, 1 ... use APW-basis for all ``global'' values of this atom. We recommend to use LAPW here.

line 4:

format(I2,2F10.5,A4)
l, El, de, switch, NAPWL

l		l of partial wave
El		for L=l
de		energy increment
		de=0: this E(l) overwrites the default energy (from line 3)
		de $\ne$ 0: a search for a resonance energy using this increment is done. The radial function up to the muffin-tin radius RMT varies with the energy. A typical case is schematically shown in Fig. 7.1.
		At the bottom of the energy bands u has a zero slope (bonding state), but it has a zero value (antibonding state) at the top of the bands. One can search up and down in energy starting with using the increment de to find where $u_l(R_{MT},E)$ changes sign in value to determine $E_{top}$ and in slope to specify $E_{bottom}$ . If both are found is taken as the arithmetic mean and replaces the trial energy. Otherwise keeps the specified value. For $E_{top}$ and $E_{bottom}$ bounds of +1 and -10 Ry are defined respectively, and if they are not found, they remain at the initial value set to -200.
switch		used only if de.ne.0
	CONT	calculation continues, even if either $E_{top}$ or $E_{bottom}$ are not found
	STOP	calculation stops if not both $E_{top}$ and $E_{bottom}$ are found (especially useful for semi-core states)
NAPWL		0 ... use LAPW basis, 1 ... use APW-basis for this value of this atom. We recommend to use APW+lo when the corresponding wavefunction is ``localized'' and thus difficult to converge with standard LAPW (like 3d functions) and/or when the respective atomic sphere is small compared to the other spheres in the unit cell.

**Figure 7.1:** Schematic dependence of DOS and on the energy
$\begin{figure}\begin{center} \leavevmode \rotatebox{0}{\epsfig{figure=figs/dos_e, width=12cm}} \end{center} \end{figure}$

:line 4

is repeated ndiff times (see line 3) for each exception. If the same l value is specified twice, local orbitals are added to the (L)APW basis. The first energy (

) is used for the usual LAPW's and the second energy (

) for the LOs, which are formed according to (see equ. 2.7): $u_{E_1} + \dot u_{E_1} + u_{E_2}$ .
Note: The default energy parameters (0.30) are replaced by an energy ``'' if the EF-switch was read before. Please read also the comments about run_lapw in section 5.1.3. In addition, you may want to change the automatically created input and add d- or f-local orbitals to reduce the linearization error (e.g. in late transition metals you could put $E_{3d}$ at 0.0 and 1.0 Ry) or s, p, d, and/or f-LOs at very high energy (e.g. 2.0 - 3.0 Ry) to better describe unoccupied states.

:lines 3 and 4

are repeated for each non equivalent atom

line 5:

format (20x,i1,2f10.1,i6)
unit-number, Emin, Emax nband

unit-number		file number from which the k-vectors in the irreducible wedge of the Brillouin zone are read. Default is 4, for which the corresponding information is contained in `case.klist` (generated by KGEN). Should not be changed.
EMIN, EMAX		energy window in which eigenvalues shall be searched (overrides setting in `case.klist`. A small window saves computer time, but it also limits the energy range for the DOS calculation of unoccupied states.
nband		number of eigenvalues calculated with iterative diagonalization. Set automatically to in `lstart`. Larger values will lead to more cpu-time. (Optional input)

line 6:

free format; optional input line, but necessary if k-vectors are read from unit 5
spro_limit, lambda_iter

spro_limit		limit for detection of linear dependency for iterative diagonalization (optional input), typical around 1.d-15)
lambda_iter		optional value for preconditioner of iterative diagonalization (see above). By default we use , but in some cases convergence can be improved by a small (around 1.0) positive or negative

line 7:

format (A10,4I10,3F5.2); (only when unit-number=5, not recommended, use unit 4 and case.klist)
name, ix,iy,iz, idv, weight

name		name of k-vector (optional)
		: the last line must be END !!
ix,iy,iz, idv		defines the k-vector, where x= ix/idv etc. We use carthesian coordinates in units of $2\pi/a$ , $2\pi/b$ , $2\pi/c$ for P,C,F and B cubic, tetragonal and orthorhombic lattices, but internal coordinates for H and monoclinic/triclinic lattices
weight		of k-vector (order of group of k)

: line 7

is repeated for each k-vector in the IBZ. The utility program kgen (see section 6.5) provides a list of such vectors (on a tetrahedral mesh) in case.klist.

: the last line

must be END

4 LAPWSO (adds spin orbit coupling)

lapwso includes spin-orbit (SO) coupling in a second-variational procedure and computes eigenvalues and eigenvectors (stored in case.vectorso) using the scalar-relativistic wavefunctions from lapw1. For reference see Singh 94 and Novák 97. The SO coupling must be small, as it is diagonalized in the space of the scalar relativistic eigenstates. For large spin orbit effects it might be necessary to include many more eigenstates from lapw1 by increasing EMAX in case.in1 (up to 10 Ry!). We also provide an additional basisfunction, namely an LO with a $p_{1/2}$ radial wavefunction, which improves the basis and removes to a large degree the dependency of the results on EMAX and RMT (see Kuneš et al. 2001). SO is considered only within the atomic spheres and thus the results may depend to some extent on the choice of atomic spheres radii. The nonspherical potential is neglected when calculating $dV\over{dr}$ . Orbital dependend potentials (LDA+U, EECE or OP) can be added to the hamiltonian in a cheap and simple way.

In spin-polarized calculations the presence of spin-orbit coupling may reduce symmetry and even split equivalent atoms into non-equivalent ones. It is then necessary to consider a larger part of the Brillouin zone and the input for lapw2 should also be modified since the potential has lower symmetry than in the non-relativistic case. The following inputs may change:

case.struct
case.klist
case.kgen
case.in2c
case.in1

We recommend to use initso (see Sec.5.2.17) which helps you together with symmetso (see Sec.9.1) to setup spinorbit calculations.

Note: SO eigenvectors are complex and thus lapw2c must be used in a selfconsistent calculation.

1 Execution

The program lapwso is executed by invoking the command:

x lapwso [ -up -p -c -orb] or
lapwso lapwso.def

where here -up indicates a spin-polarized calculation (no ``-dn'' is needed, since spin-orbit will mix spin-up and dn states in one calculation).

2 Dimensioning parameters

The following parameters are used (collected in file param.inc):

FLMAX constant = 3

LMAX highest l of wave function inside sphere (consistent with lapw1)

LABC highest l of wave function inside sphere where SO is considered

LOMAX max l for local orbital basis

NRAD number of radial mesh points

NLOAT number of local orbitals

3 Input

A sample input for lapwso is given below. It will be generated automatically by initso

------------------ top of file: case.inso --------------------
WFFIL
 4  0  0                      llmax,ipr,kpot 
 -10.0000   1.5000           Emin, Emax
     0 0 1                       h,k,l (direction of magnetization)
   2                             number of atoms with RLO
 1    -3.5      0.005 STOP       atom-number, E-param for RLO
 3    -4.5      0.005 STOP       atom-number, E-param for RLO
 1   2                           number of atoms without SO, atomnumbers
------------------- bottom of file ------------------------

Interpretive comments on this file are as follows:

line 1:

format(A5)
switch

WFFIL

wavefunctions will also be calculated for scf-calculation. Otherwise only eigenvalues are calculated.

line 2:

free format
LLMAX, IPR, KPOT

LLMAX		maximum l for wavefunctions
IPR		print switch, larger numbers give additional output.
KPOT	0	V(dn) potential is used for $<dn\vert V\vert dn>$ elements, V(up) for $\mbox{$<up\vert V\vert up>$}$ and [V(dn)+V(up)]/2 for $<up\vert V\vert dn>$ .
	1	averaged potential used for all matrix elements.

line 3:

free format
Emin, Emax

Emin		minimum energy for which the output eigenvectors and eigenenergies will be printed (Ry)
Emax		maximum energy

line 4:

free format

h,k,l

vector describing the direction of magnetization. For R lattice use h,k,l in rhombohedral coordinates (not in hexagonal)

line 5:

free format

nlr

number of atoms for which a $p_{1/2}$ LO should be added

line 6:

free format
nlri, El, de, switch

nlri		atom-number for which RLO should be added
El		for L=l
de		energy increment (see `lapw1`)
switch		used only if de.ne.0
	CONT	calculation continues, even if either $E_{top}$ or $E_{bottom}$ are not found
	STOP	calculation stops if not both $E_{top}$ and $E_{bottom}$ are found (especially useful for semi-core states)

: line 6

must be repeated ``nlr'' times (or should be omitted if nlr=0).

line 7:

free format
noff, (iatoff(i),i=1,noff)

noff		number of atoms for which SO is switched off (for ``light'' elements, saves time)
iatoff		atom-numbers

5 LAPW2 (generates valence charge density expansions)

lapw2 uses the files case.energy and case.vector and computes the Fermi-energy (for a semiconductor is set to the valence band maximum) and the expansions of the electronic charge densities in a representation according to eqn. 2.10 for each occupied state and each k-vector; then the corresponding (partial) charges inside the atomic spheres are obtained by integration. In addition ``Pulay-corrections`` to the forces at the nuclei are calculated here. For systems without inversion symmetry you have to use the program lapw2c (in connection with lapw1c).

The partial charges for each state (energy eigenvalue) and each k-vector can be written to files case.help031, case.help032 etc., where the last digit gives the atomic index of inequivalent atoms (switch -help_files). Optionally these partial charges are also written to case.qtl (switch -qtl). For meta-GGA calculations energy densities are written to case.vrepval(switch -vresp).

In order to get partial charges for bandstructure plots, use -band, which sets the ``QTL option and uses ``ROOT'' in case.in2. Several other switches change the input file case.in2 temporarely and are described there.

1 Execution

The program lapw2 is executed by invoking the command:

x lapw2 [-c -up|dn -p -so -qtl -fermi -efg -band -eece -vresp -help_files -emin X -all X Y] or
lapw2 lapw2.def [proc#] or lapw2c lapw2.def [proc#]

where proc# is the i-th processor number in case of parallel execution (see Fig. 5.2). The -so switch sets -c automatically.

For complex calculations case.in2c is used. For a spin-polarized case see the fcc Ni test case in the WIEN2k package.

2 Dimensioning parameters

The following parameters are used (collected in file modules.F):

IBLOCK Blocking parameter (32-255) in l2main.F, optimize for best performance

LMAX2 highest l in wave function inside sphere (smaller than in lapw1, at present must be .le. 8)

LOMAX max l for local orbital basis

NCOM number of LM terms in density

NGAU max. number of Gaunt numbers

NRAD number of radial mesh points

restrict_output for mpi-jobs, limits the number of case.output2_X_proc_XXX files to ``restrict_output''

3 Input

A sample input for lapw2 is listed below, it is generated automatically by the programs lstart and symmetry.

------------------ top of file: case.in2 --------------------
TOT                         (TOT,FOR,QTL,EFG)
-1.2       32.000   0.5  0.05  (EMIN, # of electrons,ESEPERMIN, ESEPER0 )
TETRA       0.0      (EF-method (ROOT,TEMP,GAUSS,TETRA,ALL),value)
  0 0  2 0  2 2  4 0  4 2  4 4 
  0 0  1 0  2 0  2 2  3 0  3 2  4 0  4 2  4 4
14.0      (GMAX)
FILE      (NOFILE, optional)
------------------- bottom of file ------------------------

Interpretive comments on this file are as follows:

line 1:

format(2A5)
switch, EECE

switch	TOT	total valence charge density expansion inside and outside spheres
	FOR	same as TOT, but in addition a ``Pulay'' force contribution is calculated (this option costs extra computing time and thus should be performed only at the final scf cycles, see `run_lapw` script in sec. 5.1.3)
	QTL	partial charges only (generates file `case.qtl` for DOS calculations), set automatically by switch `-qtl`
	EFG	computes decomposition of electric field gradient (EFG), contributions from inside spheres (the total EFG is computed in `lapw0`), set automatically by switch `-efg`.
	ALM	this generates two files, `case.radwf` and `case.almblm`, where the radial wavefunctions and the coefficients of the wavefunction inside spheres are listed. The file case.almblm can get very big.
	CLM	CLM charge density coefficients only
	FERMI	Fermi energy only, this produces weight files for parallel execution and for the `optics` and `lapwdm` package, set automatically by switch `-fermi`.
	:	TOT and FOR are the standard options, QTL is used for density of states (or energy bandstructure) calculations, EFG for analysis, while FOURI, CLM are for testing only.
EECE		if set to ``EECE'', calculates the density for specified atoms and angular momentum only. Used for exact-exchange or hybrid-calculations, usually set automatically by `runsp_lapw -eece`

line 2:

free format
emin, ne, esepermin, eseper0

emin		lower energy cut-off for defining the range of occupied states, can be set termporarely to ``X'' by switch `-emin X` or `-all X Y`
ne		number of electrons (per unit cell) in that energy range
esepermin		LAPW2 tries to find the ``mean'' energies for each channel, for both the valence and the semicore states. To define ``valence'' and ``semicore'' it starts at (EF - ``esepermin'') and searches for a ``gap'' with a width of at least ``eseper0'' and defines this as separation energy of valence and semicore
eseper0		minimum gap width (see above). The values esepermin and eseper0 will only influence results if the option -in1new is used

line 3:

format(a5,f10.5)
efmod, eval

efmod		determines how E is determined
	ROOT	E is calculated and k space integration is done by root sampling (this can be used for insulators, but for metals poor convergence is found)
	TEMP	E is calculated where each eigenvalue is temperature broadened using a Fermi function with a broadening parameter of eval Ry. The total energy is corrected corresponding to T=0K. (e.g. eval=0.002 Ry gives good total energy convergence, but has no ``physical`` justification)
	TEMPS	E is calculated where each eigenvalue is temperature broadened using a Fermi function with a broadening parameter of eval Ry. The total energy is corrected by -TS corresponding to the temperature specified by eval (e.g. eval=0.002 Ry corresponds to about 40 C)
	GAUSS	E is calculated as above but a Gaussian smearing method is used with a width of eval Ry. (e.g. eval=0.002 gives good total energy convergence, but has no ``physical`` justification).
	TETRA	E is calculated and k space integration is done by the modified (if eval is .eq. 0) or standard (eval .ge. 100) tetrahedron-method (Blöchl 94). This ``standard'' scheme is recommended for `optic`. In this case the file `case.kgen`, consistent with the k-mesh used in `lapw1`, must be provided (see Sec. 7.3). This is the recommended option although convergence may be slower than with Gauss- or temperature-smearing.
	ALL	All states up to eval are used. This can be used to generate charge densities in a specified energy interval, can be set termporarely by switch `-all X Y`.
eval		when efmod is set to TEMP(S) (eval=0 will lead to room temperature broadening, 0.0018 Ry) or GAUSS, eval specifies the width of the broadening (in Ry), if efmod is set to ALL, eval specifies the upper limit of the energy window (in Ry; can be set termporarely by switch `-all X Y`), if efmod is set to TETRA, eval .ge. 100 specifies the use of the standard tetrahedron method instead of the modified one (see above). By default, TETRA will average over partially occupied degenerate states at EF with a degeneracy criterium D = 1.d-6. You can modify this by setting eval equal to your desired D (or 100+D).

optional line 3a:

free format (ONLY when EECE is set)

nat_rho

number of atoms for which a specific density should be calculated

optional line 3b:

free format (ONLY when EECE is set)
jatom_rho, l_rho

jatom_rho		index of atom for which a specific density should be calculated
l_rho		angular momentum l-value for which a specific density should be calculated

line 3b:

must be repeated nat_rho times

line 4:

format (121(I3,I2))

L,M LM values of lattice harmonics expansion (equ. 2.10), defined according to the point symmetry of the corresponding atom; generated in SYMMETRY, MUST be consistent with the local rotation matrix defined in case.struct (details can be found in Kara and Kurki-Suonio 81). CAUTION: additional LM terms which do not belong to the lattice harmonics will in general not vanish and thus such terms must be omitted. Automatic termination of the series occurs when a second 0,0 pair appears within the list. When you change the list during an SCF calculation the Broyden-Mixing is restarted in MIXER.

line 4:

must be repeated for each inequivalent atom

Table 7.41: LM combinations of ``Cubic groups'' (3 $\parallel$ (111)) direction, requires ``positive atomic index'' in case.struct. Terms that should be combined (Kara and Kurki-Suonio 81) must follow one another.

Symmetry	LM combinations
23	0 0, 4 0, 4 4, 6 0, 6 4,-3 2, 6 2, 6 6,-7 2,-7 6, 8 0, 8 4, 8 8,-9 2,-9 6,-9 4,-9 8,10 0, 10 4,10 8, 10 2, 10 6, 10 10
M3	0 0, 4 0, 4 4, 6 0, 6 4, 6 2, 6 6, 8 0, 8 4, 8 8,10 0, 10 4,10 8, 10 2, 10 6, 10 10
432	0 0, 4 0, 4 4, 6 0, 6 4, 8 0, 8 4, 8 8,-9 4,-9 8,10 0, 10 4,10 8
-43M	0 0, 4 0, 4 4, 6 0, 6 4,-3 2,-7 2,-7 6, 8 0, 8 4, 8 8,-9 2,-9 6,10 0, 10 4,10 8
M3M	0 0, 4 0, 4 4, 6 0, 6 4, 8 0, 8 4, 8 8,10 0, 10 4,10 8

Table 7.42: LM combination and local coordinate system of ``non-cubic groups'' (requires ``negative atomic index'' in case.struct)

Symmetry	Coordinate axes	Indices of $Y_{\pm LM}$	crystal system
1	any	ALL ( $\pm$ l,m)	triclinic
-1	any	( $\pm$ 2l,m)
2	2 $\parallel$ z	( $\pm$ l,2m)	monoclinic
M	m $\perp$ z	( $\pm$ l,l-2m)
2/M	2 $\parallel$ z, m $\perp$ z	( $\pm$ 2l,2m)
222	2 $\parallel$ z, 2 $\parallel$ y, (2 $\parallel$ x)	(+2l,2m), (-2l+1,2m)	orthorhombic
MM2	2 $\parallel$ z, m $\perp$ y, (2 $\perp$ x)	(+l,2m)
MMM	2 $\perp$ z, m $\perp$ y, 2 $\perp$ x	(+2l,2m)
4	4 $\parallel$ z	( $\pm$ l,4m)	tetragonal
-4	-4 $\parallel$ z	( $\pm$ 2l,4m), ( $\pm$ 2l+1,4m+2)
4/M	4 $\parallel$ z, m $\perp$ z	( $\pm$ 2l,4m)
422	4 $\parallel$ z, 2 $\parallel$ y, (2 $\parallel$ x)	(+2l,4m), (-2l+1,4m)
4MM	4 $\parallel$ z, m $\perp$ y, (2 $\perp$ x)	(+l,4m)
-42M	-4 $\parallel$ z, 2 $\parallel$ x (m=xy $\rightarrow$ yx)	(+2l,4m), (-2l+1,4m+2)
4MMM	4 $\parallel$ z, m $\perp$ z, m $\perp$ x	(+2l,4m)
3	3 $\parallel$ z	( $\pm$ l,3m)	rhombohedral
-3	-3 $\parallel$ z	( $\pm$ 2l,3m)
32	3 $\parallel$ z, 2 $\parallel$ y	(+2l,3m), (-2l+1,3m)
3M	3 $\parallel$ z, m $\perp$ y	(+l,3m)
-3M	-3 $\parallel$ z, m $\perp$ y	(+2l,3m)
6	6 $\parallel$ z	( $\pm$ l,6m)	hexagonal
-6	-6 $\parallel$ z	(+2l,6m), ( $\pm$ 2l+1,6m+3)
6/M	6 $\parallel$ z, m $\perp$ z	( $\pm$ 2l,6m)
622	6 $\parallel$ z, 2 $\parallel$ y, (2 $\parallel$ x)	(+2l,6m), (-2l+1,6m)
6MM	6 $\parallel$ z, m $\parallel$ y, (m $\perp$ x)	(+l,6m)
-62M	-6 $\parallel$ z, m $\perp$ y, (2 $\parallel$ x)	(+2l,6m), (+2l+1,6m+3)
6MMM	6 $\parallel$ z, m $\perp$ z, m $\perp$ y, (m $\perp$ x)	(+2l,6m)

line 5:

free format

GMAX max. G (magnitude of largest vector) in charge density Fourier expansion. For systems with short H bonds larger values (e.g. GMAX up to 25) could be necessary. Calculations using GGA (potential option 13 or 14 in case.in0) should also employ a larger GMAX value (e.g. 14), since the gradients are calculated numerically on a mesh determined by GMAX. When you change GMAX during an scf calculation the Broyden-Mixing is restarted in mixer.

line 6:

reclist FILE writes list of K-vectors into file case.recprlist or reuses this list if the file exists. The saved list will be recalculated whenever GMAX, or a lattice parameter has been changed.

NOFILE always calculate new list of K-vectors

6 SUMPARA (summation of files from parallel execution)

sumpara is a small program which (in parallel execution of WIEN2k) sums up the densities (case.clmval_*) and quantities from the case.scf2_* files of the different parallel runs.

1 Execution

The program sumpara is executed by invoking the 2 commands as follows:

x sumpara -d [-up/-dn/-du] and then
sumpara sumpara.def #_of_proc

where #_of_proc is the numbers of parallel processors used. It is usually called automatically from lapw2para or x lapw2 -p.

2 Dimensioning parameters

The following parameters are listend in file param.inc, but usually they need not to be modified:

NCOM number of LM terms in density

NRAD number of radial mesh points

NSYM order of point group

7 LAPWDM (calculate density matrix)

This program was contributed by:

$\framebox{ \parbox[c]{12cm}{ J.Kune\v{s} and P.Nov\'ak \\ Inst. of Physics,... ...ilinglist. If necessary, we will communicate the problem to the authors.} } }$

lapwdm calculates the density matrix needed for the orbital dependent potentials generated in orb. Optionally it also provides orbital moments, orbital and dipolar contributions to the hyperfine field (only for the specified atoms AND orbitals). It calculates the average value of the operator X which behaves in the same way as the spin-orbit coupling operator: it must be nonzero only within the atomic spheres and can be written as a product of two operators - radial and angular:

$X =X_r(r)*X_{ls}(\vec{l},\vec{s})$

and $X_{ls}(\vec{l},\vec{s})$ are determined by RINDEX and LSINDEX in the input as described below:

RINDEX=0 LSINDEX=0: the density matrix is calculated (this is needed for LDA+U calculations)
RINDEX=1 LSINDEX=1: X is number of electrons inside the atomic sphere (for test)
RINDEX=2 LSINDEX=1: X is the expectation value inside the atomic sphere
RINDEX=1 LSINDEX=2: X is the projection of the electronic spin inside the atomic sphere (must be multiplied by g=2 to get the spin moment)
RINDEX=1 LSINDEX=3: X is the projection of the orbital moment inside the atomic sphere (in case of SO-calculations WITHOUT LDA+U)
RINDEX=3 LSINDEX=3: X is the orbital part of the hyperfine field at the nucleus (for a converged calculation at the very end)
RINDEX=3 LSINDEX=5: X is the spin dipolar part of the hyperfine field at the nucleus (for a converged calculation at the very end)

To use the different operators, set the appropriate input. More information and extentions to operators of similar behavior may be obtained directly from P. Novák (2006). (RINDEX=3 includes now an approximation to the relativistic mass enhancement and LSINDEX=5 includes nondiagonal terms)

lapwdm needs the occupation numbers, which are calculated in lapw2. Note: You must not use ROOT in case.in2 for that purpose.

1 Execution

The program lapwdm is executed by invoking the command:

x lapwdm [ -up/dn -p -c -so ] or
lapwdm lapwdm.def

2 Dimensioning parameters

The following parameters are used (collected in file param.inc):

3 Input

A sample input for lapwdm is given below.

------------------ top of file: case.indm --------------------
-9.                      Emin cutoff energy
 1                       number of atoms for which density matrix is calculated
 1  1  2      index of 1st atom, number of L's, L1
 0 0          r-index, (l,s)-index
------------------- bottom of file ------------------------

Interpretive comments on this file are as follows:

line 1:

free format

emin

lower energy cutoff (usually set to very low number).

line 2:

free format

natom

number of atoms for which the density matrix is calculated

line 3:

free format
iatom, nl, l

	iatom	index of atom for which the density matrix should be calculated
	nl	number of l-values for which the density matrix should be calculated
	l	l-values for which the density matrix should be calculated

line 3 is repeated natom times

line 4:

free format, optional
RINDEX, LSINDEX

	RINDEX	0-3, as described in the introduction to `lapwdm`
	LSINDEX	0-5, as described in the introduction to `lapwdm`

8 LCORE (generates core states)

lcore is a modified version of the Desclaux (69, 75) relativistic LSDA atomic code. It computes the core states (relativistically including SO, or non-relativistically if ``NREL'' is set in case.struct) for the current spherical part of the potential (case.vsp). It yields core eigenvalues, the file case.clmcor with the corresponding core densities, and the core contribution to the atomic forces.

1 Execution

The program lcore is executed by invoking the command:

lcore lcore.def or x lcore [-up|-dn]

For the spin-polarized case see fcc Ni on the distribution tape. If case.incup and case.incdn are present for spin-polarized calculations, different core-occupation (``open core'' approximation for 4f states or spin-polarized core-holes) for both spins are possible.

2 Dimensioning parameters

The following parameter is listend in file param.inc:

NRAD number of radial mesh points

3 Input

Below is a sample input (written automatically by lstart)

for (rutile), one of the test cases provided with the WIEN2k

package.

In case of a "open core" calculation (eg. for 4f states) you may need "spin-polarized" case.inc files in order to define different configurations for spin-up and dn. Create two files case.incup and case.incdn with the corresponding occupations. The runsp_lapw script will automatically copy the corresponding files to case.inc.

------------------ top of file: case.inc --------------------
 4 0.0 0     # of orbitals,  shift of potential, print switch
 1,-1,2      n (principal quantum number), kappa, occup. number
 2,-1,2      2s
 2,-2,4      2p
 2, 1,2      2p*
 1   0.0          # of orbital of second atom
 1,-1,2      1s
 0           end switch
------------------- bottom of file ------------------------

Interpretive comments on this file are as follows:

line 1:

free format
nrorb, shift, iprint

nrorb		number of core orbitals
shift		shift of potential for ``positive'' eigenvalues (e.g. for 4f states as core states in lanthanides)
iprint		optional print switch to reduce (0) or increase (1) printing to `case.outputc`

line 2:

free format
n, kappa, occup

n		principle quantum number
kappa		relativistic quantum number (see Table 6.6)
occup		occupation number (including spin), fractial occupations supported

: line 2

is repeated for each orbital (nrorb times; see line 1)

: line 1 and 2

are repeated for each inequivalent atom. Atoms without core states (e.g. H or Li) must still include a 1s orbital, but with occupation zero.

line 3:

free format

zero indicating end of job

9 MIXER (adding and mixing of charge densities)

In mixer the electron densities of core, semi-core, and valence states are added to yield the total new (output) density (in some calculations only one or two types will exist). Proper normalization of the densities is checked and enforced (by adding a constant charge density in the interstitial). As it is well known, simply taking the new densities leads to instabilities in the iterative SCF process. Therefore it is necessary to stabilize the SCF cycle. In WIEN2k this is done by mixing the output density with the (old) input density to obtain the new density to be used in the next iteration. Several mixing schemes are implemented, but we mention only:

straight mixing as originally proposed by Pratt (52) with a mixing factor Q

$\begin{displaymath}\rho_{new}(r)=(1-Q)\rho_{old}(r)+Q\rho_{output}(r)\end{displaymath}$
a Multi-Secant mixing scheme contributed by L. Marks (see Marks and Luke 2008), in which all the expansion coefficients of the density from several preceding iterations (usually 6-10) are utilized to calculate an optimal mixing fraction for each coefficient in each iteration. It is very robust and stable (works nicely also for magnetic systems with 3d or 4f states at EF, only for ill-conditioned single-atom calculations you can break it) and usually converges at least 30 % faster than the old BROYD scheme.
Two new variants on the Multi-Secant method including a rank-one update (see Marks 2011) which appear to be 5-10% faster and equally robust.

At the outset of a new calculation (for any changed computational parameter such as k-mesh, matrix size, lattice constant etc.), any existing case.broydX files must be deleted (since the iterative history which they contain refers to a ``different`` incompatible calculation).

If the file case.clmsum_old can not be found by mixer, a ``PRATT-mixing`` with mixing factor (greed) 1.0 is done.

Note: a case.clmval file must always be present, since the LM values and the K-vectors are read from this file.

The total energy and the atomic forces are computed in mixer by reading the case.scf file and adding the various contributions computed in preceding steps of the last iteration. Therefore case.scf must not contain a certain ``iteration-number'' more than once and the number of iterations in the scf file must not be greater than 999.

For LDA+U calculations case.dmatup/dn and for hybrid-DFT (switch -eece) case.vorbup/dn files will be included in the mixing procedure.

With the new mode MSR1a (or MSECa) (contributed by L. Marks) atomic positions will also be mixed and thus optimized. This scheme can (unfortunately not in all cases) be a facter or 2-3 faster then the traditional optimization using min_lapw.

1 Execution

The program mixer is executed by invoking the command:

mixer mixer.def or x mixer [-eece]

A spin-polarized case will be detected automatically by x due to the presence of a case.clmvalup file. For an example see fccNi (sec. 10.2) in the WIEN2k package.

2 Dimensioning parameters

The following parameters are collected in file param.inc, :

NCOM number of LM terms in density

NRAD number of radial mesh points

NSYM order of point group

traptouch minimum acceptable distance between atoms in full optimization model

3 Input

Below a sample input (written automatically by lstart) is provided for (rutile), one of the test cases provided with the WIEN2k package.

------------------ top of file: case.inm --------------------
MSEC3 0.d0  YES  (PRATT/MSEC1/3/MSR1/a  bg charge (+1 for additional e), NORM
  0.2            MIXING GREED
1.0 1.0          Not used, retained for compatibility only
999 8            nbroyd nuse
------------------- bottom of file ------------------------

Interpretive comments on this file are as follows:

line 1:

(A5,*)
switch, bgch, norm

switch	MSEC1	Multi-Secant scheme (Marks and Luke 2008)
	MSEC2	similar to MSEC1 (above), but mixes the higher LM values inside spheres by an adaptive PRATT scheme. This leads to a significant reduction of programsize and filesize (`case.broyd*`) for unitcells with many atoms and low symmetry (factor 10-50) with only slighly worse mixing performance.
	MSEC3	Similar to MSEC1, but with updated scaling, regularization and other improvements.
	MSEC4	similar to MSEC3 (above), but mixes only the L=0 LM value
	MSR1	Recommended. A Rank-One Multisecant that is slightly faster than MSEC3 in most cases. For MSR1a see later.
	MSR2	similar to MSR1 (above), but mixes only the L=0 LM value
	MSR1a	Similar to MSR1, but in addition it optimizes the atomic positions simultaneously (see Sect. 5.3.2)
	PRATT	Pratt scheme with a fixed greed
	PRAT0	Pratt scheme with a greed restrained by previous improvement, similar to MSEC3

bgch

Background charge for charged cells (+1 for additional electron, -1 for core hole, if not neutralized by additional valence electron)

norm	YES	Charge densities are normalized to sum of Z
	NO	Charge densities are not normalized

line 2:

free format

greed

mixing greed Q. Essential for Pratt, rather less important for MSEC1. In the first iteration using Broyden's scheme: Q is automatically reduced by the program depending on the average charge distance :DIS andthe relative improvement in the last cycle. In case that the scf cycle fails due to large charge fluctuations, this can be further reduced but this can lead to stagnation. One should rarely reduce this below 0.05.)

line 3 (optional):

(free format)
f_pw, f_clm

f_pw

Not used, retained for input compatibility.

f_clm

Not used, retained for input compatibility.

line 4 (optional):

(free format)
nbroyd, nuse

nbroyd

Not used, retained for input compatibility.

nuse

For all Multisecant methods: Only nuse steps are used during modified broyden (this value has some influence on the optimal convergence. Usually 6-10 seems reasonable and 8 is recommended).

Next: 8 Analysis, Properties and Up: 2 Detailed description of Previous: 6 Initialization Contents

pblaha 2011-03-22

NCOM	number of lm components in charge density and potential representation; it must satisfy the following condition: NCOM+3 .gt. $\{$ [number of with ] + [2 * number of with ] $\}$
NRAD	number of radial mesh points
LMAX2	highest L in the LM expansion of charge and potential
LMAX2X	highest L for the gpoint-grid in the xcpot generation (may need large values for ``-eece'')
restrict_output	for mpi-jobs, limits the number of case.output0xxx files to ``restrict_output''

LABC	highest l+1 value of orbital dependent potentials
NRAD	number of radial mesh points

LMAX	highest l+1 in basis function inside sphere (consistent with input in case.in1)
LMMX	number of LM terms in potential (should be at least NCOM-1)
LOMAX	highest l for local orbital basis (consistent with input in case.in1)
NGAU	number of Gaunt coefficients for the non-spherical contributions to the matrix elements
NMATMAX	maximum size of H,S-matrix (defines size of program, should be chosen according to the memory of your hardware, see chapter 11.2.2!)
NRAD	number of radial mesh points
NSLMAX	highest l+1 in basis functions for non-muffin-tin matrix elements (consistent with input in case.in1).If set larger than 5, parameter MAXDIM (modules.F) and LOMAX=8, P(10,10) (gaunt2.f) must also be increased.
NSYM	order of point group
NUME	maximum number of energy eigenvalues per k-point
NVEC1	defines the largest triple of integers which define reciprocal
NVEC2	K-vectors when multiplied with the reciprocal Bravais matrix
NVEC3
NLOAT	max number of LOs for one -quantum number
restrict_output	for mpi-jobs, limits the number of case.output1_X_proc_XXX files to ``restrict_output''

FLMAX	constant = 3
LMAX	highest l of wave function inside sphere (consistent with `lapw1`)
LABC	highest l of wave function inside sphere where SO is considered
LOMAX	max l for local orbital basis
NRAD	number of radial mesh points
NLOAT	number of local orbitals

IBLOCK	Blocking parameter (32-255) in `l2main.F`, optimize for best performance
LMAX2	highest l in wave function inside sphere (smaller than in `lapw1`, at present must be .le. 8)
LOMAX	max l for local orbital basis
NCOM	number of LM terms in density
NGAU	max. number of Gaunt numbers
NRAD	number of radial mesh points
restrict_output	for mpi-jobs, limits the number of case.output2_X_proc_XXX files to ``restrict_output''

7 Programs for running an SCF cycle

1 LAPW0 (generates potential)

1 Execution

2 Dimensioning parameters

3 Input

2 ORB (Calculate orbital dependent potentials)

1 Execution

2 Dimensioning parameters

3 Input

1 Input for all potentials

2 Input for LDA+U (nmod=1)

3 Input for Orbital Polarization (nmod=2)

4 Input for interaction with (nmod=3)

3 LAPW1 (generates eigenvalues and eigenvectors)

1 Execution

2 Dimensioning parameters

3 Input

4 LAPWSO (adds spin orbit coupling)

1 Execution

2 Dimensioning parameters

3 Input

5 LAPW2 (generates valence charge density expansions)

1 Execution

2 Dimensioning parameters

3 Input

6 SUMPARA (summation of files from parallel execution)

1 Execution

2 Dimensioning parameters

7 LAPWDM (calculate density matrix)

1 Execution

2 Dimensioning parameters

3 Input

8 LCORE (generates core states)

1 Execution

2 Dimensioning parameters

3 Input

9 MIXER (adding and mixing of charge densities)

1 Execution

2 Dimensioning parameters

3 Input

4 Input for interaction with $B_{ext}$ (nmod=3)