\input genpctex: %\input /taupin/texinput/genpctex.tex
\input twlpoint:
\def\tenpointÅ\TwlpointÌ\def\eightpointÅ\TenpointÌ\tenpoint
\def\hpkplÅ$h^2+k^2+\ell^2$Ì
\def\lftÅ$<$\bgroup\itÌ\def\rgtÅ\egroup$>$Ì
\def\wparÅ\parÌ
%
\def\midinzertÅ\midinsertÌ
\def\topinzertÅ\vfil\ejectÌ%
\def\vec#1ÅÅ\bf #1ÌÌ
\def\pageinzertÅ\pageinsertÌ%
\def\endinzertÅ\endinsertÌ
%
\def\accol#1Å\Å#1\ÌÌ
\def\zbullÅÅ\eightsy\char'017ÌÌ%
\def\zdiamÅÅ\eightsy\char'005ÌÌ%
\def\zstarÅÅ\eightsy\char'003ÌÌ%
\def\zcircÅÅ\eightsy\char'016ÌÌ%
\def\zuarrÅÅ\eightsy\char'042ÌÌ%
\def\zdarrÅÅ\eightsy\char'043ÌÌ%
%\check
\newdimen\hpos\newdimen\vpos\newdimen\locunit
\def\plotbull#1#2Å\hccharÅ#1\hunitÌÅ#2\vunitÌÅ\zbullÌÌ%
\def\plotstar#1#2Å\hccharÅ#1\hunitÌÅ#2\vunitÌÅ\zstarÌÌ%
\def\plotdiam#1#2Å\hccharÅ#1\hunitÌÅ#2\vunitÌÅ\zdiamÌÌ%
\def\plotcirc#1#2Å\hccharÅ#1\hunitÌÅ#2\vunitÌÅ\zcircÌÌ%
%
\newdimen\bullwidth\newdimen\bullheight
\bullwidth=4pt\bullheight=4pt
\def\hcchar#1#2#3Å\rlapÅ\locunit=#2\relax\advance\locunit by -0.5\bullheight
\relax
\kern#1\kern-0.5\bullwidth\raise\locunit\hboxÅ#3ÌÌÌ
\def\hpoint#1#2#3Å\rlapÅ\kern#1\raise#2\hboxÅ#3ÌÌÌ
%\check
\def\figbotruleÅ\hruleÌ
%\check
\newdimen\hunit\newdimen\vunit\newdimen\rulelength
\newcount\figcount\newcount\figtemp
%
\def\hscalep#1Å\hpointÅ#1\hunitÌÅ-\vposÌÅ\vrule height \vposÅ\eightpoint\ #1ÌÌÌ%
\def\vscalep#1Å\hpointÅ-9mmÌÅ#1\vunitÌÅÅ\eightpoint
\hbox to 9mmÅ\hss#1 \ \vrule height \rthk width \hposÌÌÌÌ%
%
%\check
\def\hscalesÅ%
\vpos=8pt\relax
\hscalepÅ0Ì\hscalepÅ10Ì\hscalepÅ20Ì\hscalepÅ30Ì\hscalepÅ40Ì\hscalepÅ50Ì%
\hpointÅ55\hunitÌÅ-\vposÌÅ$N\,\to$Ì
Ì
%
\def\vscalesÅ%
\hpos=8pt\relax
\vscalepÅ0Ì\vscalepÅ50Ì\vscalepÅ100Ì\vscalepÅ150Ì\vscalepÅ200Ì\vscalepÅ250Ì%
\hpointÅ-8mmÌÅ260\vunitÌÅÅ$M$ÌÌ%
Ì
%
\def\figlegend#1Å\medskip\centerlineÅ\newfigÌ\smallskip\centerlineÅ\advance
\hsize by -2\parindent\vboxÅ\
\eightpoint\ \unskip #1\wparÌÌÌ
%
\def\newfigÅ\global\advance\figcount by 1\relax\figrefÌ
\def\forefig#1ÅÅ\figtemp=\figcount\advance\figtemp by #1\relax\fignum\figtempÌÌ
\def\nextfigÅ\forefig 1Ì%
\def\figrefÅ\fignum\figcount Ì%
%
\def\fignum#1ÅFigure \number#1Ì%
%
\def\rthkÅ0.4ptÌ
%
\def\plafond#1ÅÅ\vpos=\figtopline\vunit
\advance\vpos by -\bullheight\relax
\advance\hpos by -0.5\bullwidth\relax
\hpointÅ\hposÌÅ\vposÌÅ\vtopÅ\hboxÅ#1Ì\hboxÅ\zuarrÌÌÌÌÌ%
\def\plancher#1ÅÅ\advance\hpos by -0.5\bullwidth\relax
\hpointÅ\hposÌÅ\figbotline\vunitÌÅ\vboxÅ\hboxÅ\zdarrÌ\hboxÅ#1ÌÌÌÌÌ%
%
%\check
\def\boundplot#1Å%
\ifdim\vpos < \figbotline\vunit\plancherÅ#1Ì%
\else\ifdim\vpos > \figtopline\vunit \plafondÅ#1Ì%
\else \hccharÅ\hposÌÅ\vposÌÅ#1Ì%
\fi\fiÌ
%
%\check
%
\def\figplot#1#2#3#4#5#6#7#8Å\midinzertÅ\medskip\lineÅ\hunit=#1cm\relax
\vunit=#2cm\relax
% calcul de la position du zero
\def\figbotlineÅ#5Ì\def\figleftlineÅ#3Ì%
\def\figtoplineÅ#6Ì\def\figrightlineÅ#4Ì%
\rulelength=\hsize\relax \advance\rulelength by -#3\hunit\relax \advance
\rulelength by -#4\hunit\relax
\divide\rulelength by 2\relax\hskip\rulelength\relax
% lignes verticales
\rulelength=#6\vunit\relax\advance\rulelength by -#5\vunit
\hboxÅ\hpointÅ#3\hunitÌÅ#5\vunitÌÅ\vrule height \rulelength\relaxÌ%
\hpointÅ#4\hunitÌÅ#5\vunitÌÅ\vrule height \rulelength\relaxÌ%
% lignes horizontales
\rulelength=#4\hunit\relax\advance\rulelength by -#3\hunit
\hpointÅ#3\hunitÌÅ#5\vunitÌÅ\vrule height\rthk width \rulelength\relaxÌ%
\hpointÅ#3\hunitÌÅ#6\vunitÌÅ\vrule height\rthk width \rulelength\relaxÌ%
#7Ì\hssÌ\medskip\figlegendÅ#8Ì\medskip\figbotruleÌ\endinzertÌ
%
\font\gros=\fonthdg mbx10 scaled \magstep1
\tolerance=10000\relax
\parindent 0.7cm
\parskip=2\smallskipamount
\advance\medskipamount by 2pt
\vsize 22cm\hsize 14cm
\baselineskip 15pt
\lineskiplimit 1pt
\normallineskip 15pt
%
%
\def\expalÅexperimentalÌ
\def\vmaxÅV_Å\textÅ\eightpoint maxÌÌÌ
\def\mminÅM_Å\textÅ\eightpoint minÌÌÌ
\def\chirmÅ\chi_År\,\textÅ\eightpoint maxÌÌÌ
\def\chirmqÅ\chi^2_År\,\textÅ\eightpoint maxÌÌÌ
\def\eiÅ\epsilon_iÌ
\def\prtyÅprobabilityÌ
\def\indptÅindependentÌ
\def\prtiesÅprobabilitiesÌ
%
\def\ital#1ÅÅ\sl #1\/ÌÌ
\def\crycÅcrystallographicÌ
%
\ \vfil
\centerlineÅ\gros Enhancements in Powder Pattern IndexingÌ
\medskip
\vskip 2cm
\centerlineÅBy Daniel TAUPINÌ\centerlineÅLaboratoire de Physique des Solides,
associ\'e au C.N.R.S.Ì
\centerlineÅB\^atiment 510, Centre Universitaire, 91405 Orsay, FranceÌ
\vfil
\centerlineÅ\bf AbstractÌ
\medskip
The indexing strategy of our sixteen year old powder pattern indexing programme
has been recently revised.
Several enhancements lead to an increased reliability and much shorter
computing times in indexing difficult
patterns, especially monoclinic.
\vfil
\centerlineÅ\sl \todayÌ
\eject
\newcount\eqnum
\def\nweq#1Å\global\advance\eqnum by 1\relax\eqno(\number\eqnum#1)Ì
\def\ideq#1Å \eqno(\number\eqnum#1)Ì
\def\refeq#1Å(\number#1)Ì
%
\def\section#1Å\removelastskip\bigskip\penalty -4000\noindentÅ\bf #1Ì\medskip
\penalty +4000Ì%
\def\subsect#1Å\removelastskip\medskip\penalty -2000Å\bf #1Ì\medskip\penalty
+2000Ì%
\sectionÅI. Introduction.Ì
Half a dozen programmes at least are now available to assign indices to X-ray
powder patterns
of unknown lattice cell:
Werner (1964),
Taupin (1968, 1973), Visser (1969), Lou\"er \& Lou\"er (1972), Kohlbeck \&
H\"orl (1975),
Werner \italÅet al.Ì (1985), etc..
These programmes and others have been discussed by Shirley (1978) who gives
some other references.
%\check
\par
Since the problem of retrieving line indices in a powder pattern is not at all
trivial,
each of these programmes has a distinct strategy. Some of them take account of
special relationships
between line position (Visser, 1969) and quickly find the solution, when they
succeed.
Others prefer a systematic strategy, using dichotomy (Lou\"er \& Lou\"er, 1972)
or testing
a tentatively exhaustive set of index combinations (Taupin, 1973).
\par
Although generally successful, our latter programme (Taupin, 1973) appeared to
be sometimes
defective, namely it sometimes wasted hours of computing time trying hopeless
index combinations
(e.g. index combinations which would necessarily lead to cell volumes beyond
the user's authorized limit).
\sectionÅII. Recalling the bases of our indexing strategy.Ì
Let $n$ be the number of unknown parameters defining the crystal cell ($n$=1,
2, 2, 3, 4, 6 for cubic, hexagonal,
tetragonal, orthorhombic, monoclinic and triclinic systems, respectively);
then all possible combinations of $n$ index triplets $\accolÅhk\ellÌ$ are
tentatively assigned to a given set
of $n$ lines of the \expal\ diagram (usually the $n$ lowest-angle lines) and,
for each combination,
the following linear system is solved:
%\check
%\bye
$$h_i^2A+k_i^2B+\ell_i^2C+2h_ik_iD+2k_i\ell_iE+2\ell_ih_iF=Q_i\quad;\quad
i=1,n\nweqÅÌ$$
\xdef\glosysÅ\refeq\eqnumÌ
where
$$Q_i=1/d^2_i\nweqÅÌ$$
$$A=\vecÅa^*Ì^Å2Ì\quad;\quad B=\vecÅb^*Ì^Å2Ì\quad;\quad C=\vecÅc^*Ì^Å2Ì\nweqÅÌ$$
$$D=\vecÅa^*Ì\cdot\vecÅb^*Ì\quad;\quad E=\vecÅb^*Ì\cdot\vecÅc^*Ì\quad;\quad
F=\vecÅc^*Ì\cdot\vecÅa^*Ì\nweqÅÌ$$
with additional constraints for all lattices but triclinic, e.g.: $A=B=C$,
$D=E=F=0$ for cubic, $A=B$, $D=E=F=0$ for
tetragonal, etc..\FootnoteÅAlthough the common use is now to call them $\beta$
and $b$,
we prefer to denote as $\gamma$ and $c$ the particular angle and the particular
axis in
the monoclinic lattices.Ì
\par For each set of $n$ triplets $\accolÅhk\ellÌ$ there is a solution, i.e. a
possible lattice, which
obviously indexes the $n$ lines (the \italÅbaseÌ lines) taken, but seldom all
the other lines of the given
diagram. If it does, one has found one of the possible solutions of the problem.
\par
If the problem were handled as crudely as described above, one would have to
perform a dramatically huge
number of trials, and even now -- in the last eighties -- this would lead to
unreasonable computing
time spent.
It is therefore necessary to introduce constraints and checks which short cut
useless, impossible or
redundant combinations as soon as possible.
\sectionÅIII. The combinatorial strategy and the constraints.Ì
Since the optimization of higher symmetry
(i.e. cubic, hexagonal and tetragonal) lattice seeking is not crucial owing to
the small amount of
trials to be performed,
while these lattices require special handling of the indices $h$, $k$ and $\ell
$, we shall only discuss below
the stategy implemented in the case of orthorhombic, monoclinic and triclinic
lattices.
\parÅ\bf III.1 Varying $h$, $k$, $\ell$ for the base lines.Ì\par
Since it is definitely impossible to make the indices $\accolÅhk\ellÌ$ run from
$-\infty$ to $+\infty$, we have been
compelled to assign limits to the $\accolÅhk\ellÌ$ of the base lines.
Rather than assigning separate limits to $h$, $k$ and $\ell$
it is obviously wiser to limit the value of
\hpkpl\ and to make this arbitrary limit proportional to
the $Q$ of the Å\sl base\/Ì lines.
In practice, $h$, $k$ and $\ell$ do not vary separately for each base line, but
we prefer to use a pre-computed
table of triplets $hk\ell$ Å\sl ordered againstÌ \hpkpl, that is, ordered
against decreasing
likelihood. Thus, simple lattices with low indices are sought first, and
unlikely ones with large cell
volumes are examined later, if needed.
However, this way of limiting the indices which we have used for years,
exhibits some drawbacks when
several crystal cell symmetries are tested; in fact, the number of lines
satisfying a condition of the type
$$ h^2+k^2+\ell^2\le K\nweqÅÌ$$
in a triclinic lattice is twice the number in a monoclinic, which in turn is
twice the number in an orthorhombic,
and so on.
Conversely, there is no obvious reason why the probability that the low index
lines have intensities
below the \expal\ threshold should be different in any of these systems.
Therefore, we think it is wiser not to state a limit for the actual value of
\hpkpl\ of the base lines,
but to state -- for example -- that the first line should be labelled with a
triplet $hk\ell$
choosen within the $N$ first ones (ordered by increasing \hpkpl).
\par
This latter number has a rather obvious probabilistic meaning: for a cell whose
parameters are of the
same order of magnitude, assigning the label of the first line to be within the
$N$ first ones
in the table means that we expect that the structural extinction rules or
damping effects
will not remove all the $N$ first possible labels from the \expal\ data.
Thus a value of $N$ ranging from 15 to 25 is quite reasonable, especially when
I, F,
A, B and C-centered lattices are handled seperately
-- i.e. with separate $hk\ell$ tables -- as we do now.
Concerning the limit of the next base lines, it must be pointed out that, while
the \hpkpl\
have to be extrapolated proportionally to $Q$, the
value of $N$ (which is akin to a number of reciprocal unit cells in a given
ellipsoid)
should be extrapolated proportionally to $Q^Å3/2Ì$, in other words, to $1/d^3$.
\parÅ\bf III.2 The constraints on the reciprocal cell parameters.Ì\par
%\check
In addition to the normal constraints resulting from the symmetry of the cell,
except for triclinic lattices,
the generality of the programme is not restricted by imposing the following
constraints:
$$A\ge B \ge C > 0\quad;\quad (orthorhombic)\nweqÅaÌ$$
$$A\ge B \ge 2D \ge 0\quad;\quad (monoclinic)\ideqÅbÌ$$
$$A\ge B\ge C > 2E\ge 0\quad;\quad B\ge 2D\ge 0\quad;\quad C\ge F\ge 0\quad
;\quad(triclinic)\ideqÅcÌ$$
\xdef\redconÅ\refeq\eqnumÌ
%\check
The positivity of all the unknowns $A$, $B$, $C$, $D$, $E$ and $F$
leads to a first simple elimination rule:
if a line $Q_i$ is tentatively indexed by $h_ik_i\ell_i$ and
a line $Q_j$ by $h_jk_j\ell_j$ and if $Q_j>Q_i$, then
it is impossible to have:
$$\bunch h_i^2\ge h_j^2\textÅ and Ì
k_i^2\ge k_j^2\textÅ and Ì
\ell_i^2\ge \ell_j^2\textÅ and Ì
\quad\quad\cr\quad\quad
h_ik_i\ge h_jk_j\textÅ and Ì
k_i\ell_i\ge k_j\ell_j\textÅ and Ì
\ell_ih_i\ge \ell_jh_j\quad.\quad\endbunch\nweqÅÌ$$
\xdef\elimhklÅ\refeq\eqnumÌ%
%\check
Therefore, any triplet which violates this condition with respect to the
indices assigned to
lines of smaller $Q$ must be immediately rejected.
\par
Besides -- and this is an important novelty -- one does not need to have
assigned tentative
index triplets to all the first $n$ \expal\ lines before beginning to solve the
system \glosys.
Rather, it is advisable to begin eliminating unknowns between tentative
equations, as soon as
two lines have been assigned index triplets.
Then, due to the high frequency of zeroes in the indices of the low-angle
lines, the value of
one or several unknowns if often available once only 2 or 3 lines have been
tentatively labelled.
If this results -- for example -- in $B$ being greater than $A$ in a monoclinic
system,
then all trials using these tentative labels have to be rejected (owing to
redundancy elimination
constraints \redcon) and thousands of useless trials can thus be short cut.
\par
Another constraint is of major importance, namely the maximum volume of the
unit cell, i.e.
the minimum volume of the reciprocal unit cell.
A first value of this limit is normally given by the programme user, but it can
often be
dynamically restricted by information considerations (see Taupin, 1988),
especially when at least
one good lattice has been found.
%\check
Since the volume of the reciprocal unit cell is readily deduced from the
parameters $A$ to $F$ by
$$V^Å*2Ì=ABC+2DEF-CD^2-AE^2-BF^2\nweqÅÌ$$
\xdef\volumeqÅ\refeq\eqnumÌ%
%\check
it is sufficient to locate $A$ to $F$ within intervals to get an estimation of
the interval
within which $V^*$ lies, and reciprocally when taking account of the conditions
\redcon.
\par
Additional efficiency can be also attained by introducing reasonable
constraints on the values
of the cell parameters $a$, $b$ and $c$.
They can obviously be stated by the programme user
(practically, we ask for constraints on $1/a^*$, $1/b^*$ and $1/c^*$) but,
even if he does not want to give them, these upper and lower bounds may be
derived from the given set
of \expal\ lines with a good reliability.
Practically, one can take account of the fact that indices greater than 2 are
very unlikely for the
first \italÅrecordedÌ line in orthorhombic, monoclinic and triclinic\FootnoteÅ%
The lowest line of a triclinic lattice can always be assessed to be 0~0~1, but
not necessarily the
first recorded one: if 0~0~1 is missing (too weak or out of range for the
detector) and one assesses the
label 0~0~1 to the true 0~0~2 line, then it will be impossible to index the
0~0~3 line if it is present.Ì
lattices, and set the lower bound of $A$, $B$, $C$
to approximately $Q_1$ divided by the maximum value allowed for
$h_1^2+k_1^2+\ell_1^2$,
which can be deduced from $N$,
unless explicitely specified.
\par
%\check
Conversely, extra-flat tile-shaped cells are also very unfrequent.
This can be interpreted in setting the upper bounds of $A$, $B$, $C$ to some
value between $Q_Å10Ì$ and
$Q_Å20Ì$.
\par
%\check
Thus, most of the strategic combinatorial work has to deal with inequalities.
\parÅ\bf III.3 Tactical considerations.Ì\par
Each time a tentative triplet is assigned to a new \expal\ line (the $i$-th
line), the following operations are made:
\itemÅa)Ì if $i>1$ the tentative triplet is compared to those previously
assigned to the lower angle lines, to see whether
it should be rejected according to \elimhkl.
\itemÅb)Ì if this triplet is decided to be eligible at level $i$ (namely
with $i-1$ previous triplets already stored), a Å\sl
formal\/Ì equation is built and stored in memory in the form:
$$\alpha_i A+\beta_i B+\gamma_i C ... = p_ÅiiÌQ_i\nweqÅÌ$$
where only $\alpha_1$, $\beta_1$, $\gamma_i$ and $p_ÅiiÌ=1$ are numerically
computed.
\itemÅc)Ìthe $i-1$ previously stored equations are Å\sl formally\/Ì
substituted in the new $i$-th equation \refeq\eqnum\ which therefore takes a
more general form:
%\check
$$\alpha_i A+\beta_i B+\gamma_i C ... = p_Åi1ÌQ_1 + p_Åi2ÌQ_2 ... + p_ÅiiÌQ_i
\nweqÅÌ$$
\xdef\formeqÅ\refeq\eqnumÌ%
thus, the system of equations stored in memory is independent of the choice
and of the exact values
of the lines $Q_1$ to $Q_i$, and the same system can be used to estimate the
derivatives of $A$, $B$, $C$, etc.,
against the $Q$-values, i.e. to perform error computation.
\itemÅd)Ì if $\alpha_i$, $\beta_i$, $\gamma_i$, ... are all equal to zero --
this means that the selected
triplet was not \indpt\ of the previous ones -- the minimum and maximum value
of the right hand member
of this linear condition are computed, according to the allowed discrepancies
stated by the user for
these lines, namely $\epsilon_1$, $\epsilon_2$,
... $\epsilon_i$.
If these values do not enclose zero, the triplet is rejected and another one is
tentatively assigned at this level $i$.
If they do enclose zero, the triplet is considered as eligible for this $i$-th
\expal\ line, but the equation is
dropped since it was not \indpt\ of the previous ones; thus other triplets have
to be assigned at a higher
level to the next higher angle line (note that all this process is recursive).
This test is in fact of major importance: if it was not correctly done, the
programme would loop
indefinitely (or crash) in falling into the trap consisting in assigning -- for
example --
all the $\accolÅhk0Ì$ triplets to all the given lines of an orthorhombic
material, while the 3rd or the 4th one would
have been rejected by this test.
\itemÅe)Ì if at least one of the $\alpha_i$, $\beta_i$, $\gamma_i$, ... is not
zero,
then the maximum and the minimum possible values of the left hand member of
\formeq\ are computed,
using the present state of knowledge of the possible ranges of $A$, $B$, ...,
$F$ and they are compared
to the maximum and minimum possible values of the second hand member computed
as above.
In case of incompatibility, another triplet is tentatively assigned to the same
\expal\ line and
the same checking process is performed.
In case of compatibility, the resulting equation is kept but further
computations are done
to try to restrict the intervals where the unknowns are supposed to be located.
\itemÅf)Ì
if, at these stage, the number of equations kept is less than the required
number $n$, then
using maximum or minimum of the other unknowns depending upon the sign of
$\alpha_i$, $\beta_i$, $\gamma_i$, ...,
the formal relationskip \formeq\ is used to cyclically update (i.e. restrict)
the interval of each unknown.
If some update happens, the new intervals are matched to equation \volumeq\ and
possibly updated once more.
Of course, updating intervals may result in contradictions which in turn lead
to the rejection
of the tentative triplet as above.
%\check
\itemÅg)Ì if the tentative triplet appears to be eligible, but the number of
equations kept
is less than the required number $n$, then the whole process is recursively
started (step~a)), one level higher ($i \to i+1$),
in order to assign tentative triplets to the next higher \expal\ line.
The number of levels of this recursive process is, however, limited in order to
avoid some
pathological cases where -- for example -- all the \expal\ lines could be
indexed by $hk0$ labels
yielded by some high volume cell. In practice, we limit the number of
\italÅdropped equationsÌ,
also called the \italÅdegenerescence indexÌ to a certain number $n_ÅdegÌ$, so
that the total
number of base lines considered simultaneously never exceeds $n+n_ÅdegÌ$.
Thus, if the tentative triplet would lead the number of base lines to exceed
this limit,
this triplet is merely rejected and a new one is tried at the same level
(step~a)).
\itemÅh)Ì conversely, if the number of \indpt\ equations is equal to $n$,
then a complete indexing work is done, which may or may not succeed in a
satisfactory lattice.
By ``satisfactory'' we mean two conditions:
\itemitemÅ---Ì all given lines have been indexed within their stated accuracy,
unless the user authorizes the
programme to ignore them by means of a special code in the data;
\itemitemÅ---Ì the computed Å\sl information merit\/Ì (see Taupin, 1988) is
higher than the threshold choosen
by the user and indicated in the input data.
\itemÅi)Ì if this succeeds, then the minimum \italÅinformation meritÌ is
updated to its best
overall value \italÅminus 10Ì (this means that any further lattice found whose
\prty\ is less than $2^Å-10Ì\approx 0.001$
times the \prty\ of the best one
the best one will be rejected). In turn, this often leads to a restriction of
the
maximum direct cell volume, i.e. an increase of the minimum reciprocal cell
volume with all its consequences
in shrinking the allowed intervals for the unknowns.
\sectionÅIV. Checking and application.Ì
The new version of the programme has been tested on a set of 112 randomly
generated synthetic diagrams,
with artificial erratic deviations supposed to simulate \expal\ imperfections.
These diagrams consisted in 16 specimens of cubic, hexagonal and tetragonal
lattices, and of 32
orthorhombic and monoclinic lattices.
\par
All the 48 higher symmetry lattices but 3 were indexed at the first trial with
a parameter $N=15$,
which corresponds for instance to a first line having indices not higher than
3~0~0 in the P-tetragonal.
Two of the generated patterns exhibited abnormally high indices for the first
line, and they needed an increase
of $N$ (up to 40) to be correctly indexed\FootnoteÅWhether an
hexagonal pattern starting with the 2~0~2 and 1~1~3 lines is a realistic
situation, is another question...Ì.
For another hexagonal lattice, indexing happened to be possible only by
increasing the
allowed discrepancies by 40~\%\FootnoteÅOne of the new features of our programme
allows the user to increase or reduce the maximm deviations by a common factor,
without having to
change them individually in the record corresponding to each given line.Ì.
\par
All the 32 orthorhombic lattices but 5 were indexed at the first trial with
$N=15$, which corresponds to the
assumption the the indices of the first line are not higher than 2~0~1.
Two of them were rather ill-conditioned (bad precision) and spent too much
computing time, so that
we had to restrict the range of the volume and parameters to achieve the
indexing in reasonable times.
Two others exhibited first lines with very high indices, so that the parameter
$N$ had to be increased.
Finally, one orthorhombic lattice needed an increase of the allowed
discrepancies to be correctly indexed.
\medskip
Of the 32 monoclinic patterns, 21 were indexed although most of them required
the maximum volume
to be tightly restricted within order to make the programme running within
reasonable c.p.u.\ times.
At present time (our tests ran almost continuously for 6 months) 5 synthetical
patterns
are not yet finished. The other 6 patterns definitely failed to be indexed.
\medskip
The reasons of these failures have been investigated: they are mainly due to
excessive imprecision
of the generated first lines; this appears to be much more harmful in the
monoclinic case than in
the higher symmetries.
In fact, determining the cell of an orthorhombic lattice requires finding three
independent lines and especially
at least one with $h\ne 0$, another with $k\ne 0$ and another with $\ell\ne 0$;
such a set of lines is usually found in the 4 or 5 first lines. If many lower
labels have zero values,
the determination of the orthorhombic cell becomes even easier. Conversely, the
determination of a monoclinic cell
necessarily requires at least one line with $h\ne 0$ Å\bf andÌ $k\ne 0$;
this condition often requires to look at upper lines due to the large number of
zeroes in the first ones.
Even worse, the determination of $\cos\gamma$ always results from differences
between lines, so that it
is often imprecise.
\medskip
For monoclinic cells, the computing times ranged from approximately one hour
to 245 hours on a HP1000/A900 computer
(comparable to a Microvax~II or a VAX~750), with a typical c.p.u.\ time of 10
to 20 hours.
\medskip
The programme also works and gives sensible results when trying to index
patterns with triclinic lattices,
but we have not yet done systematic trials to assess that it always finds the
correct cell.
\medskip Besides, the VAX version of the programme was sent to G. Johnson who
used it for trial indexing
of the materials recorded in the Powder Diffraction File recorded as
``unindexed''.
In fact, according to G.\ Johnson (1988), 25~\% of the unindexed materials were
indexed
by our programme, namely,
all the 20 first lines were indexed within estimated \expal\ accuracy with an
overall \italÅinformation meritÌ
greater than the standard threshold of 80.
\sectionÅV. Parameters and features.Ì
Although the programme is intended to work with all default values for its
parameters, the user may
change many of them to fit his special purposes:
\itemÅ--Ì cubic, hexagonal, tetragonal, orthorhombic, monoclinic and triclinic
lattices may be separately selected;
\itemÅ--Ì the user can either search F, I, A, B, C together with P-lattices or
only the P-lattices;
\itemÅ--Ì input data may be $d$, $1/d$, $1/d^2$, $\theta$, $2\theta$ or
$4\theta$;
\itemÅ--Ì each given line may be given its own precision;
\itemÅ--Ì each given line may be allowed to be ignored by the programme instead
of rejecting the whole trial;
\itemÅ--Ì the total number of skipped lines can be choosen by the user;
\itemÅ--Ì the minimum \italÅinformation meritÌ can be optionally changed;
\itemÅ--Ì a global error scaling factor can be introduced to relax or to
tighten the estimated precision
of all lines;
\itemÅ--Ì a zero angle shift can be introduced to take account of possible
systematic errors;
\itemÅ--Ì a general wavelength is given, but individual lines may have been
recorded with different radiations;
\itemÅ--Ì the maximum volume of the unit cell can be modified;
\itemÅ--Ì the maximum values of $a$, $b$ and $c$ can be specified;
\itemÅ--Ì the information about specific weight and molecular mass can be
introduced;
\itemÅ--Ì for long trials, the user is provided with a restart facility and
output file swapping
which allows him to look at previous results without stopping the process.
\sectionÅVI. Availability.Ì
The indexing programme is written in standard Fortran 77, with a few assembler
or
specific supervisor calls for timing controls. It runs presently on VAX/VMS,
HP1000/RTE-A,
IBM/MVS.
If needed, specific versions can be derived for UNISYS/1100, IBM/VM/CMS and PC.
\sectionÅVII. References.Ì
\def\jacÅ\italÅ\unskip\ J. Appl. Cryst., ÌÌ
\def\actaÅ\italÅ\unskip\ Acta Cryst., ÌÌ
\def\reference#1ÅÅ\par\hangindent=\parindent\hangafter 1\noindentÅ\rm
#1Ì\medskipÌÌ
%
%\check
\referenceÅKohlbeck, F. \& H\"orl, E.M. (1976). \jac Å\bf 9Ì, 28-33.Ì
\referenceÅLou\"er, D. \& Lou\"er, M. (1972). \jac Å\bf 5Ì, 271-275.Ì
\referenceÅShirley, R. (1978). \italÅIndexing Powder DiagramsÌ in \italÅProc.
of 1978 Summer SchoolÌ,
Delft Univ. Press and Oosthoeks, 1978.Ì
\referenceÅTaupin, D. (1968). \jac Å\bf 1Ì, 178-181.Ì
\referenceÅTaupin, D. (1973). \jac Å\bf 6Ì, 380-385.Ì
\referenceÅTaupin, D. (1988). \jac Å\bf 21Ì, \italÅin the pressÌ.Ì
\referenceÅVisser, J.W. (1969). \italÅJ. Appl. Cryst.,Ì Å\bf 2Ì, 89-95.Ì
\referenceÅWerner, P.E. (1964). \italÅZ. Kryst.,Ì Å\bf 120Ì, 375-387.Ì
\referenceÅWerner, P.E., Eriksson, L. \& Westdahl, M. (1985). \jac Å\bf 18Ì,
367-370.Ì
\vfill\eject\hrule
\bigskip\penalty -5000
\centerlineÅ\bf Appendix\FootnoteÅ%
This appendix is only given here for the referrees' need of information.ÌÌ
\centerlineÅ\bf NOT to be publishedÌ
\medskip
\centerlineÅPractical use of the Powder Diagram Indexing ProgrammeÌ
\sectionÅA.1 Invoking the programme.Ì
Typical invocation of the programme is of the form:
\medskip
\par\leftline Å\tt PWDCDS \lft input file\rgt,\lft output file\rgt,\lft
breakpoint file\rgt,\lft log file\rgtÌ
\rightlineÅ (VMS, RTE-A).Ì
\par
\par\leftline Å\tt // EXEC PGM=PWDCDS,PARM='\lft input file\rgt,\lft output
file\rgt,\lft breakpoint file\rgt,\lft log file\rgt'Ì
\rightlineÅ (IBM-MVS).Ì
\par
\lft input file\rgt\ is the name of the file containing the input data (this is
a DDNAME in the case of IBM-MVS).
If omitted, the user either prompted to give it, or standard input is used,
depending of the implementation.
\lft output file\rgt\ contains the printout after execution (standard output is
used when omitted).
On VAX-VMS, RTE-A, and other executive systems which allow dynamical allocation
of new files (i.e. not
on IBM-MVS) \lft output file\rgt\ is always a \italÅnewÌ file: if the name
already exists, a new version or
a new derivated file name is created.
\par
If specified, \lft breakpoint file\rgt\ is used by the programme to store
restart informations which
enable it to resume its operations at the point where it has been stopped
because of exceeded c.p.u.
time conditions (see parameters below).
If it does not exist, it is created (with restrictions for IBM-MVS); if it
exists, it is re-used for input
and for output. Therefore, it must be deleted or erased when performing a new
trial, i.e. performing a trial which
is not the continuation of a previous one.
\par
If specified,
\lft log file\rgt\ contains a very short summary of the execution, namely the
title of each indexed
lattice, its cell parameters and its information merit, and the c.p.u.\ time
already consumed for that pattern.
If it does not exist, it is created (with restrictions for IBM-MVS); if it
exists, it is re-used, but
the new information is \italÅappendedÌ to the existing file which, therefore,
contains the summary
of all trials referring the the same \lft log file\rgt.
%\check
\sectionÅA.2 Input data (the \lft input file\rgt\ structure).Ì
In order to avoid problems with editors (e.g. TSO) which use columns 73:80 for
record numbering,
significant data are supposed to be stored only in columns 1:72 of the records
of the \lft input file\rgt.
\subsectÅA.2.1 First recordÌ
Columns 1:12
contain a set of \italÅoption charactersÌ which specify which operations are to
be done,
and which kind of data are input below.
\def\otem#1Å\itemÅÅ\tt'#1'ÌÌÌ
\otemÅAÌ means that input data will be $2\theta$ angles, rather than $d$ values.
\otemÅA1Ì means that input data will be $\theta$ angles.
\otemÅA4Ì means that input data will be $4\theta$ angles.
\otemÅCÌ means that cubic lattices have to be sought.
\otemÅHÌ means that hexagonal lattices have to be sought.
\otemÅLÌ disables seeking of S, I, F lattices. If specified, only P-lattices
are looked for.
\otemÅMÌ means that monoclinic lattices have to be sought.
\otemÅOÌ (letter O, not zero) means that orthorhombic lattices have to be
sought.
\otemÅPÌ causes special ``punch'' output to happen on Fortran unit 7,
which has to be specially allocated before execution (this is a rather obsolete
feature).
\otemÅQÌ means that input data will be $1/d^2=Q$ rather than $d$.
\otemÅSÌ suppresses the normal redundancy test which eliminates lattices which
index
lines whose angles are lower than some base lines.
\otemÅTÌ means that tetragonal lattices have to be sought.
\otemÅUÌ means that input data will be $1/d=q$ rather than $d$.
%\check
\otemÅVÌ enables a check that the cell volume is approximately a multiple of the
volume of the elementary molecule. This option implies that the user gives the
density of the sample and its molecular weight.
\otemÅXÌ suppresses the updating of the minimum information merit to the best
one minus 10.
\otemÅZÌ produce an extensive printout. To be used for maintenance or checking
only.
\otemÅ3Ì means that triclinic lattices have to be sought.
\par
\noindent\italÅRemark: Ì if none of the options Å\tt CÌ, Å\tt HÌ, Å\tt MÌ, Å\tt
OÌ, Å\tt TÌ and Å\tt 3Ì
are specified, then all lattice symmetries are tried.
\par
Columns 13:72 contain a title which will be reproduced as a heading at the
beginning of each
apparently successful indexing, and in the \lft log file\rgt. It is usual to
indicate the
name of the sample in this field.
%\check
\subsectÅA.2.2 Second recordÌ
While the first record mainly contained \italÅbooleanÌ parameters, the second
one contains
numerical parameters.
These parameters are in free format, separated by commas, and blanks are
meaningless.
The user can choose to give these parameters as \italÅpositional parametersÌ in
which case he is
welcome to make no errors in counting the commas when specifying the 16-th
parameter (for example).
He can also use \italÅkeywordsÌ like Å\tt WLTH=1.54018,Ì to specify the
wavelength.
\par
Keyword specified parameters have precedence against positional ones.
\par
These parameters are specified as follows:
\par\penalty -2000\relax\removelastskip
%\check
\def\pardesc#1#2Å\itemÅÅ\hbox to 7mmÅ\hss #1Ì\hbox to 20mmÅ\tt
\ \ #2\hssÌÌÌÌ
%\check
Å\parindent 27mm
\pardescÅNo.ÌÅ\italÅkeywordÌÌ\italÅÅDescription.ÌÌ
\smallskip
\pardescÅ1ÌÅNLIN=Ì Number of \expal\ lines given. May be omitted, then the
given linesare counted.
\pardescÅ2ÌÅNEXT=Ì Number of lines the programme is allowed to skip if it
cannot index them within the
stated maximum deviations.
\pardescÅ3ÌÅNDEG=Ì Degenerescence parameter: this is the maximum number
$n_ÅdegÌ$ of \italÅbase linesÌ
which are allowed to be dependent of the previous ones. For instance, if this
parameter is equal to 5,
this means that the programme will not try to introduce the 9-th line as a base
line when the 8 first
ones (8=3+5) are not sufficient to determine a trial cell in an orthorhombic
lattice.
The default value is 2; the maximum value is 20-$n$, but it is not wise to use
values greater than 5.
\pardescÅ4ÌÅINDX=Ì
If not negative, this parameter is the maximum value of
\hpkpl\ for the first line.
If negative, its absolute value gives the number $N$ of missing lines before
the first \expal\ one,
when line triplets are ordered against \hpkpl.
Default value is -20, i.e. $N=20$.
\pardescÅ5ÌÅCPUX=Ì Maximum c.p.u.\ time, in minutes.
If positive, the programme will stop when this limit is reached. Restart
informations will be stored into
\lft breakpoint file\rgt\ if this file name is present in the command.
If negative, the programme will continue after this limit is reached, but the
\lft output file\rgt\
will be closed, and the output will continue in another file whose name is
derived from the previous
one. Thus, the user will be able to look at the recently closed files in order
to
see whether something is pgoing wrong and decide whether the programme should
continue running.
\pardescÅ6ÌÅVMAX=Ì Maximum volume of the unit cell. Default value is 1000 \AA
$^3$.
\pardescÅ7ÌÅINFM=Ì Minimum information merit. Only the cattices whose
information merit is greater than
this parameter are printed and recorded in the \lft log file\rgt.
This minimum is eventually increased to the best merit found, minus 10. This
updating is disabled by option Å\tt XÌ.
Default value is 80.
It is not wise to set it below 40, unless for checking purposes.
Usual good lattices have a merit ranging from 60 to 300.
\pardescÅ8ÌÅMOLM=Ì Molecular mass of the sample. Useful only with Å\tt VÌ
option.
\pardescÅ9ÌÅDENS=Ì Density of the sample in $g/cm^3$. Useful only with Å\tt VÌ
option.
\pardescÅ10ÌÅDERR=Ì Relative error on that density. Useful only with Å\tt VÌ
option.
\pardescÅ11ÌÅWLTH=Ì Wavelength used when recording the pattern. Necessary if
input data are angles.
Useful to extrapolate deviations when not given in the specific record of any
line.
\pardescÅ12ÌÅERSC=Ì Error scale factor. If not equal to 1, this parameters is
used to increase the
given maximum deviations of all lines, without having to modify each record.
Thus, if an indexation seems to fail and no other obvious reasons appear, it
may be wise to
re-run the programme with Å\tt ERSC=1.5Ì or more.
Conversely, if a great number of bad merit cells appear in the output, it might
be wise to ty indexing
with Å\tt ERSC=0.7Ì or less.
Note that this factor is internally restricted when trying large cell volumes
with high minimum merits.
\pardescÅ13ÌÅCHIM=Ì The minimum value of the expected reduced $\chi^2$. Default
value is 0.1.
It is not wise to increase it; decreasing it could help finding weird cells,
but it could cost much
c.p.u.\ time.
\pardescÅ14ÌÅAMIN=Ì The minimum value of $1/a^*$, $1/b^*$ and of $1/c^*$.
\pardescÅ15ÌÅAMAX=Ì The maximum value of $1/a^*$, $1/b^*$ and of $1/c^*$.
\pardescÅ16ÌÅBMMX=Ì The number of ``bad merit'' cells to be accepted before the
programme
stops as if the maximum c.p.u.\ time condition was reached. The default value
is 128.
This parameter is a security which is supposed to warn the user that his data,
or their
quality he states through his allowed deviations, are so bad that a great
number of
cells fit them, without reliability.
The corresponding counter is reset at each partial execution, just like
c.p.u.\ consumption
registers.
\pardescÅ17ÌÅAOFF=Ì An offset to be added to the angular values of all the
lines.
The purpose of this feature is to permit shifting all the input angles by a
constant offset
in order to account for a possible offset of the zero angle of the
diffractometer.
This feature is effective only if the Å\tt AÌ option has been specified.
\par
Ì
\noindent\italÅRemark: Ì using keywords instead of positional values usually
leads to a second
record exceeding the 72 column limit. This is circumvented by terminating this
record by a Å\tt '-'Ì;
this last character is then dropped and the next record is considered as a
continuation of the previous one.
The total limit is 144 characters.
\subsectÅA.2.3 Experimental line records.Ì
There is one record per \expal\ line.
Each record is in free format, with positional values separated by commas.
The values are as follows:
\itemÅ1)Ì The line position, namely $d$. If options Å\tt AÌ, Å\tt A1Ì, Å\tt A2Ì
are present, this position
is the corresponding angle, if option Å\tt QÌ is present, the value given is
$1/d^2$, if option Å\tt UÌ is
present the value given is $1/d$.
\itemÅ2)Ì The maximum deviation of this line, expressed in the same range as
the first value.
\itemÅ4,5,6)Ì The values of $h$, $k$, $\ell$ of that line, if they are known.
One should give
the complete triplet, or nothing.
\itemÅ7)Ì The particular wavelength with which this line was recorded.
The maximum deviation is necessary only for the first line. If further maximum
deviations are omitted,
they are extrapolated from the most recently stated, assuming that deviations
are constant in the
$\theta$ scale.
\par The presence of the letter Å\tt 'P'Ì or of the question mark Å\tt '?'Ì
anywhere in the record
means that the programme is allowed to drop this line, provided that the number
of dropped lines
does not exceed the value of the second parameter of the second record (Å\tt
NEXT=Ì).
%\check
\sectionÅA.3 Some advices.Ì
The default values are fit for the most common trials. It is wise to change
them only in case
of some certitude (e.g.: cell volume, maximum and minimum parameters) or when
the trial with standard parameters
actually failed.
\par
The user should remember that multiplying $N$ (Å\tt INDX=Ì) by 2 results in
multiplying the c.p.u.\ time
by approximately 16 in the orthorhombic case, and by approximately 50 in the
monoclinic case...
The precision (actual and stated) of the input data is also a very critical
parameter:
a factor of 2 in the precision usually results in factors of 30 to 100 in
c.p.u.\ times.
\par
Therefore, the user is strongly advised to make the first trials with rather
restrictive conditions,
and to progressively extend the error scale factor (Å\tt ERSC=Ì), the
degenerescence parameter
(Å\tt NDEG=Ì) and the value of $N$ (Å\tt INDX=Ì)
in case of failure.
\bye