LTE, Karp
[math]
\documentclass[12pt]{article}
\usepackage{amsmath}
\hbadness 10000
\topmargin -27pt
\oddsidemargin 0in
\evensidemargin 0in
\textheight 23cm
\textwidth 16.3cm
\def\ave#1{\langle #1 \rangle}
\def\pd{\partial}
%\def\sp(#1){\noalign{\vskip #1pt}}
\title{A Globally Convergent Method for computing the electron density
of a partially ionized plasma}
\author{Alan H. Karp\\
IBM Palo Alto\\
CA 94304, USA}
\date{Received 30 April 1979}
\begin{document}
\begin{titlepage}
\maketitle
\thispagestyle{empty}
\vspace{0.2cm}
\begin{abstract}
\end{abstract}
A stable Newton Raphson method for Saha and NLTE.\\
{\em JQSRT vol.23 (1980), No.3-D, p. 285-289}
\end{titlepage}
\section{Introduction}
\label{intr}
In LTE Saha:
\begin{equation}
\label{saha}
$$ \frac{N_{i,j+1}}{N_{ij}}=\frac{D_{ij}(T)}{N_e} $$
\end{equation}
where $N_{ij}$ is the number density of atoms of atomic number $i$ ionized $j$
times and
\begin{equation*}
\label{DT}
$$ D_{ij}(T) = \left(\frac{2\pi m_e kT}{h^2} \right)^{3/2}
2 \frac{u_{i,j+1}(T)}{u_{ij}(T)}\mathrm{e}^{-\chi_{ij}/kT} \; .$$
\end{equation*}
Since the partition functions $u_{ij}$ depend on the electron density $N_e$
through the depression of the continuum, $D_{ij}$ is also a function of
$N_e$. See Sec.2 why we can ignore this.
The fraction of atoms of atomic number $i$ ionized $j$
times is just
\begin{equation}
\label{Nij}
$$
\frac{N_{ij}}{N_{i}}=\frac{N_{ij}}{N_{i0}+N_{i1}+\dots+N_{ii}}=
\frac{N_{ij}/N_{i0}}{1+N_{i1}/N_{i0}+\ldots+N_{ii}/N_{i0}} \; ,
$$
\end{equation}
where
$$
\frac{N_{ij}}{N_{i0}}=\prod_{k=0}^{j-1}\frac{N_{i,k+1}}{N_{ik}}=
N_e^{-j}\prod_{k=0}^{j-1}D_{ik}(T)=A_{ij}(T)N_e^{-j} \; .
$$
If we define
$$
\alpha_i=1+\sum_{j=1}^i A_{ij}(T) N_e^{-j}
$$
and
$$
x_i=\sum_{j=1}^i jA_{ij}(T) N_e^{-j} \; ,
$$
it is clear that each atom of element $i$ gives up an average of $x_i/\alpha_i$
free electrons to the plasma.
The number density of atoms is $N_A=\rho/\mu_0 m_H$, where $\mu_0$ is mean
molecular weight (in hydrogen scale) of the completely recombined gas.
If $\nu_i$ is the fractional number abundance of element $i$ ($\sum_i \nu_i =
1$), then
\begin{equation}
\label{Ne}
$$
N_e = N_A \sum_i \nu_i x_i/\alpha_i \; .
$$
\end{equation}
Since Eq.(3) is nonlinear in $N_e$, an iterative procedure is needed.
One approach is to solve simultaneously for all the $N_{ij}$ and $N_e$ using
Eqs. (2), (3).
This method involves solving a system of equations for each iteration and may
become unstable.
Since simpler and more stable methods are available, the continued use of this
approach is surprising.
Another approach, which avoids these problems [1,2]: (1) Guess an $N_e$ ;
(2) compute $x_i$ and $\alpha_i$ for all $i$;
(3) use Eq.(3) to compute $N_e$; repeat. *** It seems Ron Eastman uses this,
so we have to change this in {\tt vladsf/ } (Elka?).
In nether method is the rate of convergence known.
Even though the change from one iteration to the next is small the result may
be far from the solution.
The radius of convergence is unknown.
There is no {\em a priori} way of knowing wether the first guess is close
enough to guarantee convergence.
The procedure derived in Sec.2 avoids these problems.
\section{Iteration for LTE}
\label{LTE}
We wish to find the root of the nonlinear equation
\begin{equation}
\label{gNe}
$$
g(N_e) = N_A \sum_i \nu_i x_i/\alpha_i -N_e
$$
\end{equation}
using the Newton-Raphson method, namely,
$$
N_e^{(n+1)}=N_e^{(n)}-g[N_e^{(n)}]/g'[N_e^{(n)}] ,
$$
where $n$ is the iteration index. Diff. Eq.(4) gives
\begin{equation}
\label{gprim}
$$
g'[N_e^{(n)}]=-\frac{N_A}{N_e} \sum_i \frac{\nu_i}{\alpha_i^2}( \alpha_i y_i
-x_i^2)-1 \equiv -\frac{N_A}{N_e}V -1,
$$
\end{equation}
where
$$
y_i=\sum_{j=0}^i j^2A_{ij}(T) N_e^{-j} \; .
$$
(Actually, summation starts from $j=1$, since $j^2=0$ for $j=0$, {\it P.Baklanov}.)
\noindent Letting
$$
U=\sum_i \nu_i x_i/\alpha_i
$$
gives
$$
N_e^{(n+1)}=N_e^{(n)} + \frac{N_AU-N_e^{(n)}}{N_AV/N_e^{(n)}+1},
$$
which can be rearranged to give
\begin{equation}
\label{nenp1}
$$
N_e^{(n+1)}=N_e^{(n)}\left\{\frac{U+V}{N_e^{(n)}/N_A+V}\right\}.
$$
\end{equation}
as $n\to\infty$, $N_AU \to N_e^{(n)}$ and the quantity in brackets goes to unity,
as shown in Sec.3.
Frequently one knows the total gas pressure $P_g$ instead of the mass density
used to derive Eq.(3).
Since the electrons contribute to $P_g$, the situation is more complicated.
Eq.(5) (*** probably must be 4) becomes
$$
g(N_e) = (N-N_e) \sum_i \nu_i x_i/\alpha_i -N_e ,
$$
where $N=P_g/kT$.
Using the same procedure as before gives
$$
N_e^{(n+1)}=N_e^{(n)}\left\{\frac{NU+[N-N_e^{(n)}]V}{N_e^{(n)}(1+U)+[N-N_e^{(n)}]V}\right\}.
$$
Again the quantity in brackets goes to unity
as $n$ increases.
It is now possible to justify ignoring the dependence of the partition
function on $N_e$. The depression of the continuum by nearby charged particles
only affects the highest bound levels [3], i.e.
$$
u_{ij}=\sum_{k=1}^\infty W_{ijk}(T,N_e)g_{ijk}\mathrm{e}^{-E_{ijk}/kT},
$$
where $k$ refers to the atomic levels, $g_{ijk}$ is the statistical weight, and
$E_{ijk}$ is the excitation potential.
$W_{ijk}$ is the probability that level $k$ exists and deviates from unity only
when $E_{ijk}$ is near the ionization potential.
Therefore, $u_{ij}$ has an appreciable dependence on $N_e$ only when
$kT\simeq \chi_{ij}$.
However, in this situation, the relative abundance of ion $j$ is low and its
contribution to the sums in Eq.(3) is small.
While in principle the depression of the continuum [3],
$\Delta\chi=1.8\times10^{-8}N_e^{1/3}$ eV can be significant, the dependence
is weak and is never important in the temperature-density range of interest.
*** Then a table is given from [4] showing that convergence goes as
$$
|\epsilon^{(n+1)}| \leq C [\epsilon^{(n)}]^2
$$
with $ \epsilon^{(n)}$ a relative error in $N_e$ on $n$-th iteration
and $C$ of order unity [5], i.e. the convergence is quadratic.
\section{Convergence proofs}
\label{proofs}
Clearly, $D_{ij}(T)$ can be treated as independent of $N_e$ as expected. This
assumption has been made implicitly in the proofs given below.
Some insight to the convergence properties can be gained by examining
$x_i/\alpha_i$ as a function of $N_e$.
When $N_e$ is low (perhaps because the total density is low),
$x_i \to iA_{ii}N_e^{-1}$ and $\alpha_i \to A_{ii}N_e^{-1}$ and, therefore,
the ratio approaches $i$.
In other words, every atom gives up all its electrons.
On the other hand, when $N_e$ gets large, $x_i \to A_{i1}/N_e$ and
$\alpha_i \to 1+A_{i1}/N_e$ and so the ratio goes to zero.
We expect, therefore, that the first derivative of $g(N_e)$ with respect to $N_e$
will be negative.
If, in addition, $g''(N_e)>0$ for all $N_e$, i.e. $g(N_e)$ is convex, then the
convergence is guaranteed for any first guess [5].
Convergence is proven in the following two theorems.
{\em Theorem 1:} $g'(N_e)<0$ for all $N_e>0$.
{\em Proof:} Since all quantities in Eq.(5) are positive for $N_e>0$, it
suffices to show that $D_i=N_{i0}^2( \alpha_i y_i -x_i^2)>0$ for all $N_e>0$.
Using
$$
\alpha_i=1+\sum_{j=1}^i N_{ij}/N_{i0},
$$
$$
x_i=\sum_{j=1}^i j N_{ij}/N_{i0} \; ,
$$
and
$$
y_i=\sum_{j=1}^i j^2 N_{ij}/N_{i0} \; ,
$$
gives
$$
D_i=\left[\sum_{j=0}^i N_{ij} \right] \left[\sum_{j=0}^i j^2N_{ij}\right]
-\left[\sum_{j=0}^i j N_{ij}\right]^2.
$$
But
$$
\left[\sum_{j=0}^i j N_{ij}\right]^2 =
\left[\sum_{j=0}^i j N_{ij}^{1/2} N_{ij}^{1/2} \right]^2 \leq
\left[\sum_{j=0}^i N_{ij} \right] \left[\sum_{j=0}^i j^2N_{ij}\right]
$$
by the Schwartz (Cauchy) inequality [6]. An alternate proof proceeds by
induction. We note that
$$
D_1=N_{11}(N_{10}+N_{11})-N_{11}^2=N_{11}N_{10}>0
$$
and assume $D_{k-1}>0$.
Then
$$
D_k=\left[N_{kk}+\sum_{j=0}^{k-1} N_{kj} \right]
\left[k^2N_{kk}+\sum_{j=0}^{k-1} j^2N_{kj}\right]
-\left[kN_{kk}+\sum_{j=0}^{k-1} j N_{kj}\right]^2.
$$
and
$$
D_k=D_{k-1}+N_{kk}\sum_{j=0}^{k-1}(k-j)^2N_{j}>0 \mathrm{***check if} N_{kj}
$$
since both term on RHS are positive.
{\em Theorem 2:} $g''(N_e)>0$ for all $N_e>0$.
{\em Proof:} Differentiating Eq.(5) with respect to $N_e$ gives
$$
g''[N_e^{(n)}]=\frac{N_A}{N_e^2} \sum_i \frac{\nu_i}{\alpha_i^3}
[ \alpha_i(\alpha_i y_i-x_i^2) + \alpha_i^2 z_i + \alpha_i x_i y_i+2x_i^3],
$$
where
$$
z_i=\sum_{j=1}^i j^3 N_{ij}/N_{i0} \; .
$$
From theorem 1, $\alpha_i y_i > x_i^2$ so that $ g''(N_e)>0$.
These theorems prove that the iteration always converges as long as
$N_e^{(n)}>0$ and that the convergence is quadratic.
Since $ g'(N_e)<0$ and $ g''(N_e)>0$ for $ N_e>0$, the iteration approaches
the solution from below.
Therefore, if $ N_e^{(1)}<0$, it must be replaced by a small positive
number.
This, and the obvious need to avoid overflows and underflows, are the only
precautions needed to ensure convergence.
\section{Iteration procedure for NLTE}
\label{NLTE}
The extension to non-LTE is straightforward. Let a starred quantity
represent the LTE value and unstarred the NLTE. Then,
$$
\frac{N_{i,j+1}}{N_{ij}}=b_{ij}\left(\frac{N_{i,j+1}}{N_{ij}}\right)^* ,
$$
where $b_{ij}$ is the departure coefficient.
Defining $\alpha_i$, $x_i$ and $y_i$ by using $N_{ij}$ in place of $N_{ij}^*$
gives, for example,
$$
\alpha_i=1+\sum_{j=1}^i\left(\frac{N_{ij}}{N_{i0}}\right)^*
\prod_{k=0}^{j-1} b_{ik},
$$
If the departure coefficient is not a function of $N_e$ the iteration converges
as before.
Since, in general, $b_{ij}$ depends on $N_e$ the derivative $g'(N_e)$ has not
been computed exactly. In spite of this problem the iteration may still
converge but more slowly than before.
Some plausibility arguments can be made.
We consider an atom with one bound level and a continuum for which [7]
$$
b_{i\kappa}=\frac{R_{i\kappa}+N_e C_{i\kappa}}{R_{\kappa i}+N_e C_{\kappa i}}
$$
There are three cases of interest.
(1) Collisions dominate, LTE holds, and $b_{i\kappa}=1$.
(2) The radiation is Planckian at the local temperature which implies
$b_{i\kappa}=1$.
(3) Radiation processes dominate and the radiation is non-Planckian.
Only in this case does $b_{ij}$ depend on $N_e$, but only through the
radiation field.
If an estimate of the derivative of $b_{ij}$ with respect to $N_e$ is
available, then the iteration of Eq.(6) can be used with the definitions
$$
U=N_A \sum_i \nu_i x_i/\alpha_i
$$
$$
V=N_A \sum_i \nu_i [\alpha_i y_i - x_i^2 + q_i\alpha_i-p_ix_i]/\alpha_i^2,
$$
where
$$
p_i=N_e\sum_{j=1}^i A_{ij}N_e^{-j}
\frac{\mathrm{d}}{\mathrm{d}N_e}\prod_{k=0}^{j-1}b_{ik},
$$
$$
q_i=N_e\sum_{j=1}^i jA_{ij}N_e^{-j}
\frac{\mathrm{d}}{\mathrm{d}N_e}\prod_{k=0}^{j-1}b_{ik}.
$$
The more accurately the derivative of $b_{ij}$ with respect to $N_e$ is known,
the more nearly quadratic the convergence.
\bigskip
{\em Acknowledgement} ... David Fischel provided the proof for theorem 1.
\begin{thebibliography}{}
\bibitem{1} C.A.Rouse, ApJ 135, 599 (1962).
\bibitem{2} R.L.Kurucz, ATLAS, a computer program for calculating model stellar
atmospheres, SAO Special Rep. No. 309 (1970).
\bibitem{3} D.Fischel, W.M. Sparks, ApJ 164, 359 (1971).
\bibitem{4} J.W.Fowler, Line-Blank model stellar atm. Sirius, unpub. PhD
thesis, Univ. maryland, 1972; also NASA-GSFC doc. No. X-670-72-303
\bibitem{5} J.M.Ortega, Numerical Analysis, Academic Press, NY (1972).
\bibitem{6} R.R.Goldberg, Methods of real analysis, p.93, Blaisdell, NY (1964).
\bibitem{7} D.Mihalas, Stellar atmospheres. Freeman, SF (1970).
\end{thebibliography}
\end{document}
[/math]