Errata for `Pattern Recognition and Neural Networks'

by B. D. Ripley

Cambridge University Press (1996) ISBN 0-521-48086-7

----------------------------------------------------------------------------

First and second printings:

p. 22 I confused origins for the vectors of probabilities here. So:

The optimal rule is to allocate to class 1 if X <= 12, class 2 if 
13 <= X <= 17 and to class 3 if X >= 18.

With two measurements  the rule is to allocate to class 1 if Xbar <= 12, 
class 2 if 12.5 <= Xbar <= 17 and to class 3 if Xbar >= 17.5.

The numbers given for class-wise success rates and overall error
rates are correct.

p. 100 There are typos in the displays (and an error in the original
reference). In first display n-1 should be n-K and there is sign error
in the second line. In LaTeX: 
\begin{eqnarray*} \Delta^2_{jc}
&\leftarrow& \Delta^2_{jc} \times \frac{n-K-1}{n-K}
\left(\frac{n_c}{n_c-1}\right)^2 \Bigm / \left[1 -
\frac{n_c}{(n_c-1)(n-K)}\Delta^2_{jc} \right ]\\

\Delta^2_{jk} &\leftarrow& \Delta^2_{jk} \times \frac{n-K-1}{n-K}
\left[ 1 + \frac{\{(\bx_j - \bmu_c)^T\hat\Sigma^{-1}(\bx_j - \bmu_k)\}^2}
{\{(n-K)(n_c-1)/n_c - \Delta^2_{jc}\}\Delta^2_{jk}}\right]
\end{eqnarray*}

The second display gives the updated value of \hat\Sigma_c. In this and
final display n_k should be n_c.

p. 225 l.20 the >largest< value of alpha ... (and hence the smallest tree)

p. 362  Ciampi et al (1987): Recursive >partition<.

p. 377 [Second printing] Mathieson (1996) has page numbers 523-536.


Typos:

p. 22 l.5 delete small ( after large (.

p. 45 l.2
The formula for \hat\beta should be \hat\Sigma^{-1} (\hat\mu_k - \hat\mu_1)

p. 288 caption Cushing's

p. 360 Bridle (1990a): an editor's name is spelled Fogelman.

----------------------------------------------------------------------------

The following apply only to the first printing:


p. iv The copyright is owned by the author, not the publisher.

pp. 7, 357, 391 For Angulin read Angluin.

p. 13 For Freemantle read Fremantle.

p. 44 (2.31) log has been omitted before the two probabilities 
      p(2 | x_i) and [1 - p(2 | x_i)].

p. 59 l.10 \ell\pi_d n_d /\pi_n n_d (interchange d and n).

p. 63 For Evans & Swantz read Evans & Swartz.

p. 83 (2.53) The constant is 4 e^{4 epsilon + 4 epsilon^2}.

p. 87 second display The factor 1/4 should be on the first not second term.

pp. 91, 121 Dietterich & Bakiri (1991, 1995), not 1994.

p. 104 l.-11 (n - g)/n_t + 2 log \pi_t.

p. 114 Examples line 2 on the >seven< explanatory inputs.

p. 118 The convergence time stated for Mansfield's method applies to
       binary inputs.

p. 149 l.-9 Minus the log-likelihood ....

p. 163 line -9 `standard deviance' should be `standard deviation'.

p. 165 second para ... by simulating w.

p. 166 display The exponent is -n_w/2.

p. 192 Proposition 6.1   A clearer statement is:
`Then the error rate of the nearest neighbour rule averaged over training
sets converges as the size of the training set increases ...'

p. 193 line 1 This line should read:
Thus E_1 = E e_1(X) where e_1(x) = \sum_{i \ne j} p(i | x) p(j | x) = 1 -
\sum_i p(i | x)^2.

p. 196 lines 11 and 12 ... for odd $k$ $E_{k-1, \lceil k/2 \rceil} =
E'_{k-1} \le E^*$, which suggests bounding the Bayes risk by the achieved
performance of the $(k-1, \lceil k/2\rceil)$-nn rule.
(Thus we require a strict majority for the rule with even k-1.)

p. 196 Proposition 6.3 It is not made clear how ties are to be handled in
the 3-nn rule: if the neighbours are of three different classes (only
possible for K > 2) `doubt' is declared, so this is a (3,2)-nn rule.

p. 270 Figure 8.8. The link from d to f has not shown up; please reinstate
it.

pp. 292, 359, 391 For Bourland read Bourlard.

p. 294 line 8 The suffix of lambda should be j in both sums.

p. 307 line -6 The bold x's are the rows of X viewed as _row_ vectors, hence
x_r x_s^T is a scalar.

p. 320 line -10 Macnaughton-Smith et al. (1964) not 1984.

p. 336 line 22 Q(phi, phi') = E[log p(phi, psi | X) | phi', X]: that is phi
was omitted from the right-hand side.

p. 350 In the definition of Mahalanobis distance there should be a transpose
on the first (x - y).

References

(probably) means that I have been unable to check the original source.

p. 356 In Akaike (1985) the second editor is Fienberg.

p. 358 The first page of Baum (1988) is 193.

p. 361 Carroll & Dickinson (1989) ... using the Radon transform.

p. 362 Clark & Niblett (1989)'s last page is (probably) 283.

p. 364 Dietterich, T. G. & Bakiri, G. (1994) appeared in 1995 in Journal of
Artificial Intelligence Research 2, 263-286.

p. 365 The paper by Evans & Swartz (not Swantz) appeared in volume 10,
254-272, in August 1996 but with a 1995 date. The discussion appeared in
volume 11, 54-64 in February 1997 with a 1996 date.

p. 365 Fagin (1977)'s title is Multivalued dependencies and a new normal
form for relational databases.

p. 369 Hastie, T. & Tibshirani, R. (1996) appeared in volume 58, 155-176,
with title ending as `Gaussian mixtures'.

p. 370 Hjort, N. L. & Glad, I. K. (1995) appeared in volume 23, 882-904.

p. 370 Hjort, N. L. & Jones, M. C. (1995) appeared in 1996 in volume
24, 1619-1647

p. 371 My copy of Hwang et al. (1994b) is undated, but it seems the report
was first distributed in 1993. A version appeared as `The cascaded
correlation learning: A projection pursuit perspective.' IEEE Transactions
on Neural Networks 7(2), 278-289, March 1996.

p. 371 Intrator & Gold (1993). The correct title is: Three-dimensional
object recognition using an unsupervised BCM network: the usefulness of
distinguishing features.

p. 374 In Kung & Diamantaras, Albuquerque is misspelt.

p. 375 Maass (1995). This is a Jan 1995 technical report, but the collection
in which it appeared is dated Oct 1994! The page numbers are 153-172.

p. 375 Macintyre & Sontag (1993). First page is (probably) 325.

p. 376 Macnaughton-Smith et al. (1964) not 1984.

p. 376 Mathieson (1995) appeared (very belatedly) in August 1996 with page
numbers 523-536 and editors A.-P. N. Refenes, Y. Abu-Mostafa, J. Moody & A.
Weigend.

p. 377 McCulloch & Pitts (1943). Replace `neural' by `nervous'.

p. 380 Quinlan (1979). Replace `classes' by `collections'.

p. 383, 397 D. W. Scott (not D. F.)

p. 386 The details for Tarassenko et al (1995) is pages 442-447 of IEE
Conference Publication 409.

p. 388 Weigend, Rumelhart & Huberman (1990) is probably Weigend, Huberman &
Rumelhart.

p. 389 Widrow & Hoff (1960). First page is 96.

p. 390 The precise reference is: Young, T. Y. & Calvert, T. W. (1974)
Classification, Estimation and Pattern Recognition. New York: American
Elsevier.

Typos

p. ix l.-14  ... or `clone' (delete `to')

p. 3 l.-4 delete `which'

p. 5 l.-5 airliner, not airline.

p. 27 l.-1 parameters >than<

p. 58 l.-16 Delete second of `the the'.

p. 62 l.-8 ... seems not >to< be widely ....

p. 202 l.-15 `minimize'

p. 228 Examples l.1 ... Pima Indians have ...

p. 228 Examples l.8 worse >than< the ...

p. 335 l.6 ... missing observations of a few ... (insert space).

----------------------------------------------------------------------------
Last edited on Weds 18 March 1998 by Brian Ripley <ripley@stats.ox.ac.uk>