| Foreword |
|
xv | |
|
Prerequisites in probability calculus |
|
|
1 | (28) |
|
|
|
1 | (1) |
|
|
|
1 | (11) |
|
|
|
1 | (2) |
|
Random Variables and their Distributions |
|
|
3 | (2) |
|
Joint Probability Distributions |
|
|
5 | (1) |
|
Conditional Probability Distributions |
|
|
6 | (1) |
|
|
|
6 | (1) |
|
|
|
7 | (1) |
|
|
|
7 | (1) |
|
Probability Models with Independence |
|
|
7 | (1) |
|
Multinomial Probability Distribution |
|
|
8 | (2) |
|
A Weight Matrix Model for a Family of Sequences |
|
|
10 | (1) |
|
|
|
11 | (1) |
|
|
|
12 | (1) |
|
|
|
12 | (1) |
|
A Missing Information Principle and Inference |
|
|
13 | (1) |
|
Some Distributions for DNA Analysis |
|
|
13 | (2) |
|
|
|
13 | (1) |
|
The Distribution of the Number of Fragments |
|
|
14 | (1) |
|
|
|
15 | (1) |
|
|
|
16 | (1) |
|
|
|
17 | (1) |
|
|
|
18 | (2) |
|
|
|
20 | (6) |
|
References and Further Reading: |
|
|
26 | (3) |
|
Information and the Kullback Distance |
|
|
29 | (22) |
|
|
|
29 | (1) |
|
|
|
30 | (1) |
|
Properties of Mutual Information |
|
|
31 | (2) |
|
|
|
31 | (1) |
|
|
|
32 | (1) |
|
Shannon's Source Coding Theorems |
|
|
33 | (3) |
|
|
|
33 | (1) |
|
The Source Coding Theorem |
|
|
34 | (1) |
|
Lossless Compression Codes and Entropy |
|
|
35 | (1) |
|
|
|
36 | (4) |
|
|
|
36 | (2) |
|
|
|
38 | (1) |
|
|
|
38 | (2) |
|
The Score and the Fisher Information |
|
|
40 | (2) |
|
Exercises on Mutual Information and Codelengths |
|
|
42 | (3) |
|
Kullback Distance and Fisher Information |
|
|
45 | (4) |
|
References and Further Reading: |
|
|
49 | (2) |
|
Probabilistic Models and Learning |
|
|
51 | (32) |
|
|
|
51 | (1) |
|
|
|
52 | (3) |
|
|
|
52 | (2) |
|
|
|
54 | (1) |
|
Models with Conditional Independence |
|
|
55 | (10) |
|
Modelling and Learning for Tosses of a Thumb tack |
|
|
55 | (3) |
|
Learning of the Multinomial Process |
|
|
58 | (6) |
|
|
|
64 | (1) |
|
Comparison of Model Families |
|
|
65 | (1) |
|
|
|
65 | (1) |
|
Inductive Learning, Updates |
|
|
65 | (1) |
|
Some Asymptotics for Evidence |
|
|
66 | (3) |
|
Evidence and Bayesian Codelengths |
|
|
69 | (3) |
|
|
|
72 | (4) |
|
Appendix: Dirichlet Densities |
|
|
76 | (2) |
|
|
|
76 | (1) |
|
|
|
77 | (1) |
|
|
|
78 | (1) |
|
References and Further Reading: |
|
|
78 | (5) |
|
|
|
83 | (22) |
|
Finite Mixtures of Distributions |
|
|
83 | (3) |
|
|
|
83 | (1) |
|
|
|
84 | (1) |
|
Likelihood and Missing Information |
|
|
85 | (1) |
|
An Expectation Maximization Algorithm |
|
|
86 | (3) |
|
|
|
86 | (1) |
|
|
|
87 | (1) |
|
An Explicit Form of Step M |
|
|
88 | (1) |
|
Exponential Type of Family of Distributions |
|
|
89 | (3) |
|
Definitions and Notations: |
|
|
89 | (1) |
|
EM for Exponential Type of Family of Distributions |
|
|
90 | (2) |
|
Summary of EM for Exponential Type of Families |
|
|
92 | (1) |
|
Iterations and Convergence |
|
|
92 | (1) |
|
A Model for Fragments with Motifs |
|
|
93 | (3) |
|
|
|
93 | (1) |
|
|
|
93 | (2) |
|
Generating Sequences with a Motif |
|
|
95 | (1) |
|
Learning the Model for One Motif |
|
|
96 | (1) |
|
|
|
96 | (6) |
|
References and Further Reading: |
|
|
102 | (3) |
|
|
|
105 | (20) |
|
|
|
105 | (1) |
|
|
|
105 | (1) |
|
|
|
106 | (2) |
|
|
|
106 | (1) |
|
|
|
107 | (1) |
|
|
|
108 | (3) |
|
|
|
108 | (1) |
|
|
|
109 | (1) |
|
The Sequence Metric and the Evolutionary Distance |
|
|
110 | (1) |
|
|
|
111 | (5) |
|
|
|
111 | (2) |
|
|
|
113 | (2) |
|
The Traceback: Optimal Alignment |
|
|
115 | (1) |
|
Global Similarity Alignment, Gap Penalities |
|
|
116 | (3) |
|
|
|
119 | (1) |
|
Appendix: On the Evolutionary Distance |
|
|
120 | (3) |
|
|
|
120 | (2) |
|
Edge Lengths of a Tree and the Related Metrics |
|
|
122 | (1) |
|
References and Further Reading: |
|
|
123 | (2) |
|
Mixture Models and Profiles |
|
|
125 | (26) |
|
|
|
125 | (3) |
|
Prior Information for a Weight Matrix |
|
|
125 | (1) |
|
Family and Superfamily Representations |
|
|
126 | (1) |
|
Multiple Alignment and Priors of Mixtures of Dirichlet Densities |
|
|
127 | (1) |
|
On a General Theory of Mixtures of Dirichlet Densities |
|
|
127 | (1) |
|
A Finite Mixture of Prior Dirichlets |
|
|
128 | (6) |
|
Multiple Alignment and Prior Information |
|
|
128 | (2) |
|
A Likelihood Function for the Frequency Counts |
|
|
130 | (2) |
|
The Mixture Estimation Algorithm |
|
|
132 | (2) |
|
Estimates of the Symbol Probability Distributions |
|
|
134 | (1) |
|
Aligning a Sequence to a Profile |
|
|
135 | (1) |
|
|
|
135 | (3) |
|
Optimal Fit of a String to a Profile |
|
|
136 | (2) |
|
|
|
138 | (3) |
|
Appendix A: Dirichlet Mixtures |
|
|
141 | (1) |
|
Definition of Dirichlet Mixtures |
|
|
141 | (1) |
|
Appendix B: The Predictive Multinomial Mixture |
|
|
142 | (3) |
|
The Predictive Multinomial Process |
|
|
142 | (1) |
|
|
|
143 | (1) |
|
The Posterior Mixture Density |
|
|
144 | (1) |
|
Mean Posterior Probability of Symbols |
|
|
144 | (1) |
|
Appendix C: The Lemmas for Mixture Estimation |
|
|
145 | (3) |
|
References and Further Reading: |
|
|
148 | (3) |
|
|
|
151 | (24) |
|
|
|
151 | (6) |
|
|
|
151 | (1) |
|
Definitions and Notations |
|
|
151 | (1) |
|
|
|
152 | (3) |
|
Markov Chains of kth Order |
|
|
155 | (1) |
|
Joint Probability Distribution of an MC |
|
|
156 | (1) |
|
Chapman--Kolmogorov Equations |
|
|
157 | (2) |
|
Stationarity and Invariant Distributions |
|
|
159 | (2) |
|
|
|
161 | (7) |
|
Definition and Properties |
|
|
161 | (1) |
|
A Sufficient Condition for Ergodicity |
|
|
162 | (5) |
|
Classification of the State Space |
|
|
167 | (1) |
|
|
|
168 | (5) |
|
On the Markov Property and the Invariant Distribution |
|
|
168 | (2) |
|
Stationary Markov Processes |
|
|
170 | (1) |
|
Entropy Rate of a Stationary Markov Chain |
|
|
170 | (1) |
|
|
|
171 | (2) |
|
References and Further Reading: |
|
|
173 | (2) |
|
Learning of Markov Chains |
|
|
175 | (16) |
|
|
|
175 | (1) |
|
|
|
175 | (1) |
|
|
|
176 | (3) |
|
|
|
176 | (1) |
|
ML of the Transition Matrix |
|
|
177 | (1) |
|
An Example of Full Likelihood |
|
|
178 | (1) |
|
|
|
179 | (2) |
|
|
|
181 | (2) |
|
Posterior Distributions for Rows in the Transition Matrix |
|
|
181 | (1) |
|
|
|
181 | (2) |
|
MC Order Comparison Using the Bayes Ratio |
|
|
183 | (1) |
|
|
|
184 | (2) |
|
|
|
186 | (3) |
|
References and Further Reading: |
|
|
189 | (2) |
|
Markovian Models for DNA sequences |
|
|
191 | (20) |
|
|
|
191 | (2) |
|
Markov Chains for DNA sequences |
|
|
193 | (2) |
|
Frame Dependent Markov Chains |
|
|
195 | (2) |
|
MTD and Interpolated Markov Models |
|
|
197 | (1) |
|
Probability Computations for Words |
|
|
198 | (6) |
|
First Occurrence of a Word |
|
|
199 | (3) |
|
Conditional Expected Frequency of a Word |
|
|
202 | (2) |
|
|
|
204 | (3) |
|
References and Further Reading: |
|
|
207 | (4) |
|
Hidden Markov Models: an Overview |
|
|
211 | (20) |
|
|
|
211 | (1) |
|
|
|
211 | (1) |
|
|
|
212 | (1) |
|
|
|
212 | (4) |
|
|
|
212 | (4) |
|
Generative Modelling of Sequences |
|
|
216 | (1) |
|
Examples and Applications |
|
|
216 | (3) |
|
Influence Diagrams, Nonstandard HMM |
|
|
219 | (5) |
|
|
|
224 | (1) |
|
Finite State Stochastic Machines |
|
|
224 | (2) |
|
Notations and Terminology |
|
|
224 | (1) |
|
|
|
225 | (1) |
|
|
|
226 | (3) |
|
References and Further Reading: |
|
|
229 | (2) |
|
|
|
231 | (14) |
|
|
|
231 | (1) |
|
Heterogeneity and Segmentation of Natural DNA sequences |
|
|
232 | (1) |
|
|
|
233 | (4) |
|
|
|
237 | (2) |
|
|
|
239 | (2) |
|
References and Further Reading: |
|
|
241 | (4) |
|
Left to Right HMM for Sequences |
|
|
245 | (26) |
|
|
|
245 | (1) |
|
|
|
245 | (1) |
|
|
|
246 | (1) |
|
Motif-Based Hidden Markov Models |
|
|
246 | (5) |
|
The Approximate Common Substring Problem |
|
|
246 | (3) |
|
|
|
249 | (2) |
|
|
|
251 | (1) |
|
Multiple Alignment by profile HMM |
|
|
251 | (2) |
|
|
|
253 | (2) |
|
HMM Alignment to a Profile |
|
|
253 | (1) |
|
|
|
253 | (1) |
|
|
|
254 | (1) |
|
|
|
255 | (7) |
|
|
|
255 | (2) |
|
The Probability Distribution for the Length of Generated Sequence |
|
|
257 | (2) |
|
Expected Length of a Generated Sequence |
|
|
259 | (2) |
|
|
|
261 | (1) |
|
|
|
262 | (5) |
|
|
|
262 | (1) |
|
Duration of Stay and Absorption |
|
|
262 | (2) |
|
A Project on the Yaglom Limit |
|
|
264 | (2) |
|
|
|
266 | (1) |
|
References and Further Reading: |
|
|
267 | (4) |
|
|
|
271 | (22) |
|
|
|
271 | (1) |
|
|
|
272 | (1) |
|
Various Sufficiency Properties |
|
|
273 | (4) |
|
|
|
277 | (2) |
|
|
|
279 | (6) |
|
|
|
285 | (1) |
|
Proof of Derin's Backwards Recursion |
|
|
286 | (2) |
|
|
|
288 | (3) |
|
References and Further Reading: |
|
|
291 | (2) |
|
Forward--Backward Algorithm |
|
|
293 | (24) |
|
|
|
293 | (1) |
|
Forward--Backward Algorithm and Evaluation |
|
|
294 | (5) |
|
|
|
294 | (1) |
|
|
|
295 | (2) |
|
|
|
297 | (1) |
|
The Scoring (Evaluation) Problem |
|
|
298 | (1) |
|
Recursions for Posterior Probabilities |
|
|
299 | (5) |
|
Filtering, Smoothing, Prediction |
|
|
299 | (2) |
|
|
|
301 | (2) |
|
|
|
303 | (1) |
|
The Alignment Problem and the Viterbi Algorithm |
|
|
304 | (5) |
|
|
|
304 | (1) |
|
The Bellman Optimality Principle |
|
|
305 | (2) |
|
Dynamic Programming for the Alignment (Decoding) Problem |
|
|
307 | (1) |
|
|
|
308 | (1) |
|
|
|
309 | (6) |
|
References and Further Reading: |
|
|
315 | (2) |
|
Baum--Welch Learning Algorithm |
|
|
317 | (28) |
|
|
|
317 | (2) |
|
Statement of the Learning Problem |
|
|
317 | (1) |
|
Plan for the Study of the Learning Problem |
|
|
318 | (1) |
|
The Quasi-log likelihood for HMM |
|
|
319 | (6) |
|
A Lower Bound for the Log Likelihood Ratio |
|
|
319 | (2) |
|
Expressions for the Quasi-log likelihood Function |
|
|
321 | (2) |
|
Maximization of the Quasi-log likelihood Function |
|
|
323 | (1) |
|
An Intermediate Summary of Re-estimation |
|
|
324 | (1) |
|
Forward and Backward Probabilities |
|
|
325 | (8) |
|
|
|
326 | (5) |
|
Re-estimates in Forward and Backward Variables |
|
|
331 | (2) |
|
Baum--Welch Learning for MAP |
|
|
333 | (3) |
|
|
|
333 | (1) |
|
|
|
334 | (2) |
|
|
|
336 | (7) |
|
References and Further Reading: |
|
|
343 | (2) |
|
Limit Points of Baum--Welch |
|
|
345 | (22) |
|
|
|
345 | (2) |
|
A Summary of Re-estimates |
|
|
345 | (1) |
|
Plan for the Further Study |
|
|
346 | (1) |
|
The Theorem on Limit Points of Re-estimates |
|
|
347 | (1) |
|
On the Stationary Points of the Likelihood |
|
|
347 | (6) |
|
A Formal Optimization Calculus |
|
|
347 | (1) |
|
Some Computational Details |
|
|
348 | (4) |
|
Re-estimation Formulas Rediscovered |
|
|
352 | (1) |
|
The Growth Transformation |
|
|
353 | (5) |
|
|
|
353 | (1) |
|
Polynomials Homogeneous in Degree n |
|
|
354 | (1) |
|
|
|
355 | (3) |
|
Baum--Welch Learning and Maximum Likelihood |
|
|
358 | (2) |
|
The Global Convergence Theorem |
|
|
360 | (1) |
|
Zangwill's Theory of Algorithms |
|
|
360 | (1) |
|
|
|
361 | (3) |
|
Appendix: Results in Mathematical Analysis |
|
|
364 | (1) |
|
Geometric Mean--Arithmetic Mean Inequality |
|
|
364 | (1) |
|
|
|
365 | (1) |
|
Euler's Theorem on Homogeneous Functions |
|
|
365 | (1) |
|
References and Further Reading: |
|
|
365 | (2) |
|
|
|
367 | (18) |
|
|
|
367 | (1) |
|
|
|
368 | (2) |
|
Steps of the Proof: An Outline |
|
|
370 | (1) |
|
Some Auxiliary Statements |
|
|
371 | (8) |
|
|
|
379 | (3) |
|
|
|
382 | (1) |
|
References and Further Reading: |
|
|
383 | (2) |
|
|
|
385 | (4) |
|
|
|
385 | (1) |
|
Revision of Local Alignment |
|
|
385 | (1) |
|
Probability Defined by Alignment |
|
|
386 | (1) |
|
|
|
387 | (1) |
|
References and Further Reading: |
|
|
388 | (1) |
| Index |
|
389 | |