(I will try to post notes here right before lecture.)

Notes 1, Introduction

**I. Vector Spaces and Linear Representations**

Notes 2, intro to bases for representing functions

Notes 3, linear vector spaces

Notes 4, norms and inner products

Notes 5, linear approximation

Notes 6, orthobases, see also technical details about convergence

Notes 7, nonorthogonal (Riesz) bases

Notes 8, linear functionals and reproducing kernel spaces

**II. Linear Estimation using Least Squares**

Notes 9, linear regression, basis regression, and linear inverse problems

Notes 10, symmetric systems of linear equations

Notes 11, the singular value decomposition

Notes 12, the least-squares problem

Notes 13, stable least-squares

Notes 14, least-squares in Hilbert space

Notes 15, kernel regression, Mercer’s theorem

Notes 16, matrix factorization

Notes 17, iterative methods for solving least squares (SD and CG)

see also the excellent paper Shewchuk, “CG without the agonizing pain”

**III. Statistical Estimation and Classification**

Notes 18, a concise review of probability, mmse=conditional mean

Notes 19, Gaussian estimation

Notes 20, conditional independence and Gaussian graphical models

Notes 21, maximum likelihood estimation

Notes 22, bias, consistency, and efficiency

Notes 23, Stein’s paradox

see also the excellent papers by Samworth and Efron-Morris

Notes 24, Bayesian estimation

Notes 25, classification using Bayes rule and nearest neighbor

Notes 26, empirical risk minimization

**Interlude**: Notes 27, basics of (unconstrained and constrained) gradient descent

**IV. Modeling**

Notes 28, principal components analysis

Notes 29, Gaussian mixture models, EM algorithm

Notes 30, hidden Markov models

see also the excellent review paper by Rabiner