搜索【\Learning】的结果

Using Machine Learning to Predict Student Performance.pdf，这是一份不错

2022-04-29 13:00:58 232KB 机器学习 文档资料 人工智能 文档

Optimizing Extreme Learning Machines with Kernel Functions.zip，这

2022-04-29 13:00:44 2KB 文档

adabnn：与论文工作相关的代码：“ AdaBnn：经过自适应结构学习训练的二值化神经网络”

阿达·本与论文工作相关的代码： “ AdaBnn：经过自适应结构学习训练的二值化神经网络” 该存储库当前包含两个协作笔记本：带有实验性质的基于Keras实施AdaNet算法提出的由该文件实验“ ”在，对于学习神经网络结构为子网的集合。此外，AdaBnn表示为对AdaNet的修改，它对运行时间施加了二进制约束，以尝试在时间方面提高性能，并且是一种基于“的正则化方式”。 “。另外，包含的单独代码包含Adanet和AdaBnn实现及其文档。一些发现根据笔记本中提供的实验：在自适应结构学习的情况下，对网络权重进行二值化具有类似的效果，即遗传算法中的突变率很高，在迭代之间很难遵循学习模式，在T迭代中不保持增量性能。 Adam优化在大多数情况下更适合于此类AdaBnn结构，并且迭代次数更少（本文中的T参数）。目前，对AdaNet进行二值化处理并没有太大的改进，但它可能是为权重/激活添加约束作为自适应结构学习的正则化方法的起点。进一步的工作进一步的工作可能包括将二值化过程作为卷积子网的一部分，这是（M Courbariaux，2016）的最初建议。例导入依赖关

2022-04-29 11:23:47 4.24MB deep-learning tensorflow scikit-learn keras

1

显着性：适用于SmoothGrad，Grad-CAM，Guided backprop，集成梯度和其他显着性技术的TensorFlow实现

显着性方法介绍该存储库包含以下显着性技术的代码： XRAI *（，） SmoothGrad *（）香草渐变（，）引导反向传播（）综合渐变（）咬合 Grad-CAM（）模糊IG *由PAIR开发。此列表绝不是全面的。我们正在接受请求添加新方法的请求！下载 pip install saliency 或开发版本： git clone https://github.com/pair-code/saliency cd saliency 用法每个显着性掩码类都从SaliencyMask基类扩展。此类包含以下方法： __init__(graph, sessio

2022-04-29 11:10:45 2.52MB machine-learning deep-neural-networks deep-learning tensorflow

1

python Deep Learning on身份件识别deepLearning_OCR-master.zip

python Deep Learning on身份件识别deepLearning_OCR-master.zip 系统共分为两部分：移动（Android）端和服务器端。移动端共分为两个模块：输入模块和输出模块；服务器端共分为三个模块：模型加载模块、模型处理模块和结果映射模块。

2022-04-29 09:11:44 104.71MB python 深度学习 源码软件 开发语言

bindsnet：使用PyTorch模拟尖峰神经网络（SNN）

一个Python软件包，用于使用 Tensor功能在CPU或GPU上模拟尖峰神经网络（SNN）。 BindsNET是一个尖刺的神经网络仿真库，旨在开发用于机器学习的受生物启发的算法。该软件包被用作正在进行的研究的一部分，该研究在中将SNN应用于机器学习（ML）和强化学习（RL）问题。查看，以获取实验集合，结果分析功能，实验结果图等。该软件包的文档可以在找到。要求 Python 3.6 requirements.txt 设置东西使用点子 BindsNET可通过其git存储库获得。问题 pip install git+https://github.com/BindsNET/bi

2022-04-28 22:44:30 23.52MB machine-learning reinforcement-learning simulation dynamic

1

an introduction to statistical learning

本书主要介绍统计学的基本思想、原理和方法, 使读者对统计学及统计学的思维方式有一个整体的了解. 本书主要内容包括: 统计学的发展和应用领域、概率理论、数据收集的概念和方法、对数据总体信息的描述、常用的参数估计和假设检验方法. 书中注重以概率理论解释常见统计方法的原理, 并通过计算机模拟帮助读者理解统计思想和原理, 以避免把统计学片面地理解为简单的加减乘除计算公式, 进而增强学生运用统计思想和方法提出问题、分析问题和解决问题的能力. 本书适合作为高等院校本科生学习统计学知识的入门教材.

2022-04-28 20:56:38 14.19MB R语言

1

Machine-Learning-with-Python:使用机器学习预测澳大利亚的降雨量

2022-04-28 18:10:24 1.1MB JupyterNotebook

1

Learning From Data

数据分析，大数据应用，非常好

2022-04-28 16:42:48 4.99MB 数据分析

1

Pattern Recognition and Machine Learning

模式识别经典教材 1 Introduction 1 1.1 Example: Polynomial Curve Fitting . . . . . . . . . . . . . . . . . 4 1.2 Probability Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.2.1 Probability densities . . . . . . . . . . . . . . . . . . . . . 17 1.2.2 Expectations and covariances . . . . . . . . . . . . . . . . 19 1.2.3 Bayesian probabilities . . . . . . . . . . . . . . . . . . . . 21 1.2.4 The Gaussian distribution . . . . . . . . . . . . . . . . . . 24 1.2.5 Curve fitting re-visited . . . . . . . . . . . . . . . . . . . . 28 1.2.6 Bayesian curve fitting . . . . . . . . . . . . . . . . . . . . 30 1.3 Model Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 1.4 The Curse of Dimensionality . . . . . . . . . . . . . . . . . . . . . 33 1.5 Decision Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 1.5.1 Minimizing the misclassification rate . . . . . . . . . . . . 39 1.5.2 Minimizing the expected loss . . . . . . . . . . . . . . . . 41 1.5.3 The reject option . . . . . . . . . . . . . . . . . . . . . . . 42 1.5.4 Inference and decision . . . . . . . . . . . . . . . . . . . . 42 1.5.5 Loss functions for regression . . . . . . . . . . . . . . . . . 46 1.6 Information Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 48 1.6.1 Relative entropy and mutual information . . . . . . . . . . 55 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 2 Probability Distributions 67 2.1 Binary Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 2.1.1 The beta distribution . . . . . . . . . . . . . . . . . . . . . 71 2.2 Multinomial Variables . . . . . . . . . . . . . . . . . . . . . . . . 74 2.2.1 The Dirichlet distribution . . . . . . . . . . . . . . . . . . . 76 2.3 The Gaussian Distribution . . . . . . . . . . . . . . . . . . . . . . 78 2.3.1 Conditional Gaussian distributions . . . . . . . . . . . . . . 85 2.3.2 Marginal Gaussian distributions . . . . . . . . . . . . . . . 88 2.3.3 Bayes’ theorem for Gaussian variables . . . . . . . . . . . . 90 2.3.4 Maximum likelihood for the Gaussian . . . . . . . . . . . . 93 2.3.5 Sequential estimation . . . . . . . . . . . . . . . . . . . . . 94 2.3.6 Bayesian inference for the Gaussian . . . . . . . . . . . . . 97 2.3.7 Student’s t-distribution . . . . . . . . . . . . . . . . . . . . 102 2.3.8 Periodic variables . . . . . . . . . . . . . . . . . . . . . . . 105 2.3.9 Mixtures of Gaussians . . . . . . . . . . . . . . . . . . . . 110 2.4 The Exponential Family . . . . . . . . . . . . . . . . . . . . . . . 113 2.4.1 Maximum likelihood and sufficient statistics . . . . . . . . 116 2.4.2 Conjugate priors . . . . . . . . . . . . . . . . . . . . . . . 117 2.4.3 Noninformative priors . . . . . . . . . . . . . . . . . . . . 117 2.5 Nonparametric Methods . . . . . . . . . . . . . . . . . . . . . . . 120 2.5.1 Kernel density estimators . . . . . . . . . . . . . . . . . . . 122 2.5.2 Nearest-neighbour methods . . . . . . . . . . . . . . . . . 124 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 3 Linear Models for Regression 137 3.1 Linear Basis Function Models . . . . . . . . . . . . . . . . . . . . 138 3.1.1 Maximum likelihood and least squares . . . . . . . . . . . . 140 3.1.2 Geometry of least squares . . . . . . . . . . . . . . . . . . 143 3.1.3 Sequential learning . . . . . . . . . . . . . . . . . . . . . . 143 3.1.4 Regularized least squares . . . . . . . . . . . . . . . . . . . 144 3.1.5 Multiple outputs . . . . . . . . . . . . . . . . . . . . . . . 146 3.2 The Bias-Variance Decomposition . . . . . . . . . . . . . . . . . . 147 3.3 Bayesian Linear Regression . . . . . . . . . . . . . . . . . . . . . 152 3.3.1 Parameter distribution . . . . . . . . . . . . . . . . . . . . 153 3.3.2 Predictive distribution . . . . . . . . . . . . . . . . . . . . 156 3.3.3 Equivalent kernel . . . . . . . . . . . . . . . . . . . . . . . 157 3.4 Bayesian Model Comparison . . . . . . . . . . . . . . . . . . . . . 161 3.5 The Evidence Approximation . . . . . . . . . . . . . . . . . . . . 165 3.5.1 Evaluation of the evidence function . . . . . . . . . . . . . 166 3.5.2 Maximizing the evidence function . . . . . . . . . . . . . . 168 3.5.3 Effective number of parameters . . . . . . . . . . . . . . . 170 3.6 Limitations of Fixed Basis Functions . . . . . . . . . . . . . . . . 172 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 4 Linear Models for Classification 179 4.1 Discriminant Functions . . . . . . . . . . . . . . . . . . . . . . . . 181 4.1.1 Two classes . . . . . . . . . . . . . . . . . . . . . . . . . . 181 4.1.2 Multiple classes . . . . . . . . . . . . . . . . . . . . . . . . 182 4.1.3 Least squares for classification . . . . . . . . . . . . . . . . 184 4.1.4 Fisher’s linear discriminant . . . . . . . . . . . . . . . . . . 186 4.1.5 Relation to least squares . . . . . . . . . . . . . . . . . . . 189 4.1.6 Fisher’s discriminant for multiple classes . . . . . . . . . . 191 4.1.7 The perceptron algorithm . . . . . . . . . . . . . . . . . . . 192 4.2 Probabilistic Generative Models . . . . . . . . . . . . . . . . . . . 196 4.2.1 Continuous inputs . . . . . . . . . . . . . . . . . . . . . . 198 4.2.2 Maximum likelihood solution . . . . . . . . . . . . . . . . 200 4.2.3 Discrete features . . . . . . . . . . . . . . . . . . . . . . . 202 4.2.4 Exponential family . . . . . . . . . . . . . . . . . . . . . . 202 4.3 Probabilistic Discriminative Models . . . . . . . . . . . . . . . . . 203 4.3.1 Fixed basis functions . . . . . . . . . . . . . . . . . . . . . 204 4.3.2 Logistic regression . . . . . . . . . . . . . . . . . . . . . . 205 4.3.3 Iterative reweighted least squares . . . . . . . . . . . . . . 207 4.3.4 Multiclass logistic regression . . . . . . . . . . . . . . . . . 209 4.3.5 Probit regression . . . . . . . . . . . . . . . . . . . . . . . 210 4.3.6 Canonical link functions . . . . . . . . . . . . . . . . . . . 212 4.4 The Laplace Approximation . . . . . . . . . . . . . . . . . . . . . 213 4.4.1 Model comparison and BIC . . . . . . . . . . . . . . . . . 216 4.5 Bayesian Logistic Regression . . . . . . . . . . . . . . . . . . . . 217 4.5.1 Laplace approximation . . . . . . . . . . . . . . . . . . . . 217 4.5.2 Predictive distribution . . . . . . . . . . . . . . . . . . . . 218 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 5 Neural Networks 225 5.1 Feed-forward Network Functions . . . . . . . . . . . . . . . . . . 227 5.1.1 Weight-space symmetries . . . . . . . . . . . . . . . . . . 231 5.2 Network Training . . . . . . . . . . . . . . . . . . . . . . . . . . . 232 5.2.1 Parameter optimization . . . . . . . . . . . . . . . . . . . . 236 5.2.2 Local quadratic approximation . . . . . . . . . . . . . . . . 237 5.2.3 Use of gradient information . . . . . . . . . . . . . . . . . 239 5.2.4 Gradient descent optimization . . . . . . . . . . . . . . . . 240 5.3 Error Backpropagation . . . . . . . . . . . . . . . . . . . . . . . . 241 5.3.1 Evaluation of error-function derivatives . . . . . . . . . . . 242 5.3.2 A simple example . . . . . . . . . . . . . . . . . . . . . . 245 5.3.3 Efficiency of backpropagation . . . . . . . . . . . . . . . . 246 5.3.4 The Jacobian matrix . . . . . . . . . . . . . . . . . . . . . 247 5.4 The Hessian Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 249 5.4.1 Diagonal approximation . . . . . . . . . . . . . . . . . . . 250 5.4.2 Outer product approximation . . . . . . . . . . . . . . . . . 251 5.4.3 Inverse Hessian . . . . . . . . . . . . . . . . . . . . . . . . 252 5.4.4 Finite differences . . . . . . . . . . . . . . . . . . . . . . . 252 5.4.5 Exact evaluation of the Hessian . . . . . . . . . . . . . . . 253 5.4.6 Fast multiplication by the Hessian . . . . . . . . . . . . . . 254 5.5 Regularization in Neural Networks . . . . . . . . . . . . . . . . . 256 5.5.1 Consistent Gaussian priors . . . . . . . . . . . . . . . . . . 257 5.5.2 Early stopping . . . . . . . . . . . . . . . . . . . . . . . . 259 5.5.3 Invariances . . . . . . . . . . . . . . . . . . . . . . . . . . 261 5.5.4 Tangent propagation . . . . . . . . . . . . . . . . . . . . . 263 5.5.5 Training with transformed data . . . . . . . . . . . . . . . . 265 5.5.6 Convolutional networks . . . . . . . . . . . . . . . . . . . 267 5.5.7 Soft weight sharing . . . . . . . . . . . . . . . . . . . . . . 269 5.6 Mixture Density Networks . . . . . . . . . . . . . . . . . . . . . . 272 5.7 Bayesian Neural Networks . . . . . . . . . . . . . . . . . . . . . . 277 5.7.1 Posterior parameter distribution . . . . . . . . . . . . . . . 278 5.7.2 Hyperparameter optimization . . . . . . . . . . . . . . . . 280 5.7.3 Bayesian neural networks for classification . . . . . . . . . 281 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284 6 Kernel Methods 291 6.1 Dual Representations . . . . . . . . . . . . . . . . . . . . . . . . . 293 6.2 Constructing Kernels . . . . . . . . . . . . . . . . . . . . . . . . . 294 6.3 Radial Basis Function Networks . . . . . . . . . . . . . . . . . . . 299 6.3.1 Nadaraya-Watson model . . . . . . . . . . . . . . . . . . . 301 6.4 Gaussian Processes . . . . . . . . . . . . . . . . . . . . . . . . . . 303 6.4.1 Linear regression revisited . . . . . . . . . . . . . . . . . . 304 6.4.2 Gaussian processes for regression . . . . . . . . . . . . . . 306 6.4.3 Learning the hyperparameters . . . . . . . . . . . . . . . . 311 6.4.4 Automatic relevance determination . . . . . . . . . . . . . 312 6.4.5 Gaussian processes for classification . . . . . . . . . . . . . 313 6.4.6 Laplace approximation . . . . . . . . . . . . . . . . . . . . 315 6.4.7 Connection to neural networks . . . . . . . . . . . . . . . . 319 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320 7 Sparse Kernel Machines 325 7.1 Maximum Margin Classifiers . . . . . . . . . . . . . . . . . . . . 326 7.1.1 Overlapping class distributions . . . . . . . . . . . . . . . . 331 7.1.2 Relation to logistic regression . . . . . . . . . . . . . . . . 336 7.1.3 Multiclass SVMs . . . . . . . . . . . . . . . . . . . . . . . 338 7.1.4 SVMs for regression . . . . . . . . . . . . . . . . . . . . . 339 7.1.5 Computational learning theory . . . . . . . . . . . . . . . . 344 7.2 Relevance Vector Machines . . . . . . . . . . . . . . . . . . . . . 345 7.2.1 RVM for regression . . . . . . . . . . . . . . . . . . . . . . 345 7.2.2 Analysis of sparsity . . . . . . . . . . . . . . . . . . . . . . 349 7.2.3 RVM for classification . . . . . . . . . . . . . . . . . . . . 353 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357 8 Graphical Models 359 8.1 Bayesian Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 360 8.1.1 Example: Polynomial regression . . . . . . . . . . . . . . . 362 8.1.2 Generative models . . . . . . . . . . . . . . . . . . . . . . 365 8.1.3 Discrete variables . . . . . . . . . . . . . . . . . . . . . . . 366 8.1.4 Linear-Gaussian models . . . . . . . . . . . . . . . . . . . 370 8.2 Conditional Independence . . . . . . . . . . . . . . . . . . . . . . 372 8.2.1 Three example graphs . . . . . . . . . . . . . . . . . . . . 373 8.2.2 D-separation . . . . . . . . . . . . . . . . . . . . . . . . . 378 8.3 Markov Random Fields . . . . . . . . . . . . . . . . . . . . . . . 383 8.3.1 Conditional independence properties . . . . . . . . . . . . . 383 8.3.2 Factorization properties . . . . . . . . . . . . . . . . . . . 384 8.3.3 Illustration: Image de-noising . . . . . . . . . . . . . . . . 387 8.3.4 Relation to directed graphs . . . . . . . . . . . . . . . . . . 390 8.4 Inference in Graphical Models . . . . . . . . . . . . . . . . . . . . 393 8.4.1 Inference on a chain . . . . . . . . . . . . . . . . . . . . . 394 8.4.2 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398 8.4.3 Factor graphs . . . . . . . . . . . . . . . . . . . . . . . . . 399 8.4.4 The sum-product algorithm . . . . . . . . . . . . . . . . . . 402 8.4.5 The max-sum algorithm . . . . . . . . . . . . . . . . . . . 411 8.4.6 Exact inference in general graphs . . . . . . . . . . . . . . 416 8.4.7 Loopy belief propagation . . . . . . . . . . . . . . . . . . . 417 8.4.8 Learning the graph structure . . . . . . . . . . . . . . . . . 418 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418 9 Mixture Models and EM 423 9.1 K-means Clustering . . . . . . . . . . . . . . . . . . . . . . . . . 424 9.1.1 Image segmentation and compression . . . . . . . . . . . . 428 9.2 Mixtures of Gaussians . . . . . . . . . . . . . . . . . . . . . . . . 430 9.2.1 Maximum likelihood . . . . . . . . . . . . . . . . . . . . . 432 9.2.2 EM for Gaussian mixtures . . . . . . . . . . . . . . . . . . 435 9.3 An Alternative View of EM . . . . . . . . . . . . . . . . . . . . . 439 9.3.1 Gaussian mixtures revisited . . . . . . . . . . . . . . . . . 441 9.3.2 Relation to K-means . . . . . . . . . . . . . . . . . . . . . 443 9.3.3 Mixtures of Bernoulli distributions . . . . . . . . . . . . . . 444 9.3.4 EM for Bayesian linear regression . . . . . . . . . . . . . . 448 9.4 The EM Algorithm in General . . . . . . . . . . . . . . . . . . . . 450 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455 10 Approximate Inference 461 10.1 Variational Inference . . . . . . . . . . . . . . . . . . . . . . . . . 462 10.1.1 Factorized distributions . . . . . . . . . . . . . . . . . . . . 464 10.1.2 Properties of factorized approximations . . . . . . . . . . . 466 10.1.3 Example: The univariate Gaussian . . . . . . . . . . . . . . 470 10.1.4 Model comparison . . . . . . . . . . . . . . . . . . . . . . 473 10.2 Illustration: Variational Mixture of Gaussians . . . . . . . . . . . . 474 10.2.1 Variational distribution . . . . . . . . . . . . . . . . . . . . 475 10.2.2 Variational lower bound . . . . . . . . . . . . . . . . . . . 481 10.2.3 Predictive density . . . . . . . . . . . . . . . . . . . . . . . 482 10.2.4 Determining the number of components . . . . . . . . . . . 483 10.2.5 Induced factorizations . . . . . . . . . . . . . . . . . . . . 485 10.3 Variational Linear Regression . . . . . . . . . . . . . . . . . . . . 486 10.3.1 Variational distribution . . . . . . . . . . . . . . . . . . . . 486 10.3.2 Predictive distribution . . . . . . . . . . . . . . . . . . . . 488 10.3.3 Lower bound . . . . . . . . . . . . . . . . . . . . . . . . . 489 10.4 Exponential Family Distributions . . . . . . . . . . . . . . . . . . 490 10.4.1 Variational message passing . . . . . . . . . . . . . . . . . 491 10.5 Local Variational Methods . . . . . . . . . . . . . . . . . . . . . . 493 10.6 Variational Logistic Regression . . . . . . . . . . . . . . . . . . . 498 10.6.1 Variational posterior distribution . . . . . . . . . . . . . . . 498 10.6.2 Optimizing the variational parameters . . . . . . . . . . . . 500 10.6.3 Inference of hyperparameters . . . . . . . . . . . . . . . . 502 10.7 Expectation Propagation . . . . . . . . . . . . . . . . . . . . . . . 505 10.7.1 Example: The clutter problem . . . . . . . . . . . . . . . . 511 10.7.2 Expectation propagation on graphs . . . . . . . . . . . . . . 513 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517 11 Sampling Methods 523 11.1 Basic Sampling Algorithms . . . . . . . . . . . . . . . . . . . . . 526 11.1.1 Standard distributions . . . . . . . . . . . . . . . . . . . . 526 11.1.2 Rejection sampling . . . . . . . . . . . . . . . . . . . . . . 528 11.1.3 Adaptive rejection sampling . . . . . . . . . . . . . . . . . 530 11.1.4 Importance sampling . . . . . . . . . . . . . . . . . . . . . 532 11.1.5 Sampling-importance-resampling . . . . . . . . . . . . . . 534 11.1.6 Sampling and the EM algorithm . . . . . . . . . . . . . . . 536 11.2 Markov Chain Monte Carlo . . . . . . . . . . . . . . . . . . . . . 537 11.2.1 Markov chains . . . . . . . . . . . . . . . . . . . . . . . . 539 11.2.2 The Metropolis-Hastings algorithm . . . . . . . . . . . . . 541 11.3 Gibbs Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . 542 11.4 Slice Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 546 11.5 The Hybrid Monte Carlo Algorithm . . . . . . . . . . . . . . . . . 548 11.5.1 Dynamical systems . . . . . . . . . . . . . . . . . . . . . . 548 11.5.2 Hybrid Monte Carlo . . . . . . . . . . . . . . . . . . . . . 552 11.6 Estimating the Partition Function . . . . . . . . . . . . . . . . . . 554 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556 12 Continuous Latent Variables 559 12.1 Principal Component Analysis . . . . . . . . . . . . . . . . . . . . 561 12.1.1 Maximum variance formulation . . . . . . . . . . . . . . . 561 12.1.2 Minimum-error formulation . . . . . . . . . . . . . . . . . 563 12.1.3 Applications of PCA . . . . . . . . . . . . . . . . . . . . . 565 12.1.4 PCA for high-dimensional data . . . . . . . . . . . . . . . 569 12.2 Probabilistic PCA . . . . . . . . . . . . . . . . . . . . . . . . . . 570 12.2.1 Maximum likelihood PCA . . . . . . . . . . . . . . . . . . 574 12.2.2 EM algorithm for PCA . . . . . . . . . . . . . . . . . . . . 577 12.2.3 Bayesian PCA . . . . . . . . . . . . . . . . . . . . . . . . 580 12.2.4 Factor analysis . . . . . . . . . . . . . . . . . . . . . . . . 583 12.3 Kernel PCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 586 12.4 Nonlinear Latent Variable Models . . . . . . . . . . . . . . . . . . 591 12.4.1 Independent component analysis . . . . . . . . . . . . . . . 591 12.4.2 Autoassociative neural networks . . . . . . . . . . . . . . . 592 12.4.3 Modelling nonlinear manifolds . . . . . . . . . . . . . . . . 595 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 599 13 Sequential Data 605 13.1 Markov Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607 13.2 Hidden Markov Models . . . . . . . . . . . . . . . . . . . . . . . 610 13.2.1 Maximum likelihood for the HMM . . . . . . . . . . . . . 615 13.2.2 The forward-backward algorithm . . . . . . . . . . . . . . 618 13.2.3 The sum-product algorithm for the HMM . . . . . . . . . . 625 13.2.4 Scaling factors . . . . . . . . . . . . . . . . . . . . . . . . 627 13.2.5 The Viterbi algorithm . . . . . . . . . . . . . . . . . . . . . 629 13.2.6 Extensions of the hidden Markov model . . . . . . . . . . . 631 13.3 Linear Dynamical Systems . . . . . . . . . . . . . . . . . . . . . . 635 13.3.1 Inference in LDS . . . . . . . . . . . . . . . . . . . . . . . 638 13.3.2 Learning in LDS . . . . . . . . . . . . . . . . . . . . . . . 642 13.3.3 Extensions of LDS . . . . . . . . . . . . . . . . . . . . . . 644 13.3.4 Particle filters . . . . . . . . . . . . . . . . . . . . . . . . . 645 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 646 14 Combining Models 653 14.1 Bayesian Model Averaging . . . . . . . . . . . . . . . . . . . . . . 654 14.2 Committees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655 14.3 Boosting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 657 14.3.1 Minimizing exponential error . . . . . . . . . . . . . . . . 659 14.3.2 Error functions for boosting . . . . . . . . . . . . . . . . . 661 14.4 Tree-based Models . . . . . . . . . . . . . . . . . . . . . . . . . . 663 14.5 Conditional Mixture Models . . . . . . . . . . . . . . . . . . . . . 666 14.5.1 Mixtures of linear regression models . . . . . . . . . . . . . 667 14.5.2 Mixtures of logistic models . . . . . . . . . . . . . . . . . 670 14.5.3 Mixtures of experts . . . . . . . . . . . . . . . . . . . . . . 672 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674 Appendix A Data Sets 677 Appendix B Probability Distributions 685 Appendix C Properties of Matrices 695 Appendix D Calculus of Variations 703 Appendix E LagrangeMultipliers 707 References 711

2022-04-28 16:33:51 8.06MB 模式识别 机器学习

1

个人信息

热门下载

最新下载

其他资源