1.1. Example: Polynomial Curve Fitting

  1. Movitate a number of concepts:

    (1) linear models: Functions which are linear in the unknow parameters. Polynomail is a linear model. For the Polynomail curve fitting problem, the models is :

        

    which is a linear model.

    (2) error function: error function measures the misfit between the prediction and the training set point. For instance, sum of the squares of the errors is one simple function, which is widely used, and is given:

        

    (3) model comparison or model selection

    (4) over-fitting: the model abtains excellent fit to training data and give a very poor performance on test data. And this behavior is known as over-fitting.

    (5) regularization: One technique which is often used to control the over-fitting phenomenon, and it involves adding a penalty term to the error function in order to discourage the coefficients from reaching large values. The simplest such penalty term takes the form of a sum of aquares of all of the coefficients, leading to a modified error function of the form:

        

And this particular case of a quadratic regularizer is called ridge regression (Hoerl and Kennard, 1970). In the context of neural networks, this approach is known as weight decay.

    (6) validation set, also called a hold-out set: If we were trying to solve a practical application using this approach of minimizing an error function, we would have to find a way to determine a suitable value for the model complexity. a simple way of achieving this, namely by taking the available data and partitioning it into a training set, used to determine the coefficients w, and a separate validation set, also called a hold-out set, used to optimize the model complexity.

1.2. Probability Theory

1. The rules of probability. Sum rule and product rule.

     

2. Bayes’ theorem.

  

3. Probability densities

4. Expectations and covariances

5. Bayesian probabilities.

  Bayes’ theorem was used to convert a prior probability into a posterior probability by incorporating the evidence provided by the observed data.

6. Gaussian distribution

  

7.maximizing the posterior distribution is equivalent to minimizing the regularized sum-of-squares error function.

1.3. Model Selection

1.6. Information Theory

1 entropy

Next Chapter

PRML读书笔记——Introduction的更多相关文章

  1. PRML读书笔记——3 Linear Models for Regression

    Linear Basis Function Models 线性模型的一个关键属性是它是参数的一个线性函数,形式如下: w是参数,x可以是原始的数据,也可以是关于原始数据的一个函数值,这个函数就叫bas ...

  2. PRML读书笔记——机器学习导论

    什么是模式识别(Pattern Recognition)? 按照Bishop的定义,模式识别就是用机器学习的算法从数据中挖掘出有用的pattern. 人们很早就开始学习如何从大量的数据中发现隐藏在背后 ...

  3. PRML读书笔记——2 Probability Distributions

    2.1. Binary Variables 1. Bernoulli distribution, p(x = 1|µ) = µ 2.Binomial distribution + 3.beta dis ...

  4. PRML读书笔记——Mathematical notation

    x, a vector, and all vectors are assumed to be column vectors. M, denote matrices. xT, a row vcetor, ...

  5. 【PRML读书笔记-Chapter1-Introduction】1.6 Information Theory

    熵 给定一个离散变量,我们观察它的每一个取值所包含的信息量的大小,因此,我们用来表示信息量的大小,概率分布为.当p(x)=1时,说明这个事件一定会发生,因此,它带给我的信息为0.(因为一定会发生,毫无 ...

  6. 【PRML读书笔记-Chapter1-Introduction】1.5 Decision Theory

    初体验: 概率论为我们提供了一个衡量和控制不确定性的统一的框架,也就是说计算出了一大堆的概率.那么,如何根据这些计算出的概率得到较好的结果,就是决策论要做的事情. 一个例子: 文中举了一个例子: 给定 ...

  7. 【PRML读书笔记-Chapter1-Introduction】1.4 The Curse of Dimensionality

    维数灾难 给定如下分类问题: 其中x6和x7表示横轴和竖轴(即两个measurements),怎么分? 方法一(simple): 把整个图分成:16个格,当给定一个新的点的时候,就数他所在的格子中,哪 ...

  8. 【PRML读书笔记-Chapter1-Introduction】1.3 Model Selection

    在训练集上有个好的效果不见得在测试集中效果就好,因为可能存在过拟合(over-fitting)的问题. 如果训练集的数据质量很好,那我们只需对这些有效数据训练处一堆模型,或者对一个模型给定系列的参数值 ...

  9. 【PRML读书笔记-Chapter1-Introduction】1.2 Probability Theory

    一个例子: 两个盒子: 一个红色:2个苹果,6个橘子; 一个蓝色:3个苹果,1个橘子; 如下图: 现在假设随机选取1个盒子,从中.取一个水果,观察它是属于哪一种水果之后,我们把它从原来的盒子中替换掉. ...

随机推荐

  1. 【POJ】3744 Scout YYF I

    http://poj.org/problem?id=3744 题意:直线上n个地雷,n<=10,范围在[1, 100000000],每一次有p的概率向前走一步,1-p的概率向前走两步,问安全通过 ...

  2. C程序演示产生僵死进程的过程

    先抄录网上一段对僵死进程的描述: 僵尸进程:一个进程使用fork创建子进程,如果子进程退出,而父进程并没有调用wait或waitpid获取子进程的状态信息,那么子进程的进程描述符仍然保存在系统中.这种 ...

  3. 【bzoj1078】[SCOI2008]斜堆

    2016-05-31 16:34:09 题目:http://www.lydsy.com/JudgeOnline/problem.php?id=1078 挖掘斜堆的性质233 http://www.cp ...

  4. nodeJS中exports和mopdule.exports的区别

    每一个node.js执行文件,都自动创建一个module对象,同时,module对象会创建一个叫exports的属性,初始化的值是 {} module.exports = {}; Node.js为了方 ...

  5. get,post 区别,HTTP通信

    GET & POST GET      1.GET 的本质是"得"      2.从服务器拿数据,效率更高 3.从数学的角度来讲,GET 的结果是"幂等" ...

  6. IOS 登陆判断问题

    有一个登陆界面,还有一个包含多个选项卡的界面在ViewController.m中登陆按钮的代码如下 UIViewController *controller=[[Tabbarcontroller al ...

  7. 基于Solr的HBase多条件查询测试

    背景: 某电信项目中采用HBase来存储用户终端明细数据,供前台页面即时查询.HBase无可置疑拥有其优势,但其本身只对rowkey支持毫秒级 的快 速检索,对于多字段的组合查询却无能为力.针对HBa ...

  8. php在window下的环境配置(VC9)

    配置PHP5:  1. 配置PHP5.3.3,打开php安装目录(笔者是D:\php\php5)可以看到目录下有两个这样的文件php.ini-    development和php.ini-produ ...

  9. CAS单点登录系统整合——注册的问题

    最近一段时间在搞CAS单点登录系统,涉及到几个子系统的整合问题.对于注册,这里遇到了一个选择: 在子系统内完成注册,然后把信息同步到CAS系统: 在CAS系统中完成基本信息的注册,比如:用户名.邮箱. ...

  10. hive中导入json格式的数据(hive分区表)

    hive中建立外部分区表,外部数据格式是json的如何导入呢? json格式的数据表不必含有分区字段,只需要在hdfs目录结构中体现出分区就可以了 This is all according to t ...