solving the problem of overfitting:regularization

  • 发生的在linear regression上面的overfitting问题

  • 发生在logistic regression上面的overfitting

  • 怎么解决overfitting

  • regularization: cost function of linear regression

    • parameters小的话,这样hypothesis就会变得简单,这样就不会overfitting
    • 一般不会对θ0进行regularization
    • 上式是进行regularization的linear regression的cost function,要使上式的值取最小值
  •  对这个cost function 的分析

    • 由两个式子(两个目标)组成,第一个式子是为了对trainning data更好的拟合(fitting the training data),第二个式子是为了避免overfitting
    • 第二个式子叫regularization term, λ叫regularization parameter, λ是为了平衡两个目标用的
    • 如果 λ非常大的话(这时θ1n几乎为0,hypothesis变得很简单,只有常数),就会出现underfitting,对trainning data/ new data很低的fitting
    • 所以并不是regularization在任何情况下(当 λ非常大的情况下),都能使model更适应new data或者training data
    • The regularization term puts a penalty on the cost J,随着模型参数的增多,the penalty increases as well.

machine learning(13) -- solving the problem of overfitting:regularization的更多相关文章

  1. Solving the Problem of Overfitting

    The Problem of Overfitting Cost Function Regularized Linear Regression Note: [8:43 - It is said that ...

  2. Advice for applying Machine Learning

    https://jmetzen.github.io/2015-01-29/ml_advice.html Advice for applying Machine Learning This post i ...

  3. How do I learn mathematics for machine learning?

    https://www.quora.com/How-do-I-learn-mathematics-for-machine-learning   How do I learn mathematics f ...

  4. [C2P2] Andrew Ng - Machine Learning

    ##Linear Regression with One Variable Linear regression predicts a real-valued output based on an in ...

  5. Machine Learning - 第3周(Logistic Regression、Regularization)

    Logistic regression is a method for classifying data into discrete outcomes. For example, we might u ...

  6. Course Machine Learning Note

    Machine Learning Note Introduction Introduction What is Machine Learning? Two definitions of Machine ...

  7. 【Machine Learning is Fun!】1.The world’s easiest introduction to Machine Learning

    Bigger update: The content of this article is now available as a full-length video course that walks ...

  8. [C2P1] Andrew Ng - Machine Learning

    About this Course Machine learning is the science of getting computers to act without being explicit ...

  9. Introduction to Machine Learning

    Chapter 1 Introduction 1.1 What Is Machine Learning? To solve a problem on a computer, we need an al ...

随机推荐

  1. Teaset-React Native UI 组件库

    GitHub地址 https://github.com/rilyu/teaset/blob/master/docs/cn/README.md React Native UI 组件库, 超过 20 个纯 ...

  2. 原生xgboost中如何输出feature_importance

    网上教程基本都是清一色的使用sklearn版本,此时的XGBClassifier有自带属性feature_importances_,而特征名称可以通过model._Booster.feature_na ...

  3. Spring之2:HierarchicalBeanFactory接口

    HierarchicalBeanFactory:HierarchicalBeanFactory继承BeanFactory并扩展使其支持层级结构.getParentBeanFactory()方法或者父级 ...

  4. 剑指offer28:找出数组中超过一半的数字。

    1 题目描述 数组中有一个数字出现的次数超过数组长度的一半,请找出这个数字.例如输入一个长度为9的数组{1,2,3,2,2,2,5,4,2}.由于数字2在数组中出现了5次,超过数组长度的一半,因此输出 ...

  5. ~json库的使用

    一.json简介 json全称"JavaScript Object Notation"(JavaScript对象表示法)它是一种基于文本,独立于语言的轻量级数据交换格式.易于让人阅 ...

  6. Django框架学习易错和易忘点

    一.get在几处的用法 1.获取前端数据 request.POST.get('xxx') #当存在多个值时,默认取列表最后一个元素:所以当存在多个值时,使用getlist 2.获取数据库数据 mode ...

  7. k8s开发实践

    代码自动生成:https://blog.openshift.com/kubernetes-deep-dive-code-generation-customresources/ 自定义controlle ...

  8. AtomicIntegerFieldUpdater和AtomicInteger

    为什么有了AtomicInteger还需要AtomicIntegerFieldUpdater? 当需要进行原子限定的属性所属的类会被创建大量的实例对象, 如果用AtomicInteger, 每个实例里 ...

  9. Linux追加磁盘扩展

    一:查看磁盘空间信息: fdisk -l 查看当前的系统的磁盘空间的情况: 二:增加分区: fdisk /dev/sda 键入n,增加一个分区,得到: 键入 p,主分区,并键入3(编号): 默认起始扇 ...

  10. C#进阶系列——WebApi接口返回值类型详解

    阅读目录 一.void无返回值 二.IHttpActionResult 1.Json(T content) 2.Ok(). Ok(T content) 3.NotFound() 4.其他 5.自定义I ...