Train/Dev/Test set

Bias/Variance

  

    

Regularization 

有下面一些regularization的方法.
  1. L2 regularation
  2. drop out
  3. data augmentation(翻转图片得到一个新的example), early stopping(画出J_train 和J_dev 对应于iteration的图像)

L2 regularization:

  

Forbenius Norm.

上面这张图提到了weight decay 的概念

Weight Decay: A regularization technique (such as L2 regularization) that results in gradient descent shrinking the weights on every iteration.

why regulation works(intuition)?

  

Dropout regularization:

下面的图只显示了forward propagation过程中使用dropout, back propagation 同样也需要drop out.

  

在对 test set 做预测的时候,不需要 drop out.

  

  

Early stopping: 缺点是违反了正交原则(Orthoganalization, 不同角度互不影响计算), 因为early stopping 同时关注Optimize cost func J, 和 Not overfit 两个任务,不是分开解决。一般建议用L2 regularization, 但是缺点是迭代次数多.

  

Normalizing input

就是把input x 转化成方差,公式如下

  

Vanishing/Exploding gradients

deep neural network suffer from these issues. they are huge barrier to training deep neural network.

There is a partial solution to solve the above problem but help a lot which is careful choice how you initialize the weights. 主要目的是使得weight W[l]不要比1太大或者太小,这样最后在算W的指数级的时候就很大程度改善vanishing 和 exploding的问题.

如果用的是Relu activation, 就用中下部的蓝框的内容(He Initialization),如果是tanh activation 就用右边的蓝框的内容(Xavier initialization),也有些人对tanh用右边第二种

Weight Initialization for Deep Networks

Xavier initialization

Gradient Checking

  

Ref:

1. Coursera

Coursera, Deep Learning 2, Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week1, Course的更多相关文章

  1. Coursera Deep Learning 2 Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week1, Assignment(Initialization)

    声明:所有内容来自coursera,作为个人学习笔记记录在这里. Initialization Welcome to the first assignment of "Improving D ...

  2. Coursera Deep Learning 2 Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week1, Assignment(Gradient Checking)

    声明:所有内容来自coursera,作为个人学习笔记记录在这里. Gradient Checking Welcome to the final assignment for this week! In ...

  3. Coursera Deep Learning 2 Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week1, Assignment(Regularization)

    声明:所有内容来自coursera,作为个人学习笔记记录在这里. Regularization Welcome to the second assignment of this week. Deep ...

  4. 《Improving Deep Neural Networks:Hyperparameter tuning, Regularization and Optimization》课堂笔记

    Lesson 2 Improving Deep Neural Networks:Hyperparameter tuning, Regularization and Optimization 这篇文章其 ...

  5. [C4] Andrew Ng - Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization

    About this Course This course will teach you the "magic" of getting deep learning to work ...

  6. Coursera Deep Learning 2 Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week2, Assignment(Optimization Methods)

    声明:所有内容来自coursera,作为个人学习笔记记录在这里. 请不要ctrl+c/ctrl+v作业. Optimization Methods Until now, you've always u ...

  7. 课程二(Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization),第一周(Practical aspects of Deep Learning) —— 4.Programming assignments:Gradient Checking

    Gradient Checking Welcome to this week's third programming assignment! You will be implementing grad ...

  8. 吴恩达《深度学习》-课后测验-第二门课 (Improving Deep Neural Networks:Hyperparameter tuning, Regularization and Optimization)-Week 1 - Practical aspects of deep learning(第一周测验 - 深度学习的实践)

    Week 1 Quiz - Practical aspects of deep learning(第一周测验 - 深度学习的实践) \1. If you have 10,000,000 example ...

  9. 吴恩达《深度学习》-第二门课 (Improving Deep Neural Networks:Hyperparameter tuning, Regularization and Optimization)-第一周:深度学习的实践层面 (Practical aspects of Deep Learning) -课程笔记

    第一周:深度学习的实践层面 (Practical aspects of Deep Learning) 1.1 训练,验证,测试集(Train / Dev / Test sets) 创建新应用的过程中, ...

随机推荐

  1. (转)java 序列化ID的作用

    序列化ID的作用: 其实,这个序列化ID起着关键的作用,它决定着是否能够成功反序列化!简单来说,java的序列化机制是通过在运行时判断类的serialVersionUID来验证版本一致性的.在进行反序 ...

  2. Spring 学习笔记一

    1.IOC,DI. 2.装配bean基于xml(实例化,声明周期,后处理bean,属性注入).3.装配bean基于注解 1       spring框架概述 1.1   什么是spring l  Sp ...

  3. HTML学习笔记Day6

    一.元素类型 1.元素类型分类依据和元素类型分类 根据css显示分类,XHTML元素被分为三种类型:块状元素.内联元素.行内块元素.可变元素 2.块状元素 1)块状元素在网页中就是以块的形式显示,所谓 ...

  4. (二叉树 BFS) leetcode103. Binary Tree Zigzag Level Order Traversal

    Given a binary tree, return the zigzag level order traversal of its nodes' values. (ie, from left to ...

  5. 关键字(5):cursor游标:(循环操作批量数据)

    declare   cursor stus_cur is select * from students;  --定义游标并且赋值(is 不能和cursor分开使用)    cur_stu studen ...

  6. 关于Tcpdump抓包总结

    一.简介 tcpdump是一个用于截取网络分组,并输出分组内容的工具.凭借强大的功能和灵活的截取策略,使其成为类UNIX系统下用于网络分析和问题排查的首选工具 tcpdump提供了源代码,公开了接口, ...

  7. 苹果笔记本适合什么人 中国Mac电脑用户的8个事实

    报告由腾讯 ISUX 研究中心收集了全国 7946 名 Mac 电脑用户的问卷整理而成.并且,参考了苹果公司的历年财报,以及百度.StatCounter 等第三方市场统计数据. 你是 iPhone 用 ...

  8. go官方的http.request + context样例

    go官方的http.request + context样例 https://github.com/DavadDi/go_study/blob/master/src/httpreq_context/ma ...

  9. MySQL数据库优化_索引

    1.添加索引后减少查询需要的行数,提高查询性能 (1) 建表 CREATE TABLE `site_user` ( `id` ) NOT NULL AUTO_INCREMENT COMMENT '自增 ...

  10. Spring Boot笔记二:快速创建以及yml文件自动注入

    上个笔记写了如何自己去创建Spring boot,以及如何去打jar包,其实,还是有些麻烦的,我们还自己新建了几个文件夹不是. Idea可以让我们快速的去创建Spring boot应用,来看 一.快速 ...