Train/Dev/Test set

Bias/Variance

  

    

Regularization 

有下面一些regularization的方法.
  1. L2 regularation
  2. drop out
  3. data augmentation(翻转图片得到一个新的example), early stopping(画出J_train 和J_dev 对应于iteration的图像)

L2 regularization:

  

Forbenius Norm.

上面这张图提到了weight decay 的概念

Weight Decay: A regularization technique (such as L2 regularization) that results in gradient descent shrinking the weights on every iteration.

why regulation works(intuition)?

  

Dropout regularization:

下面的图只显示了forward propagation过程中使用dropout, back propagation 同样也需要drop out.

  

在对 test set 做预测的时候,不需要 drop out.

  

  

Early stopping: 缺点是违反了正交原则(Orthoganalization, 不同角度互不影响计算), 因为early stopping 同时关注Optimize cost func J, 和 Not overfit 两个任务,不是分开解决。一般建议用L2 regularization, 但是缺点是迭代次数多.

  

Normalizing input

就是把input x 转化成方差,公式如下

  

Vanishing/Exploding gradients

deep neural network suffer from these issues. they are huge barrier to training deep neural network.

There is a partial solution to solve the above problem but help a lot which is careful choice how you initialize the weights. 主要目的是使得weight W[l]不要比1太大或者太小,这样最后在算W的指数级的时候就很大程度改善vanishing 和 exploding的问题.

如果用的是Relu activation, 就用中下部的蓝框的内容(He Initialization),如果是tanh activation 就用右边的蓝框的内容(Xavier initialization),也有些人对tanh用右边第二种

Weight Initialization for Deep Networks

Xavier initialization

Gradient Checking

  

Ref:

1. Coursera

Coursera, Deep Learning 2, Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week1, Course的更多相关文章

  1. Coursera Deep Learning 2 Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week1, Assignment(Initialization)

    声明:所有内容来自coursera,作为个人学习笔记记录在这里. Initialization Welcome to the first assignment of "Improving D ...

  2. Coursera Deep Learning 2 Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week1, Assignment(Gradient Checking)

    声明:所有内容来自coursera,作为个人学习笔记记录在这里. Gradient Checking Welcome to the final assignment for this week! In ...

  3. Coursera Deep Learning 2 Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week1, Assignment(Regularization)

    声明:所有内容来自coursera,作为个人学习笔记记录在这里. Regularization Welcome to the second assignment of this week. Deep ...

  4. 《Improving Deep Neural Networks:Hyperparameter tuning, Regularization and Optimization》课堂笔记

    Lesson 2 Improving Deep Neural Networks:Hyperparameter tuning, Regularization and Optimization 这篇文章其 ...

  5. [C4] Andrew Ng - Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization

    About this Course This course will teach you the "magic" of getting deep learning to work ...

  6. Coursera Deep Learning 2 Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week2, Assignment(Optimization Methods)

    声明:所有内容来自coursera,作为个人学习笔记记录在这里. 请不要ctrl+c/ctrl+v作业. Optimization Methods Until now, you've always u ...

  7. 课程二(Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization),第一周(Practical aspects of Deep Learning) —— 4.Programming assignments:Gradient Checking

    Gradient Checking Welcome to this week's third programming assignment! You will be implementing grad ...

  8. 吴恩达《深度学习》-课后测验-第二门课 (Improving Deep Neural Networks:Hyperparameter tuning, Regularization and Optimization)-Week 1 - Practical aspects of deep learning(第一周测验 - 深度学习的实践)

    Week 1 Quiz - Practical aspects of deep learning(第一周测验 - 深度学习的实践) \1. If you have 10,000,000 example ...

  9. 吴恩达《深度学习》-第二门课 (Improving Deep Neural Networks:Hyperparameter tuning, Regularization and Optimization)-第一周:深度学习的实践层面 (Practical aspects of Deep Learning) -课程笔记

    第一周:深度学习的实践层面 (Practical aspects of Deep Learning) 1.1 训练,验证,测试集(Train / Dev / Test sets) 创建新应用的过程中, ...

随机推荐

  1. P3486 [POI2009]KON-Ticket Inspector

    啊!这题做的真是爽!除了DP这个方法是有提示的之外,这题居然没有题解,哈哈哈嘿嘿嘿.很自豪的说:全是我自己独立解出来的一道题,包括设计状态,推倒(☺)转移方程,最后记录路径. 好了,首先,我们发现这题 ...

  2. gallery

    效果如下 目录如下 代码如下: //index.html <!DOCTYPE html> <html> <head> <meta charset=" ...

  3. Redis高并发和快速的原因

    一.Redis的高并发和快速原因 1.redis是基于内存的,内存的读写速度非常快: 2.redis是单线程的,省去了很多上下文切换线程的时间:   3.redis使用多路复用技术,可以处理并发的连接 ...

  4. 斯坦福大学公开课机器学习:machine learning system design | error metrics for skewed classes(偏斜类问题的定义以及针对偏斜类问题的评估度量值:查准率(precision)和召回率(recall))

    上篇文章提到了误差分析以及设定误差度量值的重要性.那就是设定某个实数来评估学习算法并衡量它的表现.有了算法的评估和误差度量值,有一件重要的事情要注意,就是使用一个合适的误差度量值,有时会对学习算法造成 ...

  5. 使用海康的某款摄像头以及v4l2的经验

    Video Capture模式下发现使用MJPEG流和YUYV流相同分辨率视野却不同. 最后没找到怎么设物理窗口或图片窗口的方法,做实验知道了为什么. 设为1280×720 YUYV实际物理窗口是19 ...

  6. Jquery Mobile事件

    Jquery Mobile事件参考手册 on()方法用于添加事件处理程序 1.Touch类事件 在用户触摸屏幕时触发 1.1 tap事件 用户敲击某个元素时发生 $("p").on ...

  7. JDBC批处理(Batch)MySQL中的表

    在数据库test里先创建表school,内容如下 向school表中一次增加多行.addBatch,executeBatch import java.sql.Connection; import ja ...

  8. ngnix FastCGI解析漏洞

    漏洞描述: Nginx默认是以CGI的方式支持PHP解析的,普遍的做法是在Nginx配置文件中通过正则匹配设置SCRIPT_FILENAME.当访问http://192.168.1.103/phpin ...

  9. scrapy关键字爬取百度图库(一)

    刚入门学习python的菜鸟,如有错误,还望指教 爬取百度图库需要知道百度图库的加载方式是通过下拉加载的,所以我们需要分析Ajax请求来爬取每一页的数据信息 表述不清直接上图片 图片一是刷新页面后加载 ...

  10. (贪心) nyoj1036-非洲小孩

    题目描述: 家住非洲的小孩,都很黑.为什么呢?第一,他们地处热带,太阳辐射严重.第二,他们不经常洗澡.(常年缺水,怎么洗澡.)现在,在一个非洲部落里,他们只有一个地方洗澡,并且,洗澡时间很短,瞬间有木 ...