Coursera, Deep Learning 2, Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week1, Course
Train/Dev/Test set
Bias/Variance
Regularization
- L2 regularation
- drop out
- data augmentation(翻转图片得到一个新的example), early stopping(画出J_train 和J_dev 对应于iteration的图像)
L2 regularization:
Forbenius Norm.
上面这张图提到了weight decay 的概念
Weight Decay: A regularization technique (such as L2 regularization) that results in gradient descent shrinking the weights on every iteration.
why regulation works(intuition)?
Dropout regularization:
下面的图只显示了forward propagation过程中使用dropout, back propagation 同样也需要drop out.
在对 test set 做预测的时候,不需要 drop out.
Early stopping: 缺点是违反了正交原则(Orthoganalization, 不同角度互不影响计算), 因为early stopping 同时关注Optimize cost func J, 和 Not overfit 两个任务,不是分开解决。一般建议用L2 regularization, 但是缺点是迭代次数多.
Normalizing input
就是把input x 转化成方差,公式如下
Vanishing/Exploding gradients
deep neural network suffer from these issues. they are huge barrier to training deep neural network.
There is a partial solution to solve the above problem but help a lot which is careful choice how you initialize the weights. 主要目的是使得weight W[l]不要比1太大或者太小,这样最后在算W的指数级的时候就很大程度改善vanishing 和 exploding的问题.
如果用的是Relu activation, 就用中下部的蓝框的内容(He Initialization),如果是tanh activation 就用右边的蓝框的内容(Xavier initialization),也有些人对tanh用右边第二种
Weight Initialization for Deep Networks
Xavier initialization
Gradient Checking
Ref:
1. Coursera
Coursera, Deep Learning 2, Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week1, Course的更多相关文章
- Coursera Deep Learning 2 Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week1, Assignment(Initialization)
声明:所有内容来自coursera,作为个人学习笔记记录在这里. Initialization Welcome to the first assignment of "Improving D ...
- Coursera Deep Learning 2 Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week1, Assignment(Gradient Checking)
声明:所有内容来自coursera,作为个人学习笔记记录在这里. Gradient Checking Welcome to the final assignment for this week! In ...
- Coursera Deep Learning 2 Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week1, Assignment(Regularization)
声明:所有内容来自coursera,作为个人学习笔记记录在这里. Regularization Welcome to the second assignment of this week. Deep ...
- 《Improving Deep Neural Networks:Hyperparameter tuning, Regularization and Optimization》课堂笔记
Lesson 2 Improving Deep Neural Networks:Hyperparameter tuning, Regularization and Optimization 这篇文章其 ...
- [C4] Andrew Ng - Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization
About this Course This course will teach you the "magic" of getting deep learning to work ...
- Coursera Deep Learning 2 Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week2, Assignment(Optimization Methods)
声明:所有内容来自coursera,作为个人学习笔记记录在这里. 请不要ctrl+c/ctrl+v作业. Optimization Methods Until now, you've always u ...
- 课程二(Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization),第一周(Practical aspects of Deep Learning) —— 4.Programming assignments:Gradient Checking
Gradient Checking Welcome to this week's third programming assignment! You will be implementing grad ...
- 吴恩达《深度学习》-课后测验-第二门课 (Improving Deep Neural Networks:Hyperparameter tuning, Regularization and Optimization)-Week 1 - Practical aspects of deep learning(第一周测验 - 深度学习的实践)
Week 1 Quiz - Practical aspects of deep learning(第一周测验 - 深度学习的实践) \1. If you have 10,000,000 example ...
- 吴恩达《深度学习》-第二门课 (Improving Deep Neural Networks:Hyperparameter tuning, Regularization and Optimization)-第一周:深度学习的实践层面 (Practical aspects of Deep Learning) -课程笔记
第一周:深度学习的实践层面 (Practical aspects of Deep Learning) 1.1 训练,验证,测试集(Train / Dev / Test sets) 创建新应用的过程中, ...
随机推荐
- socket,tcp,http三者之间的区别和原理
http.TCP/IP协议与socket之间的区别下面的图表试图显示不同的TCP/IP和其他的协议在最初OSI模型中的位置: 7 应用层 例如HTTP.SMTP.SNMP.FTP.Telnet.SIP ...
- Linux:文件系统层次结构标准(Filesystem Hierarchy Standard)
Linux FHS_2.3标准文档:http://refspecs.linuxfoundation.org/FHS_3.0/fhs-3.0.pdf
- CF1153F Serval and Bonus Problem
Serval and Bonus Problem 1.转化为l=1,最后乘上l 2.对于一个方案,就是随便选择一个点,选在合法区间内的概率 3.对于本质相同的所有方案考虑在一起,贡献就是合法区间个数/ ...
- cf 990G - GCD Counting
题意 #include<bits/stdc++.h> #define t 200000 #define MAXN 200100 using namespace std; int n; in ...
- Spring2
简介:1.Aop编程.2.AspectJ基于xml文件.3.AspectJ基于注解. 4.JdbcTemplate. 5.配置properties文件 1 AOP 1.1 AOP介绍 ...
- SpringBoot+Shiro入门小栗子
写一个不花里胡哨的纯粹的Springboot+Shiro的入门小栗子 效果如图: 首页:有登录注册 先注册一个,然后登陆 登录,成功自动跳转到home页 home页:通过认证之后才可以进 代码部分: ...
- c#大文件的拷贝
using System.IO; namespace 数据流 { class Demo2 { private string _strSourcePath = @"D:\httpd-2.4.3 ...
- 字符类型char、字符串与字符数组、字符数组与数据数组区别
字符类型是以ASCII码值运算的:小写字母比相应的大写字母大32,其中A=65,a=97 Esc键 27(十进制).'\x1B'(十六进制).'\33'(八进制) 转义字符:\0 空字符 AS ...
- mysql体系结构和sql查询执行过程简析
一: mysql体系结构 1)Connectors 不同语言与 SQL 的交互 2)Management Serveices & Utilities 系统管理和控制工具 备份和恢复的安全性,复 ...
- C# 实现身份验证之WEB Service篇
在这个WEB API横行的时代,讲WEB Service技术却实显得有些过时了,过时的技术并不代表无用武之地,有些地方也还是可以继续用他的,我之所以会讲解WEB Service,源于我最近面试时被问到 ...