machine learning (5)---learning rate

degugging:make sure gradient descent is working correctly

cost function(J(θ)) of Number of iteration ：cost function随着迭代次数增加的变化函数
运行错误的图象是什么样子的：cost function(J(θ)) of Number of iteration随着迭代次数增加而上升(如以下两种图像的情况)，应使用较小的learning rate
运行正确的图象是什么样子的：cost function(J(θ)) of Number of iteration应该是递减的并且随着迭代次数增加它趋于一条平缓的曲线（即收敛于一个固定的值）

how to choose learning rate(∂)
1. 若learning rate太小: 收敛速度会很慢
2. 若learning rate太大： gradient descent不会收敛，会出现随着迭代次数的增加，cost function反而变大的情况，这时我们要选择较小的learning rate去尝试。
3. 可供选择的一些learning rate值: 0.3, 0.1, 0.03, 0.01 and so on(3倍)
4. 在进行gradient drscent时，我们会尝试一些不同的learning rate,然后绘制出不同的ost function(J(θ)) of Number of iteration曲线，然后选择一个使cost function 快速下降的learning rate.
5. 如何选择最佳的learning rate

尝试这些不同的learning rate找到一个最大的learning rate（若再大则不会收敛）或者比最大稍小一点的learning rate

machine learning (5)---learning rate的更多相关文章

Machine and Deep Learning with Python
Machine and Deep Learning with Python Education Tutorials and courses Supervised learning superstiti ...
Machine Learning—Online Learning
印象笔记同步分享:Machine Learning-Online Learning
What are some good books/papers for learning deep learning?
What's the most effective way to get started with deep learning? 29 Answers Yoshua Bengio, ...
（转）Paper list of Meta Learning/ Learning to Learn/ One Shot Learning/ Lifelong Learning
Meta Learning/ Learning to Learn/ One Shot Learning/ Lifelong Learning 2018-08-03 19:16:56 本文转自:http ...
(转) Learning Deep Learning with Keras
Learning Deep Learning with Keras Piotr Migdał - blog Projects Articles Publications Resume About Ph ...
增强学习（五）----- 时间差分学习(Q learning, Sarsa learning)
接下来我们回顾一下动态规划算法(DP)和蒙特卡罗方法(MC)的特点,对于动态规划算法有如下特性: 需要环境模型,即状态转移概率$P_{sa}$ 状态值函数的估计是自举的(bootstrapping ...
Zero-shot Learning / One-shot Learning / Few-shot Learning
Zero-shot Learning / One-shot Learning / Few-shot Learning Learning类型:Zero-shot Learning.One-shot Le ...
[Machine Learning] Active Learning
1. 写在前面在机器学习(Machine learning)领域,监督学习(Supervised learning).非监督学习(Unsupervised learning)以及半监督学习(Semi ...
Machine Learning——Supervised Learning（机器学习之监督学习）
监督学习是指:利用一组已知类别的样本调整分类器的参数,使其达到所要求性能的过程. 我们来看一个例子:预测房价(注:本文例子取自业界大牛吴恩达老师的机器学习课程) 如下图所示:横轴表示房子的面积,单位是 ...

随机推荐

超简单的react和typescript和引入scss项目搭建流程
1.首先我们先创建一个react项目,react官网也有react项目搭建的命令 npx create-react-app my-app cd my-app 2.安装我们项目需要的样式依赖,这个项目我 ...
gorm 实现 mysql for update 排他锁
关于 MySQL 的排他锁网上已经有很多资料进行了介绍,这里主要是记录一下 gorm 如果使用排他锁. 排他锁是需要对索引进行锁操作,同时需要在事务中才能生效.具体操作如下: 假设有如下数据库表结构: ...
vue中$router与$route的区别
$.router是VueRouter的实例,相当于一个全局的路由器对象.包含很多属性和子对象,例如history对象 $.route表示当前正在跳转的路由对象.可以通过$.route获取到name,p ...
python3与Excel的完美结合
https://segmentfault.com/a/1190000016256490 Excel 是 Windows 环境下流行的.强大的电子表格应用.openpyxl 模块让 Python 程序能 ...
【1】BIO与NIO、AIO的区别
一.BIO 在JDK1.4出来之前,我们建立网络连接的时候采用BIO模式,需要先在服务端启动一个ServerSocket,然后在客户端启动Socket来对服务端进行通信,默认情况下服务端需要对每个请求 ...
Eclipse 安装反编译插件 Eclipse Class Decompiler
Eclipse Class Decompiler在线安装方法 https://blog.csdn.net/tangjinquan1157/article/details/77506015 Eclips ...
(二) Windows 进行 Docker CE 安装(Docker Desktop)
参考并感谢官方文档: https://docs.docker.com/docker-for-windows/install/ 下载地址 https://download.docker.com/win ...
14 Scroll 滚动搜索
Scroll的用法: 第一次搜的时候,要指定快照保留时间1min,分页的大小:2条/页: 对于第一次搜索,ES会返回一个这个scroll的id: 下次再搜的时候,就带着这个scrollid去搜就 ...
IOS - UDID IDFA IDFV MAC keychain
在开发过程中,我们经常会被要求获取每个设备的唯一标示,以便后台做相应的处理.我们来看看有哪些方法来获取设备的唯一标示,然后再分析下这些方法的利弊. 具体可以分为如下几种: UDID IDFA IDFV ...
【转载】 C#使用Math.PI常量来表示圆周率
在C#中计算圆形面积的时候,我们时常会用到圆周率这个变量,圆周率我们一般定义为十进制decimal类型变量,圆周率的值为3.1415926535等一个近似值,其实在C#的数值计算类Math类中,有专门 ...

machine learning (5)---learning rate

machine learning (5)---learning rate的更多相关文章

随机推荐

热门专题