Optimization Algorithms

MicN 2024-11-05 22:48:15 原文

1. Stochastic Gradient Descent

2. SGD With Momentum

Stochastic gradient descent with momentum remembers the update Δ w at each iteration, and determines the next update as a linear combination of the gradient and the previous update:

Unlike in classical stochastic gradient descent, it tends to keep traveling in the same direction, preventing oscillations.

3. RMSProp

RMSProp (for Root Mean Square Propagation) is also a method in which the learning rate is adapted for each of the parameters. The idea is to divide the learning rate for a weight by a running average of the magnitudes of recent gradients for that weight. So, first the running average is calculated in terms of means square,

where, is the forgetting factor.

And the parameters are updated as,

RMSProp has shown excellent adaptation of learning rate in different applications. RMSProp can be seen as a generalization of Rprop and is capable to work with mini-batches as well opposed to only full-batches.

4. The Adam Algorithm

Adam (short for Adaptive Moment Estimation) is an update to the RMSProp optimizer. In this optimization algorithm, running averages of both the gradients and the second moments of the gradients are used. Given parameters and a loss function , where indexes the current training iteration (indexed at ), Adam's parameter update is given by:

where is a small number used to prevent division by 0, and and are the forgetting factors for gradients and second moments of gradients, respectively.

参考链接：Wikipedia。

Optimization Algorithms的更多相关文章

(转) An overview of gradient descent optimization algorithms
An overview of gradient descent optimization algorithms Table of contents: Gradient descent variants ...
An overview of gradient descent optimization algorithms
原文地址:An overview of gradient descent optimization algorithms An overview of gradient descent optimiz ...
课程二(Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization)，第二周（Optimization algorithms） —— 2.Programming assignments:Optimization
Optimization Welcome to the optimization's programming assignment of the hyper-parameters tuning spe ...
优化算法动画演示Alec Radford's animations for optimization algorithms
Alec Radford has created some great animations comparing optimization algorithms SGD, Momentum, NAG, ...
[C2W2] Improving Deep Neural Networks : Optimization algorithms
第二周:优化算法(Optimization algorithms) Mini-batch 梯度下降(Mini-batch gradient descent) 本周将学习优化算法,这能让你的神经网络运行 ...
【论文翻译】An overiview of gradient descent optimization algorithms
这篇论文最早是一篇2016年1月16日发表在Sebastian Ruder的博客.本文主要工作是对这篇论文与李宏毅课程相关的核心部分进行翻译. 论文全文翻译: An overview of gradi ...
Coursera Deep Learning 2 Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week2, Optimization algorithms
Gradient descent Batch Gradient Decent, Mini-batch gradient descent, Stochastic gradient descent 还有很 ...
An overview of gradient descent optimization algorithms (更新到Adam)
Momentum:解快了收敛速度,同时也减弱了SGD的波动 NAG: 减速了Momentum更新参数太快 Adagrad: 出现频率较低参数采用较大的更新,对于出现频率较高的参数采用较小的,不共用一个 ...
最佳化常用测试函数 Optimization Test functions
http://www.sfu.ca/~ssurjano/optimization.html The functions listed below are some of the common func ...

随机推荐

python3速查参考- python基础 5 -> 常用的文件操作
文件的打开方式打开方式详细释义 r 以只读方式打开文件.文件的指针会放在文件的开头.这是默认模式. rb 以二进制只读方式打开一个文件.文件指针会放在文件的开头. r+ 以读写方式打开一个文 ...
【18.065】Lecture2
由于这一课的教材放出来了,所以直接将整个pdf放上来.
【计算机视觉】背景建模之PBAS
本文是根据M. Hofmann等人在2012年的IEEE Workshop on Change Detection上发表的"Background Segmentation with Feed ...
【VS开发】VSTO 学习笔记（十）Office 2010 Ribbon开发
微软的Office系列办公套件从Office 2007开始首次引入了Ribbon导航菜单模式,其将一系列相关的功能集成在一个个Ribbon中,便于集中管理.操作.这种Ribbon是高度可定制的,用户可 ...
nRF5 SDK Bootloader and DFU moudles(3)
DFU控制点特性用于控制DFU过程的状态. 通过写入该特征来请求所有DFU程序. 标记过程结束的响应将作为通知收到. BLE传输 Transfer of an init packet DFU控制器首先 ...
mysql数据库之索引与慢查询优化
索引与慢查询优化知识回顾:数据都是存在硬盘上的,那查询数据不可避免的需要进行IO操作索引在MySQL中也叫做“键”,是存储引擎用于快速找到记录的一种数据结构. primary key unique ...
python pip换源方法
以下资料来源于网络: pip国内的一些镜像阿里云 http://mirrors.aliyun.com/pypi/simple/ 中国科技大学 https://pypi.mirrors.ust ...
关于MySQL 5.6 中文乱码的问题(尤其是windows的gbk编码)
一般MySQL 数据库乱码由以下几种情况造成(按照顺序): 1. 创建数据库的时候没设置编码,解决办法: 就是在创建数据库的时候设置编码, 例如: CREATE DATABASE `mydb` CHA ...
Hello World！！！
C #include <stdio.h> int main() #main 入口函数 { printf("Hello,World!"); #printf 函数打印 ; ...
CSS模块化：less
less的安装与基本使用 less的语法及特性一.本地使用less的方法 Less (Leaner Style Sheets 的缩写) 是一门向后兼容的 CSS 扩展语言.是一种动态样式语言,属于c ...