optim.py cs231n
n如果有错误,欢迎指出,不胜感激
import numpy as np """
This file implements various first-order update rules that are commonly used for
training neural networks. Each update rule accepts current weights and the
gradient of the loss with respect to those weights and produces the next set of
weights. Each update rule has the same interface: def update(w, dw, config=None): Inputs:
- w: A numpy array giving the current weights.
- dw: A numpy array of the same shape as w giving the gradient of the
loss with respect to w.
- config: A dictionary containing hyperparameter values such as learning rate,
momentum, etc. If the update rule requires caching values over many
iterations, then config will also hold these cached values. Returns:
- next_w: The next point after the update.
- config: The config dictionary to be passed to the next iteration of the
update rule. NOTE: For most update rules, the default learning rate will probably not perform
well; however the default values of the other hyperparameters should work well
for a variety of different problems. For efficiency, update rules may perform in-place updates, mutating w and
setting next_w equal to w.
""" def sgd(w, dw, config=None):
"""
Performs vanilla stochastic gradient descent. config format:
- learning_rate: Scalar learning rate.
"""
if config is None: config = {}
config.setdefault('learning_rate', 1e-2)
w -= config['learning_rate'] * dw
return w, config def sgd_momentum(w, dw, config=None):
"""
Performs stochastic gradient descent with momentum. config format:
- learning_rate: Scalar learning rate.
- momentum: Scalar between 0 and 1 giving the momentum value.
Setting momentum = 0 reduces to sgd.
- velocity: A numpy array of the same shape as w and dw used to store a moving
average of the gradients.
"""
if config is None: config = {}
config.setdefault('learning_rate', 1e-2)
config.setdefault('momentum', 0.9)
v = config.get('velocity', np.zeros_like(w)) next_w = None
v=v*config['momentum']-config['learning_rate']*dw
next_w=w+v
config['velocity'] = v return next_w, config def rmsprop(x, dx, config=None):
"""
Uses the RMSProp update rule, which uses a moving average of squared gradient
values to set adaptive per-parameter learning rates. config format:
- learning_rate: Scalar learning rate.
- decay_rate: Scalar between 0 and 1 giving the decay rate for the squared
gradient cache.
- epsilon: Small scalar used for smoothing to avoid dividing by zero.
- cache: Moving average of second moments of gradients.
"""
if config is None: config = {}
config.setdefault('learning_rate', 1e-2)
config.setdefault('decay_rate', 0.99)
config.setdefault('epsilon', 1e-8)
config.setdefault('cache', np.zeros_like(x)) next_x = None cache=config['cache']*config['decay_rate']+(1-config['decay_rate'])*dx**2
next_x=x-config['learning_rate']*dx/np.sqrt(cache+config['epsilon'])
config['cache']=cache return next_x, config def adam(x, dx, config=None):
"""
Uses the Adam update rule, which incorporates moving averages of both the
gradient and its square and a bias correction term. config format:
- learning_rate: Scalar learning rate.
- beta1: Decay rate for moving average of first moment of gradient.
- beta2: Decay rate for moving average of second moment of gradient.
- epsilon: Small scalar used for smoothing to avoid dividing by zero.
- m: Moving average of gradient.
- v: Moving average of squared gradient.
- t: Iteration number.
"""
if config is None: config = {}
config.setdefault('learning_rate', 1e-3)
config.setdefault('beta1', 0.9)
config.setdefault('beta2', 0.999)
config.setdefault('epsilon', 1e-8)
config.setdefault('m', np.zeros_like(x))
config.setdefault('v', np.zeros_like(x))
config.setdefault('t', 0)
config['t']+=1
这个方法比较综合,各种方法的好处吧
m=config['beta1']*config['m']+(1-config['beta1'])*dx # now to change by acc
v=config['beta2']*config['v']+(1-config['beta2'])*dx**2
config['m']=m
config['v']=v
m=m/(1-config['beta1']**config['t'])
v=v/(1-config['beta2']**config['t']) next_x=x-config['learning_rate']*m/np.sqrt(v+config['epsilon']) return next_x, config
n
optim.py cs231n的更多相关文章
- cnn.py cs231n
n import numpy as np from cs231n.layers import * from cs231n.fast_layers import * from cs231n.layer_ ...
- fc_net.py cs231n
n如果有错误,欢迎指出,不胜感激 import numpy as np from cs231n.layers import * from cs231n.layer_utils import * cla ...
- layers.py cs231n
如果有错误,欢迎指出,不胜感激. import numpy as np def affine_forward(x, w, b): 第一个最简单的 affine_forward简单的前向传递,返回 ou ...
- 笔记:CS231n+assignment2(作业二)(一)
第二个作业难度很高,但做(抄)完之后收获还是很大的.... 一.Fully-Connected Neural Nets 首先是对之前的神经网络的程序进行重构,目的是可以构建任意大小的全连接的neura ...
- 深度学习原理与框架-神经网络-cifar10分类(代码) 1.np.concatenate(进行数据串接) 2.np.hstack(将数据横着排列) 3.hasattr(判断.py文件的函数是否存在) 4.reshape(维度重构) 5.tanspose(维度位置变化) 6.pickle.load(f文件读入) 7.np.argmax(获得最大值索引) 8.np.maximum(阈值比较)
横1. np.concatenate(list, axis=0) 将数据进行串接,这里主要是可以将列表进行x轴获得y轴的串接 参数说明:list表示需要串接的列表,axis=0,表示从上到下进行串接 ...
- optim.py-使用tensorflow实现一般优化算法
optim.py Project URL:https://github.com/Codsir/optim.git Based on: tensorflow, numpy, copy, inspect ...
- 深度学习之卷积神经网络(CNN)详解与代码实现(一)
卷积神经网络(CNN)详解与代码实现 本文系作者原创,转载请注明出处:https://www.cnblogs.com/further-further-further/p/10430073.html 目 ...
- 深度学习原理与框架-卷积神经网络-cifar10分类(图片分类代码) 1.数据读入 2.模型构建 3.模型参数训练
卷积神经网络:下面要说的这个网络,由下面三层所组成 卷积网络:卷积层 + 激活层relu+ 池化层max_pool组成 神经网络:线性变化 + 激活层relu 神经网络: 线性变化(获得得分值) 代码 ...
- Pytorch1.3源码解析-第一篇
pytorch$ tree -L 1 . ├── android ├── aten ├── benchmarks ├── binaries ├── c10 ├── caffe2 ├── CITATIO ...
随机推荐
- 细说WPF自定义路由事件
WPF中的路由事件 as U know,和以前Windows消息事件区别不再多讲,这篇博文中,将首先回顾下WPF内置的路由事件的用法,然后在此基础上自定义一个路由事件. 1.WPF内置路由事件 W ...
- php缓存技术有哪些(总结)
php缓存技术有哪些(总结) 一.总结 一句话总结: 静态页面:全页面静态化缓存,页面部分缓存(将页面中不常变动的部分进行静态化缓存), 数据缓存:比如我的每轮的题目数据,商店,寻宝数据等 数据库:查 ...
- ERROR:ORA-30076: 对析出来源无效的析出字段
DEBUG:key: sql: select count(*) as col_0_0_ from jc_user cmsuser0_ where 1=1 and cmsuser0_.register_ ...
- 【One by one系列】一步步开始使用Redis吧(一)
One by one,一步步开始使用Redis吧(一) 最近有需求需要使用redis,之前也是随便用用,从来也没有归纳总结,今天想睡觉,但是又睡不着,外面阳光不错,气温回升了,2019年6月1日,成都 ...
- leetcode 847. Shortest Path Visiting All Nodes 无向连通图遍历最短路径
设计最短路径 用bfs 天然带最短路径 每一个状态是 当前的阶段 和已经访问过的节点 下面是正确但是超时的代码 class Solution: def shortestPathLength(self, ...
- Hibernate-概述-搭建-测试-配置详解
一 hibernate概述 1.1 框架是什么 1.框架是用来提高开发效率的 2.封装了好了一些功能.我们需要使用这些功能时,调用即可.不需要再手动实现. 3.所以框架可以理解成是一个半成品的项目.只 ...
- ubuntu16.04环境编译gSOAP
一.gSOAP简介 SOAP 是基于 XML 的简易协议,可使应用程序在 HTTP 之上进行信息交换.或者更简单地说:SOAP 是用于访问网络服务的协议. SOAP 提供了一种标准的方法,使得运行在 ...
- HDU6200 mustedge mustedge mustedge
不用看题就知道这是和什么tarjan.缩点或桥一类有关的题. 谁让他取题目叫一个mustedge还连续写3次的(哦,似乎是因为那个比赛的题目都是这个画风) 必须的边 >必须要经过的边 > ...
- 使用pdf.js在移动端预览pdf文档
pdf.js 是一个技术原型主要用于在 HTML5 平台上展示 PDF 文档,无需任何本地技术支持. 在线演示地址:http://mozilla.github.com/pdf.js/web/viewe ...
- python 实现发送短信验证码
[说明]短信接口使用的是“聚合数据”上面的接口. 那么在使用接口前,需要在聚合数据上面注册,进行申请接口.当然在正式使用之前,我们可以使用申请免得的进行测试. 一.申请成功后,需做的准备工作如下: 1 ...