n如果有错误,欢迎指出,不胜感激

import numpy as np

"""
This file implements various first-order update rules that are commonly used for
training neural networks. Each update rule accepts current weights and the
gradient of the loss with respect to those weights and produces the next set of
weights. Each update rule has the same interface: def update(w, dw, config=None): Inputs:
- w: A numpy array giving the current weights.
- dw: A numpy array of the same shape as w giving the gradient of the
loss with respect to w.
- config: A dictionary containing hyperparameter values such as learning rate,
momentum, etc. If the update rule requires caching values over many
iterations, then config will also hold these cached values. Returns:
- next_w: The next point after the update.
- config: The config dictionary to be passed to the next iteration of the
update rule. NOTE: For most update rules, the default learning rate will probably not perform
well; however the default values of the other hyperparameters should work well
for a variety of different problems. For efficiency, update rules may perform in-place updates, mutating w and
setting next_w equal to w.
""" def sgd(w, dw, config=None):
"""
Performs vanilla stochastic gradient descent. config format:
- learning_rate: Scalar learning rate.
"""
if config is None: config = {}
config.setdefault('learning_rate', 1e-2)
w -= config['learning_rate'] * dw
return w, config def sgd_momentum(w, dw, config=None):
"""
Performs stochastic gradient descent with momentum. config format:
- learning_rate: Scalar learning rate.
- momentum: Scalar between 0 and 1 giving the momentum value.
Setting momentum = 0 reduces to sgd.
- velocity: A numpy array of the same shape as w and dw used to store a moving
average of the gradients.
"""
if config is None: config = {}
config.setdefault('learning_rate', 1e-2)
config.setdefault('momentum', 0.9)
v = config.get('velocity', np.zeros_like(w)) next_w = None
v=v*config['momentum']-config['learning_rate']*dw
next_w=w+v
config['velocity'] = v return next_w, config def rmsprop(x, dx, config=None):
"""
Uses the RMSProp update rule, which uses a moving average of squared gradient
values to set adaptive per-parameter learning rates. config format:
- learning_rate: Scalar learning rate.
- decay_rate: Scalar between 0 and 1 giving the decay rate for the squared
gradient cache.
- epsilon: Small scalar used for smoothing to avoid dividing by zero.
- cache: Moving average of second moments of gradients.
"""
if config is None: config = {}
config.setdefault('learning_rate', 1e-2)
config.setdefault('decay_rate', 0.99)
config.setdefault('epsilon', 1e-8)
config.setdefault('cache', np.zeros_like(x)) next_x = None cache=config['cache']*config['decay_rate']+(1-config['decay_rate'])*dx**2
next_x=x-config['learning_rate']*dx/np.sqrt(cache+config['epsilon'])
config['cache']=cache return next_x, config def adam(x, dx, config=None):
"""
Uses the Adam update rule, which incorporates moving averages of both the
gradient and its square and a bias correction term. config format:
- learning_rate: Scalar learning rate.
- beta1: Decay rate for moving average of first moment of gradient.
- beta2: Decay rate for moving average of second moment of gradient.
- epsilon: Small scalar used for smoothing to avoid dividing by zero.
- m: Moving average of gradient.
- v: Moving average of squared gradient.
- t: Iteration number.
"""
if config is None: config = {}
config.setdefault('learning_rate', 1e-3)
config.setdefault('beta1', 0.9)
config.setdefault('beta2', 0.999)
config.setdefault('epsilon', 1e-8)
config.setdefault('m', np.zeros_like(x))
config.setdefault('v', np.zeros_like(x))
config.setdefault('t', 0)
config['t']+=1
这个方法比较综合,各种方法的好处吧
m=config['beta1']*config['m']+(1-config['beta1'])*dx # now to change by acc
v=config['beta2']*config['v']+(1-config['beta2'])*dx**2
config['m']=m
config['v']=v
m=m/(1-config['beta1']**config['t'])
v=v/(1-config['beta2']**config['t']) next_x=x-config['learning_rate']*m/np.sqrt(v+config['epsilon']) return next_x, config

  

n

optim.py cs231n的更多相关文章

  1. cnn.py cs231n

    n import numpy as np from cs231n.layers import * from cs231n.fast_layers import * from cs231n.layer_ ...

  2. fc_net.py cs231n

    n如果有错误,欢迎指出,不胜感激 import numpy as np from cs231n.layers import * from cs231n.layer_utils import * cla ...

  3. layers.py cs231n

    如果有错误,欢迎指出,不胜感激. import numpy as np def affine_forward(x, w, b): 第一个最简单的 affine_forward简单的前向传递,返回 ou ...

  4. 笔记:CS231n+assignment2(作业二)(一)

    第二个作业难度很高,但做(抄)完之后收获还是很大的.... 一.Fully-Connected Neural Nets 首先是对之前的神经网络的程序进行重构,目的是可以构建任意大小的全连接的neura ...

  5. 深度学习原理与框架-神经网络-cifar10分类(代码) 1.np.concatenate(进行数据串接) 2.np.hstack(将数据横着排列) 3.hasattr(判断.py文件的函数是否存在) 4.reshape(维度重构) 5.tanspose(维度位置变化) 6.pickle.load(f文件读入) 7.np.argmax(获得最大值索引) 8.np.maximum(阈值比较)

    横1. np.concatenate(list, axis=0) 将数据进行串接,这里主要是可以将列表进行x轴获得y轴的串接 参数说明:list表示需要串接的列表,axis=0,表示从上到下进行串接 ...

  6. optim.py-使用tensorflow实现一般优化算法

    optim.py Project URL:https://github.com/Codsir/optim.git Based on: tensorflow, numpy, copy, inspect ...

  7. 深度学习之卷积神经网络(CNN)详解与代码实现(一)

    卷积神经网络(CNN)详解与代码实现 本文系作者原创,转载请注明出处:https://www.cnblogs.com/further-further-further/p/10430073.html 目 ...

  8. 深度学习原理与框架-卷积神经网络-cifar10分类(图片分类代码) 1.数据读入 2.模型构建 3.模型参数训练

    卷积神经网络:下面要说的这个网络,由下面三层所组成 卷积网络:卷积层 + 激活层relu+ 池化层max_pool组成 神经网络:线性变化 + 激活层relu 神经网络: 线性变化(获得得分值) 代码 ...

  9. Pytorch1.3源码解析-第一篇

    pytorch$ tree -L 1 . ├── android ├── aten ├── benchmarks ├── binaries ├── c10 ├── caffe2 ├── CITATIO ...

随机推荐

  1. ElasticSearch入门之彼行我释(四)

    散仙在上篇文章中,介绍了关于ElasticSearch基本的增删改查的基本粒子,本篇呢,我们来学下稍微高级一点的知识: (1)如何在ElasticSearch中批量提交索引 ? (2)如何使用高级查询 ...

  2. mysql-connector-java-8.0.12使用时报错

    配置url加 &useSSL=false&serverTimezone=UTC 就可以了

  3. [转]深入WPF--Style

    Style 用来在类型的不同实例之间共享属性.资源和事件处理程序,您可以将 Style 看作是将一组属性值应用到多个元素的捷径. 这是MSDN上对Style的描述,翻译的还算中规中矩.Style(样式 ...

  4. python验证码识别PIL+pytesseract

    1.需要模块安装 在python安装目录scripts即: 执行pip install pillow 下载tesseract-ocr-setup-4.00.00dev.exe 安装,我的目录在C盘默认 ...

  5. 工控安全入门(六)——逆向角度看Vxworks

    上一篇文章中我们对于固件进行了简单的分析,这一篇我们将会补充一些Vxworks的知识,同时继续升入研究固件内容. 由于涉及到操作系统的内容,建议大家在阅读本篇前有一定操作系统知识的基础,或者是阅读我的 ...

  6. Asp.Net 应用程序在IIS发布后无法连接oracle数据库问题的解决方法

    asp.net程序编写完成后,发布到IIS,经常出现的一个问题是连接不上Oracle数据库,具体表现为Oracle的本地NET服务配置成功:用 pl/sql 等工具也可以连接上数据库,但是通过浏览器中 ...

  7. 数据交换格式之 - XML

    XML简介 XML是一种可扩展的标记语言,被设计用来传输和存储数据.传输数据. 需要自定义标签,自我描述性,XML是W3C的推荐标准: XML的特点与作用 特点: xml与操作系统.编程语言的开发平台 ...

  8. Linux下ps -ef和ps aux的区别及格式详解-转

    原文:https://www.linuxidc.com/Linux/2016-07/133515.htm Linux下显示系统进程的命令ps,最常用的有ps -ef 和ps aux.这两个到底有什么区 ...

  9. POJ 2533 最小上升子序列

    D - POJ 2533 经典DP-最长上升子序列 A numeric sequence of ai is ordered if a1 < a2 < ... < aN. Let th ...

  10. (大概是最全的解决方法)使用bandicam录制视频导入pr后音画不同步问题

    遇到这个问题大部分都是使用了VBR来录制视频导致的, 搜集了各种能够找到的方法,并没有每个尝试过 一 Handbrake转码 Audio out of sync AFTER importing 解决方 ...