CS231N Assigenment1 two_layer_net笔记
two_layer_net.ipynb
之前对 x.reshape(x.shape[0], -1)语句的输出结果理解一直有误:
1 x = [[1,4,7,2],[2,5,7,4]]
2 x = np.array(x)
3 x0 = x.reshape(x.shape[0], -1)
4 x1 = x.reshape(x.shape[1], -1)
5 print(x0)
6 print(x1)
的输出实际为
[[1 4 7 2]
[2 5 7 4]]
[[1 4]
[7 2]
[2 5]
[7 4]]
Affine layer: forward
# Test the affine_forward function num_inputs = 2
input_shape = (4, 5, 6)
output_dim = 3 input_size = num_inputs * np.prod(input_shape)
#print(np.prod(input_shape)) #120
weight_size = output_dim * np.prod(input_shape) x = np.linspace(-0.1, 0.5, num=input_size).reshape(num_inputs, *input_shape)
#prin(t(*input_shape)# 4 5 6
print(np.shape(x))# (2, 4, 5, 6)
w = np.linspace(-0.2, 0.3, num=weight_size).reshape(np.prod(input_shape), output_dim)
b = np.linspace(-0.3, 0.1, num=output_dim) out, _ = affine_forward(x, w, b)
correct_out = np.array([[ 1.49834967, 1.70660132, 1.91485297],
[ 3.25553199, 3.5141327, 3.77273342]]) # Compare your output with ours. The error should be around e-9 or less.
print('Testing affine_forward function:')
print('difference: ', rel_error(out, correct_out))
要补充的函数为
def affine_forward(x, w, b):
out = None
x_vector = x.reshape(x.shape[0], -1)
out = x_vector.dot(w)
out += b
return out, cache
Affine layer: backward
# Test the affine_forward function num_inputs = 2
input_shape = (4, 5, 6)
output_dim = 3 input_size = num_inputs * np.prod(input_shape)
#print(np.prod(input_shape)) #120
weight_size = output_dim * np.prod(input_shape) x = np.linspace(-0.1, 0.5, num=input_size).reshape(num_inputs, *input_shape)
#print(*input_shape)# 4 5 6
print(np.shape(x))# (2, 4, 5, 6)
w = np.linspace(-0.2, 0.3, num=weight_size).reshape(np.prod(input_shape), output_dim)
b = np.linspace(-0.3, 0.1, num=output_dim) out, _ = affine_forward(x, w, b)
correct_out = np.array([[ 1.49834967, 1.70660132, 1.91485297],
[ 3.25553199, 3.5141327, 3.77273342]]) # Compare your output with ours. The error should be around e-9 or less.
print('Testing affine_forward function:')
print('difference: ', rel_error(out, correct_out))
其中函数为
def affine_backward(dout, cache):
x, w, b = cache
dx, dw, db = None, None, None
# shape :x(10*2*3) w(6*5) b(5) out(10*5)
dx = np.dot(dout, w.T) # (N, M) dot (D, M).T -> (N, D): (10, 6)
dx = dx.reshape(x.shape) # 将 dx 调整为与 x 相同的形状: (10*2*3)
dw = np.dot(x.reshape(x.shape[0], -1).T, dout) # (D, N) dot (N, M) -> (D, M) 6*10 dot 10*5 = 6*5
db = np.sum(dout, axis=0) # 沿着样本维度求和,得到 (M,) 形状的梯度
return dx, dw, db
ReLU activation
正向
out = np.maximum(x,0)
反向的话刚开始理解错了,写成了
dx=np.maximum(dx,0)
显然是错误的,应当判断x是否小于0而不是dx
dx = np.copy(dout)
dx[x <= 0] = 0
即可。
Sandwich layers
看一下layer_utis.py,看起来就是affine和relu封装在了一起,误差也在e-11 e-12这样
Loss layers: SVM & Softmax
svm
def svm_loss(x, y):
loss, dx = None, None num_train = x.shape[0]
scores = x - np.max(x, axis=1, keepdims=True)
correct_class_scores = scores[np.arange(num_train), y]
margins = np.maximum(0, scores - correct_class_scores[:, np.newaxis] + 1)
margins[np.arange(num_train), y] = 0
loss = np.sum(margins) / num_train num_pos = np.sum(margins > 0, axis=1)
dx = np.zeros_like(x)
dx[margins > 0] = 1
dx[np.arange(num_train), y] -= num_pos
dx /= num_train return loss, dx
造了个3,4的看一下过程
x:
[[ 4.17943411e-04 1.39710028e-03 -1.78590431e-03]
[-7.08827734e-04 -7.47253161e-05 -7.75016769e-04]
[-1.49797903e-04 1.86172902e-03 -1.42552930e-03]
[-3.76356699e-04 -3.42275390e-04 2.94907637e-04]] y:
[2 1 1 0] scores:
[[-0.00097916 0. -0.003183 ]
[-0.0006341 0. -0.00070029]
[-0.00201153 0. -0.00328726]
[-0.00067126 -0.00063718 0. ]] correct class scores:
[-0.003183 0. 0. -0.00067126]
margins:
[[1.00220385 1.003183 1. ]
[0.9993659 1. 0.99929971]
[0.99798847 1. 0.99671274]
[1. 1.00003408 1.00067126]] margins:
[[1.00220385 1.003183 0. ]
[0.9993659 0. 0.99929971]
[0.99798847 0. 0.99671274]
[0. 1.00003408 1.00067126]] num_pos:
[2 2 2 2]
dx: [[1. 1. 0.]
[1. 0. 1.]
[1. 0. 1.]
[0. 1. 1.]] dx:
[[ 1. 1. -2.]
[ 1. -2. 1.]
[ 1. -2. 1.]
[-2. 1. 1.]]
softmax
dx去掉了x转置dot(dscore)
def softmax_loss(x, y):
loss, dx = None, None
*START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
num_train = x.shape[0]
scores = x - np.max(x, axis=1, keepdims=True)
exp_scores = np.exp(scores)
#correct_class_scores = scores[np.arange(num_train), y]
probs = exp_scores / np.sum(exp_scores, axis=1, keepdims=True)
loss = np.sum(-np.log(probs[np.arange(num_train ), y])) / num_train
# Compute the gradient
dscores = probs
dscores[np.arange(num_train ), y] -= 1
dscores /= num_train
dx = dscores
return loss, dx
Two-layer network
这一部分就看一下fc_net.py里面TwoLayerNet的类
假设输入维度是D,隐藏(层)维度H,分类数为C
from builtins import range
from builtins import object
import numpy as np from ..layers import *
from ..layer_utils import * class TwoLayerNet(object): def __init__(
self,
input_dim=3 * 32 * 32,
hidden_dim=100,
num_classes=10,
weight_scale=1e-3,
reg=0.0,
): self.params = {}
self.reg = reg self.params['W1'] = weight_scale * np.random.randn(input_dim,hidden_dim)
self.params['b1'] = np.zeros(hidden_dim) self.params['W2'] = weight_scale * np.random.randn(hidden_dim, num_classes)
self.params['b2'] = np.zeros(num_classes) def loss(self, X, y=None): scores = None hidden_layer = np.maximum(0, np.dot(X, self.params['W1']) + self.params['b1']) # ReLU activation
scores = np.dot(hidden_layer, self.params['W2']) + self.params['b2'] # If y is None then we are in test mode so just return scores
if y is None:
return scores loss, grads = 0, {} num_train = X.shape[0]
scores -= np.max(scores, axis=1, keepdims=True) # for numerical stability
softmax_scores = np.exp(scores) / np.sum(np.exp(scores), axis=1, keepdims=True)
correct_class_scores = softmax_scores[range(num_train), y]
data_loss = -np.log(correct_class_scores).mean()
reg_loss = 0.5 * self.reg * (np.sum(self.params['W1'] ** 2) + np.sum(self.params['W2'] ** 2))
loss = data_loss + reg_loss # Backward pass
dscores = softmax_scores.copy()
dscores[range(num_train), y] -= 1
dscores /= num_train grads['W2'] = np.dot(hidden_layer.T, dscores) + self.reg * self.params['W2']
grads['b2'] = np.sum(dscores, axis=0) dhidden = np.dot(dscores, self.params['W2'].T)
dhidden[hidden_layer <= 0] = 0 # backpropagate through ReLU grads['W1'] = np.dot(X.T, dhidden) + self.reg * self.params['W1']
grads['b1'] = np.sum(dhidden, axis=0) return loss, grads
CS231N Assigenment1 two_layer_net笔记的更多相关文章
- 【cs231n】神经网络笔记笔记2
) # 对数据进行零中心化(重要) cov = np.dot(X.T, X) / X.shape[0] # 得到数据的协方差矩阵 数据协方差矩阵的第(i, j)个元素是数据第i个和第j个维度的协方差. ...
- 【cs231n】最优化笔记
): W = np.random.randn(10, 3073) * 0.0001 # generate random parameters loss = L(X_train, Y_train, W) ...
- cs231n官方note笔记
本文记录官方note中比较新颖和有价值的观点(从反向传播开始) 一 反向传播 1 “反向传播是一个优美的局部过程.在整个计算线路图中,每个门单元都会得到一些输入并立即计算两个东西:1. 这个门的输出值 ...
- [基础]斯坦福cs231n课程视频笔记(三) 训练神经网络
目录 training Neural Network Activation function sigmoid ReLU Preprocessing Batch Normalization 权重初始化 ...
- CS231n 2017 学习笔记01——KNN(K-Nearest Neighbors)
本博客内容来自 Stanford University CS231N 2017 Lecture 2 - Image Classification 课程官网:http://cs231n.stanford ...
- 【cs231n】图像分类笔记
前言 首先声明,以下内容绝大部分转自知乎智能单元,他们将官方学习笔记进行了很专业的翻译,在此我会直接copy他们翻译的笔记,有些地方会用红字写自己的笔记,本文只是作为自己的学习笔记.本文内容官网链接: ...
- [基础]斯坦福cs231n课程视频笔记(二) 神经网络的介绍
目录 Introduction to Neural Networks BP Nerual Network Convolutional Neural Network Introduction to Ne ...
- [基础]斯坦福cs231n课程视频笔记(一) 图片分类之使用线性分类器
线性分类器的基本模型: f = Wx Loss Function and Optimization 1. LossFunction 衡量在当前的模型(参数矩阵W)的效果好坏 Multiclass SV ...
- cs231n学习笔记——lecture6 Training Neural Networks
该博客主要用于个人学习记录,部分内容参考自:[基础]斯坦福cs231n课程视频笔记(三) 训练神经网络.[cs231n笔记]10.神经网络训练技巧(上).CS231n学习笔记-训练神经网络.整理学习之 ...
- 『cs231n』绪论
笔记链接 cs231n系列所有图片笔记均拷贝自网络,链接如上,特此声明,后篇不再重复. 计算机视觉历史 总结出视觉两个重要结论:1.基础的视觉神经识别的是简单的边缘&轮廓2.视觉是分层的 数据 ...
随机推荐
- 华夏天信携手华为云开天aPaaS,打造安全、高效、节能的主煤流运输系统
摘要:基于开天aPaaS集成工作台,主煤流运输系统如何实现多源异构数据融合.皮带物料和人员违章的智能感知,以及皮带的智能控制.灵活架构.高效集成.快速开发! 本文分享自华为云社区<华夏天信携手华 ...
- 云小课|VMware备份上云学习专列来了,快加入吧~
阅识风云是华为云信息大咖,擅长将复杂信息多元化呈现,其出品的一张图(云图说).深入浅出的博文(云小课)或短视频(云视厅)总有一款能让您快速上手华为云.更多精彩内容请单击此处. 摘要:华为云云备份CBR ...
- 带你掌握二进制SCA检测工具的短板及应对措施
摘要:本文针对二进制SCA检测技术短板所面临的一些特殊场景.检测影响及应对措施进行详细分析和说明,希望对使用二进制SCA检测工具的测试和研发人员有所帮助. 本文分享自华为云社区<二进制SCA检测 ...
- 火山引擎 DataTester:A/B 实验如何实现人群智能化定向?
更多技术交流.求职机会,欢迎关注字节跳动数据平台微信公众号,回复[1]进入官方交流群 在精细化运营时代,用户需求和业务场景愈加多元,在产品功能迭代以及各类活动中,面向不同人群的兴趣点,有针对性地&qu ...
- ThrottleStop设置
主界面 选项界面 在主界面点击"Options"按钮进入选项界面 给CPU降压 在主界面点击"FIVR"按钮进入如下界面 如上图所示,我自己的电脑,降压49.8 ...
- Oracle JDK7 bug 发现、分析与解决实战
本文首发于 vivo互联网技术 微信公众号 链接: https://mp.weixin.qq.com/s/8f34CaTp--Wz5pTHKA0Xeg作者:vivo 官网商城开发团队 众所周知,Ora ...
- mysql备份恢复总结
mysqldump备份注:例子中的语句都是在mysql5.6下执行------------------基础------------------------一.修改my.cnf文件 vi /etc/my ...
- Nacos注册中心搭建
1.Nacos服务端搭建(需要有java环境),下载地址:https://github.com/alibaba/Nacos/releases 下载对应操作系统的包解压. 1.1.解压:tar -zxv ...
- Nginx(2)---搭建一个静态web服务
1.配置文件语法及参数说明:nginx.conf worker_processes 1; #工作进程多少个 events { worker_connections 1024; #连接数 } http ...
- VueTreeselect
https://www.vue-treeselect.cn/ 官网简介