目录

学习网址:https://www.youtube.com/watch?v=ogZi5oIo4fI
有道云笔记:http://note.youdao.com/noteshare?id=d86bd8fc60cb4fe87005a2d2e2d5b70d&sub=6911732F9FA44C68AD53A09072155ED3

Pytorch Leture 05: Linear Rregression in the Pytorch Way

第一部分,使用一个类来构建你的模型,需要写forward函数

import torch
from torch.autograd import Variable
import matplotlib.pyplot as plt x_data = Variable(torch.Tensor([[1.0], [2.0], [3.0]]))
y_data = Variable(torch.Tensor([[2.0], [4.0], [6.0]])) class Model(torch.nn.Module): def __init__(self):
"""
In the constructor we instantiate two nn.Linear module
"""
super(Model, self).__init__()
self.linear = torch.nn.Linear(1, 1) # One in and one out def forward(self, x):
"""
In the forward function we accept a Variable of input data and we must return
a Variable of output data. We can use Modules defined in the constructor as
well as arbitrary operators on Variables.
"""
y_pred = self.linear(x)
return y_pred # our model
model = Model()

第二部分,构建loss和优化器来进行参数计算


# Construct our loss function and an Optimizer. The call to model.parameters()
# in the SGD constructor will contain the learnable parameters of the two
# nn.Linear modules which are members of the model.
# criterion 标准准则 主要用来计算loss
criterion = torch.nn.MSELoss(size_average=False)
# 优化器
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

第三部分,进行训练,forward -> backward -> update parameters

# Training loop
for epoch in range(1000):
# Forward pass: Compute predicted y by passing x to the model
y_pred = model(x_data) # Compute and print loss
loss = criterion(y_pred, y_data)
print(epoch, loss.data[0]) # Zero gradients, perform a backward pass, and update the weights.
# initialize the gradients
optimizer.zero_grad()
# 反向传递
loss.backward()
# 更新优化器中的权重,即model.parrameters
optimizer.step()

第四部分,测试

# After training
hour_var = Variable(torch.Tensor([[4.0]]))
y_pred = model(hour_var)
print("predict (after training)", 4, model(hour_var).data[0][0])

总结一下基本的训练框架:

  1. 通过写一个类,来构造你的模型
  2. 构建loss和优化器
  3. 开始训练 Forward -> compute loss -> backward -> update
  • Forward: y_pred = model(x_data)
  • Compute loss: loss = criterion(y_pred,y_data)
  • Backward: optimizer.zero_grad() && loss.backward()
  • Update: optimizer.step()

作业测试其他optimizers:

  • torch.optim.Adagrad
  • torch.optim.Adam
  • torch.optim.Adamax
  • torch.optim.ASGD
  • torch.optim.LBFGS
  • torch.optim.RRRMSprop
  • torch.optim.Rprop
  • torch.optim.SGD

Logistic Regression 逻辑回归 - 二分类

原来的:

graph LR

x-->Linear
Linear-->y
\hat{y} = x * w + b

loss = \frac{1}{N}\sum_{n=1}^{N}(\hat{y_n}-y_n)^2

激活函数:

using sigmoid functions:

graph LR

x --> Linear
Linear --> Sigmoid
Sigmoid --> y

Y 介于 [0,1] 之间, 这样做可以用来压缩计算量,让计算更加容易

\sigma(z) = \frac{1}{1+e^{-z}}

\hat{y} = \sigma(x*w+b)

loss=-\frac{1}{N}\sum_{n=1}^{N}y_nlog\hat{y_n} + (1-y_n)log(1-\hat{y_n})

代码:

import torch
from torch.autograd import Variable
import torch.nn.functional as F x_data = Variable(torch.Tensor([[1.0], [2.0], [3.0], [4.0],[5.0]]))
y_data = Variable(torch.Tensor([[0.], [0.], [1.], [1.],[1.]])) class Model(torch.nn.Module): def __init__(self):
"""
In the constructor we instantiate nn.Linear module
"""
super(Model, self).__init__()
self.linear = torch.nn.Linear(1, 1) # One in and one out def forward(self, x):
"""
In the forward function we accept a Variable of input data and we must return
a Variable of output data.
"""
y_pred = F.sigmoid(self.linear(x))
return y_pred # our model
model = Model() # Construct our loss function and an Optimizer. The call to model.parameters()
# in the SGD constructor will contain the learnable parameters of the two
# nn.Linear modules which are members of the model.
criterion = torch.nn.BCELoss(size_average=True)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01) # Training loop
for epoch in range(400):
# Forward pass: Compute predicted y by passing x to the model
y_pred = model(x_data) # Compute and print loss
loss = criterion(y_pred, y_data)
print(epoch, loss.data[0]) # Zero gradients, perform a backward pass, and update the weights.
optimizer.zero_grad()
loss.backward()
optimizer.step() # After training
hour_var = Variable(torch.Tensor([[0.0]]))
print("predict 1 hour ", 0.0, model(hour_var).data[0][0] > 0.5)
hour_var = Variable(torch.Tensor([[7.0]]))
print("predict 7 hours", 7.0, model(hour_var).data[0][0] > 0.5)

新增激活函数:

  1. Design your model using class

y_Pred = F.sigmoid(self.linear(x))

  1. Construct loss and optimizer

change loss into:

criterion = torch.nn.BCELoss(size_average=True)

  1. Training cycle (forward,backward,update)

作业:尝试其他激活函数:

  • ReLu

ReLU是修正线性单元(The Rectified Linear Unit)的简称,近些年来在深度学习中使用得很多,可以解决梯度弥散问题,因为它的导数等于1或者就是0。相对于sigmoid和tanh激励函数,对ReLU求梯度非常简单,计算也很简单,可以非常大程度地提升随机梯度下降的收敛速度。(因为ReLU是线性的,而sigmoid和tanh是非线性的)。但ReLU的缺点是比较脆弱,随着训练的进行,可能会出现神经元死亡的情况,例如有一个很大的梯度流经ReLU单元后,那权重的更新结果可能是,在此之后任何的数据点都没有办法再激活它了。如果发生这种情况,那么流经神经元的梯度从这一点开始将永远是0。也就是说,ReLU神经元在训练中不可逆地死亡了。

  • ReLu6
  • ELU

ELU在正值区间的值为x本身,这样减轻了梯度弥散问题(x>0区间导数处处为1),这点跟ReLU、Leaky ReLU相似。而在负值区间,ELU在输入取较小值时具有软饱和的特性,提升了对噪声的鲁棒性

  • SELU
  • PReLU
  • LeakyReLu

Leaky ReLU主要是为了避免梯度消失,当神经元处于非激活状态时,允许一个非0的梯度存在,这样不会出现梯度消失,收敛速度快。它的优缺点跟ReLU类似。

  • Threshold
  • Hardtanh

tanh函数将输入值压缩至-1到1之间。该函数与Sigmoid类似,也存在着梯度弥散或梯度饱和的缺点。

  • Sigmoid

这应该是神经网络中使用最频繁的激励函数了,它把一个实数压缩至0到1之间,当输入的数字非常大的时候,结果会接近1,当输入非常大的负数时,则会得到接近0的结果。在早期的神经网络中使用得非常多,因为它很好地解释了神经元受到刺激后是否被激活和向后传递的场景(0:几乎没有被激活,1:完全被激活),不过近几年在深度学习的应用中比较少见到它的身影,因为使用sigmoid函数容易出现梯度弥散或者梯度饱和。当神经网络的层数很多时,如果每一层的激励函数都采用sigmoid函数的话,就会产生梯度弥散的问题,因为利用反向传播更新参数时,会乘以它的导数,所以会一直减小。如果输入的是比较大或者比较小的数(例如输入100,经Sigmoid函数后结果接近于1,梯度接近于0),会产生饱和效应,导致神经元类似于死亡状态。

  • Tanh

Lecture07: How to make netural network wide and deep ?

graph LR
a-->Linear
b-->Linear
Linear-->Sigmoid
Sigmoid-->y

多维度,更层次的网络,主要在Design your model using class 中进行的改变

import torch
from torch.autograd import Variable
import numpy as np xy = np.loadtxt('./data/diabetes.csv.gz', delimiter=',', dtype=np.float32)
x_data = Variable(torch.from_numpy(xy[:, 0:-1]))
y_data = Variable(torch.from_numpy(xy[:, [-1]])) print(x_data.data.shape)
print(y_data.data.shape) class Model(torch.nn.Module): def __init__(self):
"""
In the constructor we instantiate two nn.Linear module
"""
super(Model, self).__init__()
self.l1 = torch.nn.Linear(8, 6)
self.l2 = torch.nn.Linear(6, 4)
self.l3 = torch.nn.Linear(4, 1) self.sigmoid = torch.nn.Sigmoid() def forward(self, x):
"""
In the forward function we accept a Variable of input data and we must return
a Variable of output data. We can use Modules defined in the constructor as
well as arbitrary operators on Variables.
"""
out1 = self.sigmoid(self.l1(x))
out2 = self.sigmoid(self.l2(out1))
y_pred = self.sigmoid(self.l3(out2))
return y_pred # our model
model = Model() # Construct our loss function and an Optimizer. The call to model.parameters()
# in the SGD constructor will contain the learnable parameters of the two
# nn.Linear modules which are members of the model.
#criterion = torch.nn.BCELoss(size_average=True)
criterion = torch.nn.BCELoss(reduction='elementwise_mean')
optimizer = torch.optim.SGD(model.parameters(), lr=0.1) # Training loop
for epoch in range(1200000):
# Forward pass: Compute predicted y by passing x to the model
y_pred = model(x_data) # Compute and print loss
loss = criterion(y_pred, y_data)
print(epoch, loss.item()) # Zero gradients, perform a backward pass, and update the weights.
optimizer.zero_grad()
loss.backward()
optimizer.step()

作业:

  1. 10层以上的更深层测的网络进行训练
    发现并没有因为更深,效果变好
  2. 更改激励函数

Lecture 08: Pytorch DataLoader

构造Datasets主要分为三个过程:

继承自Dataset

  1. download, rerad data etc
  2. return one item on the index
  3. return the data length

实例化一个dataset,在Dataloader中使用:

train_loader = DataLoader(dataset=dataset,
batch_size=1,
shuffle=True,
num_workers=1)

Code:

# References
# https://github.com/yunjey/pytorch-tutorial/blob/master/tutorials/01-basics/pytorch_basics/main.py
# http://pytorch.org/tutorials/beginner/data_loading_tutorial.html#dataset-class
import torch
import numpy as np
from torch.autograd import Variable
from torch.utils.data import Dataset, DataLoader class DiabetesDataset(Dataset):
""" Diabetes dataset.""" # Initialize your data, download, etc.
def __init__(self):
xy = np.loadtxt('./data/diabetes.csv.gz',
delimiter=',', dtype=np.float32)
self.len = xy.shape[0]
self.x_data = torch.from_numpy(xy[:, 0:-1])
self.y_data = torch.from_numpy(xy[:, [-1]]) def __getitem__(self, index):
return self.x_data[index], self.y_data[index] def __len__(self):
return self.len dataset = DiabetesDataset()
train_loader = DataLoader(dataset=dataset,
batch_size=1,
shuffle=True,
num_workers=1) for epoch in range(2):
for i, data in enumerate(train_loader, 0):
# get the inputs
inputs, labels = data # wrap them in Variable
inputs, labels = Variable(inputs), Variable(labels) # Run your training process
print(epoch, i, "inputs", inputs.data, "labels", labels.data)

课后作业:
使用其他数据集,MNIST,参考了官网的代码:

总结一下训练的思路:

  1. 构造继承自Dataset的自己的datasets类
  • [ ] 读取数据集,np.loadtxt("datas.csv") , 构建trainset testset
  • [ ] 构建DataLoader: 得到trainLoader , testLoader
  • [ ] 从DataLoader中获取数据: dataiter = iter(trainloader)  images, labels = dataiter.next()
  • [ ] 训练
  • [ ] 测试

Lecture 09: softmax Classifier

part one

MNist softmax

before:

graph LR

x{x} --> Linear
Linear --> Activation
Activation --> ...
... --> Linear2
Linear2-->Activation2
Activation2-->h{y}

now:

graph LR

x{x} --> Linear
Linear --> Activation
Activation --> ...
... --> Linear2
Linear2-->Activation2
Activation2-->P_y=0
Activation2-->P_y=1
Activation2-->....
Activation2-->P_y=10

what is softmax?


\sigma(z)_j = \frac{e^{z_j}}{\sum_{k=1}^{K}e^{z_k}} for j=1,2,...,k

using softmax to get probabilities.

what is corss entropy?

loss = \frac{1}{N}\sum_i D(Softmax(wx_i+b),Y_i)

D(\hat{Y},Y) = -Ylog\hat{Y}

整个过程:

graph LR
x--LinearModel-->Z
Z--Softmax-->y'
y'--Cross_Entropy-->Y

Pytorch中的实现:

loss = torch.nn.CrossEntropyLoss()
这个中既包括了Softmax也包括了Cross_Entropy

graph LR
X--Softmax-->y'
y'--Cross_Entropy-->Y

Code:

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
from torch.autograd import Variable # Cross entropy example
import numpy as np
# One hot
# 0: 1 0 0
# 1: 0 1 0
# 2: 0 0 1
Y = np.array([1, 0, 0]) Y_pred1 = np.array([0.7, 0.2, 0.1])
Y_pred2 = np.array([0.1, 0.3, 0.6])
print("loss1 = ", np.sum(-Y * np.log(Y_pred1)))
print("loss2 = ", np.sum(-Y * np.log(Y_pred2))) ################################################################################ # Softmax + CrossEntropy (logSoftmax + NLLLoss)
loss = nn.CrossEntropyLoss() # target is of size nBatch
# each element in target has to have 0 <= value < nClasses (0-2)
# Input is class, not one-hot
Y = Variable(torch.LongTensor([0]), requires_grad=False) # input is of size nBatch x nClasses = 1 x 4
# Y_pred are logits (not softmax)
Y_pred1 = Variable(torch.Tensor([[2.0, 1.0, 0.1]]))
Y_pred2 = Variable(torch.Tensor([[0.5, 2.0, 0.3]])) l1 = loss(Y_pred1, Y)
l2 = loss(Y_pred2, Y) print("PyTorch Loss1 = ", l1.data, "\nPyTorch Loss2=", l2.data) print("Y_pred1=", torch.max(Y_pred1.data, 1)[1])
print("Y_pred2=", torch.max(Y_pred2.data, 1)[1]) ################################################################################
"""Batch loss"""
# target is of size nBatch
# each element in target has to have 0 <= value < nClasses (0-2)
# Input is class, not one-hot
Y = Variable(torch.LongTensor([2, 0, 1]), requires_grad=False) # input is of size nBatch x nClasses = 2 x 4
# Y_pred are logits (not softmax)
Y_pred1 = Variable(torch.Tensor([[0.1, 0.2, 0.9],
[1.1, 0.1, 0.2],
[0.2, 2.1, 0.1]])) Y_pred2 = Variable(torch.Tensor([[0.8, 0.2, 0.3],
[0.2, 0.3, 0.5],
[0.2, 0.2, 0.5]])) l1 = loss(Y_pred1, Y)
l2 = loss(Y_pred2, Y) print("Batch Loss1 = ", l1.data, "\nBatch Loss2=", l2.data)

作业:CrossEntropyLoss VS NLLLoss ?

part two : real problem - MNIST input

MNIST Network

graph LR
inputLayer -.-> HiddenLayer
HiddenLayer -.-> OutputLayer

Code:

# https://github.com/pytorch/examples/blob/master/mnist/main.py
from __future__ import print_function
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
from torch.autograd import Variable # Training settings
batch_size = 16 # MNIST Dataset
train_dataset = datasets.MNIST(root='./mnist_data/',
train=True,
transform=transforms.ToTensor(),
download=True) test_dataset = datasets.MNIST(root='./mnist_data/',
train=False,
transform=transforms.ToTensor()) # Data Loader (Input Pipeline)
train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
batch_size=batch_size,
shuffle=True) test_loader = torch.utils.data.DataLoader(dataset=test_dataset,
batch_size=batch_size,
shuffle=False) class Net(nn.Module): def __init__(self):
super(Net, self).__init__()
self.l1 = nn.Linear(784, 520)
self.l2 = nn.Linear(520, 320)
self.l3 = nn.Linear(320, 240)
self.l4 = nn.Linear(240, 120)
self.l5 = nn.Linear(120, 10) def forward(self, x):
x = x.view(-1, 784) # Flatten the data (n, 1, 28, 28)-> (n, 784)
x = F.relu(self.l1(x))
x = F.relu(self.l2(x))
x = F.relu(self.l3(x))
x = F.relu(self.l4(x))
return self.l5(x) model = Net() criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5) def train(epoch):
model.train()
for batch_idx, (data, target) in enumerate(train_loader):
data, target = Variable(data), Variable(target)
optimizer.zero_grad()
output = model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
if batch_idx % 10 == 0:
print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
epoch, batch_idx * len(data), len(train_loader.dataset),
100. * batch_idx / len(train_loader), loss.data[0])) def test():
model.eval()
test_loss = 0
correct = 0
for data, target in test_loader:
data, target = Variable(data, volatile=True), Variable(target)
output = model(data)
# sum up batch loss
test_loss += criterion(output, target).data[0]
# get the index of the max
pred = output.data.max(1, keepdim=True)[1]
correct += pred.eq(target.data.view_as(pred)).cpu().sum() test_loss /= len(test_loader.dataset)
print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
test_loss, correct, len(test_loader.dataset),
100. * correct / len(test_loader.dataset))) for epoch in range(1, 10):
train(epoch)
test()

作业:

Use DataLoader

Lecture 10 : basic CNN

Simple convolution layer

for Example:

graph LR
3*3*1_image-->2*2*1_filter_W
3*3*1_image-->1*1_Stride
3*3*1_image-->NoPadding
NoPadding-->2*2_featureMap
2*2*1_filter_W-->2*2_featureMap
1*1_Stride-->2*2_featureMap

How to compute multi-dimension pictures ?

  • 32 * 32 * 3 image
  • 5 * 5 * 3 filter W
w^T + b

Get: 28 * 28 * 1 feature map * N (how many filters you used)

计算公式


OutputSize = \frac{(InputSize+PaddingSize*2-FilterSize)}{Stride} + 1

几个需要解释的参数:

CONV

卷积层,需要配合激活函数使用
filter and padding and filterSize using function above to calculate

torch.nn.Conv2d(in_channels,out_channels,kernel_size)

self.conv1=nn.Conv2d(1,10,kernel_size=5)

激活函数

activate functions

Max Pooling

选取一个n*m的Filter中最大的值作为pooling的结果
还有类似的avg Pooling

nn.MaxPool2d(kernel_size)

self.mp = nn.MaxPool2d(2)

全连接层

self.fc = nn.Linear(320,10)

CNN & Fully Connected network 区别

CNN中的神经元不是跟每个像素都相连

Fully Connected network中的神经元是跟每个像素都相连。

implement of Simple CNN

graph TB

ConvolutionalLayer1 --> PoolingLayer1
PoolingLayer1 --> ConvolutionalLayer2
ConvolutionalLayer2 --> PoolingLayer2
PoolingLayer2 --> Fully-ConnectedLayer

Model:

class Net(nn.Module):
def __init__(self):
super(Net,self).__init__()
self.conv1 = nn.Conv2d(1,10,kernel_size=5)
self.conv2 = nn.Conv2d(10,20,kernel_size=5)
self.mp = nn.MaxPool2d(2)
self.fc = nn.Linear(???,10)
def forward(self,x):
in_size = x.size(0)
x = F.relu(self.mp(self.conv1(x)))
x = F.relu(self.mp(self.conv2(x)))
x = x.view(in_size,-1) # flatten the tensor
x = self.fc(x)
return F.log_softmax(x)

??? 处如何填写

  • ??? 处可以随意先填一个数值,然后通过程序的报错来填写
  • 还可以在forward函数中print(x.size())得到tensor的维度

作业:

尝试更深层次的网络,更深的全连接层

Lecture 11 Advanced CNN

Why 1*1 convolution ?

using 32 1*1 filters to turn 64-dimension pic into 32-dimension pic.

using 1*1 filters can significantly save our computations.

Inception Module

graph LR

Filter_concat_in --> 1*1Conv0_16
Filter_concat_in --> 1*1Conv1_16
Filter_concat_in --> 1*1Conv2_16
Filter_concat_in --> AvgPooling
AvgPooling --> 1*1Conv3_16
1*1Conv0_16 --> 3*3Conv0_24
3*3Conv0_24 --> 3*3Conv1_24
3*3Conv1_24 --> Filter_Concat_out
1*1Conv1_16 --> 5*5Conv_24
5*5Conv_24 --> Filter_Concat_out
1*1Conv3_16 --> Filter_Concat_out
1*1Conv2_16 --> Filter_Concat_out

Implement

  1. 最下边的实现(第四道)
self.brach1x1 = nn.Conv2d(in_channels,16,kernel_size=1)

branch1x1 = self.branch1x1(x)
  1. 倒数第二道
self.branch_pool = nn.Conv2d(in_channels,24,kernel_size=1)

branch_pool = F.avg_pool2d(x,kernel_size=3,stride=1,padding=1)
branch_pool = self.branch_pool(branch_pool)
  1. 正数第二道
self.branch5x5_1 = nn.Conv2d(in_channels,16,kernel_size=1)
self.branch5x5_2 = nn.Conv2d(16,24,kernel_size=1,padding=2) branch5x5 = self.branch5x5_1(x)
branch5x5 = self.branch5x5_2(branch5x5)
  1. 第一道
self.branch3x3_1=nn.Conv2d(in_channels,16,kernel_size=1)
self.branch3x3_2=nn.Conv2d(16,24,kernel_size=3,padding=1)
self.branch3x3_3=nn.Conv2d(24,24,kernel_size=3,padding=1) branch3x3 = self.branch3x3_1(x)
branch3x3 = self.branch3x3_2(branch3x3)
branch3x3 = self.branch3x3_3(branch3x3)
  1. output
outputs = [branch1x1,branch_pool,branch5x5,branch3x3]

ALL CODE:

# https://github.com/pytorch/examples/blob/master/mnist/main.py
from __future__ import print_function
import argparse
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
from torch.autograd import Variable # Training settings
batch_size = 64 # MNIST Dataset
train_dataset = datasets.MNIST(root='./data/',
train=True,
transform=transforms.ToTensor(),
download=True) test_dataset = datasets.MNIST(root='./data/',
train=False,
transform=transforms.ToTensor()) # Data Loader (Input Pipeline)
train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
batch_size=batch_size,
shuffle=True) test_loader = torch.utils.data.DataLoader(dataset=test_dataset,
batch_size=batch_size,
shuffle=False) class InceptionA(nn.Module): def __init__(self, in_channels):
super(InceptionA, self).__init__()
self.branch1x1 = nn.Conv2d(in_channels, 16, kernel_size=1) self.branch5x5_1 = nn.Conv2d(in_channels, 16, kernel_size=1)
self.branch5x5_2 = nn.Conv2d(16, 24, kernel_size=5, padding=2) self.branch3x3dbl_1 = nn.Conv2d(in_channels, 16, kernel_size=1)
self.branch3x3dbl_2 = nn.Conv2d(16, 24, kernel_size=3, padding=1)
self.branch3x3dbl_3 = nn.Conv2d(24, 24, kernel_size=3, padding=1) self.branch_pool = nn.Conv2d(in_channels, 24, kernel_size=1) def forward(self, x):
branch1x1 = self.branch1x1(x) branch5x5 = self.branch5x5_1(x)
branch5x5 = self.branch5x5_2(branch5x5) branch3x3dbl = self.branch3x3dbl_1(x)
branch3x3dbl = self.branch3x3dbl_2(branch3x3dbl)
branch3x3dbl = self.branch3x3dbl_3(branch3x3dbl) branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)
branch_pool = self.branch_pool(branch_pool) outputs = [branch1x1, branch5x5, branch3x3dbl, branch_pool]
return torch.cat(outputs, 1) class Net(nn.Module): def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
self.conv2 = nn.Conv2d(88, 20, kernel_size=5) self.incept1 = InceptionA(in_channels=10)
self.incept2 = InceptionA(in_channels=20) self.mp = nn.MaxPool2d(2)
self.fc = nn.Linear(1408, 10) def forward(self, x):
in_size = x.size(0)
x = F.relu(self.mp(self.conv1(x)))
x = self.incept1(x)
x = F.relu(self.mp(self.conv2(x)))
x = self.incept2(x)
x = x.view(in_size, -1) # flatten the tensor
x = self.fc(x)
return F.log_softmax(x) model = Net() optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5) def train(epoch):
model.train()
for batch_idx, (data, target) in enumerate(train_loader):
data, target = Variable(data), Variable(target)
optimizer.zero_grad()
output = model(data)
loss = F.nll_loss(output, target)
loss.backward()
optimizer.step()
if batch_idx % 10 == 0:
print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
epoch, batch_idx * len(data), len(train_loader.dataset),
100. * batch_idx / len(train_loader), loss.data[0])) def test():
model.eval()
test_loss = 0
correct = 0
for data, target in test_loader:
data, target = Variable(data, volatile=True), Variable(target)
output = model(data)
# sum up batch loss
test_loss += F.nll_loss(output, target, size_average=False).data[0]
# get the index of the max log-probability
pred = output.data.max(1, keepdim=True)[1]
correct += pred.eq(target.data.view_as(pred)).cpu().sum() test_loss /= len(test_loader.dataset)
print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
test_loss, correct, len(test_loader.dataset),
100. * correct / len(test_loader.dataset))) for epoch in range(1, 10):
train(epoch)
test()

Lecture 12: RNN

Recurrrent NN

graph LR

X1 --> A1
A1 --> h1 X2 --> A2
A2 --> h2 X3 --> A3
A3 --> h3 X4 --> A4
A4 --> h4 A1 --> A2
A2 --> A3
A3 --> A4

Pytorch提供了RNN函数,可以直接使用

different RNN implementations

cell = nn.RNN(input_size=4,hidden_size=2,batch_first=True)
cell = nn.GRU(input_size=4,hidden_size=2,batch_first=True)
cell = nn.LSTM(input_size=4,hidden_size=2,batch_first=True)

How to use RNN?

cell = nn.RNN(input_size=4,hidden_size=2,batch_first=True)

inputs = ... # batch_size, seq_len,inputSize
hidden = (...) # numLayers,batch_size, hidden_size out, hidden = cell(inputs,hidden)

有两个输出,一个是output, 一个是hidden layer的output

# Lab 12 RNN
import sys
import torch
import torch.nn as nn
from torch.autograd import Variable torch.manual_seed(777) # reproducibility
# 0 1 2 3 4
idx2char = ['h', 'i', 'e', 'l', 'o'] # Teach hihell -> ihello
x_data = [0, 1, 0, 2, 3, 3] # hihell
one_hot_lookup = [[1, 0, 0, 0, 0], # 0
[0, 1, 0, 0, 0], # 1
[0, 0, 1, 0, 0], # 2
[0, 0, 0, 1, 0], # 3
[0, 0, 0, 0, 1]] # 4 y_data = [1, 0, 2, 3, 3, 4] # ihello
x_one_hot = [one_hot_lookup[x] for x in x_data] # As we have one batch of samples, we will change them to variables only once
inputs = Variable(torch.Tensor(x_one_hot))
labels = Variable(torch.LongTensor(y_data)) num_classes = 5
input_size = 5 # one-hot size
hidden_size = 5 # output from the RNN. 5 to directly predict one-hot
batch_size = 1 # one sentence
sequence_length = 1 # One by one
num_layers = 1 # one-layer rnn class Model(nn.Module): def __init__(self):
super(Model, self).__init__()
self.rnn = nn.RNN(input_size=input_size,
hidden_size=hidden_size, batch_first=True) def forward(self, hidden, x):
# Reshape input (batch first)
x = x.view(batch_size, sequence_length, input_size) # Propagate input through RNN
# Input: (batch, seq_len, input_size)
# hidden: (num_layers * num_directions, batch, hidden_size)
out, hidden = self.rnn(x, hidden)
return hidden, out.view(-1, num_classes) def init_hidden(self):
# Initialize hidden and cell states
# (num_layers * num_directions, batch, hidden_size)
return Variable(torch.zeros(num_layers, batch_size, hidden_size)) # Instantiate RNN model
model = Model()
print(model) # Set loss and optimizer function
# CrossEntropyLoss = LogSoftmax + NLLLoss
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.1) # Train the model
for epoch in range(100):
optimizer.zero_grad()
loss = 0
hidden = model.init_hidden() sys.stdout.write("predicted string: ")
for input, label in zip(inputs, labels):
# print(input.size(), label.size())
hidden, output = model(hidden, input)
val, idx = output.max(1)
sys.stdout.write(idx2char[idx.data[0]])
loss += criterion(output, label) print(", epoch: %d, loss: %1.3f" % (epoch + 1, loss.data[0])) loss.backward()
optimizer.step() print("Learning finished!")

torch基础学习的更多相关文章

  1. salesforce 零基础学习(五十二)Trigger使用篇(二)

    第十七篇的Trigger用法为通过Handler方式实现Trigger的封装,此种好处是一个Handler对应一个sObject,使本该在Trigger中写的代码分到Handler中,代码更加清晰. ...

  2. 如何从零基础学习VR

    转载请声明转载地址:http://www.cnblogs.com/Rodolfo/,违者必究. 近期很多搞技术的朋友问我,如何步入VR的圈子?如何从零基础系统性的学习VR技术? 本人将于2017年1月 ...

  3. IOS基础学习-2: UIButton

    IOS基础学习-2: UIButton   UIButton是一个标准的UIControl控件,UIKit提供了一组控件:UISwitch开关.UIButton按钮.UISegmentedContro ...

  4. HTML5零基础学习Web前端需要知道哪些?

    HTML零基础学习Web前端网页制作,首先是要掌握一些常用标签的使用和他们的各个属性,常用的标签我总结了一下有以下这些: html:页面的根元素. head:页面的头部标签,是所有头部元素的容器. b ...

  5. python入门到精通[三]:基础学习(2)

    摘要:Python基础学习:列表.元组.字典.函数.序列化.正则.模块. 上一节学习了字符串.流程控制.文件及目录操作,这节介绍下列表.元组.字典.函数.序列化.正则.模块. 1.列表 python中 ...

  6. python入门到精通[二]:基础学习(1)

    摘要:Python基础学习: 注释.字符串操作.用户交互.流程控制.导入模块.文件操作.目录操作. 上一节讲了分别在windows下和linux下的环境配置,这节以linux为例学习基本语法.代码部分 ...

  7. CSS零基础学习笔记.

    酸菜记 之 CSS的零基础. 这篇是我自己从零基础学习CSS的笔记加理解总结归纳的,如有不对的地方,请留言指教, 学前了解: CSS中字母是不分大小写的; CSS文件可以使用在各种程序文件中(如:PH ...

  8. Yaf零基础学习总结5-Yaf类的自动加载

    Yaf零基础学习总结5-Yaf类的自动加载 框架的一个重要功能就是类的自动加载了,在第一个demo的时候我们就约定自己的项目的目录结构,框架就基于这个目录结构来自动加载需要的类文件. Yaf在自启动的 ...

  9. Yaf零基础学习总结4-Yaf的配置文件

    在上一节的hello yaf当中我们已经接触过了yaf的配置文件了, Yaf和用户共用一个配置空间, 也就是在Yaf_Application初始化时刻给出的配置文件中的配置. 作为区别, Yaf的配置 ...

随机推荐

  1. SDRAM调试总结

    SDRAM的调试总结 1 说明 实验平台: JZ2440 CPU: S3C2440 SDRAM型号: EM63A165TS-6G   2 SDRAM的一些基本概念 2.1 引脚分配   2.2 引脚描 ...

  2. 如何为 .NET Core 安装本地化的 IntelliSense 文件

    在.Net Core 2.x 版本,Microsoft 官方没有提供 .Net Core 正式版的多语言安装包.因此,我们在用.Net Core 2.x 版本作为框架目标编写代码时,智能提成是英文的. ...

  3. flink初识及安装flink standalone集群

    flink architecture 1.可以看出,flink可以运行在本地,也可以类似spark一样on yarn或者standalone模式(与spark standalone也很相似),此外fl ...

  4. 吴裕雄--天生自然JAVA SPRING框架开发学习笔记:Spring通知类型及使用ProxyFactoryBean创建AOP代理

    通知(Advice)其实就是对目标切入点进行增强的内容,Spring AOP 为通知(Advice)提供了 org.aopalliance.aop.Advice 接口. Spring 通知按照在目标类 ...

  5. Bean 注解(Annotation)配置(2)- Bean作用域与生命周期回调方法配置

    Spring 系列教程 Spring 框架介绍 Spring 框架模块 Spring开发环境搭建(Eclipse) 创建一个简单的Spring应用 Spring 控制反转容器(Inversion of ...

  6. FTP故障排除

    1,ping 检查 IP是否通 禁PING可以使用TCPING 2,服务器端被动模式设置,可设置固定端口号,保证防火墙上该端口畅通 浏览器默认是主动模式 3,使用FLASHFXP软件可以监测到数据端口 ...

  7. 大二暑假第七周总结--开始学习Hadoop基础(六)

    复习关于Hadoop的操作语句以及重点 Shell版 跳转目录到Hadoop: cd /usr/local/hadoop 启动Hadoop: ./sbin/start-dfs.sh 注意:Hadoop ...

  8. Linux - 安装 dotnet core 环境

    Linux -  安装 dotnet core 环境 系统环境:CentOS7 官方安装指导 https://www.microsoft.com/net/learn/get-started/linux ...

  9. zuul网关配置

    静态路由:通过url匹配映射地址进行静态路由(只会把到达zuul网关的请求按照发送,并把匹配请求地址 /common-service/ ->http://localhost:9001/) zuu ...

  10. 吴裕雄--天生自然 PHP开发学习:字符串变量

    <?php $txt="Hello world!"; echo $txt; ?> <?php $txt1="Hello world!"; $t ...