一个理解基本RCNN的简单例子

对于一个最简单的RNN网络https://github.com/Teaonly/beginlearning/

"""
Minimal character-level Vanilla RNN model. Written by Andrej Karpathy (@karpathy)
BSD License
配套七月视频https://www.bilibili.com/video/av17261517?from=search&seid=1299105350833546417
"""
import numpy as np

# data I/O
data = open('input.txt', 'r').read() # should be simple plain text file
chars = list(set(data))
data_size, vocab_size = len(data), len(chars)
print 'data has %d characters, %d unique.' % (data_size, vocab_size)
char_to_ix = { ch:i for i,ch in enumerate(chars) }
ix_to_char = { i:ch for i,ch in enumerate(chars) }

# hyperparameters
hidden_size = 100 # size of hidden layer of neurons
seq_length = 25 # number of steps to unroll the RNN for
learning_rate = 1e-1

# model parameters
Wxh = np.random.randn(hidden_size, vocab_size)*0.01 # input to hidden
Whh = np.random.randn(hidden_size, hidden_size)*0.01 # hidden to hidden
Why = np.random.randn(vocab_size, hidden_size)*0.01 # hidden to output
bh = np.zeros((hidden_size, 1)) # hidden bias
by = np.zeros((vocab_size, 1)) # output bias

def lossFun(inputs, targets, hprev):
  """
  inputs,targets are both list of integers.
  hprev is Hx1 array of initial hidden state
  returns the loss, gradients on model parameters, and last hidden state
  """
  xs, hs, ys, ps = {}, {}, {}, {}
  hs[-1] = np.copy(hprev)
  loss = 0
  # forward pass
  for t in xrange(len(inputs)):
    xs[t] = np.zeros((vocab_size,1)) # encode in 1-of-k representation
    xs[t][inputs[t]] = 1
    hs[t] = np.tanh(np.dot(Wxh, xs[t]) + np.dot(Whh, hs[t-1]) + bh) # hidden state
    ys[t] = np.dot(Why, hs[t]) + by # unnormalized log probabilities for next chars
    ps[t] = np.exp(ys[t]) / np.sum(np.exp(ys[t])) # probabilities for next chars
    loss += -np.log(ps[t][targets[t],0]) # softmax (cross-entropy loss)
  # backward pass: compute gradients going backwards
  dWxh, dWhh, dWhy = np.zeros_like(Wxh), np.zeros_like(Whh), np.zeros_like(Why)
  dbh, dby = np.zeros_like(bh), np.zeros_like(by)
  dhnext = np.zeros_like(hs[0])
  for t in reversed(xrange(len(inputs))):
    dy = np.copy(ps[t])
    dy[targets[t]] -= 1 # backprop into y. see http://cs231n.github.io/neural-networks-case-study/#grad if confused here
    dWhy += np.dot(dy, hs[t].T)
    dby += dy
    dh = np.dot(Why.T, dy) + dhnext # backprop into h
    dhraw = (1 - hs[t] * hs[t]) * dh # backprop through tanh nonlinearity
    dbh += dhraw
    dWxh += np.dot(dhraw, xs[t].T)
    dWhh += np.dot(dhraw, hs[t-1].T)
    dhnext = np.dot(Whh.T, dhraw)
  for dparam in [dWxh, dWhh, dWhy, dbh, dby]:
    np.clip(dparam, -5, 5, out=dparam) # clip to mitigate exploding gradients
  return loss, dWxh, dWhh, dWhy, dbh, dby, hs[len(inputs)-1]

def sample(h, seed_ix, n):
  """ 
  sample a sequence of integers from the model 
  h is memory state, seed_ix is seed letter for first time step
  """
  x = np.zeros((vocab_size, 1))
  x[seed_ix] = 1
  ixes = []
  for t in xrange(n):
    h = np.tanh(np.dot(Wxh, x) + np.dot(Whh, h) + bh)
    y = np.dot(Why, h) + by
    p = np.exp(y) / np.sum(np.exp(y))
    ix = np.random.choice(range(vocab_size), p=p.ravel())
    x = np.zeros((vocab_size, 1))
    x[ix] = 1
    ixes.append(ix)
  return ixes

n, p = 0, 0
mWxh, mWhh, mWhy = np.zeros_like(Wxh), np.zeros_like(Whh), np.zeros_like(Why)
mbh, mby = np.zeros_like(bh), np.zeros_like(by) # memory variables for Adagrad
smooth_loss = -np.log(1.0/vocab_size)*seq_length # loss at iteration 0
while True:
  # prepare inputs (we're sweeping from left to right in steps seq_length long)
  if p+seq_length+1 >= len(data) or n == 0:     hprev = np.zeros((hidden_size,1)) # reset RNN memory    p = 0 # go from start of data  inputs = [char_to_ix[ch] for ch in data[p:p+seq_length]]  targets = [char_to_ix[ch] for ch in data[p+1:p+seq_length+1]]  # sample from the model now and then  if n % 100 == 0:    sample_ix = sample(hprev, inputs[0], 200)    txt = ''.join(ix_to_char[ix] for ix in sample_ix)    print '----\n %s \n----' % (txt, )  # forward seq_length characters through the net and fetch gradient  loss, dWxh, dWhh, dWhy, dbh, dby, hprev = lossFun(inputs, targets, hprev)  smooth_loss = smooth_loss * 0.999 + loss * 0.001  if n % 100 == 0: print 'iter %d, loss: %f' % (n, smooth_loss) # print progress    # perform parameter update with Adagrad  for param, dparam, mem in zip([Wxh, Whh, Why, bh, by],                                 [dWxh, dWhh, dWhy, dbh, dby],                                 [mWxh, mWhh, mWhy, mbh, mby]):    mem += dparam * dparam    param += -learning_rate * dparam / np.sqrt(mem + 1e-8) # adagrad update  p += seq_length # move data pointer  n += 1 # iteration counter

一个理解基本RCNN的简单例子的更多相关文章

一个java解析xml的简单例子
java解析xml,主要是通过Dom4j实现的,很多场合都会用到此功能,需要解析XML文件. 下面是一个简单的解析XML文件的例子: import java.util.Iterator; import ...
一个Self Taught Learning的简单例子
idea: Concretely, for each example in the the labeled training dataset xl, we forward propagate the ...
c# spring aop的简单例子
刚刚完成了一个c#的spring aop简单例子,是在mac下用Xamarin Studio开发的.代码如下: 接口 using System; using System.Collections.Ge ...
一个简单例子：贫血模型or领域模型
转:一个简单例子:贫血模型or领域模型贫血模型我们首先用贫血模型来实现.所谓贫血模型就是模型对象之间存在完整的关联(可能存在多余的关联),但是对象除了get和set方外外几乎就没有其它的方法,整个 ...
Iterator 其实很简单(最好理解的工厂模式的例子)
我们都知道Iterator是一个典型的工厂模式的例子.那么我们可能会被这两个名词搞晕.首先,我们会奇怪,为什么iterator可以遍历不同类型的结合,其次,出入程序猿的我们根本不知道工厂模式是什么. ...
[Machine-Learning] 一个线性回归的简单例子
这篇博客中做一个使用最小二乘法实现线性回归的简单例子. 代码来自<图解机器学习> 图3-2,使用MATLAB实现. 代码link 用到的matlab函数由于以前对MATLAB也不是非常熟 ...
（转）Java中使用正则表达式的一个简单例子及常用正则分享
转自:http://www.jb51.net/article/67724.htm 这篇文章主要介绍了Java中使用正则表达式的一个简单例子及常用正则分享,本文用一个验证Email的例子讲解JAVA中如 ...
C语言多线程的一个简单例子
多线程的一个简单例子: #include <stdio.h> #include <stdlib.h> #include <string.h> #include &l ...
quartz---的一个简单例子
quartz---的一个简单例子首先建立一个maven项目.jar工程即可.(提示:我前面有如何建立一个maven工程的总结以及maven环境的配置.) 1.建立好后点击到app中运行,--> ...

随机推荐

吴裕雄--天生自然C++语言学习笔记：C++ 指针
每一个变量都有一个内存位置,每一个内存位置都定义了可使用连字号(&)运算符访问的地址,它表示了在内存中的一个地址. #include <iostream> using namesp ...
NumPy 数组创建
章节 Numpy 介绍 Numpy 安装 NumPy ndarray NumPy 数据类型 NumPy 数组创建 NumPy 基于已有数据创建数组 NumPy 基于数值区间创建数组 NumPy 数组切 ...
sql ,类型转换，日期截取格式
字符型转换成整型 CONVERT(int ,字段) 只取年月日格式 CONVERT(varchar(10), ZB.drive_time, 120 ) SELECT CONVERT(VARCHAR, ...
main：处理命令行选项
#include<iostream> #include<stdlib.h> using namespace std; int main(int argc, char** arg ...
md5sum|zip|
##move## ;i<=;i++));do cp combine_all.split_$i split_$i;done ##gzip## mkdir gzip/workshell ;i< ...
c\c++ 中字符串分割，并且转换为整形数据
在项目开发中,经常使用到字符串分割, 并且将其转换为整形(比如IP的分割获取,MAC地址的分割获取等),代码如下: #ifndef _UNICODE void StrToIntData( char * ...
关于jquery js读取excel文件内容 xls xlsx格式纯前端
附带参考:http://blog.csdn.net/gongzhongnian/article/details/76438555 更详细导入导出:https://www.jianshu.com/p/7 ...
UVA 11235 RMQ算法
上次的湘潭赛的C题,用线段树敲了下还是WA,不知道为何,我已经注意了处理相同数据,然后他们当时用的RMQ. 所以学了下RMQ,感觉算法思想是一样的,RMQ用了DP或者是递推,由单个数到2^k往上推,虽 ...
equals与hashcode分析
我们经常在面经中看到这样的问题,为什么重写equals方法就一定要重写hashcode方法.本文就是分析这个问题.  在阿里巴巴java开发手册中就给出了这样的规则. ...
CF1217A Creating a Character
You play your favourite game yet another time. You chose the character you didn't play before. It ha ...

一个理解基本RCNN的简单例子

一个理解基本RCNN的简单例子的更多相关文章

随机推荐

热门专题