Long Short-Term Memory (LSTM)公式简介

姜楠 2024-10-19 05:52:36 原文

Long short-term memory:

make that short-term memory last for a long time.

Paper Reference：

A Critical Review of Recurrent Neural Networks for Sequence Learning

Three Types of Gate

Input Gate:

Controls how much of the current input \(x_t\) and the previous output \(h_{t-1}\) will enter into the new cell.
\[i_t=\sigma(W^i x_t+U^i h_{t-1}+b^i)\]

Forget Gate:

Decide whether to erase (set to zero) or keep individual components of the memory.
\[f_t=\sigma(W^f x_t+U^f h_{t-1}+b^f)\]

Cell Update:

Transforms the input and previous state to be taken into account into the current state.
\[g_t=\phi(W^g x_t+U^g h_{t-1}+b^g)\]

Output Gate:

Scales the output from the cell.
\[o_t=\sigma(W_o x_t+U^o h^{t-1}+b^o)\]

Internal State update:

Computes the current timestep's state using the gated previous state and the gated input.
\[s_t=g_t\cdot i_t+s_{t-1}\cdot f_t\]

Hidden Layer:

Output of the LSTM scaled by a \(\tanh\) (squashed) transformations of the current state.
\[h_t=s_t\cdot \phi(o_t)\]

其中\(\cdot\) 代表"element-wise matrix multiplication"(对应元素相乘)，\(\phi(x)=\tanh(x),\sigma(x)=sigmoid(x)\)
\[\phi(x)=\frac{e^x-e^{-x}}{e^x+e^{-x}},\sigma(x)=\frac{1}{1+e^{-x}}\]

Parallel Computing

input gate, forget gate, cell update, output gate can be computed in parallel.

\[\begin{bmatrix} i^t\\ f^t\\g^t\\o^t \end{bmatrix} =\begin{bmatrix}\sigma\\ \sigma\\\phi\\\sigma\end{bmatrix}\times W\times[x^t,h^{t-1}]\]

LSTM network for Semantic Analysis

Model Architecture
Model: LSTM layer --> Averaging Pooling --> Logistic Regession

Input sequence:

\[x_0,x_1,x_2,\cdots,x_n\]

representation sequence:

\[h_0,h_1,h_2,\cdots,h_n\]

This representation sequence is then averaged over all timesteps resulting in representation h:
\[h=\sum\limits_i^n{h_i}\]

Bidirectional LSTM

貌似只能用于 fixed-length sequence. 还有一点就是在传统的机器学习中我们实际上无法获取到 future infromation

Long Short-Term Memory (LSTM)公式简介的更多相关文章

LSTM学习—Long Short Term Memory networks
原文链接:https://colah.github.io/posts/2015-08-Understanding-LSTMs/ Understanding LSTM Networks Recurren ...
LSTM(Long Short Term Memory)
长时依赖是这样的一个问题,当预测点与依赖的相关信息距离比较远的时候,就难以学到该相关信息.例如在句子”我出生在法国,……,我会说法语“中,若要预测末尾”法语“,我们需要用到上下文”法国“.理论上,递归 ...
[译] 理解 LSTM(Long Short-Term Memory, LSTM) 网络
本文译自 Christopher Olah 的博文 Recurrent Neural Networks 人类并不是每时每刻都从一片空白的大脑开始他们的思考.在你阅读这篇文章时候,你都是基于自己已经拥有 ...
Long Short-Term Memory (LSTM)
Long Short-Term Memory (LSTM) Outline Background LSTM Network Extended LSTM LST ...
Mathjax与LaTex公式简介
MathJax与LaTex公式简介 (转载) PS: 原文链接写的非常好!!! 博主写这篇文章,一是为了防止原链接失效,二是在cnblogs上测试MathJax; 本文从math.stackexcha ...
(转).NET Memory Profiler 使用简介
1 简介 .Net Memory Profiler(以下简称Profiler):专门针对于.NET程序,功能最全的内存分析工具,最大的特点是具有内存动态分析(Automatic Mem ...
Gated Recurrent Unit (GRU)公式简介
update gate $z_t$: defines how much of the previous memory to keep around. \[z_t = \sigma ( W^z x_t+ ...
[Android Memory] Android Lint简介（转载）
英文原文:http://tools.android.com/tips/lint 参照文章:http://blog.csdn.net/thl789/article/details/8037473 转载 ...
DMA（Direct Memory Access）简介
什么是DMA(Direct Memory Access) DMA绕过CPU,在内存和外设之间开辟了一条 "隧道" ,直接控制内存与外设之间的操作,并完全由硬件控制. 这样数据传送不 ...

随机推荐

CGLib动态代理原理及实现
JDK实现动态代理需要实现类通过接口定义业务方法,对于没有接口的类,如何实现动态代理呢,这就需要CGLib了.CGLib采用了非常底层的字节码技术,其原理是通过字节码技术为一个类创建子类,并在子类中采 ...
特定IP访问远程桌面
1. 新建IP安全策略 (远程端口没有修改情况下的设置) WIN+R打开运行对话框,输入“gpedit.msc”进入组策略编辑器. 依次打开:本地计算机策略--计算机配置--Windows设置--安全 ...
shell中export理解误区
一直以来,以为shell脚本中经过export后的变量会影响到执行这个shell的终端中的环境变量.环境变量这个概念不是shell所独有的,而是linux里面进程所拥有的,shell解释器运行起来就是 ...
[转]什么鬼，又不知道怎么命名class了
(本文作者Mrcxt,原文链接:http://blog.csdn.net/mrcxt/article/details/52038884) 相信写css的人都会遇到下面的问题: 糟糕,怎么命名这个cla ...
GCC 中零长数组与变长数组
前两天看程序,发现在某个函数中有下面这段程序: int n; //define a variable n int array[n]; //define an array with length n 在 ...
【2016-10-24】【坚持学习】【Day11】【WPF】【MVVM】
今天学习wpf的mvvm 人家说,APS.NET ===>MVC WPF===>MVVM 用WPF不用mvvm的话,不如用winform... 哈哈,题外话. 定义: MVVM: WPF的 ...
【读书笔记《Bootstrap 实战》】1.初识Bootstrap
作为Web前端开发框架,Bootstrap为大多数标准的UI设计常见提供了用户友好.扩浏览器的解决方案. 1.下载Bootstrap 打开官方网址 http://getbootstrap.com/ 进 ...
并查集补集作法 codevs 1069 关押罪犯
1069 关押罪犯 2010年NOIP全国联赛提高组时间限制: 1 s 空间限制: 128000 KB 题目等级 : 钻石 Diamond 题解题目描述 Description ...
第19章集合框架(3)-Map接口
第19章集合框架(3)-Map接口 1.Map接口概述 Map是一种映射关系,那么什么是映射关系呢? 映射的数学解释设A,B是两个非空集合,如果存在一个法则,使得对A中的每一个元素a,按法则f,在 ...
AutoCAD2006的安装及CASS7.1的配置破解
-----本文只限于学习交流,商业用途请支持正版! -----转载请注明:http://www.cnblogs.com/mxbs/p/6193190.html 准备: AutoCAD2006.CASS ...