Coursera, Deep Learning 5, Sequence Models, week3, Sequence models & Attention mechanism

mashuai_191 2024-08-26 19:34:35 原文

Sequence to Sequence models

basic sequence-to-sequence model:

　　

basic image-to-sequence or called image captioning model:

　　

but there are some differences between how you write a model like this to generate a sequence, compared to how you were synthesizing novel text using a language model. One of the key differences is,you don't want a randomly chosen translation,you maybe want the most likely translation,or you don't want a randomly chosen caption, maybe not,but you might want the best caption and most likely caption.So let's see in the next video how you go about generating that.

Picking the most likely sentence

　　

找出最大可能性的P(y|x)，最常用的算法是beam search.

　　

在介绍 beam search 之前，先了解一下 greedy search 已经为什么不用 greedy search?

greedy search 的意思是，在已知一个值word的情况下，求下一个值word的最可能的情况，以此类推。。。下图是一个很好的例子说明 greedy search 不适用的情况，就不如求核能的 y^ 的组合的概率 p(y^1, y^2, ...|x) 然后找出最大概率，当然这样也有问题，就是比如说 10 个word 的输出，在一个 10,000 大的corpus 里就有 10,000¹⁰ 种组合情况，需要诉诸于更好的算法，且继续往下看

　　

Coursera, Deep Learning 5, Sequence Models, week3, Sequence models & Attention mechanism的更多相关文章

Coursera Deep Learning 2 Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week1, Assignment(Regularization)
声明:所有内容来自coursera,作为个人学习笔记记录在这里. Regularization Welcome to the second assignment of this week. Deep ...
Coursera Deep Learning 2 Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week2, Optimization algorithms
Gradient descent Batch Gradient Decent, Mini-batch gradient descent, Stochastic gradient descent 还有很 ...
Coursera Deep Learning 2 Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week1, Assignment(Gradient Checking)
声明:所有内容来自coursera,作为个人学习笔记记录在这里. Gradient Checking Welcome to the final assignment for this week! In ...
Coursera, Deep Learning 4, Convolutional Neural Networks - week4,
Face recognition One Shot Learning 只看一次图片,就能以后识别, 传统deep learning 很难做到这个. 而且如果要加一个人到数据库里面,就要重新train ...
Coursera, Deep Learning 1, Neural Networks and Deep Learning - week1, Introduction to deep learning
整个deep learing 系列课程主要包括哪些内容 Intro to Deep learning
Coursera, Deep Learning 4, Convolutional Neural Networks - week1
CNN 主要解决 computer vision 问题,同时解决input X 维度太大的问题. Edge detection 下面演示了convolution 的概念下图的 vertical ed ...
Coursera Deep Learning笔记逻辑回归典型的训练过程
Deep Learning 用逻辑回归训练图片的典型步骤. 笔记摘自:https://xienaoban.github.io/posts/59595.html 1. 处理数据 1.1 向量化(Vect ...
Deep Learning基础--理解LSTM/RNN中的Attention机制
导读目前采用编码器-解码器 (Encode-Decode) 结构的模型非常热门,是因为它在许多领域较其他的传统模型方法都取得了更好的结果.这种结构的模型通常将输入序列编码成一个固定长度的向量表示,对 ...
Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks
有哪些sequence model Notation: RNN - Recurrent Neural Network 传统NN 在解决sequence input 时有什么问题? RNN就没有上面的问 ...

随机推荐

搭建james邮件服务器
把james解压到任何一个非中文无空格目录下: lib下添加必要的jar文件: 运行run.bat命令服务器,使用期间不要关闭. 创建邮件数据库创建配置文件:james-database.prope ...
js jquery 数组的合并对象的合并
转载自:http://www.cnblogs.com/xingxiangyi/p/6416468.html 1 数组合并 1.1 concat 方法 1 2 3 4 var a=[1,2,3],b=[ ...
react-native中使用长列表
React Native 提供了几个适用于展示长列表数据的组件,一般而言我们会选用FlatList或是SectionList. FlatList组件用于显示一个垂直的滚动列表,其中的元素之间结构近似而 ...
java和c#中的装箱和拆箱操作
c#装箱和拆箱装箱:整体上来说,装箱是将值类型转换成引用类型,比如将Vector3转换成Object类型. 具体而言: 1)在托管堆中为值类型分配内存.除了原始的数值以外还应该有指向该数值的引用. ...
FineUILearning
一:表单控件的学习: 1(1) <f:PageManager > 将对象引用设置到对象的实例,否则页面无法显示: (2)<Menu></Menu>就是下拉菜单控件 ...
apache2 以及https证书配置
环境Ubuntu12.04 server 配置 1,首先在进入找到/etc/apache2/apache2.conf的配置文件,里面有包含了较多配置文件的路径如:httpd.conf/ports.co ...
hystrix实战
https://blog.csdn.net/Ezreal_King/article/details/72942823
Prometheus+AlertManager实现邮件报警
AlertManager下载 https://prometheus.io/download/ 解压添加配置文件test.yml,配置收发邮件邮箱参考配置: global: smtp_smartho ...
python自动化开发-[第十八天]-django的ORM补充与ajax,分页器
今日概要: 1.ORM一对多,多对多 2.正向查询,反向查询 3.聚合查询与分组查询 4.F查询和Q查询 5.ajax 6.分页器一.ORM补充: django在终端打印sql语句设置: LOGGI ...
NandFlash学习
目录 NandFlash学习概述原理图(K9F2G08U0C) 启动的引脚配置命令概述操作概述 Uboot下操作体验 ID与地址编码时序初始化程序设计忙判断基本操作读NAND 擦除 ...