Sequence to Sequence models

basic sequence-to-sequence model:

  

basic image-to-sequence or called image captioning model:

  

but there are some differences between how you write a model like this to generate a sequence, compared to how you were synthesizing novel text  using a language model. One of the key differences is,you don't want a randomly chosen translation,you maybe want the most likely translation,or you don't want a randomly chosen caption, maybe not,but you might want the best caption and most likely caption.So let's see in the next video how you go about generating that.

Picking the most likely sentence

  

找出最大可能性的P(y|x),最常用的算法是beam search.

  

在介绍 beam search 之前,先了解一下 greedy search 已经为什么不用 greedy search?

greedy search 的意思是,在已知一个值word的情况下,求下一个值word的最可能的情况,以此类推。。。 下图是一个很好的例子说明 greedy search 不适用的情况, 就不如求核能的 y^ 的组合的概率 p(y^1, y^2, ...|x) 然后找出最大概率,当然这样也有问题,就是比如说 10 个word 的输出,在一个 10,000 大的corpus 里就有 10,000 10 种组合情况,需要诉诸于更好的算法,且继续往下看

  

Coursera, Deep Learning 5, Sequence Models, week3, Sequence models & Attention mechanism的更多相关文章

  1. Coursera Deep Learning 2 Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week1, Assignment(Regularization)

    声明:所有内容来自coursera,作为个人学习笔记记录在这里. Regularization Welcome to the second assignment of this week. Deep ...

  2. Coursera Deep Learning 2 Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week2, Optimization algorithms

    Gradient descent Batch Gradient Decent, Mini-batch gradient descent, Stochastic gradient descent 还有很 ...

  3. Coursera Deep Learning 2 Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization - week1, Assignment(Gradient Checking)

    声明:所有内容来自coursera,作为个人学习笔记记录在这里. Gradient Checking Welcome to the final assignment for this week! In ...

  4. Coursera, Deep Learning 4, Convolutional Neural Networks - week4,

    Face recognition One Shot Learning 只看一次图片,就能以后识别, 传统deep learning 很难做到这个. 而且如果要加一个人到数据库里面,就要重新train ...

  5. Coursera, Deep Learning 1, Neural Networks and Deep Learning - week1, Introduction to deep learning

    整个deep learing 系列课程主要包括哪些内容 Intro to Deep learning

  6. Coursera, Deep Learning 4, Convolutional Neural Networks - week1

    CNN 主要解决 computer vision 问题,同时解决input X 维度太大的问题. Edge detection 下面演示了convolution 的概念 下图的 vertical ed ...

  7. Coursera Deep Learning笔记 逻辑回归典型的训练过程

    Deep Learning 用逻辑回归训练图片的典型步骤. 笔记摘自:https://xienaoban.github.io/posts/59595.html 1. 处理数据 1.1 向量化(Vect ...

  8. Deep Learning基础--理解LSTM/RNN中的Attention机制

    导读 目前采用编码器-解码器 (Encode-Decode) 结构的模型非常热门,是因为它在许多领域较其他的传统模型方法都取得了更好的结果.这种结构的模型通常将输入序列编码成一个固定长度的向量表示,对 ...

  9. Coursera, Deep Learning 5, Sequence Models, week1 Recurrent Neural Networks

    有哪些sequence model Notation: RNN - Recurrent Neural Network 传统NN 在解决sequence input 时有什么问题? RNN就没有上面的问 ...

随机推荐

  1. Spring Cloud Vault介绍

    https://mp.weixin.qq.com/s?__biz=MzU0MDEwMjgwNA==&mid=2247484838&idx=1&sn=6439ed96133dde ...

  2. goaccess nginx 日志分析

    用法介绍 GoAccess的基本语法如下: goaccess [ -b ][ -s ][ -e IP_ADDRESS][ - a ] <-f log_file > 参数说明: -f – 日 ...

  3. tyvj/joyoi 1374 火车进出栈问题(水水版)

    我受不了了. Catalan数第100项,30000项,50000项,cnm 这tm哪里是在考数学,分明是在考高精度,FFT...... 有剧毒! 我只得写高精度,只能过100的那个题,两个进化版超时 ...

  4. BZOJ3512 DZY Loves Math IV

    解:这又是什么神仙毒瘤题...... 我直接把后面那个phi用phi * I = id反演一波,得到个式子,然后推不动了...... 实际上第一步我就大错特错了.考虑到n很小,我们有 然后计算S,我们 ...

  5. YSLOW(一款实用的网站性能检测工具)

     概述 YSlow是Yahoo发布的一款基于FireFox的插件,这个插件可以分析网站的页面,并告诉你为了提高网站性能,如何基于某些规则而进行优化.  安装  官网:http://yslow.org/ ...

  6. Solr7.1--- 单机Linux环境搭建

    应网友的要求,写个关于Linux单机的 首先,把MySQL驱动包和solr7.1安装包上传到服务器,我上传到了自定义的目录/mysoft 执行服务安装脚本 1. 先切换到root用户 2. 解压出脚本 ...

  7. argparse模块的应用

    主要参照博客https://www.cnblogs.com/lindaxin/p/7975697.html http://wiki.jikexueyuan.com/project/explore-py ...

  8. secureCRT恶意终止下次无法启动

    secureCRT使用中恶意中断后会在C:\Documents and Settings\wuyipeng\Application Data目录下产生一个secureCRT.dmp文件 下次正常运行s ...

  9. mysql 将一张表的数据更新到另外一张表中

    update 更新表 set 字段 = (select 参考数据 from 参考表 where  参考表.id = 更新表.id); update table_2 m  set m.column = ...

  10. Storm 消息分发策略

    1.Shuffle Grouping:随机分组,随机派发stream里面的tuple,保证每个bolt接收到的tuple数目相同.2.Fields Grouping:按字段分组,比如按userid来分 ...