Homework 7 INF 552,
1. Generative Models for Text
(a) In this problem, we are trying to build a generative model to mimic the writing
style of prominent British Mathematician, Philosopher, prolific writer, and
political activist, Bertrand Russell.
(b) i. The Problems of Philosophy
ii. The Analysis of Mind
iii. Mysticism and Logic and Other Essays
iv. Our Knowledge of the External World as a Field for Scientific Method in
Philosophy
Project Gutenberg adds a standard header and footer to each book and this is
not part of the original text. Open the file in a text editor and delete the header
and footer.
The header is obvious and ends with the text:
*** START OF THIS PROJECT GUTENBERG EBOOK AN INQUIRY INTO
MEANING AND TRUTH ***
The footer is all of the text after the line of text that says:
THE END
To have a better model, it is strongly recommended that you download the following
books from The Library of Congress https://archive.org and convert
them to text files:
i. The History of Western Philosophy
https://archive.org/details/westernphilosophy4
ii. The Analysis of Matter
https://archive.org/details/in.ernet.dli.2015.221533
iii. An Inquiry into Meaning and Truth
https://archive.org/details/BertrandRussell-AnInquaryIntoMeaningAndTruth
Try to only use the text of the books and throw away unwanted text before and
after the text, although in a large corpus, these are considered as noise and should
not make big problems.1
(c) LSTM: Train an LSTM to mimic Russell’s style and thoughts:
i. Concatenate your text files to create a corpus of Russell’s writings.
ii. Use a character-level representation for this model by using extended ASCII
that has N = 256 characters. Each character will be encoded into a an integer
using its ASCII code. Rescale the integers to the range [0, 1], because LSTM
INF 552作业代做、ASCII code留学生作业代写、MATLAB课程设计作业代写
If this is a large corpus for your computer’s power and it makes training LSTM hard, use as many of
the books as possible.
1
Homework 7 INF 552,
uses a sigmoid activation function. LSTM will receive the rescaled integers
as its input.2
iii. Choose a window size, e.g., W = 100.
iv. Inputs to the network will be the first W ?1 = 99 characters of each sequence,
and the output of the network will be the Wth character of the sequence.
Basically, we are training the network to predict each character using the 99
characters that precede it. Slide the window in strides of S = 1 on the text.
For example, if W = 5 and S = 1 and we want to train the network with the
sequence ABRACADABRA, The first input to the network will be ABRA
and the corresponding output will be C. The second input will be BRAC and
the second output will be A, etc.
v. Note that the output has to be encoded using a one-hot encoding scheme with
N = 256 (or less) elements. This means that the network reads integers, but
outputs a vector of N = 256 (or less) elements.
vi. Use a single hidden layer for the LSTM with N = 256 (or less) memory units.
vii. Use a Softmax output layer to yield a probability prediction for each of the
characters between 0 and 1. This is actually a character classification problem
with N classes. Choose log loss (cross entropy) as the objective function for
the network (research what it means).3
viii. We do not use a test dataset. We are using the whole training dataset to
learn the probability of each character in a sequence. We are not seeking for
a very accurate model. Instead we are interested in a generalization of the
dataset that can mimic the gist of the text.
ix. Choose a reasonable number of epochs4
for training, considering your computational
power (e.g., 30, although the network will need more epochs to yield
a better model).
x. Use model checkpointing to keep the network weights to determine each time
an improvement in loss is observed at the end of the epoch. Find the best set
of weights in terms of loss.
xi. Use the network with the best weights to generate 1000 characters, using the
following text as initialization of the network:
There are those who take mental phenomena naively, just as they
would physical phenomena. This school of psychologists tends not to
emphasize the object.
2A smarter way is to parse the whole corpus to figure out how many distinct characters you have in the
corpus (the number may be less than 256, e.g., 53). One can also disregard lowercase and uppercase letters
or even remove punctuation characters such as !.
3
In Keras, you can use the ADAM optimization algorithm for speed.
4one epoch = one forward pass and one backward pass of all the training examples.
batch size = the number of training examples in one forward/backward pass. The higher the batch size,
the more memory space you’ll need.
number of iterations = number of passes, each pass using [batch size] number of examples. To be clear,
one pass = one forward pass + one backward pass (we do not count the forward pass and backward pass as
two different passes).
See https://stats.stackexchange.com/questions/153531/what-is-batch-size-in-neural-network
2
Homework 7 INF 552, Instructor: Mohammad Reza Rajati
xii. Extra Practice: Use one-hot encoding for the input sequence. Use a large
number of epochs, e.g., 150. Add dropout to the network, and use a deeper
LSTM (e.g. with 3 or more layers). Generate 3000 characters using the above
initialization and report if you get more meaningful text.
xiii. Extra Practice- HMM: Train a Hidden Markov Model with V hidden states
and V possible outputs using Baum-Welch Algorithm (or any other modern
algorithm that is available) using the Russell corpus, where V is the number
of distinct words in the corpus. Note that for HMM, you NOT use character
level encoding, because it may yield totally meaningless results, although the
transition matrices associated with it will be way smaller (you are welcome
to try it). Generate 200 words using the model and comment on its meaningfulness.
Extra extra practice: can you train a higher order HMM (i.e. an
HMM that assumes dependency on more than one previous state) to get a
better model?
2. (Deep) CNNs for Image Colorization
(a) This assignment uses a convolutional neural network for image colorization which
turns a grayscale image to a colored image.5 By converting an image to grayscale,
we loose color information, so converting a grayscale image back to a colored
version is not an easy job. We will use the CIFAR-10 dataset. Downolad the
dataset from http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz.
(b) From the train and test dataset, extract the class birds. We will focus on this
class, which has 6000 members.
(c) Those 6000 images have 6000 × 32 × 32 pixels. Choose at least 10% of the pixels
randomly. It is strongly recommended that you choose a large number or all of
the pixels. You will have between P = 614400 and P = 6144000 pixels. Each
pixel is an RGB vector with three elements.
(d) Run k-means clustering on the P vectors using k = 4. The centers of the clusters
will be your main colors. Convert the colored images to k-color images by converting
each pixel’s value to the closest main color in terms of Euclidean distance.
These are the outputs of your network, whose each pixel falls in one of those k
classes.6
(e) Use any tool (e.g., openCV or scikit-learn) to obtain grayscale 32 × 32 × 1 images
from the original 32 × 32 × 3 images. The grayscale images are inputs of your
network.
5MATLAB seems to have an easy to use CNN library. https://www.mathworks.com/help/nnet/
examples/train-a-convolutional-neural-network-for-regression.html
6Centers of clusters have been reported too close previously, so the resultant tetra-chrome images
will be very close to grayscale. In case you would like to see colorful images, repeat the exercise
with colors you select from https://sashat.me/2017/01/11/list-of-20-simple-distinct-colors/ or
https://www.rapidtables.com/web/color/RGB_Color.html. A suggestion would be Navy = (0,0,128),
Red =( 230, 25, 75), Mint = (170, 255, 195), and White = (255, 255, 255).
3
Homework 7 INF 552, 
(f) Set up a deep convolutional neural network with two convolution layers (or more)
and two (or more) MLP layers. Use 5 × 5 filters and a softmax output layer.
Determine the number of filters, strides, and whether or not to use padding yourself.
Use a minimum of one max pooling layer. Use a classification scheme, which
means your output must determine one of the k = 4 color classes for each pixel in
your grayscale image. Your input is a grayscale version of an image (32 × 32 × 1)
and the output is 32 × 32 × 4. The output assigns one of the k = 4 colors to
each of the 32 × 32 pixels; therefore, each of the pixels is classified into one of the
classes [1 0 0 0], [0 1 0 0], [0 0 1 0], [0 0 0 1]. After each pixel is classified into one
of the main colors, the RGB code of that color can be assigned to the pixel. For
example, if the third main color 7
is [255 255 255] and pixel (32,32) of an image
has the one-hot encoded class [0 0 1 0], i.e it was classified as the third color, the
(32,32) place in the output can be associated with [255 255 255]. The size of the
output of the convolutional part, c1 × c2 depends on the size of the convolutional
layers you choose and is a feature map, which is a matrix. That matrix must be
flattened or reshaped, i.e. must be turned into a vector of size c1c2 ×1, before it is
fed to the MLP part. Choose the number of neurons in the first layer of the MLP
(and any other hidden layers, if you are willing to have more than one hidden
layer) yourself, but the last layer must have 32 × 32 × 4 = 4096 neurons, each of
which represents a pixel being in one of the k = 4 classes. Add a softmax layer8
which will choose the highest value out of its k = 4 inputs for each of the 1024
pixels; therefore, the output of the MLP has to be reshaped into a 32 × 32 × 4
matrix, and to get the colored image, the RGB vector of each of the k = 4 classes
has to be converted to the RGB vector, so an output image will be 32 × 32 × 3.
Train at least for 5 epochs (30 epochs is strongly recommended). Plot training,
(validation), and test errors in each epoch. Report the train and test errors and
visually compare the artificially colored versions of the first 10 images in the test
set with the original images.9
(g) Extra Practice: Repeat the whole exercise with k = 16, 24, 32 colors if your
computer can handle the computations.
7Do not use the original CIFAR-10 images as the output. You must use the tetrachrome images you
created as your output.
8Compile the network with loss = cross entropy .
9
If you are using matplotlib, you may get a floating point error because to print an image, matplotlib either
expects ints in range 0-255 or floats in range 0-1. You might be having, for example, 153.0 representation of
153 in your array and this is what makes matplotlib think that you are sending floats.
Wrap your array into np.uint8(). It will convert 153.0 into 153. NOTE that you cannot use np.round
or np.int etc because matplotlib’s requirement is unsigned int of 8 bit (think 0-255).

因为专业,所以值得信赖。如有需要,请加QQ:99515681 或邮箱:99515681@qq.com

微信:codinghelp

Homework 7 INF 552的更多相关文章

  1. Doing Homework 状态压缩DP

    Doing Homework 题目抽象:给出n个task的name,deadline,need.  每个任务的罚时penalty=finish-deadline;   task不可以同时做.问按怎样的 ...

  2. HDU 1074 Doing Homework (dp+状态压缩)

    题目链接:http://acm.hdu.edu.cn/showproblem.php?pid=1074 题目大意:学生要完成各科作业, 给出各科老师给出交作业的期限和学生完成该科所需时间, 如果逾期一 ...

  3. hdu1074 Doing Homework(状态压缩DP Y=Y)

    Doing Homework Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/32768 K (Java/Others) T ...

  4. 【状态DP】 HDU 1074 Doing Homework

    原题直通车:HDU  1074  Doing Homework 题意:有n门功课需要完成,每一门功课都有时间期限t.完成需要的时间d,如果完成的时间走出时间限制,就会被减 (d-t)个学分.问:按怎样 ...

  5. hdu--1798--Doing Homework again(贪心)

    Doing Homework again Time Limit: 1000/1000 MS (Java/Others)    Memory Limit: 32768/32768 K (Java/Oth ...

  6. Doing Homework

    Doing Homework Time Limit:1000MS     Memory Limit:32768KB     64bit IO Format:%I64d & %I64u Subm ...

  7. HDU 1074 Doing Homework (状态压缩DP)

    Doing Homework Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/32768 K (Java/Others)To ...

  8. HDU 1074 Doing Homework【状压DP】

    Doing Homework Problem Description Ignatius has just come back school from the 30th ACM/ICPC. Now he ...

  9. HDU 1074 Doing Homework(经典状压dp)

    题目链接  Doing Homework        Ignatius has just come back school from the 30th ACM/ICPC. Now he has a ...

随机推荐

  1. 10张思维导图带你学习Java​Script

    10张思维导图带你学习Java​Script   下面将po出10张JavaScript相关的思维导图. 分别归类为: JavaScript变量 JavaScript运算符 JavaScript数组 ...

  2. Copley-STM32串口+CANopen实现双电机力矩同步

    原来有个CANopen的主站卡,现在没了,只有单片机,用单片机来制作一个CANopen的主站卡貌似不是很难,但是需要时间.无奈仔细看了一个Copley的说明,决定采用CAN口+串口来实现之前的功能. ...

  3. EHCache:Eelment刷新后,timeToLiveSeconds失效了?

    个人以为只要设定了timeToLiveSeconds,中间过程不管有没有访问,只要LiveSeconds时间到了,缓存就会失效.但是开发时发现并非如此,经过一番折腾,最终发现自己的理解是正确的,还是使 ...

  4. 2019 Android 高级面试题总结 从java语言到AIDL使用与原理

    说下你所知道的设计模式与使用场景 a.建造者模式: 将一个复杂对象的构建与它的表示分离,使得同样的构建过程可以创建不同的表示. 使用场景比如最常见的AlertDialog,拿我们开发过程中举例,比如C ...

  5. win7下安装linux(centos6.5)双系统详细小白教程

    在正式介绍linux安装教程之前,先声明一下本人也是刚开始接触linux,所以教程只以成功安装linux为目标,里面的具体步骤我都是参考网上的教程自己操作实现的,至于为什么要这么做就不多做解释,大家想 ...

  6. cf55D 数位dp记忆化搜索+状态离散

    /* 漂亮数定义:可以整除任意数位上的数 求出区间[l,r]之间的漂亮数个数 因为 dp[i][j][k]:i位前模lcm的值是j,i位前lcm是k的漂亮数个数 */ #include<bits ...

  7. Mac新系统常用设置

    一.MAC OS整个系统的隐藏文件显示可见,在终端下输入以下命令defaults write com.apple.finder AppleShowAllFiles -bool true 二. 在MAC ...

  8. matplotlib基本用法-【老鱼学matplotlib】

    本文介绍一下matplotlib的最基本用法. 这次我们要显示一个线性方程的直线. 首先要引入matplotlib库,一般是用plt这个简写的,我们就按照大多数人的惯例来进行命名: import ma ...

  9. vue-cli 构建的项目中 如何使用less

    vue-cli 构建的项目默认是不支持 less 的,需要自己添加. 首选,安装 less 和 less-loader ,在项目目录下运行如下命令 npm install less less-load ...

  10. node.js官方文档解析 01—assert 断言

    assert-------断言 new assert.AssertionError(options) Error 的一个子类,表明断言的失败. options(选项)有下列对象 message < ...