STAT UN2102 Homework 4 [100 pts]
Due 11:59pm Monday, May 6th on Canvas
Your homework should be submitted on Canvas as an R Markdown file. Please submit
the knitted .pdf or .html file along with the .Rmd file. We will not (and cannot) accept
any other formats. Please clearly label the questions in your responses and support your
answers by textual explanations and the code you use to produce the result. We may print
out your homeworks. Please do not waste paper by printing the dataset or any vector over,
say, length 20.
Goals: Simulating probability distributions using the accept-reject method, simulating a
sampling distribution related to the linear regression model.
1 Reject-Accept Method
Let random variable X denote the temperature at which a certain chemical reaction takes
place. Suppose that X has probability density function
Perform the following tasks:
1. Determine the maximum of f(x). Find an envelope function e(x) by using a uniform
distribution for g(x) and setting e(x) = maxx{f(x)}.
2. Using the Accept-Reject Algorithm, write a program that simulates 1000 draws
from the probability density function f(x) from Equation 1.
3. Plot a histogram of your simulated data with the density function f overlayed in the
graph. Label your plot appropriately.
2 Regression and Empirical Size
2.1 Regression
We work with the grocery retailer dataset from Canvas. The description follows:
1
A large national grocery retailer tracks productivity and costs of its facilities closely. Consider
a data set obtained from a single distribution center for a one-year period. Each data
point for each variable represents one week of activity. The variables included are number
of cases shipped in thousands (X1), the indirect costs of labor as a percentage of total
costs (X2), a qualitative predictor called holiday that is coded 1 if the week has a holiday
and 0 otherwise (X3), and total labor hours (Y ). Consider the multiple linear regression
model
(2) Yi = β0 + β1 Xi1 + β2 Xi2 + β3 Xi3 + i, i = 1, 2, . . . , 52,
and iid~ N(0, σ2).
Perform the following tasks:
4. Read in the grocery retailer dataset. Name the dataset grocery.
5. Use the least squares equation = (XTX)
1XTY to estimate regression model (2).
To estimate the model, use the linear model function in R, i.e., use lm().
6. Use R to estimate σ2, i.e., compute MSE =1
. To perform this task,
use the residuals function.
2.2 Test for Slope

STAT UN2102作业代做、代做R Markdown file作业、代写R课程作业
Now consider investigating if the number of cases shipped (X1) is statistically related to
total labor hours (Y ). To investigate the research question, we run a t-test on the coefficient
corresponding to X1, i.e., we test the null alternative pair
(3) H0 : β1 = 0 versus HA : β1 6= 0.
To run the hypothesis testing procedure, we use the t-statistic
1 is the second element of the least squares estimator β= (XTX)
1XTY and
SE(β1) is the standard error of β?
1. The least squares estimates, estimated standard errors,
t-statistics and p-values for all coefficients β0, β1, β2, β3 are nicely organized in the standard
linear regression output displayed in Table 1. To get this output in R, use the summary()
function on your model.
Test the manager’s claim in (3) using the R functions lm() and summary().
2
Table 1: Standard Multiple Linear Regression Output
Estimate Std. Error t value Pr(> |t|) or Sig
(Intercept) β
2.3 Sampling Distribution
Under model (2) and under the null hypothesis H0 : β1 = 0, the test statistic (4) has a
student’s t-distribution with n 4 degrees of freedom, i.e.,
The goal of this section is to simulate the sampling distribution of the t-statistic.
Perform the following tasks:
5. Write a loop that simulates the sampling distribution of the t-statistic under null
hypothesis (3) with the multiple linear regression model (2). To accomplish this task:
i. Assume the true model relating Y with X1, X2, X3 is
(5) Yi = 4200 + β1Xi1 ? 15X2 + 620X3 + i, i = 1, 2, . . . , 52,i
iid~ N(0, 20500).
ii. Assuming H0 : β1 = 0 is true, simulate 10,000 draws from model (5) using the
fixed covariates X2, X3.
iii. For each iteration of the loop, fit the full model
using the simulated Y and fixed covariates X1, X2, X3.
iv. For each iteration of the loop, also compute the t-statistic from equation (4).
Store these values in a vector t.stat. Hint: Use the summary function in R and
extract the actual summary table using the code summary(model)[[4]]. Then
extract the relevant t-statistic from the table.
v. Display the first six elements of your simulated t-values.
3
7. Plot a histogram of the simulated sampling distribution. Overlay the correct t-density
on this histogram, i.e., overlay the density t(df = 52 ? 4). Plot the density in green
and set breaks=40 in the histogram. Make sure to label the plot appropriately. You
can use base R or ggplot.
8. Recall that the significance level of a testing procedure is defined as
P(Type I error) = P(Rejecting H0 when H0 is true) = α.
The significance level is often called the size of the testing procedure. Based on
significance levels α = 0.10, 0.05, 0.01, compute the sample proportion of simulated
t-values that fell in the rejection region. The proportion of simulated rejected t-values
under the null is called the empirical size of a test. The three values should be close
to the actual α levels.

因为专业,所以值得信赖。如有需要,请加QQ:99515681 或邮箱:99515681@qq.com

微信:codinghelp

STAT UN2102 Homework的更多相关文章

  1. bzoj 4320: ShangHai2006 Homework

    4320: ShangHai2006 Homework Time Limit: 10 Sec Memory Limit: 128 MB Description 1:在人物集合 S 中加入一个新的程序员 ...

  2. HDU 1789 Doing Homework again(贪心)

    Doing Homework again 这只是一道简单的贪心,但想不到的话,真的好难,我就想不到,最后还是看的题解 [题目链接]Doing Homework again [题目类型]贪心 & ...

  3. hdu-1789-Doing Homework again

    /* Doing Homework again Time Limit: 1000/1000 MS (Java/Others) Memory Limit: 32768/32768 K (Java/Oth ...

  4. HDU 1789 Doing Homework again (贪心)

    Doing Homework again http://acm.hdu.edu.cn/showproblem.php?pid=1789 Problem Description Ignatius has ...

  5. Doing Homework 状态压缩DP

    Doing Homework 题目抽象:给出n个task的name,deadline,need.  每个任务的罚时penalty=finish-deadline;   task不可以同时做.问按怎样的 ...

  6. 机器学习 —— 概率图模型(Homework: Exact Inference)

    在前三周的作业中,我构造了概率图模型并调用第三方的求解器对器进行了求解,最终获得了每个随机变量的分布(有向图),最大后验分布(双向图).本周作业的主要内容就是自行编写概率图模型的求解器.实际上,从根本 ...

  7. hdoj 1789 Doing Homework again

    Doing Homework again Time Limit: 1000/1000 MS (Java/Others)    Memory Limit: 32768/32768 K (Java/Oth ...

  8. homework做了些什么?

    第一步:get_new_guid_uid_pairs_{$ymd} 参数是时间和100上的文件. 那么100上的文件是从哪里来的呢? 我们进入到100机器上,打开root权限下的cron,看到如下内容 ...

  9. HDU 1074 Doing Homework (dp+状态压缩)

    题目链接:http://acm.hdu.edu.cn/showproblem.php?pid=1074 题目大意:学生要完成各科作业, 给出各科老师给出交作业的期限和学生完成该科所需时间, 如果逾期一 ...

随机推荐

  1. Mesos:数据库使用的持久化卷

    摘要: Mesos为很多不同的用户场景都提供了精妙的,考虑周全的API.持久化卷是由新的acceptOffers API引入的特性.持久化卷让用户可以为Mesos构建数据库框架,Mesos可以在任何不 ...

  2. elk-图形化展示(八)

    可以根据自己定义: pv: uv: ip top 10 ua tope 10 url top 5 status  top 10 仪表板展示:

  3. JavaScript基础知识(数据类型)

    数据类型 布尔:true/fasle console.log(typeof true);// "boolean" Number : true -->1 false --> ...

  4. If 与 else的性福生活。

    IF 与 ELSE 从此不再孤单 今天我们来学习java课程里的选择结构——if与else if的意思,众所周知,就是如果想必大家心里对这个词已经有丶数了 else的意思,一目了然,就是否则经过图片的 ...

  5. 磁盘异步I / O在Windows上显示为同步

    概要 Microsoft Windows上的文件I / O可以是同步或异步的.I / O的默认行为是同步的,其中调用I / O函数并在I / O完成时返回.异步I / O允许I / O函数立即将执行返 ...

  6. day18:正则表达式和re模块

    1,复习递归:返回值,不要只看到return就认为已经返回了,要看返回操作是在递归的第几层发生的,然后返回给了谁,如果不是返回给最外层函数,调用者就接收不到,需要再分析,看如何把结果返回回来,超过最大 ...

  7. iOS 线程安全--锁

    一,前言 线程安全是iOS开发中避免了的话题,随着多线程的使用,对于资源的竞争以及数据的操作都可能存在风险,所以有必要在操作时保证线程安全. 二,为什么要使用锁? 由于一个进程中不可避免的存在多线程, ...

  8. C#中的double类型数据向SQL sqerver 存储与读取问题

    1.存储 由于double类型在SQLsever中并没有对应数据,试过对应float.real类型,发现小数位都存在四舍五入的现象,目前我使用的是decimal类型,用此类型时个人觉得小数位数应该比自 ...

  9. ionic3.x版本开发问题记录---使用Image Resizer打包报错问题

    按照官方文档安装和使用,最后在打包的时候报错 /platforms/android/src/info/protonet/imageresizer/ImageResizer.java:12: error ...

  10. ORACLE删除分区

    业务需求:定期删除表中三个月之前的数据 说明:由于表采取一个月一个分区的设计,所以删除三个月之前的数据也就是删除三个月之前的分区.但需要注意的是删除分区后全局索引会失效,而本地local索引不会受到影 ...