SciTech-Mathmatics-Probability+Statistics-Applications : Probability&Sampling : Sampling Distribution + Central Limit Theorem

Sampling Distribution + Central Limit Theorem

BY ZACH BOBBITTPOSTED ON OCTOBER 8, 2018

Imagine there exists a population of 10,000 dolphins and the mean weight of a dolphin in this population is 300 pounds.

If we take a simple random sample of 50 dolphins from this population, we might find that the mean weight of dolphins in this sample is 305 pounds.

Then if we take another simple random sample of 50 dolphins, we might find that the mean weight of dolphins in that sample is 295 pounds.

Each time we take a simple random sample of 50 dolphins, it’s likely that the mean weight of the dolphins in the sample will be close to the population mean of 300 pounds, but not exactly 300 pounds.

Imagine that we take 200 simple random samples of 50 dolphins from this population and make a histogram of the mean weight in each sample:

  • In most of the samples, the mean weight will be close to 300 pounds.
  • In rare scenarios,
    • we may happen to pick a sample full of small dolphins where the mean weight is only 250 pounds.
    • Or we may happen to pick a sample full of large dolphins where the mean weight is 350 pounds.

In general, the distribution of the sample means will be approximately normal with the center of the distribution located at the true center of the population.

This distribution of sample means is known as the sampling distribution of the mean and has the following properties:

\(\large \begin{array}{lrl} \\
& \mu_{\overline{x}} &= \mu \\
where, & & \\
& \mu_{\overline{x}} : & is\ the\ \bm{ sample\ mean } \\
& \mu : & is\ the\ \bm{ population\ mean } \\
\\
& \sigma_{\overline{x}} = & \frac{\sigma}{\sqrt{n}} \\
where, & & \\
& \sigma_{\overline{x}} : & is\ the\ \bm{ sample\ standard\ deviation } \\
& \sigma : & is\ the\ \bm{ population\ standard\ deviation } \\
& n : & is\ the\ \bm{ sample\ size } \\
\end{array}\)

For example, in this population of dolphins we know that the mean weight is $\large \mu $ = 300. So the mean of the sampling distribution is $\large \mu_{\overline{x}} $ = 300.

Suppose we also know that the standard deviation of the population is 18 pounds.

So the sample standard deviation is $\large \sigma_{\overline{x}} = \frac{18}{\sqrt{50}} = 2.546 $.

Sampling Distribution of the Proportion

Consider the same population of 10,000 dolphins. Suppose 10% of the dolphins are black and the rest are gray. Suppose we take a simple random sample of 50 dolphins and find that 14% of the dolphins in that sample are black. Then we take another simple random sample of 50 dolphins and find that 8% of the dolphins in that sample are black.

Imagine that we take 200 simple random samples of 50 dolphins from this population and make a histogram of the proportion of dolphins that are black in each sample:

In most of the samples, the proportion of dolphins that are black will be close to the true population of 10%. The distribution of the sample proportion of dolphins that are black will be approximately normal with the center of the distribution located at the true center of the population.

This distribution of sample proportions is known as the sampling distribution of the proportion and has the following properties:

μp = P

where p is the sample proportion and P is the population proportion.

σp = √(P)(1-P) / n

where P is the population proportion and n is the sample size.

For example, in this population of dolphins we know that the true proportion of dolphins that are black is 10% = 0.1. So the mean of the sampling distribution of the proportion is μp = 0.1.

Suppose we also know that the standard deviation of the population is 18 pounds. So the sample standard deviation is σp = √(P)(1-P) / n = √(.1)(1-.1) / 50 = .042.

Establishing Normality

To use the formulas above, the sampling distribution needs to be normal.

According to the central limit theorem, the sampling distribution of a sample mean is approximately normal if the sample size is large enough, even if the population distribution is not normal. In most cases, we consider a sample size of 30 or larger to be sufficiently large.

The sampling distribution of a sample proportion is approximately normal if the expected number of successes and failures are both at least 10.

Examples

We can use sampling distributions to calculate probabilities.

Example 1: A certain machine creates cookies. The distribution of the weight of these cookies is skewed to the right with a mean of 10 ounces and a standard deviation of 2 ounces. If we take a simple random sample of 100 cookies produced by this machine, what is the probability that the mean weight of the cookies in this sample is less than 9.8 ounces?

Step 1: Establish normality.

We need to make sure that the sampling distribution of the sample mean is normal. Since our sample size is greater than or equal to 30, according to the central limit theorem we can assume that the sampling distribution of the sample mean is normal.

Step 2: Find the mean and standard deviation of the sampling distribution.

μx = μ

σx = σ/ √n

μx = 10 ounces

σx = 2/ √100 = 2/10 = 0.2 ounces

Step 3: Use the Z Score Area Calculator to find the probability that the mean weight of the cookies in this sample is less than 9.8 ounces.

Enter the following numbers into the Z Score Area Calculator. You can leave “Raw Score 2” blank since we’re only finding one number in this example.

Since we want to know the probability that the mean weight of the cookies in this sample is less than 9.8 ounces, we are interested in the area to the left of 9.8. The calculator tells us that this probability is 0.15866.

Example 2: According to a school-wide study, 87% of students in a particular school prefer pizza over ice cream. Suppose we take a simple random sample of 200 students. What is the probability that the proportion of students who prefer pizza is less than 85%?

Step 1: Establish normality.

Recall that the sampling distribution of a sample proportion is approximately normal if the expected number of “successes” and “failures” are both at least 10.

In this case the expected number of students who will prefer pizza is 87% * 200 students = 174 students. The expected number of students who will not prefer pizza is 13% * 200 students = 26 students. Since both of these numbers are at least 10, we can assume that the sampling distribution of the sample proportion of students who will prefer pizza is approximately normal.

Step 2: Find the mean and standard deviation of the sampling distribution.

μp = P

σp = √(P)(1-P) / n

μp = 0.87

σp = √(.87)(1-.87) / 200 = .024

Step 3: Use the Z Score Area Calculator to find the probability that the proportion of students who prefer pizza is less than 85%.

Enter the following numbers into the Z Score Area Calculator. You can leave “Raw Score 2” blank since we’re only finding one number in this example.

Since we want to know the probability that the proportion of students who prefer pizza is less than 85%, we are interested in the area to the left of 0.85. The calculator tells us that this probability is 0.20233.

Bonus: Video Explanation of Sampling Distributions

SciTech-Mathmatics-Probability+Statistics-Applications : Probability&Sampling : Sampling Distribution + Central Limit Theorem的更多相关文章

  1. Sampling Distributions and Central Limit Theorem in R(转)

    The Central Limit Theorem (CLT), and the concept of the sampling distribution, are critical for unde ...

  2. 加州大学伯克利分校Stat2.2x Probability 概率初步学习笔记: Section 4 The Central Limit Theorem

    Stat2.2x Probability(概率)课程由加州大学伯克利分校(University of California, Berkeley)于2014年在edX平台讲授. PDF笔记下载(Acad ...

  3. Sampling Distribution of the Sample Mean|Central Limit Theorem

    7.3 The Sampling Distribution of the Sample Mean population:1000:Scale are normally distributed with ...

  4. Probability&Statistics 概率论与数理统计(1)

    基本概念 样本空间: 随机试验E的所有可能结果组成的集合, 为E的样本空间, 记为S 随机事件: E的样本空间S的子集为E的随机事件, 简称事件, 由一个样本点组成的单点集, 称为基本事件 对立事件/ ...

  5. 【转载】Recommendations with Thompson Sampling (Part II)

    [原文链接:http://engineering.richrelevance.com/recommendations-thompson-sampling/.] [本文链接:http://www.cnb ...

  6. (转)Awesome Courses

    Awesome Courses  Introduction There is a lot of hidden treasure lying within university pages scatte ...

  7. [Math Review] Statistics Basic: Sampling Distribution

    Inferential Statistics Generalizing from a sample to a population that involves determining how far ...

  8. Sampling and Estimation

    Sampling and Estimation Sampling Error Sampling error is the difference between a sample statistic(t ...

  9. Study note for Continuous Probability Distributions

    Basics of Probability Probability density function (pdf). Let X be a continuous random variable. The ...

  10. [Math Review] Statistics Basic: Estimation

    Two Types of Estimation One of the major applications of statistics is estimating population paramet ...

随机推荐

  1. Linux TCP网关的线程结构方案

    如果所示: 无论客户端还是服务端链接网关的socket都拆分为读EPoll.写EPoll分别独立. 有两个线程:线程A(左).线程B(右): 线程A负责服务端Socket的读和客户端socket的写, ...

  2. 某公交管理系统简易逻辑漏洞+SQL注入挖掘

    某公交管理系统挖掘 SQL注入漏洞 前台通过给的账号密码,进去 按顺序依次点击1.2.3走一遍功能点,然后开启抓包点击4 当点击上图的4步骤按钮时,会抓到图下数据包,将其转发到burp的重放模块 构造 ...

  3. React-Native开发鸿蒙NEXT-video

    React-Native开发鸿蒙NEXT-video 前几周的开发,基本把一个"只读型"社区开发的差不多了.帖子列表,详情,搜索都迁移实现了,但还差了一点------视频类型帖子的 ...

  4. CentOS 7.* 安装最新版nginx1.28*

    一.下载nginx https://nginx.org/en/download.html 选择稳定版本 nginx-1.28.0 如果使用虚拟机,可以先用windows系统下载后,上传到虚机,此步骤省 ...

  5. 为什么构建容器需要Namespace?

    1.什么是Namespace? Namespace 是 Linux 内核的一个特性,该特性可以实现在同一主机系统中,对进程 ID.主机名.用户 ID.文件名.网络和进程间通信等资源的隔离.Docker ...

  6. Golang defer

    一.多个延迟执行语句的处理顺序 Go语言中defer语句会将起后面跟随的语句进行延迟处理,在defer归属的函数即将返回时,将延迟处理的语句按照defer的逆序进行执行,也就是说先被defer的语句最 ...

  7. (各种数组之间的互相转换)int 数组与List互相转换,object数组转换int数组

    Stream流之List.Integer[].int[]相互转化 一.int[ ] 1.1.int[ ] 转 Integer[ ] public static void main(String[] a ...

  8. ET框架运行初次--客户端资源更新(Mac环境)

    1.首先在Mac上启动资源服务器.参考 https://www.cnblogs.com/cj8988/p/13965074.html 2.资源会下载到该位置( /Unity/Assets/Stream ...

  9. SpringBoot扩展点全攻略:让你的代码像积木一样灵活组装

    SpringBoot扩展点全攻略:让你的代码像积木一样灵活组装 小李正在开发一个电商系统,老板突然说:"我们要在用户登录时发送短信通知,在订单支付后要积分奖励,在系统启动时要预热缓存...& ...

  10. 视频转换 rtsp 流 转rtmp流播放(待完善)

    前言:之前的博客找的rtsp流地址,和按照red5 都是为写这篇文章铺垫. 这篇文章,选择 ffmepg命令 把rtsp流转为rtmp, 接着vlc插件测试下生成的rtmp流. 最后 已经配置好了re ...