SciTech-Mathmatics-Probability+Statistics-Population Vs. Sampling: Representative Samples + How to obtain Samples
Difference: Population vs. Sample
BY ZACH BOBBITTPOSTED ON NOVEMBER 27, 2020




Often in statistics we're interested in collecting data so that we can answer some research question.
For example, we might want to answer the following questions:
- What is the median household income in Miami, Florida?
- What is the mean weight of a certain population of turtles?
- What percentage of residents in a certain county support a certain law?
In each scenario, we are interested in answering some question about a population, which represents every possible individual element that we're interested in measuring.
However, instead of collecting data on every individual in a population we instead collect data on a sample of the population, which represents a portion of the population.
Population: Every possible individual element that we are interested in measuring.
Sample: A portion of the population.
Here is an example of a population vs. a sample in the three intro examples.
Three Examples
- What is the median household income in Miami, Florida?
The entire population might include 500,000 households,
but we might only collect data on a sample of 2,000 total households. - What is the mean weight of a certain population of turtles?
The entire population might include 800 turtles,
but we might only collect data on a sample of 30 turtles. - What percentage of residents in a certain county support a certain law?
The entire population might include 50,000 residents,
but we might only collect data on a sample of 1,000 residents.
Why Use Samples?
There are several reasons that we typically collect data on samples instead of entire populations, including:
- It is too time-consuming to collect data on an entire population. For example, if we want to know the median household income in Miami, Florida, it might take months or even years to go around and gather income for each household. By the time we collect all of this data, the population may have changed or the research question of interest might no longer be of interest.
- It is too costly to collect data on an entire population. It is often too expensive to go around and collect data for every individual in a population, which is why we instead choose to collect data on a sample instead.
- It is unfeasible to collect data on an entire population. In many cases it's simply not possible to collect data for every individual in a population. For example, it may be extraordinarily difficult to track down and weigh every turtle in a certain population that we're interested in.
By collecting data on samples, we're able to gather information about a given population much faster and cheaper.
And if our sample is representative of the population, then we can generalize the findings from a sample to the larger population with a high level of confidence.
The Importance of Representative Samples
When we collect a sample from a population,
we ideally want the sample to be like a "mini version" of our population.
For example, suppose we want to understand the movie preferences of students in a certain school district that has a population of 5,000 total students. Since it would take too long to survey every individual student, we might instead take a sample of 100 students and ask them about their preferences.
If the overall student population is composed of 50% girls and 50% boys, our sample would not be representative if it included 90% boys and only 10% girls.

Or if the overall population is composed of equal parts freshman, sophomores, juniors, and seniors, then our sample would not be representative if it only included freshman.

A sample is representative of a population if the characteristics of the individuals in the sample \(\large closely\ matches\) the characteristics of the individuals in the overall population.
When this occurs, we can generalize the findings from the sample to the overall population with confidence.
How to Obtain Samples
There are many different methods we can use to obtain samples from populations.
To maximize the chances that we obtain a representative sample, we can use one of the three following methods:
Simple random sampling:Randomly select $\large individuals $ through the use of \(\large a\ random\ number\ generator\) or \(\large some\ means\ of\ random\ selection\).
Stratified random sampling: Split \(\large a\ population\) into \(\large groups\). Randomly select some \(\large members\) from \(\large each\ group\) to be in the sample.
Systematic random sampling: Put every member of a population into some order. Choose a random starting point and select every \(\large n\)th member to be in the sample.
In each of these methods, every individual in the population has an equal probability of being included in the sample. This maximizes the chances that we obtain a sample that is a “mini version” of the population.
SciTech-Mathmatics-Probability+Statistics-Population Vs. Sampling: Representative Samples + How to obtain Samples的更多相关文章
- Simple Random Sampling|representative sample|probability sampling|simple random sampling with replacement| simple random sampling without replacement|Random-Number Tables
1.2 Simple Random Sampling Census, :全部信息 Sampling: 抽样方式: representative sample:有偏向,研究者选择自己觉得有代表性的sam ...
- Probability&Statistics 概率论与数理统计(1)
基本概念 样本空间: 随机试验E的所有可能结果组成的集合, 为E的样本空间, 记为S 随机事件: E的样本空间S的子集为E的随机事件, 简称事件, 由一个样本点组成的单点集, 称为基本事件 对立事件/ ...
- 随机采样和随机模拟:吉布斯采样Gibbs Sampling
http://blog.csdn.net/pipisorry/article/details/51373090 吉布斯采样算法详解 为什么要用吉布斯采样 通俗解释一下什么是sampling. samp ...
- [Math Review] Statistics Basic: Sampling Distribution
Inferential Statistics Generalizing from a sample to a population that involves determining how far ...
- Sampling Error|Sampling mean|population mean
7.1 Sampling Error; the Need for Sampling Distributions 样本均值的三种表达: Sampling distribution of the samp ...
- Gibbs sampling
In statistics and in statistical physics, Gibbs sampling or a Gibbs sampler is aMarkov chain Monte C ...
- Java 7 jstat – JVM Statistics Monitoring Tool【翻译】
原文地址:Java 7 jstat 本文内容 语法 参数 描述 虚拟机标识符 选项 一般选项 输出选项 示例 先发出来,然后慢慢翻译~ 语法 jstat [ generalOption | outpu ...
- 【算法34】蓄水池抽样算法 (Reservoir Sampling Algorithm)
蓄水池抽样算法简介 蓄水池抽样算法随机算法的一种,用来从 N 个样本中随机选择 K 个样本,其中 N 非常大(以至于 N 个样本不能同时放入内存)或者 N 是一个未知数.其时间复杂度为 O(N),包含 ...
- How do I learn machine learning?
https://www.quora.com/How-do-I-learn-machine-learning-1?redirected_qid=6578644 How Can I Learn X? ...
- How to handle Imbalanced Classification Problems in machine learning?
How to handle Imbalanced Classification Problems in machine learning? from:https://www.analyticsvidh ...
随机推荐
- hadoop部署安装(六)hive
5.配置hive 5.1 hive下载地址 http://mirror.bit.edu.cn/apache/hive/ 解压缩 [root@master ~]# tar xf apache-hive- ...
- 【深度学习基础】:VGG实战篇(图像风格迁移)
目录 前言 style transfer原理 原理解析 损失函数 style transfer代码 效果图 fast style transfer 代码 效果图 前言 本篇来带大家看看VGG的实战篇, ...
- 【记录】IDA|Ollydbg|两种软件中查看指令在原二进制文件中的位置,及查看原二进制文件位置对应的反汇编指令的方式
文章目录 在IDA中查看指令地址 在Ollydbg中查看指令地址 在Ollydbg中查看地址对应的指令 在IDA中查看指令地址 在Ollydbg中查看指令地址 ollydbg在对应指令处,右键-查看- ...
- SQL 日常练习(十五)
这两周真的是被客户搞怕了, 我一个数据分析师, 干着比程序员还复杂的活, 拿着文员的工资, 看这我每天下班的打卡时间, 感觉我一点求生欲都没有,真的不知道图啥. 快速理解业务, 马上建数据库表, 写后 ...
- LLM主要架构
LLM本身基于Transformer架构 自2017年,Attention is all you need诞生起,原始的Transformer模型不同领域的模型提供了灵感和启发 基于原始的Transf ...
- C#之并发字典
internal class Program { const string Item = "Dictionary item"; const int Iterations = 100 ...
- 【公众号搬运】gap
.markdown-body { line-height: 1.8; font-weight: 400; font-size: 16px; word-spacing: 2px; letter-spac ...
- 基于StringUtils实现List和String字符串互转
将以逗号分割的字符串转换成List类型: String ids= "1,2,32,59,96"; List<Long> idsList = Arrays.asList( ...
- 关于vue关闭页面时去除定时器失效问题解决
1.先去除页面缓存,这个在路由部分 2.
- BAPI_OUTB_DELIVERY_CHANGE 删除DN
"""回滚数据 删除DN, CLEAR: l_header_data_chg,l_header_control_chg. l_header_data_chg-deliv_ ...