Links

PRESIDENTIAL COLUMN: Bayes for Beginners: Probability and Likelihood

C. Randy Gallistel, August 31, 2015; TAGS: C. RANDY GALLISTEL COLUMNS | DATA | EXPERIMENTAL PSYCHOLOGY | METHODOLOGY | STATISTICAL ANALYSIS

Distinguishing Likelihood(可能性) From Probability(概率)

The distinction between probability and likelihood is fundamentally important:

  • Probability attaches to possible outcomes(可能的产出);

    the possible outcomes to which probabilities attach are MECE(Mutually Exclusive and Collectively Exhaustive);
  • Likelihood attaches to hypotheses(假设).

    the hypotheses to which likelihoods attach are often neither(MECE);

Explaining this distinction is the purpose of this first column.

Possible results are MECE

Possible outcomes are MECE(Mutually Exclusive and Collectively Exhaustive).

Suppose we ask a subject to predict the outcome of each of 10 tosses of a coin.

There are only 11 possible outcomes(0 to 10 correct predictions).

The actual result will always be one and only one of the possible outcomes.

Thus, the probabilities that attach to the possible outcomes MUST sum to 1.

Hypotheses are often neither(MECE)

Hypotheses, unlike outcomes, are neither mutually exclusive nor collectively exhaustive.

Suppose that the first subject we test predicts 7 of the 10 outcomes correctly.

  • I might hypothesize that the subject just guessed,
  • and you might hypothesize that the subject may be somewhat clairvoyant,

    by which you mean that the subject may be expected to correctly predict the results,

    at slightly greater than chance rates over the long run.

    These are different hypotheses, but they are not mutually exclusive,

    because you hedged when you said "may be.",

    You thereby allowed your hypothesis include mine.
  • In technical terminology, my hypothesis is nested within yours.

    Someone else might hypothesize that the subject is strongly clairvoyant and that the observed result underestimates the probability that her next prediction will be correct.

    Another person could hypothesize something else altogether.

    There is no limit to the hypotheses one might entertain.

The set of hypotheses to which we attach likelihoods is limited by our capacity to dream them up. In practice, we can rarely be confident that we have imagined all the possible hypotheses. Our concern is to estimate the extent to which the experimental results affect the relative likelihood of the hypotheses we and others currently entertain. Because we generally do not entertain the full set of alternative hypotheses and because some are nested within others, the likelihoods that we attach to our hypotheses do not have any meaning in and of themselves; only the relative likelihoods — that is, the ratios of two likelihoods — have meaning.

"Forwards" and "Backwards"

The difference between probability and likelihood becomes clear,

when one uses the probability distribution function in general-purpose programming languages.

  • In the present case, the function we want is the \(binomial\ distribution\ function\).

    It is called \(BINOM.DIST\) in the most common spreadsheet software and \(binopdf\) in the language I use. It has three input arguments: the number of successes, the number of tries, and the probability of a success. When one uses it to compute probabilities, one assumes that the latter two arguments (number of tries and the probability of success) are given. They are the parameters of the distribution. One varies the first argument (the different possible numbers of successes) in order to find the probabilities that attach to those different possible results (top panel of Figure 1). Regardless of the given parameter values, the probabilities always sum to 1.
  • By contrast, in computing a likelihood function, one is given the number of successes (7 in our example) and the number of tries (10). In other words, the given results are now treated as parameters of the function one is using. Instead of varying the possible results, one varies the probability of success (the third argument, not the first argument) in order to get the binomial likelihood function (bottom panel of Figure 1). One is running the function backwards, so to speak, which is why likelihood is sometimes called reverse probability.

The information that the binomial likelihood function conveys is extremely intuitive. It says that given that we have observed 7 successes in 10 tries, the probability parameter of the binomial distribution from which we are drawing (the distribution of successful predictions from this subject) is very unlikely to be 0.1; it is much more likely to be 0.7, but a value of 0.5 is by no means unlikely. The ratio of the likelihood at p = .7, which is .27, to the likelihood at p = .5, which is .12, is only 2.28. In other words, given these experimental results (7 successes in 10 tries), the hypothesis that the subject's long-term success rate is 0.7 is only a little more than twice as likely as the hypothesis that the subject's long-term success rate is 0.5.

In summary, the likelihood function is a Bayesian basic.

To understand likelihood, you must be clear about the differences between probability and likelihood:

Probabilities attach to results; likelihoods attach to hypotheses.

In data analysis, the "hypotheses" are most often a possible value or a range of possible values for the mean of a distribution, as in our example.

  • The results to which probabilities attach are MECE(Mutually Exclusive and Collectively Exhaustive);
  • the hypotheses to which likelihoods attach are often neither;

    the range in one hypothesis may include the point in another, as in our example.

    To decide which of two hypotheses is more likely given an experimental result,

    we consider the ratios of their likelihoods. This ratio, the relative likelihood ratio, is called the "Bayes Factor".

SciTech-Mathmatics-Probability+Statistics: Distinguishing(区分) Probability(attaches to Outcomes MECE) from Likelihood(attaches Hypothesis are often neither MECE)的更多相关文章

  1. Probability&Statistics 概率论与数理统计(1)

    基本概念 样本空间: 随机试验E的所有可能结果组成的集合, 为E的样本空间, 记为S 随机事件: E的样本空间S的子集为E的随机事件, 简称事件, 由一个样本点组成的单点集, 称为基本事件 对立事件/ ...

  2. Study note for Continuous Probability Distributions

    Basics of Probability Probability density function (pdf). Let X be a continuous random variable. The ...

  3. An Introduction to Measure Theory and Probability

    目录 Chapter 1 Measure spaces Chapter 2 Integration Chapter 3 Spaces of integrable functions Chapter 4 ...

  4. Probability Concepts

    Probability Concepts Unconditional probability and Conditional Probability Unconditional Probability ...

  5. 【概率论】1-1:概率定义(Definition of Probability)

    title: [概率论]1-1:概率定义(Definition of Probability) categories: Mathematic Probability keywords: Sample ...

  6. Normal Probability Plots|outlier

    6.4 Assessing Normality; Normal Probability Plots The normal probability plot is a graphical techniq ...

  7. [Math Review] Statistics Basics: Main Concepts in Hypothesis Testing

    Case Study The case study Physicians' Reactions sought to determine whether physicians spend less ti ...

  8. PHP7函数大全(4553个函数)

    转载来自: http://www.infocool.net/kb/PHP/201607/168683.html a 函数 说明 abs 绝对值 acos 反余弦 acosh 反双曲余弦 addcsla ...

  9. How do I learn machine learning?

    https://www.quora.com/How-do-I-learn-machine-learning-1?redirected_qid=6578644   How Can I Learn X? ...

  10. [C2P3] Andrew Ng - Machine Learning

    ##Advice for Applying Machine Learning Applying machine learning in practice is not always straightf ...

随机推荐

  1. 数据结构之位图(bitmap、RoaringMap)

    参照资料: 1.https://www.bilibili.com/video/BV1u44y1g7Ps(bitmap) 2.https://b23.tv/cQtuFOx (RoaringMap) 3. ...

  2. 浏览器如何确定最终的CSS属性值?解析计算优先级与规则

    前言 上篇文章中有提到CSS值的处理过程,但如果想要确定一个元素的最终样式值可以不需要这么多步.实际上我们写的任何一个标签元素无论写没写样式,它都会有一套完整的样式.理解这一点非常重要️ 比如:一个简 ...

  3. 阿里云Ansible自动化运维平台部署

    以下是在阿里云平台上基于Ansible实现自动化运维的完整实践指南,整合所有核心操作流程和命令,适配指定的服务器规划: 一.环境规划 主机名 IP地址 角色 操作系统 manage01 192.168 ...

  4. django实例(2)

    S14day19---->urls.py from django.contrib import adminfrom django.conf.urls import url,includeurlp ...

  5. 图解Spring源码4-Spring Bean的作用域

    >>>点击去看B站配套视频<<< 系列文章目录和关于我 1. 从一个例子开始 小陈经过开店标准化审计流程后,终于拥有了一家自己的咖啡店,在营业前它向总部的咖啡杯生产 ...

  6. 基于vue3项目开发+MonacoEditor实现外部引入依赖,界面化所见即所得

    最近一个项目中,基于vue3开发,想开发一个在线管理组件库的功能,具体业务实现: 1. 在私库Nexus上传组件包: 2. 然后用UNPKG实现路径访问在线解压文件: 3. 解压文件上传到gitee组 ...

  7. 为Feign客户端自定义ErrorDecoder

    摘要:重写Feign的错误解码器ErrorDecoder,以自定义业务逻辑.   ErrorDecoder,顾名思义,它是发生错误或者异常情况时使用的一种解码器,允许我们对异常进行特殊处理.   在配 ...

  8. Linux grep 匹配多个关键字

      Linux grep 命令非常常用,经常用于匹配文本字符.基本语法如下: grep 'keyword'fileName.txt   如上所示,Linux grep 命令用于查找文件里符合指定条件的 ...

  9. Spring Boot MyBatis使用type-aliases-package自定义类别名

    摘要:介绍MyBatis 中 type-aliases-package 属性的作用.在Spring Boot项目中,使用属性type-aliases-package为MyBatis引用的实体类自定义别 ...

  10. 长短期记忆(LSTM)网络模型

    一.概述   长短期记忆(Long Short-Term Memory,LSTM)网络是一种特殊的循环神经网络(RNN),专门设计用于解决传统 RNN 在处理长序列数据时面临的梯度消失 / 爆炸问题, ...