Study notes for Discrete Probability Distribution
The Basics of Probability
- Probability measures the amount of uncertainty of an event: a fact whose occurence is uncertain.
- Sample space refers to the set of all possible events, denoted as
.
- Some properties:
- Sum rule:
- Union bound:
- Sum rule:
- Conditional probability:
. To emphasize that p(A) is unconditional, p(A) is called "marginal probability", and p(B, A) is called "joint probability", where p(A, B)=p(B|A) p(A) is called the "multiplication rule" or "factorization rule".
- Total probability theorem: p(B) = p(B|A)p(A) + p(B|~A)p(~A)
- Bayes' Theorem:
Bayes' Theorem can be regarded as a rule to update a prior probability p(A) into a posterior probability p(A|B), taking into account the amount/occurrence of evidence/event B.
- Conditional independence: Two events A and B, with p(A)>0 and p(B)>0 are independent, given C, if p(A, B|C)=p(A|C) p(B|C).
- Probability mass function (p.m.f) of random variable X is a function
- Joint probability mass function of X and Y is a function
- Cumulative distribution function (c.d.f) of a random variable X is a function:
- The c.d.f describes the probability in a specific interval, whereas the p.m.f describes the probability in a specific event.
- Expectation: the expectationof a random variable X is:
- linearity: E[aX+bY]=aE[x]+bE[Y]
- if X and Y are independent: E[XY]=E[X]*E[Y]
- Markov's inequality: let X be a nonnegative random variable with
, then for all
- Variance: the variance of a random variable X is:
, where
is called the standard deviation of the random variable X.
- Var[aX] = a2Var[X]
- if X and Y are independent, Var[X+Y]=Var[X]+Var[Y]
- Chebyshev's inequality: let X be a random variable
, then for all
Bernoulli Distribution
- A (single) Bernoulli trial is an experiment whose outcome is random and can be either of two possible outcomes, "success" and "failure", or "yes" and "no". Examples of Bernoulli trials include: flipping a coin, political option poll, etc.
- The Bernoulli distribution is a discrete probability distribution ofone (a) discrete random variable X, which takes value 1 with success probability p: Pr(X=1)=p, and value 0 with failure probability Pr(X=0)=q=1-p. For formally, the Bernoulli distribution is summarized as follows:
- notation: Bern(p), where 0<p<1 is the probability of success.
- support: X={0, 1}
- p.m.f: Pr[X=0]=q=1-p, Pr[X=1]=p
- mean: E[X]=p
- variance: Var[X]=p(1-p)
- It is a special case of Binomial distribution B(n, p). Bernoulli distribution is B(1, p).
Binomial Distribution
- The Binomial distribution is the discrete probability distribution of the number of successes in a sequence ofn independent Bernoulli trials with success probabilityp, denoted asX~B(n, p).
- The Binomial distribution is often used to model the number of successes in a sample of sizen drawn with replacement from a population of sizeN. If the sampling is carried out without replacement, the draws are not independent and so the resulting distribution is a hypergeometric distribution, not a binomial one.
- The Binomial distribution is summarized as follows:
- notation: B(n, p), where n is the number of trials and p is the success probability in each trial
- support: k = {0, 1, ..., n} the number of successes
- p.m.f:
- mean: np
- variance: np(1-p)
- If n is large enough, then the skew of the distribution is not too great. In this case, a reasonable approximation to B(n, p) is given by the normal distribution:
since a large n will result in difficulty to compute the p.m.f of Binomial distribution.
- one rule to determine if such approximation is reasonable, or if n is large enough is that both np and np(1-p) must be greater than 5. If both are greater than 15 then the approximation should be good.
- A second rule is than for n>5, the normal approximation is adequate if:
- Another commonly used rule holds that the normal approximation is appropriate only if everything within 3 standard deviation of its mean is within the range of possible values, that is if:
- To improve the accuracy of the approximation, we usually use a correction factor to take into account that the binomial random variable is discrete while the normal random variable is continuous. In particular, the basic idea is to treat the discrete value k as the continuous interval from k-0.5 to k+0.5.
- In addition, Poisson distribution can be used to approximate the Binomial distribution when n is very large. A rule of thumb stating that the Poisson distribution is a good approximation oof the binomial distribution if n is at least 20 and p is smaller than or equal to 0.05, and an excellent approximation if n>=100, and np<=10:
Poisson Distribution
- Poisson distribution: Let X be a discrete random variable taking values in the set of integer numbers
with probability:
My understanding. Poisson distribution describes the fact that the probability of drawing a specific integer from a set of integers is not uniform. For example, it is well-known that if someone is asked to pick a random integer from 1-10, some integers are occurring with greater probability whereas some others happen with lower probability. Although it seems that all possible integers get equal chance to be picked, it is not true in real case. I think this may be due to subjectivity of people, i.e., some one prefers larger values while other tends to pick smaller ones. This point needs to be verified as I got this feeling totally from intuitions. - The Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time and/or space if these events occur with a known average rate and independent of the time since the last event.
- The Poisson distribution is summarized as follows.
- notation:
, where
is a real number, indicating the number of events occurring that will be observed in the time interval
.
- support: k = {0, 1, 2, 3, ...}
- mean:
- variance:
- notation:
- Applications of Poisson distribution
- Telecommunication: telephone calls arriving in a system
- Management: customers arriving at a counter or call center
- Civil engineering: cars arriving at a traffic light
- Generating Poisson random variables
algorithm poisson_random_number:
init:
Let,
, and
.
do:Generate uniform random number u in [0, 1], and let
while p>L.
return k-1.
References
- Paola Sebastiani, A tutorial on probability theory
- Mehryar Mohri, Introduction to Machine Learning - Basic Probability Notations.
Study notes for Discrete Probability Distribution的更多相关文章
- Generating a Random Sample from discrete probability distribution
If is a discrete random variable taking on values , then we can write . Implementation of this formu ...
- Machine Learning Algorithms Study Notes(2)--Supervised Learning
Machine Learning Algorithms Study Notes 高雪松 @雪松Cedro Microsoft MVP 本系列文章是Andrew Ng 在斯坦福的机器学习课程 CS 22 ...
- Notes on the Dirichlet Distribution and Dirichlet Process
Notes on the Dirichlet Distribution and Dirichlet Process In [3]: %matplotlib inline Note: I wrote ...
- Study note for Continuous Probability Distributions
Basics of Probability Probability density function (pdf). Let X be a continuous random variable. The ...
- Machine Learning Algorithms Study Notes(3)--Learning Theory
Machine Learning Algorithms Study Notes 高雪松 @雪松Cedro Microsoft MVP 本系列文章是Andrew Ng 在斯坦福的机器学习课程 CS 22 ...
- Machine Learning Algorithms Study Notes(1)--Introduction
Machine Learning Algorithms Study Notes 高雪松 @雪松Cedro Microsoft MVP 目 录 1 Introduction 1 1.1 ...
- Study notes for Latent Dirichlet Allocation
1. Topic Models Topic models are based upon the idea that documents are mixtures of topics, where a ...
- Study notes for Clustering and K-means
1. Clustering Analysis Clustering is the process of grouping a set of (unlabeled) data objects into ...
- ORACLE STUDY NOTES 01
[JSU]LJDragon's Oracle course notes In the first semester, junior year DML数据操纵语言 DML指:update,delete, ...
随机推荐
- How to:如何在调用外部文件时调试文件路径(常见于使用LaunchAppAndWait和LaunchApp函数)
原文:How to:如何在调用外部文件时调试文件路径(常见于使用LaunchAppAndWait和LaunchApp函数) IS里调用外部文件的时候,一般都是用LaunchAppAndWait函数,比 ...
- 【转】NuGet的安装与使用
学习了一段时间的MVC,今天想自己尝试初步搭建一个MVC框架,结果新建MVC4.0(MVC3.0同样)项目时,弹出一个错误提示框,如下图.上网一搜,说是要安装一个第三方组件NuGet.刚接触MVC,更 ...
- EF6+MVC4+EasyUI个人日记系统开源共享
发现在2015年里学习MVC的人越来越多,本人的群成员也越来越多,为了更方便大家学习,在此共享一个个人的小项目. 如下是部分截图: 简单介绍一下本系统的一些相关知识. 1.简单的3层框架,易学易懂 2 ...
- ASP.NET MVC 例子演示如何在 Knockout JS 的配合下,使用 TypeScript 。
一个简单的 ASP.NET MVC 例子演示如何在 Knockout JS 的配合下,使用 TypeScript . 前言 TypeScript 是一种由微软开发的自由和开源的编程语言.它是JavaS ...
- jQuery Tags Input 插件显示选择记录
利用jQuery Tags Input 插件显示选择记录 最近花了不少时间在重构和进一步提炼我的Web开发框架上,力求在用户体验和界面设计方面,和Winform开发框架保持一致,而在Web上,我主要采 ...
- C#使用文件监控对象FileSystemWatcher 实现数据同步
在C#使用文件监控对象FileSystemWatcher 实现数据同步 2013-12-12 18:24 by 幕三少, 352 阅读, 3 评论, 收藏, 编辑 最近在项目中有这么个需求,就是得去实 ...
- ETHREAD APC
ETHREAD APC <寒江独钓>内核学习笔记(4) 继续学习windows 中和线程有关系的数据结构: ETHREAD.KTHREAD.TEB 1. 相关阅读材料 <window ...
- Web API 2中的属性路由
Web API 2中的属性路由 前言 阅读本文之前,您也可以到Asp.Net Web API 2 系列导航进行查看 http://www.cnblogs.com/aehyok/p/3446289.ht ...
- [Usaco2007 Dec]Building Roads 修建道路[最小生成树]
Description Farmer John最近得到了一些新的农场,他想新修一些道路使得他的所有农场可以经过原有的或是新修的道路互达(也就是说,从任一个农场都可以经过一些首尾相连道路到达剩下的所有农 ...
- UVA 408 (13.07.28)
Uniform Generator Computer simulations often require random numbers. One way to generatepseudo-ran ...