Common Probability Distributions

Probability Distribution

A probability distribution describes the probabilities of all the possible outcomes for a random variable.

A discrete random variable if one for which the number of possible outcomes can be counted, and for each possible outcome, there is a measurable and positive probability.

A continuous random variable is one for which the number of possible outcome is infinite, even if lower and upper bounds exist.

A cumulative distribution function (CDF) defines the probability that a random variable, X, takes on a value equal to or less than a specific value, x.

F(x)=P(X<=x)

A discrete uniform random variable is one for which the probabilities for all possible outcomes for a discrete random variable are equal.

Binomial Distribution(二项分布)

A binomial random variable may be defined as the number of "success" in a given number of trials, whereby the outcome can be either "success" or "failure". The probability of success, p, is constant for each trial, and the trails are independent.

Note: binomial distribution is a discrete distribution.

A binomial random variable for which the number of trials is 1 is called Bernoulli random variable (伯努利随即变量).

Expected value

For a given serial of n trials,, the expected number of success, or E(X), is given by the following formula:

expectedd value of X = E(X) = np

The intuition is straightforward; if we perform n trails and the probability of success on each trail is p, we expect np successes.

Variance

The variance of a binomial random variable is given by:

variance of X = np(1-p)

Tracking Error

Tracking error is the difference between the total return on a portfolio and the total return on the benchmark against which its performance is measured.

Note: The expression "tracking error" is sometimes used interchangeably with "tracking risk", which refer to the standard deviation of the differences between a portfolio's return and its benchmark return.

Continuous Uniform Distribution

The continuous uniform distribution is defined over a range that spans between some lower limit, a, and some upper limit, b, which serve as the parameters of the distribution. Outcomes can only occur between a and b, and since we are dealing with a continuous distribution, even if a < x < b, P(X=x)=0.

  • PDF (probability density function)

  • CDF (continuous distribution function)

Normal Distribution

Note: some properties of normal distribution:

  1. Skewness=0
  2. Kurtosis=3
  3. A linear combination of normally distributed random variables is also normally distributed.

Standard Normal Distribution (标准正太分布) and Z value

The standard normal distribution is a normal distribution that has been standardized so that is has a mean of zero and a standard deviation of 1.

To standardize an observation from a given normal distribution,, the z-value of the observation must be calculated. The z-value represents the number of standard deviations a given observation is from the population mean. Standardization is the process of converting an observed value for a random variable to its z-value.

z = [observation-population mean]/[standard deviation] = [x-μ]/σ

Confidence Interval(置信区间)

A confidence interval is a range of values around the expected outcome within which we expect the actual outcome to be some specified percentage of the time. A 95% confidence interval is a range that we expect the random variable to be in 95% of time. For a normal distribution, this interval is based on the expected value(sometimes called a point estimate) of the random variable and on its variability, which we measure with standard deviation.

Note: 1-confidence interval = α (which is called significance level)

For any normally distributed random variable, 68% of outcomes are within one standard deviation of the expected value (mean), and approximately 95% of the outcomes are within two standard deviations of the expected value.

Shortfall risks, Safety-First ratio

Shortfall risk is the probability that a portfolio value or return will fall below a particular(target) value or return over a given time period.

Roy's safety-first criterion states that the optimal portfolio minimizes the probability that the return of the portfolio falls below some minimum acceptable level. This minimum acceptable level is called the threshold level.

If portfolio returns area normally distributed, then Roy's safety-first criterion can be stated as:

Note that the SFR is the number of standard deviations below the mean. Thus the portfolio with the larger SFR has the lower probability of returns below the threshold return.

Lognormal distribution(对数正太分布)

The lognormal distribution is generated by the function e^x, where x is normally distributed. Since the natural logarithm, ln, of e^x is x, the logarithms of lognormally distributed random variables are normally distributed.

  • The log-normal distribution is skewed to the right
  • The log-normal distribution is bounded from below by zero that it is useful for modeling asset prices which never take negative values.

PMF(Probability Mass Function)

PMF(概率质量函数),这个函数是值到其概率的映射。

如果要处理的数据比较少,PMF很合适。但随着数据的增加,每个值的概率就会降低,而随机噪声的影响就会增大。

CDF(Cumulative Distribution Function)

CDF(累积分布函数), 这个函数是值到其在分布中百分等级的映射。

  def Cdf(t, x):
count = 0.0
for value in t:
if value <= x:
count += 1.0
prob = count / len(t)
return prob

我们可以计算任意值x的CDF, 而不仅仅是样本中出现的值。如果x比样本中最小的值还要小,那么CDF(x)就等于0.如果x比样本中的最大值还要大,那么CDF(x)就是1.

帕累托分布

See百度百科 and 维基百科

帕累托分布(Pareto distribution)是以意大利经济学家维弗雷多·帕雷托命名的。 是从大量真实世界的现象中发现的幂次定律分布。这个分布在经济学以外,也被称为布拉德福分布。

帕累托因对意大利20%的人口拥有80%的财产的观察而著名,后来被约瑟夫·朱兰和其他人概括为帕累托法则(80/20法则),后来进一步概括为帕累托分布的概念。

在帕累托分布中,如果X是一个随机变量, 则X的概率分布如下面的公式所示:

其中x是任何一个大于xmin的数,xmin是X最小的可能值(正数),k是为正的参数。帕累托分布曲线族是由两个数量参数化的:xmin和k。分布密度则为

Common Probability Distributions的更多相关文章

  1. PRML Chapter 2. Probability Distributions

    PRML Chapter 2. Probability Distributions P68 conjugate priors In Bayesian probability theory, if th ...

  2. PRML读书笔记——2 Probability Distributions

    2.1. Binary Variables 1. Bernoulli distribution, p(x = 1|µ) = µ 2.Binomial distribution + 3.beta dis ...

  3. PRML读书会第二章 Probability Distributions(贝塔-二项式、狄利克雷-多项式共轭、高斯分布、指数族等)

    主讲人 网络上的尼采 (新浪微博: @Nietzsche_复杂网络机器学习) 网络上的尼采(813394698) 9:11:56 开始吧,先不要发言了,先讲PRML第二章Probability Dis ...

  4. Study note for Continuous Probability Distributions

    Basics of Probability Probability density function (pdf). Let X be a continuous random variable. The ...

  5. Tensorflow Probability Distributions 简介

    摘要:Tensorflow Distributions提供了两类抽象:distributions和bijectors.distributions提供了一系列具备快速.数值稳定的采样.对数概率计算以及其 ...

  6. 基本概率分布Basic Concept of Probability Distributions 8: Normal Distribution

    PDF version PDF & CDF The probability density function is $$f(x; \mu, \sigma) = {1\over\sqrt{2\p ...

  7. 基本概率分布Basic Concept of Probability Distributions 7: Uniform Distribution

    PDF version PDF & CDF The probability density function of the uniform distribution is $$f(x; \al ...

  8. 基本概率分布Basic Concept of Probability Distributions 6: Exponential Distribution

    PDF version PDF & CDF The exponential probability density function (PDF) is $$f(x; \lambda) = \b ...

  9. 基本概率分布Basic Concept of Probability Distributions 5: Hypergemometric Distribution

    PDF version PMF Suppose that a sample of size $n$ is to be chosen randomly (without replacement) fro ...

随机推荐

  1. TextView跑步灯效果及在特殊情况下无效的解决方式

    概述: 关于在TextView中使用跑马灯效果的样例在网上一搜一大把.他们可能会让你像以下这样来在xml中定义TextView控件的属性.而事实也确是如此. 只是我不知道他们有没有遇到和我一样的问题( ...

  2. WebService SOAP、Restful和HTTP(post/get)请求区别

    web service(SOAP) Webservice的一个最基本的目的就是提供在各个不同平台的不同应用系统的协同工作能力. Web service 就是一个应用程序,它向外界暴露出一个能够通过We ...

  3. PyQt5教程——组件(7)

    PyQt5中的组件(widgets) 组件(widgets)是构建一个应用的基础模块.PyQt5有广泛的各式各样的组件,包含按钮,复选按钮,滑块条,和列表框.在这个部分的教程中,我们将学习几种有用的组 ...

  4. 导入maven项目出现 Unsupported IClasspathEntry kind=4

    Unsupported IClasspathEntry kind=4 这个异常会导致项目无法使用spring ide启动 来自:http://blog.csdn.net/kongqz/article/ ...

  5. IDEA开发web程序配置Tomcat

    1.下载zip版的Tomcat 7,并解压2.在IDEA中配置Tomcat 7 在idea中的Settings(Ctrl+Alt+s)(或者点击图标 ) 弹出窗口左上过滤栏中输入“Applicatio ...

  6. MySQL 联合索引测试3

    接上一篇文章: http://www.cnblogs.com/xiaoit/p/4430387.html 有时候会出现某字段建立一个索引,但是查看执行计划的时候发现还是全扫了表? 可以强制走下索引看看 ...

  7. taro alias 的使用

    用来配置目录别名,从而方便书写代码引用路径.例如,使用相对路径书写文件引用如下: import A from '../../componnets/A' import Utils from '../.. ...

  8. CAS 5.1.x 的搭建和使用(一)—— 通过Overlay搭建服务端

    CAS单点登录系列: CAS 5.1.x 的搭建和使用(一)—— 通过Overlay搭建服务端 CAS5.1.x 的搭建和使用(二)—— 通过Overlay搭建服务端-其它配置说明 CAS5.1.x ...

  9. 网速变慢解决方法.Tracert与PathPing(转)

    Tracert命令与PathPing命令你常用吗: 前段时间本网吧网速不太正常.每晚8点后到11点之间网速爆慢.其余时间则正常.在8~11点间PING电信DNS TIME值要100多MS以上,但PIN ...

  10. 配置的好的Apache和PHP语言的环境下,如何在Apache目录下htdocs/html目录下 同时部署两个项目呢

    建虚拟目录打开Apache->conf->httpd.conf在最下面粘贴NameVirtualHost 127.0.0.1 <VirtualHost 127.0.0.1> S ...