Hypergeometric distribution
How TermFinder calculates P-values
The GoTermFinder attempts to determine whether an observed level of annotation for a group of genes is significant within the context of annotation for all genes within the genome. Suppose that we have a total population of N genes, in which M have a particular annotation. If we observe x genes with that annotation, in a sample of n genes, then we can calculate the probability of that observation, using the hypergeometric distribution (eg, see http://mathworld.wolfram.com/HypergeometricDistribution.html ) as:
P-value is the probability or chance of seeing at least x number of genes out of the total n genes in the list annotated to a particular GO term, given the proportion of genes in the whole genome that are annotated to that GO Term. That is, the GO terms shared by the genes in the user's list are compared to the background distribution of annotation. The closer the p-value is to zero, the more significant the particular GO term associated with the group of genes is (i.e. the less likely the observed annotation of the particular GO term to a group of genes occurs by chance).
In other words, when searching the process ontology, if all of the genes in a group were associated with "DNA repair", this term would be significant. However, since all genes in the genome (with GO annotations) are indirectly associated with the top level term "biological_process", this would not be significant if all the genes in a group were associated with this very high level term.
Gene Ontology analysis in multiple gene clusters under multiple hypothesis testing frameworks.


Hypergeometric distribution的更多相关文章
- Study notes for Discrete Probability Distribution
The Basics of Probability Probability measures the amount of uncertainty of an event: a fact whose o ...
- 常见的概率分布类型(二)(Probability Distribution II)
以下是几种常见的离散型概率分布和连续型概率分布类型: 伯努利分布(Bernoulli Distribution):常称为0-1分布,即它的随机变量只取值0或者1. 伯努利试验是单次随机试验,只有&qu ...
- NLP&数据挖掘基础知识
Basis(基础): SSE(Sum of Squared Error, 平方误差和) SAE(Sum of Absolute Error, 绝对误差和) SRE(Sum of Relative Er ...
- 《量化投资:以MATLAB为工具》连载(2)基础篇-N分钟学会MATLAB(中)
http://www.matlabsky.com/thread-43937-1-1.html <量化投资:以MATLAB为工具>连载(3)基础篇-N分钟学会MATLAB(下) ...
- 加州大学伯克利分校Stat2.2x Probability 概率初步学习笔记: Final
Stat2.2x Probability(概率)课程由加州大学伯克利分校(University of California, Berkeley)于2014年在edX平台讲授. PDF笔记下载(Acad ...
- 加州大学伯克利分校Stat2.2x Probability 概率初步学习笔记: Midterm
Stat2.2x Probability(概率)课程由加州大学伯克利分校(University of California, Berkeley)于2014年在edX平台讲授. PDF笔记下载(Acad ...
- 加州大学伯克利分校Stat2.2x Probability 概率初步学习笔记: Section 2 Random sampling with and without replacement
Stat2.2x Probability(概率)课程由加州大学伯克利分校(University of California, Berkeley)于2014年在edX平台讲授. PDF笔记下载(Acad ...
- 加州大学伯克利分校Stat2.3x Inference 统计推断学习笔记: Section 2 Testing Statistical Hypotheses
Stat2.3x Inference(统计推断)课程由加州大学伯克利分校(University of California, Berkeley)于2014年在edX平台讲授. PDF笔记下载(Acad ...
- R代码展示各种统计学分布 | 生物信息学举例
二项分布 | Binomial distribution 泊松分布 | Poisson Distribution 正态分布 | Normal Distribution | Gaussian distr ...
随机推荐
- Selenium android driver
selenium android « s « Jar File Download http://www.java2s.com/Code/Jar/s/selenium-android.htm How t ...
- uniGUI试用笔记(五)
uniGUI的主窗体可以采用多页面方式进行管理,参考网上的资料,都是用TUniFrame + TUniPageControl 来实现,尝试了一下,效果还不错,如下图: 用TUniFrame 能够使用继 ...
- 用uniGUI做B/S下业务系统的产品原型体验
从10月份到重庆工作后,一直忙于工作,感兴趣的几个方面的技术都处于暂停. 一个多月来,按照公司要求在做B/S集中式基卫产品的原型,主要是画原型图,开始是用Axure,弄来弄去感觉功能还是弱了些,尤其是 ...
- k倍区间 前缀和【蓝桥杯2017 C/C++ B组】
标题: k倍区间 给定一个长度为N的数列,A1, A2, ... AN,如果其中一段连续的子序列Ai, Ai+1, ... Aj(i <= j)之和是K的倍数,我们就称这个区间[i, j]是K倍 ...
- 题解——洛谷P3128 [USACO15DEC]最大流Max Flow
裸的树上差分 因为要求点权所以在点上差分即可 #include <cstdio> #include <algorithm> #include <cstring> u ...
- hihoCoder 1515 分数调查(带权并查集)
http://hihocoder.com/problemset/problem/1515 题意: 思路: 带权并查集的简单题,计算的时候利用向量法则即可. #include<iostream&g ...
- PHP 内置函数fgets读取文件
php fgets()函数从文件指针中读取一行 语法: fgets(file,length) 参数 描述 file 必需.规定尧要读取的文件 length 可选 .规定尧都区的字节数.默认是102字 ...
- 【转】Windows Live Writer 代码插件改造
源码和插件都在后面,如果不想看我神神叨叨的可以直接到文章后面下载 一 .找插件 在使用Windows Live Writer 经常要用到插入代码的功能,根据博客园中教程,分别使用了: WindowsL ...
- MVC ---- 去掉HTML过滤
在方法头上添加特效 [ValidateInput(false)] 富文本框提交的内容就可以顺利提交到后台了.
- BioConda--转载
1. Conda安装 如BioConda官网[1]所说,BioConda需要Conda安装环境,如果你使用过Anaconda python安装环境,那么你已经有了Conda安装环境,否则,最好的办法是 ...