How TermFinder calculates P-values

Readme: MGI GO Term Finder

The GoTermFinder attempts to determine whether an observed level of annotation for a group of genes is significant within the context of annotation for all genes within the genome.  Suppose that we have a total population of N genes, in which M have a particular annotation.   If we observe x genes with that annotation, in a sample of n genes, then we can calculate the probability of that observation, using the hypergeometric distribution (eg, see http://mathworld.wolfram.com/HypergeometricDistribution.html ) as:

P-value is the probability or chance of seeing at least x number of genes out of the total n genes in the list annotated to a particular GO term, given the proportion of genes in the whole genome that are annotated to that GO Term. That is, the GO terms shared by the genes in the user's list are compared to the background distribution of annotation. The closer the p-value is to zero, the more significant the particular GO term associated with the group of genes is (i.e. the less likely the observed annotation of the particular GO term to a group of genes occurs by chance).

In other words, when searching the process ontology, if all of the genes in a group were associated with "DNA repair", this term would be significant. However, since all genes in the genome (with GO annotations) are indirectly associated with the top level term "biological_process", this would not be significant if all the genes in a group were associated with this very high level term.

Multiple comparisons problem

Gene Ontology analysis in multiple gene clusters under multiple hypothesis testing frameworks.

Hypergeometric distribution的更多相关文章

  1. Study notes for Discrete Probability Distribution

    The Basics of Probability Probability measures the amount of uncertainty of an event: a fact whose o ...

  2. 常见的概率分布类型(二)(Probability Distribution II)

    以下是几种常见的离散型概率分布和连续型概率分布类型: 伯努利分布(Bernoulli Distribution):常称为0-1分布,即它的随机变量只取值0或者1. 伯努利试验是单次随机试验,只有&qu ...

  3. NLP&数据挖掘基础知识

    Basis(基础): SSE(Sum of Squared Error, 平方误差和) SAE(Sum of Absolute Error, 绝对误差和) SRE(Sum of Relative Er ...

  4. 《量化投资:以MATLAB为工具》连载(2)基础篇-N分钟学会MATLAB(中)

    http://www.matlabsky.com/thread-43937-1-1.html   <量化投资:以MATLAB为工具>连载(3)基础篇-N分钟学会MATLAB(下)     ...

  5. 加州大学伯克利分校Stat2.2x Probability 概率初步学习笔记: Final

    Stat2.2x Probability(概率)课程由加州大学伯克利分校(University of California, Berkeley)于2014年在edX平台讲授. PDF笔记下载(Acad ...

  6. 加州大学伯克利分校Stat2.2x Probability 概率初步学习笔记: Midterm

    Stat2.2x Probability(概率)课程由加州大学伯克利分校(University of California, Berkeley)于2014年在edX平台讲授. PDF笔记下载(Acad ...

  7. 加州大学伯克利分校Stat2.2x Probability 概率初步学习笔记: Section 2 Random sampling with and without replacement

    Stat2.2x Probability(概率)课程由加州大学伯克利分校(University of California, Berkeley)于2014年在edX平台讲授. PDF笔记下载(Acad ...

  8. 加州大学伯克利分校Stat2.3x Inference 统计推断学习笔记: Section 2 Testing Statistical Hypotheses

    Stat2.3x Inference(统计推断)课程由加州大学伯克利分校(University of California, Berkeley)于2014年在edX平台讲授. PDF笔记下载(Acad ...

  9. R代码展示各种统计学分布 | 生物信息学举例

    二项分布 | Binomial distribution 泊松分布 | Poisson Distribution 正态分布 | Normal Distribution | Gaussian distr ...

随机推荐

  1. 判断一个点是否在RotatedRect中

    openCV函数pointPolygonTest(): C++: double pointPolygonTest(InputArray contour, Point2f pt, bool measur ...

  2. css 元素居中

    css 4种常见实现元素居中的办法: 1.通过 margin 属性调整 : { position: absolute; top: 50%; left: 50%; margin-left: 盒子的一半: ...

  3. Codeforces Round #439 (Div. 2) Problem E (Codeforces 869E) - 暴力 - 随机化 - 二维树状数组 - 差分

    Adieu l'ami. Koyomi is helping Oshino, an acquaintance of his, to take care of an open space around ...

  4. Job for php-fpm.service failed because the control process exited with error code. See "systemctl status php-fpm.service" and "journalctl -xe" for details.

    [root@web01 ~]#  systemctl start php-fpm Job for php-fpm.service failed because the control process ...

  5. xlrd与xlwt的下载

    读excel表格: xlrd 地址:https://pypi.org/project/xlrd/ 下载语句:pip install xlrd 写excel表格: xlwt https://pypi.o ...

  6. topcoder srm 702 div1 -3

    1.给定一个$n*m$的矩阵,里面的数字是1到$n*m$的一个排列.一个$n*m$矩阵$A$对应一个$n*m$ 的01字符串,字符串的位置$i*m+j$为1当且仅当$A_{i,j}=i*m+j+1$. ...

  7. Bootstrap3基础 dropdown divider 下拉列表中的分割线

      内容 参数   OS   Windows 10 x64   browser   Firefox 65.0.2   framework     Bootstrap 3.3.7   editor    ...

  8. 学习dart从这里开始

    void main() { ; i < ; i++) { print('hello ${i + 1}'); } } // 定义个方法. printNumber(num aNumber) { pr ...

  9. CentOS 安装 MongoDB

    一.安装mongodb 本文介绍的安装方式是以二进制方式离线安装,相当于windows"绿色"安装版本的概念. 下载mongodb: # https://www.mongodb.c ...

  10. 【Dalston】【第六章】API服务网关(Zuul) 下

    Zuul给我们的第一印象通常是这样:它包含了对请求的路由和过滤两个功能,其中路由功能负责将外部请求转发到具体的微服务实例上,是实现外部访问统一入口的基础.过滤器功能则负责对请求的处理过程进行干预,是实 ...