Hypergeometric distribution
How TermFinder calculates P-values
The GoTermFinder attempts to determine whether an observed level of annotation for a group of genes is significant within the context of annotation for all genes within the genome. Suppose that we have a total population of N genes, in which M have a particular annotation. If we observe x genes with that annotation, in a sample of n genes, then we can calculate the probability of that observation, using the hypergeometric distribution (eg, see http://mathworld.wolfram.com/HypergeometricDistribution.html ) as:
P-value is the probability or chance of seeing at least x number of genes out of the total n genes in the list annotated to a particular GO term, given the proportion of genes in the whole genome that are annotated to that GO Term. That is, the GO terms shared by the genes in the user's list are compared to the background distribution of annotation. The closer the p-value is to zero, the more significant the particular GO term associated with the group of genes is (i.e. the less likely the observed annotation of the particular GO term to a group of genes occurs by chance).
In other words, when searching the process ontology, if all of the genes in a group were associated with "DNA repair", this term would be significant. However, since all genes in the genome (with GO annotations) are indirectly associated with the top level term "biological_process", this would not be significant if all the genes in a group were associated with this very high level term.
Gene Ontology analysis in multiple gene clusters under multiple hypothesis testing frameworks.


Hypergeometric distribution的更多相关文章
- Study notes for Discrete Probability Distribution
The Basics of Probability Probability measures the amount of uncertainty of an event: a fact whose o ...
- 常见的概率分布类型(二)(Probability Distribution II)
以下是几种常见的离散型概率分布和连续型概率分布类型: 伯努利分布(Bernoulli Distribution):常称为0-1分布,即它的随机变量只取值0或者1. 伯努利试验是单次随机试验,只有&qu ...
- NLP&数据挖掘基础知识
Basis(基础): SSE(Sum of Squared Error, 平方误差和) SAE(Sum of Absolute Error, 绝对误差和) SRE(Sum of Relative Er ...
- 《量化投资:以MATLAB为工具》连载(2)基础篇-N分钟学会MATLAB(中)
http://www.matlabsky.com/thread-43937-1-1.html <量化投资:以MATLAB为工具>连载(3)基础篇-N分钟学会MATLAB(下) ...
- 加州大学伯克利分校Stat2.2x Probability 概率初步学习笔记: Final
Stat2.2x Probability(概率)课程由加州大学伯克利分校(University of California, Berkeley)于2014年在edX平台讲授. PDF笔记下载(Acad ...
- 加州大学伯克利分校Stat2.2x Probability 概率初步学习笔记: Midterm
Stat2.2x Probability(概率)课程由加州大学伯克利分校(University of California, Berkeley)于2014年在edX平台讲授. PDF笔记下载(Acad ...
- 加州大学伯克利分校Stat2.2x Probability 概率初步学习笔记: Section 2 Random sampling with and without replacement
Stat2.2x Probability(概率)课程由加州大学伯克利分校(University of California, Berkeley)于2014年在edX平台讲授. PDF笔记下载(Acad ...
- 加州大学伯克利分校Stat2.3x Inference 统计推断学习笔记: Section 2 Testing Statistical Hypotheses
Stat2.3x Inference(统计推断)课程由加州大学伯克利分校(University of California, Berkeley)于2014年在edX平台讲授. PDF笔记下载(Acad ...
- R代码展示各种统计学分布 | 生物信息学举例
二项分布 | Binomial distribution 泊松分布 | Poisson Distribution 正态分布 | Normal Distribution | Gaussian distr ...
随机推荐
- 20155201 网络攻防技术 实验九 Web安全基础
20155201 网络攻防技术 实验九 Web安全基础 一.实践内容 本实践的目标理解常用网络攻击技术的基本原理.Webgoat实践下相关实验. 二.报告内容: 1. 基础问题回答 1)SQL注入攻击 ...
- 20145315何佳蕾《网络对抗》Web安全基础
20145315何佳蕾<网络对抗>Web安全基础 1.实验后回答问题 (1)SQL注入攻击原理,如何防御 SQL Injection:就是通过把SQL命令插入到Web表单递交或输入域名或页 ...
- bzoj 2216 Lightning Conductor - 二分法 - 动态规划
题目传送门 需要root权限的传送门 题目大意 给定一个长度为$n$的数组,要求对每个$1 \leqslant i \leqslant n$找到最小整数的$p$,对于任意$j$满足使得$a_{i} + ...
- 网络存储结构简明分析—DAS、NAS和SAN 三者区别
存储的总体分类 主流存储结构 网络存储结构大致分为三种:直连式存储(DAS:Direct Attached Storage).存储区域网络(SAN:Storage Area Network ...
- macOS搭建开发环境
1.包管理器Homebrew使用下面的命令安装: ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/insta ...
- git删除远程分支文件,不改变本地文件
git提交项目时候踩的Git的坑 特别 由于准备春招,所以希望各位看客方便的话,能去github上面帮我Star一下项目 https://github.com/Draymonders/Campus-S ...
- nohup 让进程在后台可靠运行的几种方法
1. nohup nohup 无疑是我们首先想到的办法.顾名思义,nohup 的用途就是让提交的命令忽略 hangup 信号. nohup 的使用是十分方便的,只需在要处理的命令前加上 nohup 即 ...
- 浅谈IIS 和 asp.net的应用之间的关系
IIS可以理解为一个web服务器. 用于提供web相关的各种服务. IIS6.0中添加了一个新的功能, application pool. application pool的作用是将运行在同一个ser ...
- Latex 算法过长 分页显示方法
参考: Algorithm tag and page break Latex 算法过长 分页显示方法 1.引用algorithm包: 2.在\begin{document}前加上以下Latex代码: ...
- NET Core 指令启动
ASP.NET Core 是新一代的 ASP.NET,早期称为 ASP.NET vNext,并且在推出初期命名为ASP.NET 5,但随着 .NET Core 的成熟,以及 ASP.NET 5的命名会 ...