pigeonhole principle 哈希表的重复问题(冲突)是不可避免的
https://en.wikipedia.org/wiki/Pigeonhole_principle
Sock-picking
Assume a drawer contains a mixture of black socks and blue socks, each of which can be worn on either foot, and that you are pulling a number of socks from the drawer without looking. What is the minimum number of pulled socks required to guarantee a pair of the same color? Using the pigeonhole principle, to have at least one pair of the same color (m = 2 holes, one per color) using one pigeonhole per color, you need to pull only three socks from the drawer (n = 3 items). Either you have three of one color, or, exclusively, two of one color and one of the other.
Hand-shaking
If there are n people who can shake hands with one another (where n > 1), the pigeonhole principle shows that there is always a pair of people who will shake hands with the same number of people. As the 'holes', or m, correspond to number of hands shaken, and each person can shake hands with anybody from 0 to n − 1 other people, this creates n − 1 possible holes. This is because either the '0' or the 'n − 1' hole must be empty (if one person shakes hands with everybody, it's not possible to have another person who shakes hands with nobody; likewise, if one person shakes hands with no one there cannot be a person who shakes hands with everybody). This leaves n people to be placed in at most n − 1 non-empty holes, guaranteeing duplication.
Hair-counting
We can demonstrate there must be at least two people in London with the same number of hairs on their heads as follows.[4] Since a typical human head has an average of around 150,000 hairs, it is reasonable to assume (as an upper bound) that no one has more than 1,000,000 hairs on their head (m = 1 million holes). There are more than 1,000,000 people in London (n is bigger than 1 million items). Assigning a pigeonhole to each number of hairs on a person's head, and assign people to pigeonholes according to the number of hairs on their head, there must be at least two people assigned to the same pigeonhole by the 1,000,001st assignment (because they have the same number of hairs on their heads) (or, n > m). For the average case (m = 150,000) with the constraint: fewest overlaps, there will be at most one person assigned to every pigeonhole and the 150,001st person assigned to the same pigeonhole as someone else. In the absence of this constraint, there may be empty pigeonholes because the "collision" happens before we get to the 150,001st person. The principle just proves the existence of an overlap; it says nothing of the number of overlaps (which falls under the subject of probability distribution).
There is a passing, satirical, allusion in English to this version of the principle in A History of the Athenian Society, prefixed to ""A Supplement to the Athenian Oracle: Being a Collection of the Remaining Questions and Answers in the Old Athenian Mercuries"", (Printed for Andrew Bell, London, 1710).[5] It seems that the question whether there were any two persons in the World that have an equal number of hairs on their head? had been raised in The Athenian Mercury before 1704.[6][7]
Perhaps the first written reference to the pigeonhole principle appears in 1622 in a short sentence of the Latin work Selectæ Propositiones, by the French Jesuit Jean Leurechon,[8] where he wrote "It is necessary that two men have the same number of hairs, écus, or other things, as each other."[9]
The birthday problem
The birthday problem asks, for a set of n randomly chosen people, what is the probability that some pair of them will have the same birthday? By the pigeonhole principle, if there are 367 people in the room, we know that there is at least one pair who share the same birthday, as there are only 366 possible birthdays to choose from (including February 29, if present). The birthday "paradox" refers to the result that even if the group is as small as 23 individuals, there will still be a pair of people with the same birthday with a 50% probability. While at first glance this may seem surprising, it intuitively makes sense when considering that a comparison will actually be made between every possible pair of people rather than fixing one individual and comparing them solely to the rest of the group.
https://zh.wikipedia.org/wiki/鸽巢原理
虽然鸽巢原理看起来很容易理解,但有时使用鸽巢原理会得到一些有趣的结论:
- 比如:北京至少有两个人头发数一样多。
- 证明:常人的头发数在15万左右,可以假定没有人有超过100万根头发,但北京人口大于100万。如果我们让每一个鸽巢对应一个头发数字,鸽子对应于人,那就变成了有大于100万只鸽子要进到至多100万个巢中。所以,可以得到“北京至少有两个人头发数一样多”的结论。
另一个例子:
- 盒子里有10只黑袜子、12只蓝袜子,你需要拿一对同色的出来。假设你总共只能拿一次,只要3只就无法回避会拿到至少两只相同颜色的袜子,因为颜色只有两种(鸽巢只有两个),而有三只袜子(三只鸽子),从而得到“拿3只袜子出来,就能保证有一双同色”的结论。
更不直观一点的例子:
- 有n个人(至少2人)互相握手(随意找人握),必有两人握过手的人数相同。
- 这里,鸽巢对应于握过手人数,鸽子对应于人,每个人都可以与[0,n-1]人握过手(但0和n-1不能同时存在,因为如果一个人不和任何人握手,那就不会存在一个和所有其他人都握过手的人),所以鸽巢是n-1个。但有n个人(n只鸽子),因此证明了命题正确。
鸽巢原理经常在计算机领域得到真正的应用。比如:哈希表的重复问题(冲突)是不可避免的,因为Keys的数目总是比Indices的数目多,不管是多么高明的算法都不可能解决这个问题。这个原理,还证明任何无损压缩算法,在把一个文件变小的同时,一定有其他文件增大来辅助,否则某些信息必然会丢失。

pigeonhole principle 哈希表的重复问题(冲突)是不可避免的的更多相关文章
- java数据结构和算法09(哈希表)
树的结构说得差不多了,现在我们来说说一种数据结构叫做哈希表(hash table),哈希表有是干什么用的呢?我们知道树的操作的时间复杂度通常为O(logN),那有没有更快的数据结构?当然有,那就是哈希 ...
- 用python实现哈希表
哈哈,这是我第一篇博客园的博客.尝试了一下用python实现的哈希表,首先处理冲突的方法是开放地址法,冲突表达式为Hi=(H(key)+1)mod m,m为表长. #! /usr/bin/env py ...
- Java数据结构和算法(十三)——哈希表
Hash表也称散列表,也有直接译作哈希表,Hash表是一种根据关键字值(key - value)而直接进行访问的数据结构.它基于数组,通过把关键字映射到数组的某个下标来加快查找速度,但是又和数组.链表 ...
- 开地址哈希表(Hash Table)的原理描述与冲突解决
在开地址哈希表中,元素存放在表本身中.这对于某些依赖固定大小表的应用来说非常有用.因为不像链式哈希表在每个槽位上有一个"桶"来存储冲突的元素,所以开地址哈希表需要通过另一种方法来解 ...
- 哈希表(Hash Table)原理及其实现
原理 介绍 哈希表(Hash table,也叫散列表), 是根据关键码值(Key value)而直接进行访问的数据结构.也就是说,它通过把关键码值映射到表中一个位置来访问记录,以加快查找的速度.这个映 ...
- 哈希表(Hash Table)/散列表(Key-Value)
目录 1. 哈希表的基本思想 2. 哈希表的相关基本概念 1.概念: 2.哈希表和哈希函数的标准定义: 1)冲突: 2)安全避免冲突的条件: 3)冲突不可能完全避免 4)影响冲突的因素 3. 哈希表的 ...
- bzoj 2761 [JLOI2011]不重复数字(哈希表)
2761: [JLOI2011]不重复数字 Time Limit: 10 Sec Memory Limit: 128 MBSubmit: 3210 Solved: 1186[Submit][Sta ...
- 2761: [JLOI2011]不重复数字(哈希表)
2761: [JLOI2011]不重复数字 Time Limit: 10 Sec Memory Limit: 128 MBSubmit: 1770 Solved: 675[Submit][Stat ...
- 重复的DNA序列[哈希表] LeetCode.187
所有 DNA 由一系列缩写为 A,C,G 和 T 的核苷酸组成,例如:"ACGAATTCCG".在研究 DNA 时,识别 DNA 中的重复序列有时会对研究非常有帮助. 编写一个函数 ...
随机推荐
- Fiddler-常用技巧
1.详情面板 1).Inspectors 标签栏进行请求和响应结果分析 2).AutoResponder 对匹配 URL 进行自动返回, 可以使用字符.URL.正则表达式 3).Composer 模拟 ...
- ActiveMQ与MSMQ的异同
http://www.cnblogs.com/luluping/archive/2010/11/03/1867841.html 目前常用的消息队列组建无非就是MSMQ和ActiveMQ,至于 ...
- 将DataSet转换成json
/// <summary> /// 把dataset数据转换成json的格式 /// </summary> /// <para ...
- Docker 安装docker-compose多容器管理服务
原文地址:https://github.com/eacdy/spring-cloud-book/blob/master/3%20%E4%BD%BF%E7%94%A8Docker%E6%9E%84%E5 ...
- JS的同步加载、异步加载
在使用js展开式菜单时,发现只有加载完页面包含的js文件时,展开菜单才能折叠起来. 查找了一下原因:是因为js页面加载使用的是同步模式,又称阻塞模式,会阻止浏览器的后续处理,停止后续的解析,只有当当前 ...
- NSTimer 增加引用计数, 导致内存泄露,
self.adTimer = [NSTimerscheduledTimerWithTimeInterval:5.0target:selfselector:@selector(handleADIma ...
- 各种broker对比
broker的主要职责是接受发布者发布的所有消息,并将其过滤后分发给不同的消息订阅者.如今有很多的broker,下面就是一张关于各种broker对比的图片: 在使用mosquitto时,如果想使用集群 ...
- Service(1)
服务是一个应用组件,能够在后运行耗时的操作,不提供一个用户界面.(由于不提供界面,所以能够耗时运行,和活动最大的不同).还有一个应用组件能够启动一个服务,服务会继续在后台运行及时用户切换到还有一个应用 ...
- mongo aggregate
https://cnodejs.org/topic/59264f62855efbac2cf7a2f3 背景 现有1000条学生记录,结构如下: { name:String,//名称 clazz:{ty ...
- NIO之管道 (Pipe)
Java NIO 管道是2个线程之间的单向数据连接.Pipe有一个source通道和一个sink通道.数据会被写到sink通道,从source通道读取. 代码使用示例: public static v ...