Python collections系列之计数器
计数器(counter)
Counter是对字典(无序)类型的补充,用于追踪值的出现次数。
使用counter需要导入 collections 类
ps:具备字典的所有功能 + 自己的功能
1、创建一个计数器
>>> import collections
>>> obj = collections.Counter('aaabbccsdfsdfdfsdfsdf')
2、查看计数器变量
>>> print(obj)
Counter({'d': 5, 'f': 5, 's': 4, 'a': 3, 'c': 2, 'b': 2})
3、查看计数器可使用的方法
>>> dir(obj)
['__add__', '__and__', '__class__', '__contains__', '__delattr__', '__delitem__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__iadd__', '__iand__', '__init__', '__ior__', '__isub__', '__iter__', '__le__', '__len__', '__lt__', '__missing__', '__module__', '__ne__', '__neg__', '__new__', '__or__', '__pos__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__weakref__', '_keep_positive', 'clear', 'copy', 'elements', 'fromkeys', 'get', 'items', 'keys', 'most_common', 'pop', 'popitem', 'setdefault', 'subtract', 'update', 'values']
########################################################################
### Counter
######################################################################## class Counter(dict):
'''Dict subclass for counting hashable items. Sometimes called a bag
or multiset. Elements are stored as dictionary keys and their counts
are stored as dictionary values. >>> c = Counter('abcdeabcdabcaba') # count elements from a string >>> c.most_common(3) # three most common elements
[('a', 5), ('b', 4), ('c', 3)]
>>> sorted(c) # list all unique elements
['a', 'b', 'c', 'd', 'e']
>>> ''.join(sorted(c.elements())) # list elements with repetitions
'aaaaabbbbcccdde'
>>> sum(c.values()) # total of all counts >>> c['a'] # count of letter 'a'
>>> for elem in 'shazam': # update counts from an iterable
... c[elem] += 1 # by adding 1 to each element's count
>>> c['a'] # now there are seven 'a'
>>> del c['b'] # remove all 'b'
>>> c['b'] # now there are zero 'b' >>> d = Counter('simsalabim') # make another counter
>>> c.update(d) # add in the second counter
>>> c['a'] # now there are nine 'a' >>> c.clear() # empty the counter
>>> c
Counter() Note: If a count is set to zero or reduced to zero, it will remain
in the counter until the entry is deleted or the counter is cleared: >>> c = Counter('aaabbc')
>>> c['b'] -= 2 # reduce the count of 'b' by two
>>> c.most_common() # 'b' is still in, but its count is zero
[('a', 3), ('c', 1), ('b', 0)] '''
# References:
# http://en.wikipedia.org/wiki/Multiset
# http://www.gnu.org/software/smalltalk/manual-base/html_node/Bag.html
# http://www.demo2s.com/Tutorial/Cpp/0380__set-multiset/Catalog0380__set-multiset.htm
# http://code.activestate.com/recipes/259174/
# Knuth, TAOCP Vol. II section 4.6.3 def __init__(self, iterable=None, **kwds):
'''Create a new, empty Counter object. And if given, count elements
from an input iterable. Or, initialize the count from another mapping
of elements to their counts. >>> c = Counter() # a new, empty counter
>>> c = Counter('gallahad') # a new counter from an iterable
>>> c = Counter({'a': 4, 'b': 2}) # a new counter from a mapping
>>> c = Counter(a=4, b=2) # a new counter from keyword args '''
super(Counter, self).__init__()
self.update(iterable, **kwds) def __missing__(self, key):
""" 对于不存在的元素,返回计数器为0 """
'The count of elements not in the Counter is zero.'
# Needed so that self[missing_item] does not raise KeyError
return 0 def most_common(self, n=None):
""" 数量大于等n的所有元素和计数器 """
'''List the n most common elements and their counts from the most
common to the least. If n is None, then list all element counts. >>> Counter('abcdeabcdabcaba').most_common(3)
[('a', 5), ('b', 4), ('c', 3)] '''
# Emulate Bag.sortedByCount from Smalltalk
if n is None:
return sorted(self.iteritems(), key=_itemgetter(1), reverse=True)
return _heapq.nlargest(n, self.iteritems(), key=_itemgetter(1)) def elements(self):
""" 计数器中的所有元素,注:此处非所有元素集合,而是包含所有元素集合的迭代器 """
'''Iterator over elements repeating each as many times as its count. >>> c = Counter('ABCABC')
>>> sorted(c.elements())
['A', 'A', 'B', 'B', 'C', 'C'] # Knuth's example for prime factors of 1836: 2**2 * 3**3 * 17**1
>>> prime_factors = Counter({2: 2, 3: 3, 17: 1})
>>> product = 1
>>> for factor in prime_factors.elements(): # loop over factors
... product *= factor # and multiply them
>>> product Note, if an element's count has been set to zero or is a negative
number, elements() will ignore it. '''
# Emulate Bag.do from Smalltalk and Multiset.begin from C++.
return _chain.from_iterable(_starmap(_repeat, self.iteritems())) # Override dict methods where necessary @classmethod
def fromkeys(cls, iterable, v=None):
# There is no equivalent method for counters because setting v=1
# means that no element can have a count greater than one.
raise NotImplementedError(
'Counter.fromkeys() is undefined. Use Counter(iterable) instead.') def update(self, iterable=None, **kwds):
""" 更新计数器,其实就是增加;如果原来没有,则新建,如果有则加一 """
'''Like dict.update() but add counts instead of replacing them. Source can be an iterable, a dictionary, or another Counter instance. >>> c = Counter('which')
>>> c.update('witch') # add elements from another iterable
>>> d = Counter('watch')
>>> c.update(d) # add elements from another counter
>>> c['h'] # four 'h' in which, witch, and watch '''
# The regular dict.update() operation makes no sense here because the
# replace behavior results in the some of original untouched counts
# being mixed-in with all of the other counts for a mismash that
# doesn't have a straight-forward interpretation in most counting
# contexts. Instead, we implement straight-addition. Both the inputs
# and outputs are allowed to contain zero and negative counts. if iterable is not None:
if isinstance(iterable, Mapping):
if self:
self_get = self.get
for elem, count in iterable.iteritems():
self[elem] = self_get(elem, 0) + count
else:
super(Counter, self).update(iterable) # fast path when counter is empty
else:
self_get = self.get
for elem in iterable:
self[elem] = self_get(elem, 0) + 1
if kwds:
self.update(kwds) def subtract(self, iterable=None, **kwds):
""" 相减,原来的计数器中的每一个元素的数量减去后添加的元素的数量 """
'''Like dict.update() but subtracts counts instead of replacing them.
Counts can be reduced below zero. Both the inputs and outputs are
allowed to contain zero and negative counts. Source can be an iterable, a dictionary, or another Counter instance. >>> c = Counter('which')
>>> c.subtract('witch') # subtract elements from another iterable
>>> c.subtract(Counter('watch')) # subtract elements from another counter
>>> c['h'] # 2 in which, minus 1 in witch, minus 1 in watch
>>> c['w'] # 1 in which, minus 1 in witch, minus 1 in watch
-1 '''
if iterable is not None:
self_get = self.get
if isinstance(iterable, Mapping):
for elem, count in iterable.items():
self[elem] = self_get(elem, 0) - count
else:
for elem in iterable:
self[elem] = self_get(elem, 0) - 1
if kwds:
self.subtract(kwds) def copy(self):
""" 拷贝 """
'Return a shallow copy.'
return self.__class__(self) def __reduce__(self):
""" 返回一个元组(类型,元组) """
return self.__class__, (dict(self),) def __delitem__(self, elem):
""" 删除元素 """
'Like dict.__delitem__() but does not raise KeyError for missing values.'
if elem in self:
super(Counter, self).__delitem__(elem) def __repr__(self):
if not self:
return '%s()' % self.__class__.__name__
items = ', '.join(map('%r: %r'.__mod__, self.most_common()))
return '%s({%s})' % (self.__class__.__name__, items) # Multiset-style mathematical operations discussed in:
# Knuth TAOCP Volume II section 4.6.3 exercise 19
# and at http://en.wikipedia.org/wiki/Multiset
#
# Outputs guaranteed to only include positive counts.
#
# To strip negative and zero counts, add-in an empty counter:
# c += Counter() def __add__(self, other):
'''Add counts from two counters. >>> Counter('abbb') + Counter('bcc')
Counter({'b': 4, 'c': 2, 'a': 1}) '''
if not isinstance(other, Counter):
return NotImplemented
result = Counter()
for elem, count in self.items():
newcount = count + other[elem]
if newcount > 0:
result[elem] = newcount
for elem, count in other.items():
if elem not in self and count > 0:
result[elem] = count
return result def __sub__(self, other):
''' Subtract count, but keep only results with positive counts. >>> Counter('abbbc') - Counter('bccd')
Counter({'b': 2, 'a': 1}) '''
if not isinstance(other, Counter):
return NotImplemented
result = Counter()
for elem, count in self.items():
newcount = count - other[elem]
if newcount > 0:
result[elem] = newcount
for elem, count in other.items():
if elem not in self and count < 0:
result[elem] = 0 - count
return result def __or__(self, other):
'''Union is the maximum of value in either of the input counters. >>> Counter('abbb') | Counter('bcc')
Counter({'b': 3, 'c': 2, 'a': 1}) '''
if not isinstance(other, Counter):
return NotImplemented
result = Counter()
for elem, count in self.items():
other_count = other[elem]
newcount = other_count if count < other_count else count
if newcount > 0:
result[elem] = newcount
for elem, count in other.items():
if elem not in self and count > 0:
result[elem] = count
return result def __and__(self, other):
''' Intersection is the minimum of corresponding counts. >>> Counter('abbb') & Counter('bcc')
Counter({'b': 1}) '''
if not isinstance(other, Counter):
return NotImplemented
result = Counter()
for elem, count in self.items():
other_count = other[elem]
newcount = count if count < other_count else other_count
if newcount > 0:
result[elem] = newcount
return result
Counter
4、常用的计数器操作
# most_common 输出出现次数最多的字符 >>> import collections
>>> obj = collections.Counter('aaabbccsdfsdfdfsdfsdf')
>>> print(obj.most_common(2))
[('d', 5), ('f', 5)]
# items 计数器以k/v方式输出统计结果 import collections obj = collections.Counter('aaabbccsdfsdfdfsdfsdf') for k,v in obj.items():
print(k,v) 输出结果:
f 5
s 4
d 5
a 3
c 2
b 2
# elements 把所有的元素拿到,并进行统计 >>> import collections
>>> obj = collections.Counter('aaabbccsdfsdfdfsdfsdf')
>>> print(obj.elements)
<bound method Counter.elements of Counter({'d': 5, 'f': 5, 's': 4, 'a': 3, 'c': 2, 'b': 2})>
# update 更新计数器中的字符串或字符 import collections obj = collections.Counter(['', '', '', ''])
print(obj)
obj.update(['eric', '', ''])
print(obj) 输出结果:
Counter({'': 2, '': 1, '': 1})
Counter({'': 3, '': 2, '': 1, 'eric': 1})
# subtract 删除计数器中的字符串 import collections obj = collections.Counter(['', '', '', ''])
print(obj)
obj.update(['eric', '', ''])
print(obj) obj.subtract(['eric', '', '', 'root'])
print(obj) 输出结果:
Counter({'': 2, '': 1, '': 1})
Counter({'': 3, '': 2, '': 1, 'eric': 1})
Counter({'': 2, '': 1, '': 1, 'eric': 0, 'root': -1})
Python collections系列之计数器的更多相关文章
- Python collections系列之有序字典
有序字典(orderedDict ) orderdDict是对字典类型的补充,他记住了字典元素添加的顺序 1.创建一个有序字典 import collections dic = collections ...
- Python collections系列之单向队列
单向队列(deque) 单项队列(先进先出 FIFO ) 1.创建单向队列 import queue q = queue.Queue() q.put(') q.put('evescn') 2.查看单向 ...
- Python collections系列之双向队列
双向队列(deque) 一个线程安全的双向队列 1.创建一个双向队列 import collections d = collections.deque() d.append(') d.appendle ...
- Python collections系列之可命名元组
可命名元组(namedtuple) 根据nametuple可以创建一个包含tuple所有功能以及其他功能的类 1.创建一个坐标类 import collections # 创建类, defaultd ...
- Python collections系列之默认字典
默认字典(defaultdict) defaultdict是对字典的类型的补充,它默认给字典的值设置了一个类型. 1.创建默认字典 import collections dic = collecti ...
- python递归、collections系列以及文件操作进阶
global log 127.0.0.1 local2 daemon maxconn log 127.0.0.1 local2 info defaults log global mode http t ...
- Python 第三篇(下):collections系列、集合(set)、单双队列、深浅copy、内置函数
一.collections系列: collections其实是python的标准库,也就是python的一个内置模块,因此使用之前导入一下collections模块即可,collections在py ...
- Python之set集合与collections系列
1>set集合:是一个无序且不重复的元素集合:访问速度快,解决了重复的问题: s2 = set(["che","liu","haha" ...
- Python collections模块总结
Python collections模块总结 除了我们使用的那些基础的数据结构,还有包括其它的一些模块提供的数据结构,有时甚至比基础的数据结构还要好用. collections ChainMap 这是 ...
随机推荐
- Android Opencv NativeCameraView error in 5.0 lollipop versions (Bug #4185)
https://github.com/opencv/opencv/wiki http://code.opencv.org/issues/4185 Hello, I finally get a ride ...
- iMX6 yocto平台QT交叉编译环境搭建
转:https://blog.csdn.net/morixinguan/article/details/79351909 . /opt/fsl-imx-fb/4.9.11-1.0.0/environm ...
- class_alias--为一个类创建别名
class_alias--为一个类创建别名 bool class_alias ( string $original , string $alias [, bool $autoload = TRUE ] ...
- Django 详解<二> 之url和view
Django URL(路由系统) RL配置(URLconf)就像Django 所支撑网站的目录.它的本质是URL模式以及要为该URL模式调用的视图函数之间的映射表:你就是以这种方式告诉Django,对 ...
- INSPIRED启示录 读书笔记 - 第5章 产品管理与软件开发
保持融洽的合作关系 形成合作关系的关键是双方承认彼此平等——任何一方不从属于另一方,产品经理负责定义正确的产品,开发团队负责正确地开发产品,双方相互依赖 产品经理要求开发团队完成任务,必须先取得他们的 ...
- 关于v4l2的一点变更
先打个连接 http://linuxtv.org/downloads/presentations/media_ws_2013/v4l2-multi-format.pdf 2013年linux 多媒体构 ...
- js 元素高度宽度整理
1.1只读属性 所谓的只读属性指的是DOM节点的固有属性,该属性只能通过js去获取而不能通过js去设置,而且获取的值是只有数字并不带单位的(px,em等),如下: 1)clientWidth和clie ...
- 【arc101】比赛记录
这场还好切出了D,rt应该能涨,然而这场的题有点毒瘤,700分的D没多少人切,更别说EF了.(暴打出题人)既然这样,干脆就水一篇博客,做个简单的比赛记录. C - Candles 这题是一道一眼题,花 ...
- java基础(2)-面向对象(2)
构造方法 构造方法特点 方法名与类名相同 方法名前没有返回值类型的声明(void也没有) 方法中不能使用return语句返回一个值 创建对象时自动调用并执行 如果类中没有自定义构造方法,则java调用 ...
- UOJ104 【APIO2014】Split the sequence
本文版权归ljh2000和博客园共有,欢迎转载,但须保留此声明,并给出原文链接,谢谢合作. 本文作者:ljh2000 作者博客:http://www.cnblogs.com/ljh2000-jump/ ...