Python collections系列之计数器
计数器(counter)
Counter是对字典(无序)类型的补充,用于追踪值的出现次数。
使用counter需要导入 collections 类
ps:具备字典的所有功能 + 自己的功能
1、创建一个计数器
>>> import collections
>>> obj = collections.Counter('aaabbccsdfsdfdfsdfsdf')
2、查看计数器变量
>>> print(obj)
Counter({'d': 5, 'f': 5, 's': 4, 'a': 3, 'c': 2, 'b': 2})
3、查看计数器可使用的方法
>>> dir(obj)
['__add__', '__and__', '__class__', '__contains__', '__delattr__', '__delitem__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__iadd__', '__iand__', '__init__', '__ior__', '__isub__', '__iter__', '__le__', '__len__', '__lt__', '__missing__', '__module__', '__ne__', '__neg__', '__new__', '__or__', '__pos__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__weakref__', '_keep_positive', 'clear', 'copy', 'elements', 'fromkeys', 'get', 'items', 'keys', 'most_common', 'pop', 'popitem', 'setdefault', 'subtract', 'update', 'values']
########################################################################
### Counter
######################################################################## class Counter(dict):
'''Dict subclass for counting hashable items. Sometimes called a bag
or multiset. Elements are stored as dictionary keys and their counts
are stored as dictionary values. >>> c = Counter('abcdeabcdabcaba') # count elements from a string >>> c.most_common(3) # three most common elements
[('a', 5), ('b', 4), ('c', 3)]
>>> sorted(c) # list all unique elements
['a', 'b', 'c', 'd', 'e']
>>> ''.join(sorted(c.elements())) # list elements with repetitions
'aaaaabbbbcccdde'
>>> sum(c.values()) # total of all counts >>> c['a'] # count of letter 'a'
>>> for elem in 'shazam': # update counts from an iterable
... c[elem] += 1 # by adding 1 to each element's count
>>> c['a'] # now there are seven 'a'
>>> del c['b'] # remove all 'b'
>>> c['b'] # now there are zero 'b' >>> d = Counter('simsalabim') # make another counter
>>> c.update(d) # add in the second counter
>>> c['a'] # now there are nine 'a' >>> c.clear() # empty the counter
>>> c
Counter() Note: If a count is set to zero or reduced to zero, it will remain
in the counter until the entry is deleted or the counter is cleared: >>> c = Counter('aaabbc')
>>> c['b'] -= 2 # reduce the count of 'b' by two
>>> c.most_common() # 'b' is still in, but its count is zero
[('a', 3), ('c', 1), ('b', 0)] '''
# References:
# http://en.wikipedia.org/wiki/Multiset
# http://www.gnu.org/software/smalltalk/manual-base/html_node/Bag.html
# http://www.demo2s.com/Tutorial/Cpp/0380__set-multiset/Catalog0380__set-multiset.htm
# http://code.activestate.com/recipes/259174/
# Knuth, TAOCP Vol. II section 4.6.3 def __init__(self, iterable=None, **kwds):
'''Create a new, empty Counter object. And if given, count elements
from an input iterable. Or, initialize the count from another mapping
of elements to their counts. >>> c = Counter() # a new, empty counter
>>> c = Counter('gallahad') # a new counter from an iterable
>>> c = Counter({'a': 4, 'b': 2}) # a new counter from a mapping
>>> c = Counter(a=4, b=2) # a new counter from keyword args '''
super(Counter, self).__init__()
self.update(iterable, **kwds) def __missing__(self, key):
""" 对于不存在的元素,返回计数器为0 """
'The count of elements not in the Counter is zero.'
# Needed so that self[missing_item] does not raise KeyError
return 0 def most_common(self, n=None):
""" 数量大于等n的所有元素和计数器 """
'''List the n most common elements and their counts from the most
common to the least. If n is None, then list all element counts. >>> Counter('abcdeabcdabcaba').most_common(3)
[('a', 5), ('b', 4), ('c', 3)] '''
# Emulate Bag.sortedByCount from Smalltalk
if n is None:
return sorted(self.iteritems(), key=_itemgetter(1), reverse=True)
return _heapq.nlargest(n, self.iteritems(), key=_itemgetter(1)) def elements(self):
""" 计数器中的所有元素,注:此处非所有元素集合,而是包含所有元素集合的迭代器 """
'''Iterator over elements repeating each as many times as its count. >>> c = Counter('ABCABC')
>>> sorted(c.elements())
['A', 'A', 'B', 'B', 'C', 'C'] # Knuth's example for prime factors of 1836: 2**2 * 3**3 * 17**1
>>> prime_factors = Counter({2: 2, 3: 3, 17: 1})
>>> product = 1
>>> for factor in prime_factors.elements(): # loop over factors
... product *= factor # and multiply them
>>> product Note, if an element's count has been set to zero or is a negative
number, elements() will ignore it. '''
# Emulate Bag.do from Smalltalk and Multiset.begin from C++.
return _chain.from_iterable(_starmap(_repeat, self.iteritems())) # Override dict methods where necessary @classmethod
def fromkeys(cls, iterable, v=None):
# There is no equivalent method for counters because setting v=1
# means that no element can have a count greater than one.
raise NotImplementedError(
'Counter.fromkeys() is undefined. Use Counter(iterable) instead.') def update(self, iterable=None, **kwds):
""" 更新计数器,其实就是增加;如果原来没有,则新建,如果有则加一 """
'''Like dict.update() but add counts instead of replacing them. Source can be an iterable, a dictionary, or another Counter instance. >>> c = Counter('which')
>>> c.update('witch') # add elements from another iterable
>>> d = Counter('watch')
>>> c.update(d) # add elements from another counter
>>> c['h'] # four 'h' in which, witch, and watch '''
# The regular dict.update() operation makes no sense here because the
# replace behavior results in the some of original untouched counts
# being mixed-in with all of the other counts for a mismash that
# doesn't have a straight-forward interpretation in most counting
# contexts. Instead, we implement straight-addition. Both the inputs
# and outputs are allowed to contain zero and negative counts. if iterable is not None:
if isinstance(iterable, Mapping):
if self:
self_get = self.get
for elem, count in iterable.iteritems():
self[elem] = self_get(elem, 0) + count
else:
super(Counter, self).update(iterable) # fast path when counter is empty
else:
self_get = self.get
for elem in iterable:
self[elem] = self_get(elem, 0) + 1
if kwds:
self.update(kwds) def subtract(self, iterable=None, **kwds):
""" 相减,原来的计数器中的每一个元素的数量减去后添加的元素的数量 """
'''Like dict.update() but subtracts counts instead of replacing them.
Counts can be reduced below zero. Both the inputs and outputs are
allowed to contain zero and negative counts. Source can be an iterable, a dictionary, or another Counter instance. >>> c = Counter('which')
>>> c.subtract('witch') # subtract elements from another iterable
>>> c.subtract(Counter('watch')) # subtract elements from another counter
>>> c['h'] # 2 in which, minus 1 in witch, minus 1 in watch
>>> c['w'] # 1 in which, minus 1 in witch, minus 1 in watch
-1 '''
if iterable is not None:
self_get = self.get
if isinstance(iterable, Mapping):
for elem, count in iterable.items():
self[elem] = self_get(elem, 0) - count
else:
for elem in iterable:
self[elem] = self_get(elem, 0) - 1
if kwds:
self.subtract(kwds) def copy(self):
""" 拷贝 """
'Return a shallow copy.'
return self.__class__(self) def __reduce__(self):
""" 返回一个元组(类型,元组) """
return self.__class__, (dict(self),) def __delitem__(self, elem):
""" 删除元素 """
'Like dict.__delitem__() but does not raise KeyError for missing values.'
if elem in self:
super(Counter, self).__delitem__(elem) def __repr__(self):
if not self:
return '%s()' % self.__class__.__name__
items = ', '.join(map('%r: %r'.__mod__, self.most_common()))
return '%s({%s})' % (self.__class__.__name__, items) # Multiset-style mathematical operations discussed in:
# Knuth TAOCP Volume II section 4.6.3 exercise 19
# and at http://en.wikipedia.org/wiki/Multiset
#
# Outputs guaranteed to only include positive counts.
#
# To strip negative and zero counts, add-in an empty counter:
# c += Counter() def __add__(self, other):
'''Add counts from two counters. >>> Counter('abbb') + Counter('bcc')
Counter({'b': 4, 'c': 2, 'a': 1}) '''
if not isinstance(other, Counter):
return NotImplemented
result = Counter()
for elem, count in self.items():
newcount = count + other[elem]
if newcount > 0:
result[elem] = newcount
for elem, count in other.items():
if elem not in self and count > 0:
result[elem] = count
return result def __sub__(self, other):
''' Subtract count, but keep only results with positive counts. >>> Counter('abbbc') - Counter('bccd')
Counter({'b': 2, 'a': 1}) '''
if not isinstance(other, Counter):
return NotImplemented
result = Counter()
for elem, count in self.items():
newcount = count - other[elem]
if newcount > 0:
result[elem] = newcount
for elem, count in other.items():
if elem not in self and count < 0:
result[elem] = 0 - count
return result def __or__(self, other):
'''Union is the maximum of value in either of the input counters. >>> Counter('abbb') | Counter('bcc')
Counter({'b': 3, 'c': 2, 'a': 1}) '''
if not isinstance(other, Counter):
return NotImplemented
result = Counter()
for elem, count in self.items():
other_count = other[elem]
newcount = other_count if count < other_count else count
if newcount > 0:
result[elem] = newcount
for elem, count in other.items():
if elem not in self and count > 0:
result[elem] = count
return result def __and__(self, other):
''' Intersection is the minimum of corresponding counts. >>> Counter('abbb') & Counter('bcc')
Counter({'b': 1}) '''
if not isinstance(other, Counter):
return NotImplemented
result = Counter()
for elem, count in self.items():
other_count = other[elem]
newcount = count if count < other_count else other_count
if newcount > 0:
result[elem] = newcount
return result
Counter
4、常用的计数器操作
# most_common 输出出现次数最多的字符 >>> import collections
>>> obj = collections.Counter('aaabbccsdfsdfdfsdfsdf')
>>> print(obj.most_common(2))
[('d', 5), ('f', 5)]
# items 计数器以k/v方式输出统计结果
import collections
obj = collections.Counter('aaabbccsdfsdfdfsdfsdf')
for k,v in obj.items():
print(k,v)
输出结果:
f 5
s 4
d 5
a 3
c 2
b 2
# elements 把所有的元素拿到,并进行统计 >>> import collections
>>> obj = collections.Counter('aaabbccsdfsdfdfsdfsdf')
>>> print(obj.elements)
<bound method Counter.elements of Counter({'d': 5, 'f': 5, 's': 4, 'a': 3, 'c': 2, 'b': 2})>
# update 更新计数器中的字符串或字符 import collections obj = collections.Counter(['', '', '', ''])
print(obj)
obj.update(['eric', '', ''])
print(obj) 输出结果:
Counter({'': 2, '': 1, '': 1})
Counter({'': 3, '': 2, '': 1, 'eric': 1})
# subtract 删除计数器中的字符串 import collections obj = collections.Counter(['', '', '', ''])
print(obj)
obj.update(['eric', '', ''])
print(obj) obj.subtract(['eric', '', '', 'root'])
print(obj) 输出结果:
Counter({'': 2, '': 1, '': 1})
Counter({'': 3, '': 2, '': 1, 'eric': 1})
Counter({'': 2, '': 1, '': 1, 'eric': 0, 'root': -1})
Python collections系列之计数器的更多相关文章
- Python collections系列之有序字典
有序字典(orderedDict ) orderdDict是对字典类型的补充,他记住了字典元素添加的顺序 1.创建一个有序字典 import collections dic = collections ...
- Python collections系列之单向队列
单向队列(deque) 单项队列(先进先出 FIFO ) 1.创建单向队列 import queue q = queue.Queue() q.put(') q.put('evescn') 2.查看单向 ...
- Python collections系列之双向队列
双向队列(deque) 一个线程安全的双向队列 1.创建一个双向队列 import collections d = collections.deque() d.append(') d.appendle ...
- Python collections系列之可命名元组
可命名元组(namedtuple) 根据nametuple可以创建一个包含tuple所有功能以及其他功能的类 1.创建一个坐标类 import collections # 创建类, defaultd ...
- Python collections系列之默认字典
默认字典(defaultdict) defaultdict是对字典的类型的补充,它默认给字典的值设置了一个类型. 1.创建默认字典 import collections dic = collecti ...
- python递归、collections系列以及文件操作进阶
global log 127.0.0.1 local2 daemon maxconn log 127.0.0.1 local2 info defaults log global mode http t ...
- Python 第三篇(下):collections系列、集合(set)、单双队列、深浅copy、内置函数
一.collections系列: collections其实是python的标准库,也就是python的一个内置模块,因此使用之前导入一下collections模块即可,collections在py ...
- Python之set集合与collections系列
1>set集合:是一个无序且不重复的元素集合:访问速度快,解决了重复的问题: s2 = set(["che","liu","haha" ...
- Python collections模块总结
Python collections模块总结 除了我们使用的那些基础的数据结构,还有包括其它的一些模块提供的数据结构,有时甚至比基础的数据结构还要好用. collections ChainMap 这是 ...
随机推荐
- 【Head First Servlets and JSP】笔记4:HttpServletRequest req
api:https://tomcat.apache.org/tomcat-5.5-doc/servletapi/ 1.GET和POST除去数据大小之外的区别. 安全性问题.使用GET的话,参数数据会出 ...
- Windos Server 2008 配置定时清理任务
系统环境:Windos 2008 R2 x64 位 实施方案:自动清理超过两周的备份系统文件. 编写自动清理脚本..bat文件后缀. 打开计划任务
- verilog中一些基本的门电路如pmos和nmos等
最近在分析波形的时候,发现某个PAD模型的行为与想象的不一致,就进入stdcell里面看了下,主要是pmos和nmos相关的东西,暂列如下: 开关级基元14种 是实际的MOS关的抽象表示,分电阻型(前 ...
- ACM训练小结-2018年6月15日
今天题目情况如下:A题:给出若干条边的边长,问这些边按顺序能否组成一个凸多边形,并求出这个多边形的最小包含圆.答题情况:无思路.正解(某种):第一问很简单.对第二问,如果R大于可行的最小R,那么按照放 ...
- linux usb简介
参考书:<linux device drivers>.<usb 2.0规范> <usb3.1规范><usb白皮书> 以linux为例来说明usb系统. ...
- linux下pycharm的使用
百度搜索pycharm 然后打开pycharm的官网 然后在官网首页点击down 如果使用的是Linux系统,那么默认已经选择Linux版本 左边的down是全功能的IDE和WEB扩展,属于商业版 ...
- 剑指Offer——链表中倒数第k个节点
Question 输入一个链表,输出该链表中倒数第k个结点. Solution 一种想法就是扫描两边,第一遍求出总的节点个数,第二遍从头开始走n-k个 第二种思想类似于fast-slow指针的方法,f ...
- 分布式技术 memcached
memcached 是一个高性能的分布式内存对象缓存系统,用于动态web应用,以减轻数据库负载,它通过在内存中缓存数据和对象来减少读取数据库的次数,从而提高动态.数据库驱动网站的速度.memcache ...
- cdq分治入门and持续学习orz
感觉cdq分治是一个很有趣的算法 能将很多需要套数据结构的题通过离线来做 目前的一些微小的理解 在一般情况下 就像求三维偏序xyz 就可以先对x排序 然后分治 1 cdq_x(L,M) ; 2 提取出 ...
- ps常用键
@updata 2016-7-31 切图 界面设置 视图 --显示 ---智能参考线 72 标尺 ctrl + r 窗口 ----信息 字符 历史记录 颜色 选RGB 信息图选项 ...