Python collections系列之计数器
计数器(counter)
Counter是对字典(无序)类型的补充,用于追踪值的出现次数。
使用counter需要导入 collections 类
ps:具备字典的所有功能 + 自己的功能
1、创建一个计数器
>>> import collections
>>> obj = collections.Counter('aaabbccsdfsdfdfsdfsdf')
2、查看计数器变量
>>> print(obj)
Counter({'d': 5, 'f': 5, 's': 4, 'a': 3, 'c': 2, 'b': 2})
3、查看计数器可使用的方法
>>> dir(obj)
['__add__', '__and__', '__class__', '__contains__', '__delattr__', '__delitem__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__iadd__', '__iand__', '__init__', '__ior__', '__isub__', '__iter__', '__le__', '__len__', '__lt__', '__missing__', '__module__', '__ne__', '__neg__', '__new__', '__or__', '__pos__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__weakref__', '_keep_positive', 'clear', 'copy', 'elements', 'fromkeys', 'get', 'items', 'keys', 'most_common', 'pop', 'popitem', 'setdefault', 'subtract', 'update', 'values']
########################################################################
### Counter
######################################################################## class Counter(dict):
'''Dict subclass for counting hashable items. Sometimes called a bag
or multiset. Elements are stored as dictionary keys and their counts
are stored as dictionary values. >>> c = Counter('abcdeabcdabcaba') # count elements from a string >>> c.most_common(3) # three most common elements
[('a', 5), ('b', 4), ('c', 3)]
>>> sorted(c) # list all unique elements
['a', 'b', 'c', 'd', 'e']
>>> ''.join(sorted(c.elements())) # list elements with repetitions
'aaaaabbbbcccdde'
>>> sum(c.values()) # total of all counts >>> c['a'] # count of letter 'a'
>>> for elem in 'shazam': # update counts from an iterable
... c[elem] += 1 # by adding 1 to each element's count
>>> c['a'] # now there are seven 'a'
>>> del c['b'] # remove all 'b'
>>> c['b'] # now there are zero 'b' >>> d = Counter('simsalabim') # make another counter
>>> c.update(d) # add in the second counter
>>> c['a'] # now there are nine 'a' >>> c.clear() # empty the counter
>>> c
Counter() Note: If a count is set to zero or reduced to zero, it will remain
in the counter until the entry is deleted or the counter is cleared: >>> c = Counter('aaabbc')
>>> c['b'] -= 2 # reduce the count of 'b' by two
>>> c.most_common() # 'b' is still in, but its count is zero
[('a', 3), ('c', 1), ('b', 0)] '''
# References:
# http://en.wikipedia.org/wiki/Multiset
# http://www.gnu.org/software/smalltalk/manual-base/html_node/Bag.html
# http://www.demo2s.com/Tutorial/Cpp/0380__set-multiset/Catalog0380__set-multiset.htm
# http://code.activestate.com/recipes/259174/
# Knuth, TAOCP Vol. II section 4.6.3 def __init__(self, iterable=None, **kwds):
'''Create a new, empty Counter object. And if given, count elements
from an input iterable. Or, initialize the count from another mapping
of elements to their counts. >>> c = Counter() # a new, empty counter
>>> c = Counter('gallahad') # a new counter from an iterable
>>> c = Counter({'a': 4, 'b': 2}) # a new counter from a mapping
>>> c = Counter(a=4, b=2) # a new counter from keyword args '''
super(Counter, self).__init__()
self.update(iterable, **kwds) def __missing__(self, key):
""" 对于不存在的元素,返回计数器为0 """
'The count of elements not in the Counter is zero.'
# Needed so that self[missing_item] does not raise KeyError
return 0 def most_common(self, n=None):
""" 数量大于等n的所有元素和计数器 """
'''List the n most common elements and their counts from the most
common to the least. If n is None, then list all element counts. >>> Counter('abcdeabcdabcaba').most_common(3)
[('a', 5), ('b', 4), ('c', 3)] '''
# Emulate Bag.sortedByCount from Smalltalk
if n is None:
return sorted(self.iteritems(), key=_itemgetter(1), reverse=True)
return _heapq.nlargest(n, self.iteritems(), key=_itemgetter(1)) def elements(self):
""" 计数器中的所有元素,注:此处非所有元素集合,而是包含所有元素集合的迭代器 """
'''Iterator over elements repeating each as many times as its count. >>> c = Counter('ABCABC')
>>> sorted(c.elements())
['A', 'A', 'B', 'B', 'C', 'C'] # Knuth's example for prime factors of 1836: 2**2 * 3**3 * 17**1
>>> prime_factors = Counter({2: 2, 3: 3, 17: 1})
>>> product = 1
>>> for factor in prime_factors.elements(): # loop over factors
... product *= factor # and multiply them
>>> product Note, if an element's count has been set to zero or is a negative
number, elements() will ignore it. '''
# Emulate Bag.do from Smalltalk and Multiset.begin from C++.
return _chain.from_iterable(_starmap(_repeat, self.iteritems())) # Override dict methods where necessary @classmethod
def fromkeys(cls, iterable, v=None):
# There is no equivalent method for counters because setting v=1
# means that no element can have a count greater than one.
raise NotImplementedError(
'Counter.fromkeys() is undefined. Use Counter(iterable) instead.') def update(self, iterable=None, **kwds):
""" 更新计数器,其实就是增加;如果原来没有,则新建,如果有则加一 """
'''Like dict.update() but add counts instead of replacing them. Source can be an iterable, a dictionary, or another Counter instance. >>> c = Counter('which')
>>> c.update('witch') # add elements from another iterable
>>> d = Counter('watch')
>>> c.update(d) # add elements from another counter
>>> c['h'] # four 'h' in which, witch, and watch '''
# The regular dict.update() operation makes no sense here because the
# replace behavior results in the some of original untouched counts
# being mixed-in with all of the other counts for a mismash that
# doesn't have a straight-forward interpretation in most counting
# contexts. Instead, we implement straight-addition. Both the inputs
# and outputs are allowed to contain zero and negative counts. if iterable is not None:
if isinstance(iterable, Mapping):
if self:
self_get = self.get
for elem, count in iterable.iteritems():
self[elem] = self_get(elem, 0) + count
else:
super(Counter, self).update(iterable) # fast path when counter is empty
else:
self_get = self.get
for elem in iterable:
self[elem] = self_get(elem, 0) + 1
if kwds:
self.update(kwds) def subtract(self, iterable=None, **kwds):
""" 相减,原来的计数器中的每一个元素的数量减去后添加的元素的数量 """
'''Like dict.update() but subtracts counts instead of replacing them.
Counts can be reduced below zero. Both the inputs and outputs are
allowed to contain zero and negative counts. Source can be an iterable, a dictionary, or another Counter instance. >>> c = Counter('which')
>>> c.subtract('witch') # subtract elements from another iterable
>>> c.subtract(Counter('watch')) # subtract elements from another counter
>>> c['h'] # 2 in which, minus 1 in witch, minus 1 in watch
>>> c['w'] # 1 in which, minus 1 in witch, minus 1 in watch
-1 '''
if iterable is not None:
self_get = self.get
if isinstance(iterable, Mapping):
for elem, count in iterable.items():
self[elem] = self_get(elem, 0) - count
else:
for elem in iterable:
self[elem] = self_get(elem, 0) - 1
if kwds:
self.subtract(kwds) def copy(self):
""" 拷贝 """
'Return a shallow copy.'
return self.__class__(self) def __reduce__(self):
""" 返回一个元组(类型,元组) """
return self.__class__, (dict(self),) def __delitem__(self, elem):
""" 删除元素 """
'Like dict.__delitem__() but does not raise KeyError for missing values.'
if elem in self:
super(Counter, self).__delitem__(elem) def __repr__(self):
if not self:
return '%s()' % self.__class__.__name__
items = ', '.join(map('%r: %r'.__mod__, self.most_common()))
return '%s({%s})' % (self.__class__.__name__, items) # Multiset-style mathematical operations discussed in:
# Knuth TAOCP Volume II section 4.6.3 exercise 19
# and at http://en.wikipedia.org/wiki/Multiset
#
# Outputs guaranteed to only include positive counts.
#
# To strip negative and zero counts, add-in an empty counter:
# c += Counter() def __add__(self, other):
'''Add counts from two counters. >>> Counter('abbb') + Counter('bcc')
Counter({'b': 4, 'c': 2, 'a': 1}) '''
if not isinstance(other, Counter):
return NotImplemented
result = Counter()
for elem, count in self.items():
newcount = count + other[elem]
if newcount > 0:
result[elem] = newcount
for elem, count in other.items():
if elem not in self and count > 0:
result[elem] = count
return result def __sub__(self, other):
''' Subtract count, but keep only results with positive counts. >>> Counter('abbbc') - Counter('bccd')
Counter({'b': 2, 'a': 1}) '''
if not isinstance(other, Counter):
return NotImplemented
result = Counter()
for elem, count in self.items():
newcount = count - other[elem]
if newcount > 0:
result[elem] = newcount
for elem, count in other.items():
if elem not in self and count < 0:
result[elem] = 0 - count
return result def __or__(self, other):
'''Union is the maximum of value in either of the input counters. >>> Counter('abbb') | Counter('bcc')
Counter({'b': 3, 'c': 2, 'a': 1}) '''
if not isinstance(other, Counter):
return NotImplemented
result = Counter()
for elem, count in self.items():
other_count = other[elem]
newcount = other_count if count < other_count else count
if newcount > 0:
result[elem] = newcount
for elem, count in other.items():
if elem not in self and count > 0:
result[elem] = count
return result def __and__(self, other):
''' Intersection is the minimum of corresponding counts. >>> Counter('abbb') & Counter('bcc')
Counter({'b': 1}) '''
if not isinstance(other, Counter):
return NotImplemented
result = Counter()
for elem, count in self.items():
other_count = other[elem]
newcount = count if count < other_count else other_count
if newcount > 0:
result[elem] = newcount
return result
Counter
4、常用的计数器操作
# most_common 输出出现次数最多的字符 >>> import collections
>>> obj = collections.Counter('aaabbccsdfsdfdfsdfsdf')
>>> print(obj.most_common(2))
[('d', 5), ('f', 5)]
# items 计数器以k/v方式输出统计结果
import collections
obj = collections.Counter('aaabbccsdfsdfdfsdfsdf')
for k,v in obj.items():
print(k,v)
输出结果:
f 5
s 4
d 5
a 3
c 2
b 2
# elements 把所有的元素拿到,并进行统计 >>> import collections
>>> obj = collections.Counter('aaabbccsdfsdfdfsdfsdf')
>>> print(obj.elements)
<bound method Counter.elements of Counter({'d': 5, 'f': 5, 's': 4, 'a': 3, 'c': 2, 'b': 2})>
# update 更新计数器中的字符串或字符 import collections obj = collections.Counter(['', '', '', ''])
print(obj)
obj.update(['eric', '', ''])
print(obj) 输出结果:
Counter({'': 2, '': 1, '': 1})
Counter({'': 3, '': 2, '': 1, 'eric': 1})
# subtract 删除计数器中的字符串 import collections obj = collections.Counter(['', '', '', ''])
print(obj)
obj.update(['eric', '', ''])
print(obj) obj.subtract(['eric', '', '', 'root'])
print(obj) 输出结果:
Counter({'': 2, '': 1, '': 1})
Counter({'': 3, '': 2, '': 1, 'eric': 1})
Counter({'': 2, '': 1, '': 1, 'eric': 0, 'root': -1})
Python collections系列之计数器的更多相关文章
- Python collections系列之有序字典
有序字典(orderedDict ) orderdDict是对字典类型的补充,他记住了字典元素添加的顺序 1.创建一个有序字典 import collections dic = collections ...
- Python collections系列之单向队列
单向队列(deque) 单项队列(先进先出 FIFO ) 1.创建单向队列 import queue q = queue.Queue() q.put(') q.put('evescn') 2.查看单向 ...
- Python collections系列之双向队列
双向队列(deque) 一个线程安全的双向队列 1.创建一个双向队列 import collections d = collections.deque() d.append(') d.appendle ...
- Python collections系列之可命名元组
可命名元组(namedtuple) 根据nametuple可以创建一个包含tuple所有功能以及其他功能的类 1.创建一个坐标类 import collections # 创建类, defaultd ...
- Python collections系列之默认字典
默认字典(defaultdict) defaultdict是对字典的类型的补充,它默认给字典的值设置了一个类型. 1.创建默认字典 import collections dic = collecti ...
- python递归、collections系列以及文件操作进阶
global log 127.0.0.1 local2 daemon maxconn log 127.0.0.1 local2 info defaults log global mode http t ...
- Python 第三篇(下):collections系列、集合(set)、单双队列、深浅copy、内置函数
一.collections系列: collections其实是python的标准库,也就是python的一个内置模块,因此使用之前导入一下collections模块即可,collections在py ...
- Python之set集合与collections系列
1>set集合:是一个无序且不重复的元素集合:访问速度快,解决了重复的问题: s2 = set(["che","liu","haha" ...
- Python collections模块总结
Python collections模块总结 除了我们使用的那些基础的数据结构,还有包括其它的一些模块提供的数据结构,有时甚至比基础的数据结构还要好用. collections ChainMap 这是 ...
随机推荐
- 《Inode与Block重要知识总结核心讲解》【转】
本文转载自:https://blog.csdn.net/BlackEnn/article/details/50787092 1.查看/dev/sda1下磁盘分区的block大小: 2.查看单个inod ...
- LeetCode——max-points-on-a-line
Question Given n points on a 2D plane, find the maximum number of points that lie on the same straig ...
- 从配置maven环境到maven项目的新建
话不多说,直接入正题. 一.配置maven 环境 首先安装最新版支持javaee的eclipse.我这里下载的版本是eclipse-jee-mars-2-win32-x86_64的新版(我是2017年 ...
- JDBC批量插入blob数据
图片从接口读取后是base64的字符串,所以转成byte数组进行保存. 我们一般保存数据的话,都是基本数据,对于这些图片数据大部分会将图片保存成Blob,Clob等. Blob存储的是二进制对象数据( ...
- Spring Boot入门——json数据处理
1.引入fastJson插件 <!-- 引入fastjson插件 --> <dependency> <groupId>com.alibaba</groupId ...
- ansible实现发布、回滚功能
ansible的两篇博客,本来是打算合二为一的,发现只用一篇写,嗯,好鬼长.... 一向秉承简单为美的我于是忍痛割爱,一分为二了 ansible实现升级发布.回滚功能 1.应用场景 在实际生产环境中, ...
- character_set_connection、character_set_results、 character_set_client的作用
如题.通常的使用中,character_set_client,character_set_connection这两个变量的值是一样的,也就是说查询不需要进行编码转换.这样看来变量character_s ...
- php的http数据传输get/post...
php的http数据传输get/post... 一般有:file_get_contents,curl,fsockopen.... 下面介绍fsockopen: //构造要post的字符串 $argv ...
- python中的列表和字典(一)
一. 列表 1. 列表的定义 [] 2. 列表特征:有序列表,可以包含任意内容,可以重复 3. 列表的赋值(顺序赋值):listA = [A, B, C] 4. 列表的取值:list[index] ...
- SOLID
S.O.L.I.D是面向对象设计和编程(OOD&OOP)中几个重要编码原则(Programming Priciple)的首字母缩写. SRP The Single Responsibility ...