當前位置：首頁 > 编程语言 > python >内容正文

python

python中collections_Python中collections模块的基本使用教程

發布時間：2025/4/5 python 22 豆豆

生活随笔收集整理的這篇文章主要介紹了 python中collections_Python中collections模块的基本使用教程小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

前言

之前認識了python基本的數據類型和數據結構，現在認識一個高級的：Collections，一個模塊主要用來干嘛，有哪些類可以使用，看__init__.py就知道

'''This module implements specialized container datatypes providing

alternatives to Python's general purpose built-in containers, dict,

list, set, and tuple.

* namedtuple?? factory function for creating tuple subclasses with named fields

* deque??????? list-like container with fast appends and pops on either end

* ChainMap???? dict-like class for creating a single view of multiple mappings

* Counter????? dict subclass for counting hashable objects

* OrderedDict? dict subclass that remembers the order entries were added

* defaultdict? dict subclass that calls a factory function to supply missing values

* UserDict???? wrapper around dictionary objects for easier dict subclassing

* UserList???? wrapper around list objects for easier list subclassing

* UserString?? wrapper around string objects for easier string subclassing

'''

__all__ = ['deque', 'defaultdict', 'namedtuple', 'UserDict', 'UserList',

'UserString', 'Counter', 'OrderedDict', 'ChainMap']

collections模塊實現一些特定的數據類型，可以替代Python中常用的內置數據類型如dict, list, set, tuple，簡單說就是對基本數據類型做了更上一層的處理。

一、deque

用途：雙端隊列，頭部和尾部都能以O(1)時間復雜度插入和刪除元素。類似于列表的容器

所謂雙端隊列，就是兩端都能操作，與Python內置的list區別在于：頭部插入與刪除的時間復雜度為O(1)，來個栗子感受一下：

#!/usr/bin/env python

# -*- coding:utf-8 -*-

# __author__ = 'liao gao xiang'

"""

保留最后n個元素

"""

from collections import deque

def search(file, pattern, history=5):

previous_lines = deque(maxlen=history)

for l in file:

if pattern in l:

yield l, previous_lines # 使用yield表達式的生成器函數，將搜索過程的代碼和搜索結果的代碼解耦

previous_lines.append(l)

with open(b'file.txt', mode='r', encoding='utf-8') as f:

for line, prevlines in search(f, 'Python', 5):

for pline in prevlines:

print(pline, end='')

print(line, end='')

d = deque()

d.append(1)

d.append("2")

print(len(d))

print(d[0], d[1])

d.extendleft([0])

print(d)

d.extend([6, 7, 8])

print(d)

d2 = deque('12345')

print(len(d2))

d2.popleft()

print(d2)

d2.pop()

print(d2)

# 在隊列兩端插入或刪除元素時間復雜度都是 O(1) ，區別于列表，在列表的開頭插入或刪除元素的時間復雜度為 O(N)

d3 = deque(maxlen=2)

d3.append(1)

d3.append(2)

print(d3)

d3.append(3)

print(d3)

輸出結果如下

人生苦短

我用Python

1 2

deque([0, 1, '2'])

deque([0, 1, '2', 6, 7, 8])

deque(['2', '3', '4', '5'])

deque(['2', '3', '4'])

deque([1, 2], maxlen=2)

deque([2, 3], maxlen=2)

因此，如果你遇到經常操作列表頭的場景，使用deque最好。deque類的所有方法，自行操作一遍就知道了。

class deque(object):

"""

deque([iterable[, maxlen]]) --> deque object

A list-like sequence optimized for data accesses near its endpoints.

"""

def append(self, *args, **kwargs): # real signature unknown

""" Add an element to the right side of the deque. """

pass

def appendleft(self, *args, **kwargs): # real signature unknown

""" Add an element to the left side of the deque. """

pass

def clear(self, *args, **kwargs): # real signature unknown

""" Remove all elements from the deque. """

pass

def copy(self, *args, **kwargs): # real signature unknown

""" Return a shallow copy of a deque. """

pass

def count(self, value): # real signature unknown; restored from __doc__

""" D.count(value) -> integer -- return number of occurrences of value """

return 0

def extend(self, *args, **kwargs): # real signature unknown

""" Extend the right side of the deque with elements from the iterable """

pass

def extendleft(self, *args, **kwargs): # real signature unknown

""" Extend the left side of the deque with elements from the iterable """

pass

def index(self, value, start=None, stop=None): # real signature unknown; restored from __doc__

"""

D.index(value, [start, [stop]]) -> integer -- return first index of value.

Raises ValueError if the value is not present.

"""

return 0

def insert(self, index, p_object): # real signature unknown; restored from __doc__

""" D.insert(index, object) -- insert object before index """

pass

def pop(self, *args, **kwargs): # real signature unknown

""" Remove and return the rightmost element. """

pass

def popleft(self, *args, **kwargs): # real signature unknown

""" Remove and return the leftmost element. """

pass

def remove(self, value): # real signature unknown; restored from __doc__

""" D.remove(value) -- remove first occurrence of value. """

pass

def reverse(self): # real signature unknown; restored from __doc__

""" D.reverse() -- reverse *IN PLACE* """

pass

def rotate(self, *args, **kwargs): # real signature unknown

""" Rotate the deque n steps to the right (default n=1). If n is negative, rotates left. """

pass

這里提示一下，有些函數對隊列進行操作，但返回值是None，比如reverse()反轉隊列，rotate(1)將隊列中元素向右移1位，尾部的元素移到頭部。

二、defaultdict

用途：帶有默認值的字典。父類為Python內置的dict

字典帶默認值有啥好處？舉個栗子，一般來講，創建一個多值映射字典是很簡單的。但是，如果你選擇自己實現的話，那么對于值的初始化可能會有點麻煩，你可能會像下面這樣來實現：

d = {}

for key, value in pairs:

if key not in d:

d[key] = []

d[key].append(value)

如果使用 defaultdict 的話代碼就更加簡潔了：

d = defaultdict(list)

for key, value in pairs:

d[key].append(value)

defaultdict 的一個特征是它會自動初始化每個 key 剛開始對應的值，所以你只需要關注添加元素操作了。比如：

#!/usr/bin/env python

# -*- coding:utf-8 -*-

# __author__ = 'liao gao xiang'

# 字典中的鍵映射多個值

from collections import defaultdict

d = defaultdict(list)

print(d)

d['a'].append([1, 2, 3])

d['b'].append(2)

d['c'].append(3)

print(d)

d = defaultdict(set)

print(d)

d['a'].add(1)

d['a'].add(2)

d['b'].add(4)

print(d)

輸出結果如下：

defaultdict(, {})

defaultdict(, {'a': [[1, 2, 3]], 'b': [2], 'c': [3]})

defaultdict(, {})

defaultdict(, {'a': {1, 2}, 'b': {4}})

三、namedtuple()

用途：創建命名字段的元組。工廠函數

namedtuple主要用來產生可以使用名稱來訪問元素的數據對象，通常用來增強代碼的可讀性，在訪問一些tuple類型的數據時尤其好用。

比如我們用戶擁有一個這樣的數據結構，每一個對象是擁有三個元素的tuple。使用namedtuple方法就可以方便的通過tuple來生成可讀性更高也更好用的數據結構。

from collections import namedtuple

websites = [

('Sohu', 'http://www.sohu.com/', u'張朝陽'),

('Sina', 'http://www.sina.com.cn/', u'王志東'),

('163', 'http://www.163.com/', u'丁磊')

]

Website = namedtuple('Website', ['name', 'url', 'founder'])

for website in websites:

website = Website._make(website)

print website

# 輸出結果:

Website(name='Sohu', url='http://www.sohu.com/', founder=u'\u5f20\u671d\u9633')

Website(name='Sina', url='http://www.sina.com.cn/', founder=u'\u738b\u5fd7\u4e1c')

Website(name='163', url='http://www.163.com/', founder=u'\u4e01\u78ca')

注意，namedtuple是函數，不是類。

四、Counter

用途：統計可哈希的對象。父類為Python內置的dict

尋找序列中出現次數最多的元素。假設你有一個單詞列表并且想找出哪個單詞出現頻率最高：

#!/usr/bin/env python

# -*- coding:utf-8 -*-

# __author__ = 'liao gao xiang'

from collections import Counter

words = [

'look', 'into', 'my', 'eyes', 'look', 'into', 'my', 'eyes',

'the', 'eyes', 'the', 'eyes', 'the', 'eyes', 'not', 'around', 'the',

'eyes', "don't", 'look', 'around', 'the', 'eyes', 'look', 'into',

'my', 'eyes', "you're", 'under'

]

word_counts = Counter(words)

# 出現頻率最高的三個單詞

top_three = word_counts.most_common(3)

print(top_three)

# Outputs [('eyes', 8), ('the', 5), ('look', 4)]

print(word_counts['eyes'])

morewords = ['why', 'are', 'you', 'not', 'looking', 'in', 'my', 'eyes']

# 如果你想手動增加計數，可以簡單的用加法：

for word in morewords:

print(word)

word_counts[word] += 1

print(word_counts['eyes'])

結果如下：

[('eyes', 8), ('the', 5), ('look', 4)]

why

are

you

not

looking

eyes

因為Counter繼承自dict，所有dict有的方法它都有(defaultdict和OrderedDict也是的)，Counter自己實現或重寫了6個方法：

most_common(self, n=None),

elements(self)

fromkeys(cls, iterable, v=None)

update(*args, **kwds)

subtract(*args, **kwds)

copy(self)

五、OrderedDict

用途：排序的字段。父類為Python內置的dict

OrderedDict在迭代操作的時候會保持元素被插入時的順序，OrderedDict內部維護著一個根據鍵插入順序排序的雙向鏈表。每次當一個新的元素插入進來的時候，它會被放到鏈表的尾部。對于一個已經存在的鍵的重復賦值不會改變鍵的順序。

需要注意的是，一個OrderedDict的大小是一個普通字典的兩倍，因為它內部維護著另外一個鏈表。所以如果你要構建一個需要大量OrderedDict 實例的數據結構的時候(比如讀取100,000行CSV數據到一個 OrderedDict 列表中去)，那么你就得仔細權衡一下是否使用 OrderedDict帶來的好處要大過額外內存消耗的影響。

#!/usr/bin/env python

# -*- coding:utf-8 -*-

# __author__ = 'liao gao xiang'

from collections import OrderedDict

d = OrderedDict()

d['foo'] = 1

d['bar'] = 2

d['spam'] = 3

d['grok'] = 4

# d['bar'] = 22 #對于一個已經存在的鍵，重復賦值不會改變鍵的順序

for key in d:

print(key, d[key])

print(d)

import json

print(json.dumps(d))

結果如下：

foo 1

bar 2

spam 3

grok 4

OrderedDict([('foo', 1), ('bar', 2), ('spam', 3), ('grok', 4)])

{"foo": 1, "bar": 2, "spam": 3, "grok": 4}

OrderDict實現或重寫了如下方法。都是干嘛的？這個留給大家當課后作業了^_^

clear(self)

popitem(self, last=True)

move_to_end(self, key, last=True)

keys(self)

items(self)

values(self)

pop(self, key, default=__marker)

setdefault(self, key, default=None)

copy(self)

fromkeys(cls, iterable, value=None)

六、ChainMap

用途：創建多個可迭代對象的集合。類字典類型

很簡單，如下：

#!/usr/bin/env python

# -*- coding:utf-8 -*-

# __author__ = 'liao gao xiang'

from collections import ChainMap

from itertools import chain

# 不同集合上元素的迭代

a = [1, 2, 3, 4]

b = ('x', 'y', 'z')

c = {1, 'a'}

# 方法一，使用chain

for i in chain(a, b, c):

print(i)

print('--------------')

# 方法二，使用chainmap

for j in ChainMap(a, b, c):

print(j)

# 這兩種均為節省內存，效率更高的迭代方式

一個 ChainMap 接受多個字典并將它們在邏輯上變為一個字典。然后，這些字典并不是真的合并在一起了，ChainMap 類只是在內部創建了一個容納這些字典的列表并重新定義了一些常見的字典操作來遍歷這個列表。大部分字典操作都是可以正常使用的，比如：

#!/usr/bin/env python

# -*- coding:utf-8 -*-

# __author__ = 'liao gao xiang'

# 合并多個字典和映射

a = {'x': 1, 'z': 3}

b = {'y': 2, 'z': 4}

# 現在假設你必須在兩個字典中執行查找操作

# (比如先從 a 中找，如果找不到再在 b 中找)。

# 一個非常簡單的解決方案就是使用collections模塊中的ChainMap類

from collections import ChainMap

c = ChainMap(a, b)

print(c)

a['x'] = 11 # 使用ChainMap時，原字典做了更新，這種更新會合并到新的字典中去

print(c) # 按順序合并兩個字典

print(c['x'])

print(c['y'])

print(c['z'])

# 對于字典的更新或刪除操作影響的總是列中的第一個字典。

c['z'] = 10

c['w'] = 40

del c['x']

print(a)

# del c['y']將出現報錯

# ChainMap對于編程語言中的作用范圍變量(比如globals,locals等)

# 是非常有用的。事實上，有一些方法可以使它變得簡單：

values = ChainMap() # 默認會創建一個空字典

print('\t', values)

values['x'] = 1

values = values.new_child() # 添加一個空字典

values['x'] = 2

values = values.new_child()

values['x'] = 30

# values = values.new_child()

print(values, values['x']) # values['x']輸出最后一次添加的值

values = values.parents # 刪除上一次添加的字典

print(values['x'])

values = values.parents

print(values)

a = {'x': 1, 'y': 2}

b = {'y': 2, 'z': 3}

merge = dict(b)

merge.update(a)

print(merge['x'], merge['y'], merge['z'])

a['x'] = 11

print(merge['x'])

輸出結果如下：

ChainMap({'x': 1, 'z': 3}, {'y': 2, 'z': 4})

ChainMap({'x': 11, 'z': 3}, {'y': 2, 'z': 4})

{'z': 10, 'w': 40}

ChainMap({})

ChainMap({'x': 30}, {'x': 2}, {'x': 1}) 30

ChainMap({'x': 1})

1 2 3

作為ChainMap的替代，你可能會考慮使用 update() 方法將兩個字典合并。這樣也能行得通，但是它需要你創建一個完全不同的字典對象(或者是破壞現有字典結構)。同時，如果原字典做了更新，這種改變不會反應到新的合并字典中去。

ChainMap實現或重寫了如下方法：

get(self, key, default=None)

fromkeys(cls, iterable, *args)

copy(self)

new_child(self, m=None)

parents(self)

popitem(self)

pop(self, key, *args)

clear(self)

七、UserDict、UserList、UserString

這三個類是分別對 dict、list、str 三種數據類型的包裝，其主要是為方便用戶實現自己的數據類型。在 Python2 之前，這三個類分別位于 UserDict、UserList、UserString 三個模塊中，需要用類似于 from UserDict import UserDict 的方式導入。在 Python3 之后則被挪到了 collections 模塊中。這三個類都是基類，如果用戶要擴展這三種類型，只需繼承這三個類即可。

總結

以上就是這篇文章的全部內容了，希望本文的內容對大家的學習或者工作具有一定的參考學習價值，如果有疑問大家可以留言交流，謝謝大家對腳本之家的支持。

總結

以上是生活随笔為你收集整理的python中collections_Python中collections模块的基本使用教程的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。

上一篇：喀什悦湖美墅楼盘地址在哪里？
下一篇： ld3320语音识别模块工作原理_风冷模