• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer

In Out Code

Your comprehensive guide to optimized code

  • Data Structures
  • Python
    • Control Flow
    • HackerRank
    • Input and Output
    • Modules
  • AWS

Python Collections Module – High Performance Containers

You are here: Home / Python / Modules / Python Collections Module – High Performance Containers

June 3, 2019 by Daniel Andrews

What is the Python Collections Module?

The Python collections module provides access to six main specialized container datatypes; namedtuple(), deque, ChainMap, Counter, OrderedDict and defaultdict.

Syntax: import collections

This guide walks you through everything you need to work with the Python collections module, with clear, commented code samples.

  • namedtuple()
  • deque
  • ChainMap
  • Counter
  • OrderedDict
  • defaultdict

namedtuple() Container

A named tuple allows individual elements of a tuple to be referenced by name as well as by index. They can be used to assign meaning to each field, require no more memory than a regular tuple and are immutable.

Syntax: namedtuple(typename, field_names)

NamedTuple Example 1: Creation and reference by field name
from collections import namedtuple

# Create a namedtuple
Car = namedtuple('Car', ['make', 'model', 'doors'])     # Method 1
Car = namedtuple('Car', 'make, model, doors')           # Method 2

cars_list = [
        Car('BMW','X6','5'),
        Car('Mercedes-Benz','GLC-Class','5'),
        Car('Tesla','Model X','5')
        ]

print("Cars named tuples using index references:")
print("-----------------------------------------")
for car in cars_list:
    print(car[0], car[1], car[2])

print("\nCars named tuples using name references:")
print("-----------------------------------------") 
for car in cars_list:
    print(car.make, car.model, car.doors)
Output:
Cars named tuples using index references:
-----------------------------------------
BMW X6 5
Mercedes-Benz GLC-Class 5
Tesla Model X 5

Cars named tuples using name references:
-----------------------------------------
BMW X6 5
Mercedes-Benz GLC-Class 5
Tesla Model X 5
NamedTuple Example 2: Using _replace to replace the value in a field
from collections import namedtuple

# Create a namedtuple
Car = namedtuple('Car', ['make', 'model', 'doors'])     # Method 1
Car = namedtuple('Car', 'make, model, doors')           # Method 2

cars_list = [
        Car('BMW','X6','5'),
        Car('Mercedes-Benz','GLC-Class','5'),
        Car('Tesla','Model X','5')
        ]

example = cars_list[0]
print(example)
example = example._replace(doors='6')
print(example)
Output:
Car(make='BMW', model='X6', doors='5')
Car(make='BMW', model='X6', doors='6')

deque Container

A deque is a double-ended queue that’s optimized for fast appends and pops from each end. This is a significant advantage over lists, which are generally optimized for fast fixed-length operations and random access to elements.

If a value is specified for the maxlen parameter, the deque is limited to this length. If new items are added, the same number of items are removed from the opposite end.

Most deque methods are the same as you get for lists, such as append(x), clear() and count(). But you also have methods unique to deques, which are:

  • appendleft(x)
  • extendleft(iterable)
  • popleft()
  • rotate(n=1)

A deque also supports slicing.

Deque Syntax: deque([iterable[, maxlen]])

A full list of deque methods can be found in the official Python 3 documentation. You can also find examples of standard list methods in our Python lists guide.

Below, you’ll find examples of the methods that are specific to Python deques.

Example 1: Create a new deque
from collections import deque

print("Using deque iteration:")
deque1 = deque('abc')
for letter in deque1:
    print(letter.upper())

print("\n---------------------\n")
print("Using list iteration:")
list1 = list('abc')
for letter in list1:
    print(letter.upper())
Output:
Using deque iteration:
A
B
C

---------------------

Using list iteration:
A
B
C
Example 2: Attempting to add items to a full deque
from collections import deque

# Create a full deque (maxlen=3, already has 3 elements)
deque1 = deque('abc',maxlen=3)
print("deque1 before append: \t{}".format(deque1))

# Add a new item 'x' to deque1
deque1.append('x')
print("deque1 after append: \t{}".format(deque1))
Output:
deque1 before append:   deque(['a', 'b', 'c'], maxlen=3)
deque1 after append:    deque(['b', 'c', 'x'], maxlen=3)
Example 3: Using appendleft(x), extendleft(x), popleft() and rotate(n=1)
from collections import deque

# Testing appendleft(), extendleft(), popleft() and rotate()
deque1 = deque('abc')
print("deque1: \t\t\t{}".format(deque1))

deque1.appendleft('x')
print("deque1 after appendleft('x'): \t{}".format(deque1))

# Right rotation (1)
deque1.rotate(1)
print("deque1 after rotate(1): \t{}".format(deque1))

# Left rotation (-1)
deque1.rotate(-1)
print("deque1 after rotate(-1): \t{}".format(deque1))

deque1.popleft()
print("deque1 after popleft(): \t{}".format(deque1))

# extendleft() reverses the input order
deque1.extendleft('def')
print("deque1 after extendleft(): \t{}".format(deque1))
Output:
deque1:                         deque(['a', 'b', 'c'])
deque1 after appendleft('x'):   deque(['x', 'a', 'b', 'c'])
deque1 after rotate(1):         deque(['c', 'x', 'a', 'b'])
deque1 after rotate(-1):        deque(['x', 'a', 'b', 'c'])
deque1 after popleft():         deque(['a', 'b', 'c'])
deque1 after extendleft():      deque(['f', 'e', 'd', 'a', 'b', 'c'])

ChainMap objects

The ChainMap class groups multiple mappings objects together in a single view. A ChainMap holds references to the underlying mapping objects, so if one of them changes, the change is reflected in the ChainMap view.

Syntax: ChainMap(*maps)

Example 1: Creating a ChainMap()
from collections import ChainMap

map1 = {'a':1, 'b':2, 'c':3}
map2 = {'c':4, 'd':4, 'e':5}

# Perform a ChainMap() between two dicts
chain = ChainMap(map1, map2)

print("Keys and Values in ChainMap:")
for k, v in chain.items():
    print("Key: {}, Value: {}".format(k, v))
Output:
Keys and Values in ChainMap:
Key: a, Value: 1
Key: c, Value: 3
Key: e, Value: 5
Key: d, Value: 4
Key: b, Value: 2

The child mappings are searched for in the order they are passed to the constructor. This is why, although map2 assigns the value of 4 to key c, because it appears after map1 in ChainMap(map1, map2), it’s the value of 3 from map1 that’s returned.

We can prove this by reversing the order in which we pass map1 and map2 to ChainMap.

from collections import ChainMap

map1 = {'a':1, 'b':2, 'c':3}
map2 = {'c':4, 'd':4, 'e':5}

# Perform a ChainMap() between two dicts
chain = ChainMap(map2, map1)

print("Keys and Values in ChainMap:")
for k, v in chain.items():
    print("Key: {}, Value: {}".format(k, v))
Output:
Keys and Values in ChainMap:
Key: a, Value: 1
Key: c, Value: 4
Key: e, Value: 5
Key: d, Value: 4
Key: b, Value: 2

In the output above, we now have the value of 4 assigned to key c. It’s possible to achieve the same result without manually changing the order map1 and map2 are passed to ChainMap, using reversed().

from collections import ChainMap

map1 = {'a':1, 'b':2, 'c':3}
map2 = {'c':4, 'd':4, 'e':5}

# Perform a ChainMap() between two dicts
chain = ChainMap(map1, map2)

print("Maps before reversal:\n{}\n".format(chain.maps))
chain.maps = list(reversed(chain.maps))
print("Maps after reversal:\n{}\n".format(chain.maps))

print("Keys and Values in ChainMap:")
for k, v in chain.items():
    print("Key: {}, Value: {}".format(k, v))
Output:
Maps before reversal:
[{'a': 1, 'b': 2, 'c': 3}, {'c': 4, 'd': 4, 'e': 5}]

Maps after reversal:
[{'c': 4, 'd': 4, 'e': 5}, {'a': 1, 'b': 2, 'c': 3}]

Keys and Values in ChainMap:
Key: a, Value: 1
Key: c, Value: 4
Key: e, Value: 5
Key: d, Value: 4
Key: b, Value: 2

Counter Container

One way to count items in a tuple, string or list is to use their count() method. But this only returns the count of a single value. A faster alternative to returning counts of multiple items in a single iterable or mapping is to create a Counter.

Counter objects in the Python collections module have the following properties:

  • dict subclass
  • Counters have the same methods as a dict, except the fromkeys() method, which isn’t available. The update() method of a Counter also combines counts, rather than replacing the values
  • Counters are an unordered collection
  • Values are stored as dictionary keys, counts as dictionary values
  • Counter returns a zero count if item doesn’t exist. No KeyError
Creating a Counter
c1 = Counter()                  # Create an empty counter
c2 = Counter("test_string")     # Create a counter from an iterable
c3 = Counter({"a": 3, "b": 4})  # Create a counter from a mapping
c4 = Counter(first=3, second=2) # Create a counter from keyword args

# Display the contents of each Counter object
print("c1: {0}\nc2: {1}\nc3: {2}\nc4: {3}".format(c1, c2, c3, c4))
Output:
c1: Counter()
c2: Counter({'t': 3, 's': 2, 'e': 1, '_': 1, 'r': 1, 'i': 1, 'n': 1, 'g': 1})
c3: Counter({'b': 4, 'a': 3})
c4: Counter({'first': 3, 'second': 2})
Counter Example 1: Count occurrences of letters in a list
# Count occurrences of letters in a list
cnt = Counter()
letters = ["a", "a", "b", "b", "b", "c", "d"]

# Loop through the list and count each letter
for letter in letters:
  cnt[letter] += 1

# Display the Counter
print("cnt: {0}".format(cnt))
Output:
Counter({'b': 3, 'a': 2, 'c': 1, 'd': 1})
Counter Example 2: most_common()

The most_common() method returns a list of (value, count) tuples, ordered from the most common to the least common. Only values with a count greater than zero will be returned.

Syntax: most_common([n])

# Create a list with multiple occurrences of letters
letters = ["a", "a", "b", "b", "b", "c", "d"]
print("letters = {0}".format(letters))

# Find the three most common letters in the list
Counter(letters).most_common(3)

# Check most_common() returns all counts
Counter(letters).most_common()
Output:
[('b', 3), ('a', 2), ('c', 1), ('d', 1)]
Counter Example 3: subtract()

The subtract([iterable]) method subtracts counts of elements in one Counter from another. For example, if a = Counter(a=3) and b = Counter(a=1), a.subtract(b) would update a to be Counter({'a': 2}).

Syntax: subtract([iterable-or-mapping])

# Create two counters
c1 = Counter(a=3, b=4, c=2, d=7, e=0)
c2 = Counter(a=3, b=1, c=1, d=3, e=5)
print("c1: {0}".format(c1))
print("c2: {0}\n".format(c2))

# Subtract c2 from c1
c1.subtract(c2)
print("c1.subtract(c2)\n{0}\n{1}".format('-'*15, c1))
Output:
c1: Counter({'d': 7, 'b': 4, 'a': 3, 'c': 2, 'e': 0})
c2: Counter({'e': 5, 'a': 3, 'd': 3, 'b': 1, 'c': 1})

c1.subtract(c2)
---------------
Counter({'d': 4, 'b': 3, 'c': 1, 'a': 0, 'e': -5})

The subtract operator - can also used to subtract counts. Important: Only positive counts will be retained. This means we won’t get the zero count for ‘a’ or the -5 count for ‘e’ that we saw when using the subtract() method in the code above.

# Create two counters
c1 = Counter(a=3, b=4, c=2, d=7, e=0)
c2 = Counter(a=3, b=1, c=1, d=3, e=5)
print("c1: {0}".format(c1))
print("c2: {0}\n".format(c2))

# Subtract c2 from c1 using the '-' operator
c1 - c2
print("c1 - c2\n{0}\n{1}".format('-'*15, c1 - c2))
Output:
c1: Counter({'d': 7, 'b': 4, 'a': 3, 'c': 2, 'e': 0})
c2: Counter({'e': 5, 'a': 3, 'd': 3, 'b': 1, 'c': 1})

c1 - c2
---------------
Counter({'d': 4, 'b': 3, 'c': 1})
Counter Example 4: update()

The update() method adds counts of elements in one Counter to another. For example, if a = Counter(a=3) and b = Counter(a=1), a.update(b) would update a to be Counter({'a': 4}).

Syntax: update([iterable-or-mapping])

# Create two counters
c1 = Counter(a=3, b=4, c=2, d=7, e=0)
c2 = Counter(a=3, b=1, c=1, d=3, e=5)
print("c1: {0}".format(c1))
print("c2: {0}\n".format(c2))

# Add c2 to c1
c1.update(c2)
print("c1.update(c2)\n{0}\n{1}".format('-'*15, c1))
Output:
c1: Counter({'d': 7, 'b': 4, 'a': 3, 'c': 2, 'e': 0})
c2: Counter({'e': 5, 'a': 3, 'd': 3, 'b': 1, 'c': 1})

c1.update(c2)
---------------
Counter({'d': 10, 'a': 6, 'b': 5, 'e': 5, 'c': 3})

The addition operator + can also used to add multiple counters together.

# Create two counters
c1 = Counter(a=3, b=4, c=2, d=7, e=0)
c2 = Counter(a=3, b=1, c=1, d=3, e=5)
print("c1: {0}".format(c1))
print("c2: {0}\n".format(c2))

# Add c2 to c1 using the '+' operator
c1 + c2
print("c1 + c2:\n{0}\n{1}".format('-'*15, c1 + c2))
Output:
c1: Counter({'d': 7, 'b': 4, 'a': 3, 'c': 2, 'e': 0})
c2: Counter({'e': 5, 'a': 3, 'd': 3, 'b': 1, 'c': 1})

c1 + c2
---------------
Counter({'d': 10, 'a': 6, 'b': 5, 'e': 5, 'c': 3})

OrderedDict Container

An OrderedDict remembers its insertion order, which means you can create a sorted dictionary. However, as of Python 3.7, regular Python dictionaries also keep their insertion order.

So, when should you use an OrderedDict?

If you need an ordered dictionary in code that runs using a Python version older than 3.7, you’ll need to use OrderedDict. Also, OrderedDict offers several useful methods that standard Python dictionaries don’t have, such as move_to_end.

Another reason to use the OrderedDict is if you want to test equality. When comparing multiple instances of OrderedDict, the order of (key, value) pairs must be the same to be considered equal. But with standard Python dictionaries, as long as the instances contain the same (key, value) pairs, they will be considered equal, irrespective of their order.

Syntax: OrderedDict(mapping)

OrderedDict Example 1: Create an OrderedDict
from collections import OrderedDict

# OrderedDict creation
dict1 = OrderedDict({'a':1, 'b':2, 'd':4, 'c':3})

# Create a sorted OrderedDict, sorted by key
dict1 = OrderedDict(sorted(dict1.items(), key= lambda x: x[0]))
print("Sorted OrderedDict\n{0}\n{1}".format("-"*18,dict1))
Output:
Sorted OrderedDict
------------------
OrderedDict([('a', 1), ('b', 2), ('c', 3), ('d', 4)])
OrderedDict Example 2: OrderedDict equality vs. dict equality
from collections import OrderedDict

# Create two standard dictionaries
dict1 = {'a':1, 'c':3, 'b':2}
dict2 = {'a':1, 'b':2, 'c':3}

# Now test their equality, when (key, value) pair order doesn't match
print("Using dict, dict1 == dict2: {0}".format(dict1 == dict2))

# Perform the equality check between two OrderedDict instances
dict1 = OrderedDict({'a':1, 'c':3, 'b':2})
dict2 = OrderedDict({'a':1, 'b':2, 'c':3})
print("Using OrderedDict, dict1 == dict2: {0}".format(dict1 == dict2))
Output:
Using dict, dict1 == dict2: True
Using OrderedDict, dict1 == dict2: False

defaultdict Container

Using a defaultdict, if a key is not found in the dictionary, a new entry is created. This is in contrast to a standard Python dictionary, which would throw a KeyError.

A defaultdict is a subclass of the built-in dict class, and supports the _missing_ method, in addition to the standard dict operations.

The default_factory function is useful for specifying what type of dictionary values you want to create if the key does not exist. Some examples are a list, int or set.

Syntax: defaultdict([default_factory])

Example 1: Creating a defaultdict
from collections import defaultdict

# Create a list defaultdict
list_dict = defaultdict(list)

# Attempt to access a key that doesn't exist
val = list_dict[1]

# Check this key has now been created
print("defaultdict(list) = {}".format(list_dict))


# Create a set defaultdict
set_dict = defaultdict(set)

# Attempt to access a key that doesn't exist
val = set_dict[1]

# Check this key has now been created
print("defaultdict(set) = {}".format(set_dict))


# Create an integer defaultdict
int_dict = defaultdict(int)

# Attempt to access a key that doesn't exist
val = int_dict[1]

# Check this key has now been created
print("defaultdict(int) = {}".format(int_dict))
Output:
defaultdict(list) = defaultdict(, {1: []})
defaultdict(set) = defaultdict(, {1: set()})
defaultdict(int) = defaultdict(, {1: 0})
Example 2: Using a defaultdict in counting
from collections import defaultdict

count = defaultdict(int)  
letters = "a b b c a b a c c d d b a".split()  
for letter in letters:  
    count[letter] +=1
print(count) 
Output:
defaultdict(, {'a': 4, 'b': 4, 'c': 3, 'd': 2})

Category iconModules Tag iconPython Counter,  Python deque

About Daniel Andrews

Passionate about all things data and cloud. Specializing in Python, AWS and DevOps, with a Masters degree in Data Science from City University, London, and a BSc in Computer Science.

Primary Sidebar

48-Hour Flash Sale. Online courses as low as $12.99

Categories

  • AWS (4)
  • Concepts (1)
  • Control Flow (1)
  • Data Structures (9)
  • HackerRank (1)
  • Input and Output (1)
  • LeetCode (1)
  • Modules (1)
  • Operators (1)
  • Python (2)
Udemy.com Homepage 300x250

Footer

Recent Posts

  • How to Setup Neo4j on AWS ECS (EC2)
  • How to Setup Neo4j on AWS EC2
  • How to List AWS S3 Bucket Names and Prefixes
  • Amazon Redshift Tutorial (AWS)
  • Big O: How to Calculate Time and Space Complexity

.