Defaultdict in Python

In Python, we have several built-in data structures like lists, tuples, dictionaries, sets, etc., which make it easy to manipulate data. One of the most widely used data structures in Python is the dictionary. It allows us to store data in a key-value pair format, with keys being unique identifiers for their associated values.

Python’s default dictionary, or defaultdict, is a subclass of the built-in dict class. With defaultdict, we can set default values for any keys that haven’t been explicitly set yet. This makes it easier to avoid key errors when working with dictionaries.

What is Defaultdict?

A defaultdict in Python is a dictionary that automatically assigns a default value to any non-existing key, rather than raise a KeyError. This is achieved by specifying a default factory function that is used to create new values. The default factory can be any callable object that takes no parameters and returns a default value.

By default, defaultdict sets the default value to None. However, we can set a different default value using the constructor’s default_factory parameter. The default_factory can be a function, a lambda function, or even a built-in type like int, str, or list.

Creating a Defaultdict

Creating a defaultdict is similar to creating a regular dictionary. We just need to import defaultdict first from the collections module and pass the default_factory parameter to its constructor.

For example, let’s create a defaultdict that stores books author’s names based on their genre:

from collections import defaultdict
books = defaultdict(str)
books['mystery'] = 'Agatha Christie'
books['science fiction'] = 'Isaac Asimov'
books['horror'] = 'Stephen King'
print(books)

Output:

defaultdict(<class 'str'="">, {'mystery': 'Agatha Christie', 'science fiction': 'Isaac Asimov', 'horror': 'Stephen King'})

In this example, we specified str as the default_factory parameter, which means any non-existing keys will have an empty string as their value. However, we were able to set up the values for specific keys like ‘mystery,’ ‘science fiction,’ and ‘horror.’

We can also specify a list as the default_factory to initialize a defaultdict that stores book titles based on their genre.

books = defaultdict(list)
books['mystery'].append('Murder on the Orient Express')
books['mystery'].append('Death on the Nile')
books['science fiction'].append('Foundation')
books['horror'].append('It')
print(books)

Output:

defaultdict(<class 'list'="">, {'mystery': ['Murder on the Orient Express', 'Death on the Nile'], 'science fiction': ['Foundation'], 'horror': ['It']})

In this example, we specified list as the default_factory parameter, which initializes any non-existing keys with an empty list. We then added book titles to specific keys like ‘mystery,’ ‘science fiction,’ and ‘horror.’

Benefits of Defaultdict

Defaultdict can be very useful when working with dictionaries because it saves us the pain of checking whether a key exists in a dictionary or not. For instance, suppose we have a program that needs to count the number of times a particular word appears in an input text. In that case, we could use a defaultdict with an int factory to achieve this goal.

from collections import defaultdict
words_count = defaultdict(int)

text = 'the quick brown fox jumps over the lazy dog'

for word in text.split():
words_count[word] += 1

print(words_count)

Output:

defaultdict(<class 'int'="">, {'the': 2, 'quick': 1, 'brown': 1, 'fox': 1, 'jumps': 1, 'over': 1, 'lazy': 1, 'dog': 1})

In this example, we initialized words_count as a defaultdict with an int factory. Then, we looped through the input text, splitting it into individual words. For each word, we updated its count in the defaultdict, using the += operator. As a result, we were able to count the number of times each word appeared in the text without having to check whether a key exists or not.

FAQs

Q: What happens if we try to access a non-existing key in a defaultdict?

A: If we try to access a non-existing key in a defaultdict, it returns the default value specified by the default_factory. If the default_factory hasn’t been set, it returns None.

Q: Can we modify the default_factory of a defaultdict after creating it?

A: Yes, we can modify the default_factory of a defaultdict by assigning a different callable object to the defaultdict.default_factory attribute.

Q: Is it possible to have a nested defaultdict?

A: Yes, we can have a nested defaultdict by setting the default_factory parameter to another defaultdict.

Q: How is a defaultdict different from a regular dictionary?

A: A defaultdict is a subclass of the built-in dict class that automatically assigns a default value to any non-existing key. In contrast, a regular dictionary raises a KeyError when trying to access a non-existing key.

Conclusion

The Python defaultdict is a powerful data structure that can make programming with dictionaries much easier. By providing a default value for any non-existing keys, it allows us to avoid key errors and concentrate on the logic of the program. We can specify any callable object as the default_factory parameter, enabling us to create complex data structures that suit our needs.

Facebook
Twitter
LinkedIn
Pinterest

Related posts