Lecture 12: Dictionaries, Sets, and Tuples¶
Welcome! Today we'll explore three more Python data structures that expand our toolkit for organizing and managing data:
- Tuples: immutable sequences
- Sets: collections of unique elements
- Dictionaries: key-value mappings
By the end of this lecture, you'll understand when and how to use each of these structures effectively.
# This is a function returning multiple values
def get_dimensions():
width = 40
height = 30
return width, height # This creates a tuple!
# And we unpacked them
w, h = get_dimensions() # This unpacks the tuple
print(f"Width: {w}, Height: {h}")
# What's actually happening?
result = get_dimensions()
print(f"Result is: {result}")
print(f"Type: {type(result)}")
Width: 40, Height: 30 Result is: (40, 30) Type: <class 'tuple'>
Tuple Creation Syntax¶
Tuples are created using commas with parentheses () or just commas:
# Creating tuples
point = (3, 5) # commas with parentheses
color = 255, 128, 0 # without parentheses (commas are enough!)
print(f"Point: {point}")
print(f"Color: {color}")
print(f"Both are tuples: {type(point)}, {type(color)}")
Point: (3, 5) Color: (255, 128, 0) Both are tuples: <class 'tuple'>, <class 'tuple'>
You can create a single element tuple!
# Single element tuple, but needs a trailing comma!
single = (42,) # note the comma
print(f"Single element tuple: {single}, type: {type(single)}")
Single element tuple: (42,), type: <class 'tuple'>
# Without comma, it's just a number in parentheses
not_tuple = (42)
print(f"Not a tuple: {not_tuple}, type: {type(not_tuple)}")
Not a tuple: 42, type: <class 'int'>
Tuples vs Lists¶
The key difference: tuples are immutable (can't be changed after creation).
# Lists can be modified
my_list = [1, 2, 3]
my_list[0] = 99
print(f"Modified list: {my_list}")
Modified list: [99, 2, 3]
# Tuples cannot be modified
my_tuple = (1, 2, 3)
my_tuple[0] = 99
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[8], line 3 1 # Tuples cannot be modified 2 my_tuple = (1, 2, 3) ----> 3 my_tuple[0] = 99 TypeError: 'tuple' object does not support item assignment
Tuple Operations¶
Even though tuples are immutable, you can still do all the sequence operations on them:
- Access elements by index
- Slice them
- Iterate through them
- Get their length
coordinates = (10, 20, 30, 40)
# First element
coordinates[0]
10
# Last element
coordinates[-1]
40
# Slicing up to the second item (this creates a new tuple!)
coordinates[:2]
(10, 20)
# Length of the tuple
len(coordinates)
4
# Iterate over the tuple
for coord in coordinates:
print(coord, end=" ")
print()
10 20 30 40
# Or the index based iteration
for i in range(len(coordinates)):
print(coordinates[i], end=" ")
print()
10 20 30 40
More with Tuple Unpacking¶
In addition to receiving multiple return values from functions, you can actually achieve swapping values in a single line with tuple unpacking:
# Swap variables using tuple unpacking
a = 5
b = 10
print(f"Before swap: a = {a}, b = {b}")
a, b = b, a
print(f"After swap: a = {a}, b = {b}")
Before swap: a = 5, b = 10 After swap: a = 10, b = 5
Another place that you have seen this is when using iteration with enumerate:
for i, coord in enumerate(coordinates):
print(f"The coordinate at index {i} is {coord}")
The coordinate at index 0 is 10 The coordinate at index 1 is 20 The coordinate at index 2 is 30 The coordinate at index 3 is 40
When to Use Tuples¶
Use tuples when:
- Data shouldn't change (coordinates, RGB colors, dates)
- Returning multiple values from a function
- Using as dictionary keys (only immutable things can be dictionary keys, we'll see very soon)
- You want slightly better performance than lists
- You want to make it clear that this data is fixed (Python programmers know that tuples are immutable so this is an implicit norm)
Sets: Collections of Unique Elements¶
A set is an unordered collection of unique elements. No duplicates allowed, unlike in other collection structures like lists and tuples.
Creating Sets¶
You can create a set with curly braces (as opposed to square brackets for lists) with items separated by comma:
colors = {"red", "blue", "green"}
print(f"Colors set: {colors}")
Colors set: {'green', 'blue', 'red'}
To create an empty set, call the set() constructor:
empty_set = set() # {} would create an empty dict!
print(f"Empty set: {empty_set}")
Empty set: set()
You can use the set() constructor on a list, which serves as de-duplicating the list:
my_list = [1, 1, 1, 2, 3, 3, 4, 4, 4, 4]
my_set = set(my_list)
print(f"The set that contains deduplicated list: {my_set}")
The set that contains deduplicated list: {1, 2, 3, 4}
Fast Membership Testing¶
Similar to other sequences, in can be used for membership checking with sets.
valid_carriers_set = {1, 2, 3, 4, 5}
carrier = 3
print(f"Is {carrier} valid?")
carrier in valid_carriers_set
Is 3 valid?
True
However, sets are much faster than lists for checking membership because they are unordered. It's not obvious with small-size collections but more significant if you work with thousands of items in a list versus a set. Intuitively, lists pay an overhead to be an ordered collection because it has to keep track of the items in order. You will learn more about why that is the case if you continue to take a course in data structures.
Set Operations¶
Sets support mathematical set operations:
set_a = {1, 2, 3, 4}
set_b = {3, 4, 5, 6}
print(f"Set A: {set_a}")
print(f"Set B: {set_b}")
Set A: {1, 2, 3, 4}
Set B: {3, 4, 5, 6}
# Union: combine both sets
set_a | set_b
{1, 2, 3, 4, 5, 6}
# Intersection: only keep elements in both sets
set_a & set_b
{3, 4}
# Difference: keep elements in A but not in B
set_a - set_b
{1, 2}
# Symmetric difference: keep elements in either set but not both
set_a ^ set_b
{1, 2, 5, 6}
Adding and Removing from Sets¶
Functions slightly different from those for lists (append, insert, pop, remove, extend) are available for sets:
carriers_used = {1, 2}
print(f"Initial: {carriers_used}")
# Add an element
carriers_used.add(3)
print(f"After add(3): {carriers_used}")
# Add an element that already exists (no error, just ignored)
carriers_used.add(2)
print(f"After add(2) again: {carriers_used}")
# Remove an element
carriers_used.remove(1)
print(f"After remove(1): {carriers_used}")
# Discard (like remove, but doesn't error if element doesn't exist)
carriers_used.discard(5) # 5 not in set, but no error
print(f"After discard(5): {carriers_used}")
carriers_used.remove(5) # errors
Initial: {1, 2}
After add(3): {1, 2, 3}
After add(2) again: {1, 2, 3}
After remove(1): {2, 3}
After discard(5): {2, 3}
--------------------------------------------------------------------------- KeyError Traceback (most recent call last) Cell In[28], line 20 18 carriers_used.discard(5) # 5 not in set, but no error 19 print(f"After discard(5): {carriers_used}") ---> 20 carriers_used.remove(5) # errors KeyError: 5
Iterate Over Sets¶
Because sets are unordered, there are no indices associated with each item so we cannot use the [] indexing operator any more. We can only use the for ... in ... type of value-only for loop.
for c in valid_carriers_set:
print(c, end=" ")
1 2 3 4 5
If you do want to iterate them with index-based iteration, you can convert sets to other sequences like tuples and lists:
list(valid_carriers_set)
[1, 2, 3, 4, 5]
tuple(valid_carriers_set)
(1, 2, 3, 4, 5)
When to Use Sets¶
Use sets when:
- You need to ensure uniqueness of the items
- You need fast membership testing
- You want to perform set operations (union, intersection, etc.)
- Order doesn't matter
Don't use sets when:
- You need to maintain order
- You need to access elements by index
- You need duplicate values
Dictionaries: Key-Value Mappings¶
A dictionary lets you look up values using keys instead of numeric indices. Think of it like a real dictionary where you look up the word itself and then see what it means: the word is the key, and the meaning of that word is the value.
Dictionaries are very useful when you have data that map to some other data. For example, if we want to know the colors of the yarn carriers on the machine, we can use two lists, one to keep track of the carrier number, and the other to keep track of the color.
# Keeping track of yarn carrier numbers and colors, they should correspond to each other
carrier_numbers = [1, 2, 5, 4, 3]
carrier_colors = ["white", "black", "pink", "purple", "beige"]
# To find the color of carrier 3, we need to search
carrier = 3
index = carrier_numbers.index(carrier)
# Then index into the color list
color = carrier_colors[index]
print(f"Carrier {carrier} is {color}")
Carrier 3 is beige
Unless we always maintain the orderliness of the two lists that have some mappings between their data, it's very easy to make a mistake if say, you changed the yarn on the machine and changed the list values without modifying the carrier numbers list.
But with dictionaries, we can create a mapping between the carrier number and the yarn color:
carrier_colors = {
1: "white",
2: "black",
3: "beige",
4: "purple",
5: "pink"
}
This enables direct lookup:
carrier = 3
color = carrier_colors[carrier]
print(f"Carrier {carrier} is {color}")
Carrier 3 is beige
Creating and Using Dictionaries¶
Dictionaries use curly braces {} with key: value pairs:
# Create an empty dictionary (this is why empty set is not created with {} but set())
empty_dict = {}
print(f"Empty: {empty_dict}")
Empty: {}
# Create a dictionary with initial data
stitch_symbols = {
"knit": "K",
"purl": "P",
"yarn over": "YO",
"knit two together": "K2tog"
}
print("Stitch abbreviations:")
print(stitch_symbols)
Stitch abbreviations:
{'knit': 'K', 'purl': 'P', 'yarn over': 'YO', 'knit two together': 'K2tog'}
You can access the keys of a dictionary with .keys() and values with .values():
stitch_symbols.keys()
dict_keys(['knit', 'purl', 'yarn over', 'knit two together'])
stitch_symbols.values()
dict_values(['K', 'P', 'YO', 'K2tog'])
Note how these things are of type dict_keys and dict_values. These are view objects of the keys and values. What that means is that they will actually update as the dictionary updates. You can think of dict_keys roughly as set and dict_values roughly as an unordered list.
Dictionary Lookup¶
The [] operator is used to index into the dictionary. The syntax looks like dict[key] which means to return the value that corresponds to the key from the dictionary dict.
# Look up knit
print("Knit abbreviation:")
stitch_symbols['knit']
Knit abbreviation:
'K'
# Look up purl
print("Purl abbreviation:")
stitch_symbols['purl']
Purl abbreviation:
'P'
What if you try to access a key that doesn't exist in the dictionary?
stitch_symbols['purl two together']
--------------------------------------------------------------------------- KeyError Traceback (most recent call last) Cell In[42], line 1 ----> 1 stitch_symbols['purl two together'] KeyError: 'purl two together'
Similar to most data types we have seen (strings, lists, tuples, sets), you can use in for membership check:
'purl two together' in stitch_symbols
False
Note that this only checks for keys, not values, because implicitly, this is doing:
'purl two together' in stitch_symbols.keys()
False
Modifying Dictionary¶
You can add a new entry to the dictionary by using this as the lefthand side of the assignment operator:
stitch_symbols['purl two together'] = 'P2tog'
stitch_symbols
{'knit': 'K',
'purl': 'P',
'yarn over': 'YO',
'knit two together': 'K2tog',
'purl two together': 'P2tog'}
And if the key already exists, this will be a modification of the dictionary:
stitch_symbols['purl two together'] = 'P2TOG'
stitch_symbols
{'knit': 'K',
'purl': 'P',
'yarn over': 'YO',
'knit two together': 'K2tog',
'purl two together': 'P2TOG'}
If you don't want to check for membership but don't want an error, you can use get as a safe indexing operation:
stitch_symbols.get('nonexistent') # returns None for a key that doesn't exist in the dictionary
You can remove an entry with pop. The syntax is dict.pop(key), and similar to list.pop() it returns the value that is being popped.
popped_abbreviation = stitch_symbols.pop('purl two together')
print(f"The value popped is: {popped_abbreviation}")
stitch_symbols
The value popped is: P2TOG
{'knit': 'K', 'purl': 'P', 'yarn over': 'YO', 'knit two together': 'K2tog'}
Iterating Through Dictionaries¶
The usual ways of iterating still works, starting with for loops:
# This goes over the keys because stitch_symbols is stitch_symbols.keys()
print("Keys:")
for stitch in stitch_symbols:
print(stitch)
Keys: knit purl yarn over knit two together
# Iterate over values
print("\nValues:")
for value in stitch_symbols.values():
print(value)
Values: K P YO K2tog
In addition to the keys view and values view, there's also a key-value pair view that is accessible with items(), which returns an object of dict_items type:
# Iterate over key-value pairs
print("\nKey-value pairs:")
for key, value in stitch_symbols.items():
print(f"Abbreviation of {key} is {value}")
Key-value pairs: Abbreviation of knit is K Abbreviation of purl is P Abbreviation of yarn over is YO Abbreviation of knit two together is K2tog
Note that you can call len on dictionaries to find the number of key-value pairs in it, but you cannot use index-based iteration on dict_keys, dict_values, and dict_items.
for i in range(len(stitch_symbols)):
print(stitch_symbols.keys()[i])
# print(stitch_symbols.values()[i]) # Does not work, dict_values is also not subscriptable
# print(stitch_symbols.items()[i]) # Does not work, dict_items is also not subscriptable
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[58], line 2 1 for i in range(len(stitch_symbols)): ----> 2 print(stitch_symbols.keys()[i]) 3 # print(stitch_symbols.values()[i]) 4 # print(stitch_symbols.items()[i]) TypeError: 'dict_keys' object is not subscriptable
Poll Time!¶
What would the following loop print?
for i, item in enumerate(stitch_symbols):
print(i, item)
Common Pattern: Counting with Dictionaries¶
Dictionaries are perfect for counting occurrences:
def count_values(values):
"""
Count how many times each value appears.
Parameters:
values: list of values
Returns:
dictionary mapping value to count
"""
counts = {}
for value in values:
if value in counts:
counts[value] += 1
else:
counts[value] = 1
return counts
# Test it
row = [0, 1, 0, 2, 1, 0, 1, 2, 0]
result = count_values(row)
print(f"Counts: {result}")
for value, count in result.items():
print(f"Value {value} appears {count} times")
Choosing the Right Data Structure¶
| Structure | Ordered? | Mutable? | Duplicates? | Use When... |
|---|---|---|---|---|
| List | Yes | Yes | Yes | You need an ordered, changeable sequence |
| Tuple | Yes | No | Yes | You need an ordered, fixed sequence |
| Dictionary | No* | Yes | No (keys) | You need key-value lookups |
| Set | No | Yes | No | You need unique elements, fast membership checks |
*As of Python 3.7+, dictionaries maintain the insertion order of entries.
Practice Exercises¶
Exercise 1: Create a Color Palette Dictionary¶
Create a dictionary mapping color names to RGB tuples. You can use this site to look for names of colors of your choice: https://rgb.utils.com/
# TODO: Create a color palette dictionary
# Example: {"red": (255, 0, 0), "green": (0, 255, 0), ...}
color_palette = {
# Add at least 3 colors here
}
# Test: print each color and its RGB values
for name, rgb in color_palette.items():
r, g, b = rgb # Unpack the tuple!
print(f"{name}: R={r}, G={g}, B={b}")
Exercise 2: Count Unique Values in a 2D Array¶
def count_unique_values(array_2d):
"""
Count how many unique values appear in a 2D array.
Parameters:
array_2d: 2D list
Returns:
integer count of unique values
"""
# TODO: implement using a set
pass
# Test
test_array = [
[1, 2, 1],
[2, 3, 2],
[1, 3, 3]
]
count = count_unique_values(test_array)
print(f"Unique values: {count}") # Should be 3
Summary¶
Today we learned about three important data structures:
Tuples:
- Immutable sequences (can't be changed)
- Created with parentheses or commas
- Great for returning multiple values
- Support unpacking:
x, y = point
Sets:
- Collections of unique elements
- Very fast membership testing
- Support mathematical set operations
- Created with curly braces or
set()
Dictionaries:
- Key-value mappings
- Fast lookups by key
- Perfect for counting, mapping, and organizing data
- Created with curly braces:
{key: value}
Choosing the right data structure for your problem is important because each has its strengths!