Lists and looping
Now things start to get really interesting! Lists are collections of things stored in a specific order. They can be defined literally by wrapping things in square brackets []
, separating items with commas ,
.
[1]:
import math
[2]:
a = [4, 2, 9, 3]
[3]:
a
[3]:
[4, 2, 9, 3]
Python lists can contain collections of whatever you like.
[4]:
excellent = [41, 'Hello', math.sin]
Each item in the list can be accessed by its index, its position in the list, which starts at zero for the first item. Indexing by negative numbers starts from the last item of the list.
[5]:
a[0]
[5]:
4
[6]:
a[2]
[6]:
9
[7]:
a[-1]
[7]:
3
We’ll get an error if we try to access an index that doesn’t exist:
[8]:
a[99]
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[8], line 1
----> 1 a[99]
IndexError: list index out of range
Like strings, lists have a length which can be found with the len
method.
[9]:
len(a)
[9]:
4
Unlike strings, lists are mutable, which means we can modify lists in-place, without creating a new one.
[10]:
a.append(45)
[11]:
a
[11]:
[4, 2, 9, 3, 45]
[12]:
len(a)
[12]:
5
We can see that lists are mutable because using the append
method didn’t print anything, and our variable a
now has a different value.
Because lists are mutable, we can use the special del
keyword to remove specific indices from the list.
[13]:
del a[-2]
[14]:
a
[14]:
[4, 2, 9, 45]
Good to know del
is a language keyword representing an action, and not a function. The syntactic difference is that functions take their arguments between parentheses, such as my_function(1, 2, 3)
, whereas del
does not.
You can retrieve sub-lists by using slice notation whilst indexing.
[15]:
a[1:-1]
[15]:
[2, 9]
This retrieves the part of list a
starting from index 1
until just before index -1
. The indexing is ‘exclusive’ in that it excludes the item of the last index. This is the convention of indexing in Python.
You can omit a number in the first or second indexing position, and Python will assume you mean the first element (index zero) and last element (index len(array)
).
[16]:
a[:-2]
[16]:
[4, 2]
[17]:
a[1:]
[17]:
[2, 9, 45]
[18]:
a[:]
[18]:
[4, 2, 9, 45]
Slicing returns a copy of the array, so modifying the return value doesn’t affect the original array.
[19]:
b = a[1:]
print(b)
[2, 9, 45]
[20]:
b[0] = 3
[21]:
b
[21]:
[3, 9, 45]
[22]:
a
[22]:
[4, 2, 9, 45]
We did something cool there by assigning a value to a specific index, b[0] = 3
. The same trick works with slices.
[23]:
b[:2] = [99, 2, 78]
[24]:
b
[24]:
[99, 2, 78, 45]
This is equivalent of replacing a certain range (:2
, or items at position 0 and 1) of the list b
with other items from another list. Note that in our example we replace 2 elements with 3. The same syntax might be used for inserting elements at an arbitrary position in the list. If we want to insert the number 6 between the 2 and the 78 in the list above, we would use:
[25]:
b[2:0] = [6]
meaning take out 0 elements from the list starting a position 2 and insert the content of the list ``[6]`` in that position.
Exercise
Slicing creates a copy, so what notation could you use to copy the full list?
Solution You need to slice from the very beginning to the very end of the list.
[26]:
a[:]
[26]:
[4, 2, 9, 45]
This is equivalent to specifying the indices explicitly.
[27]:
a[0:len(a)]
[27]:
[4, 2, 9, 45]
[28]:
a == a[:]
[28]:
True
[29]:
a is a
[29]:
True
[30]:
a is a[:] # creates a copy!
[30]:
False
Looping
When you’ve got a collection of things, it’s pretty common to want to access each one sequentially. This is called looping, or iterating, and is super easy.
[31]:
for item in a:
print(item)
4
2
9
45
We have to indent the code inside the for
loop to tell Python that these lines should be run for every iteration.
Indentation
The for
loop is a block, and every Python block requires indentation, unlike other “free-form” languages such as C++ or Java. This means that Python will throw an error if you don’t indent:
[32]:
for i in b:
print(i)
99
2
6
78
45
Indentation must be consistent within the same block, so if you indent two lines in the same for
loop using a different number of spaces, Python will complain once again:
[33]:
for i in b:
print("I am in a loop")
print(i)
I am in a loop
99
I am in a loop
2
I am in a loop
6
I am in a loop
78
I am in a loop
45
Indentation is necessary as Python does not use any keyword or symbol to determine the end of a block (e.g. there is no endfor
). As a side effect, indentation forces you to make your code more readable!
Note that it does not matter how many spaces you use for indentation. As a convention, we are using four spaces.
The variable name item
can be whatever we want, but its value is changed by Python to be the element of the item we’re currently on, starting from the first.
Because lists are mutable, we can try to modify the length of the list whilst we’re iterating.
[34]:
a_copy = a[:]
for item in a_copy:
del a_copy[0]
[35]:
a_copy
[35]:
[9, 45]
Intuitively, you might expect a_copy
to be empty, but it’s not! The technical reasons aren’t important, but this highlights an important rule: never modify the length of a list whilst iterating over it! You won’t end up with what you expect.
You can, however, freely modify the values of each item in the list whilst looping. This is a very common use case.
[36]:
a_copy = a[:]
i = 0
for item in a_copy:
a_copy[i] = 2*item
i += 1
[37]:
a_copy
[37]:
[8, 4, 18, 90]
Keeping track of the current index ourselves, with i
is annoying, but luckily Python gives us a nicer way of doing that.
[38]:
a_doubled = a[:]
for index, item in enumerate(a_doubled):
a_doubled[index] = 2 * item
[39]:
a_doubled
[39]:
[8, 4, 18, 90]
There’s a lot going on here. Firstly, note that Python lets you assign values to multiple variables at the same time.
[40]:
one, two = [34, 43]
[41]:
print(f"one: {one}, two: {two}")
one: 34, two: 43
That’s already pretty cool! But then think about what happens if you had a list where each item was another list, each containing two numbers.
[42]:
nested = [[20, 29], [30, 34]]
for item in nested:
print(item)
[20, 29]
[30, 34]
So, we can just assign each item in the sublist to separate variables in the for
statement.
[43]:
for one, two in nested:
print(two, one)
29 20
34 30
Now we can understand a little better what enumerate
does: for each item in the list, it returns a new list containing the current index and the item.
[44]:
enumerate(a)
[44]:
<enumerate at 0x7f547c84b6f0>
[45]:
list(enumerate(a))
[45]:
[(0, 4), (1, 2), (2, 9), (3, 45)]
For more advanced reasons enumerate
doesn’t return a list directly, but instead something that the for
statement knows how to iterate over (this is called a generator and for the moment you don’t need to know how it works). We can convert it to a list with the list
method when we want to see what’s it doing.
This technique of looping over lists of lists lets us loop over two lists simultaneously, using the zip
method.
[46]:
for item, item2 in zip(a, a_doubled):
print(item2, item)
8 4
4 2
18 9
90 45
Neat! As before, we can see what zip
is doing explicitly by using list
.
[47]:
list(zip(a, a_doubled))
[47]:
[(4, 8), (2, 4), (9, 18), (45, 90)]
You can see that the structure of the list that’s iterated over, the output of zip
, is identical to that for enumerate
.
Finally, we’ll take a quick look at the range
method.
[48]:
for i in range(0, 10):
print(i)
0
1
2
3
4
5
6
7
8
9
The arguments to range
work just like slicing, the second argument is treated exclusively, as its value is excluded from the output. Again like slicing, we can specify a third argument as the step size for the iteration.
[49]:
for i in range(0, 10, 2):
print(i)
0
2
4
6
8
If you only give a single argument to range
, it assumes you’ve given the end value, and want a starting value of zero.
[50]:
for i in range(5):
print(i)
0
1
2
3
4
This reads “give me a list of length 5, in steps of 1, starting from zero”.
Now that we know how to easily generate sequences of numbers, we can write enumerate
by hand!
[51]:
for index, item in zip(range(len(a)), a):
print(index, item)
0 4
1 2
2 9
3 45
Just like before! When you see something cool like enumerate
, it can be fun trying to see how you’d accomplish something similar with different building blocks.
List comprehension (Sugar, can be skipped on first read)
We’ve already made a new list from an existing one when we created a_doubled
.
[52]:
a_doubled = a[:]
for index, item in enumerate(a_doubled):
a_doubled[index] = 2*item
Creating a new list from an existing one is a common operation, so Python has a shorthand syntax called list comprehension.
[53]:
a_doubled = [2*item for item in a]
[54]:
a_doubled
[54]:
[8, 4, 18, 90]
Isn’t that beautiful?
We can use the same multi-variable stuff we learnt whilst looping.
[55]:
[index*item for index, item in enumerate(a)]
[55]:
[0, 2, 18, 135]
We’re not restricted to creating new lists with the same structure as the original.
[56]:
[[item, item*item] for item in a]
[56]:
[[4, 16], [2, 4], [9, 81], [45, 2025]]
We can even filter out items from the original list using if
.
[57]:
[[item, item*item] for item in a if item % 2 == 0]
[57]:
[[4, 16], [2, 4]]
List comprehensions are a powerful way of succinctly creating new lists. But be responsible; if you find you’re doing something complicated, it’s probably better to write a full for
loop.
Exercise Write a list comprehension yourself
Compute the square of the magnitude of the sum of the following two three-vectors, using a single list comprehension and the global sum
method.
It might help to first think about how you’d compute the quantity for a single vector.
Not sure what the sum
method does? Ask for help
!
[58]:
help(sum)
Help on built-in function sum in module builtins:
sum(iterable, /, start=0)
Return the sum of a 'start' value (default: 0) plus an iterable of numbers
When the iterable is empty, return the start value.
This function is intended specifically for use with numeric values and may
reject non-numeric types.
[59]:
kaon = [3.4, 4.3, 20.0]
pion = [1.4, 0.9, 19.8]
Solution
The square magnitude is the sum of the squares of the components, where the components are the sum of the two input vectors.
[60]:
magsq = sum([(k + pi)**2 for k, pi in zip(kaon, pion)])
The square root of this is around 40.42.
Tuples
A close relative of lists are tuples, which differ in that they cannot be mutated after creation. You can create tuples literally using parentheses, or convert things to tuples using the tuple
method.
[61]:
a = (3, 4)
[62]:
del a[0]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[62], line 1
----> 1 del a[0]
TypeError: 'tuple' object doesn't support item deletion
[63]:
a.append(5)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[63], line 1
----> 1 a.append(5)
AttributeError: 'tuple' object has no attribute 'append'
Tuples are usually used to describe data whose length is meaningful in and of itself. For example, you could express coordinates as a tuple.
[64]:
coords = (3.2, 0.1)
x, y = coords
This is nice because it doesn’t make sense to append to an $ (x, y) $
coordinate, nor to ‘delete’ a dimension. Generally, it can be useful if the data structure you’re using respects the meaning of the data you’re storing.
If you can’t think of a use for tuples yourself, its worth keeping in mind that Python creates tuples for groups of things by default. We saw that earlier when we used enumerate
.
[65]:
list(enumerate([4, 9]))
[65]:
[(0, 4), (1, 9)]
Each element of the list is a tuple.