Lists

In the page on Tuples and Sets, we described that a tuple can store an ordered list of objects. A tuple is immutable, meaning that once it is created it can't be changed. In this case, you can't do things such as add objects to the list, remove objects, or reorder the objects. There are various reasons why a programmer might want to work with immutable objects such as tuples instead of mutable objects which can be changed, but a complete discussion of this is beyond the scope of the course. (Wikipedia has a discussion of this in their article on immutability.)

In any case, there are times when you want to be able to change the contents of a list. Python has a List type for this purpose. You create a list with square brackets.

In [1]:
my_list = [1,2,3,"infinity"]
type(my_list)
Out[1]:
list

Lists can be manipulated in much the same way as tuples:

In [2]:
print("The fourth item in my list is \"", my_list[3], "\".", sep="")
print("My list has", len(my_list), "items. They are:")
for x in my_list:
    print(x)
The fourth item in my list is "infinity".
My list has 4 items. They are:
1
2
3
infinity

However unlike tuples, you can add and remove objects from a list.

In [3]:
my_list.append("something after infinity")
my_list.remove(2)
print("My list is now", my_list)
My list is now [1, 3, 'infinity', 'something after infinity']

The append method above adds something to the end of the list. You can also add something to the beginning of a list:

In [4]:
my_list.insert(0,"zero")
print("My list is now", my_list)
My list is now ['zero', 1, 3, 'infinity', 'something after infinity']

The method insert(index,item) adds item to the list at the provided index moving everything at that point or later to one index later in the list. So we can return 2 to its proper location in the list by using:

In [5]:
my_list.insert(2,2)
print("My list is now", my_list)
My list is now ['zero', 1, 2, 3, 'infinity', 'something after infinity']

Example: List of Fibonnacci numbers

Suppose we want to create a function which lists the first \(n \geq 2\) Fibonnacci numbers starting at (1,1). The following does this:

In [6]:
def fibonacci_list(n):
    flist = [1,1]
    while len(flist)<n:
        flist.append(flist[len(flist)-1]+flist[len(flist)-2])
    return flist

Here we test it out for some values.

In [7]:
fibonacci_list(10)
Out[7]:
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]
In [8]:
fibonacci_list(20)
Out[8]:
[1,
 1,
 2,
 3,
 5,
 8,
 13,
 21,
 34,
 55,
 89,
 144,
 233,
 377,
 610,
 987,
 1597,
 2584,
 4181,
 6765]

Example: Using a list to represent a polynomial

We could think of a polynomial in the variable \(x\) as represented by its list of coefficients so that if \(l\) is a list of coefficients, the represented polynomial is \[\sum_{i=0}^{len(l)} l[i] x^i.\] Here is a function which prints the polynomial represented by a list.

In [9]:
def polynomial_str(coefficient_list):
    string = ""+str(coefficient_list[0])
    for i in range(1, len(coefficient_list)):
        string = string + " + " + str(coefficient_list[i]) + "x^" + str(i)
    return string

Note that the str(obj) function converts an object to a string (see documentation of str()), and + concatenates two strings (see the tutorial). Here are some examples of using this function.

In [10]:
polynomial_str([3.2])
Out[10]:
'3.2'
In [11]:
polynomial_str([3.2,1,2,-3])
Out[11]:
'3.2 + 1x^1 + 2x^2 + -3x^3'

We can then ask if we can add two polynomials viewed as coefficient lists:

In [12]:
def add_polynomials(c1, c2):
    """Generate the coefficient list for the sum of two polynomials (each represented by a coefficient list)."""
    sum_list = []
    m = min(len(c1), len(c2))
    # First add the coefficients as long as the coefficient is defined for both polynomials
    for i in range(m):
        sum_list.append(c1[i] + c2[i])
        
    # Now deal with the possibility that one of the original polynomials might have more coefficients:
    if len(c1) > m:
        for i in range(m, len(c1)):
            sum_list.append(c1[i])
    elif len(c2) > m:
        for i in range(m, len(c2)):
            sum_list.append(c2[i])
    return sum_list

Examples:

In [13]:
c1=[1,0,2]
c2=[0,1,1,1]
print("Adding", polynomial_str(c1), "to", polynomial_str(c2), "yields", polynomial_str(add_polynomials(c1,c2)), end=".")
Adding 1 + 0x^1 + 2x^2 to 0 + 1x^1 + 1x^2 + 1x^3 yields 1 + 1x^1 + 3x^2 + 1x^3.

Aliasing (a warning)

When we write an expression such as x = 3 in Python, your should really think that this says "Give 3 the name x."

Consider the following code:

In [14]:
list1=[1,2,3]
list2=list1
print("First, list1 is",list1,"and list2 is", list2)
list1.remove(2)
First, list1 is [1, 2, 3] and list2 is [1, 2, 3]

It appears that in the code above the line list1.remove(2) removed \(2\) from both list1 and list2!

Actually, there is only one list object in this code block. That same list has been given two names: list1 and list2. The print statements are printing the same list twice!

Copying a list

You can make a copy of a list with the copy() method. This partially resolves the above problem.

In [15]:
list1=[1,2,3]
list2=list1.copy()
print("First, list1 is",list1,"and list2 is", list2)
list1.remove(2)
print("Second, list1 is",list1,"and list2 is", list2)
First, list1 is [1, 2, 3] and list2 is [1, 2, 3]
Second, list1 is [1, 3] and list2 is [1, 2, 3]

This type of copy is called a shallow copy. The problem manifests itself again for lists containing lists:

In [16]:
matrix1=[ [1,2], [4,5] ]
matrix2=matrix1.copy()
print("First, matrix1 is",matrix1,"and matrix2 is", matrix2)
matrix1[1][1]=17
print("Second, matrix1 is",matrix1,"and matrix2 is", matrix2)
First, matrix1 is [[1, 2], [4, 5]] and matrix2 is [[1, 2], [4, 5]]
Second, matrix1 is [[1, 2], [4, 17]] and matrix2 is [[1, 2], [4, 17]]

The copy() method makes a "shallow copy". It is essentially the same as the following code:

In [17]:
matrix1=[ [1,2], [4,5] ]
matrix2=[]
for row in matrix1:
    matrix2.append(row)
print("First, matrix1 is",matrix1,"and matrix2 is", matrix2)
matrix1[1][1]=17
print("Second, matrix1 is",matrix1,"and matrix2 is", matrix2)
First, matrix1 is [[1, 2], [4, 5]] and matrix2 is [[1, 2], [4, 5]]
Second, matrix1 is [[1, 2], [4, 17]] and matrix2 is [[1, 2], [4, 17]]

In this case matrix1 and matrix2 are different lists, but the elements in the list are all equal. In otherwords matrix2[i] is just a different name for matrix1[i].

To fix the last code block, instead of appending the rows of matrix1, we should append copies of the rows.

In [18]:
matrix1=[ [1,2], [4,5] ]
matrix2=[]
for row in matrix1:
    matrix2.append(row.copy())
print("First, matrix1 is",matrix1,"and matrix2 is", matrix2)
matrix1[1][1]=17
print("Second, matrix1 is",matrix1,"and matrix2 is", matrix2)
First, matrix1 is [[1, 2], [4, 5]] and matrix2 is [[1, 2], [4, 5]]
Second, matrix1 is [[1, 2], [4, 17]] and matrix2 is [[1, 2], [4, 5]]

That worked as expected. But what if our lists were even more nested? Python has a module called copy which has a deepcopy() method which deals with this.

In [19]:
import copy
matrix1=[ [1,2], [4,5] ]
matrix2=copy.deepcopy(matrix1)
print("First, matrix1 is",matrix1,"and matrix2 is", matrix2)
matrix1[1][1]=17
print("Second, matrix1 is",matrix1,"and matrix2 is", matrix2)
First, matrix1 is [[1, 2], [4, 5]] and matrix2 is [[1, 2], [4, 5]]
Second, matrix1 is [[1, 2], [4, 17]] and matrix2 is [[1, 2], [4, 5]]

To learn more about this method refer to the the python library's documentation of the copy module.

Remark: Issues like this explain why programmers prefer immutable objects (objects that can't be changed after they are created).

Table of methods applicable to Lists:

In the following lst denotes a list. The following are functions which work just like those for tuples.

len(lst) Returns the length of lst.
lst[n] Returns the \((n+1)\)st item in the list. Here \(0 \leq n < len(lst)\).
3 in lst Tests if 3 appears somewhere in lst, returning a boolean value.
3 not in lst Returns the boolean value of the statement "3 does not appear in lst."
lst1+lst2 Returns the concatenation of the two listslst1 and lst2 without changing them.
lst.count(3) Returns the number of times 3 appears in lst.
lst.index(3) Returns the first index of 3 in lst.
for x in lst: Iterates through the objects appearing in lst, running the indented block of code below once with x equal to each element of v.
list(z) Attempts to convert the object z to a tuple and returns the tuple. This works for sets, tuples, and strings.

Recall that tuples are immutable and lists are mutable. Here are some methods to change a list.

lst[i] = x Replaces the \(i\)th item in the list with \(x\). Results in an error if there is no \(i\)-th item.
lst.append(item) Adds item to the end of the list.
lst.insert(k,item) Adds item to the list in position \(k\) and increases the index of items previously with indices in \(\{k,k+1, \ldots, len(lst)-1\}\) by one.
lst.pop() Removes the last element of the list and returns it.
lst.pop(i) Removes the \(i\)th element of the list and returns it.
del lst[i] Removes the \(i\)th element of the list. Decreases the index of later items by one.
lst.remove(obj) Removes the first occurence of obj from the list. Causes an error if the item does not appear.
lst.copy() Returns a shallow copy of lst. (See the discussion above of shallow and deep copies.)
lst.clear() Removes all items from the list.
lst.extend(lst2)
or
lst += lst2
Adds the entries of lst2 to the end of lst.

Example: The Sieve of Eratosthenes

The Sieve of Eratosthenes is a method to list the prime numbers \(p\) with \(2 \leq p < n\) for some integer \(n \geq 2\).

The basic idea is to maintain a list of "possible primes." We will call this list lst starting with lst=[2,3,4,...,n-1]. We will construct a list of primes as we shorten our list of possible primes. As long as lst has at least one element, we will update our list of primes by adding the first element of lst. Then we remove all numbers in lst which are divisible by \(p\). By induction, it will follow that the first number in our list is always prime because it has no smaller prime divisors. Also, aside from the first element of lst, all the other numbers in the list that we remove are composite (i.e., not prime) because they are divisible by a prime distinct from themselves. It follows that when the list lst becomes empty, we have found all primes in the original list.

Here is a simple implementation of the Sieve. Whenever we update lst, we print it out. Hopefully this lets you figure out what it is doing. See the example below.

In [20]:
def list_of_primes(n):
    """returns a list of primes $p$ with 2<=p<n."""
    lst = list(range(2,n))
    print("Initially lst is ", lst, sep="")
    primes = []
    while len(lst) > 0:
        p = lst.pop(0)
        primes.append(p)
        temporary_list = []
        for x in lst:
            if x%p != 0:
                temporary_list.append(x)
        lst=temporary_list
        print("We found the prime ", p, ". At this point our lst is ", lst, sep="")
    return primes

The following prints the list of primes less than 30.

In [21]:
primes=list_of_primes(30)
print("Our list of primes is", primes)
Initially lst is [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29]
We found the prime 2. At this point our lst is [3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29]
We found the prime 3. At this point our lst is [5, 7, 11, 13, 17, 19, 23, 25, 29]
We found the prime 5. At this point our lst is [7, 11, 13, 17, 19, 23, 29]
We found the prime 7. At this point our lst is [11, 13, 17, 19, 23, 29]
We found the prime 11. At this point our lst is [13, 17, 19, 23, 29]
We found the prime 13. At this point our lst is [17, 19, 23, 29]
We found the prime 17. At this point our lst is [19, 23, 29]
We found the prime 19. At this point our lst is [23, 29]
We found the prime 23. At this point our lst is [29]
We found the prime 29. At this point our lst is []
Our list of primes is [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]

Remarks: I did not attempt to write the most efficient or the shortest program which implements the Sieve of Eratosthenes. There is a good Wikipedia article on the Sieve of Eratosthenes.

Exercises:

  1. The tribonacci sequence is a sequence of integers defined inductively by \(a_0=a_1=0\), \(a_2=1\), and \(a_{n+3}=a_n+a_{n+1}+a_{n+2}\) for integers \(n \geq 0\). Write a function tribonacci(m) which takes as input an number \(m \geq 1\) and returns the list \([a_0, a_1, a_2, \ldots, a_k]\) where \(a_k\) is the largest number in the sequence with \(a_k<m\).
  2. A list is a palindrome if it is the same when reversed. Write a function is_palindrome(lst) which takes as input a list lst and returns the truth value of the statement "lst is a palindrome."
  3. Write a function derivative(c) which takes as input a list c coefficients of a polynomial and returns the coefficient list of the derivative of the polynomial.
  4. Write a function multiply_polynomials(c1,c2) which takes as input two lists of coefficients of two polynomials and returns the coefficient list of the product of these polynomials.
  5. A polygon in the plane can be stored as a list of coordinates of vertices, where each vertex is a pair of numbers. For instance [(0,0),(3,0),(0,4)] represents a \(3\)-\(4\)-\(5\) right triangle.
    Write a function perimeter(p) which computes the perimeter of the polygon \(p\), where \(p\) is represented as a list of coordinates as above. For example, perimeter([(0,0),(3,0),(0,4)]) should return \(12\) (which is \(3+4+5\)).
  6. Viewing addition as a binary operation, the Catalan number \(C_k\) is the number of ways to write \(k+1\) as a sum of \(k+1\) ones. Here \(k \geq 0\) is an integer. For example \(C_0=1\) because \(1\) can only be expressed as \(1\), and \(C_1=1\) because \(2=1+1\) is the only way to write \(2\) as a sum of ones. But, \(C_2=2\) because \[3=(1+1)+1=1+(1+1),\] and \(C_3=5\) because \[4=1+\big(1+(1+1)\big)=1+\big((1+1)+1\big)=(1+1)+(1+1)=\big((1+1)+1\big)+1=\big(1+(1+1)\big)+1.\] Suppose we have an expression for \(k+1\) as a sum of ones. Then there is an outermost addition, and we can simplify the left and right sides. For example, the sum \(\big(1+(1+1)\big)+1\) simplifies to \(3+1\). Every sum representing \(k+1\) simplifies to a sum of the form \(a+b\) with \(a \geq 1\), \(b \geq 1\) and \(a+b=k+1\). Furthermore in such a sum \(a\) and \(b\) are each represented as a sum of ones. It follows that for \(k \geq 1\) we have \[C_k=\sum_{i=0}^{k-1} C_i C_{k-i-1}.\] (Here each term \(C_i C_{k-i-1}\) represents number of ways to write \(k+1\) as a sum of ones which simplifies to \((i+1) + (k-i)\).)
    Write a function catalan_numbers(n) which returns the list of the first \(n\) Catalan numbers: \([C_0, C_1, \ldots, C_{n-1}]\). (The Catalan numbers show up in a lot of counting problems. Wikipedia has a nice article on the Catalan numbers.)

References:

  • The book How to Think Like a Computer Scientist: Learning with Python 3 has a chapter on lists.
  • The Python tutorial has a sections entitled Lists and More on Lists which describe how to use lists.