Student's Name: PLEASE INSERT YOUR NAME HERE
Directions: Add work to this notebook to solve the problems below. Some other cells have been included which are important for testing your work and should give you some feeback.
Check your work. I give some tests that should be passed if you get it correct.
Your notebook should run without printing errors and without user input. If this is not the case, points will be deducted from your grade.
Problem Sources:
# Standard imports
import numpy as np
import matplotlib.pyplot as plt
import math as m
from mpmath import mp, iv
Data Source:
Monthly $CO_2$ measurements made at the Mauna Lao Observatory are available from the co2.earth website. I have included the important information from the table of that data for years begining with 2010 and placed it in the following dictionary. (I extracted this data from this text file.)
data = { 2010.042: 388.7,
2010.125: 390.2,
2010.208: 391.17,
2010.292: 392.46,
2010.375: 393.0,
2010.458: 392.15,
2010.542: 390.2,
2010.625: 388.35,
2010.708: 386.85,
2010.792: 387.24,
2010.875: 388.67,
2010.958: 389.79,
2011.042: 391.33,
2011.125: 391.86,
2011.208: 392.6,
2011.292: 393.25,
2011.375: 394.19,
2011.458: 393.74,
2011.542: 392.51,
2011.625: 390.13,
2011.708: 389.08,
2011.792: 389.0,
2011.875: 390.28,
2011.958: 391.86,
2012.042: 393.12,
2012.125: 393.86,
2012.208: 394.4,
2012.292: 396.18,
2012.375: 396.74,
2012.458: 395.71,
2012.542: 394.36,
2012.625: 392.39,
2012.708: 391.11,
2012.792: 391.05,
2012.875: 392.98,
2012.958: 394.34,
2013.042: 395.55,
2013.125: 396.8,
2013.208: 397.43,
2013.292: 398.41,
2013.375: 399.78,
2013.458: 398.61,
2013.542: 397.32,
2013.625: 395.2,
2013.708: 393.45,
2013.792: 393.7,
2013.875: 395.16,
2013.958: 396.84,
2014.042: 397.85,
2014.125: 398.01,
2014.208: 399.77,
2014.292: 401.38,
2014.375: 401.78,
2014.458: 401.25,
2014.542: 399.1,
2014.625: 397.03,
2014.708: 395.38,
2014.792: 396.03,
2014.875: 397.28,
2014.958: 398.91,
2015.042: 399.98,
2015.125: 400.28,
2015.208: 401.54,
2015.292: 403.28,
2015.375: 403.96,
2015.458: 402.8,
2015.542: 401.31,
2015.625: 398.93,
2015.708: 397.63,
2015.792: 398.29,
2015.875: 400.16,
2015.958: 401.85,
2016.042: 402.56,
2016.125: 404.12,
2016.208: 404.87,
2016.292: 407.45,
2016.375: 407.72,
2016.458: 406.83,
2016.542: 404.41,
2016.625: 402.27,
2016.708: 401.05,
2016.792: 401.59,
2016.875: 403.55,
2016.958: 404.45,
2017.042: 406.17,
2017.125: 406.46,
2017.208: 407.22,
2017.292: 409.04,
2017.375: 409.69,
2017.458: 408.88,
2017.542: 407.12,
2017.625: 405.13,
2017.708: 403.37,
2017.792: 403.63,
2017.875: 405.12,
2017.958: 406.81,
2018.042: 407.96,
2018.125: 408.32,
2018.208: 409.41,
2018.292: 410.24,
2018.375: 411.24,
2018.458: 410.79,
2018.542: 408.71,
2018.625: 406.99,
2018.708: 405.51,
2018.792: 406.0,
2018.875: 408.02,
2018.958: 409.07,
2019.042: 410.83,
2019.125: 411.75,
2019.208: 411.97,
2019.292: 413.32,
2019.375: 414.66,
2019.458: 413.92,
2019.542: 411.77,
2019.625: 409.94,
2019.708: 408.54,
2019.792: 408.53,
2019.875: 410.27,
2019.958: 411.76,
2020.042: 413.4,
2020.125: 414.11,
2020.208: 414.5}
A dictionary represents a map. Here our map sends a month to the average ppm of $CO_2$ measured that month. Each month is represented by the point in time, the middle of the month measured in years. For example, since 2020.042
is the first entry larger than 2020
, this represents January 2020 and the average of $CO_2$ measurements that month was 413.4 ppm
.
We haven't discussed dictionaries much. But, you can learn about them in L § 6.2. Here we convert the data to numpy arrays:
months = []
measurements = []
for month in data:
months.append(month)
measurements.append(data[month])
months = np.array(months)
measurements = np.array(measurements)
Below we plot the data.
plt.plot(months,measurements,".")
plt.xlabel("time in years")
plt.ylabel("average $C0_2$ measurement in ppm")
plt.title("Monthly C02 measurements")
plt.show()
The goal of this homework assignment is to try to fit a curve to this data. Looking at the graph, we see two dominant features:
This suggests we might be able to fit a curve to this data of the form $$f(t) = c_0 + c_1 t + c_2 \sin(2 \pi t) + c_3 \cos(2 \pi t),$$ where $c_0$, $c_1$, $c_2$, and $c_3$ are real constants. The collection of all functions of this form is a vector space, we will denote by $W$.
The space $W$ has a natural coordinate system in ${\mathbb R}^4$ given by coefficients. That is, $c \in {\mathbb R}^4$ corresponds to the function $$f(t) = c_0 + c_1 t + c_2 \sin(2 \pi t) + c_3 \cos(2 \pi t).$$
Write a function to_function(c)
which takes as input a numpy array c
consisting of four coefficients and returns the function $f$ above. Use currying to return a function. (Currying was discussed in the last example in the Plotting Graphs notebook.) Be sure to use cos
and sin
from numpy
so that the functions work with numpy
arrays.
# Test 1: This should produce a line.
x = np.linspace(2010,2021,1000)
f = to_function(np.array([1,1,0,0]))
plt.plot(x,f(x))
plt.show()
# Test 2: This should produce a wave with a period of 1 year.
x = np.linspace(2010,2021,1000)
f = to_function(np.array([0,0,1,-1]))
plt.plot(x,f(x))
plt.show()
A standard basis for $W$ given as coefficients consists of the following four vectors.
coefs_0 = np.array([1,0,0,0])
coefs_1 = np.array([0,1,0,0])
coefs_2 = np.array([0,0,1,0])
coefs_3 = np.array([0,0,0,1])
Each coefficient vector represents a function, which can be evaluated on times in our data set. This gives alternate coordinates for $W$ in ${\mathbb R}^{123}$. We can "pull back" the dot product on ${\mathbb R}^{123}$ to an inner product on $W$. The following is this pull back inner product:
def ip(coefs_A, coefs_B):
return to_function(coefs_A)(months) @ to_function(coefs_B)(months)
# This is another test of the to_function from problem 1.
assert abs(ip(coefs_0, coefs_1)-247860.375)<10**-8, "There is something wrong with to_function."
Use the Gram-Schmidt process to find an othogonal basis for $W$ with respect to the inner product given in the ip
function above. Store the four elements of the orthogonal basis in the variables orth_0
, orth_1
, orth_2
, and orth_3
.
orth_0 =
orth_1 =
orth_2 =
orth_3 =
# Test: The vectors should should each be coefficient vectors, with 4 entries:
assert len(orth_0)==4, "The length of orth_0 should be 4."
assert len(orth_1)==4, "The length of orth_1 should be 4."
assert len(orth_2)==4, "The length of orth_2 should be 4."
assert len(orth_3)==4, "The length of orth_3 should be 4."
# Test: The vectors should be orthogonal. Here we compute the square of the
# cosine of the angle between two vectors.
assert ip(orth_0,orth_1)**2 / ip(orth_0,orth_0) / ip(orth_1,orth_1) < 10**-15, \
"Vectors orth_0 and orth_1 are not orthogonal."
assert ip(orth_0,orth_2)**2 / ip(orth_0,orth_0) / ip(orth_2,orth_2) < 10**-15, \
"Vectors orth_0 and orth_2 are not orthogonal."
assert ip(orth_0,orth_3)**2 / ip(orth_0,orth_0) / ip(orth_3,orth_3) < 10**-15, \
"Vectors orth_0 and orth_3 are not orthogonal."
assert ip(orth_1,orth_2)**2 / ip(orth_1,orth_1) / ip(orth_2,orth_2) < 10**-15, \
"Vectors orth_1 and orth_2 are not orthogonal."
assert ip(orth_1,orth_3)**2 / ip(orth_1,orth_1) / ip(orth_3,orth_3) < 10**-15, \
"Vectors orth_1 and orth_3 are not orthogonal."
assert ip(orth_2,orth_3)**2 / ip(orth_2,orth_2) / ip(orth_3,orth_3) < 10**-15, \
"Vectors orth_2 and orth_3 are not orthogonal."
Recall ${\mathbb R}^{123}$ represents the space of all possible measurements that could have been taken at the times in our data set. Our data set measurements
of $CO_2$ measurements represents a single point in this space. Find the orthogonal projection of measurements
to $W$ as a list of four coefficients. Store the coefficients as a numpy array in the variable proj
.
We remark that the elements of ${\mathbb R}^{123}$ associated to your orthogonal basis can be computed with:
orth_img_0 = to_function(orth_0)(months)
orth_img_1 = to_function(orth_1)(months)
orth_img_2 = to_function(orth_2)(months)
orth_img_3 = to_function(orth_3)(months)
Also, in ${\mathbb R}^{123}$, we are using the usual dot product. Compute proj
below.
proj =
Instead of a test, you can visually check if your coefficients track the data. Here is a plot that draws both the data points in blue and the function corresponding to proj
in red. This gives a least squares approximation.
proj_f = to_function(proj)
t = np.linspace(2010,2021,1000)
plt.plot(t, proj_f(t), "r")
plt.plot(months, measurements, ".")
plt.xlabel("time in years")
plt.ylabel("average $C0_2$ measurement in ppm")
plt.title("Monthly C02 measurements")
plt.show()
As a remark, the linear growth term is the most troubling. Your computation suggests that the $CO_2$ is increasing at the following rate (measured in ppm/year):
proj[1]
Also, you might guess that we will hit the milestone
of 500 ppm of $CO_2$ in the atmosphere around the following year:
(500 - proj[0]) / proj[1]
If you are interested in learning more, a more serious discussion can be read here: https://e360.yale.edu/features/how-the-world-passed-a-carbon-threshold-400ppm-and-why-it-matters.