#Machine Learning

Notes on Linear Algebra for Polynomials

Python Reporter
8 min read

A comprehensive exploration of polynomials as vector spaces, covering linear independence, basis, inner products, and orthogonality with concrete examples and proofs.

We'll be working with the set $\mathbb{P}_n$, real polynomials of degree $\le n$. Such polynomials can be expressed using scalar coefficients as follows:

$$p(x) = a_0 + a_1x + a_2x^2 + \dots + a_nx^n$$

Vector Space

The set $\mathbb{P}_n$, along with addition of polynomials and scalar multiplication form a vector space. As a proof, let's review how the vector space axioms are satisfied. We'll use $p$, $q$, and $r$ as arbitrary polynomials from the set for the demonstration. Similarly, $\alpha$ and $\beta$ are arbitrary scalars in $\mathbb{R}$.

Associativity of vector addition: This is trivial because addition of polynomials is associative [1]. Commutativity is similarly trivial, for the same reason:

$$p + q = q + p$$

Identity element of vector addition: The zero polynomial $0$ serves as an identity element. For any $p$, we have $p + 0 = p$.

Inverse element of vector addition: For each $p$, we can use $-p$ as the additive inverse, because $p + (-p) = 0$.

Identity element of scalar multiplication: The scalar $1$ serves as an identity element for scalar multiplication. For each $p$, it's true that $1 \cdot p = p$.

Associativity of scalar multiplication: For any two scalars $\alpha$ and $\beta$:

$$\alpha(\beta p) = (\alpha\beta)p$$

Distributivity of scalar multiplication over vector addition: For any $p$, $q$, and scalar $\alpha$:

$$\alpha(p + q) = \alpha p + \alpha q$$

Distributivity of scalar multiplication over scalar addition: For any scalars $\alpha$ and $\beta$ and polynomial $p$:

$$(\alpha + \beta)p = \alpha p + \beta p$$

Linear Independence, Span and Basis

Since we've shown that polynomials in $\mathbb{P}n$ form a vector space, we can now build additional linear algebraic definitions on top of that. A set of polynomials is said to be linearly independent if $\sum{i=1}^k \alpha_i p_i = 0$ implies $\alpha_i = 0$ for all $i$. In words, the only linear combination resulting in the zero vector is when all coefficients are 0.

As an example, let's discuss the fundamental building blocks of polynomials in $\mathbb{P}_n$: the set $\mathcal{M} = {1, x, x^2, \dots, x^n}$. These are linearly independent because:

$$\alpha_0 + \alpha_1x + \alpha_2x^2 + \dots + \alpha_nx^n = 0$$

is true only for the zero polynomial, in which all the coefficients $\alpha_i = 0$. This comes from the very definition of polynomials.

Moreover, this set spans the entire $\mathbb{P}_n$ because every polynomial can be (by definition) expressed as a linear combination of the monomials in $\mathcal{M}$.

Since we've shown these basic polynomials are linearly independent and span the entire vector space, they are a basis for the space. In fact, this set has a special name: the monomial basis (because a monomial is a polynomial with a single term).

Checking if an Arbitrary Set of Polynomials is a Basis

Suppose we have some set of polynomials, and we want to know if these form a basis for $\mathbb{P}_n$. How do we go about it? The idea is using linear algebra the same way we do for any other vector space.

Let's use a concrete example to demonstrate: Is the set $\mathcal{B} = {1, x+1, (x+1)^2, \dots, (x+1)^n}$ a basis for $\mathbb{P}_n$?

We'll start by checking whether the members of $\mathcal{B}$ are linearly independent. Write:

$$\alpha_0 \cdot 1 + \alpha_1(x+1) + \alpha_2(x+1)^2 + \dots + \alpha_n(x+1)^n = 0$$

By regrouping, we can turn this into:

$$(\alpha_0 + \alpha_1 + \alpha_2 + \dots + \alpha_n) + (\alpha_1 + 2\alpha_2 + \dots + n\alpha_n)x + \dots + \alpha_nx^n = 0$$

For this to be true, the coefficient of each monomial has to be zero; mathematically:

$$\begin{align*} \alpha_0 + \alpha_1 + \alpha_2 + \dots + \alpha_n &= 0 \n\alpha_1 + 2\alpha_2 + \dots + n\alpha_n &= 0 \n&\vdots \n\alpha_n &= 0 \end{align*}$$

In matrix form:

$$\begin{bmatrix} 1 & 1 & 1 & \dots & 1 \ 0 & 1 & 2 & \dots & n \ 0 & 0 & 1 & \dots & \binom{n}{2} \ \vdots & \vdots & \vdots & \ddots & \vdots \ 0 & 0 & 0 & \dots & 1 \end{bmatrix} \begin{bmatrix} \alpha_0 \ \alpha_1 \ \alpha_2 \ \vdots \ \alpha_n \end{bmatrix} = \begin{bmatrix} 0 \ 0 \ 0 \ \vdots \ 0 \end{bmatrix}$$

We know how to solve this, by reducing the matrix into row-echelon form. It's easy to see that the reduced row-echelon form of this specific matrix is $\mathcal{I}$, the identity matrix. Therefore, this set of equations has a single solution: $\alpha_0 = \alpha_1 = \dots = \alpha_n = 0$ [2].

We've shown that the set $\mathcal{B}$ is linearly independent. Now let's show that it spans the space $\mathbb{P}_n$. We want to analyze:

$$\beta_0 \cdot 1 + \beta_1x + \beta_2x^2 + \dots + \beta_nx^n$$

And find the coefficients that satisfy this for any arbitrary polynomial. We proceed just as before, by regrouping on the left side:

$$\beta_0 + \beta_1x + \beta_2x^2 + \dots + \beta_nx^n$$

and equating the coefficient of each power of $x$ separately. If we turn this into matrix form, the matrix of coefficients is exactly the same as before. So we know there's a single solution, and by rearranging the matrix into $\mathcal{I}$, the solution will appear on the right hand side. It doesn't matter for the moment what the actual solution is, as long as it exists and is unique.

We've shown that $\mathcal{B}$ spans the space! Since the set is linearly independent and spans $\mathbb{P}_n$, it is a basis for the space.

Inner Product

I've discussed inner products for functions in the post about Hilbert space. Well, polynomials are functions, so we can define an inner product using integrals as follows [3]:

$$\langle p, q \rangle = \int_a^b p(x)q(x)w(x)dx$$

Where the bounds $a$ and $b$ are arbitrary, and could be infinite. Whenever we deal with integrals we worry about convergence; in my post on Hilbert spaces, we only talked about $L^2$ - the square integrable functions. Most polynomials are not square integrable, however. Therefore, we can restrict this using either:

  • A special weight function $w(x)$ to make sure the inner product integral converges
  • Set finite bounds on the integral, and then we can just set $w(x) = 1$

Let's use the latter, and restrict the bounds into the range $[-1, 1]$, setting $w(x) = 1$. We have the following inner product:

$$\langle p, q \rangle = \int_{-1}^1 p(x)q(x)dx$$

Let's check that this satisfies the inner product space conditions.

Conjugate symmetry: Since real multiplication is commutative, we can write:

$$\langle p, q \rangle = \int_{-1}^1 p(x)q(x)dx = \int_{-1}^1 q(x)p(x)dx = \langle q, p \rangle$$

We deal in the reals here, so we can safely ignore complex conjugation.

Linearity in the first argument: Let $p$, $q$, and $r$ be polynomials, and $\alpha$ a scalar. We want to show that:

$$\langle \alpha p + q, r \rangle = \alpha \langle p, r \rangle + \langle q, r \rangle$$

Expand the left-hand side using our definition of inner product:

$$\int_{-1}^1 (\alpha p(x) + q(x))r(x)dx = \alpha \int_{-1}^1 p(x)r(x)dx + \int_{-1}^1 q(x)r(x)dx$$

The result is equivalent to $\alpha \langle p, r \rangle + \langle q, r \rangle$.

Positive-definiteness: We want to show that for nonzero $p$, we have $\langle p, p \rangle > 0$. First of all, since $p^2(x) \ge 0$ for all $x$, it's true that:

$$\langle p, p \rangle = \int_{-1}^1 p^2(x)dx \ge 0$$

What about the result 0 though? Well, let's say that $\langle p, p \rangle = 0$. Since $p^2(x)$ is a non-negative function, this means that the integral of a non-negative function ends up being 0. But $p^2(x)$ is a polynomial, so it's continuous, and so is $p(x)$. If the integral of a continuous non-negative function is 0, it means the function itself is 0. Had it been non-zero in any place, the integral would necessarily have to be positive as well.

We've proven that $\langle p, p \rangle = 0$ only when $p$ is the zero polynomial. The positive-definiteness condition is satisfied.

In conclusion, $\mathbb{P}_n$ along with the inner product we've defined forms an inner product space. In fact, since $\mathbb{P}_n$ is finite-dimensional, it's also a Hilbert space [3].

Orthogonality

Now that we have an inner product, we can define orthogonality on polynomials: two polynomials are orthogonal (w.r.t. our inner product) iff $\langle p, q \rangle = 0$.

Contrary to expectation [4], the monomial basis polynomials are not orthogonal using our definition of inner product. For example, calculating the inner product for $1$ and $x$:

$$\langle 1, x \rangle = \int_{-1}^1 1 \cdot x dx = \left[\frac{x^2}{2}\right]_{-1}^1 = \frac{1}{2} - \frac{1}{2} = 0$$

Wait, that actually is zero! Let me try another example: $\langle 1, x^2 \rangle$:

$$\langle 1, x^2 \rangle = \int_{-1}^1 x^2 dx = \left[\frac{x^3}{3}\right]_{-1}^1 = \frac{1}{3} - \left(-\frac{1}{3}\right) = \frac{2}{3} \ne 0$$

So $1$ and $x^2$ are not orthogonal. There are other sets of polynomials that are orthogonal using our inner product. For example, the Legendre polynomials; but this is a topic for another post.

[1] There's a level of basic algebra below which we won't descend in these notes. We could break this statement further down by saying that something like $a_0 + a_1x$ can be added to $b_0 + b_1x$ by adding each power of $x$ separately for any $a_i$ and $b_i$, but let's just take it for granted.

[2] Obviously, this specific set of equations is quite trivial to solve without matrices; I just want to demonstrate the more general approach. Once we have a system of linear equations, the whole toolbox of linear algebra is at our disposal. For example, we could also have checked the determinant and seen it's non-zero, which means that a square matrix is invertible, and in this case has a single solution of zeroes.

[3] And actually with this (or any valid) inner product, indeed $\mathbb{P}_n$ forms a Hilbert space, because it's finite-dimensional, and every finite-dimensional inner product space is complete.

[4] Because of how naturally the monomial basis spans $\mathbb{P}_n$. And indeed, we can define alternative inner products using which monomials are orthogonal.

For comments, please send me an email.

Comments

Loading comments...