Essential Matrix Algebra for Data Science and Quantitative Finance:

Eigenvalues, Eigenvectors and Matrix Diagonalization

Posted by Jim Range on April 21, 2023

Introduction

In the world of data science and quantitative finance, the landscape is constantly shifting and evolving, demanding professionals to be well-versed in the bedrock of mathematics to truly excel. We will explore fascinating mathematical concepts such as:

  • Eigenvalues and Eigenvectors
  • The Eigenvalue Problem
  • Matrix Diagonalization
  • Powers of a Matrix
  • Matrix Exponential

I created this blog post while studying the MITx MicroMasters Program in Finance. I return to this blog post from time to time to keep these topics fresh in my mind. This post is not meant to be a complete "course" on the topic. It is more in the form of my notes that I (and you) can review to stay sharp on these topics, or as a guide for further study.

If you are interested in taking a course on these topics, I recommend:

Eigenvalues and Eigenvectors

Eigenvalues and eigenvectors are fundamental concepts in linear algebra and have numerous applications in various fields, including data science, quantitative finance, eigenvalue problem as follows quantum mechanics, and control systems. They help us understand the underlying structure of a linear transformation, and they play a crucial role in tasks such as dimensionality reduction, spectral clustering, and stability analysis.

The Eigenvalue Problem

Given a square matrix A of size nxn, the eigenvalue problem is to find a scalar λ (the eigenvalue) and a nonzero vector x (the eigenvector) such that:

\[Ax = \lambda x\]

In other words, the action of the matrix A on the eigenvector x only stretches or compresses it by a factor of λ. Note that the zero vector is excluded from being an eigenvector because it would trivially satisfy the equation for any λ.

To find the eigenvalues and eigenvectors, we can rewrite the eigenvalue problem as follows by inserting the identity matrix \(\mathbb{I}\) of size nxn:

\[Ax = \lambda \mathbb{I} x \]

And then rearrange the equation to:

\[(A - \lambda \mathbb{I})x = 0\]

For a nontrivial solution (x ≠ 0), the matrix (A - λ \(\mathbb{I}\)) must be singular, which means its determinant must be zero:

\[det(A - \lambda \mathbb{I}) = 0\]

This equation is known as the characteristic equation of the matrix A. Solving it yields the eigenvalues λ. Once the eigenvalues are found, we can substitute each one back into the equation (A - λ\(\mathbb{I}\))x = 0 to find the corresponding eigenvectors.

Finding Eigenvalues and Eigenvectors

To find the eigenvalues and eigenvectors of a square matrix A, follow these steps:

  1. Form the characteristic equation by setting the determinant of (A - λ\(\mathbb{I}\)) equal to zero: \(det(A - \lambda \mathbb{I}) = 0\).
  2. Solve the characteristic equation for λ to find the eigenvalues.
  3. For each eigenvalue, substitute it back into the equation (A - λ\(\mathbb{I}\))x = 0 and solve for the eigenvector x.

Note that the eigenvectors are not unique; any nonzero scalar multiple of an eigenvector is also an eigenvector. Therefore, eigenvectors are usually normalized to have unit length for consistency.

Characteristic Equation for a 2x2 Matrix

Let's consider a 2x2 matrix A:

\[A = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix}\]

The characteristic equation for this matrix can be derived by computing the determinant of (A - λ\(\mathbb{I}\)):

\[\begin{vmatrix} a_{11} - \lambda & a_{12} \\ a_{21} & a_{22} - \lambda \end{vmatrix} = 0\]

\[=\lambda^2 - \text{tr}(A) \lambda + det(A) = 0\]

\[= (a_{11} - \lambda)(a_{22} - \lambda) - a_{12}a_{21} = \lambda^2 - (a_{11} + a_{22})\lambda + (a_{11}a_{22} - a_{12}a_{21}) = 0\]

where tr(A) denotes the trace of matrix A, which is the sum of its diagonal elements. Solving this quadratic equation yields the eigenvalues λ of the 2x2 matrix A. Once the eigenvalues are found, we can substitute each one back into the equation (A - λ\(\mathbb{I}\))x = 0 to find the corresponding eigenvectors.

Solve the quadratic equation for λ to find the eigenvalues:

\[\lambda_{1,2} = \frac{(a_{11} + a_{22}) \pm \sqrt{(a_{11} + a_{22})^2 - 4(a_{11}a_{22} - a_{12}a_{21})}}{2}\]

For each eigenvalue, substitute it back into the equation (A - λ\(\mathbb{I}\))x = 0 and solve for the eigenvector x.

Eigenvalues and Eigenvectors of a 2x2 Matrix

Let's consider a 2x2 matrix A:

\[A = \begin{bmatrix} 4 & 1 \\ 1 & 4 \end{bmatrix}\]

Setup the eigenvalue problem:

\[\begin{vmatrix} 4-\lambda & 1 \\ 1 & 4-\lambda \end{vmatrix} = 0\]

Expand the determinant and solve for lambda:

\[(4-\lambda)(4-\lambda) - 1 = 0 \] \[\lambda^2 - 8\lambda + 15 = 0 \] \[(\lambda-3)(\lambda-5)= 0 \] \[\{\lambda = 3, \lambda=5\}\]

We will label the eigenvectors with a double subscript \(x_{ij}\), where \(i\) is the eigen value the eigenvector corresponds to and \(j\) is the index of the eigenvector.

Substitute each \(\lambda\) back into the equation (A - λ\(\mathbb{I}\))x = 0 and solve for the eigenvector x. First we will use \(\lambda=3\):

\[(A - \lambda \mathbb{I})x = 0\]

\[ \left(\begin{bmatrix}4&1\\1&4\end{bmatrix} - \begin{bmatrix}3&0\\0&3\end{bmatrix}\right) \begin{bmatrix}x_{11}\\x_{12}\end{bmatrix}=\begin{bmatrix}0\\0\end{bmatrix} \] \[ \begin{bmatrix}1&1\\1&1\end{bmatrix} \begin{bmatrix}x_{11}\\x_{12}\end{bmatrix} = \begin{bmatrix}0\\0\end{bmatrix} \] \[ x_{11} = - x_{12} \]

Now we will solve for \(\lambda=5\):

\[(A - \lambda \mathbb{I})x = 0\]

\[ \left(\begin{bmatrix}4&1\\1&4\end{bmatrix} - \begin{bmatrix}5&0\\0&5\end{bmatrix}\right) \begin{bmatrix}x_{21}\\x_{22}\end{bmatrix}=\begin{bmatrix}0\\0\end{bmatrix} \] \[ \begin{bmatrix}-1&1\\1&-1\end{bmatrix} \begin{bmatrix}x_{21}\\x_{22}\end{bmatrix} = \begin{bmatrix}0\\0\end{bmatrix} \] \[ x_{21} = x_{22} \]

We can see that if we were to scale \(x\) that it would have no impact in the below equation:

\[Ax = \lambda \mathbb{I} x \]

So we are free to scale \(x\) by a constant, or we can normalize the values of \(x\). So then for each eigen value we have the eigen vector that corresponds to that eigen value:

\[ \lambda_1 = 3 : x_1 = \begin{bmatrix}-1\\1\end{bmatrix} \] \[ \lambda_2 = 5 : x_2 = \begin{bmatrix}1\\1\end{bmatrix} \]

Or we can normalize the vectors to:

\[ \lambda_1 = 3 : x_1 = \frac{1}{\sqrt{2}}\begin{bmatrix}-1\\1\end{bmatrix} \] \[ \lambda_2 = 5 : x_2 = \frac{1}{\sqrt{2}}\begin{bmatrix}1\\1\end{bmatrix} \]

Complex Eigenvalues

In some cases, when solving the characteristic equation of a matrix A, the resulting eigenvalues may be complex. Complex eigenvalues are often encountered in systems with oscillatory or rotational behavior, such as damped harmonic oscillators or rotating systems.

When a matrix A has complex eigenvalues, its eigenvectors will also generally be complex. However, if the matrix A is real and symmetric (i.e., \(A = A^T\)), then all its eigenvalues are guaranteed to be real, and its eigenvectors can be chosen to be real as well.

Let's consider an example of a 2x2 matrix with complex eigenvalues:

\[A = \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix}\]

Form the characteristic equation:

\[det\begin{bmatrix} -\lambda & -1 \\ 1 & -\lambda \end{bmatrix} = \lambda^2 + 1 = 0\]

Solve the quadratic equation for λ to find the complex eigenvalues:

\[\lambda_{1,2} = \pm i\]

For each complex eigenvalue, substitute it back into the equation (A - λ\(\mathbb{I}\))x = 0 and solve for the complex eigenvector x. For example, with λ = i:

\[\begin{bmatrix} -i & -1 \\ 1 & -i \end{bmatrix}x = 0\]

An eigenvector corresponding to λ = i is \(x = \begin{bmatrix} 1 \\ -i \end{bmatrix}\).

Complex eigenvalues and eigenvectors play an important role in understanding the behavior of many systems in physics and engineering, as well as in advanced topics in linear algebra, such as the Jordan canonical form and matrix exponentiation.

Matrix Diagonalization

Matrix diagonalization is the process of finding a diagonal matrix D that is similar to a given square matrix A and an invertible matrix S such that:

\[A = SDS^{-1}\]

Matrix diagonalization is a technique of factoring a square matrix that simplifies many computations involving matrices, as diagonal matrices are easier to work with (e.g., raising a diagonal matrix to a power or computing its exponential). Diagonalization is closely related to eigenvalues and eigenvectors, and it is possible if and only if A has enough linearly independent eigenvectors to form a complete basis for its column space.

\[ D = S^{-1}AS = \begin{bmatrix} \lambda_1 & 0 & \dots & 0 \\ 0 & \lambda_2 & \dots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \dots & \lambda_n \end{bmatrix} \]

Linearly Independent Eigenvectors

For a matrix to be diagonalizable, it must have n linearly independent eigenvectors (where n is the size of the square matrix). When this condition is satisfied, we can form a matrix \(S\) using the linearly independent eigenvectors as its columns:

\[S = \begin{bmatrix} | & | & & | \\ v_1 & v_2 & \cdots & v_n \\ | & | & & | \end{bmatrix}\]

Invertibility of the Eigenvector Matrix

When a matrix A has n linearly independent eigenvectors, the matrix \(S\) formed by arranging these eigenvectors in columns is guaranteed to be invertible. With an invertible \(S\), we can find the diagonal matrix D by multiplying both sides of the equation \(A = SDS^{-1}\) by \(S\) and \(S^{-1}\):

\[D = S^{-1}AS\]

The diagonal matrix D will have the eigenvalues of A on its diagonal. The order of the eigenvalues in D should match the order of the corresponding eigenvectors in \(S\).

Example: Matrix Diagonalization of a 3x3 Matrix

Let's consider a 3x3 matrix A:

\[A = \begin{bmatrix} 2 & 1 & 1 \\ 1 & 2 & 1 \\ 1 & 1 & 2 \end{bmatrix}\]

Assume we have found the eigenvalues λ₁ = 1, λ₂ = 2, and λ₃ = 4, and the corresponding eigenvectors \(v_1 = \begin{bmatrix} 1 \\ 1 \\ -2 \end{bmatrix}\), \(v_2 = \begin{bmatrix} 1 \\ -1 \\ 0 \end{bmatrix}\), and \(v_3 = \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix}\).

Form the matrix S using the eigenvectors as columns:

\[S = \begin{bmatrix} 1 & 1 & 1 \\ 1 & -1 & 1 \\ -2 & 0 & 1 \end{bmatrix}\]

Compute the inverse of S:

\[S^{-1} = \frac{1}{6}\begin{bmatrix} 1 & 1 & -2 \\ 3 & -3 & 0 \\ 2 & 2 & 2 \end{bmatrix}\]

Form the diagonal matrix \(D\) with the eigenvalues on its diagonal. It is important that the column of each eigenvector matches the column of each eigenvalue:

\[D = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 4 \end{bmatrix} \]

Verify that A = \(SDS^{-1}\):

\[SDS^{-1} = \begin{bmatrix} 1 & 1 & 1 \\ 1 & -1 & 1 \\ -2 & 0 & 1 \end{bmatrix}\begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 4 \end{bmatrix}\frac{1}{6}\begin{bmatrix} 1 & 1 & -2 \\ 3 & -3 & 0 \\ 2 & 2 & 2 \end{bmatrix} = \begin{bmatrix} 2 & 1 & 1 \\ 1 & 2 & 1 \\ 1 & 1 & 2 \end{bmatrix}\]

As shown, the matrix \(A\) is diagonalizable, with the diagonal matrix \(D\) containing its eigenvalues and the invertible matrix \(S\) consisting of its eigenvectors as columns.

Powers of a Matrix

Matrix powers are an essential concept in linear algebra that can help simplify complex calculations involving matrices. Given a square matrix A, the n-th power of A (denoted as An) is the result of multiplying A by itself n times. Here's the general formula for matrix powers:

\[ A^n = \underbrace{A \times A \times \cdots \times A}_{n \text{ times}} \]

Matrix powers have several applications in data science and quantitative finance, such as:

  • Markov chains and state transition matrices
  • Computing the variance-covariance matrix in finance
  • Matrix factorization techniques in machine learning

Example: Powers of a 2x2 Matrix

Let's consider the following 2x2 matrix A:

\[ A = \begin{bmatrix} a & b \\ c & d \end{bmatrix} \]

Here's how we can compute A2:

\[ A^2 = A \times A = \begin{bmatrix} a & b \\ c & d \end{bmatrix} \begin{bmatrix} a & b \\ c & d \end{bmatrix} = \begin{bmatrix} a^2 + bc & ab + bd \\ ac + cd & bc + d^2 \end{bmatrix} \]

Power Calculation Shortcut

One property of a diagonal matrix is that its value of being raised to a power is the same as each element of the matrix being raised to the same power.

\[ B^2 = B \times B = \begin{bmatrix} a & 0 \\ 0 & b \end{bmatrix} \begin{bmatrix} a & 0 \\ 0 & b \end{bmatrix} = \begin{bmatrix} a^2 & 0 \\ 0 & b^2 \end{bmatrix} \]

It can be proven that:

\[ B^n = B \times \dots \times B = \begin{bmatrix} a^n & 0 \\ 0 & b^n \end{bmatrix} \]

If a matrix can be factored into a diagonal matrix as discussed above, then it is much easier to calculate the power that matrix.

\[ A^2 = (SDS^{-1})^2 = SDS^{-1}SDS^{-1} = SD\mathbb{I}DS^{-1} = SD^2S^{-1} \]

Through induction it can also be proven that:

\[ A^n = (SDS^{-1})^n = SD^nS^{-1} \]

So it can be seen that the calculation of \(A^n\) can be reduced down to factoring the matrix into \(SDS^{-1}\), raising each element of the diagonal matrix \(D\) to the power of \(n\), and then multiplying those there matrices together:

\[ A^n = SD^nS^{-1} \]

Matrix Exponential

The matrix exponential is an extension of the exponential function for matrices. Given a square matrix A, the matrix exponential (denoted as eA) is defined using a convergent infinite series:

\[ e^A = \sum_{n=0}^{\infty} \frac{A^n}{n!} = I + A + \frac{A^2}{2!} + \frac{A^3}{3!} + \cdots \]

Here, I is the identity matrix of the same size as A. The matrix exponential satisfies several important properties:

  • \(e^0 = I\)
  • \(e^Ae^B = e^{A+B} \quad\text{(if } AB = BA \text{)}\)
  • \(e^{tA}e^{sA} = e^{(t+s)A}\)

The matrix exponential can be used to solve certain types of differential equations involving matrices. For example, the solution to the linear differential equation x'(t) = Ax(t), where x(t) is a vector-valued function of t, can be written as:

\[x(t) = e^{At}x(0)\]

where x(0) is the initial condition of the system.

Computing the matrix exponential can be challenging in general, but for some special types of matrices (e.g., diagonalizable matrices), it can be simplified using eigenvalues and eigenvectors.

The matrix exponential has numerous applications, including:

  • Solving systems of linear differential equations
  • Network analysis and graph theory
  • Quantitative finance for modeling interest rates and option pricing

Example: Matrix Exponential of a 2x2 Matrix

Let's consider the same 2x2 matrix A as before:

\[ A = \begin{bmatrix} a & b \\ c & d \end{bmatrix} \]

Computing the matrix exponential eA involves calculating the infinite series:

\[ e^A = I + A + \frac{A^2}{2!} + \frac{A^3}{3!} + \cdots \]

For a 2x2 matrix, we can approximate the matrix exponential by taking the first few terms of the series. Let's compute the first three terms:

\[ I = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \quad , \quad A = \begin{bmatrix} a & b \\ c & d \end{bmatrix} \quad , \quad \frac{A^2}{2!} = \frac{1}{2} \begin{bmatrix} a^2 + bc & ab + bd \\ ac + cd & bc + d^2 \end{bmatrix} \]

Summing these terms yields an approximation for eA:

\[ e^A \approx \begin{bmatrix} 1 + a + \frac{a^2 + bc}{2} & b + \frac{ab + bd}{2} \\ c + \frac{ac + cd}{2} & 1 + d + \frac{bc + d^2}{2} \end{bmatrix} \]

Although this is only an approximation, it demonstrates the process of calculating the matrix exponential for a 2x2 matrix. In practice, you would use specialized numerical algorithms to compute the matrix exponential more accurately and efficiently.

Matrix exponential can help you tackle complex problems and develop more sophisticated mathematical models.

Calculating the Exponential of a 2x2 Diagonalized Matrix

As we saw with raising a diagonal matrix to a power had a shortcut, the same can be found with taking the exponential of a diagonal matrix.

Matrix exponentiation is an important operation in various fields, including data science and quantitative finance. In this section, we will discuss how to calculate the exponential of a 2x2 matrix, given that the matrix has already been diagonalized. Diagonalization is the process of transforming a matrix into a diagonal matrix by using a similarity transformation.

Let's assume we have a diagonalized 2x2 matrix \(D\):

\[ D = \begin{bmatrix}\lambda_1&0\\0&\lambda_2\end{bmatrix} \]

where \(\lambda_1\) and \(\lambda_2\) are the eigenvalues of the original matrix A.

The exponential of the diagonalized matrix D, denoted as \(e^D\), can be computed directly:

\[ e^{D} = \begin{bmatrix}e^{\lambda_1}&0\\0&e^{\lambda_2}\end{bmatrix} \]

Since the original matrix A was diagonalized using a similarity transformation, we can express \(A = SDS^{-1}\), where \(S\) is the matrix formed by the eigenvectors of \(A\), and \(S^{-1}\) is the inverse of \(S\). To find the exponential of \(A\), denoted as \(e^A\), we use the following equation:

\[e^{A} = Se^DS^{-1}\]

Now, simply multiply the matrices \(S\), \(e^D\), and \(S^{-1}\) to obtain the exponential of the original matrix \(A\), \(e^A\).

Conclusion: Embracing the Power of Math in Data Science and Quantitative Finance

As we conclude this blog post, we have examined the world of key mathematical concepts that form the cornerstone of data science and quantitative finance. We have explored:

  • Eigenvalues and Eigenvectors
  • The Eigenvalue Problem
  • Matrix Diagonalization
  • Powers of a Matrix
  • Matrix Exponential

By immersing yourself in these quantitative math concepts, you have now equipped yourself with the essential knowledge required to navigate the ever-evolving landscapes of data science and quantitative finance. This newfound understanding will not only sharpen your intuition about the potential and constraints of various tools and methods but also help you tackle complex challenges, design innovative solutions, and stand out in the competitive landscape of your field.

Remember that mastery is not an overnight get-it-done activity. Stay curious, keep learning, and embrace the opportunities that come with being a well-rounded professional in these exciting and dynamic fields. As you continue to grow and excel, you'll be able to harness the power of quantitative mathematics to drive progress and innovation in data science and quantitative finance.

Legal Disclaimer