mscroggs.co.uk
mscroggs.co.uk

subscribe

Blog

 2020-02-06 
This is the third post in a series of posts about matrix methods.
Yet again, we want to solve \(\mathbf{A}\mathbf{x}=\mathbf{b}\), where \(\mathbf{A}\) is a (known) matrix, \(\mathbf{b}\) is a (known) vector, and \(\mathbf{x}\) is an unknown vector.
In the previous post in this series, we used Gaussian elimination to invert a matrix. You may, however, have been taught an alternative method for calculating the inverse of a matrix. This method has four steps:
  1. Find the determinants of smaller blocks of the matrix to find the "matrix of minors".
  2. Multiply some of the entries by -1 to get the "matrix of cofactors".
  3. Transpose the matrix.
  4. Divide by the determinant of the matrix you started with.

An example

As an example, we will find the inverse of the following matrix.
$$\begin{pmatrix} 1&-2&4\\ -2&3&-2\\ -2&2&2 \end{pmatrix}.$$
The result of the four steps above is the calculation
$$\frac1{\det\begin{pmatrix} 1&-2&4\\ -2&3&-2\\ -2&2&2 \end{pmatrix} }\begin{pmatrix} \det\begin{pmatrix}3&-2\\2&2\end{pmatrix}& -\det\begin{pmatrix}-2&4\\2&2\end{pmatrix}& \det\begin{pmatrix}-2&4\\3&-2\end{pmatrix}\\ -\det\begin{pmatrix}-2&-2\\-2&2\end{pmatrix}& \det\begin{pmatrix}1&4\\-2&2\end{pmatrix}& -\det\begin{pmatrix}1&4\\-2&-2\end{pmatrix}\\ \det\begin{pmatrix}-2&3\\-2&2\end{pmatrix}& -\det\begin{pmatrix}1&-2\\-2&2\end{pmatrix}& \det\begin{pmatrix}1&-2\\-2&3\end{pmatrix} \end{pmatrix}.$$
Calculating the determinants gives $$\frac12 \begin{pmatrix} 10&12&-8\\ 8&10&-6\\ 2&2&-1 \end{pmatrix},$$ which simplifies to
$$ \begin{pmatrix} 5&6&-4\\ 4&5&-3\\ 1&1&-\tfrac12 \end{pmatrix}.$$

How many operations

This method can be used to find the inverse of a matrix of any size. Using this method on an \(n\times n\) matrix will require:
  1. Finding the determinant of \(n^2\) different \((n-1)\times(n-1)\) matrices.
  2. Multiplying \(\left\lfloor\tfrac{n}2\right\rfloor\) of these matrices by -1.
  3. Calculating the determinant of a \(n\times n\) matrix.
  4. Dividing \(n^2\) numbers by this determinant.
If \(d_n\) is the number of operations needed to find the determinant of an \(n\times n\) matrix, the total number of operations for this method is
$$n^2d_{n-1} + \left\lfloor\tfrac{n}2\right\rfloor + d_n + n^2.$$

How many operations to find a determinant

If you work through the usual method of calculating the determinant by calculating determinants of smaller blocks the combining them, you can work out that the number of operations needed to calculate a determinant in this way is \(\mathcal{O}(n!)\). For large values of \(n\), this is significantly larger than any power of \(n\).
There are other methods of calculating determinants: the fastest of these is \(\mathcal{O}(n^{2.373})\). For large \(n\), this is significantly smaller than \(\mathcal{O}(n!)\).

How many operations

Even if the quick \(\mathcal{O}(n^{2.373})\) method for calculating determinants is used, the number of operations required to invert a matrix will be of the order of
$$n^2(n-1)^{2.373} + \left\lfloor\tfrac{n}2\right\rfloor + n^{2.373} + n^2.$$
This is \(\mathcal{O}(n^{4.373})\), and so for large matrices this will be slower than Gaussian elimination, which was \(\mathcal{O}(n^3)\).
In fact, this method could only be faster than Gaussian elimination if you discovered a method of finding a determinant faster than \(\mathcal{O}(n)\). This seems highly unlikely to be possible, as an \(n\times n\) matrix has \(n^2\) entries and we should expect to operate on each of these at least once.
So, for large matrices, Gaussian elimination looks like it will always be faster, so you can safely forget this four-step method.
Previous post in series
This is the third post in a series of posts about matrix methods.
                        
(Click on one of these icons to react to this blog post)

You might also enjoy...

Comments

Comments in green were written by me. Comments in blue were not written by me.
 Add a Comment 


I will only use your email address to reply to your comment (if a reply is needed).

Allowed HTML tags: <br> <a> <small> <b> <i> <s> <sup> <sub> <u> <spoiler> <ul> <ol> <li> <logo>
To prove you are not a spam bot, please type "segment" in the box below (case sensitive):
 2020-02-04 
This is the second post in a series of posts about my PhD thesis.
During my PhD, I spent a lot of time working on the open source boundary element method Python library Bempp. The second chapter of my thesis looks at this software, and some of the work we did to improve its performance and to make solving problems with it more simple, in more detail.

Discrete spaces

We begin by looking at the definitions of the discrete function spaces that we will use when performing discretisation. Imagine that the boundary of our region has been split into a mesh of triangles. (The pictures in this post show a flat mesh of triangles, although in reality this mesh will usually be curved.)
We define the discrete spaces by defining a basis function of the space. The discrete space will have one of these basis functions for each triangle, for each edge, or for each vertex (or a combination of these) and the space is defined to contain all the sums of multiples of these basis functions.
The first space we define is DP0 (discontinuous polynomials of degree 0). A basis function of this space has the value 1 inside one triangle, and has the value 0 elsewhere; it looks like this:
Next we define the P1 (continuous polynomials of degree 1) space. A basis function of this space has the value 1 at one vertex in the mesh, 0 at every other vertex, and is linear inside each triangle; it looks like this:
Higher degree polynomial spaces can be defined, but we do not use them here.
For Maxwell's equations, we need different basis functions, as the unknowns are vector functions. The two most commonly spaces are RT (Raviart–Thomas) and NC (Nédélec) spaces. Example basis functions of these spaces look like this:
RT (left) and NC (right) basis functions.

Preconditioning

Suppose we are trying to solve \(\mathbf{A}\mathbf{x}=\mathbf{b}\), where \(\mathbf{A}\) is a matrix, \(\mathbf{b}\) is a (known) vector, and \(\mathbf{x}\) is the vector we are trying to find. When \(\mathbf{A}\) is a very large matrix, it is common to only solve this approximately, and many methods are known that can achieve good approximations of the solution. To get a good idea of how quickly these methods will work, we can calculate the condition number of the matrix: the condition number is a value that is big when the matrix will be slow to solve (we call the matrix ill-conditioned); and is small when the matrix will be fast to solve (we call the matrix well-conditioned).
The matrices we get when using the boundary element method are often ill-conditioned. To speed up the solving process, it is common to use preconditioning: instead of solving \(\mathbf{A}\mathbf{x}=\mathbf{b}\), we can instead pick a matrix \(\mathbf{P}\) and solve $$\mathbf{P}\mathbf{A}\mathbf{x}=\mathbf{P}\mathbf{b}.$$ If we choose the matrix \(\mathbf{P}\) carefully, we can obtain a matrix \(\mathbf{P}\mathbf{A}\) that has a lower condition number than \(\mathbf{A}\), so this new system could be quicker to solve.
When using the boundary element method, it is common to use properties of the Calderón projector to work out some good preconditioners. For example, the single layer operator \(\mathsf{V}\) when discretised is often ill-conditioned, but the product of it and the hypersingular operator \(\mathsf{W}\mathsf{V}\) is often better conditioned. This type of preconditioning is called operator preconditioning or Calderón preconditioning.
If the product \(\mathsf{W}\mathsf{V}\) is discretised, the result is $$\mathbf{W}\mathbf{M}^{-1}\mathbf{V},$$ where \(\mathbf{W}\) and \(\mathbf{V}\) are discretisations of \(\mathsf{W}\) and \(\mathsf{V}\), and \(\mathbf{M}\) is a matrix called the mass matrix that depends on the discretisation spaces used to discretise \(\mathsf{W}\) and \(\mathsf{V}\).
In our software Bempp, the mass matrices \(\mathbf{M}\) are automatically included in product like this, which makes using preconditioning like this easier to program.
As an alternative to operator preconditioning, a method called mass matrix preconditioning is often used: this method uses the inverse mass matrix \(\mathbf{M}^{-1}\) as a preconditioner (so is like the operator preconditioning example without the \(\mathbf{W}\)).

More discrete spaces

As the inverse mass matrix \(\mathbf{M}^{-1}\) appears everywhere in the preconditioning methods we would like to use, it would be great if this matrix was well-conditioned: as if it is, it's inverse can be very quickly and accurately approximated.
There is a condition called the inf-sup condition: if the inf-sup condition holds for the discretisation spaces used, then the mass matrix will be well-conditioned. Unfortunately, the inf-sup condition does not hold when using a combination of DP0 and P1 spaces.
All is not lost, however, as there are spaces we can use that do satisfy the inf-sup condition. We call these DUAL0 and DUAL1, and they form inf-sup stable pairs with P1 and DP0 (respectively). They are defined using the barycentric dual mesh: this mesh is defined by joining each point in a triangle with the midpoint of the opposite side, then making polygons with all the small triangles that touch a vertex in the original mesh:
The mesh (left), the barycentric refinement (centre), and the dual grid (right)
Example DUAL1 and DUAL0 basis functions look like this:
DUAL1 (left) and DUAL0 (right) basis functions.
For Maxwell's equations, we define BC (Buffa–Christiansen) and RBC (rotated BC) functions to make inf-sup stable spaces pairs. Example BC and RBC basis functions look like this:
Example BC (left) and RBC (right) basis functions.

My thesis then gives some example Python scripts that show how these spaces can be used in Bempp to solve some example problems, concluding chapter 2 of my thesis. Why not take a break and have a slice of the following figure before reading on.
An electromagnetic wave scattering off a perfectly conducting metal cake. This solution was found using a Calderón preconditioned boundary element method.
Previous post in series
This is the second post in a series of posts about my PhD thesis.
Next post in series
                        
(Click on one of these icons to react to this blog post)

You might also enjoy...

Comments

Comments in green were written by me. Comments in blue were not written by me.
 Add a Comment 


I will only use your email address to reply to your comment (if a reply is needed).

Allowed HTML tags: <br> <a> <small> <b> <i> <s> <sup> <sub> <u> <spoiler> <ul> <ol> <li> <logo>
To prove you are not a spam bot, please type "emirp" backwards in the box below (case sensitive):
 2020-01-31 
This is the first post in a series of posts about my PhD thesis.
Yesterday, I handed in the final version of my PhD thesis. This is the first in a series of blog posts in which I will attempt to explain what my thesis says with minimal mathematical terminology and notation.
The aim of these posts is to give a more general audience some idea of what my research is about. If you are looking to build on my work, I recommend that you read my thesis, or my papers based on parts of it for the full mathematical detail. You may find these posts helpful with developing a more intuitive understanding of some of the results in these.
In this post, we start at the beginning and look at what is contained in chapter 1 of my thesis.

Introduction

In general, my work looks at using discretisation to approximately solve partial differential equations (PDEs). We primarily focus on three PDEs:

If \(u\) represents the temperature in a region, the region contains no heat sources, and the heat distribution in the region is not changing, then \(u\) a solution of Laplace's equation. Because of this application, Laplace's equation is sometimes called the steady-state heat equation.

The Helmholtz equation models acoustic waves travelling through a medium. The unknown \(u\) represents the amplitude of the wave at each point. The equation features the wavenumber \(k\), which gives the number of waves per unit distance.

Maxwell's equations describe the behaviour of electromagnetic waves. The unknown \(e\) in Maxwell's equatons is an unknown vector, whereas the unknown \(u\) in Laplace and Helmholtz is a scalar. This adds some difficulty when working with Maxwell's equations.

Discretisation

In many situations, no method for finding the exact solution to these equations is known. It is therefore very important to be able to get accurate approximations of the solutions of these equations. My PhD thesis focusses on how these can be approximately solved using discretisation.
Each of the PDEs that we want to solve describes a function that has a value at every point in a 3D space: there are an (uncountably) infinite number of points, so these equations effectively have an infinite number of unknown values that we want to know. The aim of discretisation is to reduce the number of unknowns to a finite number in such a way that solving the finite problem gives a good approximation of the true solution. The finite problem can then be solved using your favourite matrix method.
For example, we could split a 2D shape into lots of small triangles, or split a 3D shape into lots of small tetrahedra. We could then try to approximate our solution with a constant inside each triangle or tetrahedron. Or we could approximate by taking a value at each vertex of the triangles or tetrahedra, and a linear function inside shape. Or we could use quadratic functions or higher polynomials to approximate the solution. This is the approach taken by a method called the finite element method.
My thesis, however, focusses on the boundary element method, a method that is closely related to the finite element method. Imagine that we want to solve a PDE problem inside a sphere. For some PDEs (such as the three above), it is possible to represent the solution in terms of some functions on the surface of the sphere. We can then approximate the solution to the problem by approximating these functions on the surface: we then only have to discretise the surface and not the whole of the inside of the sphere. This method will also work for other 3D shapes, and isn't limited to spheres.
One big advantage of this method is that it can also be used if we want to solve a problem in the (infinte) region outside an object, as even if the region is infinite, the surface of the object will be finite. The finite element method has difficulties here, as splitting the infinite region in tetrahedra would require an infinite number of tetrahedra, which destroys the point of discretisation (although this kind of problem can be solved using finite elements if you split into tetrahedra far enough outwards, then use some tricks where your tetrahedron stop). The finite element method, on the other hand, works for a much wider range of PDEs, so in many situations it would be the only of these two methods that can be used.
We focus on the boundary element method as when it can be used, it can be a very powerful method.

Sobolev spaces

Much of the first chapter of my thesis looks at the definitions of Sobolev function spaces: these spaces describe the types of functions that we will look for to solve our PDEs. It is important to frame the problems using Sobolev spaces, as properties of these spaces are used heavily when proving results about our approximations.
The definitions of these spaces are too technical to include here, but the spaces can be thought of as a way to require certain things of the solutions to our problems. We start be demanding that the function be square integrable: they can be squared the integrated to give a finite answer. Other spaces are then defined by demanding that a function can be differentiated to give a square integrable derivative; or it can be differentiated twice or more times.

Boundary element method

We now look at how the boundary element method works in more detail. The "boundary" in the name refers to the surface of the object or region that we are solving problems in.
In order to be able to use the boundary element method, we need to know the Green's function of the PDE in question. (This is what limits the number of problems that the boundary element method can be used to solve.) The Green's function is the solution as every point in 3D (or 2D if you prefer) except for one point, where it has an infinite value, but a very special type of infinite value. The Green's function gives the field due to a unit point source at the chosen point.
If a region contains no sources, then the solution inside that region can be represented by adding up point sources at every point on the boundary (or in other words, integrating the Green's function times a weighting function, as the limit of adding up these at lots of points is an integral). The solution could instead be represented by doing the same thing with the derivative of the Green's function.
We define two potential operators—the single layer potential \(\mathcal{V}\) and the double layer potential \(\mathcal{K}\)—to be the two operators representing the two boundary integrals: these operators can be applied to weight functions on the boundary to give solutions to the PDE in the region. Using these operators, we write a representation formula that tells us how to compute the solution inside the region once we know the weight functions on the boundary.
By looking at the values of these potential operators on the surface, and derivative of these values, (and integrating these over the boundary) we can define four boundary operators: the single layer operator \(\mathsf{V}\), the double layer operator \(\mathsf{K}\), the adjoint double layer operator \(\mathsf{K}'\), the hypersingular operator \(\mathsf{W}\). We use these operators to write the boundary integral equation that we derive from the representation formula. It is this boundary integral equation that we will apply discretisation to to find an approximation of the weight functions; we can then use the representation formula to find the solution to the PDE.

The Calderón projector

Using the four boundary operators, we can define the Calderón projector, \(\mathsf{C}\):
$$\mathsf{C}=\begin{bmatrix} \tfrac12-\mathsf{K}&\mathsf{V}\\ \mathsf{W}&\tfrac12+\mathsf{K}' \end{bmatrix}$$
This is the Calderón projector for problems in a finite interior region. For exterior problems, the exterior Cadleró is used. This is given by
$$\mathsf{C}=\begin{bmatrix} \tfrac12+\mathsf{K}&-\mathsf{V}\\ -\mathsf{W}&\tfrac12-\mathsf{K}' \end{bmatrix}.$$
We treat the Calderón operator like a matrix, even though its entries are operators not numbers. The Calderón projector has some very useful properties that are commonly used when using the boundary element method. These include:

These properties conclude chapter 1 of my thesis. Why not take a break and fill the following diagram with hot liquid before reading on.
The results of applying the Calderón projector to two functions. By the properties above, these are (the boundary traces of) a solution of the PDE.
This is the first post in a series of posts about my PhD thesis.
Next post in series
                        
(Click on one of these icons to react to this blog post)

You might also enjoy...

Comments

Comments in green were written by me. Comments in blue were not written by me.
 Add a Comment 


I will only use your email address to reply to your comment (if a reply is needed).

Allowed HTML tags: <br> <a> <small> <b> <i> <s> <sup> <sub> <u> <spoiler> <ul> <ol> <li> <logo>
To prove you are not a spam bot, please type "integer" in the box below (case sensitive):
 2020-01-24 
This is the second post in a series of posts about matrix methods.
We want to solve \(\mathbf{A}\mathbf{x}=\mathbf{b}\), where \(\mathbf{A}\) is a (known) matrix, \(\mathbf{b}\) is a (known) vector, and \(\mathbf{x}\) is an unknown vector.
This matrix system can be thought of as a way of representing simultaneous equations. For example, the following matrix problem and system of simultaneous equations are equivalent.
\begin{align*} \begin{pmatrix}2&1\\3&1\end{pmatrix}\mathbf{x}&=\begin{pmatrix}3\\4\end{pmatrix} &&\quad&& \begin{array}{r} 2x+2y=3&\\ 3x+2y=4& \end{array} \end{align*}
The simultaneous equations here would usually be solved by adding or subtracting the equations together. In this example, subtracting the first equation from the second gives \(x=1\). From there, it is not hard to find that \(y=\frac12\).
One approach to solving \(\mathbf{A}\mathbf{x}=\mathbf{b}\) is to find the inverse matrix \(\mathbf{A}^{-1}\), and use \(\mathbf{x}=\mathbf{A}^{-1}\mathbf{b}\). In this post, we use Gaussian elimination—a method that closely resembles the simultaneous equation method—to find \(\mathbf{A}^{-1}\).

Gaussian elimination

As an example, we will use Gaussian elimination to find the inverse of the matrix
$$\begin{pmatrix} 1&-2&4\\ -2&3&-2\\ -2&2&2 \end{pmatrix}.$$
First, write the matrix with an identity matrix next to it.
$$\left(\begin{array}{ccc|ccc} 1&-2&4&1&0&0\\ -2&3&-2&0&1&0\\ -2&2&2&0&0&1 \end{array}\right)$$
Our aim is then to use row operations to change the matrix on the left of the vertical line into the identity matrix, as the matrix on the right will then be the inverse. We are allowed to use two row operations: we can multiply a row by a scalar; or we can add a multiple of a row to another row. These operations closely resemble the steps used to solve simultaneous equations.
We will get the matrix to the left of the vertical line to be the identity in a systematic manner: our first aim is to get the first column to read 1, 0, 0. We already have the 1; to get the 0s, add 2 times the first row to both the second and third rows.
$$\left(\begin{array}{ccc|ccc} 1&-2&4&1&0&0\\ 0&-1&6&2&1&0\\ 0&-2&10&2&0&1 \end{array}\right)$$
Our next aim is to get the second column to read 0, 1, 0. To get the 1, we multiply the second row by -1.
$$\left(\begin{array}{ccc|ccc} 1&-2&4&1&0&0\\ 0&1&-6&-2&-1&0\\ 0&-2&10&2&0&1 \end{array}\right)$$
To get the 0s, we add 2 times the second row to both the first and third rows.
$$\left(\begin{array}{ccc|ccc} 1&0&-8&-3&-2&0\\ 0&1&-6&-2&-1&0\\ 0&0&-2&-2&-2&1 \end{array}\right)$$
Our final aim is to get the third column to read 0, 0, 1. To get the 1, we multiply the third row by -½.
$$\left(\begin{array}{ccc|ccc} 1&0&-8&-3&-2&0\\ 0&1&-6&-2&-1&0\\ 0&0&1&1&1&-\tfrac{1}{2} \end{array}\right)$$
To get the 0s, we add 8 and 6 times the third row to the first and second rows (respectively).
$$\left(\begin{array}{ccc|ccc} 1&0&0&5&6&-4\\ 0&1&0&4&5&-3\\ 0&0&1&1&1&-\tfrac{1}{2} \end{array}\right)$$
We have the identity on the left of the vertical bar, so we can conclude that
$$\begin{pmatrix} 1&-2&4\\ -2&3&-2\\ -2&2&2 \end{pmatrix}^{-1} = \begin{pmatrix} 5&6&-4\\ 4&5&-3\\ 1&1&-\tfrac{1}{2} \end{pmatrix}.$$

How many operations

This method can be used on matrices of any size. We can imagine doing this with an \(n\times n\) matrix and look at how many operations the method will require, as this will give us an idea of how long this method would take for very large matrices. Here, we count each use of \(+\), \(-\), \(\times\) and \(\div\) as a (floating point) operation (often called a flop).
Let's think about what needs to be done to get the \(i\)th column of the matrix equal to 0, ..., 0, 1, 0, ..., 0.
First, we need to divide everything in the \(i\)th row by the value in the \(i\)th row and \(i\)th column. The first \(i-1\) entries in the column will already be 0s though, so there is no need to divide these. This leaves \(n-(i-1)\) entries that need to be divided, so this step takes \(n-(i-1)\), or \(n+1-i\) operations.
Next, for each other row (let's call this the \(j\)th row), we add or subtract a multiple of the \(i\)th row from the \(j\)th row. (Again the first \(i-1\) entries can be ignored as they are 0.) Multiplying the \(i\)th row takes \(n+1-i\) operations, then adding/subtracting takes another \(n+1-i\) operations. This needs to be done for \(n-1\) rows, so takes a total of \(2(n-1)(n+1-i)\) operations.
After these two steps, we have finished with the \(i\)th column, in a total of \((2n-1)(n+1-i)\) operations.
We have to do this for each \(i\) from 1 to \(n\), so the total number of operations to complete Gaussian elimination is
$$ (2n-1)(n+1-1) + (2n-1)(n+1-2) +...+ (2n-1)(n+1-n) $$
This simplifies to $$\tfrac12n(2n-1)(n+1)$$ or $$n^3+\tfrac12n^2-\tfrac12n.$$
The highest power of \(n\) is \(n^3\), so we say that this algorithm is an order \(n^3\) algorithm, often written \(\mathcal{O}(n^3)\). We focus on the highest power of \(n\) as if \(n\) is very large, \(n^3\) will be by far the largest number in the expression, so gives us an idea of how fast/slow this algorithm will be for large matrices.
\(n^3\) is not a bad start—it's far better than \(n^4\), \(n^5\), or \(2^n\)—but there are methods out there that will take less than \(n^3\) operations. We'll see some of these later in this series.
Previous post in series
This is the second post in a series of posts about matrix methods.
Next post in series
                        
(Click on one of these icons to react to this blog post)

You might also enjoy...

Comments

Comments in green were written by me. Comments in blue were not written by me.
 Add a Comment 


I will only use your email address to reply to your comment (if a reply is needed).

Allowed HTML tags: <br> <a> <small> <b> <i> <s> <sup> <sub> <u> <spoiler> <ul> <ol> <li> <logo>
To prove you are not a spam bot, please type "quotient" in the box below (case sensitive):
 2020-01-23 
This is the first post in a series of posts about matrix methods.
When you first learn about matrices, you learn that in order to multiply two matrices, you use this strange-looking method involving the rows of the left matrix and the columns of this right.
It doesn't immediately seem clear why this should be the way to multiply matrices. In this blog post, we look at why this is the definition of matrix multiplication.

Simultaneous equations

Matrices can be thought of as representing a system of simultaneous equations. For example, solving the matrix problem
$$ \begin{bmatrix}2&5&2\\1&0&-2\\3&1&1\end{bmatrix} \begin{pmatrix}x\\y\\z\end{pmatrix} = \begin{pmatrix}14\\-16\\-4\end{pmatrix} $$
is equivalent to solving the following simultaneous equations.
\begin{align*} 2x+5y+2z&=14\\ 1x+0y-2z&=-16\\ 3x+1y+1z&=-4 \end{align*}

Two matrices

Now, let \(\mathbf{A}\) and \(\mathbf{C}\) be two 3×3 matrices, let \(\mathbf{b}\) by a vector with three elements, and let \(\mathbf{x}=(x,y,z)\). We consider the equation
$$\mathbf{A}\mathbf{C}\mathbf{x}=\mathbf{b}.$$
In order to understand what this equation means, we let \(\mathbf{y}=\mathbf{C}\mathbf{x}\) and think about solving the two simuntaneous matrix equations,
\begin{align*} \mathbf{A}\mathbf{y}&=\mathbf{b}\\ \mathbf{C}\mathbf{x}&=\mathbf{y}. \end{align*}
We can write the entries of \(\mathbf{A}\), \(\mathbf{C}\), \(\mathbf{x}\), \(\mathbf{y}\) and \(\mathbf{b}\) as
\begin{align*} \mathbf{A}&=\begin{bmatrix} a_{11}&a_{12}&a_{13}\\ a_{21}&a_{22}&a_{23}\\ a_{31}&a_{32}&a_{23} \end{bmatrix} & \mathbf{C}&=\begin{bmatrix} c_{11}&c_{12}&c_{13}\\ c_{21}&c_{22}&c_{23}\\ c_{31}&c_{32}&c_{23} \end{bmatrix} \end{align*} \begin{align*} \mathbf{x}&=\begin{pmatrix}x_1\\x_2\\x_3\end{pmatrix} & \mathbf{y}&=\begin{pmatrix}y_1\\y_2\\y_3\end{pmatrix} & \mathbf{b}&=\begin{pmatrix}b_1\\b_2\\b_3\end{pmatrix} \end{align*}
We can then write out the simultaneous equations that \(\mathbf{A}\mathbf{y}=\mathbf{b}\) and \(\mathbf{C}\mathbf{x}=\mathbf{y}\) represent:
\begin{align} a_{11}y_1+a_{12}y_2+a_{13}y_3&=b_1& c_{11}x_1+c_{12}x_2+c_{13}x_3&=y_1\\ a_{21}y_1+a_{22}y_2+a_{23}y_3&=b_2& c_{21}x_1+c_{22}x_2+c_{23}x_3&=y_2\\ a_{31}y_1+a_{32}y_2+a_{33}y_3&=b_3& c_{31}x_1+c_{32}x_2+c_{33}x_3&=y_3\\ \end{align}
Substituting the equations on the right into those on the left gives:
\begin{align} a_{11}(c_{11}x_1+c_{12}x_2+c_{13}x_3)+a_{12}(c_{21}x_1+c_{22}x_2+c_{23}x_3)+a_{13}(c_{31}x_1+c_{32}x_2+c_{33}x_3)&=b_1\\ a_{21}(c_{11}x_1+c_{12}x_2+c_{13}x_3)+a_{22}(c_{21}x_1+c_{22}x_2+c_{23}x_3)+a_{23}(c_{31}x_1+c_{32}x_2+c_{33}x_3)&=b_2\\ a_{31}(c_{11}x_1+c_{12}x_2+c_{13}x_3)+a_{32}(c_{21}x_1+c_{22}x_2+c_{23}x_3)+a_{33}(c_{31}x_1+c_{32}x_2+c_{33}x_3)&=b_3\\ \end{align}
Gathering the terms containing \(x_1\), \(x_2\) and \(x_3\) leads to:
\begin{align} (a_{11}c_{11}+a_{12}c_{21}+a_{13}c_{31})x_1 +(a_{11}c_{12}+a_{12}c_{22}+a_{13}c_{32})x_2 +(a_{11}c_{13}+a_{12}c_{23}+a_{13}c_{33})x_3&=b_1\\ (a_{21}c_{11}+a_{22}c_{21}+a_{23}c_{31})x_1 +(a_{21}c_{12}+a_{22}c_{22}+a_{23}c_{32})x_2 +(a_{21}c_{13}+a_{22}c_{23}+a_{23}c_{33})x_3&=b_2\\ (a_{31}c_{11}+a_{32}c_{21}+a_{33}c_{31})x_1 +(a_{31}c_{12}+a_{32}c_{22}+a_{33}c_{32})x_2 +(a_{31}c_{13}+a_{32}c_{23}+a_{33}c_{33})x_3&=b_3 \end{align}
We can write this as a matrix:
$$ \begin{bmatrix} a_{11}c_{11}+a_{12}c_{21}+a_{13}c_{31}& a_{11}c_{12}+a_{12}c_{22}+a_{13}c_{32}& a_{11}c_{13}+a_{12}c_{23}+a_{13}c_{33}\\ a_{21}c_{11}+a_{22}c_{21}+a_{23}c_{31}& a_{21}c_{12}+a_{22}c_{22}+a_{23}c_{32}& a_{21}c_{13}+a_{22}c_{23}+a_{23}c_{33}\\ a_{31}c_{11}+a_{32}c_{21}+a_{33}c_{31}& a_{31}c_{12}+a_{32}c_{22}+a_{33}c_{32}& a_{31}c_{13}+a_{32}c_{23}+a_{33}c_{33} \end{bmatrix} \mathbf{x}=\mathbf{b} $$
This equation is equivalent to \(\mathbf{A}\mathbf{C}\mathbf{x}=\mathbf{b}\), so the matrix above is equal to \(\mathbf{A}\mathbf{C}\). But this matrix is what you get if follow the row-and-column matrix multiplication method, and so we can see why this definition makes sense.
This is the first post in a series of posts about matrix methods.
Next post in series
                        
(Click on one of these icons to react to this blog post)

You might also enjoy...

Comments

Comments in green were written by me. Comments in blue were not written by me.
 Add a Comment 


I will only use your email address to reply to your comment (if a reply is needed).

Allowed HTML tags: <br> <a> <small> <b> <i> <s> <sup> <sub> <u> <spoiler> <ul> <ol> <li> <logo>
To prove you are not a spam bot, please type "sixa-y" backwards in the box below (case sensitive):

Archive

Show me a random blog post
 2024 

Feb 2024

Zines, pt. 2

Jan 2024

Christmas (2023) is over
 2023 
▼ show ▼
 2022 
▼ show ▼
 2021 
▼ show ▼
 2020 
▼ show ▼
 2019 
▼ show ▼
 2018 
▼ show ▼
 2017 
▼ show ▼
 2016 
▼ show ▼
 2015 
▼ show ▼
 2014 
▼ show ▼
 2013 
▼ show ▼
 2012 
▼ show ▼

Tags

pizza cutting dataset martin gardner mathslogicbot runge's phenomenon boundary element methods london underground realhats polynomials dinosaurs php tmip captain scarlet data draughts golden spiral logo 24 hour maths royal baby fractals electromagnetic field exponential growth misleading statistics interpolation world cup pac-man numbers game of life bempp gather town curvature a gamut of games pascal's triangle chess finite element method go propositional calculus inverse matrices geometry folding paper reddit folding tube maps european cup hannah fry newcastle asteroids data visualisation trigonometry probability countdown books signorini conditions inline code sound logs youtube craft pythagoras squares programming ternary rhombicuboctahedron menace binary accuracy numerical analysis harriss spiral platonic solids golden ratio dates live stream palindromes rugby matrix of minors light sport stickers coins manchester science festival javascript standard deviation map projections hats national lottery hyperbolic surfaces sobolev spaces matrix of cofactors graph theory estimation christmas card dragon curves preconditioning wave scattering approximation finite group quadrilaterals pi approximation day graphs final fantasy raspberry pi errors ucl recursion bodmas mathsteroids databet royal institution zines chalkdust magazine turtles crossnumber determinants fonts geogebra radio 4 wool mathsjam hexapawn talking maths in public python phd matt parker simultaneous equations error bars flexagons reuleaux polygons mean chebyshev news games sorting noughts and crosses game show probability correlation london speed weak imposition people maths advent calendar video games bubble bobble arithmetic gerry anderson plastic ratio cross stitch nine men's morris the aperiodical computational complexity matrix multiplication logic tennis puzzles statistics machine learning football triangles crochet frobel braiding guest posts christmas pi latex weather station fence posts stirling numbers gaussian elimination matrices edinburgh convergence cambridge oeis datasaurus dozen big internet math-off anscombe's quartet manchester

Archive

Show me a random blog post
▼ show ▼
© Matthew Scroggs 2012–2024