This post was translated from Korean by LLM (Kimi). The translation may contain errors or awkward sentences. The Korean original is the source of truth.

In this post we define matrices and the operations between them.

Basic Definitions

Simply put, an \(m\times n\) matrix is a rectangular array of \(mn\) scalars arranged in an \(m\times n\) grid. That is, if we denote the entry of \(A\) in the \(i\)th row and \(j\)th column by \(A_{ij}\), then a matrix can be written as follows.

\[A=\begin{pmatrix}A_{11}&A_{12}&\cdots&A_{1n}\\A_{21}&A_{22}&\cdots&A_{2n}\\ \vdots&\vdots&\ddots&\vdots\\A_{m1}&A_{m2}&\cdots&A_{mn}\end{pmatrix}\]

In the matrix \(A\) above, the \(m\) vectors

\[(A_{11},A_{12},\cdots, A_{1n}),\;\ldots,\;(A_{m1},A_{m2},\cdots,A_{mn})\]

are called the rows of \(A\), and the \(n\) vectors

\[(A_{11},A_{21},\cdots,A_{m1}),\; \ldots,\; (A_{1n}, A_{2n},\cdots, A_{mn})\]

are called the columns of \(A\). Here the row vectors are all elements of the \(n\)-dimensional space \(\mathbb{K}^n\), and the column vectors are elements of the \(m\)-dimensional space \(\mathbb{K}^m\).

It is often convenient to group certain blocks of a matrix and write it in block form. For example, the following matrix

\[\begin{pmatrix}0&0&1\\ 0&0&3\\ 2&4&0\end{pmatrix}\]

can be conveniently written as in the figure below

block_matrix

Similarly, using the column vectors \(A_1\), \(A_2\), \(\ldots\), \(A_n\) of the above \(m\times n\) matrix \(A\), we can also write

\[A=(A_1|A_2|\cdots|A_n)\]

The set of all \(m\times n\) matrices whose entries are elements of \(\mathbb{K}\) is denoted by \(\Mat_{m\times n}(\mathbb{K})\). In particular, \(\Mat_{m\times m}(\mathbb{K})\) is abbreviated as \(\Mat_m(\mathbb{K})\).

Operations on Matrices

Addition and multiplication are defined for matrices, but they are not well-defined for arbitrary elements. For example, addition is defined only for matrices of the same shape, as the entrywise sum. Thus, \(\Mat_{m\times n}(\mathbb{K})\) has a well-defined addition. Furthermore, if we define scalar multiplication for each matrix by the following formula

\[\alpha\begin{pmatrix}A_{11}&A_{12}&\cdots&A_{1n}\\A_{21}&A_{22}&\cdots&A_{2n}\\ \vdots&\vdots&\ddots&\vdots\\A_{m1}&A_{m2}&\cdots&A_{mn}\end{pmatrix}=\begin{pmatrix}\alpha A_{11}&\alpha A_{12}&\cdots&\alpha A_{1n}\\\alpha A_{21}&\alpha A_{22}&\cdots&\alpha A_{2n}\\ \vdots&\vdots&\ddots&\vdots\\\alpha A_{m1}&\alpha A_{m2}&\cdots&\alpha A_{mn}\end{pmatrix}\]

one can easily verify that \(\Mat_{m\times n}(\mathbb{K})\) has a \(\mathbb{K}\)-vector space structure. This vector space has dimension \(mn\). A basis is most simply given by the following matrices

\[\begin{pmatrix}1&0&\cdots&0\\0&0&\cdots&0\\\vdots&\vdots&\ddots&\vdots\\0&0&\cdots&0\end{pmatrix},\quad\begin{pmatrix}0&1&\cdots&0\\0&0&\cdots&0\\\vdots&\vdots&\ddots&\vdots\\0&0&\cdots&0\end{pmatrix},\quad\cdots,\quad\begin{pmatrix}0&0&\cdots&0\\0&0&\cdots&0\\\vdots&\vdots&\ddots&\vdots\\0&0&\cdots&1\end{pmatrix}\]

Matrix multiplication is somewhat more complicated, so we first need to define the product of a matrix and a vector.

Definition 2 Let \(A\in\Mat_{m\times n}(\mathbb{K})\) and \(x\in\mathbb{K}^n\). Then the product \(Ax\) of the matrix \(A\) and the vector \(x\) is defined by the following formula

\[Ax=\sum_{j=1}^nx_j A_j\]

Here \(x_j\) is the \(j\)th component of \(x\), and \(A_j\) is the \(j\)th column vector of \(A\).

By the above definition, \(Ax\) is always a linear combination of the column vectors \(A_j\) for any vector \(x\). Therefore, the set of all \(Ax\) for \(x\in\mathbb{K}^n\) is \(\span\left\{A_1, A_2,\ldots,A_n\right\}\), which we call the column space of \(A\) and denote briefly by \(\col A\). Of course, the row space of \(A\) can be defined similarly, but it is rarely used.

The above formula can be written out in more detail. Let \(A_{ij}\) denote the \(i\)th component of the column vector \(A_j\). Then the first component of the vector \(Ax\) is expressed as the sum of the first components of the vectors \(x_jA_j\), so it can be written as

\[x_1A_{11}+x_2A_{12}+\cdots+x_nA_{1n}\]

and more generally, the \(i\)th component of \(Ax\) is expressed as

\[x_1A_{i1}+x_2A_{i2}+\cdots+x_nA_{in}\]

Thus we explicitly obtain the following formula

\[\begin{pmatrix}A_{11}&A_{12}&\cdots&A_{1n}\\A_{21}&A_{22}&\cdots&A_{2n}\\\vdots&\vdots&\ddots&\vdots\\A_{m1}&A_{m2}&\cdots&A_{mn}\end{pmatrix}\begin{pmatrix}x_1\\x_2\\\vdots\\x_n\end{pmatrix}=\begin{pmatrix}\sum_{i=1}^nA_{1i}x_i\\\sum_{i=1}^nA_{2i}x_i\\\vdots\\\sum_{i=1}^nA_{mi}x_i\end{pmatrix}\]

Matrix Multiplication

Having defined the product of a matrix and a vector, we can now define the product of two matrices.

Definition 3 For two matrices \(A\in \Mat_{m\times n}(\mathbb{K})\) and \(B\in\Mat_{p\times q}(\mathbb{K})\), a necessary and sufficient condition for the matrix product \(BA\) to be defined is that \(q=m\). In this case, the matrix product \(BA\) is given by the following formula

\[BA=(BA_1|BA_2|\cdots|BA_n)\]

(\(A_i\) is the \(i\)th column of \(A\), and \(BA_i\) is the product of the column vector \(A_i\) and the matrix \(B\)1)

That is, if we denote the \(i\)th component of \(A_j\) by \(A_{ij}\), then we can write

\[BA_j=\sum_{k=1}^m A_{kj}B_k\]

Then the \(i\)th component of the column vector \(BA_j\), i.e., the \((i,j)\)-entry \((BA)_{ij}\) of the matrix \(BA\), is given by

\[(BA)_{ij}=\sum_{k=1}^n B_{ik}A_{kj}\]

Matrix multiplication does not satisfy the commutative law. That is, \(AB=BA\) does not hold in general. Not only is there no guarantee that \(AB\) is defined even if \(BA\) is defined, but even when both are defined, \(AB\) and \(BA\) may have different shapes. Moreover, even if \(m=n=p=q\) so that \(AB\), \(BA\in\Mat_{m\times m}(\mathbb{K})\), these values may still differ.

On the other hand, matrix multiplication satisfies the associative law. If \(A\), \(B\), and \(C\) are matrices such that \(AB\) and \(BC\) are both defined, then the products \(A(BC)\) and \((AB)C\) are also well-defined, and

\[A(BC)=(AB)C\]

holds.

Definition 4 The matrix \(I=(e_1\mid e_2\mid\cdots\mid e_n)\) is called the \(n\times n\) identity matrix.

That is,

\[I=\begin{pmatrix}1&0&\cdots&0\\0&1&\cdots&0\\\vdots&\vdots&\ddots&\vdots\\ 0&0&\cdots&1\end{pmatrix}\]

Similarly, a matrix whose entries off the main diagonal are all zero is called a diagonal matrix.

True to its name, if \(I\) is the \(n\times n\) identity matrix, then \(AI=A\) for any \(m\times n\) matrix \(A\), and \(IB=B\) for any \(n\times m\) matrix \(B\).

Definition 5 Let \(A\) be a matrix in \(\Mat_n(\mathbb{K})\). \(A\) is called invertible if there exists some \(B\in\Mat_n(\mathbb{K})\) such that \(AB=BA=I\). A matrix that is not invertible is called a singular matrix.

Not every element of \(\Mat_n(\mathbb{K})\) is invertible. For example, the zero matrix \(O\) is not invertible. Thus, \(\Mat_n(\mathbb{K})\) is not generally an abelian group. Instead, if we collect only invertible $$n\times n$$ matrices, this set has a group structure.

Definition 6 The group whose underlying set is the collection of \(n\times n\) invertible matrices and whose operation is matrix multiplication is written as \(\GL(n,\mathbb{K})\) and is called the general linear group.

As mentioned above, matrix multiplication satisfies the associative law. That is, the operation of \(\GL(n,\mathbb{K})\) is associative. Moreover, the identity element of \(\GL(n,\mathbb{K})\) is the identity matrix \(I\), and the inverse of \(A\in\GL(n,\mathbb{K})\) is the matrix \(B\) appearing in Definition 5. From the uniqueness in §Abelian Groups and Fields, ⁋Proposition 2, we know that the inverse of \(A\) is unique. This is called the inverse matrix of \(A\) and is denoted by \(A^{-1}\). From the observation

\[(B^{-1}A^{-1})AB=B^{-1}IB=B^{-1}B=I\]

we see that the product \(AB\) of two invertible matrices is also invertible, and its inverse is \(B^{-1}A^{-1}\).

Let us consider why the inverse of \(A\) is defined only for \(n\times n\) invertible matrices. If an inverse of an \(m\times n\) matrix \(A\) existed, then since both \(AB\) and \(BA\) must exist, the inverse of \(A\) would necessarily be an \(n\times m\) matrix, and this would have to satisfy

\[AB=I_m,\qquad BA=I_n\]

Such a matrix \(B\) does not exist. This is an obvious consequence of the §Fundamental Theorem of Linear Algebra, but it can also be proved using only the language of matrices.

Definition 7 Let \(A\in\Mat_n(\mathbb{K})\). Then the trace of \(A\), denoted \(\tr(A)\), is defined as the sum of the diagonal entries of \(A\), i.e. \(\tr(A)=\sum_1^n A_{ii}\).

The map \(\tr\) defined in this way is a linear map from \(\Mat_n(\mathbb{K})\) to \(\mathbb{K}\). That is, for any \(A,B\in\Mat_n(\mathbb{K})\) and \(\alpha\in\mathbb{K}\)

\[\tr(A+B)=\tr(A)+\tr(B),\qquad \tr(\alpha A)=\alpha\tr(A)\]

holds.

Now suppose two matrices \(A\in\Mat_{m\times n}(\mathbb{K})\) and \(B\in\Mat_{n\times m}(\mathbb{K})\) are given. Then one can verify that

\[\tr(AB)=\sum_{i=1}^m(AB)_{ii}=\sum_{i=1}^m\sum_{j=1}^nA_{ij}B_{ji}=\sum_{j=1}^n\sum_{i=1}^m B_{ji}A_{ij}=\sum_{j=1}^n(BA)_{jj}=\tr(BA)\]

Therefore, if \(AB=I_m\) and \(BA=I_n\), then from the above formula

\[m=\tr(I_m)=\tr(AB)=\tr(BA)=\tr(I_n)=n\]

must hold, so if \(m\neq n\), there is no matrix \(B\) satisfying this.

Definition 8 For a given matrix \(A\in\Mat_{m\times n}(\mathbb{K})\), the transpose \(A^t\) of \(A\) is the \(n\times m\) matrix defined by the formula

\[(A^t)_{ij}=A_{ji}\]

Proposition 9 For two matrices \(A\in\Mat_{m\times n}(\mathbb{K}),B\in\Mat_{n\times k}(\mathbb{K})\), the following formula

\[(AB)^t=B^tA^t\]

holds.

This can be verified by a simple calculation.


References

[Goc] M.S. Gockenbach, Finite-dimensional linear algebra, Discrete Mathematics and its applications, Taylor&Francis, 2011.
[Lee] 이인석, 선형대수와 군, 서울대학교 출판문화원, 2005.


  1. By the above definition, the \(j\)th column of the matrix \(BA\) is equal to the product of the matrix \(B\) and the column vector \(A_j\); therefore, there is no risk of confusion whether one thinks of \(BA_j\) as the $$j$$th column of $$BA$$ or as the product of \(B\) and \(A_j\). 

댓글남기기