This is a post I’ve been wanting to write for a while – Quadratic forms and Definite matrices are everywhere in linear algebra and they have nice properties which are useful in data analysis. As a result, it’s important to get a grounding how these very basic building blocks work. It wasn’t until I saw these ideas geometrically that they clicked for me, so I include some diagrams in the post.

A quadratic form on $R^n$ is a real-valued function of the form
$$Q(x_1,…,x_n) = \sum_{i \leq j} a_{ij}x_ix_j$$

Each quadratic form $Q$ can be represented by a symmetric matrix $A$ ($A = A^T$) so that
$$Q(\mathbf{x}) = \mathbf{x}^TA\mathbf{x}$$

For instance, the general 2 dimensional quadratic form
$$a_{11}x^2_1 + a_{12}x_1x_2 + a_{22}x^2_2$$

can be written as
$$\begin{bmatrix} x_1 & x_2 \end{bmatrix}\begin{bmatrix} a_{11} & \frac{1}{2}a_{12} \\ \frac{1}{2}a_{21} & a_{22} \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix}$$
The general 3 dimensional quadratic form
$$a_{11}x^2_1 + a_{22}x^2_2 + a_{33}x^2_3 + a_{12}x_1x_2 + a_{13}x_1x_3 + a_{23}x_2x_3$$
can be written as
$$\begin{bmatrix} x_1 & x_2 & x_3 \end{bmatrix}\begin{bmatrix} a_{11} & \frac{1}{2}a_{12} & \frac{1}{2}a_{13}\\ \frac{1}{2}a_{21} & a_{22} & \frac{1}{2}a_{23} \\ \frac{1}{2}a_{13} & \frac{1}{2}a_{23} & a_{33} \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}$$

A quadratic form always takes the value zero at $\mathbf{x} = \mathbf{0}$. The general quadratic form of one variable is $y=ax^2$. If $a > 0$, then $ax^2$ is always $\geq 0$ and equals 0 only when $x = 0$. Such a form is positive definite; $x = 0$ is its global minimiser. If $a < 0$, then $ax^2$ is always $\leq 0$ and equals $0$ only when $x = 0$. Such a quadratic form is called negative definite; $x = 0$ is its global maximiser.

In 2 dimensions, the quadratic form
$$Q_1(x_1, x_2) = x^2_1 + x^2_2$$
is always greater than 0 at $(x_1, x_2) \neq (0,0)$ – so we call $Q_1$ positive definite.

$$Q_2(x_1, x_2) = -x^2_1 – x^2_2$$
are strictly negative except at the origin – we call these forms negative definite. Quadratic forms like
$$Q_3(x_1,x_2) = x^2_1 – x^2_2$$
take on both positive and negative values – $$Q_3(1, 0) = 1$$ and $$Q_3(0,1) = -1$$

These forms are called indefinite. A quadratic from which is always greater than or equal to 0 but may equal zero at some nonzero $\mathbf{x}$ (as opposed to being zero only at $\mathbf{x} = \mathbf{0}$) is called positive semidefinite. An example of this form is
$$Q_4(x_1, x_2) = (x_1 + x_2)^2 = x^2_1 + 2x_1x_2 + x^2_2$$
which is never negative but equals 0 at nonzero points like $(x_1, x_2) = (1,-1)$ or $(x_1, x_2) = (-2,2)$. A quadratic form which is always $\leq$ 0 but may equal 0 at some nonzero $\mathbf{x}$ (as opposed to only at $\mathbf{x} = \mathbf{0}$) is called negative semidefinite. An example of this form is
$$Q_5(x_1, x_2) = -(x_1 + x_2)^2$$
Some (admittedly terrible) sketches of the types of curves are:

Definite Symmetric Matrices

Let $A$ be a $n \times n$ symmetric matrix, then $A$ can be:

1. Postive Definite: $\mathbf{x}^TA\mathbf{x} > 0$ for all $\mathbf{x} \neq \mathbf{0}$ in $R^n$
2. Positive Semidefinite: $\mathbf{x}^TA\mathbf{x} \geq 0$ for all $\mathbf{x} \neq \mathbf{0}$ in $R^n$
3. Negative Definite: $\mathbf{x}^TA\mathbf{x} < 0$ for all $\mathbf{x} \neq \mathbf{0}$ in $R^n$
4. Negative Semidefinite: $\mathbf{x}^TA\mathbf{x} \leq 0$ for all $\mathbf{x} \neq \mathbf{0}$ in $R^n$
5. Indefinite: $\mathbf{x}^TA\mathbf{x} > 0$ for some $\mathbf{x}$ in $R^n$ and $\mathbf{x}^TA\mathbf{x} < 0$ for some $\mathbf{x}$ in $R^n$

Definiteness of Diagonal Matrices

The simplest $n \times n$ symmetric matrices are the diagonal matrices. They correspond to the simplest quadratic forms
$$\begin{bmatrix} x_1 & … & x_n \end{bmatrix} \begin{bmatrix} a_{11} & 0 & … & 0 \\ 0 & a_{22} & … & 0 \\ … \\ 0 & 0 & … & a_{nn} \end{bmatrix} \begin{bmatrix} x_1 \\ … \\ x_n \end{bmatrix} = a_{11}x^2_1 + a_{22}x^2_2 + …+ a_{nn}x^2_n$$

The quadratic form will be positive definite iff all the $a_i,\ i=1,…,n$ are positive and negative definite iff all the $a_i,\ i=1,…,n$ are negative. The quadratic form will be positive semidefinite iff all the $a_i,\ i=1,…,n$ are greater than or equal to 0 and negative semidefinite iff all the $a_i,\ i=1,…,n$ are less than or equal to 0. If there are 2 $a_i$ of opposite sign, this quadratic form will be indefinite.

The principal submatrices of a diagonal matrix themselves will be diagonal matrices and their determinants – the principal minors – are products of their diagonal entries. If all the $a_i$ are positive (so the form is positive definite), then all their products are positive and so all the leading principal minors are positive. If all the $a_i$ are negative (so the form is negative definite), then the product of odd numbers of the $a_i$ will be negative. The product of even numbers of the $a_i$ will be positive. If $a_{11}$ is zero in the above quadratic expression, the quadratic form is indefinite. This is because all leading submatrices will contain $a_{11}$ and all leading principal minors will also be 0.

Definiteness of $2 \times 2$ matrix

Consider the general quadratic form $R^2$
$$Q(x_1, x_2) = \begin{bmatrix} x_1 & x_2 \end{bmatrix} \begin{bmatrix} a & b \\ b & c \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = ax^2_1 + 2bx_1x_2 + cx^2_2$$

If $a = 0$, then $Q$ cannot be positive or negative definite because $Q(1,0) = 0$. Assume $a \neq 0$ and complete the square in the previous expression by adding
$$Q(x_1, x_2) = ax^2_1 + 2bx_1x_2 + cx^2_2 + \frac{b^2}{a}x^2_2 – \frac{b^2}{a}x^2_2$$
$$Q(x_1, x_2) = a(x^2_1 + \frac{2b}{a}x_1x_2 + \frac{b}{a^2}x^2_2) – \frac{b^2}{a}x^2_2 + cx^2_2$$
$$Q(x_1, x_2) = a(x_1 + \frac{b}{a}x_2)^2 + \frac{ac -b^2}{a} x_2$$

Principal Minors of a Matrix

1. Principal Submatrix: Let $A$ be a $n \times n$ matrix. A $k \times k$ submatrix of $A$ formed by deleting $n-k$ columns and the same $n-k$ rows from $A$ is called a k-th order principal submatrix of $A$
2. Principal Minor: The determinant of a $k \times k$ principal submatrix is called a k-th order principal minor of $A$
3. Leading Principal Submatrix: Let $A$ be a $n \times n$ matrix. The k-th order principal submatrix of $A$ obtained by deleting the last $n-k$ rows and the last $n-k$ columns from $A$ is called the k-th order leading principal submatrix of $A$- denoted $A_k$.
4. Leading Principal Minor: The determinant of the leading principal submatrix – denoted $|A_k|$.

A $n \times n$ matrix has $n$ leading principal submatrices – the top leftmost $1 \times 1$ submatrix, top leftmost $2 \times 2$ submatrix, and so on.

Example 1: Consider a general $3 \times 3$ matrix
$$A = \begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{bmatrix}$$
$k = 1, n-k = 3 – 1 = 2$ → Remove last 2 rows and columns $A_1 = a_{11}$, $|A_1| = |a_{11}|$
$k = 2, n-k = 3 – 2 = 1$ → Remove last 1 rows and columns $A_2 = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix}$, $|A_2| = \begin{vmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{vmatrix}$
$k = 3, n-k = 3 – 3 = 0$ → Remove last 0 rows and columns $A_3 = \begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{bmatrix}$, $|A_3| = \begin{vmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{vmatrix}$

We can use leading principal minors to determine the definiteness of a given matrix.
Theorem 1: Let $A$ be a $n \times n$ symmetric matrix. Then,
a. $A$ is positive definite iff all its $n$ leading principal minors are strictly positive
b. $A$ is negative definite iff its $n$ leading principal minors alternate in sign as follows:

• $|A_1| <0,\ |A_2| > 0, |A_3| < 0$
• The k-th order leading principal minor should have the same sign as $(-1)^k$

c. If some k-th order leading principal minor of $A$ (or some pair of them) is nonzero but does not fit either of the above 2 sign patterns, then $A$ is indefinite. This case occurs when $A$ has a negative k-th order leading principal minor for an even integer k or when $A$ has a negative k-th order leading principal minor and a positive l-th order leading principal minor for 2 distinct odd integers k and l.

One way that the leading minor test of theorem 1 can fail for a given symmetric matrix $A$ is that some leading principal minor of $A$ is 0 while the nonzero ones fit the sign pattern in a) or b). When this occurs, the matrix $A$ is not definite and it may or may not be semidefinite. In this case, to check for semidefiniteness, we must check the sign of every principal minor of $A$.

Relationship Between Definiteness and Principal Minors

There are some notable relationships between definite matrices and it’s principal minors:

Lemma 1: If $A$ is a positive or negative definite matrix, then $A$ is nonsingular, invertible ($det(A) \neq 0$)

Suppose $A$ is singular (non-invertible, $det(A) = 0$). Then there exists a nonzero vector $\mathbf{x}$ such that $A\mathbf{x} = \mathbf{0}$. If this is true, we obtain $\mathbf{x}^TA\mathbf{x} = \mathbf{x}^T(A\mathbf{x}) = \mathbf{x}^T \mathbf{0} = \mathbf{0}$ this contradicts the fact that $A$ is positive or negative definite, $\mathbf{x}^TA\mathbf{x} > 0$ or $\mathbf{x}^TA\mathbf{x} < 0$. So $A$ cannot be singular and positive or negative definite. If $A$ is positive or negative definite, then $A$ must be nonsingular ( there are only 2 options for $A$ – singular or nonsingular).

Lemma 2: Suppose $A$ is a symmetric matrix, $A = A^T$ and $Q$ is a nonsingular matrix, $det(Q) \neq 0$.Then:

i) $Q^TAQ$ is a symmetric matrix

ii) $A$ is positive/ negative definite iff $Q^TAQ$ is positive/negative definite.

Proof
i) $(Q^TAQ)^T = (Q)^TA^T(Q^T)^T = Q^T A^T Q = Q^T A Q$
ii) Suppose $Q^TAQ$ is positive definite. Let $\mathbf{x} \neq \mathbf{0}$ be an arbitrary nonzero vector in $R^n$. Since $Q$ is nonsingular, there exists a nonzero vector $\mathbf{y}$ such that $\mathbf{x} = Q \mathbf{y}$ (because if it was singular (non invertible) then it’s inverse would be undefined – here we have $Q^{-1}\mathbf{x} = (Q^{-1}Q)\mathbf{y} = \mathbf{y}$)

Then
$$\mathbf{x}^TA\mathbf{x} = (Q\mathbf{y})^T A Q\mathbf{y}$$
$$\mathbf{x}^TA\mathbf{x} = \mathbf{y}^T Q^T A Q\mathbf{y}$$
$$\mathbf{x}^TA\mathbf{x} = \mathbf{y}^T (Q^TAQ) \mathbf{y}$$

Since we supposed $Q^TAQ$ is positive definite ($Q^TAQ > 0$), we have
$\mathbf{x}^TA\mathbf{x} = \mathbf{y}^T (Q^TAQ) \mathbf{y} > 0$

Which implies that $A$ is positive definite if we suppose $Q^TAQ$ is positive definite. Conversely, suppose $A$ is positive definite and let $\mathbf{z}$ be an arbitrary nonzero vector, since $Q$ is nonsingular, we can write

$$Q\mathbf{z} = \mathbf{x}$$$$\mathbf{x}^TA\mathbf{x} = (Q \mathbf{z})^T A Q\mathbf{z} > 0$$
$$\mathbf{x}^TA\mathbf{x} = \mathbf{z}^T Q^T A Q\mathbf{z} > 0$$
$$\mathbf{x}^TA\mathbf{x} = \mathbf{z}^T (Q^T A Q) \mathbf{z} > 0$$

this implies that $Q^TAQ$ is positive definite if we suppose $A$ is positive definite.

Theorem 2: Let $A$ be a symmetric matrix ($A=A^T$). Then, $A$ is positive definite ($\mathbf{x}^TA\mathbf{x} > 0$) iff all its leading principal minors are positive.

Proof map:This proof is quite long so consider a proof map to help keep track. We have 2 objectives to establish with this theorem:

i) Objective 1: $|A_j|>0,\ j=1,…,n$$A$ is positive definite

• Suppose all leading principal submatrices are positive definite.
• Since all leading principal submatrices are positive definite, they are nonsingular and invertible (by Lemma 1)
• Partition the matrix $A$ is a certain way.
• Express $A = Q^TBQ$ and partition them in a similar way to$A$
• Obtain an identity for the determinant of $A$ via $A = Q^TBQ$ and partition. This gives us a set of facts $f_1$
• Multiply $B$ in $A = Q^TBQ$ by $\mathbf{x}^T\mathbf{x}$ so we have $\mathbf{x}^TB\mathbf{x}$
• Use the set of facts$f_1$ to show $B$ is positive definite
• By lemma 2, if $B$ in $A = Q^TBQ$ is positive definite, then $A$ is positive definite

ii) Objective 2: If $A$ is positive definite → all $|A_j|$ are positive

• Suppose $A$ is positive definite, $\mathbf{x}^TA\mathbf{x}$
• Write $\mathbf{x}^TA\mathbf{x}$ in terms of the principal submatrices $A_j$ and this shows $A_j$ is positive definite
• Prove $det(A)$ by using the decomposition $A = Q^TBQ$

Proof
To prove this theorem, we must establish 2 relationships:

i) If $A$ is positive definite, then all its leading principal minors are positive

ii) If all the leading principal minors are positive, then $A$ is positive definite

i): Let $A$ be a $(n+1) \times (n+1)$ positive definite matrix. To show that all $A_j$, $j = 1,…, n+1$ are positive definite, let $\mathbf{x}_j$ be a nonzero $j$ vector and let $\mathbf{0}^*$ be the zero $(n + 1) – j$ vector. Then

(1) $\mathbf{x}^TA\mathbf{x}> 0$
$$\mathbf{x}^TA\mathbf{x} = \begin{bmatrix} \mathbf{x}^T_j & \mathbf{0}^{*^T} \end{bmatrix}_{1 \times (n+1)} A_{(n + 1) \times (n+1)} \begin{bmatrix} \mathbf{x}^T_j \\ \mathbf{0}^{*^T} \end{bmatrix}_{(n \times 1) \times 1} > 0$$
(1) can be written as

(2) $\mathbf{x}^TA\mathbf{x} = \begin{bmatrix} \mathbf{x}^T_j & \mathbf{0}^{*^T} \end{bmatrix}_{1 \times (n+1)} A_{(n + 1) \times (n+1)} \begin{bmatrix} \mathbf{x}^T_j \\ \mathbf{0}^{*^T} \end{bmatrix}_{(n \times 1) \times 1} = \mathbf{x}^T_j A_j \mathbf{x}_j > 0$

This is because the zero elements in $\begin{bmatrix} \mathbf{x}^T_j & \mathbf{0}^{*^T} \end{bmatrix}$ eliminate the $(n+1) – j$ rows and $(n+1) -j$ columns from $A$ – this is equivalent to the j-th leading principal submatrix. (2) shows that each $A_j$ is positive definite.

If $A_j$ is positive definite, then it is symmetric and has positive eigenvalues by definition. This implies that its determinant (principal minor) is positive.

ii): Let $A$ be an $(n + 1) \times (n + 1)$ symmetric matrix. Let $A_j$ be the $j \times j$ leading principal submatrix of $A$ for $j = 1,…,n+1$. I.e. if $j = n+1$, then $A_j = A$.

Suppose the leading submatrices of $A$, $A_j$, are positive definite. By lemma 1, the $A_j$ are nonsingular and invertible.

Partition $A$ as
(3) $A = \begin{bmatrix} a_{11} & a_{12} & … & a_{1n} & a_{1(n+1)} \\ a_{21} & a_{22} & … & a_{2n} & a_{2(n+1)} \\ …\\ a_{n1} & a_{n2} & … & a_{nn} & a_{n(n+1)} \\ a_{(n+1)1} & a_{(n+1)2} & … & a_{(n+1)n} & a_{(n+1)(n+1)} \end{bmatrix} = \begin{bmatrix} A_n & \mathbf{a} \\ \mathbf{a}^T & a_{(n+1)(n+1)} \end{bmatrix}$

where $\mathbf{a} = \begin{bmatrix} a_{1(n+1)} \\ a_{2(n+1)} \\ … \\ a_{n(n+1)} \end{bmatrix}$ is a $n \times 1$ column matrix.

Let $d = a_{(n \times 1)(n \times 1)} – \mathbf{a}^T(A_n)^{-1}\mathbf{a}$ and let $\mathbf{0}_n$ be a $n \times 1$ column vector of all 0s.

The identity (3) now can be written as
(4) $A = Q^TBQ = \begin{bmatrix} I_n & \mathbf{0}_n \\ (A_n^{-1}\mathbf{a})^T & 1 \end{bmatrix} \begin{bmatrix} A_n & \mathbf{0}_n \\ \mathbf{0}^T_n & d \end{bmatrix} \begin{bmatrix} I_n & A^{-1}_n \mathbf{a} \\ \mathbf{0}^T_n & 1 \end{bmatrix}$

where the leftmost matrix is lower triangular and the rightmost matrix is upper triangular.

Express $det(A)$, use the properties of determinants
$det(Q^T) = det(Q_n) = 1^n = 1$$Q$ is triangular and has diagonal entries of 1, so multiply the diagonal entries $det(B) = d \times det(A_n)$ → Obtained by cofactor expansions across the bottom row – all that isn’t multiplied by 0 is $d$ and the determinant of $det(A_n)$.

(5) $det(A) = det(Q^T) det(B) det(Q) = d \times det(A_n)$

Since $det(A) > 0 \rightarrow det(A_n) > 0\ \text{and}\ d > 0$

Let $\mathbf{x}$ be an arbitrary $(n + 1) \times 1$ vector. We can write $\mathbf{x}$ as
$$\mathbf{x} = \begin{bmatrix} x_1 \\ x_2 \\ … \\ x_n \\ x_{(n+1)} \end{bmatrix} = \begin{bmatrix} \mathbf{x}_n \\ x_{(n+1)} \end{bmatrix}$$

where $\mathbf{x}_n$ is a $n \times 1$ vector

Apply to a matrix $B$

(6) $\mathbf{x}^TB\mathbf{x} = \begin{bmatrix} \mathbf{x}_n & x_{(n+1)} \end{bmatrix}^T \begin{bmatrix} A_n & \mathbf{0}_n \\ \mathbf{0}^T_n & d \end{bmatrix} \begin{bmatrix} \mathbf{x}_n \\ x_{(n+1)} \end{bmatrix}$

$\mathbf{x}^T B \mathbf{x} = \begin{bmatrix} \mathbf{x}^T_n A_n & dx_{(n+1)} \end{bmatrix}\begin{bmatrix} \mathbf{x}_n \\ x_{(n+1)} \end{bmatrix}$

$\mathbf{x}^T B \mathbf{x} = \mathbf{x}^T_n A_n \mathbf{x}_n + dx^2_{(n+1)}$

Since $A_n$ is positive definite by supposition ($\mathbf{x}_n^T A_n \mathbf{x}_n > 0$) and we established that $d > 0$, it follows that $\mathbf{x}^T_n A_n \mathbf{x}_n + dx^2_{(n+1)} > 0$.

Therefore $B$ is positive definite. Using lemma 2, if $B$ is positive then $Q^TBQ = A$ is also positive definite.