This is a post I’ve been wanting to write for a while – Quadratic forms and Definite matrices are everywhere in linear algebra and they have nice properties which are useful in data analysis. As a result, it’s important to get a grounding how these very basic building blocks work. It wasn’t until I saw these ideas geometrically that they clicked for me, so I include some diagrams in the post.

## Quadratic Forms

A quadratic form on R^n is a real-valued function of the form

$$ Q(x_1,…,x_n) = \sum_{i \leq j} a_{ij}x_ix_j$$

Each quadratic form Q can be represented by a symmetric matrix A (A = A^T) so that

$$ Q(\mathbf{x}) = \mathbf{x}^TA\mathbf{x}$$

For instance, the general 2 dimensional quadratic form

$$a_{11}x^2_1 + a_{12}x_1x_2 + a_{22}x^2_2$$

can be written as

$$\begin{bmatrix} x_1 & x_2 \end{bmatrix}\begin{bmatrix} a_{11} & \frac{1}{2}a_{12} \\ \frac{1}{2}a_{21} & a_{22} \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix}$$

The general 3 dimensional quadratic form

$$a_{11}x^2_1 + a_{22}x^2_2 + a_{33}x^2_3 + a_{12}x_1x_2 + a_{13}x_1x_3 + a_{23}x_2x_3$$

can be written as

$$\begin{bmatrix} x_1 & x_2 & x_3 \end{bmatrix}\begin{bmatrix} a_{11} & \frac{1}{2}a_{12} & \frac{1}{2}a_{13}\\ \frac{1}{2}a_{21} & a_{22} & \frac{1}{2}a_{23} \\ \frac{1}{2}a_{13} & \frac{1}{2}a_{23} & a_{33} \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}$$

## Definite Quadratic Forms

A quadratic form always takes the value zero at \mathbf{x} = \mathbf{0}. The general quadratic form of one variable is y=ax^2. If a > 0, then ax^2 is always \geq 0 and equals 0 only when x = 0. Such a form is **positive definite**; x = 0 is its **global minimiser. **If a < 0, then ax^2 is always \leq 0 and equals 0 only when x = 0. Such a quadratic form is called **negative definite**; x = 0 is its **global maximiser. **

In 2 dimensions, the quadratic form

$$Q_1(x_1, x_2) = x^2_1 + x^2_2$$

is always greater than 0 at (x_1, x_2) \neq (0,0) – so we call Q_1 **positive definite. **

Quadratic forms like

$$Q_2(x_1, x_2) = -x^2_1 – x^2_2$$

are strictly negative except at the origin – we call these forms **negative definite. **Quadratic forms like

$$Q_3(x_1,x_2) = x^2_1 – x^2_2$$

take on both positive and negative values – $$Q_3(1, 0) = 1$$ and $$Q_3(0,1) = -1$$

These forms are called **indefinite. **A quadratic from which is always **greater than or equal to 0 **but may equal zero at some nonzero \mathbf{x} (as opposed to being zero only at \mathbf{x} = \mathbf{0}) is called **positive semidefinite**. An example of this form is

$$Q_4(x_1, x_2) = (x_1 + x_2)^2 = x^2_1 + 2x_1x_2 + x^2_2$$

which is never negative but equals 0 at nonzero points like (x_1, x_2) = (1,-1) or (x_1, x_2) = (-2,2). A quadratic form which is always \leq 0 but may equal 0 at some nonzero \mathbf{x} (as opposed to only at \mathbf{x} = \mathbf{0}) is called **negative semidefinite. **An example of this form is

$$Q_5(x_1, x_2) = -(x_1 + x_2)^2$$

Some (admittedly terrible) sketches of the types of curves are:

## Definite Symmetric Matrices

Let A be a n \times n symmetric matrix, then A can be:

- Postive Definite: \mathbf{x}^TA\mathbf{x} > 0 for all \mathbf{x} \neq \mathbf{0} in R^n
- Positive Semidefinite: \mathbf{x}^TA\mathbf{x} \geq 0 for all \mathbf{x} \neq \mathbf{0} in R^n
- Negative Definite: \mathbf{x}^TA\mathbf{x} < 0 for all \mathbf{x} \neq \mathbf{0} in R^n
- Negative Semidefinite: \mathbf{x}^TA\mathbf{x} \leq 0 for all \mathbf{x} \neq \mathbf{0} in R^n
- Indefinite: \mathbf{x}^TA\mathbf{x} > 0 for some \mathbf{x} in R^n and \mathbf{x}^TA\mathbf{x} < 0 for some \mathbf{x} in R^n

## Definiteness of Diagonal Matrices

The simplest n \times n symmetric matrices are the diagonal matrices. They correspond to the simplest quadratic forms

$$\begin{bmatrix} x_1 & … & x_n \end{bmatrix} \begin{bmatrix} a_{11} & 0 & … & 0 \\ 0 & a_{22} & … & 0 \\ … \\ 0 & 0 & … & a_{nn} \end{bmatrix} \begin{bmatrix} x_1 \\ … \\ x_n \end{bmatrix} = a_{11}x^2_1 + a_{22}x^2_2 + …+ a_{nn}x^2_n$$

The quadratic form will be positive definite iff all the a_i,\ i=1,…,n are positive and negative definite iff all the a_i,\ i=1,…,n are negative. The quadratic form will be positive semidefinite iff all the a_i,\ i=1,…,n are greater than or equal to 0 and negative semidefinite iff all the a_i,\ i=1,…,n are less than or equal to 0. If there are 2 a_i of opposite sign, this quadratic form will be indefinite.

The principal submatrices of a diagonal matrix themselves will be diagonal matrices and their determinants – the principal minors – are products of their diagonal entries. If all the a_i are positive (so the form is positive definite), then all their products are positive and so all the leading principal minors are positive. If all the a_i are negative (so the form is negative definite), then the product of odd numbers of the a_i will be negative. The product of even numbers of the a_i will be positive. If a_{11} is zero in the above quadratic expression, the quadratic form is indefinite. This is because all leading submatrices will contain a_{11} and all leading principal minors will also be 0.

## Definiteness of 2 \times 2 matrix

Consider the general quadratic form R^2

$$Q(x_1, x_2) = \begin{bmatrix} x_1 & x_2 \end{bmatrix} \begin{bmatrix} a & b \\ b & c \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = ax^2_1 + 2bx_1x_2 + cx^2_2$$

If a = 0, then Q cannot be positive or negative definite because Q(1,0) = 0. Assume a \neq 0 and complete the square in the previous expression by adding

$$Q(x_1, x_2) = ax^2_1 + 2bx_1x_2 + cx^2_2 + \frac{b^2}{a}x^2_2 – \frac{b^2}{a}x^2_2$$

$$Q(x_1, x_2) = a(x^2_1 + \frac{2b}{a}x_1x_2 + \frac{b}{a^2}x^2_2) – \frac{b^2}{a}x^2_2 + cx^2_2$$

$$Q(x_1, x_2) = a(x_1 + \frac{b}{a}x_2)^2 + \frac{ac -b^2}{a} x_2$$

## Principal Minors of a Matrix

- Principal Submatrix: Let A be a n \times n matrix. A k \times k submatrix of A formed by deleting n-k columns and the
**same**n-k rows from A is called a k-th order principal submatrix of A - Principal Minor: The
**determinant**of a k \times k**principal submatrix**is called a k-th order**principal minor**of A - Leading Principal Submatrix: Let A be a n \times n matrix. The k-th order principal submatrix of A obtained by deleting the
**last**n-k rows and the**last**n-k columns from A is called the k-th order leading principal submatrix of A- denoted A_k. - Leading Principal Minor: The determinant of the leading principal submatrix – denoted |A_k|.

A n \times n matrix has n leading principal submatrices – the top leftmost 1 \times 1 submatrix, top leftmost 2 \times 2 submatrix, and so on.

**Example 1: **Consider a general 3 \times 3 matrix

$$A = \begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{bmatrix}$$

k = 1, n-k = 3 – 1 = 2 → Remove last 2 rows and columns A_1 = a_{11}, |A_1| = |a_{11}|

k = 2, n-k = 3 – 2 = 1 → Remove last 1 rows and columns A_2 = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix}, |A_2| = \begin{vmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{vmatrix}

k = 3, n-k = 3 – 3 = 0 → Remove last 0 rows and columns A_3 = \begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{bmatrix}, |A_3| = \begin{vmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{vmatrix}

We can use leading principal minors to determine the definiteness of a given matrix.**Theorem 1: **Let A be a n \times n symmetric matrix. Then,

a. A is positive definite iff all its n leading principal minors are strictly positive

b. A is negative definite iff its n leading principal minors alternate in sign as follows:

- |A_1| <0,\ |A_2| > 0, |A_3| < 0

- The k-th order leading principal minor should have the same sign as (-1)^k

c. If some k-th order leading principal minor of A (or some pair of them) is nonzero but **does not **fit either of the above 2 sign patterns, then A is **indefinite. **This case occurs when A has a **negative **k-th order leading principal minor for an **even integer k **or when A has a negative k-th order leading principal minor and a positive l-th order leading principal minor for 2 distinct odd integers k and l.

One way that the leading minor test of theorem 1 can fail for a given symmetric matrix A is that some leading principal minor of A is 0 while the nonzero ones fit the sign pattern in a) or b). When this occurs, the matrix A is **not **definite and it may or may not be semidefinite. In this case, to check for semidefiniteness, we must check the sign of **every **principal minor of A.

## Relationship Between Definiteness and Principal Minors

There are some notable relationships between definite matrices and it’s principal minors:

**Lemma 1**: If A is a positive or negative definite matrix, then A is nonsingular, invertible (det(A) \neq 0)

**Proof**: By Contradiction

Suppose A is singular (non-invertible, det(A) = 0). Then there exists a nonzero vector \mathbf{x} such that A\mathbf{x} = \mathbf{0}. If this is true, we obtain \mathbf{x}^TA\mathbf{x} = \mathbf{x}^T(A\mathbf{x}) = \mathbf{x}^T \mathbf{0} = \mathbf{0} this contradicts the fact that A is positive or negative definite, \mathbf{x}^TA\mathbf{x} > 0 or \mathbf{x}^TA\mathbf{x} < 0. So A cannot be singular and positive or negative definite. If A is positive or negative definite, then A must be nonsingular ( there are only 2 options for A – singular or nonsingular).

**Lemma 2: **Suppose A is a symmetric matrix, A = A^T and Q is a nonsingular matrix, det(Q) \neq 0.Then:

i) Q^TAQ is a symmetric matrix

ii) A is positive/ negative definite iff Q^TAQ is positive/negative definite.

**Proof**:

i) (Q^TAQ)^T = (Q)^TA^T(Q^T)^T = Q^T A^T Q = Q^T A Q

ii) Suppose Q^TAQ is positive definite. Let \mathbf{x} \neq \mathbf{0} be an arbitrary nonzero vector in R^n. Since Q is nonsingular, there exists a nonzero vector \mathbf{y} such that \mathbf{x} = Q \mathbf{y} (because if it was singular (non invertible) then it’s inverse would be undefined – here we have Q^{-1}\mathbf{x} = (Q^{-1}Q)\mathbf{y} = \mathbf{y})

Then

$$\mathbf{x}^TA\mathbf{x} = (Q\mathbf{y})^T A Q\mathbf{y}$$

$$\mathbf{x}^TA\mathbf{x} = \mathbf{y}^T Q^T A Q\mathbf{y}$$

$$\mathbf{x}^TA\mathbf{x} = \mathbf{y}^T (Q^TAQ) \mathbf{y}$$

Since we supposed Q^TAQ is positive definite (Q^TAQ > 0), we have

\mathbf{x}^TA\mathbf{x} = \mathbf{y}^T (Q^TAQ) \mathbf{y} > 0

Which implies that A is positive definite if we suppose Q^TAQ is positive definite. Conversely, suppose A is positive definite and let \mathbf{z} be an arbitrary nonzero vector, since Q is nonsingular, we can write

$$Q\mathbf{z} = \mathbf{x}$$$$\mathbf{x}^TA\mathbf{x} = (Q \mathbf{z})^T A Q\mathbf{z} > 0$$

$$\mathbf{x}^TA\mathbf{x} = \mathbf{z}^T Q^T A Q\mathbf{z} > 0$$

$$\mathbf{x}^TA\mathbf{x} = \mathbf{z}^T (Q^T A Q) \mathbf{z} > 0$$

this implies that Q^TAQ is positive definite if we suppose A is positive definite.

**Theorem 2: **Let A be a symmetric matrix (A=A^T). Then, A is positive definite (\mathbf{x}^TA\mathbf{x} > 0) iff **all its leading principal minors are positive**.

Proof map:This proof is quite long so consider a proof map to help keep track. We have 2 objectives to establish with this theorem:

i) Objective 1: |A_j|>0,\ j=1,…,n → A is positive definite

- Suppose all leading principal submatrices are positive definite.
- Since all leading principal submatrices are positive definite, they are nonsingular and invertible (by Lemma 1)
- Partition the matrix A is a certain way.
- Express A = Q^TBQ and partition them in a similar way toA
- Obtain an identity for the determinant of A via A = Q^TBQ and partition. This gives us a set of facts f_1
- Multiply B in A = Q^TBQ by \mathbf{x}^T\mathbf{x} so we have \mathbf{x}^TB\mathbf{x}
- Use the set of factsf_1 to show B is positive definite
- By lemma 2, if B in A = Q^TBQ is positive definite, then A is positive definite

ii) Objective 2: If A is positive definite → all |A_j| are positive

- Suppose A is positive definite, \mathbf{x}^TA\mathbf{x}
- Write \mathbf{x}^TA\mathbf{x} in terms of the principal submatrices A_j and this shows A_j is positive definite
- Prove det(A) by using the decomposition A = Q^TBQ

**Proof**:

To prove this theorem, we must establish 2 relationships:

i) If A is positive definite, then all its leading principal minors are positive

ii) If all the leading principal minors are positive, then A is positive definite

i): Let A be a (n+1) \times (n+1) positive definite matrix. To show that all A_j, j = 1,…, n+1 are positive definite, let \mathbf{x}_j be a nonzero j vector and let \mathbf{0}^* be the zero (n + 1) – j vector. Then

(1) \mathbf{x}^TA\mathbf{x}> 0

$$\mathbf{x}^TA\mathbf{x} = \begin{bmatrix} \mathbf{x}^T_j & \mathbf{0}^{*^T} \end{bmatrix}_{1 \times (n+1)} A_{(n + 1) \times (n+1)} \begin{bmatrix} \mathbf{x}^T_j \\ \mathbf{0}^{*^T} \end{bmatrix}_{(n \times 1) \times 1} > 0$$

(1) can be written as

(2) \mathbf{x}^TA\mathbf{x} = \begin{bmatrix} \mathbf{x}^T_j & \mathbf{0}^{*^T} \end{bmatrix}_{1 \times (n+1)} A_{(n + 1) \times (n+1)} \begin{bmatrix} \mathbf{x}^T_j \\ \mathbf{0}^{*^T} \end{bmatrix}_{(n \times 1) \times 1} = \mathbf{x}^T_j A_j \mathbf{x}_j > 0

This is because the zero elements in \begin{bmatrix} \mathbf{x}^T_j & \mathbf{0}^{*^T} \end{bmatrix} eliminate the (n+1) – j rows and (n+1) -j columns from A – this is equivalent to the j-th leading principal submatrix. (2) shows that each A_j is positive definite.

If A_j is positive definite, then it is symmetric and has positive eigenvalues by definition. This implies that its determinant (principal minor) is positive.

ii): Let A be an (n + 1) \times (n + 1) symmetric matrix. Let A_j be the j \times j leading principal submatrix of A for j = 1,…,n+1. I.e. if j = n+1, then A_j = A.

Suppose the leading submatrices of A, A_j, are **positive definite**. By lemma 1, the A_j are nonsingular and invertible.

Partition A as

(3) A = \begin{bmatrix} a_{11} & a_{12} & … & a_{1n} & a_{1(n+1)} \\ a_{21} & a_{22} & … & a_{2n} & a_{2(n+1)} \\ …\\ a_{n1} & a_{n2} & … & a_{nn} & a_{n(n+1)} \\ a_{(n+1)1} & a_{(n+1)2} & … & a_{(n+1)n} & a_{(n+1)(n+1)} \end{bmatrix} = \begin{bmatrix} A_n & \mathbf{a} \\ \mathbf{a}^T & a_{(n+1)(n+1)} \end{bmatrix}

where \mathbf{a} = \begin{bmatrix} a_{1(n+1)} \\ a_{2(n+1)} \\ … \\ a_{n(n+1)} \end{bmatrix} is a n \times 1 column matrix.

Let d = a_{(n \times 1)(n \times 1)} – \mathbf{a}^T(A_n)^{-1}\mathbf{a} and let \mathbf{0}_n be a n \times 1 column vector of all 0s.

The identity (3) now can be written as

(4) A = Q^TBQ = \begin{bmatrix} I_n & \mathbf{0}_n \\ (A_n^{-1}\mathbf{a})^T & 1 \end{bmatrix} \begin{bmatrix} A_n & \mathbf{0}_n \\ \mathbf{0}^T_n & d \end{bmatrix} \begin{bmatrix} I_n & A^{-1}_n \mathbf{a} \\ \mathbf{0}^T_n & 1 \end{bmatrix}

where the leftmost matrix is lower triangular and the rightmost matrix is upper triangular.

Express det(A), use the properties of determinants

det(Q^T) = det(Q_n) = 1^n = 1 → Q is triangular and has diagonal entries of 1, so multiply the diagonal entries det(B) = d \times det(A_n) → Obtained by cofactor expansions across the bottom row – all that isn’t multiplied by 0 is d and the determinant of det(A_n).

(5) det(A) = det(Q^T) det(B) det(Q) = d \times det(A_n)

Since det(A) > 0 \rightarrow det(A_n) > 0\ \text{and}\ d > 0

Let \mathbf{x} be an arbitrary (n + 1) \times 1 vector. We can write \mathbf{x} as

$$ \mathbf{x} = \begin{bmatrix} x_1 \\ x_2 \\ … \\ x_n \\ x_{(n+1)} \end{bmatrix} = \begin{bmatrix} \mathbf{x}_n \\ x_{(n+1)} \end{bmatrix}$$

where \mathbf{x}_n is a n \times 1 vector

Apply to a matrix B

(6) \mathbf{x}^TB\mathbf{x} = \begin{bmatrix} \mathbf{x}_n & x_{(n+1)} \end{bmatrix}^T \begin{bmatrix} A_n & \mathbf{0}_n \\ \mathbf{0}^T_n & d \end{bmatrix} \begin{bmatrix} \mathbf{x}_n \\ x_{(n+1)} \end{bmatrix}

\mathbf{x}^T B \mathbf{x} = \begin{bmatrix} \mathbf{x}^T_n A_n & dx_{(n+1)} \end{bmatrix}\begin{bmatrix} \mathbf{x}_n \\ x_{(n+1)} \end{bmatrix}

\mathbf{x}^T B \mathbf{x} = \mathbf{x}^T_n A_n \mathbf{x}_n + dx^2_{(n+1)}

Since A_n is positive definite by supposition (\mathbf{x}_n^T A_n \mathbf{x}_n > 0) and we established that d > 0, it follows that \mathbf{x}^T_n A_n \mathbf{x}_n + dx^2_{(n+1)} > 0.

Therefore B is positive definite. Using lemma 2, if B is positive then Q^TBQ = A is also positive definite.

## Leave a Reply