Wedderburn's Theorem: A Finite Division Ring Is a Field

Wedderburn's Theorem of 1905 is a beauty, that a finite division ring is a field. The result itself, but also the proof, which surely is enrolled in The Book, Paul Erdős's whimsical imaginary list of the finest proofs in all mathematics.[1] A division ring is a ring with a 1 in which every non-zero element has a multiplicative inverse. Add multiplicative commutativity and you have a field. The quaternions over the reals are an example of an infinite division ring that is not a field, but Wedderburn says that such an example must be infinite — there is no example of a finite division ring that is not a field. What is striking about this proof is the range of techniques it employs, including group theory, linear algebra, the cyclotomic polynomials, Euclidean geometry, and basic facts about integers and complex numbers.[2]

Centralizers and Conjugacy Classes

Given a group \(G\) and an element \( a \in G, \) define the centralizer of \(a\) in \(G\) as:

\[ C_G(a) := \{x \in G \,| \, xa = ax\}. \]

Simple calculations show that \( C_G(a) \) is a subgroup of \(G\) and also that the center of \( G \) is contained in all the centralizers, where the center is the set of elements commuting with every element. Centralizers are of no interest in abelian groups, where every centralizer is the entire group, but they are typically not trivial in non-commutative groups. Take the quaternion group \(Q_8,\) for example, where \(C_{Q_8}(i) = \{1, -1, i, -i\},\) which in turn contains the center of \(C_{Q_8},\) which is \( Z(Q_8) = \{1, -1\}.\)

A related notion is that of conjugacy. Two elements of a group \(a\) and \(b\) are said to be conjugates if there is some \(g \in G\) such that \(b = gag^{-1}.\) Conjugacy is an equivalence relation, so the conjugacy classes partition a group. A direct calculation shows that \(\{i, -i\}\) is a conjugacy class of \(Q_8,\) for example, and \(\{j, -j\},\) \(\{k, -k\},\) \(\{1\},\) and \(\{-1\}\) are the others. It's perfectly possible for a conjugacy class to contain a single element; in fact, this is the case for elements in the center, elements commuting with all elements of the group.

Given an element \(a \in G,\) the left cosets of \(C_G(a)\) are in one-to-one correspondence with the conjugacy class containing \(a\) according to the map:
\[ \varphi: B = d \cdot C_G(a) \mapsto BaB^{-1}. \]
It's not obvious that \(BaB^{-1} = \{bab^{-1} \, | \, b \in B\}\) is a single element, much less that that element is conjugate to \(a,\) so let's show that first. Consider \(bab^{-1},\) where \(b \in B\) — that is, \(b = dk,\) where \( d\) is a fixed value representing the coset and \( k \in C_G(a).\) Then:
\begin{align}
bab^{-1} &= (dk) \cdot a \cdot (dk)^{-1}\\
&= dk \cdot a \cdot k^{-1} d^{-1}\\
&= dad^{-1},
\end{align}
considering that \(k\) commutes with \(a\) and therefore can be slid to the right of the \(a\) in the middle expression to cancel out the \(k^{-1}.\) Since \(d\) is fixed, this shows that \(BaB^{-1}\) is always a single element and indeed that that element is a conjugate of \(a,\) so \(\varphi\) is well defined.

To show that \(\varphi\) is \(1-1,\) suppose \(bab^{-1} = cac^{-1},\) where \(b, c \in B,\) a coset of \(C_G(a).\) Multiplying on the left by \(c^{-1}\) and on the right by \(b\) results in:
\begin{align}
(c^{-1}b) \cdot a &= a \cdot (c^{-1}b)\\
\end{align}
That is, \(c^{-1}b\) commutes with \(a,\) so \(c^{-1}b \in C_G(a).\) This shows that \(c \cdot C_G(a) = b \cdot C_G(a),\) so \(b\) and \(c\) are in the same coset and \(\varphi\) is \(1-1.\)

Finally, that \(\varphi\) is onto is obvious, since for any conjugate \(kak^{-1}, k \cdot C_G(a)\) maps to it.

It follows that \(\varphi\) provides a one-to-one correspondence between the left cosets of \(C_G(a)\) and the conjugacy class containing \(a.\)

The Class Equation

Any subgroup leads to another partition of a group into cosets. In this case the equivalence classes are the cosets and they are of equal size. The number of cosets is called the index of the subgroup and is denoted by \(|G : H|,\) where \(H\) is the subgroup in question. It follows that the size of a subgroup divides the size of the group and that:
\[ |G| = |H| \cdot |G : H|.\]
Since the conjugacy class containing \(a\) has the same size as the set of cosets of \(C_G(a),\) the conjugacy class containing \(a\) has \(|G : C_G(a)|\) elements. And since the conjugacy classes partition \(G:\)
\begin{align}
|G| &= \sum (\text{ sizes of the conjugacy classes) }\\
&= |Z(G)| + \sum |G : C_G(a)|,
\end{align}
where \(Z(G)\) is the center of \(G,\) which contains exactly the conjugacy classes with a single element and the sum is taken over one element \(a\) in each non-central conjugacy class. Note that \(Z(G)\) and each summand on the right divides the order of \(G.\) This is the Class Equation of a finite group.

For example, the conjugacy classes of \(Q_8\) are \( \{1\}, \{-1\}, \{i, -i\}, \{j, -j\},\) and \(\{k, -k\},\) and the centralizers of the last three are \(\{1, -1, i, -i\}, \{1, -1, j, -j\},\) and \(\{1, -1, k, -k\},\) so the class equation reads:
\begin{align}
|Q_8| &= |Z(Q_8)| + \sum |Q_8 : C_{Q_8}(a)|\\
&= |Z(Q_8)| + |Q_8 : C_{Q_8}(i)| + |Q_8 : C_{Q_8}(j)| + |Q_8 : C_{Q_8}(k)|\\
&= 2 + 2 + 2 + 2.
\end{align}

The class equation reduces questions about groups to integers and unlocks many a problem — that the center of a \(p\)-group is non-trivial, for example (a \(p\)-group is a group whose order is a power of a prime \(p\)). To see this, note that the left side of the class equation is a multiple of \(p\) for a \(p\)-group \(G\), and so is every summand of the sum on the right, so \(|Z(G)|\) is a multiple of \(p\) as well. But \(1 \in Z(G),\) so \(|Z(G)| \neq 0.\) It follows that \(|Z(G)| \geq 2.\) QED!

On To Wedderburn

Suppose \(K\) is a finite division ring and \(K^* = K \backslash \{0\}\) is its multiplicative group. The object is to show that \(K^*\) must be commutative. Straightforward calculations show that \(Z = Z(K^*) \cup \{0\}\), the multiplicative center of \(K^*\) together with zero, constitutes a sub-division ring of \(K\) and (because it is commutative) a field. As in finite field theory, \(K\) is a vector space over \(Z.\) Therefore if \(Z\) has \(q\) elements, then \(K\) has \(q^n\) elements and so \(K^*\) has \(q^n-1\) elements.

Given any \(a \in K^*,\) its centralizer \(C_{K^*}(a)\) in \(K^*\) together with zero is a sub-division ring of \(K,\) so it too is a vector space over \(Z\) with cardinality a power of \(q;\) that is, \(|C_{K^*}(a)| = q^{n_a}-1\) for some integer \(n_a,\) where \(1 \leq n_a \leq n.\) This all leads to the class equation:
\begin{align}
|K^*| &= |Z(K^*)| + \sum |K^* : C_{K^*}(a)|\\
&= |Z(K^*)| + \sum {{|K^*|}\over{|C_{K^*}(a)|}}\\
q^n - 1 &= q - 1 + \sum {{q^n-1}\over{q^{n_a}-1}},\\
\end{align}
where the third line substitutes the values just derived. Note that all those summands on the right are integers, since they are indices of a subgroup in a larger group. It is a fact about integers that:
\[(q^{n_a}-1) \, | \; (q^n - 1) \implies n_a \, | \, n,\]
where \( 2 \leq q\) and \( 1 \leq n_a \leq n.\) To see this, put \( n = kn_a + r,\) where \( 0 \leq r < n_a.\) Then
\begin{align}
(q^{n_a}-1) \, &| \, \left( (q^{kn_a+r}-1) - (q^{n_a}-1)\right)\\
&= q^{kn_a+r}-q^{n_a}\\
&= q^{n_a}(q^{(k-1)n_a+r}-1).
\end{align}
Since \(q^{n_a}\) and \(q^{n_a}-1\) are relatively prime, this implies:
\[(q^{n_a}-1) \, | \, (q^{(k-1)n_a+r}-1).\]
Continuing in this fashion results in:
\[(q^{n_a}-1) \, | \, (q^r-1),\]
but \(q^r - 1 < q^{n_a}-1,\) so it must be the case that \(q^r-1 = 0\) and \(r=0;\) that is, \(n_a \; | \; n\) for each \(n_a\) as claimed. Putting this together:
\begin{align}
q^n - 1 &= q - 1 + \sum {{q^n-1}\over{q^{n_a}-1}}, \hspace{20pt} n_a \, | n, \hspace{5pt} n_a \neq n.\tag{1}\\
\end{align}

Wedderburn Finishes the Proof

The plan is to prove that (1) cannot hold unless \(n = 1\) (and the sum is empty). To this end, Wedderburn cited a result of Birkhoff and Vandiver[3] conditioning sums like the one appearing in (1):

For \(n > 1\) there is a prime number dividing \(q^n -1\) but which does not divide any \(q^m -1\) where \(m\) is a proper divisor of \(n\) except when:
• \(q=2,\, n=6,\)
• \(q=2^k-1\) is prime, \(n=2.\)

To consider the first exception, plug \(q=2,\,n=6\) into (1):
\begin{align}
2^6 - 1 &= 2 - 1 + \sum {{2^6-1}\over{2^m-1}}, \hspace{10pt} m \in \{1, 2, 3\}\\
62 &= \sum {63\over s}, \hspace{10pt} s \in \{1,3,7\},
\end{align}
taking this to mean that the sum can be any combination of the three quotients, namely, \(63, 21, 9.\) The \(63 \) is out as too large and no combination of the other two sums to \(62,\) ruling out the first exception.

To consider the second exception, note that the only possible proper divisor of \(n=2\) is \(m=1,\) so (1) reduces to:
\begin{align}
q^2 &= q + \text{ maybe } {{q^2-1} \over {q-1}}\\
q^2 &= q + \text{ maybe } q+1.
\end{align}
Not including \(q+1\) leads to \(q^2=q,\) so \(q=1,\) which can't be since \(q \geq 2.\) Including \(q+1\) leads to \(q^2-2q-1 =0,\) which has no rational roots, so that is disallowed as well. The upshot is that neither of the exceptions is consistent with equation (1), so if \(n > 1\) there is a prime number dividing \(q^n-1\) but not dividing any of the \(q^{n_a}-1\) in (1). Since that prime number divides the numerator but not the denominator in each of the fractions in the sum, it divides the fraction itself (which is actually an integer). This prime also divides the left side of (1), consequently it divides \(q-1=q^1-1,\) which it expressly does not. This contradiction forces \(n=1,\) and so \(K=Z\) and \(K\) is commutative. QED.

Witt's Alternative — The Cyclotomic Polynomials

The last step above is dissatisfying in not being self-contained and depending on the seemingly arcane result of Birkhoff and Vandiver.[4] In 1931 Ernst Witt suggested an elegant alternative approach to showing that (1) implies \(n=1\).[5]

 fifth roots of unity

Witt employed the cyclotomic polynomials, where for a positive integer \(n,\) the \(n\)th cyclotomic polynomial is defined as:

\[ \Phi(n) := \prod_\lambda (x - \lambda), \]

where the \(\lambda\) range over the primitive \(n\)th roots of unity, that is, the \(n\)th roots of unity generating the cyclic group of all the \(n\)th roots of unity. Number the \(n\)th roots of unity as \(\lambda_0 = 1, \lambda_1 = \cos{(2\pi/n)} + i\sin{(2\pi/n)},\) and so on, proceeding counter-clockwise around the unit circle. Then \(\lambda_k\) is a primitive \(n\)th root of unity if and only if \(k\) and \(n\) are relatively prime. In the case \(n=5,\) shown here, each of \(\lambda_1, \lambda_2, \lambda_3,\) and \(\lambda_4\) are primitive \(5\)th roots of unity, so:

\begin{align}
\Phi(5) &= (x-\lambda_1)(x-\lambda_2)(x-\lambda_3)(x-\lambda_4)\\
&= [(x-\lambda_1)(x-\lambda_4)] \cdot [(x-\lambda_2)(x-\lambda_3)]\\
&= [x^2 -(\lambda_1 + \lambda_4)x + \lambda_1 \lambda_4] \cdot [x^2 -(\lambda_2 + \lambda_3)x + \lambda_2 \lambda_3]\\
&= [x^2 -2 \cos{(2 \pi/5)}x + 1] \cdot [x^2 -2 \cos{(4 \pi/5)}x + 1]\\
&= [x^2 -2 (-\varphi'/2)x + 1] \cdot [x^2 -2 (-\varphi/2)x + 1],\\
&= [x^2 + \varphi' x + 1] \cdot [x^2 + \varphi x + 1],
\end{align}

where \(\varphi, \varphi' = (1 \pm \sqrt{5})/2\) are the roots of \(x^2-x-1=0\), so \(\varphi \varphi' = -1\) and \(\varphi + \varphi' = 1\). Multiplying the quadratic expressions resuts in:

\begin{align}
\Phi(5) &= x^4 + (\varphi + \varphi')x^3 + (2 + \varphi \varphi')x^2 + (\varphi + \varphi')x + 1\\
&= x^4 + x^3 + x^2 + x + 1.
\end{align}

In this case, there is an easier way, considering that every fifth root of unity is primitive except \(\lambda_0 = 1,\) so:

\begin{align}
\Phi(5) &= {{x^5-1} \over {x-1}}\\[5pt]
&= x^4 + x^3 + x^2 + x + 1.\tag{2}\\
\end{align}

Expressions like (2) obtain when (and only when) \(n\) is prime:

\[\Phi(p) = \sum_{k=0}^{p-1} x^k, \text{ when } p \text{ is prime.}\]

 sixth roots of unity

For composite values of \(n,\) not every \(n\)th root of unity is primitive (and \(\lambda_0 = 1\) never counts), so the degree of \( \Phi(x) \) is less than \(n-1.\) Take \(n = 6,\) for example, where there are only two primitive toots of unity, \(\lambda_1, \lambda5 = \cos{(\pi/3)} \pm i\sin{(\pi/3)},\) so \(\Phi(6)\) is a quadratic:
\begin{align}
\Phi(6) &= (x-\lambda_1)(x-\lambda_5)\\
&= x^2 -(\lambda_1 + \lambda_5)x + \lambda_1 \lambda_5\\
&= x^2 -2 \cos{(\pi/3)}x + 1\\
&= x^2 - x + 1.
\end{align}
Consider the group of the \(n\)th roots of unity. Every element has an order \(k\) dividing \(n\) and that element is a primitive \(k\)th root of unity, primitive because no lower power of the element equals one and therefore the element generates the group of \(k\)th roots of unity. It follow that every \(n\)th root of unity is a root of exactly one cyclotomic polynomial \(\Phi(d),\) where \(d \, | \, n:\)
\[ x^n - 1 = \sum_{d \, | \, n} \Phi_d(x).\tag{3}\]
In the case \(n = 6,\) for example:
\begin{align}
x^6 - 1 &= [\underbrace{(x-\lambda_1)(x-\lambda_5)}_{\large \text{prim. }6 \text{th roots}}][\underbrace{(x-\lambda_2)(x-\lambda_4)}_{\large \text{prim. }3 \text{rd roots}}](\underbrace{x-\lambda_3}_{\large 2 \text{nd}})(\underbrace{x-\lambda_0}_{\large 1 \text{st}})\\[10pt]
&= \Phi_6(x)\Phi_3(x)\Phi_2(x)\Phi_1(x)\\
&= (x^2-x+1)(x^2+x+1)(x+1)(x-1).
\end{align}
It follows from (3) that the cyclotomic polynomials have integral coefficients with constant term 1 after the first, certainly not to be expected considering that they are defined over the complex numbers. To see this, rewrite (3) as:
\[ x^n - 1 = \Phi_n(x) \cdot \prod_{d \, | \, n, \, d \neq n} \Phi_d(x).\tag{4}\]
By induction, the product on the right is a polynomial with integral coefficients and comparing coefficients on each side of (4) establishes that \(\Phi(x)\) has integral coefficients with constant term 1. To spell this out, suppose:
\begin{align}
\Phi_n(x) &= x^j + a_{j-1} x^{j-1} + \cdots + a_1 x + a_0,\\
\prod_{d \, | \, n, \, d \neq n} \Phi_d(x) &= x^k + b_{k-1} x^{k-1} + \cdots + b_1 x - 1, \hspace{20pt} b_i \in \mathbb{Z}.
\end{align}
Both these polynomials are monic being either a cyclotomic polynomial or a product of them. -1 is the constant of the polynomial on the second line because \((x-1)\) is one of the factors of the product and, by induction, the other polynomial factors have constant term 1. The object is to show that \(a_0 = 1\) and that \(a_i \in \mathbb{Z}\) for \(i \geq 1.\) To this end, compare the constants on each side of (4), then the coefficents of \(x,\) the coefficients of \(x^2,\) and so on:
\begin{alignat}{2}
\text{constants } &: -1 = a_0 \cdot (-1) &&\implies a_0 = 1,\\
\text{coefficients of } x &: a_0 b_1 - a_1 = 0 &&\implies a_1 \in \mathbb{Z}, \text{ since } a_0, b_1 \in \mathbb{Z},\\
\text{coefficients of } x^2 &: a_0 b_2 + a_1 b_1 - a_2 = 0 &&\implies a_2 \in \mathbb{Z}, \text{ since } a_0, a_1 b_i \in \mathbb{Z},
\end{alignat}
and so on. This establishes the induction, that \(\Phi_n(x)\) has integral coefficients and a constant term 1.

Let \(d \, | \, n, d \neq n\) and consider the expression:
\[Q_{n,d}(x) = {{x^n - 1} \over {x^d - 1}}.\]
Both numerator and denominator are the product of cyclotomic polynomials and since \(d \, | \, n,\) all the cyclotomic polynomials appearing in the denominator also appear in the numerator. It follows that \(Q_{n,d}(x)\) is itself a product of cyclotomic polynomials. Since \(\Phi(x)\) a factor of the numerator but not the denominator:
\[\Phi_n(x) \, | \, Q_{n,d}(x).\]
Putting \(x=q\) and \(d=n_a:\)
\[\Phi_n(q) \, \bigg| \, {{q^n - 1} \over {q^{n_a} - 1}}.\]
Since \(\Phi_n(q) \, | \, (q^n-1)\) as well, equation (1) above implies that \(\Phi_n(q) \, | \, (q-1).\)

Bringing It Together
 sixth roots of unity

The last step of the proof is to show that \(\Phi_n(q)\) cannot divide \(q-1\) if \(n > 1.\) Recall that \(q\) is the size of the center of the division ring, so \(q\) is an integer greater than or equal to 2. Point \((q,0)\) is closest to the half-plane to the left of line \(x=1\) at the point \((1,0)\) and at at that point \((q,0)\) is distance \(q-1\) from the half plane. For all other points in the half plane, the distance is strictly greater than \(q-1.\) As illustrated here, for example, \(|q-\mu| > q-1\) and \(|q-\lambda| > q-1.\) Since \(\Phi_n(q) = \prod(q-\lambda),\) with all \(\lambda\) interior to the left half-plane if \(n > 1\) (and there being at least one of them if \(n > 1\)):
\[|\Phi_n(q)| = \prod|q-\lambda| > (q-1)^m \geq q-1 \text{ when } n > 1,\]
where \(m = \text{ deg } \Phi_n(x) \geq 1.\) So \(\Phi_n(q)\) cannot divide \(q-1\) in that case. It follows that \(n=1,\) that is, that the dimension of the division ring over its center is 1. Put otherwise, the center is the entire division ring and therefore the disivion ring is commutative. QED.

Mike Bertrand

March 3, 2022


^ 1. Proofs from THE BOOK, by Martin Aigner and Günter M. Ziegler, Springer (1998), ISBN 3-540-63698-6. This is the first edition, where Wedderburn's Theorem is the subject of Chapter 5 (pp. 23-26). Erdős impishly maintained that God (in whom he didn't believe) maintained a jealously guarded book of the most elegant theorems of mathematics and it was the duty of mathematicians to pry them from His hands. Proofs from THE BOOK has had great success, running to six editions and having its own Wikipedia page.

^ 2. Wedderburn's original proof is in "A Theorem on Finite Algebras", by J. H. Maclagan Wedderburn, Transactions of the American Mathematical Society, Vol. 6, No. 3 (July 1905), pp. 349-352. I've also drawn heavily on Proofs from THE BOOK and Topics in Algebra, by I. N. Herstein, Blaisdell Publishing Company (1964), pp. 318-321.

^ 3. "On the Integral Divisors of \(a^n - b^n\)", by Geo. D. Birkhoff and H. S. Vandiver, Annals of Mathematics, July 1904, Second Series, Vol. 5, No. 4, pp. 173-180.

^ 4. The theorem of Birkhoff and Vandiver was first proven by Karl Zsigmondy in 1892 and is now known as Zsigmondy's Theorem — "Zur Theorie der Potenzreste", by Karl Zsigmondy, Monatshefte für Mathematik und Physik 3 (1892), pp. 265-284.

^ 5. "Über die Kommutativität endlicher Schiefkörper", by Ernst Witt, Abhandlungen aus dem Mathematischen Seminar der Universität Hamburg 8 (1931), p. 413.