The Hilbert Basis Theorem

 David Hilbert in 1912
David Hilbert, 1862-1943 (photo 1912)

Hilbert first proved a form of the basis theorem in 1890.[1] It was so revolutionary at the time that Paul Gordan reportedly exclaimed, “This is not mathematics, it is theology!”.[2] van der Waerden gave an updated and generalized proof in Moderne Algebra in 1931, crediting Hilbert for the basic idea and Emil Artin for the specifics.[3] The proof here is updated still more, though still retaining van der Waerden's degree reduction strategy.

Hilbert Basis Theorem. Let \( R \) be a commutative ring with \( 1 \). If \( R \) is Noetherian, then \( R[x] \) is Noetherian as well.

Proof. Recall that a ring is Noetherian if its ideals satisfy the Ascending Chain Condition, or equivalently, if every ideal is finitely generated.

Suppose \( R \) is Noetherian and \( U \) is an ideal in \( R[x]. \) The objective is to prove that \( U \) is finitely generated. For \( m = 0, 1, 2, \cdots, \) let

\begin{align}
L_m &= \text{ set of leading coefficients of polynomials of degree } \leq m \text{ in } U, \text{ together with }0
\end{align}

Every nonzero \( \alpha \in L_m \) is associated with some polynomial \( a(x) = \alpha x^s + \cdots \in U \) with \( \deg{a} = s \leq m. \)

Each \( L_m \) is an ideal in \( R. \) For suppose \( \alpha, \beta \in L_m. \) Then \( \alpha \) and \( \beta \) are the leading coefficients of some polynomials \( a(x), b(x) \in U \) of degree less than or equal to \( m: \)

\begin{align}
a(x) &= \alpha x^s + \cdots \in U, \;\; \text{ where } \deg{a} = s \leq m,\\
b(x) &= \beta x^t + \cdots \in U, \;\; \text{ where } \deg{b} = t \leq m.
\end{align}

Assume \( s \geq t \) and consider \( c(x) = a(x) - x^{s-t} b(x). \) Multiplying \( b(x) \) by \( x^{s-t} \) or indeed by any other polynomial keeps the result in \( U \) since \( U \) is an ideal in \( R[x], \) and subtracting that from another element of \( U, a(x), \) keeps the final result \( c(x) \) in \( U \) also, so

\begin{align}
c(x) &= a(x) - x^{s-t} b(x) \in U\\
&= (\alpha x^s + \cdots ) - (\beta x^s + \cdots)\\
&= (\alpha - \beta) x^s + \cdots.
\end{align}

Since \( c(x) \in U, \) this implies that either \( \alpha - \beta = 0 \) or \( \alpha - \beta \) is the coefficient of the highest power of a polynomial of degree \( s \leq m \) in \( U. \) In any event, \( \alpha - \beta \in L_m. \) Similarly \( \lambda \alpha \in L_m \) for any \( \lambda \in R \) and so \( L_m \) is an ideal in \( R \). Also:

\begin{align}
L_0 \subseteq L_1 \subseteq L_2 \subseteq \cdots,
\end{align}

so by the ACC in \( R, \) the chain stabilizes; that is, there is some \( n \geq 0 \) such that

\begin{align}
L_n = L_{n+1} = L_{n+2} = \cdots.
\end{align}

Assume that \( n \) is the smallest such value. Since \( R \) is Noetherian, all its ideals are finitely generated, and in particular, each \( L_m \) has a finite set of generating values \( \alpha \in R \):

\begin{align}
L_m = (\alpha_{m1}, \alpha_{m2}, \cdots, \alpha_{mk_m}), \hspace{20pt} m \leq n.
\end{align}

By the definition of the \( L_m, \) each \( \alpha_{mj} \) is associated with a corresponding polynomial \( a_{mj}(x) \in U \) of degree less than or equal to \( m. \) The goal is to show that all the \( a_{mj}(x) \in U \) taken together generate \( U \) in \( R[x]. \) To this end, build ideals in \( R[x] \) paralleling the \( L_m: \)

\begin{align}
A_m = (a_{m1}(x), a_{m2}(x), \cdots, a_{mk_m}(x)), \hspace{30pt} m \leq n.
\end{align}

First we establish the following:

For \( g(x) \in U \) with \( \deg{g} = m \geq n\) there is some \( g_1(x) \in U \) such that \( g(x) - g_1(x) \in A_n, \) where \( \deg{g_1} < \deg{g} = m. \)(1)

For the sake of illustration, suppose \( g(x) = \beta x^{n+3} + \cdots, \) where \( \beta \neq 0, \) so \( \beta \in L_{n+3} = L_n. \) Assume further that:

\begin{align}
L_n &= \{\alpha_1, \alpha_2, \alpha_3\}, \text{ putting } \alpha_{j} = \alpha_{n,j},\\
A_n &= \{a_1(x), a_2(x), a_3(x)\}, \text{ putting } a_j(x) = a_{n,j}(x),\\
&= \{\alpha_1 x^{n-1} + \cdots, \;\; \alpha_2 x^n + \cdots, \;\; \alpha_3 x ^{n-2} + \cdots\},\\
\end{align}

Note in the above that \( \alpha_1 \in L_n \) means that there is a polynomial \( a_1(x) \in A_n \) of degree less than or equal to \( n \) with \( \alpha_1 \) as the leading coefficient, say \( a_1(x) = \alpha_1 x^{n-1} + \cdots, \) where \( \alpha_1 \neq 0 \) and the ellipsis indicates terms in lower powers of \( x. \) In fact, the degree of \( a_1(x) \) could have been anything less than or equal to \( n \) and \( a_2(x) \) and \( a_3(x) \) exemplify that possibility.

Since \( \beta \in L_n \):

\begin{align}
\beta = \lambda_1 \alpha_1 + \lambda_2 \alpha_2 + \lambda_3 \alpha_3, \hspace{20pt} \lambda_i \in R.
\end{align}

Put:

\begin{align}
g_1(x) = g(x) - \left(\lambda_1 \color{red}{x^4} a_1(x) + \lambda_2 \color{red}{x^3} a_2(x) + \lambda_3 \color{red}{x^5} a_2(x)\right),\\
\end{align}

where the highlighted weighting powers of \( x \) are chosen to scale the \( a_i (x) \) in question to have the same degree as \( g(x). \) Note that the parenthesized expression is in \( A_n, \) since it consists of sums of polynomials in \( A_n \) multiplied by elements of \( R[x]. \) Continuing the calculation:

\begin{align}
g_1(x) &= g(x) - \bigl(\lambda_1 \color{red}{x^4} \underbrace{(\alpha_1 x^{n-1} + \cdots)}_{\Large a_1(x)} + \lambda_2 \color{red}{x^3} \underbrace{(\alpha_2 x^n + \cdots)}_{\Large a_2(x)} + \lambda_3 \color{red}{x^5} \underbrace{(\alpha_3 x^{n-2} + \cdots)}_{\Large a_3(x)}\bigl)\\
&= \beta x^{n+3} + \cdots - \left((\lambda_1 \alpha_1 x^{n+3} + \cdots) + (\lambda_2 \alpha_2 x^{n+3} + \cdots) + (\lambda_3 \alpha_3 x^{n+3} + \cdots)\right)\\
&= \bigl(\underbrace{\beta - (\lambda_1 \alpha_1 + \lambda_2 \alpha_2 + \lambda_3 \alpha_3)}_{\Large 0}\bigl)x^{n+3} + \cdots\\
&= \text{ a polynomial of degree } \leq n+2 = m-1,
\end{align}

so \( g_1(x) \) is the polynomial promised in (1) in this example, and a general proof of (1) proceeds accordingly.

Keeping in mind that \( n \) is the point at which the ideals \( L_m \) in \( R \) stabilize (that is, \( L_m = L_n \) for \( m \geq n \)), repeated applications of (1) produce a series of polynomials \( f_i(x) \in U \) for any given \( f(x) \in U \) with \( \deg{f} = p > n: \)

\begin{alignat}{3}
f(x) &= f_1(x) + b'(x), \hspace{5pt} \quad && b'(x) \in A_n, \quad &&&\deg{f_1} \leq p-1,\\
f_1(x) &= f_2(x) + b''(x), \hspace{5pt} \quad && b''(x) \in A_n, \quad &&&\deg{f_2} \leq p-2,\\
&\vdots\\
f_{r-1}(x) &= f_r(x) + b^{(r)}(x), \hspace{5pt} \quad && b^{(r)}(x) \in A_n, \quad &&&\deg{f_r} \leq n,\\
\end{alignat}

where \( r \leq p - n, \) the inequality being strict if the degree is reduced by more than one at any step. Unrolling this sequence:

\begin{equation}{f(x) = f_r(x) + b(x), \quad b(x) \in A_n, \quad \deg{f_r} \leq n.}\tag{2} \end{equation}

That is, for every \( f(x) \in U \) of degree greater than \( n, \) there is an associated \( f_r(x) \in U \) of degree less than or equal to \( n \) such that:

\begin{equation}{f(x) \in \left(f_r(x), A_n\right), \quad \deg{f_r} \leq n.}\tag{3} \end{equation}

A similar degree reduction strategy is used on \( f_r(x) \) by applying this principle:

For \( g(x) \in U \) with \( \deg{g} = m < n\) there is some \( g_1(x) \in U \) such that \( g(x) - g_1(x) \in A_m, \) where \( \deg{g_1} < \deg{g} = m. \)(1')

(1') is proven just like (1) except that \( g(x) - g_1(x) \) in (1') is in \( A_{m-1}, \) the next ideal down, instead of \( A_n. \) Applying (1') to \( f_r(x) \) produces polynomials \( h(x) \) and \( c(x) \) in \( U \) such that:

\begin{equation}{f_r(x) = h(x) + c(x), \quad c(x) \in A_{n-1}, \quad \deg{h} \leq n-1.}\tag{4} \end{equation}

Combining (3) and (4):

\begin{align}
f(x) \in \left(h(x), A_n, A_{n-1}\right), \quad \deg{h} \leq n-1.
\end{align}

Continuing in this fashion:

\begin{align}
f(x) \in \left(A_n, A_{n-1}, A_{n-2}, \cdots, A_0\right).
\end{align}

\( f(x) \) was assumed to be a polynomial in \( U \) of degree greater than \( n. \) If \( \deg{f} \leq n, \) the latter part of the argument shows that \( f(x) \) is in an ideal generated by some subset of the \( A_i, \) so in either case:

\begin{align}
U = \left(A_n, A_{n-1}, A_{n-2}, \cdots, A_0\right).
\end{align}

Since each of the \( A_i \) is finitely generated in \( R[x] \) and \( n \) is a fixed value depending on \( U, \) it follows that \( U \) is finitely generated; and since this is true of every ideal in \( R[x], R[x] \) is Noetherian.QED.

Mike Bertrand

March 1, 2020


^ 1. "Ueber die Theorie der algebraischen Formen", by David Hilbert, Mathematische Annalen, 36 (4) (1890), pp. 473–534.

^ 2. A History of Abstract Algebra: From Algebraic Equations to Modern Algebra, by Jeremy Gray, Springer (2018), ISBN 978-3-319-94772-3. “This is not mathematics": p. 266. Gray translated the pertinent part of Hilbert's paper into English and discusses it helpfully in Chapter 25, pp. 263-273.

^ 3. See "On the Sources of My Book Moderne Algebra", by B. L. van der Waerden, Historia Mathematica, 2 (1975), pp. 31-40. "Emil Artin for the specifics" : p. 35.