A system of three congruences is shown on the right, but start with the simpler system:

\[ \begin{align*}

x &\equiv 1 \hspace{-.6em} {\pmod{2}}\\

x &\equiv 2 \hspace{-.6em} {\pmod{3}}.

\end{align*} \]

Values congruent \( \text{mod} \; 6 \) are certainly congruent \( \text{mod} \; 2 \) **and** \( \text{mod} \; 3, \) so in looking for an \( x \) solving both congruences simultaneously, it suffices to consider congruence classes \( \text{mod} \; 6 \) and in particular their smallest positive residues, namely \( 0, 1, 2, 3, 4, 5. \) We're seeking an odd number among those \( 6 \) since \( x \equiv 1{\pmod{2}}, \) one that is also congruent to \( 2 \; \text{mod} \; 3. \) \( x = 1 \) won't do, since \( 1 \equiv 1{\pmod{3}} \) neither will \( x = 3, \) since \( 3 \equiv 0{\pmod{3}}. \) \( x = 5 \) is the solution, since it satisfies both congruences, and it is the **only** solution \( \text{mod} \; 6. \)

The theorem in question is:

If \( p \) is an odd prime with \( p \equiv 1 \; (\text{mod} \; 4), \) then \( p \) is the sum of two squares.\( (1) \)

*Only if* is easy, because for all natural numbers \( n, \; n^2 \equiv 0, 1 \; (\text{mod } 4), \) so \( n^2 + m^2 \equiv 0, 1, 2 \; (\text{mod} \; 4) \) and a sum of two squares cannot be congruent to \( 3 \; (\text{mod} \; 4). \) Obviously \( 2 = 1^2 + 1^2 \) as well. The Brahmagupta-Fibonacci identity assures that a *product* of sums of two squares is itself a sum of two squares:

\[ \begin{equation}{(a^{2}+b^{2})(c^{2}+d^{2})=(ac+bd)^{2}+(ad-bc)^{2}.}\tag{2} \end{equation} \]

\( 5 = 1^2 + 2^2 \) is the sum of two squares, \( 3 \) is not. Dealing with whole numbers only, including \( 0, \) it's a bit of a riddle coming up with the criterion distinguishing the two situations. Based on empirical investigations, mathematicians in the \( 17^\text{th} \) century found the key. According to Leonard Dickson^{[1]}:

A. Girard (Dec 9, 1632) had already made a determination of the numbers expressible as a sum of two integral squares: every square, every prime \( 4n + 1, \) a product formed of such numbers, and the double of one of the foregoing.

The part about primes \( p \equiv 1 \; (\text{mod} \; 4) \) is central, because a product of two numbers each of which is the sum of two squares is itself the sum of two squares. Since \( 5 = 1^2 + 2^2 \) and \( 13 = 2^2 + 3^2, \) for example, \( 65 = 5 \cdot 13 \) is also the sum of two squares: \( 65 = 4^2 + 7^2. \) In fact there is a second representation: \( 65 = 1^2 + 8^2, \) and the *number* of representations is of interest too (this exact example is from Diophantus).

Progressive differences of the first few cubes.

Write down the first few cubes, then put their differences \( \Delta \) in the second column, the differences of those differences \( \Delta^2 \) in the third column, and so on. Remarkably, \( \Delta^3 = 6 \), and that is true for any contiguous sequence of cubes (obviously \( \Delta^4 = 0 \)). Do that with the fourth powers and you find that \( \Delta^4 = 24, \) and in general for contiguous \( n^{th} \) powers, \( \Delta^n = n!. \) The key to unlocking this mystery is the Calculus of Finite Differences, out of vogue now apparently, but with a hallowed history going back to Newton and before and studied in depth by George Boole in 1860.^{[1]} His book can still be read with profit, as can C. H Richardson's little text from 1954. ^{[2]}

Euler used \( \Delta^n x^n = n! \) in 1755 to prove the two squares theorem. Boole and those following him employed the term "calculus" advisedly, many theorems in the finite case matching similar ones in the familiar infinitesimal calculus. Which stands to reason, considering all there is in common, it's just that now \( \Delta x = 1. \)

De l'évanouissement des inconnues

(On the Vanishing of Unknowns)

Appendix to *Introduction à l'analyse des lignes courbes algébriques* (1750)

(Introduction to the Analysis of Algebraic Curves)

by *Gabriel Cramer*

When a problem contains several unknowns whose relationships are so complicated that one is obliged to form several equations; then, to discover the values of the unknowns, one makes all of them vanish, except one, which combined only with known quantities, gives, if the problem is determined, a *final Equation*, whose resolution reveals this first unknown, and then by this means all the others.

Robert E. Lee Moore (1882-1974)

It's a statement when someone names their child after Robert E. Lee, a man who did his best to destroy the United States in order to preserve slavery. Robert E. Lee was lionized more in death than in life, a paragon of the Lost Cause, the glorious if doomed rebellion of a brave people who wanted nothing but to be left alone, crushed by the soulless and brutal industrial juggernaut (Sherman's march to the Sea!). It's the big lie, forwarded for 150 years to defend the indefensible. What a wretched history of oppression, assiduously rebuilt over the generations by people like Moore, Sr. and his illustrious and vicious son Robert E. Lee Moore. The Compromise of 1877, peonage, disenfranchisement, lynching, Plessy v. Ferguson, Jim Crow, the Dunning school false flag on reconstruction. Read the old classics by W. E. B. Du Bois, Eric Foner, and C. Vann Woodward (himself a son of the south), among others, if you still doubt the long-standing construction and reconstruction of anti-black racism in this country down through the generations since 1865.

Frigyes Riesz.

The Riesz Representation Theorem is a foundation stone of 20^{th} century functional analysis. Generalized almost beyond recognition, Frigyes (Frédéric) Riesz originally proved the theorem in 1909 for \( C[0,1] \), the continuous real-valued functions on \( [0,1] \):

If \( \mathcal{A} \) is a bounded linear functional on \( C[0,1] \), then there is a function \( \alpha \) of bounded variation on \( [0,1] \) such that for all \( f \in C[0,1] \): \[ \mathcal{A}[f(x)] = {\int_0^1 f(x)d\alpha(x).} \hskip{60pt} (1) \]

In this article, I propose to retrace Riesz's original proof in *Sur les opérations fonctionnelles linéaires*^{[1]} in 1909, augmenting with his discussion in *Sur certains systèmes singuliers d'équations intégrales*^{[2]} in 1911 where appropriate.

Sur certains systèmes singuliers d'équations intégrales

(On some noteworthy systems of integral equations)

by *Frédéric Riesz, à Budapest.*

In what follows, the functions of bounded variation will play a leading role. We know the importance of this class of functions defined by M. Jordan, whose most remarkable properties become almost obvious after only one statement: that every real function of bounded variation is the difference of two bounded, never decreasing functions.

Sur les opérations fonctionnelles linéaires

(On linear functional operations)

by *Frédéric Riesz*

To define what is meant by a linear operation, it is necessary to specify the *domain of the functional*. We consider the totality of all real continuous functions \( \Omega \) between two fixed numbers, for example between \( 0 \) and \( 1 \); for this class, we define the *limit function* based on the assumption of uniform convergence. The functional operation \( \text{A}[f(x)] \), which associates to each element of \( \Omega \) a corresponding real number, will be called *continuous* if when \( f(x) \) is the limit of \( f_i(x) \), then \( \text{A}(f_i) \) tends to \( \text{A}(f) \). Such a distributive and continuous operation is said to be *linear*. It is easy to show that *this operation is bounded, that is to say, there is a constant \( M_A \) such that for every element \( f(x) \) we have*

\[ \begin{equation}{|\text{A}[f(x)]| \leq M_A \times max. |f(x)|.} \tag{1} \end{equation} \]

Sergei Bernstein.

In 1912 Sergei Bernstein introduced his famous polynomials to prove the Weierstrass Approximation theorem:

If \( F(x) \) is any continuous function in the interval [0,1], it is always possible, regardless how small \( \varepsilon \), to determine a polynomial \( E_n(x) = {a_0 x^n + a_1 x^{n-1} + \cdots + a_n} \) of degree \( n \) high enough such that we have \[ {|F(x) - E_n(x)|} < \varepsilon \] for every point in the interval under consideration.

Weierstrass proved the theorem originally in 1885^{[1]}, the very man who had earlier shown how wild a continuous function can be and in particular, how far from being smooth and subject to a Taylor expansion. Bernstein's proof was simple and based on probability theory. Maven Philip J. Davis says that "while [Bernstein's proof] is not the simplest conceptually, it is easily the most elegant".^{[2]}