\( 5 = 1^2 + 2^2 \) is the sum of two squares, \( 3 \) is not. Dealing with whole numbers only, including \( 0, \) it's a bit of a riddle coming up with the criterion distinguishing the two situations. Based on empirical investigations, mathematicians in the \( 17^\text{th} \) century found the key. According to Leonard Dickson[1]:
A. Girard (Dec 9, 1632) had already made a determination of the numbers expressible as a sum of two integral squares: every square, every prime \( 4n + 1, \) a product formed of such numbers, and the double of one of the foregoing.
The part about primes \( p \equiv 1 \; (\text{mod} \; 4) \) is central, because a product of two numbers each of which is the sum of two squares is itself the sum of two squares. Since \( 5 = 1^2 + 2^2 \) and \( 13 = 2^2 + 3^2, \) for example, \( 65 = 5 \cdot 13 \) is also the sum of two squares: \( 65 = 4^2 + 7^2. \) In fact there is a second representation: \( 65 = 1^2 + 8^2, \) and the number of representations is of interest too (this exact example is from Diophantus).
Fermat claimed to have a proof for primes \( p \equiv 1 \; (\text{mod} \; 4) \) and Euler proved it conclusively in 1760[2]. Given an integer \( n < 2^{31} = 2,147,483,648, \) this calculator produces all ways of representing it as a sum of two squares. For primes \( p \equiv 1 \; (\text{mod} \; 4), \) there is exactly one solution, but there can be more than one solution (or none) for other values of \( n. \) My approach for a given integer \( n \) is:
1) Determine the prime power representation of \( n. \)
2) Find the one solution for each prime \( p \; | \; n \) with \( p \equiv 1 \; (\text{mod} \; 4). \)
3) Use the Brahmagupta-Fibonacci identity to find all solutions for the highest power of each prime in (2).
4) Use the Brahmagupta-Fibonacci identity to find all solutions for \( n. \)
The Brahmagupta-Fibonacci identity (Brahmagupta henceforth) assures that a product of sums of two squares is itself a sum of two squares:
\[ \begin{align*}
(a^{2}+b^{2})(c^{2}+d^{2})&=(ac+bd)^{2}+(ad-bc)^{2}\\
&=(ac-bd)^{2}+(ad+bc)^{2}.
\end{align*} \]
In general, this method produces the same solution multiple times in steps (3) and (4) considering that Brahmagupta produces two not necessarily new solutions at each step. The calculator algorithm culls duplicate solutions. I don't have a definitive proof that the algorithm generates all solutions, but can testify that it does for all \( n \leq 2000, \) all \( n \) such that \( 2,000,000,000 \leq n \leq 2,000,002,000, \) and thousands of random values in between, never a failure.
The fundamental theorem on sums of two squares is:
Let \( n = 2^k p_1^{a_1} \cdots p_r ^{a_r} q_1^{b_1} \cdots q_s ^{b_s} \), where the \( p_i \) are distinct primes with \( p_i \equiv 1 \; (\text{mod} \; 4) \) and the \( q_i \) are distinct primes with \( q_j \equiv 3 \; (\text{mod} \; 4). \) Then \( n \) is the sum of two squares if and only if all the \( b_j \) are even. In that case, the number of distinct solutions is \( \left \lceil{{1 \over 2}(a_1+1)(a_2+1) \cdots (a_r+1)}\right \rceil, \) where \( \left \lceil{x}\right \rceil \) is the ceiling function, the smallest integer greater than or equal to \( x. \)
See Dummit & Foote, 3rd ed., p 291, for a proof using the Gaussian integers. They distinguish \( 5 = 1^2 + 2^2 \) and \( 5 = (-2)^2 + 1^2 \) as different solutions, and six others too, counting all ways of ordering the two summands and applying plus and minus signs, so they give \( 4 (a_1+1)(a_2+1) \cdots (a_r+1) \) as the total number of solutions. That makes sense in the Gaussian integers, where the eight solutions truly are different, ranging around a circle of radius \( 5, \) two solutions in each quadrant. In my formulation, the ceiling function only comes into play when \( n = m^2 \) is a perfect square, so \( n = 0^2 + m^2 \) can be written in only four ways, not eight.
In programming the calculator, I divided integers into four Fermat types (ft) based on the nature of their prime divisors \( p: \)
ft0: all \( p \equiv 1 \; (\text{mod} \; 4) \)
ft1: \( 2 \) a factor, all other \( p \equiv 1 \; (\text{mod} \; 4) \)
ft2: at least one \( p \equiv 3 \; (\text{mod} \; 4), \) all such to an even power
ft3: at least one \( p \equiv 3 \; (\text{mod} \; 4), \) one or more to an odd power
The first three types have one or more solutions, the fourth type none. Here are the numbers of each type among the first \( 100,000 \) integers:
Total | ft0 | ft1 | ft2 | ft3 |
---|---|---|---|---|
\( 100,000 \) | \( 9,623 \) | \( 10,292 \) | \( 4,113 \) | \( 75,972 \) |
It seems that the proportion of type \( 3 \) integers increases relative to the other types as \( n \) increases (about \( 83\% \) of integers are type \( 3 \) in the vicinity of \( 2,000,000,000 \)). According to the fundamental theorem, powers of \( 2 \) in the prime power factorization of \( n \) are irrelevant for the number of solutions of \( n \) as a sum of two squares. So are powers of prime divisors \( p \equiv 3 \; (\text{mod} \; 4), \) provided those powers are all even, so there are solutions. So suppose \( n = m^2r, \) where \( r \) is square free and has only \( 2 \) and primes \( p \equiv 3 \; (\text{mod} \; 4) \) as divisors. It follows from the foregoing that \( n \) and \( r \) have the same number of solutions. Assuming that \( r = a^2 + b^2: \)
\[ n = m^2r = m^2(a^2 + b^2) = (ma)^2 + (mb)^2. \]
That is, the solutions for \( n \) are exactly \( m \) times the solutions for \( r. \) It's conceivable that the highest power of \( 2 \) dividing \( n \) is not square, so after extracting the square part from the powers of two, one copy of \( 2 \) is left as a divisor of \( r. \) Regarding the number of solutions, then, it suffices to consider \( n = 2s, \) where all the primes \( p \) in the factorization of \( s \) are such that \( p \equiv 1 \; (\text{mod} \; 4) \) and \( s = a^2 + b^2. \) Applying brahmagupta:
\[ \begin{align*}
(a^{2}+b^{2})(1^{2}+1^{2})&=(a \cdot 1 + b \cdot 1)^{2}+(a \cdot 1 - b \cdot 1)^{2}\\
&=(a \cdot 1 - b \cdot 1)^{2}+(a \cdot 1 + b \cdot 1)^{2}\\
&=(a + b)^2 + (a - b)^2
\end{align*} \]
Both versions of brahmagupta reduce to that third line. For example:
\[ \begin{align*}
{5 \cdot 13} &= {1^2 + 8^2} = {4^2 + 7^2}.\\
\therefore {2 \cdot 5 \cdot 13} &= {9^2 + 7^2} = {11^2 + 3^2}.
\end{align*} \]
In both cases, those are the only solutions.
Professor Stan Wagon gives and proves the algorithm for determining the (one and only) decomposition of a prime \( p \equiv 1 \; (\text{mod} \; 4) \) as a sum of two squares \( p = \alpha^2 + \beta^2, \)[3] introduced by the observation that "Fortunately, this problem can be solved by a very fast and easy-to-program algorithm." Only 750 lines of PHP code in my case, a few hundred more for debugging and diagnostics. He describes the algorithm as follows:
Now, let \( p \) be a prime of the form \( 4k + 1. \) The algorithm has two steps:
A. Find \( x \) such that \( x^2 \equiv -1 \; (\text{mod} \; p). \)
B. Apply the Euclidean algorithm to \( p \) and \( x; \) the first two remainders that are less than \( \sqrt{p} \) may be taken as \( \alpha \) and \( \beta.\)
Step A is easy in practice: First find \( c, \) a quadratic nonresidue of \( p; \) recalling Euler's criterion (which states that for \( a \) not divisible by \( p, a \) is (resp., is not) a quadratic residue \( \text{mod} \; p \) iff \( a^{{(p-1)/2}} \equiv +1 \) (resp., \( -1 \)) mod \( p \)), we have that \( c^{2k} \equiv -1 (\text{mod} \; p), \) whence \( x \) may be taken to be \( c^k \; \text{mod} \; p. \)
An integer \( r \) is called a quadratic residue \( \text{mod} \; n \) (or a quadratic residue of \( n \)) if it is a perfect square \( \text{mod} \; n; \) i.e., if there is an integer \( x \) such that \( x^{2}\equiv r{\pmod {n}}. \) If there is no such \( x, \) then \( r \) is a quadratic non-residue \( \text{mod} \; n. \) A prime \( p \) has exactly \( {p-1} \over 2 \) residues and \( {p-1} \over 2 \) non-residues, excluding \( 0. \) The law of quadratic reciprocity says that if at least one of two primes \( p \) and \( q \) is congruent to \( 1{\pmod {4}}, \) then \( p \) is a quadratic residue of \( q \) if and only if \( q \) is a quadratic residue of \( p; \) that is, \( x^{2}\equiv p{\pmod {q}} \) is solvable if and only if \( x^{2}\equiv q{\pmod {p}} \) is.
Let's work through the algorithm for \( p = 2,000,000,089, \) a prime congruent to \( 1 \; \text{mod} \; 4 \) and therefore subject to quadratic reciprocity (large primes shown on request here). We want a quadratic non-residue of \( p. \) Could \( 5 \) be one? Well, quadratic reciprocity says \( 5 \) is a a quadratic residue of \( p \) if and only if \( p \) is a quadratic residue of \( 5; \) that is, if and only if there is an \( x \) such that:
\[ p \equiv x^{2}{\pmod {5}}. \]
This becomes simple when \( p \) is reduced \( \text{mod} \; 5 \) to the equivalent congruence:
\[ 4 \equiv x^{2}{\pmod {5}}. \]
This last is obviously solvable with \( x = 2, \) from which it follows that \( 4 \) and therefore \( p \) is a quadratic residue of \( 5 \) and so \( 5 \) is a quadratic residue of \( p \) and therefore not suitable as a value of \( c. \) Similarly, \( 2 \) and \( 3 \) are quadratic residues of \( p. \) Let's try \( 7. \) The same calculations as for \( 5 \) show that \( 7 \) is a quadratic residue of \( p \) if and only if this equation has a solution:
\[ \begin{equation}{3 \equiv x^{2}{\pmod {7}}.} \tag{1} \end{equation} \]
But going through all the possibilities for \( x \) (there are only \( 7 \)) shows that (1) has no solution and therefore \( 7 \) is a quadratic non-residue of \( p \) and we can put \( c = 7. \) Also \( k = {{n-1} \over 4} = 500,000,022, \) so it is simply (!) a matter of calculating:
\[ x = c^k = 7^k {\pmod {p}}. \]
So a nine digit exponent, all calculations mod a ten digit value to keep things manageable. That would be a bit of a task in 32-bit PHP, where the integer limit is just a bit higher than \( p. \) PHP's BCMath library for arbitrary precision calculations comes to the rescue, built into PHP since PHP 4.0.4. A near-instantaneous BCMath calculation shows that \( x = 377,944,462 \) (double-check at WolframAlpha). Next is to execute the Euclidean algorithm, stopping when turning up the first two values less than \( \sqrt{p} = \sqrt{2,000,000,089} = 44,721.36 : \)
\[ \begin{align*}
p &= 5 \cdot x + 110,277,779\\
x &= 3 \cdot 110,277,779 + 47,111,125\\
110,277,779 &= 2 \cdot 47,111,125 + 16,055,529\\
47,111,125 &= 2 \cdot 16,055,529 + 15,000,067\\
16,055,529 &= 1 \cdot 15,000,067 + 1,055,462\\
15,000,067 &= 14 \cdot 1,055,462 + 223,599\\\
1,055,462 &= 4 \cdot 223,599 + 161,066\\
223,599 &= 1 \cdot 161,066 + 62,533\\
161,066 &= 2 \cdot 62,533 + \color{red}{36,000}\\
62,533 &= 1 \cdot 36,000 + \color{red}{26,533}
\end{align*} \]
The colored values in the last two lines are what we're seeking: \( p = 2,000,000,089 = \) \( \color{red}{26,533}^2 + \color{red}{36,000}^2, \) and that is the only way of representing \( p \) as the sum of two squares.
This subject shows how important the prime numbers are when a multiplicative property like the Brahmagupta-Fibonacci identity is in play; proving a property for primes can be the key step in extending it to all integers. In this case, the property is whether an integer is representable as a sum of two squares. The property holds for \( 2, \) for primes congruent to \( 1 {\pmod {4}}, \) and for squares — therefore it holds for products of all of these. The property does not hold for primes congruent to \( 3 {\pmod {4}}, \) because for all natural numbers \( n, \; n^2 \equiv 0, 1 \; (\text{mod } 4), \) so \( n^2 + m^2 \equiv 0, 1, 2 \; (\text{mod} \; 4) \) and a sum of two squares cannot be congruent to \( 3 \; (\text{mod} \; 4). \) Two propositions from Euler show how knowing the situation for primes leads to broader understanding. Euler demonstrated these propositions in 1758[4] on his way to proving that primes congruent to \( 1 \; (\text{mod} \; 4) \) can be written as the sum of two squares:
Proposition 1. If the product \( pq \) is a sum of two squares and one factor \( p \) is a prime number and itself a sum of two squares, then the other factor \( q \) will also be a sum of two squares.
Proposition 4. If \( a \) and \( b \) are relatively prime, then every factor of \( a^2 + b^2 \) is itself the sum of two squares
Write the prime power factorization of positive integer as \( n = 2^s uvw, \) where \( u \) contains all the primes congruent to \( 1 \; (\text{mod} \; 4), v \) contains primes congruent to \( 3 \; (\text{mod} \; 4) \) to the highest even power possible (so \( v \) is a square), and \( w \) contains left-over copies of primes congruent to \( 3 \; (\text{mod} \; 4) \) to the first power. From the foregoing, \( n \) is a sum of two squares if \( w = 1, \) that is, if \( n = 2^s uv \) is a product of \( 2 \)s, primes congruent to \( 1 {\pmod {4}}, \) and squares. We want to show that contrarily, if \( w \neq 1, \) then \( n \) is not a sum of two squares, proving the fundamental theorem.
So suppose \( n = 2^s uvw \) as suggested, where \( w = q' q'' q''' \cdots, \) the \( q \)'s are one or more different primes congruent to \( 3 \; (\text{mod} \; 4), \) and \( n \) is a sum of two squares. By stripping off \( 2 \)s and primes in \( u \) one at a time, Proposition 1 implies that \( vw \) is a sum of two squares:
\[ \begin{equation}{vw = a^2 + b^2.} \tag{2} \end{equation} \]
If \( a \) and \( b \) have a common factor \( t, \) then \( t^2 \; | \; a^2 + b^2 = vw = vq'q''q''' \cdots, \) where \( v \) contains primes congruent to \( 3 \; \text{mod} \; 4, \) all to an even power, possibly but not necessarily including some of \( q', q'', \) and the others listed to the first power. When \( t^2 \) is divided into \( vq'q''q''' \cdots, \) there will be cancellation with primes in \( v, \) but \( q', q'', \cdots \) will be left intact, so dividing \( t^2 \) out of (2) results in:
\[ \begin{equation}{v'w = a'^2 + b'^2,} \tag{3} \end{equation} \]
where \( a' \) and \( b' \) will be relatively prime if \( t = \text{gcd}(a, b). \) Then Proposition 4 says that the primes \( q \) dividing \( w \) are each a sum of two squares. But that's impossible, because all those primes are congruent to \( 3 {\pmod {4}}. \) Therefore the original \( n \) cannot be a sum of two squares. That finishes the proof of the fundamental theorem, at least the part determining whether an integer is representable as the sum of two squares. QED.
Programming a problem like this is a good way to become immersed in it, sharpening programming skills at the same time. Unit testing forces you to address hundreds of specific situations. Plus, what better way to make the problem your own than implementing an algorithm to solve it. For example, here is the header of the function called for \( 2 \) and other primes congruent to \( 1 {\pmod {4}} \) dividing a proposed \( n: \)
function getAllSumsOfPowers(&$q, $highPower) { // USE: Loads pairs whose sum of squares equals a power of a prime p. // IN: $q front-loaded with prime p ($q[0]) and two values the sum // of whose squares equals p ($q[1]). // $highPower = highest power of p for which to load values // RET: $q loaded with pairs of values the sum of whose squares equals powers of p // NOTE: Assuming p = 2 OR p = 1 (mod 4), so it is sum of two squares in a unique way. // Loads *all* the powers up to (and including) $highPower. . . }
Using function brahmagupta($a, $b, $c, $d) repeatedly, getAllSumsOfPowers() builds up solutions for all powers of a given prime up to the highest power dividing \( n, \) output like this if the prime is \( 5 \) and \( 5^4 \) is the highest power of \( 5 \) dividing \( n: \)
Array ( [0] => 5 [1] => Array ( [0] => Array(1, 2) ) [2] => Array ( [0] => Array(0, 5) [1] => Array(3, 4) ) [3] => Array ( [0] => Array(2, 11) [1] => Array(5, 10) ) [4] => Array ( [0] => Array( 0, 25) [1] => Array( 7, 24) [2] => Array(15, 20) ) )
In this case, only the last subarray is needed, showing that \( 5^4 = 625 = {0^2 + 25^2} = \) \( {7^2 + 24^2} = \) \( {15^2 + 20^2}, \) but the progressive brahmagupta approach requires generating each step to get to the next. Note that for these powers of \( 5, \) the number of solutions is as expected from the fundamental theorem, namely \( \left \lceil{{1 \over 2}(power+1)}\right \rceil. \)
My hosting service provides 64-bit PHP, but the calculator as written bogs down in the vicinity of 14 decimal digits (dump PHP variable PHP_INT_SIZE to see how many bytes are used to represent integers — PHP_INT_SIZE = 4, for example, means that integers are stored in four bytes = 32 bits). So I've disallowed input higher than \( 2^{31} - 1, \) about \( 2,000,000,000. \) Some thought on the maximum number of solutions in this range is in order; too many solutions would overwhelm the popup colorbox. The formula for the number of solutions shows that the best way to maximize the number of solutions is by tacking on primes congruent to \( 1 {\pmod {4}} \) to the first power, so \( n = 48,612,265 = \) \( 5 \cdot 13 \cdot 17 \cdot 29 \cdot 37 \cdot 41, \) for example, has \( 32 \) solutions, and no lower value has that many. Multiply by \( 25 \) to get \( 1,215,306,625 \) for \( 64 \) solutions, and that is the maximum number of solutions for \( n < 2^{31}. \)
The calculator's bottleneck is factoring, which depends on this naive routine:
function isPrime($n) { // USE: Says whether $n is prime. // IN: $n = number to check // RET: true if $n is prime, false if not // Make sure $n is whole number greater than 1. if (!ctype_digit((string)$n) or ($n < 2)) return false; // Done if $n = 2. if($n == 2) return true; // If get here, $n is whole number greater than 2. // Check for even. if($n % 2 == 0) return false; // If get here, $n is odd whole number greater than 2. // If $n divisible by odd number less than sqrt($n), not prime. $lim = ceil(sqrt($n)); for ($i = 3; $i <= $lim; $i += 2) if($n % $i == 0) return false; // If get this far, we have a prime! return true; }
My host executes isPrime() in about \( 0.08 \) seconds for \( n \sim 2 \times 10^9, \) consistently so no matter how many factors \( n \) has. Around \( 2 \times 10^{11}, \) the time is about \( 1.4 \) seconds, which wouldn't be too bad for a web app, but above that, times increase even faster than by the expected factor of \( \sqrt{10} \) every time \( n \) increases ten-fold (the prime check goes up to \( \sqrt{n} \)). Around \( 2 \times 10^{12}, \) about \( 8 \) seconds is required, and around \( 2 \times 10^{13}, \) the thirty second timeout often kicks in. There are some good tips here for improved prime-finding algorithms, lists of large primes too.
It's a long way to go from \( 5 = 1^2 + 2^2 \) to \( 20,000,000,000,021 = 2,873,161^2 + 3,427,090^2, \) but Euler proved that the latter is as true as the former and I have it on good authority that there are still larger primes, some of them no doubt congruent to \( 1 {\pmod {4}}. \)
Mike Bertrand
August 25, 2016
^ 1. History of the Theory of Numbers, Vol. II, by Leonard Dickson (Carnegie Institute of Washington, 1920). Chapter VI takes up the sum of two squares problem in detail.
^ 2. De numeris qui sunt aggregata duorum quadratorum (On Numbers which are the Sum of Two Squares), by Leonhard Euler, Opera Omnia, 2, p 295-327. Originally published in 1758.
^ 3. The Euclidean Algorithm Strikes Again, by Stan Wagon, The American Mathematical Monthly, 97(2), (Feb 1990), p. 125-129.
^ 4. Demonstratio theorematis Fermatiani omnem numerum primum formae \( 4n + 1 \) esse summam duorum quadratorum (Demonstration of a Theorem of Fermat That Every Prime Number of the Form \( 4n + 1 \) is the Sum of Two Squares), by Leonhard Euler, Opera Omnia, 2, p 328-337. Originally published in 1760.