The Erdös-Ginzburg-Ziv Theorem

I've tried my hand at a few of Peter Winkler's stumpers in Mathematical Puzzles^[1] with limited success when I lighted on this innocuous-looking one on page 43:

Even Split. Prove that from every subset of \(2n\) integers, you can choose a subset of size \(n\) whose sum is divisible by \(n\).

Aha! One I understand and can no doubt address with some application. Sometimes you can get an idea of the general solution by working through the problem for small \(n\). \(n=1\) and \(n=2\) are easy and \(n=3\) isn't too bad, but \(n=4\) leads to a rat's nest of special cases. I knew trouble was imminent after trying Gemini ("Prove that from every set of 8 integers, you can always choose a subset of size 4 whose sum is divisible by 4"). Several things jumped out:

• It's possible to feel sorry for Gemini as it engages in some world class wheel-spinning.
• Examining cases for small \(n\) brings zero illumination.
• The problem is well known as the Erdös-Ginzburg-Ziv Theorem (EGZ).^[2]

Ok, let's check out the hint on page 66, which simply says, "Prove it first when \(n\) is prime." This didn't help, except to suggest that there is a proof to the effect:

• The theorem holds for prime numbers.
• If the theorem holds for \(n=a\) and \(n=b\), then it holds for \(n=ab\).

The theorem follows for all positive integers if these two statements are true, considering that every positive integer is a product of primes. I recommend reading Winkler's abbreviated proof for the strategy, then studying the following fleshed-out account for the details.
.

EGZ For \(a\) And \(b\) Implies EGZ For \(ab\)

Define a set of integers to be “flat\(_n\)” if the elements of the set sum to \(0 \pmod{n}\). Then these three propositions are equivalent (the first is the original proposition, the third the Erdös-Ginzburg-Ziv Theorem):

\(\Gamma_n: \;\;\) Every set of \(2n\) integers contains a flat\(_n\) subset of size \(n\).
\(\Delta_n: \;\; \) A flat\(_n\) set of \(2n\) integers can be split into two disjoint flat\(_n\) subsets of size \(n\).
\(\Omega_n: \;\;\) Every set of \(2n-1\) integers contains a flat\(_n\) subset of size \(n\).

\(\Gamma_n \Rightarrow \Delta_n: \;\;\) Suppose \(\Gamma_n\) holds and \(S\) is a flat\(_n\) set of size \(2n\). Then \(\Gamma_n\) implies that \(S\) has a flat\(_n\) subset \(A\) of size \(n\). Suppose \(S = \{a_1, \cdots a_n, a_{n+1}, \cdots a_{2n}\}\), where \(A = \{a_1, \cdots a_n\}\) is flat\(_n\) and \(B = \{a_{n+1}, \cdots a_{2n}\}\) Then:
\begin{align*}
\sum_{a_i \in S} a_i &= \sum_{a_i \in A} a_i + \sum_{a_i \in B} a_i.
\end{align*}
The first and second sums in this equation are flat\(_n\), that is, their terms sum to \(0 \pmod{n}\). It follows that the same holds for the third sum, showing that \(B\) is flat\(_n\) and that \(\Delta_n\) holds.

\(\Delta_n \Rightarrow \Omega_n: \;\;\) Suppose \(\Delta_n\) holds and let \(T\) be any set of size \(2n-1\). Let \(S = T \cup \{a\}\), where \(a\) is an integer which, when added to the elements of \(T\), results in the final sum equaling \(0 \pmod{n}\). \(S\) is a flat\(_n\) set of size \(2n\), so \(\Delta_n\) implies that \(S\) can be split into two disjoint flat\(_n\) sets \(A\) and \(B\) of size \(n\). One of them does not contain \(a\) and that one is a flat\(_n\) subset of \(T\) of size \(n\), showing that \(\Omega_n\) holds.

\(\Omega_n \Rightarrow \Gamma_n:\;\;\) Obvious.

Suppose \(a, b \geq 2\) and \(\Delta_a\) and \(\Delta_b\) hold. The goal is to prove \(\Delta_{ab}\). I found it helpful to go through the proof with \(a=2, b=3\) to appreciate the logic. Suppose \(S\) is a flat\(_{ab}\) set of integers of size \(2ab\). A flat\(_{ab}\) set is flat\(_a\), since if a sum is a multiple of \(ab\), then it is certainly a multiple of \(a\). Let \(T\) be any subset of \(S\) of size \(2a\) and \(T'\) its complement in \(S\), so:
\begin{align*}
S = T \cup T', \text{ where } |T| = 2a.
\end{align*}
By \(\Gamma_a\), \(T\) has a flat\(_a\) subset \(S_1\) of size \(a\); let \(S_1'\) be the complement of \(S_1\) in \(S\). Then:
\begin{align*}
S = S_1 \cup S_1', \text{ where } |S_1| = a, \;\; S_1 \text{ is flat}_a, \;\; |S_1'| = 2ab - a = (2b-1)a.
\end{align*}
Because \(S\) and \(S_1\) are flat\(_a\), \(S_1'\) is flat\(_a\) as well, so the same analysis can be applied to \(S_1'\), leading to:
\begin{align*}
S = S_1 \cup S_2 \cup S_2', \text{ where } |S_1| = |S_2| = a, \;\; S_1, S_2 \text{ are flat}_a, \;\; |S_2'| = (2b-2)a.
\end{align*}
This can continue until \(S\) is exhausted, resulting in:
\begin{align*}
S = S_1 \cup S_2 \cup \cdots \cup S_{2b}, \text{ where } |S_i| = a, \;\; S_i \text{ are flat}_a, \;\; S_i \text{ disjoint}.
\end{align*}
\(\sum_{s \in S_i} s = a \cdot b_i\) for integers \(b_i\) since \(S_i\) is flat\(_a\) (which means that the sum of its elements is a multiple of \(a\)). Let \(B = \{b_i\}\). Then \(a \sum b_i = \sum ab_i = \sum_{s \in S} s = 0 \pmod{ab}\) because \(S\) is flat\(_{ab}\). That is:
\begin{align*}
a \sum b_i &= ab \cdot k\\[0.3em]
\sum b_i &= bk\\[0.3em]
\therefore \sum b_i &\equiv 0 \!\!\! \pmod{b}.
\end{align*}
\(B\) is a flat\(_b\) set of \(2b\) integers, so by \(\Delta_b\), \(B\) can be partitioned into two flat\(_b\) sets of size \(b\):
\begin{align*}
B &= B_1 \cup B_2, \text{ where } |B_1| = |B_2| = b, \;\; B_1, B_2 \text{ flat}_b.
\end{align*}
Let \(T_1\) be the union of the \(S_i\) such that \(b_i \in B_1\) and similarly for \(T_2\):
\begin{align*}
T_1 = \bigcup_{b_i \in B_1} S_i, \hspace{20pt} T_2 = \bigcup_{b_i \in B_2} S_i.
\end{align*}
Then each \(T_i\) is a union of \(b\) disjoint sets, each of size \(a\), so \(|T_i| = ab = \tfrac{1}{2}(2ab)\;\). \(T_1\) and \(T_2\) are disjoint because \(B_1\) and \(B_2\) are. Furthermore:
\begin{align*}
\sum_{t \in T_1} t &= \sum_{b_i \in B_1} ab_i = a\sum_{b_i \in B_1} b_i \equiv a \cdot b \equiv 0 \!\!\!\pmod{ab}.
\end{align*}
The same holds for \(T_2\), so both \(T_1\) and \(T_2\) are flat\(_{ab}\). The desired partition of the original set \(S\) has been found, so proposition \(\Delta_{ab}\) is established and the first part of the proof is done.

A moment's reflection might be in order to acknowledge the accomplishment. Since the cases \(n=2\) and \(n=3\) are straightforward, the theorem follows for \(n=4, 6, 8, 9, 12\) and any other integer divisible by only 2 or 3. Gemini provided a proof for \(n=4\), but it wasn't pretty. Even \(n=5\) was too much for them, so \(n=6\) is out of the question. Brute force approaches with Python also start to fail at about that point.

EGZ Holds for Primes

The next task is to show that \(\Delta_p\) holds for prime \(p\).

Given a flat\(_p\) set \(S\) of size \(2p\), where \(p\) is prime, the object is to find a flat\(_p\) subset of size \(p\). Order the elements of \(S\) in ascending order mod \(p\) and consider pairs of residues separated by \(p\) units:
\begin{align*}
(\color{red}{\boldsymbol{x_1}}, \boldsymbol{x_2}, \color{forestgreen}{\boldsymbol{x_3}}, \cdots, x_p, \color{red}{\boldsymbol{x_{p+1}}}, \boldsymbol{x_{p+2}}, \color{forestgreen}{\boldsymbol{x_{p+3}}}, \cdots, x_{2p}).
\end{align*}
There are \(p\) pairs:
\begin{align*}
(\color{red}{\boldsymbol{x_1}}, \color{red}{\boldsymbol{x_{p+1}}}), \; (\boldsymbol{x_2}, \boldsymbol{x_{p+2}}), \; (\color{forestgreen}{\boldsymbol{x_3}}, \color{forestgreen}{\boldsymbol{x_{p+3}}}), \cdots,
(x_p, x_{2p}).
\end{align*}
If any pair has the same value in each slot, then there is a run of \(p+1\) identical values in the original list. \(p\) of the integers with that residue constitute a flat\(_p\) subset of \(S\) because the sum of \(p\) copies of the same residue is congruent to \(0\) mod \(p\). So we can assume that each pair contains different residues.

For \(k \geq 1\), let \(A_k\) be all possible sums mod \(p\) obtained by taking one element from each of the first \(k\) pairs. For \(k=1\), it's a matter of choosing one of the elements from \((\color{red}{\boldsymbol{x_1}}, \color{red}{\boldsymbol{x_{p+1}}})\), so \(A_1 = \{x_1,x_{p+1}\}\). For \(k=2\), there are four ways to compose sums, so
\begin{align*}
A_2 = \{\color{red}{\boldsymbol{x_1}} + \boldsymbol{x_2}, \; \color{red}{\boldsymbol{x_1}} + \boldsymbol{x_{p+2}}, \; \color{red}{\boldsymbol{x_{p+1}}} + \boldsymbol{x_2}, \; \color{red}{\boldsymbol{x_{p+1}}} + \boldsymbol{x_{p+2}}\}.
\end{align*}
Generally there won't be four different values in \(A_2\) due to duplication. In like manner, there are \(2^k\) ways to compose the sums making up \(A_k\), but of course \(A_k\) has many fewer elements than that because of duplication. The task is to show that there isn't too much duplication by showing that there is a \(k\) such that \(|A_k| = p\). In fact, \(|A_k| < |A_{k+1}|\) as long as \(|A_k| < p\). To see this, note that by construction:
\begin{align*}
A_{k+1} = (A_k + x_{k+1}) \cup (A_k + x_{k+1+p}),\tag{1}
\end{align*}
where the parenthesized expressions are set translates mod \(p\):
\begin{align*}
(A_k + x_{k+1}) = \{(a + x_{k+1}) \!\!\! \pmod{p}\}_{a \in A_k}.
\end{align*}
The elements of \(\{A_k\}\) are different, so the elements of their translates are different as well by the nature of arithmetic in \(\mathbb{Z}_p\). It follows that each of the translates in (1) has \(|A_k|\) elements and that \(|A_k| \leq |A_{k+1}|\); it remains to show that this inequality is strict when \(|A_k| < p\). If either of the translates in (1) has an element not in the other, then (1) implies that \(|A_k| < |A_{k+1}|\), as desired. Otherwise:
\begin{align*}
A_k + x_{k+1} &= A_k + x_{k+1+p}\\
\therefore A_k &= A_k + (x_{k+1+p} - x_{k+1})\\
A_k &= A_k + t, \hspace{8pt} \text{where } 0 \neq t = x_{k+1+p} - x_{k+1}.\tag{2}
\end{align*}
Assuming that \(A_k \subseteq \mathbb{Z}_p\) is not empty, (2) implies that \(A_k = \mathbb{Z}_p\)! I didn't see this at first but Gemini helped me see it, inadequate to the general task though Gemini was.

Lemma. If \(A \subseteq \mathbb{Z}_p\) where \(p\) is prime, and \(A = A + t\) where \(A\) is not empty and \(t \neq 0\), then \(A = \mathbb{Z}_p\).

Proof. Let \(a\) be any element of \(A\). Then \(a + t \in (A + t) = A\), and so \((a+t) + t \in A + t = A\). Continuing in this fashion:
\begin{align*}
a & \in A\\
a + t & \in A\\
a +2t & \in A\\
a +3t & \in A\\
& \vdots\\
a +(p-1)t & \in A\\
\end{align*}
This lists \(p\) elements in \(A\) and they are all different. Were there any doubt about this, proceed as follows:
\begin{align*}
a + jt & \equiv (a + kt) \!\!\! \pmod{p}, \hspace{10pt} \text{where } 0 \leq j, k \leq p-1\\
\therefore jt & \equiv kt \!\!\! \pmod{p}\\
j & \equiv k \!\!\! \pmod{p}
\end{align*}
The last step is where \(p\) prime comes into play. \(\mathbb{Z}_p\) is a field and \(t \neq 0\) in \(\mathbb{Z}_p\), so \(t\) can be cancelled. QED.

So \(2 = |A_2| < |A_3| < |A_4| < \cdots < |A_s| = p\) for some \(s\). Therefore all residues mod \(p\) are in \(A_s\) and in particular, \(0 \in A_s\), so the sum of those residues is congruent to \(0\) mod \(p\) — that is, the integers represented by those residues sum to a multiple of \(p\) and that subset of \(S\) is flat\(_p\). This completes the proof of the Erdös-Ginzburg-Ziv Theorem.

Mike Bertrand
July 1, 2025

^ 1. Mathematical Puzzles, Revised Edition, by Peter Winkler, CRC Press (2024), ISBN 978-1-03270-848-5. Original question on p. 43 (Even Split), hint on p. 66, solution on pp. 141-142.

^ 2. "Theorem in the additive number theory", by P. Erdös, A. Ginzburg, A. Ziv, Israel Research and Development Nat. Council Bull., Sect. F, 10 (1961), pp. 41–43.

Main menu

Navigation

The Erdös-Ginzburg-Ziv Theorem

Tags:

Main menu

You are here

Navigation

The Erdös-Ginzburg-Ziv Theorem

Tags: