Bertrand's ballot theorem
Bertrand's ballot theorem
In combinatorics, Bertrand's ballot problem is the question: "In an election where candidate A receives p votes and candidate B receives q votes with p > q, what is the probability that A will be strictly ahead of B throughout the count?" The answer is
In Bertrand's original paper, he sketches a proof based on a general formula for the number of favourable sequences using a recursion relation. He remarks that it seems probable that such a simple result could be proved by a more direct method. Such a proof was given by Désiré André,[3] based on the observation that the unfavourable sequences can be divided into two equally probable cases, one of which (the case where B receives the first vote) is easily computed; he proves the equality by an explicit bijection. A variation of his method is popularly known as André's reflection method, although André did not use any reflections.[4]
Example
Suppose there are 5 voters, of whom 3 vote for candidate A and 2 vote for candidate B (so p = 3 and q = 2). There are ten possibilities for the order of the votes cast:
AAABB
AABAB
ABAAB
BAAAB
AABBA
ABABA
BAABA
ABBAA
BABAA
BBAAA
For the order AABAB, the tally of the votes as the election progresses is:
Candidate | A | A | B | A | B |
A | 1 | 2 | 2 | 3 | 3 |
B | 0 | 0 | 1 | 1 | 2 |
For each column the tally for A is always larger than the tally for B so the A is always strictly ahead of B. For the order AABBA the tally of the votes as the election progresses is:
Candidate | A | A | B | B | A |
A | 1 | 2 | 2 | 2 | 3 |
B | 0 | 0 | 1 | 2 | 2 |
For this order, B is tied with A after the fourth vote, so A is not always strictly ahead of B. Of the 10 possible orders, A is always ahead of B only for AAABB and AABAB. So the probability that A will always be strictly ahead is
Equivalent problems
Another equivalent problem is to calculate the number of random walks on the integers that consist of n steps of unit length, beginning at the origin and ending at the point m, that never become negative. Assuming n and m have the same parity and n ≥ m ≥ 0, this number is
Proof by reflection
- the probability of sequences that tie at some pointthe probability of sequences that tie at some point and begin with A or B
Proof by induction
Another method of proof is by mathematical induction:
We loosen the condition to . Clearly, the theorem is correct when , since in this case the first candidate will not be strictly ahead after all the votes have been counted (so the probability is 0).
Clearly the theorem is true if p > 0 and q = 0 when the probability is 1, given that the first candidate receives all the votes; it is also true when p = q > 0 as we have just seen.
Assume it is true both when p = a − 1 and q = b, and when p = a and q = b − 1, with a > b > 0. (We don't need to consider the case here, since we have already disposed of it before.) Then considering the case with p = a and q = b, the last vote counted is either for the first candidate with probability a/(a + b), or for the second with probability b/(a + b). So the probability of the first being ahead throughout the count to the penultimate vote counted (and also after the final vote) is:
And so it is true for all p and q with p > q > 0.
Proof by permutation
Bertrand's and André's proofs
Bertrand expressed the solution as
and thus the required probability is
as expected.
Variant: ties allowed
The original problem is to find the probability that the first candidate is always strictly ahead in the vote count. One may instead consider the problem of finding the probability that the second candidate is never ahead (that is, with ties are allowed). In this case, the answer is
Represent a voting sequence as a lattice path on the Cartesian plane as follows:
Start the path at (0, 0)
Each time a vote for the first candidate is received move right 1 unit.
Each time a vote for the second candidate is received move up 1 unit.
Each such path corresponds to a unique sequence of votes and will end at (p, q). A sequence is 'good' exactly when the corresponding path never goes above the diagonal line y = x; equivalently, a sequence is 'bad' exactly when the corresponding path touches the line y = x + 1.
In fact, the solutions to the original problem and the variant problem are easily related. For candidate A to be strictly ahead throughout the vote count, they must receive the first vote and for the remaining votes (ignoring the first) they must be either strictly ahead or tied throughout the count. Hence the solution to the original problem is
as required.