andrewt.net

blog

Carnival of Mathematics 98

Hello!

First of all, regular readers confused by suddenly arriving at #98 in a series you’ve never seen before should read this page to find out what’s going on. In short, it’s a monthly collection of user-submitted maths blogging, hosted by a different blog each month.

Everyone else, welcome to the 98th Carnival of Mathematics! As per tradition, here are some facts about the number 98:

  • 98 is 77 in base 13, and also 72+72. (This sounds impressive but all numbers of the form 2n2 do that in base 2n−1.)
  • 98 is a Wedderburn-Etherington number, which is to say the number of weakly binary trees you can draw with a given number of nodes. The first three such numbers are 1. Then it grows. Fast.
  • 98 is a nontotient. (Don’t Google that. Trust me.)

Maths and things

Patrick Honner has been given a speeding ticket and has used maths to prove that he definitely did the crime for at least a moment. It seems unlikely that either the cops or a judge has thought it through this far, though, so it might be worth arguing the toss. It’s worked before.

Tallys Yunes has devised an alternative set of times for microwaving food such that the ten digits are used roughly equally. I suppose in theory it could help protect your microwave from wear and tear, but really it’s delightfully pointless. My old microwave would accept times like 2:90, so maybe we could take this work even further…

Niles Johnson has blogged a gallant attempt to draw some shapes that presumably live in 8-dimensional space. Since my grounding in such things extends basically to having nearly completed Antichamber, I’m afraid I don’t fully understand it.

Oluwasanya Awe has submitted a nice proof for a puzzle in the Maths Olympiad.

Stijn Oomes submitted a post on ‘rational trigonometry’ which is rather nice maths. I can’t see the use for it, but then, when has that ever made maths worse?

Richard Elwes has blogged about totients, handily explaining what “nontotients” are for anyone who obediently didn’t Google them earlier. The formula for finding them quickly is neat — and sort of obvious in retrospect so maybe try to find it before he tells you.

Wolfram’s blog has a good writeup of the Ramanujan Gap recently filled using their software. The Gap is a handful of missing solutions from Ramanujan’s old notebook. It has some nice 2D colour graphs in it too.

Lastly, here is a post by Mohammed Ladak about how intuitive and obvious the rules of Set are, which makes our attempts to explain it at MathsJam all the more pitiful.

Resources, books and so on

Colleen Young has found a nice set of stats resources, and Justin Lanier has found some more general-interest ones including a website for playing the Four Fours game (and variants thereupon). Frederick Koh has sent a quiz about vectors and has some others on the site so they might be useful for anyone learning such things.

John Hunter has written an obituary of George Box, and Shecky Riemann has reviewed a book about P and NP. We also have a list of maths ladies you should follow on Twitter.

Colin Beveridge tells us about the Mathematical Ninja’s ten favourite numbers, and Christian Perfect sent in a blog by Adriana Salerno who has found x in a museum.

Lastly, Katie sent me a post by Mr Reddy with photos of all the things in his Maths Cupboard. It’s best played as a sort of maths-nerd version of the Conveyor Belt round from the Generation Game.

I’m glad I’m not the only one with too much maths to take this seriously.

I saw this on xkcd the other day:

What’s always bothered me about this claim is that everyone knows bacteria grow exponentially. If you kill 99.99% of them, they need only double about 13.3 times before the population returs to its original size. Under optimal conditions, that’s roughly 133 minutes. That’s only as long as it takes to watch Donnie Darko or have a perfect Christmas. If you wash your hands, kitchen or whatever less often than that, the bacteria will win.

And I’m pretty sure Dettol only claims to kill 99.9% of germs. That might not sound like much, a difference of 0.09%, but it only takes bacteria 98 minutes to claw back that difference. That’s a quarter of an hour you’ve lost.

But I can see how their marketing people thought “kills 99.99% of all germs” sounded better than “buys you an hour and a half in your doomed battle against the unstoppable onslaught of disease”.

How to beat “Blue Monday”

So, another Blue Monday is upon us. It’s not clear exactly when Blue Monday is, but we do know it’s upon us. The Daily Mail has gone with last Monday; The Training Room, one of the four billion companies who have used it for marketing, have gone with next Monday. But we can work it out, because we have the equation:

$Misery = \frac{(W + (D-d)) \times TQ}{M \times NA}$

…unless it’s this one

$D = N + M (T+1) C R \frac{B-S}{J}$

but nobody uses that one. Anyway, it’s definitely a Monday, and it’s definitely in January because $T$ is time, which increases with time and so is highest in December, and $Q$ is how many new year’s resolutions you’ve given up on which is also highest in December look don’t think about it shut up.

Anyway, we know these formulæ are real because they make predictions. One intrepid PR man compiled a formula for the perfect marriage which results in thirty-some pre-specified romantic gestures you have to make every month. Since months are of different lengths, that means you need be less romantic in the longer ones.

This formula about a marriage clearly makes the unrelated prediction that February should be the most romantic month; that there should be some kind of festival of romance smack in the middle of it. Since every restaurant I’ve been in lately has taken great pains to confirm that this is indeed the case, the only rational, skeptical or scientific conclusion is that these formulæ represent the very real cutting edge of science.

While it is obviously bad news that we are all scientifically depressed either last or next Monday, it does offer a ray of hope. Here, you see, is the formula for the perfect Christmas:

I’m assuming that the second “divided by 3D” on the bottom is a typo. Firstly because this formula otherwise defines some kind of festive acceleration, which even a cursory analysis of the Twelve Days Of Christmas song teaches us is dangerous, but mostly because if it’s a real fraction then the numerator has an equals in it.

The point of this equation is—

Actually, first I should point out that $W$ appears twice: once to represent a walk, and once to represent a glass of wine. This might be an error, but I prefer to assume it is a deliberate simplification introduced because walking and wine are somehow equal. I say this because this is the Times’ “Offset Your Carbs” wallchart:

It quite clearly shows that going for a walk is equal to garlic bread:

It also shows (although I haven’t got a close-up photo of this) that a glass of wine is equal to some sex. Taking the new result that wine equals walking,

$Garlic Bread = Walking = Wine = Sex$

The strangest thing here is that mathematically, Peter Kay is now the filthiest comedian in Britain.

Anyway, the point is that the $D$ in the formula for misery is Christmas debt. If we can reduce that value we can all-but eliminate Blue Monday. So how do we do that? Well, the formula for the perfect Christmas is:

$ P_\chi = \frac{8F \times (4P + £23)}{3D} + \frac{3G}{3D} + \left(1+\frac{1}{4C}\right)\frac{2W}{3D} + \frac{5T}{3D \times 1NR}$

Christmas debt is $8F \times 4P \times £23 = £736$. Quite a whack. But we can reduce it without affecting $P_\chi$ — I’m generously assuming what is probably an $X$ is a $\chi$ — because it’s in a fraction. Dividing top and bottom by 4, we get

$ P_\chi = \frac{8F \times (1P + £5.75)}{18 hours} + \frac{3G}{3D} + \left(1+\frac{1}{4C}\right)\frac{2W}{3D} + \frac{5T}{3D \times 1NR}$

That gets us below £50, but that’s still a lot. Let’s dispense with seven family members and divide through by eight.

$ P_\chi = \frac{1F \times (1P + £5.75)}{135 minutes} + \frac{3G}{3D} + \left(1+\frac{1}{4C}\right)\frac{2W}{3D} + \frac{5T}{3D \times 1NR}$

Now our only problem is that the times are inconsistent: 135 minutes in the first term, and three days thereafter. Never mind, just divide all the fractions by 32, top and bottom:

$P_\chi = \frac{1F \times (1P + £5.75) + \frac{3}{32}G}{135 minutes} + \left(1+\frac{1}{4C}\right)\frac{\frac{1}{16}W \times 32NR + 5T}{135 minutes \times 32NR}$

This new, abbreviated Christmas is interesting. $G$, for example, represents a nice family game, of which we must play three thirty-second fractions, and while the formula for the perfect family game (obviously there is one) doesn’t readily divide by 32, there is a game that does.

Three Chess Over Thirty Two features six squares with three pieces on them, and luckily for anyone wanting to knock out Christmas in a little over two hours, requires only one move for checkmate.

But however dull a game Three Chess Over Thirty Two can be, however unsatisfactory one sixteenth of a glass of wine (or walk), and however overfilling you find the 32 portions of nut roast you are forced to consume because it was inexplicably on the denominator, the cold, mathematical reality is that $P_\chi$ has not changed and therefore Christmas has been totally and objectively perfect, while costing less than six pounds. This, in turn, means that our formula for misery loses much of its sting, and that is how you beat Blue Monday.


This is adapted from part of my Stupid Formulæ Talk which can come to your local Skeptics in the Pub or similar event if you think your audience would enjoy hearing me be approximately this silly for the better (or at least longer) part of an hour.

On reflection, perhaps they shouldn’t go this far. It’s a bit sad, isn’t it?

I saw this post about the graphs of National Novel Writing Month on my brother’s blog ages ago, but an RSS malfunction showed it to me again today, and I got to thinking about his final graph: words written on any given day, plotted against how far behind he was that day (or, more precisely, words left to write / day left, normalised to the same value on day one). He correctly notices “a hint of a positive correlation”, and notes that that may not be causation, but could be an external factor, such as his determination to show his doubting wife what for.

I have my own theory, arguably more prosaic but also more interesting, so I created a simulation to test it. I assigned a random number to each of the 30 days of November, using the formula RAND()*RAND() to create a nice distribution. I assigned a number of words to each day by multiplying this random number by 50,000 and dividing by the total of all 30 random numbers. This created a month of simulated writing with random ups and downs, but a guaranteed total output of exactly 50,000. (For these purposes I allowed non-integer numbers of words.) I worked out the cumulative wordcount and behindness index for each day, and plotted them, with a trendline (in red) and R2 value, and ran simulation after simulation.

My theory was proven: every single one had a positive correlation.

Of course it did — any deviation from the 1666.7-word daily target will be reflected in your behindness score on every subsequent day, and if you write your 50,000th word on day 30, then it will be balanced by an equal and opposite deviation spread unevenly across those same days. Any novel of exactly 50,000 words will have this hint of positive correlation between behindness and words written. I suspect this would hold even if we allowed some days to have a net deletion of words. Novels of just over 50,000 (such as almost all of them) will get a slightly reduced version of the same effect. There are only two ways to avoid it. One is to write a wildly different number of words — say, 40,000 or 60,000. The other is to write exactly one thousand, six hundred and sixty six and two thirds of a word every single day, although that involves using a lot of three- and six- letter words.

I’m not sure that someone who writes 60,000 words really worked to the 50,000-word target at all, so if you want to know if ‘behindness’ affects performance, you’ll have to examine the stats from people who failed to complete the novel, between day one and whenever they eventually gave up. And to be honest, how useful a sample are they to a study of productivity?

Anyway, I suppose I hope this can be a nice example of how something plausible and supported-by-the-data-looking can turn out to just be randomness viewed from a funny angle.

A round-up of a mathsy week

Maths!

First of all, I’ve written an article for the Aperiodical, an irregular maths blogging collective kind of a thing. It’s about voting systems, crazy dice, paper, lizards and Spock.

Grime Dice” are a set of five coloured dice with unusual combinations of numbers on them. The red die, for example, has five fours and a nine. The blue one has three twos and three sevens, so it loses to the red die about 58% of the time. The green die has five fives and a zero, and will lose to the blue one in 58% of rolls. What makes them interesting is that the green die will beat the red one in 69% of rolls. These three dice behave rather like rock-paper-scissors — in mathematical terms, they are ‘non-transitive’. The full set of Grime Dice also has a purple and a yellow die, so a better analogy would be rock-paper-scissors-lizard-Spock.

You might ask which is the best Grime die, and the obvious solution is just to roll all five dice at once, a hundred times, and see which one wins the most times. This is why you should never believe something simply because it is obvious.

Read the rest at Aperiodical.com

I’ve been quite reasonably asked what Ranked Pairs did to deserve omission, to which my only answer is “be a bit inelegant and have a boring name” (not that I have any particular room to criticise it for either of those). If you’re interested, in Ranked Pairs, you start with the strongest preference the voters expressed (say, Labour are better than UKIP), “lock it in”, then keep locking in preferences, in decreasing order of strength and skipping any that would create Grime-dice-style cycles, until you’ve figured out an ordering on all the candidates. The winner is the one who beats every other candidate in the final, locked-in pairings. It’s not so very different from Beatpath (which I did include) but even after Paul’s recasting of it in more elegant terms it still feels a bit clunky to me. I appreciate that “a bit clunky” is not a completely valid mathematical criticism.

Secondly, my maths-based comedy rant thing about stupid formulæ in PR newsfodder will be on at the new Kingston Skeptics in the Pub group on Thursday 2 August. It’s at the Ram Jam Club, which is here:

Come down if that sounds like somewhere near where you live. You’ll learn how to derive the existence of tiny shoe-horns using quantum physics and a cheap advert for some guy’s book. (Connection fans may note that the golf club to the east shares its name with a popular electoral system.)

Lastly, here is some maths I found scattered around the web, which I may bring out at Manchester MathsJam tonight (spoiler alert). It’s about counting, binary, puzzles and hypercubes (as is all maths).

Picture a counter, like the ones used to count people into clubs or on the obnoxious Lynx adverts a few years ago. When a digit ticks over, all the digits to the right of it also tick over to zero. If you had a really big club (or a really noxious deodorant) you might end up changing hundreds of digits at a time. Binary is no different — but sometimes having to change all those bits at precisely the same time causes problems, so you’d much rather only change one digit at a time. To this end, Frank Gray invented a revised binary counting system where that is exactly what happens. In 3-bit, you start as normal at 000, then 001, and then you go to 011 — the nearest unused number. Next, you go to 010 which you missed out, then skip to 110. Next, comes 111, then 101, and then 100 — the only number left. To get back to zero, you flip that single bit at the front back to 000. To get to eight, you need four bits, and it’s 1100 — again only one bit has changed.

So how do you know which bit to flip? Every second flip, starting with the first, is the right-most bit. Every fourth, starting with the second, is the next rightmost bit. Every eighth, starting with the fourth, is the third rightmost. And so on. I think this is interesting, because it’s also the solution to the Towers of Hanoi game: in that, you move the smallest disc first, and every second move after that. You move the second smallest second, and every fourth move after that. You move the third smallest fourth, and every eighth move after that. And so on. This is a neat way to visualise why the time needed to solve Hanoi increases as 2n. (It’s also the solution to the Chinese Ring Puzzle which, generalising from myself, I shall assume you haven’t heard of and ignore.)

Counting in Gray Code is also equivalent to visiting every corner of a hypercube without visiting any twice: if you think of a binary number like 101 as a set of co-ordinates (1, 0, 1), then the numbers 000 to 111 denote the corners of a cube. 00 to 11 give you a square, and 0000 to 1111 give you a four-dimensional hypercube. Obviously you could visit them in standard binary order (0000, 0001, 0010… 1111) but that means taking diagonal tracks through the cube. Gray Code’s system of only changing one bit at a time means you can do it just tracking along the edges — and because you can always change that last bit to get back to zero, you can do a full Hamiltonian cycle, just along the edges, by doing nothing more taxing than counting.

I honestly don’t remember what possessed me to look up Gray Code. It had been sitting in the back of my mind for about two years, and suddenly I thought maybe it was interesting enough to mention at MathsJam. It hadn’t occurred to me that it might have any relevance to traditional wooden puzzles, hyperdimensional navigation, or indeed anything else. But it did. And that is why maths is awesome.

Towards a Mathematically Optimal Scrabble Bag

There aren’t a lot of Ses in a Scrabble bag. Despite being the seventh most common letter in English, it is the joint-eighth rarest in Scrabble.

Excluding the two blank tiles, there are 98 letters in the bag, and for the most part, the number reflects the amount that appear in English text. Z, Q, X and J are over-represented, but only because they wouldn’t appear at all otherwise. The other end of the scale is a bit more haphazard, though: S and T are roughly one third less common in Scrabble than in English text, while H, for some reason, is two thirds less likely — 6% of any given book is the letter H, but Scrabble players get but two of them.

The original document that led to Scrabble letter bags

It’s not really clear why that should be the case. The original letter distribution was devised in 1938, based on the frequencies of each letter in different length words in newspapers and a dictionary, so perhaps that methodology shafted H, or perhaps it was the texts chosen. Maybe H just wasn’t used much in 1938. Who knows.

This is interesting, because it suggests the dearth of Ses is not an artificial limitation to counter their obvious usefulness: you can stick an S on the end of 77,392 different words, making them instantly valuable — but the similarly underrepresented T only goes on the end of 1,393 words. The second most useful letter for this is D, which appends 8441 words. (The least useful letter for this is Q, which goes on the end of TALA and TZADDI. J goes on HA, TA, BEN, HAD and HAJ, although there’s only one in the bag so HAJJ doesn’t come up much. There’s no particular pattern to what you can put on the start of a word.) That means that we can ignore usefulness of letters, and apply the same model as Butts did in 1938 to create an optimal bag: two blanks, a Z, a Q, a J, an X, and then 94 other tiles in proportion to their frequency in English text. Right?

No. It’s not, in general, possible to do that. Consider the case of three letters, A, B and C, which occur equally, and you want to put them in a two-tile Scrabble game. You can’t. The errors get smaller as you get more tiles, but they don’t go away. It’s the same problem faced by electoral systems that use proportional representation: you have to agree on an algorithm before you start or else you’ll end up in an impossible situation — if the vote is 60/40, and there are four seats, do you split them 50/50, or 75/25? But you can get it near enough for Scrabble.

Is that a good idea, though? After all, while the word “at” appears a lot in newspapers, it only comes up once in the Scrabble dictionary. And in fact, these things don’t correlate well at all: there are 5 times more Zs in CSW12 than English, 13 times as many Js, and 7 times as many Ks and Bs; whereas English has 15 times as many Es as CSW12, and 13 times as many Is. Should we be playing with a tile bag containing 10 Ses, 8 Cs and 8 Ps — but only 4 Es? It sounds unthinkable, although at least you’d be less likely to get a rack full of Old McDonald lyrics.

Alternative letter distributions, based on CSW12, normal English, and the current Scrabble bag:

CSW12 English Scrabble
A 6 7 9
B 5 2 2
C 8 3 2
D 6 4 4
E 4 11 12
F 4 2 2
G 3 2 3
H 4 6 2
I 3 6 9
J 1 1 1
K 2 1 1
L 3 4 4
M 5 3 2
N 2 6 6
O 3 7 8
P 8 2 2
Q 1 1 1
R 5 6 6
S 10 6 4
T 5 8 6
U 3 3 4
V 2 1 2
W 2 2 2
X 1 1 1
Y 1 2 2
Z 1 1 1
Blank 2 2 2

Of course, players don’t know all 270,163 words in CSW12, and it’s likely that the obscurity of many of them will shift the letter distribution back towards that of normal English. In that respect, the ideal bag of tiles probably depends to some extent on the skill of the players. In the hardcore CSW12 bag, there are 19 vowels rather than 42, so it would change the game a lot.

This would probably also require different scores for each letter: X and Y currently score 8 and 4 respectively, but both are rarer in CSW12 than Z or Q which score 10. You get 1 for an N, but it’s rarer than G and D (2 each), B, M, P and C (3 each), and H and F (4 each). Thing is, scaling between 0 and 10, then rounding up, this is the closest fit you can get:

CSW12 English Scrabble
A 1 1 1
B 1 1 3
C 1 1 3
D 1 1 2
E 1 1 1
F 1 1 4
G 1 1 2
H 1 1 4
I 1 1 1
J 2 5 8
K 1 1 5
L 1 1 1
M 1 1 3
N 1 1 1
O 1 1 1
P 1 1 3
Q 3 8 10
R 1 1 1
S 1 1 1
T 1 1 1
U 1 1 1
V 1 1 4
W 1 1 4
X 10 5 8
Y 4 1 4
Z 3 10 10
Blank 0 0 0

I think that looks a bit boring, to be honest. I think the game will be more fun if the scores are more varied. Maybe some kind of gamma function would help increase the contrast on the low scores and reduce the power of X. Still, though, I’d like to have a game with the alternative distribution and see if it works at all. I feel sure it can’t, with so few vowels, but then, maybe that’s just because I’m so used to the current, vowel-heavy version that I don’t use the vowel-light words much.

Of course, if we really wanted to optimise the game, we would invent a new set of words to play with that were designed to be fun rather than similar to English. But that would be taking things a bit far, don’t you think?

Apple Have Updated The iTunes Terms And Conditions

Every so often, my iPhone tells me that Apple have changed their terms and conditions again, and I have to agree to the new ones if I want to install an app ever again. I worry that this is becoming a time sink.

According to a website I found, the current version of the T&Cs is 17,581 words long. It’s been in force since October 12. Archive.org have a copy from July, which had been in force for a little over a year. That is 16,836 words long. The one before that ran from April 1 to at least May 19, and is a staggering 25,115 words. Another 19,273 words served from September 2009 to February 2010, and there was a version in June that was 18,248 words. Lastly, the version that was online between November 2008 and April 2009 was 17,870 words. There are gaps in this, because Archive.org doesn’t scan every website every day, so it’s possible that there are several more iterations that I’ve been unable to count. But let’s go with this for now: I have found 114,923 words of terms and conditions on the iTunes website.

Let’s put that number in context.

iTunes has 200 million users, and the average reading speed for comprehension is 200-400 words per minute. I’m going to take the lower estimate, since this shit is really boring. Using that figure, if, as Apple presumably intend, all of those people read all of those words carefully enough to understand them, it would consume 218.7 man-millennia. Wolfram|Alpha chooses to express that figure in terms of the total age of the universe, and doesn’t bother with scientific notation.

But what’s that figure like for one user? I made you this graph, using figures from this website:

lotr

Anyway, I just clicked ‘I agree’ and got on with my day.

Judge Bans Thinking

Hello!

I just saw this news story on singingbanana’s Tumblr:

The footwear expert made what the judge believed were poor calculations about the likelihood of the match, compounded by a bad explanation of how he reached his opinion. The conviction was quashed. But more importantly, as far as mathematicians are concerned, the judge also ruled against using [Bayes' theorem] in the courts in future.

What?

In case you don’t know, Bayes’ Theorem is a formula from the days when mathematicians could get away with just naming bits of common sense after themselves, and states:

P(A|B) = P(B|A) × P(A) ÷ P(B)

That is,

(probability A is true given evidence B) = (probability of finding B if A is true) × (probability of A being true) ÷ (probability of finding B)

Obviously that’s pretty useful in court, and it’s also pretty easy to understand: clearly, (probability of finding B) = (probability of finding B if A is true) × (probability of A being true) + (probability of finding B if A is false) × (probability of A being false). That’s all the options: A must be true or false. All Bayes’ theorem does is work out in what the proportion of the cases in which B is found, A is true. It’s almost obvious, in an informal kind of way, if you have the right kind of brain.

Point is, you’re now not allowed to use that reasoning in court, which means if your defence depends on any evidence more subtle than “fifteen people saw me across town” then you’d better find another way of phrasing it than this. It’s basically maths, so you ought to be able to work round it and find a more circuitous way of showing the same thing.

Strictly, the ruling was that you can’t use Bayes’ theroem “unless the underlying statistics are firm” but that doesn’t help a lot. In qualitative terms, all the theory says is that something is more likely if you have evidence for it — so I think we can safely assume it’s the use of maths that the judge objected to. It’s the same reasoning behind the Drake equation: people are bad at guessing big, complicated things like “how many aliens might there be” or “how likely is this guy to be guilty given we found this footprint”, but pretty good at estimating simple things like “how long might an alien race beam signals into space” or “how likely is it there’d be a footprint like this just here”. You can do the same thing if you get to the tie-break in a pub quiz: come up with a way to work out the answer from easier to estimate quantities and crunch the numbers. You’ll almost certainly do better than the team that guesses the final answer directly.

And that is why this ruling is worrying: because not only has a judge fallen foul of the natural but wrong tendency for humans to overestimate their own judgement and distrust logic and reason, but they’ve ruled that all other judges and lawyers have to make the same error. It’s a shame, because while any large group tends to be a bit rubbish at thinking, judges are usually pretty good — as, I think, is anyone impartial with the time and inclination to look into things.

I didn’t know they could do that. (I actually suspect they can’t and the story is overblown, but I wouldn’t know enough to decide.) What’s next — a judge commits the prosecutor’s fallacy and then rules that everyone else has to do it as well? Idunno, seems a bit dangerous to me.

It just seems like it’d be fairly easy to set someone up if there’s a big list of thought processes their defence lawyer isn’t allowed to invoke. Or design a crime that could be easily proven but not with the limited methods of thinking allowed in the courtroom.

There are some interesting mental exercises there — coming up with crimes or frames for differently handicapped justice systems — but not ones actual lawyers should have to bother with.

Can AV elect a second- or third-placed candidate?

Under First Past The Post (FPTP), everyone picks their favourite option, and the one with most votes wins. Simple, right? Under the Alternative Vote (AV), the one with fewest votes is eliminated, and this process repeats until one candidate has a majority — or until only one candidate is left; it makes no difference. No2AV will tell you that this can hand victory to the second- or third-placed candidate. But what does that mean? The winner is by definition the first-placed candidate — what they mean is that AV sometimes returns a winner who would not have won under FPTP. The Yes campaign will tell you that that is sort of the point — if AV always returned the same winner as FPTP, nobody would care which we used.

The thing is, elections are so engrained in the public consciousness that it’s hard to dislodge the idea that after one there’s a very clear “best candidate”, and any system that doesn’t elect that person is fundamentally broken. But is that true?

And what if there was more than just FPTP and AV on the ballot this May?

What if we added Runoff Voting? This is the system used in France, whereby the two most popular candidates after round one face a second round, the winner of which is duly elected. We could also add Approval Voting. This is essentially exactly the same as FPTP, but you’re allowed to vote for as many options as you like. Lastly, we could add Range Voting. With Range Voting, you can score each option, usually out of 10 or 100, and the one with the maximum score is the winner. There are still more options, but let’s stick to these five for now.

Let’s ask an imaginary population to vote on which system they want. The population is made up of:

  • 40 “Reds”. The reds like FPTP, but would settle for Approval. They’ll have no truck with numbers or multiple rounds.
  • 21 “Yellows”. The yellows would like Runoff voting, and would be happy enough with AV. They would settle for Approval, but not Range or FPTP.
  • 20 “Greens”. The greens would like AV, and would be happy enough with Runoff. They too would settle for Approval, but not Range or FPTP.
  • 19 “Blues”. The blues are desperate to have Range voting, but otherwise have identical preferences to the greens.

Let’s assume nobody votes tactically.

fptp ballot

Clearly, FPTP has most votes. So that should win, right?

fptp

Well, let’s not get ahead of ourselves. “Having the most votes” is only how you win under FPTP. If we used Runoff voting, the red and yellow vote would get FPTP and Runoff into the second round, where the blues and greens would side with Runoff, granting it a 60:40 win.

runoff ballot 2
fptprunoff

This graph shows an interesting fact: even though FPTP got the most votes in the general ballot, the population as a whole prefers Runoff. But what would happen under AV?

av ballot

In round one, Approval voting would be eliminated, as nobody voted for it. In round two, Range voting, with only the 19-strong blue vote to its name, would also be eliminated. In round three, with Range gone, the blues vote like the greens, taking AV up to 39 votes. This puts Runoff into last place, and it is eliminated. Round four, therefore, is between AV and FPTP. The reds stay loyal to FPTP, but the yellows, greens and blues all prefer AV, which wins with a 60:40 majority.

fptpav

Again, it is clear that the population as a whole prefers AV to FPTP. In fact, the population as a whole also prefers Approval to FPTP. This is a serious flaw with FPTP: it is possible for the “Condorcet Loser” to win. The Condorcet Loser is any candidate who would lose a two-horse race against each of the other candidates. (In our hypothetical example, however, Range Voting is the Condorcet Loser and so FPTP has not messed up as badly as it might.) AV and Runoff voting are immune from electing the Condorcet Loser, because such a candidate would by definition be defeated in the final round (if they got there).

Let’s have a look at the less well-known systems now. Range Voting is the only system to measure strength of feeling. I mentioned earlier than the blues are desperate for Range to win — so they rate it a 10. Then, they rate AV 3, Runoff 2 and Approval 1 — as do the greens. The greens have opted not to give the full 10 points to any system. That is their right. The yellows vote in a similar way, giving Runoff 3 points, AV 2 and Approval 1. The reds give 3 points to FPTP, and two to Approval. Anything I haven’t mentioned gets zero points.

range ballot

In total FPTP has 120 points, all from the reds. Approval has 140 points, from everyone. Runoff slightly beats it with 141, from the yellow, green and blues. AV does a little better with 159, from the same ballot papers. But the sheer strength of feeling from the blues is enough to hand Range Voting the win, with 190 points.

range

Even though the majority would rather have anything but Range Voting (i.e., it is the Condorcet Loser), the theory goes that Range would make the most total happiness, albeit concentrated in a small minority of voters.

Lastly, what would happen under Approval Voting?

approval ballot

Only the reds approve of FPTP. Everyone else approves of AV and Runoff voting. Range gets only 19 votes, from the blues, but since everybody would be happy with Approval voting, it scores an easy win.

approval

All five systems have elected themselves — and yet the voters’ preferences are the same in each case. Obviously this population was engineered so this would happen, and it’s very unlikely in real life. But it should illustrate some of the problems with the current system, and indeed with all systems. It is mathematically impossible to design a system without tactical voting.

Here is the issue: looking at the voter’s preferences, I don’t think there’s a single system that clearly ought to win. No2AV would tell you FPTP should win because it “came first” in terms of first-preference votes, but 60% of the population would rather have Alternative, Runoff or Approval Voting, and it’s hard to argue that democracy dictates that they mustn’t.

This makes designing a fair system almost impossible. For example, a criterion mathematicians use to judge voting systems is “independence of irrelevant alternatives”. Range and Approval Voting satisfy this criterion: if any losing system were removed from the ballot, it would not affect the scores for the remaining ones and the result would be the same. Under FPTP, removing Runoff Voting from the ballot would hand the victory to AV. In this example, AV would elect itself with any combination of opponents, but this is not generally the case.

Even though all five systems elect different winners, none of them is inherently biased. Nor do any of them give any voters more than their fair share of sway over the result, whatever No2AV might tell you. Nor, whatever Yes To Fairer Votes may have you believe, are any of them a panacea that will automatically usher in a “new politics”. We simply have to agree a system beforehand, so that going into the election everyone knows the rules. That way, while the result may not be popular, at least everybody can accept that it is fair, and knows that they have a chance to do something about it the next time around.

Can AV rob a rightful winner of victory? Yes. But so can FPTP, and and at least AV will never elect the single least popular candidate.

How to Choose This May

Which is better, First Past The Post or Alternative Vote? A lot of people are arguing about exactly that, and as well as spouting all the usual bullshit, I think they’re getting quite a long way ahead of themselves. First, we need to define “better”. I think it boils down to this:

Your best strategy should always be to stand for election, and failing that, to vote for candidates in the order that you like them.

It turns out, though, that it’s much more complicated than that, to the point that there’s a whole branch of mathematics just for this. So, there’s already a list of criteria that good electoral systems should satisfy. They have fancy names and, because I think this stuff is interesting, I’ve paraphrased a few from Wikipedia:

Now you have to decide which you think are important. You can’t have all of them. But decide now.

I think consistency is nice, but not a deal-breaker. Independence of clones and of irrelevant alternatives are pretty important. Everybody loses when a crap but distinctive candidate wins because several competent but similar ones split the sensible vote. I think the others, though, are really important. You can’t have someone lose an election despite being more popular than every other candidate. You can’t have someone win an election by being less popular than every other candidate. And you sure as hell can’t have someone lose an election who, had you not voted for them, would have won.

You may have different opinions about the above. Again, I encourage you to decide now, because in a moment I’m going to tell you which voting systems satisfy which criteria. No cheating, now.

Here are the ‘answers’:

First Past The Post Alternative Vote
Condorcet Winner Neither
Condorcet Loser No Yes
Consistency Yes No
Clones No Yes
Irrelevant Alternatives Neither
Monotonicity Yes No
Participation Yes No

I don’t know about you, but this table surprised me. I’d have got the left column pretty much bang on, but I honestly thought that Alternative Vote would get a tick for “independence to irrelevant alternatives”, and I was utterly convinced it would pass the Condorcet winner test. I was dead wrong. That Alternative Vote failed on the last two criteria was less surprising, since I’d never considered them before, but still a major shock — under the new system, you might be better off not voting! Sure, a vote for a rank outsider is a bit of a wasted trip under First Past The Post, but at least it won’t do any harm. Under AV, it just might.

Looking outside the two options we’ll be given in May, is there any system that won’t punish me for voting or for standing for election? Well, no. Range Voting seems pretty good — it fails the Condorcet criteria, but only when the preferences are so close it arguably shouldn’t elect the Condorcet winner — but nobody’s invented a voting system that ticks all the boxes above. It may well be impossible. I said this stuff was interesting, not uplifting.

I really thought this post would be strongly defending Alternative Vote, but now I don’t know what to think. I still think AV is better than First Past The Post, but I’m not thrilled with either. I think Alex Foster’s case on the Pod Delusion, that a vote for AV sends the right message to politicians, and that anyway AV is a useful first step towards genuinely better, multi-member or proportional systems, is the clincher. If we have any form of Proportional Representation, it’s much less important if the local MP is arguably not the most popular one on the ballot.

The main point, though, is that it matters how we ask the question. There have been four polls about AV. In three, the actual referendum question was used:

At present, the UK uses the “first past the post” system to elect MPs to the House of Commons. Should the “alternative vote” system be used instead?

All three said that voters would slightly prefer AV. The fourth poll, by YouGov, used this question:

The Conservative-Liberal Democrat government are committed to holding a referendum on changing the electoral system from first-past-the-post (FPTP) to the alternative (AV). At the moment, under first-past-the-post (FPTP), voters select ONE candidate, and the candidate with the most votes wins.

It has been suggested that this system should be replaced by the Alternative Vote (AV). Voters would RANK a number of candidates from a list. If a candidates wins more than half of the ‘1st’ votes, a winner is declared. If not, the least popular candidates are eliminated from the contest, and their supporters’ subsequent preferences counted and shared accordingly between the remaining candidates. This process continues until an outright winner is declared.

If a referendum were held tomorrow on whether to stick with first-past-the-post or switch to the Alternative Vote for electing MPs, how would you vote?

In this poll, “no” won. The anti-AV Labour MP Tom Harris summarised this as

The more AV is explained to voters, the less they like it.

Not meaning to patronise the general public, but I think Harris has massively overestimated their grasp of game theory. You may notice that support for AV was not much lower in YouGov’s poll than in the others. Certainly not enough to account for the increase in opposition to the change. Harris presumably thinks that around a fifth of the people who hadn’t previously thought about AV have read this précis, usefully evaluated the proposals, and decided, on balance, that monotonicity and participation are too valuable to lose. I think they’re just put off by long-winded explanations. It sounds complicated and offers no obvious benefits. It isn’t even obvious that it works at all. Nobody would support AV based on that synopsis except logicians and the insane. You might say I’m underestimating the public, but I include myself — I totally misjudged AV when given roughly the information in YouGov’s question.

In short, having a referendum on a question of algorithm design is a terrible idea and nobody would do it if they honestly thought it mattered a damn whether we use Alternative Vote or First Past The Post, but luckily, it doesn’t. They’re both total pants. Neither reliably elects the most popular candidate, both can be manipulated, and both promote unhealthy two-party dominance. Ignore the maths. Ignore the pros and cons of what you’re actually being asked about. None of that matters. For once in your electoral life, you’re free! Vote for ideology. Vote for vague emotional ideals of “hope” and “change”. Vote with your heart.

Because that’s how this result is going to be treated in Westminster and in the media, and because fuck it, you might as well.