What does ‘random’ look like?

Which plot is random?

The exhibit shows two seemingly similar patterns – but one of them is random and one isn’t. Can you tell which is which? More in a moment.

I’ve taken these plots from Steven Pinker’s new book, The better angels of our nature: why violence has declined. It’s in a Chapter titled The statistics of deadly quarrels where he discusses the statistical patterning of wars and takes a small detour into “a paradox of utility”, specifically our tendency to see randomness as regularity with little clustering.

This cognitive illusion has relevance to all disciplines but is of particular interest to anyone interested in spatial issues, as a couple of these examples show.

Professor Pinker, who’s a psychologist at Harvard, cites the example of the London blitz, when Londoners noticed a few sections of the city were hit by German V-2 rockets many times, while other parts were not hit at all:

They were convinced that the rockets were targeting particular kinds of neighborhoods. But when statisticians divided a map of London into small squares and counted the bomb strikes, they found that the strikes followed the distribution of a Poisson process—the bombs, in other words, were falling at random. The episode is depicted in Thomas Pynchon’s 1973 novel Gravity’s Rainbow, in which statistician Roger Mexico has correctly predicted the distribution of bomb strikes, though not their exact locations. Mexico has to deny that he is a psychic and fend off desperate demands for advice on where to hide

Another example is The Gambler’s Fallacy – the belief that after a long run of (say) heads, the next toss will be tails:

Tversky and Kahneman showed that people think that genuine sequences of coin flips (like TTHHTHTTTT) are fixed, because they have more long runs of heads or of tails than their intuitions allow, and they think that sequences that were jiggered to avoid long runs (like HTHTTHTHHT) are fair

The exhibit above shows a simulated plot of the stars on the left. On the right it shows the pattern made by glow worms on the ceiling of the famous Waitomo caves, New Zealand. The stars show constellation-like forms but the virtual planetarium produced by the glow worms is relatively uniform.

That’s because glow worms are gluttonous and inclined to eat anything that comes within snatching distance, so they keep their distance from each other and end up relatively evenly spaced i.e. non-randomly. Says Pinker:

The one on the left, with the clumps, strands, voids, and filaments (and perhaps, depending on your obsessions, animals, nudes, or Virgin Marys) is the array that was plotted at random, like stars. The one on the right, which seems to be haphazard, is the array whose positions were nudged apart, like glowworms

Thus random events will occur in clusters, because “it would take a non-random process to space them out. The human mind has great difficulty appreciating this law of probability”. Read the rest of this entry »


Can selection bias shoot down an argument?

Manhattan in Motion

Urban policy is rich in opportunities for fallacious thinking – for example, surveys that purport to show huge latent demand for a particular mode of transport, but sample only users of that mode. So I’m always interested in new examples of where we can so easily go wrong.

Alex Tabarrok at Marginal Revolution recently provided a pointer to this great historical example by John D. Cook of the importance of selection bias. It isn’t specifically related to urban policy, but nevertheless provides a valuable lesson. Cook says:

During WWII, statistician Abraham Wald was asked to help the British decide where to add armour to their bombers. After analysing the records, he recommended adding more armour to the places where there was no damage!

According to Cook, while it seems backward at first, Wald realised his data came from bombers that hadn’t been shot down:

That is, the British were only able to analyse the bombers that returned to England; those that were shot down over enemy territory were not part of their sample. These (surviving) bombers’ wounds showed where they could afford to be hit. Said another way, the undamaged areas on the survivors showed where the lost planes must have been hit because the planes hit in those areas did not return from their missions.

Wald assumed that the bullets were fired randomly, that no one could accurately aim for a particular part of the bomber. Instead they aimed in the general direction of the plane and sometimes got lucky. So, for example, if Wald saw that more bombers in his sample had bullet holes in the middle of the wings, he did not conclude that Nazis liked to aim for the middle of wings. He assumed that there must have been about as many bombers with bullet holes in every other part of the plane but that those with holes elsewhere were not part of his sample because they had been shot down.

Here’s another example via Marc Gawley, who noted a BBC story implying that three quarters of those committing crimes during the London riots had previous convictions or cautions. Is that really true, he wonders, or:

is it that those with previous convictions have their details on a police database and it was therefore possible to identify (and find) those people based on images from photographs and recordings made last month? Read the rest of this entry »