Tikalon Blog by Dev Gualtieri

The Wisdom of Composite Crowds

April 27, 2017

Physical scientists are able to control their experiments quite closely. Chemists will choose pure materials for their reactions, and physicists will undertake measurements under precise temperature, pressure, and other conditions. Experiments in many of the life sciences are seldom as precise, since they're conducted on organisms that are not precisely equivalent from one individual to another. It's only by doing experiments on many organisms, and analysis of the data by statistics, that those disciplines can transcend what Rutherford called "stamp collecting."[1]

Statistics are important when we venture away from science and into the realm of opinion; but, as the performance of some opinion polls for the last US Presidential election reminds us, the methodology can be tricky. However, it appears the the composite opinion of large numbers of properly selected people will give us a better answer to a question that of a few individuals.

While Aristotle (384 BC-322 BC) alludes to the idea of the "wisdom of the crowd" in his Politics (see figure),[2] Victorian polymath, Francis Galton (1822-1911) was the first to apply this concept statistically. In 1906, Galton observed a contest at a livestock fair in which 800 visitors guessed the weight of meat that would be produced by a particular ox.

Aristotle, Politics III, 11, 1281a41-1281b1

"For it is possible that the many, though not individually good men, yet when they come together may be better, not individually but collectively, than those who are so..."(Aristotle, Politics, Book III, Section 1281a41-1281b1, via the Tufts University Perseus Digital Library.[2])

Although people who would have attended such a livestock fair a hundred years ago would have had a reasonable knowledge of what a pound weight of meat would looked like "on the hoof," no individual guessed the exact weight, and there was quite a range of estimates. Galton found that the mean of the 800 guesses was within a pound of the actual weight of 1,197 pounds; that is, the mean was within 0.1% of the actual value.

Photograph of Sir Francis Galton, with signature

Photograph of Francis Galton (1822-1911), with signature.

Galton was a Victorian polymath best known to scientists as the originator of the statistical concept of correlation.

He is known to most others for coining the term, eugenics, for a concept that goes back to Plato.

(image source and signature source, via Wikimedia Commons.)

Fast-forward a little more than a hundred years for an experiment similar to Galton's wisdom of crowds observation. Stefan Krause of the Lübeck University of Applied Sciences in Schleswig-Holstein, Germany and his colleagues asked visitors to a museum to guess the number of marbles in a jar.[3-6] This research team did a statistical analysis of the results that went beyond Galton's average.[3]

Since there were more than 2,000 responses, they were able to make random groups from the total and determine the average for these groups. The result was that groups of forty or more had a better average guess than the top quartile of individual guesses. In other words, groups of forty ranked at the 99th percentile of performers, a result consistent with Galton's.[3]

Galton's initial crowd wisdom observation and the marble experiment have the common feature that the guesses are made in isolation. Recent research by scientists at University College London (London, UK), the Universidad Torcuato Di Tella (Buenos Aires, Argentina), and TED (Buenos Aires, Argentina) have examined how the wisdom of crowds can be increased by discussion within members of a group.[7] The group members still get their own vote, but the vote is informed by the discussions.[7]

Statisticians would likely agree that the best results in a collective wisdom study would arise from independent voting, principally because the accuracy of a statistical prediction increases with the number of observations. This study demonstrated that deliberation and discussion improved the collective wisdom.[7] The researchers had a crowd of 5,180 people answer some simple questions, such as the height of the Eiffel Tower, answering independently, or after discussions in groups of five.[7] It was found that the estimates given after discussions were more diverse, and the average more accurate.[7]

Educated guessing.

Scientists make educated guesses, so their approach to guessing the number of objects in a jar differs from that of most people.

A random packing of spheres will occupy about 2/3 of the volume of a space, while ellipsoidal objects, such as the olives shown, will pack about 3/4 of the volume.

I discussed packing in a previous article (Packing, November 30, 2010).

(Wikimedia Commons image.)

A different way to improve crowd wisdom was discovered by researchers from the Massachusetts Institute of Technology (Cambridge, Massachusetts) and Princeton University (Princeton, New Jersey).[8-10] The technique, which they call the "surprisingly popular" algorithm, is based on asking each participant two questions.

The first question is what the participant thinks is the right answer, while the second asks what the popular answer will be. Looking at the difference between these answers allows choice of the better answer.[10] Says Drazen Prelec, a paper co-author and MIT professor, "In situations where there is enough information in the crowd to determine the correct answer to a question, that answer will be the one [that] most outperforms expectations."[10]

The team polled the participants with simple yes/no questions, some examples of which appear below.[9]

1.	Japan has the world's highest life expectancy.
2.	The Nile River is more than double the length of the Volga.
3.	Portuguese is the official language of Mozambique.
4.	Avogadro's constant is greater than Planck's constant.
5.	The currency of Switzerland is the Euro.
6.	Abkhazia is a disputed territory in Georgia.
7.	The chemical symbol for Tin is Sn.
8.	The Iron Age comes after the Bronze Age.
9.	Schuyler Colfax was Abraham Lincoln's Vice President.
10.	The longest bone in the human body is the femur.

Readers of this blog will certainly have no difficulty answering questions 4, 7, 8, and probably 10. I had convictions about some of the others, but no direct knowledge of any of those answers. I did have an objection to question 4, since those constants do not have the same units. The reason why the "surprisingly popular" method works is that some people in the crowd have specialized knowledge, they are sure of their answer, and they are confident of what the typical wrong answer would be.

As a simple example, consider one question that the research team posed; namely, whether Philadelphia is the capital of Pennsylvania.[10] While Philadelphia is not Pennsylvania's capital, most people think that it is (The same can be said of New York City's being the capital of New York State, whose capital is actually Albany). People who thought that Philadelphia was the capital also thought that others would give the same mistaken answer.[10]

Those who answered Harrisburg, the actual capital, also thought that others would choose Philadelphia, so there was a divergence of answers to these questions between the groups.[10] The expectation that the most popular answer would be Philadelphia exceeded the actual number of answers, and this directs us to the correct answer.[10]

It was found that the "surprisingly popular" algorithm reduced errors by 21.3 percent compared to simple majority voting.[10] As John McCoy, a Ph.D. candidate in the MIT Department of Brain and Cognitive Sciences, explains, "A lot of crowd wisdom weights people equally... But some people have more specialized knowledge."[10]

The Wisdom of Composite Crowds

References: