Posts Tagged statistics

A simple statistic reveals amazing wisdom from crowds

My good friend Rich Burnham, knowing my interest in off-beat science and stats, drew my attention to this video by YouTuber Michael Stevens (aka “Vsauce”) on an experiment that failed to confirm a phenomenon called the “wisdom of the crowds.”

Normally, as demonstrated by Sir Francis Galton in 1906 from data collected at a country fair on 787 guesses at the weight of an ox,* groups of people exhibit a high level of collective intelligence via a simple median (the “middlemost estimate”)—being off by only 9 pounds for the 1,198 pound ox. This amazes me—blowing away my mindset that the wisdom of a crowd degrades to the ‘lowest common denominator,’ that is, the people with the least knowledge.

Experts agree with Vsauce’s hypothesis that the complete failure of his crowd to correctly guess the number of jelly beans in his jar stemmed from the estimates being shared, rather than gathered with no cross-talk.

“The wisdom of crowds requires that people’s estimates be independent. Studies have found that when people can observe the estimates of others, the accuracy of the crowd typically goes down. People’s errors become correlated or dependent, and are less likely to cancel each other out. We follow our peers, to the detriment of the performance of the group.”

  – Psychology professor Tania Lombrozo, No Man Is An Island: The Wisdom Of Deliberating Crowds, posted 3/12/18 by WGCU, a National Public Radio-member station on Florida’s Gulf Coast

I made the same mistake in a 2019 contest for my Anderson clan. While vacationing together at a lakeside resort, I gathered individuals’ estimates on the number of aluminum-can pull-tabs I’d collected for donation to the Ronald McDonald House in Minneapolis. See the picture below of my wife Karen (holding Bertie) working with our oldest grandchild Archer do the count. I asked the participants to write down their guesses on a clipboard by the jar, which created more fun via the gaming aspects of going just above or below a competitor, but violated the statistical requirement for independence.

An interesting workaround that allows collaboration for tapping the “wisdom of the crowds” is to first break the group into a number of teams and then average out their consensus estimates. See the research, based on results from a group of 5,180 people asked to estimate the height of the Eiffel Tower or the like, at this 2018 Letter by Nature Human Behavior on Aggregated knowledge from a small number of debates outperforms the wisdom of large crowds.

To keep things simple, the next time my bottle of pull-tabs fills up for another contest to guess the total, I will go with the simpler approach for crowd wisdom by banning cross talk and then seeing if the median estimate wins. If it doesn’t work, I will blame it on our family group being too small (though it does exceed 20—all in one cabin!).

* “Vox Populi,” Nature, 1907

,

No Comments

Talking turkey on extrapolation—do not stick your neck out!




“Extrapolating patterns beyond their natural range can lead to false conclusions” PhD statistician Christine Anderson-Cook in the October issue of Quality Progress. * That should be obvious to everyone. Unfortunately, though, one naturally “eye fits” the final leg of every graph straight out into the beyond of the X axis, thus overlooking the possibility of an imminent bend such as the one shown here by Gapminder. A classic case of this occurred a decade ago when a consensus of forecasters predicted that all of the world economies would experience continued expansion in 2009—none foreseeing the Great Recession. **

“Avoid Linear extrapolation … The turkey’s first 1000 days are a seemingly unending succession of gradually improving circumstances confirmed by daily experience. What happens on Day 1001? Thanksgiving.”

-John E. Sener (Source: Consortium for the Advancement of Undergraduate Statistics Education)

Be careful out there!

* “Straight Line or Not?” p45.

** Financial Times, “An astonishing record – of complete failure”, Tim Harford, May 30, 2014

No Comments

Marketers trick math-challenged consumers with ploys on percentages




WSJ’s “The Numbers” columnist, Jo Craven McGinty, advised readers “To Shop Smart, Mind the Percentages” in this weekend’s issue. It turns out that, as I blogged back in 2007, percentages are puzzling to many people. Put yourself to this test from McGinty: You can buy a regular container of ice cream at 33% off (option 1) or pay the usual price for a 33% more of it as a free bonus (option 2). If you picked the first option without any hesitation, you go to the head of the class. Those of you—likely the majority of the general population–who went for option 2 are the target for the marketers.

“People always go for the bonus.” – Quote in WSJ from Akshay Rao, marketing professor, University of Minnesota and co-author of When Two and Two is Not Equal to Four: Errors in Processing Multiple Percentage Changes

The remainder who withheld judgement until they do the calculation get full credit for knowing that percentages require thinking to work out their effect. Kudos to you for being math-savvy.

“To be statistically literate, one must be able to form arithmetic comparisons of any two numbers.”

– Milo Schield, Department of Business, Accounting and MIS, Augsburg College, Minneapolis, “Common Errors In Forming Arithmetic Comparisons”, Sept 1999, Association of Public Data Users, Volume 1.51 Journal Of Significance

,

No Comments

Fireworks that do not go Fourth deserve a resounding fizgig




My word for today is “fizgig”, a type of firework that makes a loud hiss. I consider this an onomatopoeic word, given the “fiz” characterizes the sound.

Residents of Saint Paul, where I grew up, must be fizzing their mayor today after he canceled the city’s fireworks this year due to budget concerns. Boo, hiss!

More commonly, fireworks are frowned upon due to safety concerns. For example, a Florida television station broadcast a warning yesterday that Independence Day revelers should be careful not to “be a statistic” by shooting off fireworks. I don’t get this. Isn’t being a statistic a good thing?

An enterprising fireworks vendor turned the statistics around in a very creative way by touting a long-term trend toward fewer injuries per pound of pyrotechnics—citing a decrease of more than 50% since 1994. I find this fellows numerical and sentimental arguments in favor of fireworks very compelling: Check it out here.

“It ought to be solemnized with Pomp and Parade, with Shews, Games, Sports, Guns, Bells, Bonfires and Illuminations from one End of this Continent to the other from this Time forward forever more.”

– John Adams, July 3, 1776

No Comments

2017—A prime year for statistics




To cap off the year, I present half a dozen wacky new statistics:

  • 2017 was a “sexy” prime, that is, 6 years beyond the last one in 2011 (six in latin is “sex”).
  • By 2050 the plastic trash floating in the oceans will outweigh the fish. (Source: Robert Samuelson, “The Top 10 Stats of 2017”, Washington Post, 12/27/17.)
  • University of Warwick statistician Nathan Cunningham debunked the “i-before-e except after c” rule based on evaluating 350,000 English words: The ratio of “ie” to “ei” is exactly the same for the after-c words as it is for all words in general. Weird science!
  • After digging into data compiled by the National UFO Reporting Center (NUFORC), Sam Monfort, a doctoral student in Human Factors and Applied Cognition at George Mason University, concluded that UFOs are visiting at all-time highs. Americans sight UFOs at a rate that exceeds the worldwide median by 300 times. Far out!
  • In May, an Australian cat named Omar was confirmed by the BBC as the world’s biggest at nearly 4 feet long and over 30 pounds. My oh meow!
  • Nearly a thousand people dressed up like penguins at Youngstown, Ohio this October to break the world’s record. Coincidentally, National Geographic reported on December 13 that the fossilized remains of a giant, man-sized penguin, were found in New Zealand. Eerie!

No Comments

‘Roid rage




Let’s not get caught off guard by an Earth-killing asteroid. As Dylan Thomas said: “Do not go gentle into that good night, …rage against the dying of the light.” 

That is the mission of NASA.  If you are reading this, chances are that Asteroid 2012 TC4 whizzed by today at 30,000 miles per hour—closely monitored by a network of observatories. Check out the details at this NASA website. They take asteroid defense very seriously.  Their defense plans for redirecting asteroids will be tested out in 2022 on a double asteroid Didymos B as explained here.

Keep in mind that asteroid 1950DA, about three-quarters a mile wide—big enough to destroy our planet, has a 0.1% chance of hitting the earth 2818.  In case NASA does not succeed in their defense efforts, start digging now and you might get hunkered down enough to survive for a short while after that.

No Comments

Errors, blunders & lies




David S. Salsburg, author of “The Lady Tasting Tea”*, which I enjoyed greatly, hits the spot again with his new book on Errors, Blunders & Lies-How to Tell the Difference. It’s all about a fundamental statistical equation: Observation = model + error. The errors, of course, are normal and must be expected. But blunders and lies cannot be tolerated.

The section on errors concludes with my favorite chapter: “Regression and Big Data”. There Salsburg endorses my favorite way to avoid over-fitting of happenstance results—hold back at random 10 percent of the data and see how well these outcomes are predicted by the 90 percent you regress.** Whenever I tried this on manufacturing data it became very clear that our high-powered statistical models worked very well for predicting what happened last month. 😉 They were worthless for seeing into the future.

Another personal favorite is the bit on spurious correlations that Italian statistician Carlo Bonferroni*** guarded against, also known as the “will of the wisps” per the founder of Yale’s statistics school—Francis Anscombe.

If you are looking for statistical insights that come without all the dreary mathematical details, this book on “Errors, Blunders & Lies” will be just the ticket. Salsburg concludes with a timely heads-up on the statistical lies caused “curbstoning” (reported here by the New York Post), which may soon combine with gerrymandering (see my previous post) to create a perfect storm of data tampering in the upcoming census. We’d all do well to sharpen up our savvy on stats!

The old saying is that “figures will not lie,” but a new saying is “liars will figure.” It is our duty, as practical statisticians, to prevent the liar from figuring; in other words, to prevent him from perverting the truth, in the interest of some theory he wishes to establish.

– Carroll D. Wright, U.S. government statistician, speaking to 1889 Convention of Commissioners of Bureaus of Statistics of Labor.

*Based on the story told here.

**An idea attributed to the inventor of modern day statistics—R. A. Fisher, and endorsed by famed mathematician John Tukey, who suggested the hold-back be 10 percent.

***See my blog on Bonferroni of Bergamo.

No Comments

Statistics to make distracted drivers more aware this month




April is now the Mathematics and Statistics Awareness Month (formerly it was just math–no stats). It also is Distracted Driving Awareness Month.

Putting these two themes together brings us to data published this month by Zendrive, a San Francisco-based startup that uses smartphone sensors to measure drivers’ behavior. They claim that 90% of collisions are due to human error, of which 1 in 4 stem from phone use while driving.

These statistics are very worrying to start off with.  But, according to this blog, it gets far worse when you drill down on Zendrive’s 3-month analysis of 3-million anonymous drivers, who made 570-million trips and covered 5.6-billion miles:

  • Drivers used their phones on 88-percent of the trips
  • They spent 3.5 minutes per hour on calls (an enormous amount of time considering that even a few seconds of distraction can create dire consequences)

About a third of US states prohibit use of hand-held phones while driving. Does this reduce distraction? The stats posted by Zendrive are not definitive.

It seems to me that that hands-free must be far safer. However, this ranking of driving distractions* (benchmarked to plain driving—rating of 1) does not provide much support for what is seemingly obvious:

  1. Listening to the radio — 1.21
  2. Listening to a book on tape — 1.75
  3. Talking on a hands-free cellphone — 2.27
  4. Talking with a passenger in the front seat — 2.33
  5. Talking on a hand-held cellphone — 2.45
  6. Interacting with a speech recognition e-mail or text system — 3.06

For all the fuss about talking on the phone, whether hands-free or not, it does not cause any more distraction than chatting with a passenger.

This list does not include texting, which Consumer Reports figures is 23 times more distracting than talking on your cell phone while driving.**

Please avoid any distractions when you drive, especially texting.

*Source: This 10/16/15 Boston Globe OpEd

**Posted here

No Comments

“Bright line” rules are simple but not very bright




Just the other day a new term came to light for me—a “bright line” rule.  Evidently this is commonplace legal jargon that traces back to at least 1946 according to this language log.  It refers to “a clear, simple, and objective standard which can be applied to judge a situation” by this USLegal.com definition.

I came across the term in this statement* on p-values from American Statistical Association (ASA) on statistical significance:

“Practices that reduce data analysis or scientific inference to mechanical ‘bright-line’ rules (such as ‘p < 0.05’) for justifying scientific claims or conclusions can lead to erroneous beliefs and poor decision-making.”

The ASA goes on to say:

“Researchers should bring many contextual factors into play to derive scientific inferences, including the design of the study, the quality of the measurements, the external evidence for the phenomenon under study, and the validity off assumptions that underlie the data analysis.”

It is hard to argue that if the p-value is high, the null will fly, that is, results cannot be deemed statistically significant.  However, I’ve never bought into 0.05 being the bright-line rule.  It is good to see ASA dulling down this overly simplistic statistical standard.

I can see the value for “bright line rules” in legal processes, a case in point being the requirement for the Miranda warning being given to advise US citizens of their rights when being arrested.  However, it is ludicrous to apply such dogmatism to statistics.

*(The American Statistician, v70, #2, May 2016, p131)

No Comments

Men who have children make more money and live longer–correlation or causation?




Hey guys, if you want to make more money and live longer, have kids.  Anyways that seems to be the gist of two studies reported this month, at least from my perspective as a father of five.  Here is the scoop:

  • “Men in the top 1 percent distribution level live about 15 years longer than men in the bottom 1 percent on the income distribution in the United States.” – Raj Chetty, professor of economics at Stanford University, quoted in this report by NPR on an article in The Journal of American Medical Association on “The Association Between Income and Life Expectancy in the United States, 2001-2014” he lead-authored.
  • Working fathers enjoy 21% ‘wage bonus’ over childless colleagues according to a study by United Kingdom’s Trades Union Congress reported here

Before you run off madly making babies, that correlation may not be causation.  For example, as reported in this expose by Slate, statistics indicate that eating ice cream turns people into killers.  Could that really be the scoop?

Correlation

No Comments