Archive for category Basic stats & math
Coin-flip hack: How to call it—heads or tails—to improve your odds
Posted by mark in Basic stats & math on June 12, 2024
As I reported in this 2009 StatsMadeEasy blog, math and stats experts Persi Diaconis, Susan Holmes and Richard Montgomery long ago worked out that “vigorously flipped coins tend to come up the same way they started.”* Based on principles of physics, the “DHM” model predicts about a 0.51 chance that a coin will come up as started. That is not a big difference over 0.50 but worth knowing by its cumulative impact over time providing an appreciable winning edge.
Now in a publication revised on June 2nd the DHM model gains support by evidence from 350,757 flips that fair coins tend to land on the same side they started. All but three of the 50 (!) co-authors—researchers at the University of Amsterdam—flipped coins in 46 different currencies and finally settled on 0.508 as the “same-side bias,” thus providing compelling statistical confirmation for the DHM physics model of coin tossing.
This finding creates many potential repercussions, for example on NFL football games going into overtime, particularly under the old rules when a team that won the coin toss could immediately win with a touchdown. The current rules provide one chance for the opposing team to tie under these circumstances. Nevertheless, it seems to me that referees should randomly pull their coin out without knowing which side came up, keep it covered up from sight of the caller and then flip it.
Let’s keep things totally fair at 50/50. (But do sneak a peek at the coin if you can!)
*Dynamical Bias in the Coin Toss, SIAM (Society for Industrial and Applied Mathematics) Review, Vol. 49, No. 2, pp. 211–235, 2007.
Common confusion about probability can be a life or death matter
Posted by mark in Basic stats & math on June 3, 2024
As a certified quality engineer (CQE), I often focused on the defect rates of manufactured products. They either passed or failed—a binary outcome.
I learned quickly that even a small probability of failure would build up quickly when applying a series of operations. For example, I worked as chief CQE on a chemical plant startup that involved several unit operations in the process line—all to a scale never attempted before. It did not go well. By my reckoning afterwards, each of the steps probably had about a 80/20 chance of succeeding. That led to optimism by the engineers in the company who design our plant. Unfortunately, though, multiplying 0.8 repeatedly is not a winning strategy for process improvement (or gambling!).
As we approach the 80th anniversary of D-Day this diabolic nature of binary outcomes takes on a deadly aspect when you consider how many times our warriors were sent into harms way. The odds continually waver as technology ratchets forward on the offense versus defense. This can be assessed statistically with specialized software such a that provided by Stat-Ease with its logistic regression tools. For example, see this harrowing tutorial on surface-to-air missile (SAM) antiaircraft firing.
Thanks to a heads up from statistician Nathan Yau in one of his daily Flowing Data newsletters, I became aware that many people, even highly educated scientists, get confused about a series of unfortunate or fortunate events (to borrow a phrase from Lemony Snicket).
Yau reports that a noted podcaster with a PhD in neuroscience suggested that chances could be summed, thus if your chance of getting pregnant was 20%, you should see a doctor if not successful after 5 tries. It seems that this should be 100% correct (5 x 20), but not so. By my more productive math (lame pun—taking the product, not summation), the probability of pregnancy comes to 67%. The trick is to multiply the chance of not getting pregnant—0.8—5 times over, subtracting this from 1 and then times 100.
If you remain unconvinced, check out the odds via Yau’s entertaining and enlightening simulation for probability of success for repeated attempts at a binomial process.
Enjoy!
Variation in eggs presents perplexing problems for preparation
Posted by mark in Basic stats & math, food science on October 13, 2023
Today is World Egg Day.
I’m a big fan of eggs—my favorite being ones perfectly poached in an Endurance Stainless Steel Pan. However, the eggs that come from my daughters’ hens vary in size far more per container than store-bought, graded ones. I work around this by adding or subtracting time based on my experience. I really should weigh the eggs and design an experiment to optimize the time.
Coincidentally, I just received the new issue of Chance, published by the American Statistical Association. An article titled “A Physicist and a Statistician Walk into a Bar” caught my eye because one of my Stat-Ease consulting colleagues is a physicist and another is a statistician. I was hoping for a good joke at both of their expense. However, the authors (John Durso and Howard Wainer) go in a completely different direction with an amusing, but educational, story about a hypothetical optimization of soft-boiled eggs.
The problem is that recipes suffer from the “flaw of averages” —smaller ones get undercooked and bigger ones end up overcooked unless the time gets adjusted (as I well know!).
While the physicist sits over a pint of beer and pad of paper scratching out possible solutions based on on partial differential equations related to spheroidal geometry, the statistician assesses data collected on weights versus cooking time. Things get a bit mathematical at this point* (this is an ASA publication, after all) but in the end the statistician determines that weight versus cooking time can be approximated by a quadratic model, which makes sense to the physicist based on the geometry and makeup of an egg.
I took some liberties with the data to simplify things by reducing the number of experimental runs from 41 to 8. Also, based on my experience cooking eggs of varying weights, I increased the variation to a more realistic level. See my hypothetical quadratic fit below in a confidence-banded graph produced by Stat-Ease software.
Perhaps someday I may build up enough steam to weigh every egg, time the poaching and measure the runniness of the resulting yolks. However, for now I just eat them as they are after being cooked by my assessment of the individual egg-size relative to others in the carton. With some pepper and salt and a piece of toast to soak up any leftover yolk, my poached eggs always hit the spot.
*For example, they apply Tukey’s ladder of variable transformations – a method that works well on single-factor fits and can be related to the shape of the curve being concave or convex, going up or down the powers, respectively. It relates closely to the more versatile Box-Cox plot provided by Stat-Ease software. Using the same data as Durso and Wainer presented, I found that the Box-Cox plot recommended the same transformation as Tukey’s ladder.
Industrial statisticians keeping calm and carrying on with p-values
Posted by mark in Basic stats & math on February 11, 2022
Last week I attended a special webinar on “Statistical Significance and p-values” presented by the European Network of Business and Industrial Statistics (ENBIS). To my relief, none of the speakers called for abandoning the use of p values. Though I feel that p’s should not be a statistic to solely rely on for deeming results significant or not, when used properly they certainly reduce the risk of pressing ahead with spurious outcomes. It was great to get varying perspectives on this issue.
Here are a couple of fun quotes on that I gleaned from this ENBIS event:
- “Surely, God loves the .06 nearly as much as the .05. Can there be any doubt that God views the strength of evidence for or against the null as a fairly continuous function of the magnitude of p?” – Rosnow, R.L. & Rosenthal, R. “Statistical procedures and the justification of knowledge in psychological science”, American Psychologist, 44 (1989), 1276-1284.
- “My definition of a statistician is ‘one who prefers true doubts to false certainty’.” – Stephen Senn (Statistical Consultant, Edinburgh, Scotland, UK)
If you have a strong stomach for stats, see this Royal Society review article: The reign of the p-value is over: what alternative analyses could we employ to fill the power vacuum? It includes discussion of an alternative to p values called the “Akaike information criterion” (AIC). This interested me, because, as a measure for goodness of model-fit, Stat-Ease software provides AICc—a version of this statistic that corrects (hence the appendage “c”) for the small sample sizes of industrial experiments (relative to large retrospective scientific studies).
Engineer detects “soul crushing” patterns in “A Million Random Digits”
Posted by mark in Basic stats & math, Uncategorized on September 27, 2020
Randomization provides an essential hedge against time-related lurking variables, such as increasing temperature and humidity. It made all the difference for me succeeding with my first designed experiment on a high-pressure reactor placed outdoors for safety reasons.
Back then I made use of several methods for randomization:
- Flipping open a telephone directory and reading off the last four digits of listings
- Pulling out number from pieces of paper put in my hard hat (easiest approach)
- Using a table of random numbers.
All of these methods seem quaint with the ubiquity of random-number generators.* However, this past spring at the height of the pandemic quarantine, a software engineer Gary Briggs of Rand combatted boredom by bearing down on his company’s landmark 1955 compilation of “A Million Random Digits with 100,000 Normal Deviates”.**
“Rand legend has it that a submarine commander used the book to set unpredictable courses to dodge enemy ships.”
Wall Street Journal
As reported here by the Wall Street Journal (9/24/20), Briggs discovered “soul crushing” flaws.
No worries, though, Rand promises to remedy the mistakes in their online edition of the book — worth a look if only for the enlightening foreword.
* Design-Expert® software generates random run orders via code based on the Mersenne Twister. For a view of leading edge technology, see the report last week (9/21/20) by HPC Wire on IBM, CQC Enable Cloud-based Quantum Random Number Generation.
**For a few good laughs, see these Amazon customer reviews.
Business people taking notice of pushback on p-value
Posted by mark in Basic stats & math on December 15, 2019
As the headline article for their November 17 Business section, my hometown newspaper, the St. Paul Pioneer Press, picked up an alarming report on p-values by Associated Press (AP). That week I gave a talk to the Minnesota Reliability Consortium*, after which one of the engineers told me that he also read this article and lost some of his faith in the value of statistics.
“One investment analyst reacted by reducing his forecast for peak sales of the drug — by $1 billion. What happened? The number that caused the gasps was 0.059. The audience was looking for something under 0.05.”
– Malcom Ritter, AP, relaying the reaction to results from a “huge” heart drug study presented this fall by Dr. Scott Solomon of Harvard’s Brigham and Women’s Hospital.
As I noted in this May 1st blog/, rather than abandoning p-values, it would pay to simply be far more conservative by reducing the critical value for significance from 0.05 to 0.005. Furthermore, as pointed out by Solomon (the scientist noted in the quote), failing to meet whatever p-value one sets a priori as the threshold, may not refute a real benefit—perhaps more data might generate sufficient power to achieve statistical significance.
Rather than using p-values to arbitrarily make a binary pass/fail decision, analysts should use this statistic as a continuous measure of calculated risk for investment. Of course, the amount of risk that can be accepted depends on the rewards that will come if the experimental results turn out to be true.
It is a huge mistake to abandon statistics because of p being hacked to come out below 0.05, or p being used to kill projects due to it coming out barely above 0.05. Come on people, we can be smarter than that.
* “Know the SCOR for Multifactor Strategy of Experimentation”
ASA calls for abandoning the declaration of results being “statistically significant”
Posted by mark in Basic stats & math, Uncategorized on May 1, 2019
On March 21 the American Statistical Association (ASA) sent out a shocking email to all members that the lead editorial in a special, open-access issue of The American Statistician calls for abandoning the use of “statistically significant”. With irony evidently intended by their italicization, they proclaimed it “a significant day in the history of the ASA and statistics.
I think the probability of experimenters ignoring ASA’s advice and continuing to say “statistically significant” approaches 100 percent. Out of the myriad of suggestions in the 43 articles of The American Statistician special issue the ones I like best come from statisticians Daniel J. Benjamin and James O. Berger. They propose that, because “p-values are often misinterpreted in ways that lead to overstating the evidence against the null hypothesis”, the threshold for “statistical significance” of novel discoveries require a threshold of 0.005. By their reckoning, a p-value between 0.05 and 0.005 should the be degraded to “suggestive,” rather than “significant.”*
It’s a shame that p-hackers, skewered in this xkcd cartoon, undermined the sound application of statistics for filtering out findings unsupported by the data.
*The American Statistician, 2019, Vol. 73, No. S1, 186–191: Statistical Inference in the 21st Century, “Three Recommendations for Improving the Use of p-Values”.
“Data are profoundly dumb”
Posted by mark in Basic stats & math, Uncategorized on October 22, 2018
This is the controversial view of Judea Pearl and Dana Mackenzie expressed in “Mind over Data”—the lead article in the August issue of Significance. In this excerpt from The Book of Why these co-authors explain “how the founders of modern statistics ‘squandered’ the chance to establish the science of causal inference”. They warn against “falsely believing the answers to all scientific questions reside in the data, to be unveiled through clever data-mining tricks.”
“Lucky is he who has been able to understand the cause of things.”
– Virgil (29 BC)
Pearl and Mackenzie are optimistic that the current “Causal Revolution” will lead to far greater understanding of underlying mechanisms. However, by my reckoning, randomized controlled trials remain the gold standard for establishing cause and effect relationships. Only then can the data speak loud and clear.
The hero of zero
Posted by mark in Basic stats & math, history on October 9, 2017
Breaking news about nothing: Dating done with the Oxford Radiocarbon Accelerator Unit now puts the invention of the number zero 500 years earlier than previously believed. As explained in this post by The Guardian, the hero of zero is Indian mathematician Brahmagupta who worked out this pivotal number in 628 AD. Isn’t that something?
The development of zero in mathematics underpins an incredible range of further work, including the notion of infinity, the modern notion of the vacuum in quantum physics, and some of the deepest questions in cosmology of how the Universe arose – and how it might disappear from existence in some unimaginable future scenario.
– Hannah Devlin,
Errors, blunders & lies
Posted by mark in Basic stats & math, pop on June 27, 2017
David S. Salsburg, author of “The Lady Tasting Tea”*, which I enjoyed greatly, hits the spot again with his new book on Errors, Blunders & Lies-How to Tell the Difference. It’s all about a fundamental statistical equation: Observation = model + error. The errors, of course, are normal and must be expected. But blunders and lies cannot be tolerated.
The section on errors concludes with my favorite chapter: “Regression and Big Data”. There Salsburg endorses my favorite way to avoid over-fitting of happenstance results—hold back at random 10 percent of the data and see how well these outcomes are predicted by the 90 percent you regress.** Whenever I tried this on manufacturing data it became very clear that our high-powered statistical models worked very well for predicting what happened last month. 😉 They were worthless for seeing into the future.
Another personal favorite is the bit on spurious correlations that Italian statistician Carlo Bonferroni*** guarded against, also known as the “will of the wisps” per the founder of Yale’s statistics school—Francis Anscombe.
If you are looking for statistical insights that come without all the dreary mathematical details, this book on “Errors, Blunders & Lies” will be just the ticket. Salsburg concludes with a timely heads-up on the statistical lies caused “curbstoning” (reported here by the New York Post), which may soon combine with gerrymandering (see my previous post) to create a perfect storm of data tampering in the upcoming census. We’d all do well to sharpen up our savvy on stats!
The old saying is that “figures will not lie,” but a new saying is “liars will figure.” It is our duty, as practical statisticians, to prevent the liar from figuring; in other words, to prevent him from perverting the truth, in the interest of some theory he wishes to establish.
– Carroll D. Wright, U.S. government statistician, speaking to 1889 Convention of Commissioners of Bureaus of Statistics of Labor.
*Based on the story told here.
**An idea attributed to the inventor of modern day statistics—R. A. Fisher, and endorsed by famed mathematician John Tukey, who suggested the hold-back be 10 percent.
***See my blog on Bonferroni of Bergamo.