Being ‘bird-brained’ merits reconsideration
First off, writing this blog from my winter home in Florida, I appreciate the sensibility of snowbirds who abandon their northern climes every fall. Smart! Furthermore, studies show that avian brains, specifically crows and ravens (collectively known as “corvids”), can accommodate statistical thinking—a skill that many humans lack based on my experience as an educator. Researchers from the University of Tübingen worked this out via a clever experiment that required crows to assess the probability of getting a treat based on prior experience pecking at differing images.
“True statistical inference requires subjects use relative rather than absolute frequency of previously experienced events. Here, we show that crows can relate memorized reward probabilities to infer reward-maximizing decisions.”
Johnston, et al, Crows flexibly apply statistical inferences based on previous experience, Current Biology, Volume 33, Issue 15, 7 August 2023, Pages 3238-3243
This gives new meaning to the saying that “if the p-value is high, the null must fly.”
Chance discovery on random walk in Utrecht
Posted by mark in science, Uncategorized on December 11, 2023
Last week I taught a class on design of experiments to a biotech company in Leiden, Netherlands. Afterwards I spent a few days in Utrecht with some friends from Germany. Imagine my excitement (nerd alert!) when on my first walk from our hotel to the city center just a few hundred feet down the sidewalk I encountered this mural featuring a differential equation.
Not being a physicist, I did not immediately grasp the formula’s importance, nor the clue provided by the fellow high-stepping down a street. It turns out this fellow is a drunk whose walk has become random. The mural, as explained by Utrecht University, pays homage to their famous professor Leonard Ornstein who, in the early 1900s along with another physicist—George Uhlenbeck—developed an important variant of the “random walk”—a term introduced by pioneering statistician Karl Pearson. The Ornstein-Uhlenbeck process is used to derive models from “big” financial data, including inflation rates, commodity prices and stock values.
I did not expect to gain an education on a vacation expedition.
Very cool!
PS: I thought about asking my colleague Martin Bezener, a PhD statistician, for his opinion on the chances of coming across something so relevant to our mission at Stat-Ease while on a random walk. But I will not bother, because I already know what he would say: “One-hundred percent: It already happened”.
British system of messed up measures hilariously skewered
Upon graduation as a chemical engineer in 1975 I took a job as a process developer at a California oil company. There I learned that a barrel amounted to 42 gallons—not the 55 in the drums at my previous employer—a specialty chemical company. In the wacky British system of volumes, the number of gallons in a barrel depends on the material—31 for beer, 53 for rum (yo ho ho!), 60 for wine, etc. Their weights and distances are just as unfathomable (pun intended).
That same year of 1975 that I first became employed as a degreed engineer, President Gerald Ford signed the Metric Conversion Act, which went nowhere before being abolished in 1982 by President Reagan. Having endured all this measurement mess throughout my career, I thoroughly enjoyed this October 28 Saturday Night Live skit:
I raise my US pint (16 fluid ounces) of beer to SNL’s clever comedy writers, though a British pint (20 fluid ounces) would be more filling and a liter even sweeter. Let’s not get into US versus Imperial ounces (or gallons)—that would get us over our head by at least a fathom (equivalent to 4 cubits, by the way).
PS. This rant about measures brings me full circle to an outburst at the outset of this year involving a whimsical unit of distance called the ‘smoot’.
Swedish sleep researchers torture subjects with math problems
This is alarming news, literally: Researchers from Stockholm University discovered via studies involving over 1700 subjects* that over two-thirds of them, especially younger individuals, habitually hit the snooze button.
I am appalled at this lack of discipline and ambition! However, I must confess that in my younger days, I got in the habit of putting my alarm on temporary pause repeatedly, which often caused me to run late for class. That would not do! Therefore, I purchased a cleverly built clock called the Clocky that rolls away when ringing, thus forcing you to jump out of bad to hunt it down. Highly recommended!
Putting aside my negative attitude about snoozers, I do feel bad for those subjected to the sleep study because as reported by the New York Times: “Immediately after the participants woke up, the researchers flipped on the lights and presented them with math problems and other cognitive tests — a challenge even more grating than a shrieking alarm, and one the participants had to complete before having a cup of coffee.”** Oof!
The good news for you slackers who do not leap out bed like I do is that this new study provides a pass for delaying the inevitable: “Snoozing [for 30 minutes] does not lead to cognitive impairments upon waking.” Just do not sleep through your final exam on math. That would be a nightmare!
*Is snoozing losing? Why intermittent morning alarms are used and how they affect sleep, cognition, cortisol, and mood , Journal of Sleep Research, October 17, 2023.
**“You Snooze, You … Win?”, Dani Blum, Oct. 18, 2023.
Variation in eggs presents perplexing problems for preparation
Posted by mark in Basic stats & math, food science on October 13, 2023
Today is World Egg Day.
I’m a big fan of eggs—my favorite being ones perfectly poached in an Endurance Stainless Steel Pan. However, the eggs that come from my daughters’ hens vary in size far more per container than store-bought, graded ones. I work around this by adding or subtracting time based on my experience. I really should weigh the eggs and design an experiment to optimize the time.
Coincidentally, I just received the new issue of Chance, published by the American Statistical Association. An article titled “A Physicist and a Statistician Walk into a Bar” caught my eye because one of my Stat-Ease consulting colleagues is a physicist and another is a statistician. I was hoping for a good joke at both of their expense. However, the authors (John Durso and Howard Wainer) go in a completely different direction with an amusing, but educational, story about a hypothetical optimization of soft-boiled eggs.
The problem is that recipes suffer from the “flaw of averages” —smaller ones get undercooked and bigger ones end up overcooked unless the time gets adjusted (as I well know!).
While the physicist sits over a pint of beer and pad of paper scratching out possible solutions based on on partial differential equations related to spheroidal geometry, the statistician assesses data collected on weights versus cooking time. Things get a bit mathematical at this point* (this is an ASA publication, after all) but in the end the statistician determines that weight versus cooking time can be approximated by a quadratic model, which makes sense to the physicist based on the geometry and makeup of an egg.
I took some liberties with the data to simplify things by reducing the number of experimental runs from 41 to 8. Also, based on my experience cooking eggs of varying weights, I increased the variation to a more realistic level. See my hypothetical quadratic fit below in a confidence-banded graph produced by Stat-Ease software.
Perhaps someday I may build up enough steam to weigh every egg, time the poaching and measure the runniness of the resulting yolks. However, for now I just eat them as they are after being cooked by my assessment of the individual egg-size relative to others in the carton. With some pepper and salt and a piece of toast to soak up any leftover yolk, my poached eggs always hit the spot.
*For example, they apply Tukey’s ladder of variable transformations – a method that works well on single-factor fits and can be related to the shape of the curve being concave or convex, going up or down the powers, respectively. It relates closely to the more versatile Box-Cox plot provided by Stat-Ease software. Using the same data as Durso and Wainer presented, I found that the Box-Cox plot recommended the same transformation as Tukey’s ladder.
Data detectives keep science honest
Posted by mark in science, Uncategorized on October 6, 2023
An article in Wall Street Journal last week* drew my attention to a growing number of scientists who moonlight as data detectives sleuthing out fraudulent studies. Thanks to their work the number of faulty papers retracted increased from 119 in 2002 to 5,500 last year. These statistics come from Retraction Watch who provide a better, graphical, perspective on the increase based on percent retractions per annual science and engineering (SE) publication–not nearly as dramatic given the explosion in publications over the last 20 years, but still very alarming.
“If you take the sleuths out of the equation it’s very difficult to see how most of these retractions would have happened.”
Ivan Oransky, co-founder of Data Colada –a blog dedicated to investigative analysis and replication of academic research.
Coincidentally, I just received this new cartoon from Professor Nadeem Irfan Bukhari. (See my all-time favorite from him in the April 27, 2007 StatsMadeEasy blog Cartoon quantifies commitment issue.)
It depicts statistics as the proverbial camel allowed to put its nose in the tent occupied by science disciplines until it become completely entrenched.
Thank goodness for scientists like Nadeem who embrace statistical tools for design and analysis of experiments. And kudos to those who guard against faulty or outright fraudulent scientific publications.
*The Band of Debunkers Busting Bad Scientists, Nidhi Subbaraman, 9/24/23
Temperature combines badly with humidity to maximize misery
The Twin Cities tied its record high temperature yesterday at 97 degrees Fahrenheit. However, the winds blew strong with air at a dew point in the low 60s, which made the heat relatively tolerable. After spending most of August at our second home in southwest Florida (leaving there just as Hurricane Idalia hit), my wife and I got acclimated to a far more uncomfortable daily combination of heat and humidity.
Before departing for Minnesota, I set up a SensorPush to monitor temperature, humidity and dew point—the temperature at which air becomes saturated with water vapor. I want to be on guard for the air conditioning going out. If that happens in Florida homes, mold can grow. After experiencing this once (due to renters not running the A/C) and dealing with an expensive remediation, I am keen to prevent another episode.
Closely related to dew point is the wet-bulb temperature, which, as chemical engineer, I learned how to measure with a sling psychrometer. The wet-bulb-temperature can then be converted to relative humidity.
To prevent heat-related deaths in training camps, the US military developed a more sophisticated measure called the wet bulb globe temperature (WBGT). It provides a good measure for the advisability of being active in full sun. The Texas University Interscholastic League requires that outdoor practices be shut down if WGBT exceeds 92.
“As with all indices that integrate elements of the thermal environment, interpretation of the observed levels of WBGT requires careful evaluation of people’s activity, clothing, and many other factors, all of which can introduce large errors into any predictions of adverse effects.”
– Grahame M Budd, Wet-bulb globe temperature (WBGT)–its history and its limitations
Other measures use to gauge comfort are Heat Index and Feels Like Temperature (FLT). I like the FLT because it accounts for the benefits of evaporative cooling. For example, as I write this, the actual temperature is 95 degrees and the FLT is only slightly higher at 96.
I’m getting too hot and bothered with all these measurements to continue much longer, but here’s yet another approach used by AccuWeather—the RealFeel Temperature.
What really matters is how you feel and what can be done to avoid discomfort. For example, earlier this summer I went to our Minnesota’s Washington County Fair on a very hot day and stopped in at a beer garden for a cold brew. However, I soon realized that its hot tin roof radiated heat down to the picnic tables—overcoming any advantage to being in the shade.
Sometimes you can find no relief other than hunkering down in an air-conditioned area. How did we ever get by without it?
Experimenting to make spirits more enticing
Posted by mark in food science on August 15, 2023
Spirits are distilled alcoholic drinks that typically contain 40% alcohol by volume (ABV) or 80 “proof”. Until the Pandemic, I avoided spirits—preferring to imbibe less intoxicating beers and wines. However, during the Quarantine, I made it my mission to drink up a stock of tequila that my Mexican exchange student’s father Pepe sent me when his daughter told him about the terrible cold in Minnesota.
Down the hatch went Don Julio and the like over some months…and yet the quarantine dragged on. Tiring of tequila I pivoted to bourbon, starting with top-shelf Woodford Reserve and settling after serial pairwise testing on bottom-shelf Evan Williams. Why pay more when you cannot discern a difference?
Last week my research on spirits expanded to rye whiskey purchased after a tour of the Chattanooga Whiskey Experimental Distillery. See my guide Sam pictured with a measurement guide for a key variable—the degree of charring in the storage barrels.
The mash bill for my bottle is malted rye, yellow corn, caramel malted rye (providing a smoother taste) and chocolate malted rye (not sure what that is, but it sounds tasty).
It seems to me that multifactor design of experiments would be an ideal tool for contending with the many process, mixture and categorical inputs to the optimization of whiskey. Once upon a time I toured Dewars Aberfeldy distillery in central Scotland. It concluded with my first taste of whiskey—shockingly strong. However, what interested me most was a simulator that allowed visitors to vary inputs and see how the output rated for taste. Unfortunately, I only had time to do one factor at a time (OFAT) testing and desperate stabs at changing multiple inputs.
If the spirit moves you (pun intended), please contact me for help designing your experiments and tasting the results.
Diabolical werewolves test trust in team
Posted by mark in Communication, leadership on August 1, 2023
Tonight, there’s a full moon. Then on August 30 comes the second full one of the month—a rare blue moon. Thus, it’s especially appropriate to consider werewolves and, in particular, an online game where these lycanthropes (secretly designated) undermine trust and security within a group.
I played Werewolf a few times but never got very far due to the cutthroat “kill the newbie” strategy deployed by more experienced (and vicious!) players. My interest in the Werewolf game stems from it providing a good laboratory for studying social dynamics and teamwork. See, for example, this blog by a relationship expert about What Werewolf teaches us about Trust & Security. For scientists studying such interactions, the Idiap Wolf Corpus (sounds creepy!) offers a wealth of data in the form of audio-visual recordings of 15 games played by 4 groups of people.
A newly published study by a trio of industrial engineers* delves into the impact of playing Werewolf at a distance and what this revealed about teamwork when members participate only on a virtual basis. The researchers divided 30 students into 3 teams of ten comprised of two werewolves, seven villagers, and one seer. Their experiment varied the groups by leadership experience.
The sample size of this study was far too small to support any conclusions, in my opinion. I just thought it would be fun to put teams, such as a group of researchers tasked with developing a new product, to the test of Werewolf.
Devious!
Cue the howls as the full moon rises…
PS I do wonder how well teams do at a distance versus in person. My feeling based on a lot of experience as a chemical engineer leading plant-process-improvement projects is that it pays to get together in one room every several meetings. It would be interesting to see well-designed research on all virtual, all in-person or a mix of the two.
*Vera Setyanitami, Hilya Mudrika Arini and Nurul Lathifah, People’s Trust in a Virtual Project Team: Results of a Game Experiment, Jurnal Teknik Industri, Vol. 25, No. 1, June 2023.
Selecting the most readable font for maximum impact
Posted by mark in Communication, Graphics on July 7, 2023
It’s Comic Sans Day today. If not so widely mocked, this font would be favored for its legibility across all ages and abilities, according to my daughter Emily—an expert in graphic design.
My early knowledge of writing options consisted of printing or cursive. As I progressed through college my preference became printing, which though slower to produce than cursive, resulted in a far more legible output and appealed to my engineering sensibilities.
I kept on handwriting through my early career—relying on secretaries to do the typing. However, it wasn’t long before I went DIY by becoming an early adopter of computers—a Radio Shack TRS-80. Its word processing capabilities made it far easier to write—a huge breakthrough by enabling editing.
Eventually, after a lot of hunting and pecking, I upgraded to MS-DOS (Microsoft’s disk operating system) and invested in Maven Beacon Teaches Typing to gain the ‘touch’ of my keyboard.
Things got really interesting with the advent of graphical user interfaces and widely available True Type fonts. After some wild and wacky times making bad blends of too many fonts, I settled in on the Microsoft Word defaults of the classic (invented 1931) Times New Roman (serif—featuring tails and feet) for text and more modern (1982) Arial (sans serif) for headlines. A big issue then, but far less so now that “e” rules, was whether a document would be read in print or electronically (on screen).
Around these times, Stat-Ease shifted its training materials from transparencies for ‘overheads’ to Powerpoint for projection from a personal computer (PC). Unfortunately, projectors in those early days put out a very weak light. However, being well equipped with Stat-Ease software, I rose to the challenge by deploying an experiment design in-class to maximize screen readability via adjustments to fonts and other factors.
Nowadays, figuring that nearly all my writing will be read on screen, I go exclusively with the current Word default of Calibri—a font that being san serif provides a “small, but significant, advantage in response times” according to this study in the Journal of Cognitive Psychology.
It turns out that, not surprisingly, studies now show different fonts increase reading speed for different individuals.
Participants’ reading speeds (measured in words-per-minute (WPM)) increased by 35% when comparing fastest and slowest fonts without affecting reading comprehension.
Adobe scientists and others who authored “Towards Individuated Reading Experience”, ACM Transactions on Computer-Human Interaction, Volume 29, Issue 431, March 2022, Article No.: 38, pp 1–56.
Therefore, I envision that, aided by developments in artificial intelligence, our devices will keep track of how fast we read and adapt the fonts accordingly. Watch out: Like it or not, you may be subjected to Comic Sans.
PS Calibri fared well overall on average in Adobe’s experiment so that remains my favored font.