Archive for June, 2017
Errors, blunders & lies
Posted by mark in Basic stats & math, pop on June 27, 2017
David S. Salsburg, author of “The Lady Tasting Tea”*, which I enjoyed greatly, hits the spot again with his new book on Errors, Blunders & Lies-How to Tell the Difference. It’s all about a fundamental statistical equation: Observation = model + error. The errors, of course, are normal and must be expected. But blunders and lies cannot be tolerated.
The section on errors concludes with my favorite chapter: “Regression and Big Data”. There Salsburg endorses my favorite way to avoid over-fitting of happenstance results—hold back at random 10 percent of the data and see how well these outcomes are predicted by the 90 percent you regress.** Whenever I tried this on manufacturing data it became very clear that our high-powered statistical models worked very well for predicting what happened last month. 😉 They were worthless for seeing into the future.
Another personal favorite is the bit on spurious correlations that Italian statistician Carlo Bonferroni*** guarded against, also known as the “will of the wisps” per the founder of Yale’s statistics school—Francis Anscombe.
If you are looking for statistical insights that come without all the dreary mathematical details, this book on “Errors, Blunders & Lies” will be just the ticket. Salsburg concludes with a timely heads-up on the statistical lies caused “curbstoning” (reported here by the New York Post), which may soon combine with gerrymandering (see my previous post) to create a perfect storm of data tampering in the upcoming census. We’d all do well to sharpen up our savvy on stats!
The old saying is that “figures will not lie,” but a new saying is “liars will figure.” It is our duty, as practical statisticians, to prevent the liar from figuring; in other words, to prevent him from perverting the truth, in the interest of some theory he wishes to establish.
– Carroll D. Wright, U.S. government statistician, speaking to 1889 Convention of Commissioners of Bureaus of Statistics of Labor.
*Based on the story told here.
**An idea attributed to the inventor of modern day statistics—R. A. Fisher, and endorsed by famed mathematician John Tukey, who suggested the hold-back be 10 percent.
***See my blog on Bonferroni of Bergamo.
Gerrymanderers may soon be sent packing for doing too much cracking
Wisconsin Governor Scott Walker and his cohort of Republicans might have gone too far in redrawing their State’s political boundaries to their advantage. Last November, a federal district court declared these maneuvers, called gerrymandering,* unconstitutional. However, as discussed in this Chicago Tribune article, the Supreme Court might consider overturning the ruling—these gerrymanders being partisan, not racially discriminatory.
One of the most infamous of all gerrymandered districts—1992’s 12th Congressional District in North Carolina-is pictured here. It became known as the “I-85 district” due to being no wider than the freeway for stretches that connected the desired populations of voters.
North Carolina’s 12th was a kind of in vitro offspring of an unromantic union: Father was the 1980s/1990s judicial and administrative decisions under the Voting Rights Act, and Mother was the partisan and personal politics that have traditionally been at redistricting’s core. The laboratory that made this birth possible was the computer technology that became available for the 1990s redistricting cycle. The progeny won no Beautiful Baby contests.
— North Carolina Redistricting Cases: the 1990s, posted at Minnesota Legislature Web Site
You may wonder, as I did, how gerrymandering works. The latest issue of Nature explains it with their graphic on “packing and cracking” here. Also, see the figures on measuring compactness. Mathematicians approach this in various ways, e.g., the area of the district compared to with that of the smallest polygon that surrounds it (called the convex hull). Quantifying the fairness of boundaries creates a great deal of contention–which measure to use being chosen for greatest advantage of whomever is wielding the figures.
Partisan gerrymandering, if not outlawed, will be catalyzed by the 2020 census. Keep an eye on this.
*A word coined in 1812 when Massachusetts’s Governor Gerry redrew a district north of Boston into the shape of a salamander.