Archive for category design of experiments
Brain-bending thoughts on a coffee experiment
Posted by mark in design of experiments, Uncategorized, Wellness on October 24, 2010
The Stat-Ease training center here at our world headquarters in Minneapolis features a wonderful single-cup brewing system that you can see demoed here. When we are not holding a workshop, I sometimes sneak in to steal a cup late in the day. By then I am reaching my limit, so I brew a “half-calf” at the half-cup setting. Being a chemical engineer, I calculate that, in this case, half of half makes a whole, that is, coffee with the normal concentration of caffeine. Does that make sense?
Making a tasty and effective cup of coffee is a huge deal for knowledge workers who need to keep their heads in gear from start to finish of every single day. One of our workshop students, a PhD, has been picking my brain about testing coffee blends on her staff of scientists. She proposes to do a mixture design such as I did on varying types of beers (see Mixture Design Brews Up New Beer Cocktail—Black & Blue Moon).
Obviously overall liking on a sensory basis should be first and foremost for such an experiment on coffee – a 5 to 9-point scale works well for this.* However, the tricky part is assessing the impact of coffee for accelerating information processing and general problem-solving, which I hypothesize depends on level of caffeine. I wonder if an online “brain training” service, such as this one developed by neuroscientists at Stanford and UCSF, might provide a valid measure.
The down side of doing a proper test on whether coffee improves cognitive skills will be the necessity of reverting to the base line, that is, every morning getting up and trying to function without the first cup.
“A mathematician is a machine for turning coffee into theorems.”
— Alfréd Rényi
*Turn your volume down (to not hear the advert) and see this primer on sensory evaluation by S-Cool– a UK educational site for teenagers.
Blah, blah, blah…”quadratic”
Posted by mark in design of experiments, Uncategorized on August 15, 2010
This add by Target got my attention. It reminded me of my futile attempt to get my oldest daughter interested in math. For her the last straw was my overly-enthusiastic reaction to her questioning me why anyone would care about quadratic equations. Perhaps I over-reacted and lectured on a bit too long about this being a very useful approximating function for response surface methods, blah, blah, blah…
Priming R&D managers to allow sufficient runs for well-designed experiment
Posted by mark in design of experiments on June 2, 2010
I am learning a lot this week at the Third European DOE User meeting in Lucerne, which features many excellent applications of DOE to industrial problems. Here’s an interesting observation from Pavel Nesladek, a technical expert from the Advanced Mask Technology Center of Dresden, Germany. He encounters severe pressure to find answers in minimal time at the least possible cost. Pavel found that whatever number of runs he asked to do for a given design of experiment, his manager would press for fewer. However, he learned that by asking for a prime number, these questions would be preempted, presumably because this seemed to be so precise that it must not be tampered with! For example, Pavel really needed to complete 20 runs for adequate power and resolution in a troubleshooting experiment, so he asked for 23 and got it. Tricky! Perhaps you stats-savvy readers who need a certain sample size to accomplish your objective might try this approach. Prima!
PB&J please, but hold the jelly (and margarine) and put it on toast – a mixture design combined with a categorical factor
Posted by mark in design of experiments, Uncategorized, Wellness on May 27, 2010
My colleague Pat Whitcomb just completed the first teach of Advanced Formulations: Combining Mixture & Process Variables. It inspired me to develop a virtual experiment for optimizing my perfect peanut butter and jelly (PB&J) sandwich. This was a staple for me and my six siblings when we were growing up. Unfortunately, so far as I was concerned, my mother generously slathered margarine on the bread (always white in those days – no whole grains) and then thick layers of peanut butter and jelly (always grape). As you see* in the response surfaces for overall liking [ 🙁 1-9 🙂 ], I prefer that none of the mixture ingredients (A: Peanut butter, B: Margarine, C: Jelly) be mixed, and I like the bread toasted. This analysis was produced using the Combined design tab from Design-Expert® software version 8 released by Stat-Ease earlier this year. I’d be happy to provide the data set, especially for anyone that may be hosting me for a PB&J dinner party. 😉
*Click to enlarge the plots so you can see the legend, etc.
Two-level factorial experimentation might make music for my ears
Posted by mark in design of experiments, Uncategorized, Wellness on May 9, 2010
I am a fan of classical music – it soothes my mind and lifts my spirits. Maybe I’m deluded, but I swear there’s a Mozart effect* on my brain. However, a big monkey wrench comes flying in on my blissful state when my stereo speaker (always only one of the two) suddenly goes into a hissy fit. I’ve tried a number of things on a hit-or-miss basis and failed to find the culprit. At this point I think it’s most likely the receiver itself – a Yamaha RX496. However, before spending the money to replace it, I’d like to rule out a number of other factors:
- Speaker set: A vs B
- Speaker wire: Thin vs Thick.
- Source: CD vs FM-Radio
- Speaker: Left vs Right.
It’s very possible that an interaction of two or more factors may be causing the problem, so to cover all bases I need to do all 16 possible combinations (2^4). But, aside from the work this involves for all the switching around of parts and settings, I am stymied by the failure being so sporadic.
Anyways, I feel better now having vented this to my blog while listening to some soothing Sunday choir music by the Dale Warland Singers on the local classical radio station. I’m taking no chances: It’s playing on my backup Panasonic SA-EN25 bookshelf system.
*Vastly over-rated according to this report by the Skeptic’s Dictionary.
Creativity defeats sensibility for paper helicopter fly-off
Posted by mark in design of experiments on April 9, 2010
Twice a year I teach a day on design of experiments (DOE) at Ohio State University’s Fisher College of Business. The students are top-flight executives seeking six sigma black belt certification. To demonstrate their proficiency for doing DOE, I ask them to break into teams of three or four and, within a two hour period, complete a two-level factorial on paper helicopters.*
It’s always interesting to see how intensely these teams from industry compete to develop the ‘copter that flies longest while landing most accurately. However, this year one group stood out as being less competitive than the others. Therefore, I was very surprised that they handily won our final fly-off. It turns out that one of their factors was dropping the helicopter either wings-up or wings-down – the latter configuration being completely non-intuitive. It turns out that going upside down makes it easier to drop, the flight time suffers only slightly and the flight becomes far more accurate – a premium in my overall scoring.
“The chief enemy of creativity is ‘good’ sense.”
– Pablo Picasso
Ironically, another team who benefited from having an expert in aeronautical engineering and a very impressive work ethic all around – they did more runs by far than anyone else – never thought of flying the ‘copters upside down. In fact, their team leader objected very vigorously that this orientation must not be allowed, it being clearly unfair. Fortunately, other executives in this black-belt class hooted this down.
I thought this provided a good lesson for process and product improvement – never assume that something cannot work when it can be easily tested. That’s the beauty of DOE – it enables one to screen unknown (and summarily dismissed) factors to uncover a vital few that often prove to be the key for beating the competition.
*I also do this experiment for a class on DOE that I teach every Spring at South Dakota School of Mines and Technology. In fact, I am writing this blog from their campus in Rapid City where I’ll be teaching class tonight. For details, pictures and results of prior experiments here and at OSU, see this 2004 Stat-Teaser article on “Playing with Paper Helicopters”.
Evolutionary operation
Posted by mark in design of experiments, Uncategorized on March 7, 2010
Last December, after an outing by the Florida sea, I put out an alert about monster lobsters. This reminded me of an illustration by statistical gurus Box and Draper* of a manufacturing improvement method called evolutionary operation (EVOP), which calls for an ongoing series of two-level factorial designs that illuminate a path to more desirable conditions.
With the aid of Design-Expert® software, I reproduced in color the contour plot in Figure 1.3 from the book on EVOP by Box and Draper (see figure at the right). To illustrate the basic principle of evolution, Box and Draper supposed that a series of mutations induced variation in length of lobster claws as well as the pressure the creatures could apply. The contours display the percentage of lobsters at any given combination of length and pressure who survive long enough to reproduce. Naturally this species then evolves toward the optimum of these two attributes as I’ve shown in the middle graph (black and white contours with lobsters crawling all over them).
In this way, Box and Draper present the two key components of natural selection:
- Variation
- An environment that favors select variants.
The strategy of EVOP mimics this process for improvement, but in a controlled fashion. As illustrated here in the left-most plot, a two-level factorial,** with ranges restricted so as not to upset manufacturing, is run repeatedly – often enough to detect a significant improvement. In this case, three cycles suffices to power up the signal-to-noise ratio. This case illustrates a big manufacturing-yield improvement over the course of an EVOP. However, any number of system attributes can be accounted for via multiple-response optimization tools provided by Design-Expert or the like. This ensures that an EVOP will produce more desirable operating conditions overall for process efficiency and product quality.
It pays to pay attention to nature!
*Box, G. E. P. and N. R. Draper, Evolutionary Operation, Wiley New York, 1969. (Wiley Classics Library, paperback edition, 1998.)
**(We show designs with center points as a check for curvature.)
Management Blog Carnival, Review 2 – “Hexawise” by Justin Hunter
Posted by hank in design of experiments, Uncategorized on January 1, 2010
(Editor’s note: This blog is contributed by my son Hank – a programmer by profession. It’s the second of three in a carnival organized by John Hunter. -Mark)
Justin Hunter is the founder of Hexawise, a SaaS tool that aids in setting up tests for software using statistical methods. This also happens to be the subject of his blog – no doubt influenced in part by his father, William Hunter, author of the classic text Statistics for Experimenters. Justin started the blog mid-way through ’09, so the pickings are a little slim, but there is still plenty of good stuff.
Some highlights from 2009:
- 10/6 The Stackoverflow.com for Software Testers marks the release of a beta version of testing.stackexchange.com. This is a community driven Q and A site that uses the same technology as Stack Overflow, a popular site for coders looking for help. Hunter’s version is aimed at testers, and already has an impressive database of answers and discussion.
- 8/25 What Else Can Software Development and Testing Learn from Manufacturing? Don’t Forget Design of Experiments (DoE) links to a Tony Baer post comparing software development to the manufacturing industry. Hunter further focuses on the application of Design of Experiments, pointing out the extensive use of DoE in quality improvement initiatives in Toyota and Six Sigma. These initiatives have yet to really penetrate the software development industry, despite some high profile successes (Google’s Website Optimizer and Youtube are mentioned).
- 12/9 Defect Seen >10 Million Times and Still not Corrected has some interesting trivia about the grammatical error in Lands’ End – something I hadn’t even noticed, and apparently the company hadn’t either until it was too late. The real point of the post, however, is to point out another much more fixable grammatical error in Google’s Blogger software. If there is only 1 comment on a post, it still says “1 comments”, instead of dropping the s. A trivial defect, perhaps, but a very visible and easily fixed one. It reminds me of something Mark always says about taking a break from work to sweep the dirt off the shop floor. That is, you shouldn’t let the little inconsequential bugs pile up while you’re focused on the big ones.
On a lighter note, in Famous Quotes that Make Just as Much Sense When You Substitute PowerPoint for Power Justin linked to a post by Jerry Brito about substituting PowerPoint for Power in famous quotes, adding a few of his own. I’d also like to add:
Kirk: “Spock, where the hell’s the PowerPoint you promised?”
Spock: “One damn minute, Admiral.” –Star Trek IV
Gambling with the devil
Posted by mark in Basic stats & math, design of experiments on November 15, 2009
In today’s “AskMarilyn” column by Marilyn vos Savant for Parade magazine she addresses a question about the game of Scrabble: Is it fair at the outset for one player to pick all seven letter-tiles rather than awaiting his turn to take one at a time? The fellow’s mother doesn’t like this. She claims that he might grab the valuable “X” before others have the chance. Follow the link for Marilyn’s answer to this issue of random (or not) sampling.
This week I did my day on DOE (design of experiments) for a biannual workshop on Lean Six Sigma sponsored by Ohio State University’s Fisher College of Business (blended with training by www.MoreSteam.com.) Early on I present a case study* on a training experiment done by a software publisher. The goal is to increase the productivity of programmers by sending them to workshop. The manager asks for volunteers from his staff of 30. Half agree to go. Upon their return from the class his annual performance rating, done subjectively on a ten-point scale, reveals a statistically significant increase due to the training. I ask you (the same as I ask my lean six sigma students): Is this fair?
“Designing an experiment is like gambling with the devil: only a random strategy can defeat all his betting systems.”
— RA Fisher
PS. I put my class to the test of whether they really “get” how to design and analyze a two-level factorial experiment by asking them to develop a long-flying and accurate paper helicopter. They use Design-Ease software, which lays out a randomized plan. However, the student tasked with dropping the ‘copters of one of the teams just grabbed all eight of their designs and jumped up the chair. I asked her if she planned to drop them all at once, or what. She told me that only one at a time would be flown – selected by intuition as the trials progressed. What an interesting sampling strategy!
PPS. Check out this paper “hella copter” developed for another statistics class (not mine).
*(Source: “Design of Experiments, A Powerful Analytical Tool” by Christopher Nachtsheim and Bradley Jones, Six Sigma Forum Magazine, August 2003.)
Small sample sizes produce yawning results from sleep studies
Posted by mark in Basic stats & math, design of experiments on July 15, 2009
“Too little attention has been paid to the statistical challenges in estimating small effects.”
— Andrew Gelman and David Weakliem, “Of Beauty, Sex and Power,” American Scientist, Volume 97, July-August 2009 .
In last week’s “In the Lab” column of the Wall Street Journal (WSJ)*, Sarah Rubinstein reported an intriguing study by the “light and health” program of the Rensselaer Polytechnic Institute (RPI). The director, Mariana Figueiro, is trying to establish a lighting scheme for older people that will facilitate their natural rhythms of wakefulness and sleep. In one 2002 experiment (according to WSJ), Dr. Figueiro subjected four Alzheimer patients to two hours of blue, red or no light-emitting diodes (LEDs). After then putting the individuals to bed, their nurses made observations every two hours and found that the “blue-light special” out-did the red by 66% versus 54% on how often they caught patients napping.
Over the years we’ve accumulated many electrical devices in our bedroom – television, cable box, clocks, smoke and carbon monoxide monitors, etc., which all feature red lights. They don’t bother me, but they keep my wife awake. So it would be interesting, I think, if blues would promote snooze. Unfortunately the WSJ report does not provide confidence intervals on the two percentages – nor do they detail the sample size so one could determine statistical significance on the difference of 0.12 (0.66 minus 0.54). (I assume that each of the 4 subjects were repeatedly tested some number of times.) According to this simple calculator posted by the Southwest Oncology Group (a national clinical research group), it would take a sample size of 554 to provide 80% power for achieving statistical significance at 0.05 for this difference!
So, although whether blue light really does facilitate sleep remains questionable, I am comforted by the testimonial of one of the study participants (a 100 years old!) – “It’s a beautiful light,” she says.
PS. Fyi, for more sophisticated multifactor experimentation (such as for screening studies), Stat-Ease posted a power calculator for binomial responses and provided explanation in its June 2009 Stat-Teaser newsletter .
* “Seeking a Light Approach to Elderly Sleep Troubles,” p. D2, 7/7/09