The Non-Paradox of ChoiceJanuary 18th, 2011 | Posted by in General Insights
Washington D.C. – Note: In this month’s guest entry, APT’s founder and current chairman, Jim Manzi, revisits the famous jam experiment and the (non) paradox of choice.
The jam experiment
Over the past decade, some academics have claimed to show scientifically that humans tend to become paralyzed by too many choices. This is often called the “paradox of choice.” Probably the best-known piece of evidence is the “jam experiment,” in which shoppers bought more jam when presented with fewer flavors than when confronted with many flavors.
But what if one of the crucial experiments at the foundation of this mountain of inference showed no such thing?
Libertarian writer Virginia Postrel opens her recent New York Times review of a new book on the topic of the paradox of choice with this:
“Sheena Iyengar is the psychologist responsible for the famous jam experiment. You may have heard about it: At a luxury food store in Menlo Park, researchers set up a table offering samples of jam. Sometimes, there were six different flavors to choose from. At other times, there were 24. (In both cases, popular flavors like strawberry were left out.) Shoppers were more likely to stop by the table with more flavors. But after the taste test, those who chose from the smaller number were 10 times more likely to actually buy jam: 30 percent versus 3 percent. Having too many options, it seems, made it harder to settle on a single selection.
Wherever she goes, people tell Iyengar about her own experiment. The head of Fidelity Research explained it to her, as did a McKinsey & Company executive and a random woman sitting next to her on a plane. A colleague told her he had heard Rush Limbaugh denounce it on the radio. That rant was probably a reaction to Barry Schwartz, the author of ‘The Paradox of Choice ‘ (2004), who often cites the jam study in antimarket polemics lamenting the abundance of consumer choice. In Schwartz’s ideal world, stores wouldn’t offer such ridiculous, brain- taxing plenitude. Who needs two dozen types of jam?”
It turns out that I was also told the story of the jam experiment – for the umpteenth time – at a business conference a few months ago. But it was Postrel’s characteristic highlighting of a telling detail that I had never before heard which piqued my interest: those who chose from the smaller number were ten times more likely to buy jam. I’ve designed and analyzed a lot of retail experiments and causing a 10X increase in sales by changing a shelf assortment would be a truly astounding result.
The real paradox
Before getting into the detailed analysis, stop to notice that if this result were valid and applicable with the kind of generality required to be relevant as the basis for social policy, it would imply that lots of retailers could simultaneously eliminate 75 percent of their inventory and increase sales by 900 percent. I don’t believe in purely efficient markets, but that doesn’t seem very plausible to me.
As I dug into the experiment, I became pretty sure that this is not what happened, and I’ll try to describe why.
Some detail on what the researchers actually did is important. On two consecutive Saturdays, they operated a tasting booth inside a specific grocery store in Menlo Park, California for five hours each day. Here is the original academic paper on the procedure:
Two research assistants, dressed as store employees, invited passing customers to “come try our Wilkin and Sons jams.” Shoppers encountered one of two displays. On the table were either 6 (limited-choice condition) or 24 (extensive-choice condition) different jams. On each of two Saturdays, the displays were rotated hourly; the hours of the displays were counterbalanced across days to minimize any day or time-of-day effects.
Consumers were allowed to taste as many jams as they wished. All consumers who approached the table received a coupon for a $l-discount off the purchase of any Wilkin & Sons jam. Afterwards, any shoppers who wished to purchase the jam needed to go to the relevant jam shelf, select the jam of their choice, and then purchase the item at the store’s main cash registers.
Across the ten-hour experimental period, 145 people stopped at the extensive assortment booth, and, of these,
- 4 bought jam with the coupon (a 3% redemption rate);
104 people stopped at the limited assortment booth, and, of these,
- 31 bought jam with the coupon (a 30% redemption rate).
The signal-to-noise problem
The fundamental problem that confronts all retail store experiments is “signal-to-noise”: the background variation in day-to-day store performance is typically very large compared to the actual causal effect of the business program being tested. These researchers are careful scholars who worked hard to correct for this by using alternating hours across two days, but the design is unlikely to be sufficiently robust.
What they were really testing was not the effect of changing assortment breadth on sales, but rather the effect of changing the assortment breadth of an in-store display on the redemption rate of a store-distributed coupon. While it seems intuitively unlikely that you could create a 10X improvement in redemption with a smaller display than a larger one, it also seems implausible that this huge a difference in response rate could be due to random chance – right? Not necessarily.
The crucial issue for this experiment is that, in combination with this background variation, the two groups of jam buyers were not assigned randomly. Because the experiment was done for a total of ten hours in only one store, and shoppers were grouped in hourly chunks, there could be all kinds of reasons that those people who happened to show up during the five hours of limited assortment could have different propensity to respond to $1 off a specific line of jams than those who arrived in the other five hour period: a soccer game finished at some specific time, and several of the parents who share similar propensities versus the average shopper came in nearly together; a bad traffic jam in one part of town with non-average propensity to respond to the coupon dissuaded several people from going to the store at one time versus another, etc. Remember, all of the inference is built on the purchase of a grand total of 35 jars of jam. This is one reason why rigorous retail experiments, when a lot of money is at stake, are typically executed for dozens of randomly assigned stores for a period of weeks.
But the result is at least interesting, and the right way to figure out whether or not the result is valid and generalizable is replication – try the same kind of experiment in other stores, geographies and product categories. Over the past ten years, a number of such experiments have been done by academics in multiple countries to evaluate the asserted paradox of choice for product categories ranging from mp3 players to mutual funds, and a paper was published in February (Scheibehenne et al) that conducted a meta-analysis of 50 of them (h/t Tim Harford). Across all of these experiments, the average effect of increasing choice on consumption or satisfaction was “virtually zero.” Further, this meta-analysis showed a positive average effect of increasing choices for those experiments that, like the jam experiment, tested the effect of choice on consumption quantity, rather than some measure of satisfaction as the outcome. That is, when it comes to sales, more choice is better.
This is consistent with the unpublished assortment experiments that I’ve seen and should not be surprising. As a store adds more and more products to a given product line assortment (say, canned soup), sales will rise sub-linearly with product count. That is, the first product in a category will generally be the one of those with the highest sales (say, Campbell’s tomato soup), and the 1,000th one added will generally have a small market. Further, people will not indefinitely add consumption of canned soup as a category just because more choices are available. Costs, on the other hand, continue to rise as the store adds more and more kinds of canned soup. At some point, incremental products in the assortment will add some small positive revenue but will also add enough cost that they will be unprofitable. So the most profitable assortment will still avoid adding some products that would drive positive revenue growth for the category. Since most assortment experiments are designed to try to find the profit optimum, adding products in this range will almost always drive some gain in revenue. There are exceptions, such as some store that has grossly misestimated demand in some category, or a business change that combines a reduced assortment with massive investment in improving the overall merchandising of the department, and so on; but these are rare. Further, obviously at some point an assortment would get so large that sales would actually decline for practical reasons like consumers just not being able to get to products.
The paradox of choice will surely occur in some contexts – it’s just that markets don’t seem to produce this outcome very often.
You can follow any responses to this entry through the RSS 2.0 Both comments and pings are currently closed.