Lies, damned lies, and statistics: How to avoid the first two when using the third - Caperton Gillett

Lying? Why, I have no idea what you’re talking about. *(Credit Sabin Paul Croce)*

Someone, and I’m not going to say who, but I am going to say he shares a bed with me at night and isn’t a flatulent rat terrier, likes to joke that I’m a professional liar. Because I’m in advertising, get it? And we in advertising lie all the time.

Unfortunately, that’s not an uncommon perspective among people who aren’t actually joking. (Thinking about it, I suddenly find myself hoping my boyfriend really is joking…) A 2015 study performed by brand expert network Experticity revealed that only 47 percent of consumers extremely or even somewhat trust advertising. I personally don’t like that even a little, because I recognize that as an advertising professional, I have a great big platform and an opportunity to use my powers for good or evil. I take that responsibility seriously, and you should, too.

One of the easiest and most tempting ways to get hinky in advertising is with statistics. They seem so solid and objective. Four out of five moms prefer Mouth Glue brand peanut butter! Ninety-five percent of consumers prefer mattresses without rocks in them! It’s easy to just say your horsehair overalls are a favorite among C-level executives, but if you have a survey to back it up, that means it’s legit, right?

… Eh.

There are plenty of ways to game statistics and get around people’s natural inclination to be skeptical about advertising claims. That, in my opinion, makes it a particularly dirty practice. We, as advertisers, keep the public’s trust by being honest, and getting as close to the edge without stepping over doesn’t qualify. If all you’re doing is following the letter of the law, you’re doing your audience, your clients, your industry, and yourself a disservice. That’s why I’m providing a rundown of common traps to stumble into when using statistics, how to avoid them, and why it’s so very, very important to do so.

Not gonna kid you or myself: This might get kind of dry in places, because statistics. But I encourage you to soldier through, because it’s good information that can help you be a smarter consumer of statistics, a more accurate writer, and a more ethical advertising professional.

(This blog post, incidentally, is dedicated to/blamed on my dad, who played math games with me when I was a kid and tricked me into thinking it was fun. Thanks a heap, Dad.)

Sampling

The statistical sample is the mine from which data is… mined, and if you don’t start with a good mine, you’re not going to get good… ore? (That’s what comes out of mines, right? I might should have picked a different metaphor there.) Because it’s usually unreasonable to survey an entire huge population about their views, researchers and pollers choose a subset of that population that’s sufficiently representative of the whole, such that we can safely assume that their opinions are pretty applicable to the rest of that population.

The representative part is the important part. If you’re whipping out figures about media usage among millennials, you’re going to get nowhere with a survey of boomers. If you’re touting a product’s high opinion among dentists but you’re relying on a survey of all healthcare professionals, you’re not giving your audience the complete picture. Another important factor of sampling is sample size. I’m not going to go into the math of how to know how big a sample is needed for a given population, because yikes, but know that it matters — if you’re surveying 1,000 people to represent the entire country, you’re likely to get inaccurate results.

Why does this matter? It matters because sometimes, we want to say that four out of five dentists recommend Trident gum. If 80 percent of a large number of dentists surveyed really do recommend Trident, that’s one thing. If all the dentists at Five Brothers Dentistry like Trident except for Brad, it’s no longer a commentary on the benefits of Trident but on Brad’s personal taste. And you know what? Brad tends to have pretty good taste. I mean, it’s Brad.

Methodology

The method by which a survey is performed can make a huge difference in the outcome. If a sample isn’t randomized, and is loaded down with people who are likely to have a specific opinion, it’s not going to be representative of everyone in the population. If it isn’t properly weighted — and again, not going to go into the specifics, but know that weighting is actually a good thing and not a bad thing — you aren’t likely to get good results.

Even the phrasing of survey questions can make a difference. A leading question like What do you like most about [X brand]? will get very different results from How likely are you to recommend [X brand] to your friends? A question about KPIs could get different results from one about metrics or goals. A question inviting people to rate your product on a scale from Good to Awesome isn’t going to give you accurate results, and if you then brag about the percentage of consumers who rate your product Very Good, you’re not telling the whole truth.

Before you use statistical data, make sure you understand how it was obtained, and be transparent with your audience about where your claims come from.

Averages

One of my pettest of peeves here.

California is 40th in the nation in terms of average annual precipitation, beating out (among other states) Colorado, Idaho, Hawaii, and North Dakota. So why was California on freaking fire and those other states weren’t? Because we’re talking averages, and averages don’t mean jack. See, nearly 90 percent of California’s annual precipitation falls between November and April, leaving the summer and fall nearly rainless, the vegetation crispy, and the ground dry, cracked, and just waiting to turn into a mudslide with the first heavy rain. But on average, it all looks pretty good.

When people casually throw around the word average, they’re usually talking about one of three statistical measures of central tendency.

Mean. Technically, when we talk averages, we’re talking about the mean — add up all the numbers, and then divide by the number of numbers. When there aren’t a lot of significant outliers in the group, this can provide a reasonably reliable view of the mathematical situation. When there are a lot of significant outliers, you can look at the math and see California getting an inch and a half of rain every month and wonder where all that smoke is coming from. The mean can be a mathematically correct way but still not the most accurate way to convey information.

Median. The median is the middle value in an ordered list of numbers, and it’s one you frequently hear about in groups that tend to be more skewed. When you hear references to “median income,” that’s why — because household income stops at zero on the low end and goes up to Jeff Bezos on the high end, and it may be accurate to say that everyone in the car was worth an average of $11 billion, but both Jeff and the three other people in his Uber pool would take a whole lot of offense at the characterization. Using the median gives you an idea of what the middle looks like without it being thrown off by that one huge, bald outlier.

Mode. The mode is the most frequently occurring number. Asking what kind of doughnuts everyone in the office likes, and jelly-filled gets the most votes? Jelly-filled is the mode, and your office has lousy taste in doughnuts. (The correct answer, of course, is lemon-filled.) If your service could be rated one to five stars and you get mostly four, the mode is four, and you’d probably get that fifth star if you had better doughnuts.

And don’t forget that when you’re talking about averages, average doesn’t mean most — it just means in the middle. The average height of American women is 5’4”, but that doesn’t mean that most women in the U.S. are that height and a few of us monsters are up here banging our heads on doorways — it means that half of women are taller than 5’4”, and half are shorter. So if you find yourself talking about the “average woman,” the “average consumer,” the “average viewer,” make sure you aren’t accidentally (or “accidentally”) applying a mathematical property to imply some greater trend or tendency.

(Who can give you a better, more thorough overview of measures of central tendency? Jim, that’s who. Jim, of Statistics by Jim.)

Sources

Statistics can come from a variety of different sources, and as you might imagine, some parties are a little more interested than others. Because of that, you’ll want to make sure that, as with everything else, you yourself understand where you’re getting your information, and you’re honest with your audience about where you got it. A survey performed by the National Association of Widget Enthusiasts may — or may not — have different results form one sponsored by WidgetCo. That doesn’t mean that WidgetCo’s survey is automatically inaccurate or biased, but you owe it to your audience to disclose that potential conflict of interest when you’re using results from that survey.

So when I say I was voted best copywriter in Birmingham by a panel of my mom, I’m not lying, and you should definitely trust her judgment — but I’d also understand if you felt you needed some input from other, more objective sources.

One out of one Capertons think proper, transparent use of statistics is crucial to ethical advertising. And she’s not alone. The Institute for Advertising Ethics, administered by the American Advertising Federation, has established eight principles and practices for being an ethical advertiser, among them exercising “the highest personal ethics in the creation and dissemination of commercial information to consumers.” (Full disclosure: As a former president of the Birmingham chapter of the AAF, I naturally think the AAF is awesome.) I encourage you to learn more about the Institute and even take their ethics certificate program to make yourself a better, more honest person and advertising professional.

And if you want to learn more about how great I am, I’m happy to hook you up with my mom’s contact information. She’s a fully objective source of information about me. Cross my heart.

Sampling

Methodology

Averages

Sources

Leave a Reply Cancel reply