*Welcome to Article II in this 5-part article series! You can find the other articles in this series by clicking the links below!*

- Size & Power Series I: How Stats Logic is Human Logic
- Size & Power Series III: Living in a Simulation
- Size & Power Series IV: A Confidence (Interval) Scheme
- Size & Power Series V: The Power and the Glory (and the Sample Size)

Ok! In Article I, we discussed how stats logic isn’t *really *as foreign as we may have first thought. Now that we’ve cleared that up, we’re about a third of the way, conceptually, to having a conversation about power and ideal sample size, I promise! But we *still *have a few more concepts to unpack with respect to how a statistician actually *uses *that logic. To do that, we’re going to have to broaden our thought experiment from the first article about stomach aches to something a bit more realistic and interesting.

Picture this: Your stomach ache investigation has sparked your curiosity. You recognize that while *some *stomach aches *are *almost certainly caused by hunger (i.e., the **true **% of *all *stomach aches caused by hunger is *not *0%), but also that *some *stomach aches are almost certainly *not *caused by hunger (i.e., the **true **% of *all *stomach aches caused by hunger is *not *100%). So, that realization leads you to wonder: *“What % of all stomach aches are caused by hunger?”*

One key difference between this question and the simpler one from our first article is that we are no longer interested in a * single *stomach ache but rather in

*stomach aches. In Statistics Land, we’d call this group of all stomach aches the*

**all****population**–it’s

*every*“potential subject” that exists that is relevant to our question, given its

*stated*scope. That is, we can’t answer the question “What % of all stomach aches are hunger-related?” without considering

*stomach aches that have*

**all****ever**occurred, right?

What quickly becomes obvious, with a question like this one (or with so many of the questions we ecologists might ask), is that **trying to answer an important question about a population at the level of the population is utterly intractable.** Imagine trying to even

*document*all the stomach aches occurring in your town on a single day, let along

*all*stomach aches occurring

*everywhere*in the

**world**! You’d never be done, you’d have to annoy an awful lot of people just to see if they

*might*have a stomach ache, you wouldn’t get to the bottom of the causes of most of those stomach aches, you’d never be totally sure how many stomach aches you were missing, etc., etc.

*The answer to your question will never come!*

What *can *we do, then, to explore this question? After all, scientists do explore questions of a similar scale as this one all the time! Answer: We shrink (and/or abstract) the problem. *We collect meaningful data from a small and carefully selected number of relevant subjects, and we get an “answer” to our question at the level of this group*.

For example, we might track down 30 stomach ache cases and document their causes. We might find out, in doing this, that 6 of them were caused by hunger, so our answer to the question of “what percentage of stomach aches are hunger-related” for this group was 6/30 = 20%. In Statistics Land, we’d call this small group our **sample **and this *sample-level* answer to our question our **sample statistic **(hence the name “statistics!”).

Then, we have a pivotal next question to ask: *How similar to the entire population was our sample?* If we think our sample was just like the population in every important way–just smaller–we might then use our

*statistic*to make an “informed prediction” as to what the answer to our question might be at the level of the

*population*. We can call our population-level answer the

**parameter**

**of interest**, so we can rephrase all of this by saying “

*What is the most likely value for the parameter, given the statistic we got?*“

As we try to answer that question, there’s an **incredibly **important logical assumption we will make that much of statistics (as we will discuss them, anyway) rests upon: **Th****e**** ****more similar our sample is to the population of interest, the more alike the statistic and the parameter should be. **This should make a lot of sense, actually! If we take this logic to its extreme, the merits of this assumption become brutally obvious–

**a sample**I mean, for example, the average of the entire group must be equal to the average of the entire group, right? Right.

*exactly*the same size as the population (i.e., the sample and the population are the*same thing*) would produce a statistic*exactly equal to*the parameter without fail!However, what happens if we move to the opposite extreme? A sample of 1 subject is **much **less capable of being *exactly *like the population (if it even can be at all)–in our stomach ache example, we’d *either *get a statistic of 0% (the one stomach ache we looked at was *not *hunger-related) or 100% (it *was *hunger-related), and those are the *only *two options. If the true parameter were really around 50% (half of all tummy aches are hunger-related), our statistic and our parameter would be *really *far off from each other no matter which sample of 1 we took! But even just doubling the sample to two subjects would *significantly *improve things–it’d be *possible*, at least, to get a statistic of 50% (and it’d happen in 50% of all such random samples)!

We just established another important concept there! **As we increase the size of our sample, we become more and more likely to get a sample that is a (more or less) perfect microcosm of the population we’re interested in, and thus to get a statistic similar to the parameter. **I hope this “rule” is just intuitive for you because, if not, the reasons behind it are a little hard to explain. The way I like to think about it is this: “Oddities” and random chance–things that can cause a sample become dissimilar to the population–tend to get increasingly overwhelmed by “predictable processes” and “normalcy” as we make a sample larger, so we necessarily will gravitate towards the “truth.”

What I just described is *so *fundamental and predictable a process that it’s actually a **Theorem**: The Central Limit Theorem (CLT). As it turns out, in many cases, even a sample of just ~30-50 subjects drawn thoughtfully from a population of *millions *can give you a surprisingly accurate guess at the parameter! That’s the “miracle of probability” for you!

However, the key word in the previous paragraph is “thoughtfully.” The CLT has an important caveat–it *only *holds true so long as our sample is **representative**. The word “representative” is doing some *seriously *heavy lifting here, but to simplify, this is a fancy way of saying “*the sample isn’t systematically unlike the population in some way*.” This caveat also makes intuitive sense, I hope! If, in reality, 99% of stomach aches are hunger-related, but we purposefully chose a sample that

*only*included non-hunger-related stomach aches, even if that sample were huge, we would get an “answer” really far from the “truth!” That’s a nugget of wisdom worth remembering: The CLT holds

*immense*power, but it has

*limited*power to protect us from ourselves and our capacity to collect bad samples.

Here’s the important question we just begged, so to speak: How could we get an **unrepresentative **sample? To simplify, there are **three **common ways:

**Our sample is just too small (and the world is**We kind of already established this! The smaller our sample, the more “random chance” and “quirky cases” can push our sample away from the population as a whole. You can think of this as the logic of the CLT but in reverse: If a sample gets*very*quirky/random).*more*like the population the*bigger*it gets, it makes sense that it may also get*less*like the population the*smaller*it gets. There’s no*guarantee*a small sample will be unrepresentative, but it increases the odds.**Our sample is**Gasp! I said a bad word!*biased*.*Bias, intentional or not, is the antithesis of good science, and even powers as great as the CLT can’t save us from it*! Here, I’ll call a**biased sample**one collected in*any*way, intentional or not, that favors inclusion of some subjects from the population over others in a way that runs parallel to our hypothesis (either for or against). That’s a bit clunky, so here’s an example: If we collect stomach-ache data from only people suffering stomach aches while inside restaurants, where hunger*should*be a less common cause then in most other locations, we are*probably*going to get a sample that suggests hunger-related stomach aches are a lot less common than they really are, right?**If you want to perform bad science, intentionally collecting a biased sample favorable to your hypothesis is one of the easiest ways to do it!****Our sample didn’t cut**This is sort of related to bias, but I think of it as the same problem wearing a different hat. What if we only collect a sample of stomach ache cases from here in the US–what problems might this cause? Well, for various reasons, hunger is less an issue here in the US than in other parts of the world. Americans are a distinct subgroup of all humans; by ourselves, on this issue at least, we cannot be a*across*subgroups in the population.**representative**sample of*all*humans!

We get around these problems by: 1) Taking large(-ish) samples, 2) Being aware of biases (unconscious or not) and working to eliminate them with our study design (e.g., by “blinding” our study so we don’t know which subjects are getting which treatments), 3) Identifying key subgroups in the population *beforehand *and sampling *across *them (e.g., “blocking” in experiments), and most importantly 4) Using **randomness **to construct our samples so that bias and subgroups are less of a problem (or temptation!).

So, across the two articles in this series so far, we’ve seen that statisticians use age-old logic to think about which explanations are best while acknowledging we can never be certain that any explanation is definitely correct. We’ve also seen that, to find explanations to large problems, statisticians try to make those problems smaller, which generally works quite well so long as they do it *carefully*.

In the next article, we’ll apply all these ideas to our question about the true % of hunger-related stomach aches, and we’ll take advantage of one of my favorite tools to do it: simulation! You can get to that third article here.

Pingback: Size & Power Series V: The Power and the Glory (and the Sample Size) | Alex Bajcz, Quantitative Ecologist

Pingback: Size & Power Series IV: A Confidence (Interval) Scheme | Alex Bajcz, Quantitative Ecologist

Pingback: Size & Power Series III: Living in a Simulation | Alex Bajcz, Quantitative Ecologist

Pingback: Size & Power Series I: How Stats Logic is Human Logic | Alex Bajcz, Quantitative Ecologist