This is Post 1 in a five-part series on statistical logic, statistical inference, confidence intervals, and, finally, using equations to make guesses about “ideal sample sizes” for studies. Click the links below to check out the other articles in the Series!
- Size & Power Series II: Walk Like a Statistician
- Size & Power Series III: Living in a Simulation
- Size & Power Series IV: A Confidence (Interval) Scheme
- Size & Power Series V: The Power and the Glory (and the Sample Size)
We dive into the sometimes-harrowing world of “Capital S Statistics” for this article series! I’ll profess that I would not consider myself a statistician, even though applied stats has always interested me. I’d prefer the title “ecological data scientist;” I take the more holistic view that all things we do with and to our data matter, not just how we analyze them!
So, while this article series is about statistics, it’s also about much more–it demonstrates the deep connections between how we gather data, analyze them, and communicate what we’ve learned from them. In other words, this series will hopefully prove the point that data and research are linked by much more than just the statistical tools we use!
This article series is also not written by a “true statistician,” which is probably good and bad! Some of what I tell you could be somewhat wrong (though I’m fairly sure it isn’t), but a lot of what I tell you should be more accessible as well!
What spurred this article series: In my role as MAISRC’s staff Quantitative Ecologist, I have received several requests lately for support/guidance in re: “ideal sample sizes.” Our researchers are curious: How many subjects does their study need for it to have a decent chance of finding whatever result it is they are expecting?
In ecological research, especially, wanting formal and objective guidance on when a study is “big enough” is a really understandable desire! I dunno if you’ve noticed, but the natural world is a bit….chaotic. The forces at work in nature are much harder to control than the forces at work in a lab. These forces of nature (e.g., the weather) are also large and unpredictable. Plus, the subjects we may be studying may be quite large themselves (think moose, an entire lake, or even a whole country’s climate!).
All these things combine to make ecological research a bit…nightmarish, even at the best of times! It can be difficult, expensive, or downright impractical to make a study much larger (YOU try tranquilizing more than 8 bears in a single, weeklong field study, why don’t you?!). It can also mean that the real trends, patterns, and relationships of the natural world, the ones we are trying to observe, are often transient, inconsistent, or hidden beneath piles of “noise.”
Thus, finding a “sweet spot,” where our study is big enough but not too big, is often tantamount to our success! Luckily, statistics offers some useful tools for helping us find this sweet spot–even if, like other statistical tools, they are easily misunderstood, fallible, and susceptible to spawning blind faith when they shouldn’t.
So, in the first two articles in this series, I’ll briefly review all the key statistical concepts we’ll need to get comfy with to understand these tools and how they work (articles I intend to “reuse” a lot when discussing further topics, as they outline some foundational ideas about how statistics works)! Then, in the third and fourth articles, I’ll explain how we can apply those concepts to investigate a question that might be of interest to us. Finally, in the fifth article, we’ll leverage these tools in a different way to get what we’re really after here: Finding an “ideal” sample size that will help to ensure our test has the statistical power we’re after. I’ll close that article with thoughts on the implications of what we’ve discussed, including how use of these tools in this way can “go wrong.”
The Logic of a Statistician–a Quick Review
When I taught Biometry here at UMN in Fall 2021, while subbing for the course’s regular instructor, Dr. Fieberg, I taught a lot of the concepts contained in the first two articles in this series over ~8 weeks, with two 75-minute lectures and 1 two-hour lab per week in which to do that. Soooooo, trying to explain ALL THAT STUFF in two brief articles gives me a bit of a sinking feeling! But let’s try anyway!
Let’s start with the single most important idea of this first article–the one in the title! Stats logic is human logic. Statistical inference–which I’ll define in a moment–is not an esoteric set of tools cooked up a few years back by some cruel academics looking to punish undergraduates (though it often feels that way)! Instead, it’s based on the same logical steps humans have used for literally thousands of years to make informed, rational decisions–it has just since been buoyed by advances in mathematical theory.
I’ll try to prove that assertion, over the course of this article, using a silly thought exercise, one that we will expand in later articles. Suppose your stomach hurts, and you notice that it hurts. Congrats: You’ve made an observation about the world! That’s step one.
Now imagine that that observation sparks a question in your mind–“why does my stomach hurt?” Congrats: That’s step two! Being curious about the “whys” of the universe is not just integral to statistical inference, it’s integral to the human experience.
Now, if you were really living this scenario, what would you do next? Consciously or not, you’d probably begin searching through your past experiences and knowledge for potential explanations–for things that can cause one’s stomach to hurt. If this were a more complex or unfamiliar question, maybe you’d also do some reading or ask your friends to buoy your baseline understanding a bit more, but, here, we don’t need to. We’ve all had many stomach aches and likely have many explanations for them at the ready!
Let’s say we suspect our stomach hurts because we’re hungry. After all, it’s been 6 hours since we sat down to crunch our data, and we didn’t pack enough Flaming Hot Cheetos to get us through the process. Congrats–we’ve selected a hypothesis, an educated (sometimes more educated than other times!) explanation for why we have observed what we’ve observed. To put a finer point on it, we’ve IDed a process (hunger) capable of causing the pattern (stomach pain) we’ve observed.
What’s next? Well, if we’re motivated enough by our stomach pain, we will probably next devise a test–a set of circumstances that, if our hypothesis is right, will produce outcomes we might predict successfully before we even actually do the test. In this case, my test might be “I’ll go buy a bag of chips, eat them like the depraved monstrosity I am (test), and, if my stomach hurt because I was hungry (hypothesis), I’ll expect my tummy ache will go away (prediction).” That last bit is my prediction–the pattern I expect to see during my test if my hypothesis is correct.
So far so good! From here, imagine we’ve performed our test, and our stomach ache went away, just as we predicted! Our bag of chips “cured” our hunger…right? I mean…right?? Our stomach hurt + stomachs can hurt because of hunger + we ate + our stomach no longer hurts = we were hungry…seems like flawless logic to me! I mean…right?!
…But let’s indulge my over-active imagination for a second: What if your stomach really hurt because you had been chewing on your pen tip absent-mindedly for the last four hours and you had ingested so much ink that you had mildly poisoned yourself?? Your stomach was hurting, in reality, as an attempt by your body to get you to stop eating ink, not to get you to eat something else instead! However, by eating something else anyway, you diluted the ink in your stomach enough that it could turn off the “alarm,” so to speak.
Ok ok…feels implausible, but it’s just one potential alternative hypothesis that could exist, right? If we’re reasonably clever, it might not be hard to think of tens or even hundreds of these, even if many feel downright implausible. And there is always another big one, lurking beneath every test we do: That there really was no connection at all between our stomach ache and the bag of chips we ate; the universe is just a random place sometimes, and the results we get are just one big coincidence. We’d call this “Everything is boringness and chaos and random chance” explanation our typical “null hypothesis.”
The key concept I have been painstakingly building toward with this thought exercise can be revealed by asking two questions at this point:
- What was our test trying to demonstrate, and
- Can it demonstrate that thing unequivocally (i.e., without any doubt)?
The answer to the first question is (probably) that our test was trying to demonstrate that our stomach hurt because we were hungry. That is, eating was the cause, and the stomach ache going away was the effect, and that those two things were fundamentally linked.
Did it actually do this, though? It may depend on your personal philosophy, I guess, but the “right” answer is probably much closer to “no” than to “yes.” On the one hand, we did get precisely the results we were expecting, and decisively so! That’s a good sign, and we shouldn’t discount that! On the other hand, though, there are other explanations out there, ones that could have yielded the exact same results as those we observed at least some of the time (recall the pen ink example from earlier!).
So, while we certainly got results supportive of our hypothesis, did we get unequivocal results? No. Can we imagine a test we could have done that could have got us unequivocal results, i.e., one that could have simultaneously ruled every other possible explanation and supported our hypothesis at the same time? If this is your hope, good luck to you friend! To rule out every other possible explanation for why my stomach hurt that one time would be an exhaustively impossible task! I could never be quite sure there wasn’t another explanation, however unlikely, out there I just hadn’t ruled out yet.
At this point, I’ve probably reduced you to a pretty nihilistic place: How do we ever feel certain about anything we think we know?! Certainly, we do feel certain about things sometimes! And in at least a good number of those instances, we ought to! But how??
Here are some cool truths lying at the heart of statistical logic:
- We are very rarely ever interested in determining the likelihood that an explanation is right in isolation because, well, that just isn’t possible! As we saw above, considering only one possible answer for a question like “Why does my stomach hurt?” at a time means blinding ourselves to all other potential explanations–at best, we’d only be seeking evidence to support what we think we already know rather than trying to find the “truth.”
- Instead, what we are really interested in is an explanation’s relative likelihood. So: “If [this] is what I think is going on (hypothesis), [this] is what I did to explore that possibility (test), [this] is what I expected (predictions), and [this] is what I saw (results), how likely was I to see those results if my explanation was right?” Also: “How likely were those results if my explanation was actually wrong?” These are really the questions we should be asking (and are asking, behind the scenes!) whenever we’re trying to explain anything!
- If there is more than one potential explanation for something (which there almost always is), and if there was any chance you could get the results you got even when your “preferred” explanation is actually wrong (which there almost always is), and you didn’t do enough tests to be 100% sure your explanation is the only right one (which you definitely didn’t because you couldn’t), then we quickly realize “proof” from any (one) test is a fantasy. We can find results that support a hypothesis, or we might find a lack of support for one, or, in some cases, we might even be lucky enough to reject a hypothesis as clearly wrong, but that’s the best we can do!
- This means we are always operating in a context of uncertainty and subjectivity! The truth can never be known with absolute certainty. Without the potential to prove an explanation, we are instead left with a decision: When should I feel I have enough support for a hypothesis, and enough “anti-support” for enough other hypotheses, to conclude that my explanation is “right enough?” You may not think about it this way very much, but every time you decide you are confident enough in a conclusion to act upon it, you’re weighing this decision at some level.
- That decision above can’t ever be fully objective, but if we use probability theory, we can at least put some reasonable numbers on the likelihoods discussed above and then use some defensible thresholds to make the decision more objectively and defensibly!
While we may not need that kind of numerical “heft” with a question like “why did my stomach hurt?”, we absolutely do when the question involves environmental health, sustainability, and millions of dollars in funding, as many ecological questions do!
Now that we better appreciate the logic that lies beneath statistical reasoning–and we appreciate that this reasoning is not as “foreign” nor as “arcane” as we might have initially thought–we are ready to consider how a statistician actually applies this logic to assign the likelihoods we just discussed. That’s the topic of Article 2 in this series, which you can get to by clicking here!