What makes for a good scientific protocol?

This week, my duties largely revolved around writing protocols for some upcoming data collections and analyses. The process has been eye-opening for me for a couple of reasons. First, the research being planned is social science research, with which, I think it is fair to say, I am less accustomed. Secondly, though, it spawned a thought process that has stuck with me: Am I doing this right?

Let me explain. My biological education involved many things, but one of the things it did not involve much of was analytical laboratory training (at least, not at first!). It’s that kind of training that, I assume, has prepared many of my peers to compose and use scientific protocols well. I don’t have that same background, so I have had to pick up protocol skills as I’ve gone along. Now, before I conduct any kind of data gathering process, especially one that will occur in a lab, I take the time to write up a protocol for that process. This is especially true if I intend for someone besides me to conduct some or all of the data gathering (like undergraduate assistants), but I usually write one up even for processes that will be done entirely by me.

I assume (and hope) that this is pretty standard practice. What I guess dawned on me this week was that writing a protocol and writing a good protocol needn’t be the same thing, and no one ever actually formally taught me how to write a good protocol. It was just something I recognized I needed to do to ensure I did good research, but who or what is to say that these protocols I’ve been writing are as good as they could be? Maybe, just maybe, there is room for there to be a how-to guide for how to write a protocol that would stand up to scrutiny in the same way our other scientific documents must.

I did some Googling, and I’m not convinced that there is a definitive guide for what makes for a good scientific protocol out there yet (if you know of one, please forward it along!). Interestingly (or sadly, depending on your perspective), I found the Wikipedia page on the subject helpful. It at least got me thinking about the elements that I think should be essential for every protocol that’s written. Maybe that’s a good enough starting point for now.

Here’s what I’ve come up with so far in terms of essential protocol elements:

  1. At a minimum, the protocol should include (1) the most recent author/editor’s name and the date the protocol was last updated; (2) A purpose statement—what is the point of the process being described?; (3) A list of materials that should be assembled prior to beginning the analysis; (4) A list of any safety hazards involved and means of reducing those hazards; and (5) Step-by-step instructions for performing the process in question. Beyond these things, a good protocol should also…
  2. …Be in a form that you would be proud (or at least not reluctant) to submit along with your manuscript for review. I don’t even just mean here that it should be scientifically rigorous (hopefully that’s a given!); I mean that it should look polished and professional. It should have been proofread, steps should be complete sentences, acronyms should be spelled out on first mention, and so forth. These should be documents that could have “lives” outside of your lab. I’d actually love to see more protocols get published—sometimes, it’s surprisingly hard to find a robust means of obtaining a specific kind of data, and I feel I’ve had to “re-invent the wheel” many times because of that. Maybe that would be less the case if more protocols were public.
  3. The more detail, the better. This is where I suspect some/many will disagree with me. I’ve certainly seen many very short protocols in the past, with no more than quick bullet points for steps, no equipment list, etc. I guess this point relates to my last one; I see the purpose of a protocol to be a complete how-to guide on how to conduct the data gathering process exactly as I would have ideally conducted it. Someone should, in theory at least, be able to recreate exactly what I have done with only the protocol as a guide. That would be the ideal, no? Protocols should enable replicability across labs, not just within them, right? We’ve all been frustrated by those instructions that come with assembly-required furniture—it always feels like they’re leaving important stuff out. Maybe we need to work harder to avoid writing instructions like those in our professional lives.
  4. The protocol should be read (and even contributed to) by everyone who will eventually use it. This is my personal feeling, anyway. If I didn’t have my undergraduate students write the protocol they would be using (which I did for several analyses), I at least made them edit it after walking them through the process a few times.  It’s surprising how many times this procedure revealed gaps in the protocol that would have prevented anyone not intimately familiar with the process from repeating it.
  5. The protocol should include mention of any readily anticipatable red flags/hurdles/challenges/etc. I feel like many protocols I’ve seen only describe what things should look like if everything is going right. That seems not enough to me. I think it’s better when it also gives me a sense of what I should be looking out for to know that the process is not working too. Like I said, I think a protocol is at its best if it could be followed to a T without me there to fill in the gaps. Not having troubleshooting information and warning signs in the protocol means one of three things will probably happen every time things don’t go exactly as planned: (1) the person doing the research will have to come to you to ask you what they should do; (2) the person doing the research will ignore the issue; or (3) the person will guess how to resolve the issue. None of those outcomes are good, in my opinion. On that note…
  6. The protocol should explicitly mention that any unusual outcomes and occurrences should be mentioned in the data sheet/log book. In my experience, people (especially undergrads!) need to be reminded that mistakes happen and they can and usually will be forgiven. After all, mistakes don’t ruin projects—mistakes that go unreported ruin projects! I always would tell my students that a problematic data point could maybe be thrown out if I knew something strange happened during the process that generated it. However, if I don’t know of any such event, I can’t remove that point. In this way, their non-reporting of an issue causes more problems for me then reporting the issue would have. I think (and hope) that framing the issue in this light may make reporting issues less scary.
  7. Any part of the protocol that could introduce bias should be discussed in excruciating detail. The best example I can point to here is when any protocol asks for something to be chosen “at random.” I’d like to point out that humans are bad at “random!” Several times, my protocols would ask for the researcher to choose a “random” blueberry out of a bag. Some ways of choosing a blueberry easily favor larger blueberries. Other ways favor smaller ones. Either way is a great way to introduce bias (if every one of my students only pulls out big berries, which will have different chemistries than smaller ones) or at least noise (if one student favors big berries and the other favors small ones). So, I mandated a single, specific way to draw blueberries out of a bag “at random” in all of my protocols for my Doctoral research. Call that petty if you must!
  8. The protocol should describe how to calibrate machines, prep reagents, analyze room conditions, dispose of wastes, clean up after the procedure is done, and so forth. In other words, describing the “meat” of the analysis isn’t enough; I want the reader to also know how to “set up” and “take down” the analysis properly as well.
  9. The Wikipedia page I linked to above made me realize there is one thing missing from my past protocols (but which will always be included from now on): A description of the analytical approach for assessing the importance/meaning of the data gathered. It’s not much help if someone can replicate your data but then they can’t replicate what you did with it once you got it, no?

I’m sure this still isn’t even close to an exhaustive guide to crafting a perfect protocol, but perhaps it’s a start. I’ll keep reflecting on this and update this as I think of new elements to add. If you have thought of some I’ve missed, please don’t hesitate to share. I’m attaching one of my own protocols here—feel free to critique it in light of what’s been discussed above.