Editor’s note: The following is Part 2 of a two-part report on selected presentations from the 36th Annual Scientific Meeting of the American Pain Society (APS), held May 17-20, 2017, in Pittsburgh, Pennsylvania, US. Also see Part 1.
At the 2017 APS annual meeting, members of the APS Basic Science Shared Interest Group (SIG) gathered over dinner to discuss the touchy subject of rigor, bias, and transparency in scientific research and publishing. A distinguished panel of experts delivered presentations, followed by an interactive discussion with the audience.
Researchers have long understood that biases influence how they do experiments and interpret their findings. Although scientists have developed practices to minimize sources of error, scientific reproducibility and translation of basic science into clinical successes remain low. While there were differences of opinion among attendees, most agreed that by increasing transparency—in reporting experimental techniques, analysis, and even raw data—they could improve scientific rigor and reproducibility.
Rigor, not glitter
The first presentation came from Shai Silberberg of the National Institute of Neurological Disorders and Stroke (NINDS) at the National Institutes of Health (NIH), Bethesda, US, who talked about the various sources of potential bias in research. The best way to reduce bias, Silberberg said, is to increase transparency, which happens when scientists reveal the details of how they perform experiments and analyze their data.
Two sources of experimental error, Silberberg said, boil down to human nature and chance. He suggested the audience consider the advice of nineteenth-century geologist Thomas Chamberlain, who suggested that scientists test multiple hypotheses in parallel, because “if we test just one hypothesis, we tend to fall in love with it.” The choices scientists make will determine whether or not they introduce bias, which is unintentional and unconscious. To combat bias, researchers must treat all experimental groups equally and be transparent.
To illustrate the need for transparency, in a recent meta-analysis of 1,117 papers focused on multiple sclerosis research, Silberberg said, only 16 percent of them reported whether analysis had been conducted blinded, just 9 percent reported on randomization of subjects or animals, and only two papers reported how the authors calculated sample size (Vesterinen et al., 2010). He mentioned another paper in which the authors called for basic science research on stroke to meet a specific set of minimal standards, in an effort to improve translation of preclinical research into the clinic (Sena et al., 2007).
A lack of reporting on rigorous research methods, Silberberg said, leads to an absence of practicing those methods, which then results in an absence of efficacy when it comes to developing therapeutic agents. The lack of rigor, he added, is largely driven by the rush to publish, particularly in high-impact journals, which in turn is tied to securing grant money.
The requirement to publish “important” papers in order to earn grants also causes researchers to publish positive data an overwhelming amount of the time—an estimated 98 percent of papers. As a result, negative results never see the light of day, and the research community does not benefit from those findings. The highest-impact journals also require “a really good story,” Silberberg said, which skews publications toward confirmatory rather than exploratory studies.
Silberberg quoted physicist Richard Feynman, who said that researchers should report any factor that might possibly confound or bias their results. “We should stop talking about glitter and focus on rigor,” he concluded.
Increasing transparency in grant proposals
The next presentation came from panelist and NIH grant reviewer Cheryl Stucky, Medical College of Wisconsin, Milwaukee, US. Stucky’s talk, “When the Rigor Hits the Road,” covered changes to NIH policy for grant writing aimed at increasing transparency as a means to achieving more rigorous science. She introduced the concept of scientific premise. “It’s not the significance, which describes the work’s impact on the field.” Rather, the premise is the foundation of published or unpublished data upon which a grant application is built. The rigor and quality of those studies are key to the strength of the new work, and researchers should be able to discuss the strengths and weaknesses of those previous investigations. Grant applications, Stucky said, should clearly state the scientific premise and how the current hypothesis grew out of that premise.
Stucky recommended that applicants include a grant section titled “Scientific Rigor” that clearly spells out how the researchers will avoid bias and adhere to rigorous research practices, for instance, by describing the rationale they will use to determine sample sizes, by providing descriptions of control groups and how blinding will be used, and so forth. Grant applications should also describe how biological variables are to be handled, including sex, age, health, genetic strain, and other factors that can affect the results.
The sex of animals is a key biological variable; recent NIH guidelines now require that animals of both sexes be used in preclinical studies unless there is strong justification for using only one sex, such as when studying conditions that occur only in males or only in females (see PRF related story covering the 2015 APS Basic Science SIG dinner discussion of the issue). Details of how animals of both sexes will be studied and analyzed should be spelled out in grant applications, she said. In grant renewal applications, an NIH-required section on authentication gives researchers the opportunity to address whether any resources or reagents actually used in the experiments differed from what was proposed in the original grant. Finally, the NIH now expects progress reports to address issues of scientific rigor.
What are journals doing?
Next, Journal of Pain editor in chief Mark Jensen, University of Washington, Seattle, US, spoke about the steps journals are taking to increase transparency and rigor in publications. In a recent meeting, leaders in journal publishing produced eight basic standards in this regard. Individual journal editors can decide to comply with the standards at one of three levels, ranging from “suggestions” to authors for increased rigor and transparency to hard requirements that may be checked by journals. Jensen predicts that the highest-impact journals will demand the highest standards. For example, the new publishing standards will include requests (or requirements, depending on the journal’s standard) for making available the data generated in the study and the sources for all materials and resources, and journals will increasingly encourage researchers to publish negative results, Jensen said.
One new idea that journals are considering is to ask researchers to “pre-register,” or publish their paper at least in part before the data have been analyzed or even generated. That would address the tendency to make the data fit a preconceived story rather than telling the story revealed by the data, Jensen said. An existing standard requires that clinical trials be registered online, which might in the future extend to animal studies.
An animated discussion
Following the presentations, audience members engaged in a lively discussion of the issues.
When a mess is a good thing
- Frank Rice, Integrated Tissue Design, Rensselaer, New York, encouraged his colleagues to “embrace the mess” of variability in research. He pointed out that the tendency for researchers to throw out data from “outliers” reduces the rigor of their data, even though it might make the data look more consistent. When researchers adhere to more rigorous standards, he said, it may make the data less reproducible, but “when you see something, you feel much more secure that you've got something very strong.”
False positives, false negatives
- Jeffrey Mogil, McGill University, Montreal, Canada, pointed out that different stakeholders have different priorities, and some standards have been forced upon researchers—for example, through pressure from pharmaceutical companies to reduce false positives in preclinical studies; these companies once performed basic science but now rely largely on academic preclinical studies. The tradeoff for reducing false positives has resulted in more false negatives, Mogil said, which may lead researchers to miss something that the world would otherwise never know. Silberberg disagreed, saying that exploratory studies are important to discover those new insights, and differ from confirmatory studies key to identifying a promising drug target. Type 1 (false positive) and type 2 (false negative) errors by definition cannot both arise in the same experiment, and if researchers prefer large effects and use appropriately large sample sizes, these errors will be minimized, he said.
The fate of low-rigor studies?
- Benedict Kolber, Duquesne University, Pittsburgh, US, posed the question, What should happen to low-rigor studies? Aside from the decision by an editor or reviewer about whether to publish them in a given journal, is there a place they can be accessed? Jensen said that there might be such a place at the Journal of Pain. Kolber also suggested that the Journal of Pain require that abstracts contain a statement about blinding and randomization. Silberberg again pointed out the difference between exploratory and confirmatory studies—blinding or randomization might not always be required in an exploratory study, for example—but researchers should simply be more transparent and report exactly what procedures they follow.
The possible versus the actual
- Michael Gold, University of Pittsburgh, US, said that truly rigorous research requires far more funding than NIH grants can provide, so he sees tension between what’s feasible to generate novel ideas and what would be the most rigorous science. “There is this disconnect that simply is not bridgeable,” he said. As for the sex issue, the original NIH recommendation instructed researchers to “consider” the sex of animals, but that the number of animals would not be affected. In reality, he said, he has submitted three proposals all requiring that he double his sample size. He fears that the standards and requirements being discussed “are giving reviewers another club to beat whichever grants they don’t want to move forward with, and this issue needs more consideration before [any NIH recommendation] becomes policy,” a comment that drew wide applause. Silberberg responded that the NIH works by peer review, which means that researchers are setting the rules, not the government.
The impact factor
- Science journalist (and author of this piece) Stephani Sutherland gave her perspective that although scientists may be able to read a paper and determine whether it meets their standards for rigor, journalists and members of the public assume that prestigious peer-reviewed journals contain rigorous science. I also made the point that high-impact journals sometimes publish subpar papers, and excellent papers often appear in publications not considered “top” journals. My advice to young researchers was to do good science and worry less about the impact factor, to which Mogil replied, “That all works perfectly well until you don’t get tenure,” again raising the problem of the emphasis on publishing for winning grant money and faculty positions.
Collective wisdom or rigorous analysis?
- Mogil then raised the issue that researchers often make certain assumptions, such as what animal sample size is appropriate to use in pain studies, not based on rigorous statistical analysis but on some vague consensus. In an unpublished review of the literature, Mogil said, he found “the average number of rats or mice used is nine. Where did that number come from? Nobody did a power analysis.” It came out of the collective wisdom over time from researchers.
Speed or accuracy?
- Mogil also said there is a tradeoff in research. “You can have accurate, or you can have cheap, or you can have fast, but not all at once.” More rigorous science requires more resources. He went further, asking the basic question, “Why is the public better served by higher accuracy and lower volume rather than lower accuracy and higher volume?” Stucky answered that a big question for taxpayers is, What are the ultimate benefits of this preclinical research in translating to the clinic? Mogil argued that overall, science works, and that when there are mistakes, they self-correct over time, and that the system is likely working at maximal efficiency. Silberberg refuted this point. “I would argue that if it takes 25 years to self-correct, it is not good.” Poorly done studies have human consequences, and if everything was working so well, there would be more successful clinical outcomes, he said. “If things aren’t done carefully, we could be chasing our tails.”
Too many regulations?
- Michael Morgan, Washington State University, Vancouver, US, voiced his dissent about more requirements. Over 20 years at WSU, he said, increasing regulations and multiple professional trainings now take up significant time. Silberberg agreed that there are too many regulations, but no one would argue that greater transparency is bad. He was trained to report in the Methods section of a paper everything that he did in an experiment, because the paper will be a surviving document that tells how the work was done.
Getting scooped
- Stucky wondered if the rush to publish is compounded by researchers’ fear of being scooped by other labs and the pressure to publish first in a high-impact journal.
Pre-registration
- Gold said he largely agreed with many of the points made during the discussion, and that “the attempts by journals to address them are reasonable, but they don’t come close to addressing the myriad factors affecting outcomes” of studies. Further, regulations requiring advance notice before performing experiments could and already are significantly hampering research. When considering pre-registering papers for publication, would that process be fluid enough to allow for the constant adjustments and fluctuations in research plans? Silberberg agreed that preclinical science “advances in a more nimble way” than clinical trials, which are planned out sometimes years in advance. Michael Iadarola, NIH, Bethesda, US, agreed with Gold that researchers have to maintain fluidity. “Science thrives where there’s freedom,” he said, “and a lot of this stuff seems anti-freedom, and [is] forcing us into a box that’s pre-registered and pre-sized, and it doesn't work that way.”
Explore and confirm
- Stucky asked, If one conducts an exploratory pilot study and later a confirmatory study, would the researcher publish both—and together or in separate publications? Mogil answered that he and Malcolm Macleod recently published a comment, “No Publication Without Confirmation,” in Nature that contains a rubric for that question (Mogil and Macleod, 2017). “The idea is that it’s a compromise—everyone gets what they want.” In the exploratory phase, researchers do whatever they want and don’t report statistics—they just get to the point where they believe a confirmatory study is in order. The catch is that another lab or consortium does the confirmatory study with a high degree of rigor. Then, all the studies are published together, Mogil said.
Replication: how high?
- Steve Davidson, University of Cincinnati, US, said that he has heard that the findings of only 20 percent of papers can be replicated, but how high is the goal? Silberberg said he doesn’t have the answer, but it surely must rise higher than it is now.
Whom does it hurt?
- Several audience members concurred that new regulations will be particularly hard on new investigators.
Taking control
- Ted Price, University of Texas, Dallas, US, said that researchers could easily be very transparent by publishing more details about methods, and particularly raw data. “That’s the most transparent you can be.” Also, journals controlled by researchers have unnecessary word limits that inhibit them from disclosing details of their work, Price said. To finish the discussion, he concluded, “We should all take a look in the mirror and try to fix the problem ourselves before people who are not scientists come and try to place more restrictions on us.”
Stephani Sutherland, PhD, is a neuroscientist, yogi, and freelance writer in Southern California.