Measuring the impact of science research

Few activities in human history have had more impact on our lives than science. And yet, the impact of new scientific research is constantly being measured, often in ways that seem to cast doubt on whether this research is even worth the expense. Why?

The answer has more to do with the performance of research than the idea of research. Research (not just in the natural sciences but in the social sciences and humanities as well) is massively more expensive than a generation ago—up at least 10-fold in constant dollar terms over the last 50 years—and the subject matter of research is also increasingly complex and siloed. Keeping track of who is spending what and for what reason is hard work. Doing so in a way that is actually helpful and meaningful—at least for researchers—is also difficult. What we find more often than not in today’s world of research impact assessment is that impact measurements aren’t really measuring research at all but factors that maybe, at best, are tangentially related to research. These impact measurements also often create impacts of their own.

Consider the case of research publishers. Different stakeholders in the research ecosystem focus on different dimensions of impact and use different metrics. Research publishers (mostly commercial scholarly journal publishers) generate volumes of statistics on the millions of articles published every year in order to better understand industry trends and customer needs, develop new titles as needed, and adjust pricing. Their best-known tool for assessing and labeling “impact” is the Journal Impact Factor (JIF), which measures the citation performance of journals over a two-year period—roughly stated, how many other researchers are reading this work and citing it in their own work. Publishers aren’t simply content to just measure impact using this tool, however. They also actively promote the impact factors of their journals because they can make more money from journals with the highest JIFs. By making certain journals appear more prestigious and impactful, publishers can levy higher subscription fees and author publishing charges for these products.

Researchers are attracted to publisher claims about impact like moths to a flame because, as survey after survey has shown over the years, they care more about making an impact than just about anything. They want their work to be useful and to make a difference. Researchers also want visible credit for their work so they can advance their careers in terms of retention, promotion, tenure and grant funding. Publishers create an illusion of prestige and impact, and researchers believe it—universities, funders, and governments as well, who all value prestige publishing more than other kinds of publishing because they believe it’s a proxy for higher impact research work. Since journals are the most important way that researchers share information—especially university-based researchers, who author around 80% of the articles published in research journals—they are particularly vulnerable to prestige baiting and particularly reliant on using impact factors to judge the impact of research.

This isn’t the only impact measure researchers care about, of course. Researchers are also judged (by their colleagues, as well as by publishers, universities, and funders) by how many articles they author or coauthor—fueling the well-known “publish or perish” syndrome—and by the dollar value of the research grants they receive, number of patents awarded, books published, dissertations directed, the reputation of their institutions, opinions of colleagues, notoriety, tenure status, position, awards, and so on. Academics have plenty of status symbols, and like the rest of society, there’s a certain presumption that collecting enough of these symbols means you’re important. There’s also a presumption (often an inaccurate one) that once you are so labeled, the work you do must be meaningful and impactful. Like the rest of society, it can take time—barring a remarkable discovery—for an early career researcher to gather enough symbols for their work be considered impactful enough to merit tenure or large grant awards.

Publishers employ other impact assessment techniques as well, such as desk evaluations by editors to determine whether a research paper will be of interest to their readers, whether it should be distributed for peer review, and whether it is worth the additional investment of time and money to transform into a published article. Most papers get rejected at least once during this process, but most eventually get published in some journal somewhere (there are tens of thousands—no one is sure exactly how many, in fact). Meanwhile, even though prestige journals reject around 90% of all submissions, and publish less than 1% of the world’s research, the attention the articles in these journals receive in research and media circles exerts an outsize influence on public perceptions and researcher incentives about which research is and is not “impactful” and worth funding.

For top-tier research universities (known as R1 universities), the tens of billions of dollars in research funding awarded annually by government agencies like the National Institutes of Health and the National Science Foundation—particularly grants in the life sciences sector—has an enormous impact on school employment, reputation, and budgets, accounting for over half of total R&D spending by US universities in recent years. In addition, school rankings like those published by US News & World Report are calculated in part using research-related inputs and outputs—the number of researchers employed, amount of research funding received, number of journal articles published, and so on. These rankings create additional impacts on school reputations, which can translate into larger endowments, higher enrollment demand, and better recognition, retention, and funding for researchers.

Nonprofit funders also play a role in impact evaluation. Most research funding comes from industry and government, but nonprofits like the Gates Foundation, Wellcome Trust, and Max Planck Institute wield a lot of influence when it comes to measuring impact. Funders of all stripes want to know if the money they spend on research makes a difference, but nonprofits have been global leaders in the push for open access and other mechanisms to help improve the impact of research through more robust transparency and reliability, and also translating research into societal impact. Today, most of the research funding awarded by nonprofits comes with the string attached that published work and data must be made freely and immediately available to the public in order to improve the accessibility of this work to other researchers and the general public. More and more governments in recent years have been following this lead and instituting similar requirements.

Businesses funders take a different approach. Most of the world’s applied and experimental research (which dwarfs the basic research conducted primarily by universities) is funded and conducted by the business sector. In addition, most patents are awarded to business-based researchers and their companies. So rather than guessing which research is likely to have the most impact, or worrying about impact factors and citations, or making research free to read (it isn’t; industry research is often completely secret and not published in journals at all), businesses are ultimately concerned with whether research translates into patents or profits. This approach might seem incongruous with a university’s approach, but it is actually a continuum. The basic research done by universities fuels the experimental and applied research done by businesses. All R1 universities have technology transfer centers that try to push research out of their settings and into the business sector via patents, licensing agreements, and spin-offs. University researchers also work closely with industry when it comes to publishing, and often make a leap to industry so they can focus more intently on developing their high-impact ideas to fruition (especially in high demand areas like artificial intelligence).

Government impact evaluation systems are a relatively new development in the history of science, starting in the post-World War II period as government spending ballooned and spending oversight ballooned along with this growth. But the complex government spending oversight mechanisms and rituals we see today really only took hold globally in a big way starting in the 1980s. As these evaluation needs evolved, they sprouted metrics for trying to measure impact and ensure their increasingly large budgets remained accountable to taxpayers. For example, all major U.S. government research funding agencies today use expert review panels to assess why the research being proposed matters, how the research is new, what the return on investment will be, and much more. These agencies must also suffer the sometimes witheringly skeptical oversight of Congressional budget appropriators. To weigh the impact of completed research, governments also collect mountains of economic statistics on the R&D sector, which get fed back into decisions about which areas of research provide a better return on investment for society. Like universities, government funders are also closely tied to universities because university researchers conduct most of the world’s basic research, and most of the funding for this work comes from governments.

These evaluation systems in the US and elsewhere around the world share three main traits in common. First, they aren’t necessarily objective, as hinted above. These processes are guided by, and sometimes even hijacked by, politicians who approve research budgets and manage oversight. In the 1970s and 80s, US Senator William Proxmire regularly scuttled research by handing out “Golden Fleece” awards for work that he thought lacked adequate public benefit. More recently, the Trump administration created a policy (later reversed) that would have made it harder for the Environmental Protection Agency to use science that—in the estimation of the agency’s politically appointed leadership—was too “secretive” to be trusted.

Second, most research funding and publishing happens in STEM (science, technology, engineering and math), so most evaluation systems and metrics are STEM-focused. These systems are often a bad fit for social science and humanities researchers, who might struggle to quantify in this evaluation framework why it’s important to study Civil War documents, or learn more about by why voter turnout is low (fields like political science routinely come under attack from government funders).

The third general truth is that we can only guess at the impact research will actually have, except for tallying up the outcomes that are quantifiable, like how much money is spent or how many people are directly employed. We especially have no idea how to gauge the impact of research across time—whether there might be a causal relationship between this work and future discovery or invention, and if so, how much. Einstein’s papers were not cited during the early years of his career, and he was widely mocked as being a misguided fool. If experts at the time in research, publishing, funding and government had assigned Einstein’s work an impact score, it would have been zero.

Still, impact evaluation is here to stay, for better and for worse. When billions of dollars are allocated from public funds every year, we obviously need systems of accountability to ensure this money is spent wisely. At the moment, though, researchers generally aren’t sold on whether the impact evaluation systems we have in place are fit for purpose. Grant funding is increasingly scarce and competitive, impact metrics aren’t actually measuring impact, and options might exist that haven’t been widely tested yet, like simply awarding grants through some kind of lottery system. In the meantime, many reform efforts are underway, often grounded in a philosophy that “open” and accountable science practices are the best way to protect the integrity of science and accelerate the progress of discovery.

Probably the most damning arguments against our current approach to impact evaluation fall into four categories:

Publishing distortion. The widespread use of the journal impact factor, combined with the professional, funding, and academic pressure authors feel to publish in expensive high impact journals, has created a situation where considerations of prestige drive publishing decisions to an unhealthy degree. The unintended consequences of this dynamic are almost too numerous to count. For one, the Matthew Effect is worsened—the rich get richer and the poor get poorer. Only wealthy authors (almost exclusively authors from well-funded US or EU institutions) can afford to publish in these journals, and since these are the journals that jack up reputation and funding, getting published in these journals is like winning the lotto, except at the expense of your colleagues who may also have meritorious research that deserves funding and attention. In addition, overhyping the impact factor results in fewer “boring” papers getting published (like needed replication studies), and fewer papers getting published in free-to-read open access format (because most high impact journals outside of a few prestige journals are still subscription based). This focus has also fueled the rise of an entire industry of fake journals that promote their faked high impacts, researchers who are paid cash rewards by their governments to publish in high impact journals (leading to more fake research), and increased forgery, plagiarism, and “citation rings” to boost citation counts. Impact factors aren’t the only culprit here. The focus on publishing volume has led to a huge increase in “salami-slicing” and “fractional-publishing”—respectively, where a researcher publishes their results across multiple papers instead of a single paper in order to pad their publishing resume, and where papers will sometimes include thousands of authors, again, in order to pad resumes. In short, the entire system of research publishing has been distorted by how we use publishing acumen as a proxy for research impact.
Bias. Impact evaluation bias has been documented by numerous studies over the years, including underfunding researchers from less prestigious institutions, cultural and gender bias, and favoring some fields over others. Indeed, the biases of a few individuals can end up affecting not only individual research careers but the direction of entire fields of research.
Wealth gap distortion. Most R&D is conducted in major industrialized countries, so naturally, most R&D deals with issues that are important to these countries—which crops to develop, which diseases to eradicate, and which social issues to address. For the rest of the world, meaning most of the world’s countries and people, researchers publish a more limited amount, mostly in regional and specialty journals that are more affordable and accessible, often in local languages, and less in the highest-profile, highest-impact journals of the world. As a result, the prestige journals that are so highly prized by researchers and valued by governments, funders and the media, serve the needs and perspectives of the world’s wealthiest countries to a greater extent than may be healthy for research. Meanwhile, less visible (but no less important) research and research journals serve the rest of the world.
Business influence. Motivated by the need to produce business-friendly impacts, and at the same escaping accountability to the public, business research is the “dark matter” of the research universe. The US Department of Defense depends on a vast, well-funded network of companies throughout the country to build the latest fighter jets and munitions; drug companies depend on collaboration networks with university researchers around the world (often co-funding their drug trials). Business money is essential to applied and experimental research, and it far exceeds government spending totals. At the same time, business-financed research focuses on ideas that make money (like high-priced popular medications), not on curing tropical diseases where the financial return on investment might be nil. At the same time, business research has no obligation (other than a moral one) to objectively assess the impacts of their products on public health. For decades, “research” funded by big tobacco insisted there was no link between smoking and lung cancer, and research funded by big oil insisted that global warming was a hoax. Even when seemingly benevolent, there is also no public direction or accountability behind where private money is spent. Private foundations like Gates, Walton, and Broad have together spent billions of dollars in recent year to research and reform the US education system, but the government has no control over what kinds of reforms are implemented and what kinds of impacts come about as a result.

What about science communication? Does it have a role to play in this debate? If science communication does become involved in the research impact evaluation world—and to-date it has not been involved—then our goal should be to improve not only research impact evaluation but research impact itself. Why? Because one will empower the other: As we do a better job of understanding and communicating the real impact of research as opposed to pretending that contrived metrics are reflecting this impact, research will have greater impact and at the same time will be able to separate itself from the negative effects of our current evaluation practices. The ultimate goal of research, after all, is to make an impact, not to be assigned a score. Somewhere along the way we’ve lost sight of this and today treat research funding more like an allowance than an investment. Finding the right balance between oversight and overreach is what’s needed now so science can be better protected from the negative feedback of our current impact evaluation systems, and in a real sense, be given more freedom to better serve society today and into the future.

Additional reading:

Adam, D. 2019. Science funders gamble on grant lotteries. Nature news vol. 575, no. 7784, pp. 574-575 https://go.nature.com/3svELIa. doi: 10.1038/d41586-019-03572-7
Archambault, E. 2018. Universalisation of OA scientific dissemination. www.slideshare.net/scielo/ 2-eric-archambault
Ari, MD, J Iskander, and J Araujo, et al. 2020. A science impact framework to measure impact beyond journal metrics. PLoS ONE vol. 15, no. 12, p. e0244407, doi: 10.1371/journal.pone.0244407
Baker, M. 2015. Over half of psychology studies fail reproducibility test. Nature, https://doi.org/10.1038/ nature.2015.18248
Baldwin, M. 2018. Scientific Autonomy, Public Accountability, and the Rise of “Peer Review” in the Cold War United States. Isis vol. 109, no. 3, pp. 538-558, doi: 10.1086/700070
Berg, J. 2020. Modeling Research Project Grant Success Rates from NIH Appropriation History: Extension to 2020. bioRxiv, doi: 10.1101/2020.11.25.398339
Bollen, J, H Van de Sompel, and A Hagberg, et al. 2009. A Principal Component Analysis of 39 Scientific Impact Measures. PLoS ONE vol. 4, no. 6, p. e6022, doi: 10.1371/journal.pone.0006022
Bornmann, L. 2012. Measuring the societal impact of research: research is less and less assessed on scientific impact alone – we should aim to quantify the increasingly important contributions of science to society. EMBO Rep, vol. 13, no. 8, pp. 673-676, doi:10.1038/embor.2012.99
Dolgin, E. 2017. Why we left academia: Corporate scientists reveal their motives. Nature Careers, http:// go.nature.com/3kqt6Y6
Etzioni, O. 2019. AI academy under siege. Inside Higher Education Blog Post, http://bit.ly/3q3gS91
Fabbri, A, A Lai, and Q Grundy, et al. 2018. The Influence of Industry Sponsorship on the Research Agenda: A Scoping Review. American Journal of Public Health, vol. 108, no. 11, pp. e9-e16, doi:10.2105/ AJPH.2018.304677
Falk-Krzesinski, HJ, and SC Tobin. 2015. How Do I Review Thee? Let Me Count the Ways: A Comparison of Research Grant Proposal Review Criteria Across US Federal Funding Agencies. The Journal of Research Administration, vol. 46, no. 2, pp. 79-94
Fang, FC, and A Casadevall. 2016. Research Funding: The Case for a Modified Lottery, mBio editorial vol. 7, no. 2, p. e00422-16, doi: 10.1128/mBio.00422-16
Gordon, M, D Viganola, and M Bishop, et al. 2020. Are replication rates the same across academic fields? Community forecasts from the DARPA SCORE programme. Royal Society Open Science, vol. 7, no. 7, p. 200566, doi: 10.1098/rsos.200566
Gugerty, MK, and D Karlan. 2018. Ten reasons not to measure impact – and what to do instead. Stanford Social Innovation Review, https://ssir.org/articles/entry/ten_reasons_not_to_measure_impact_ and_what_to_do_instead#
Hampson, G, M DeSart, and L Kamerlin, et al. 2021. OSI Policy Perspective 4: Open Solutions: Unifying the meaning of open and designing a new global open solutions policy framework. Open Scholarship Initiative, January 2021 edition, doi: 10.13021/osi2020.2930
Holbrook, JB, and R Frodeman. 2011. Peer review and the ex ante assessment of societal impacts.. Research Evaluation vol. 20, no. 3, pp. 239-246, doi: 10.3152/095820211X12941371876788
Larivière, V, B Macaluso, and P Mongeon, et al. 2018. Vanishing industries and the rising monopoly of universities in published research. PLoS ONE vol. 13, no. 8, p. e0202120, doi: 10.1371/journal. pone.0202120
Lok, C. 2010. Science funding: Science for the masses’, Nature vol. 465, pp. 416-418, doi:10.1038/465416a
Mallapaty, S. 2020. China bans cash rewards for publishing papers’, Nature, vol. 579, p. 18 https://go.nature.com/3qPbB66
National Institutes of Health (NIH). 2013. Additional scoring guidance for research applications’, https://bit.ly/3uJpsO3
National Research Council (NRC). 2014. Furthering America’s Research Enterprise, Washington, DC: The National Academies Press, doi: 10.17226/18804.
National Science Foundation (NSF). 2013. Chapter II – proposal preparation instructions’, http://bit. ly/2O0C2Yx
2018. Comment on proposed rule, “Strengthening Transparency in Regulatory Science.” EPA- HQ-OA-2018-0259. Open Scholarship Initiative, https://bit.ly/3bHgN5O
2021. How do researchers decide where to publish? OSI infographic 3. Open Scholarship Initiative, http://bit.ly/37VVvk1
Packalen, M, and J Bhattacharya. 2020. NIH funding and the pursuit of edge science. Proceedings of the National Academy of Sciences May 2020, 201910160, vol. 117, no. 22, pp. 12011-12016, doi:10.1073/pnas. 1910160117
Plume, A, and D van Weijen. 2014. Publish or perish? The rise of the fractional author’, Research Trends vol. 38, no. 3, pp. 16-18
PricewaterhouseCoopers (PWC). 2020. Pharma 2020: Challenging business models. White Paper, http://pwc.to/38codwT
Priyadarshini, S. 2018. India targets universities in predatory-journal crackdown. Nature vol. 560, pp. 537-538, doi: 10.1038/d41586-018-06048-2
Ravenscroft, J, M Liakata, and A Clare, et al. 2017. Measuring scientific impact beyond academia: An assessment of existing impact metrics and proposed improvements. PLoS ONE vol. 12, no. 3, p. e0173152, doi: 10.1371/journal.pone.0173152
Reale, E, D Avramov, and K Canhial, et al. 2018. A review of literature on evaluating the scientific, social and political impact of social sciences and humanities research. Research Evaluation, vol. 27, no. 4, pp. 298-308.
Research Excellence Framework (REF). 202. www.ref.ac.uk
Sage Publishing. 2019. The latest thinking about metrics for research impact in the social sciences (White paper) Thousand Oaks, CA: Author. doi: 10.4135/wp190522
Schimanski, LA, and JP Alperin. 2018. The evaluation of scholarship in academic promotion and tenure processes: Past, present, and future. F1000Res vol. 7, p. 1605, doi:10.12688/f1000research.16493.1
Science & Engineering Indicators (SEI). 2020. US national science board. https://www.nsf.gov/statistics/seind/
Sinatra, R, D Wang, and P Deville, et al. 2016. Quantifying the evolution of individual scientific impact. Science, vol. 354, no. 6312, p. aaf5239, doi: 10.1126/science.aaf5239
Taylor & Francis. 2019. Taylor & Francis researcher survey. https://bit.ly/3koHgrX
2021. UNESCO institute for statistics (UIS) dataset. http://data.uis.unesco.org
Wahls, W. 2018. High cost of bias: Diminishing marginal returns on NIH grant funding to institutions. bioRxiv 367847, doi: 10.1101/367847
Wang, D, and AL Barabási. 2021. The science of science. Cambridge: Cambridge University Press.
Weisshaar, K. 2017. Publish and Perish? An Assessment of Gender Gaps in Promotion to Tenure in Academia. Social Forces vol. 96, no. 2, pp. 529-560, doi: 10.1093/sf/sox052
Wootton, D. 2015. The invention of science: A new history of the scientific revolution, New York: HarperCollins.
Wu, J. 2018. Why U.S. business R&D is not as strong as it appears: Information technology & innovation foundation’, White Paper, http://www2.itif.org/2018-us-business-rd.pdf

Glenn Hampson

Glenn is Executive Director of the Science Communication Institute and Program Director for SCI’s global Open Scholarship Initiative. You can reach him at [email protected].

PrevNext

Glenn Hampson

Recent Posts

Consent form