AI and scientific productivity: Considering policy and governance challenges

K. Flanagan
The University of Manchester
United Kingdom
B. Ribeiro
Université Côte D’Azur
P. Ferri
The University of Manchester
United Kingdom

Increased application of artificial intelligence (AI) is often touted as the solution to the problem of scientific productivity. This essay explores the science policy and governance implications of AI within the broader debate about scientific productivity. It reviews lessons from previous waves of automation in science and their impact on the practice of science. Since the public sector science base is also the environment in which advanced skills in science and technology are developed, the paper considers possible implications of AI use on scientific human capital. It then examines a range of policy and governance implications, including how AI tools might be used in funding and governance practices.

Scientific productivity is not necessarily related to research because not all science is research. The Frascati Manual (OECD, 2015) defines research as “creative and systematic work undertaken in order to increase the stock of knowledge – including knowledge of humankind, culture and society – and to devise new applications of available knowledge.” Many scientific workers, as defined in the Canberra Manual (OECD, 1995), work in monitoring and testing roles, whether in the private or public sector. In addition, most research is not investigator-driven science (or “basic research” in the Frascati definition). Finally, most research is carried out in the private sector (almost three-quarters of all research and development for OECD countries). Distinguishing between public and private and applied and basic research is necessary since they have different underpinnings, dynamics and motivations.

Assuming that routine scientific work, such as testing water, air or food quality, is not relevant to the debate, research productivity could be conceived in a number of ways: the efficiency with which scientists generate outputs (and the most appropriate outputs to consider e.g. publications and research grants); the rate of important and possibly high-impact discoveries; or the generation of successful innovations, or perhaps only radical or transformational ones.

Recent debates about scientific productivity seem to switch back and forth between these different understandings of “science” and of “productivity”, and between an interest in the quantity vs. quality of outputs or impacts. This is not one but several distinct debates (EC, 2022). The relationships between investigator-driven basic science and industrial (i.e. corporate) innovation are non-linear and indirect (e.g. Salter and Martin, 2001). Consequently, a change in the quality or quantity of basic science will not necessarily drive a change in the quality or quantity of industrial innovation. In a recent expert survey, only 15% of respondents felt that research productivity had declined in the past decade, while more than half felt it had increased (EC, 2022). This lack of consensus illustrates the subjective nature of research productivity as a concept.

A recurring framing in debates about scientific productivity is that science is a process in which inputs are turned into outputs i.e. a production process. A different way to make sense of scientific productivity is to consider the policy goals behind publicly funded science, and the kind(s) of outputs and outcomes that governments are looking to realise by funding research.

The conventional wisdom that science policy is about funding research to generate knowledge that (hopefully) has positive societal and economic impacts retains a powerful hold in academic and policy circles. It is a key component of rhetoric around the public value of science and innovation (Ribeiro and Shapira, 2020). However, research funding also plays an important role in developing and maintaining the scientific workforce, as can be observed in the past century of science policy in Western countries.

The supply of “advanced scientific human capital” was a prime concern behind the emergence of science policy in the 20th century. In the United Kingdom, increasing the supply of skilled researchers in the national interest, for example, was a key aim behind the 1916 creation of the Department of Scientific and Industrial Research, the precursor of UK research councils (Clarke, 2019). In the United States, the issue found expression through President Roosevelt’s letter tasking Vannevar Bush with his 1945 report Science: The Endless Frontier (Zachary, 1997; Dennis, 2006).

This recurring focus on the human and institutional capacity to do research, and the economic and social roles played by this capacity, is reflected in the more modern conception of the “science base”. One could argue that governments fund investigator-driven science to build and sustain the research ecosystem as a key social and economic resource. If so, then the impacts of AI should be considered not just on scientific output but on the science base more broadly.

To map an agenda around these broader impacts on the science base, this paper focuses on everyday scientific labour (as opposed to an idealised, heroic view focused on moments of discovery; or proxies to assess scientific work, such as publications), on careers, and on governance issues in relation to publicly funded science. This will involve looking at previous waves of automation in science. It will consider how AI research tools might be funded in the public science base, and how that might impact on research processes and practices. It will also briefly reflect on the uses of AI in improving research governance and integrity.

The history and sociology of science show clearly that scientific research is not a homogenous, unchanging activity. Pickstone (2000) gives the example of how embryology moved from description through analysis to experimentation. At a more micro level, specific practices also change over time. The early embryologists observed with very different instruments and techniques from those of contemporary scientists.

Practices around the recording, communication and analysis of data have also varied from discipline to discipline and over time, as have standards of evidence. Such changes are often entangled with the adoption of new technologies. High-throughput automated sequencing technologies developed during and after the Human Genome Project created a demand for new scientific skills. They even created new disciplines such as bioinformatics (Bartlett, Lewis and Williams, 2016).

The “data deluge” in high energy physics and astronomy, but also in biomedical research, has led those scientific communities to develop new practices for managing, sharing and analysing data. In extreme cases, new practices have also emerged for verifying observations. These include, for example, multiple detectors based on different concepts and managed by different multinational teams at the Large Hadron Collider; see Junk and Lyons (2020) for a detailed discussion of replication efforts in particle physics. Funding and governance processes must often adapt to the adoption of new scientific tools. This occurred, for example, with the introduction of the genetically modified ‘knockout mouse’ in the biomedical sciences (Flanagan et al., 2003; Valli et al., 2007).

As with any other profession, scientific research careers and everyday work practices are affected by labour market dynamics, and workplace and organisational cultures. They are also affected by disciplinary cultures and national funding and evaluation practices. The content of scientific work is varied. Most researchers in the public sector science base are not full-time researchers but also play roles as teachers or managers. Research itself involves an unusual mixture of routine, often mundane, work and exploratory creative work. This can occur in conditions of high uncertainty and, often, competitive pressure, as well as sometimes extensive collaboration and sharing. Practices of collaboration, co-ordination and sometimes competition will depend on and/or be affected by automated systems.

Few scholars have studied the impact of automation on the content of scientific work. However, many studies have examined the impacts of automation on the content of other kinds of work. Studies in other domains show that automation can have a wide range of impacts on everyday work routines and interactions, depending on context. Unexpected effects may even include reduced productivity, for instance, through information overload (Azad and King, 2008).

How the adoption of new AI tools will affect science is heavily entangled with the features of scientific work. Some labour-intensive, routine and mundane practices may be replaceable by automated tools, as others have been replaced in the past. Think of how statistical analysis by hand was replaced by computers. However, the adoption of new tools can also introduce a demand for new routine and mundane tasks, which have to be incorporated into the practice of science.

In their study of scientific labour in the field of synthetic biology, Ribeiro et al. (2023) show how automation and digitalisation can lead to the amplification and diversification of tasks. New protocols and methodologies demand new skills from researchers. They also require a good deal of translational labour for interdisciplinary collaborations (e.g. computer sciences and biology).

A digitalisation paradox emerges. Robotics and advanced data analytics are aimed at simplifying scientific work by automating some repetitive tasks such as pipetting. Yet they also contribute to increasing the complexity of scientific work in terms of the number and diversity of tasks that cannot be automated (Ribeiro et al., 2023).

These tasks often involve mundane work with laboratory robots and with large volumes of data – from preparing and supervising robots to checking and standardising data. This is because automated and “intelligent” systems affect the number and types of hypothesis and scientific experiments that can be tested and performed.

Importantly, the tasks created by adoption of new AI tools are likely to be the preserve of early career researchers lower down the scientific hierarchy. This is because the application of AI tools often involves labour-intensive, time-consuming activities. Data curation, cleaning and labelling, for example, are usually performed by these researchers.

Therefore, further automation and digitalisation of scientific work – from robotics to AI models focused on discovery – might pose employment-related risks to scientific workers occupying lower positions in the scientific hierarchy. Mundane work with data and robots has little value for promotion or job applications in scientific organisations, which value publication in prestigious journals. As argued by Ribeiro et al. (2023), the performance of scientists dealing with data and machines is intertwined with the performance of these technical systems. Should an experiment fail or be delayed due to equipment problems, the burden would mostly fall on those scientists.

Maintaining and enhancing the human and infrastructural capacity of the science base is a key, if often implicit, aim of research policy. In other words, the research environment is also the research training environment. Graduate students and post-docs are learning by doing and learning by observing. They learn not only lab and analytical skills and practices but – like apprentices – they also learn the assumptions and cultures of the communities they are embedded in. This research training experience is a key public good motivating the public funding of research.

Because automation will change the content of scientific work in the ways outlined above, it can affect the quantity and quality of those training opportunities in the research base. If fewer research post-docs are required, or where such roles become primarily focused on work with automated systems, it could limit exposure to a wider set of scientific practices.

When automating manual or cognitive practices, there is always a risk that understanding of, and skills relating to, key procedures may be lost. Mindell (2015) notes it is critical that pilots periodically practise flying in manual mode to understand what is required should the autopilot system fail. In the scientific context, as critical techniques and processes become “black-boxed” in this way, students, as well as early career and other researchers, may not get the opportunity to fully learn or understand them.

The earlier black-boxing of statistical analysis in software packages has been argued to have contributed to the misapplication of statistical tests. This could be a contributory factor in a crisis of reproducibility in research (Nickerson, 2000). Future generations of scientists will be able to accomplish their tasks yet be unable to perform an experiment without the automated system support or fully understand the outcomes produced by algorithms. Paradoxically, the adoption of new tools could leave scientists less well equipped to understand and critique their application.

As new techniques and technologies emerge in scientific research, they become fashionable, creating new demands for funding. A scientific community’s shared understanding of what constitutes “leading-edge” research can change. This happened in the biomedical sciences, for example, with the increasing popularity of new technologies like high-throughput sequencing.

Researchers may change trajectories, select topics and problems amenable to apply a new tool to remain at the leading edge, publish in the most prestigious outlets and attract funding. Survey evidence suggests that the cost of meeting the performance level demanded at the leading edge of scientific research tends to grow faster than the rate at which technological innovation lowers cost (Georghiou and Halfpenny, 1996). This inevitably creates pressures on funding. These pressures have the potential to strengthen or create new structural inequalities that discriminate against less well-resourced groups or researchers in lower-income countries. Helmy, Awad and Mosa (2016), for example, examine challenges faced by developing countries in establishing themselves in genomics research. Wagner (2008) provides a more general discussion of entry barriers to the leading edge of global science.

There are also questions about how future automation in the public research base will be funded. Some commentators argue that AI tools will transform the productivity of research at little or no cost. However, AI tools have to be embedded in wider systems of data collection, curation, storage and validation. In particular, automated systems involving both AI and robotics tools are unlikely to come cheaply.

It is helpful here to consider how other items of scientific research infrastructure are funded. Small items of equipment may be funded through competitive grants. However, bigger ticket items are more likely to be funded through capital spending streams.

The cost effects of the adoption of new tools may be difficult to predict. Adoption of proven models from libraries may involve no direct costs and may reduce the direct labour costs of doing research. However, in other cases, especially at the research frontier, both capital and labour costs may rise.

Major items of research equipment typically require complementary assets. This can include refurbished or purpose-built accommodation, skilled technical and user support staff, preparation and analysis facilities. It may also require additional items of generic, supporting equipment.

There is some evidence that competitive project-based grant funding systems struggle to fund mid-range and generic research equipment that may be used across many projects and grants. They may also lack the necessary ongoing technical support and maintenance to make that equipment productive (see Flanagan et al., 2003).

This struggle might provide a challenge to adoption of new automated tools involving AI and other forms of automation. At the very least, it could affect situations where funding is primarily competitive and research organisations lack their own private resources to complement competitively won grants. It may also be an issue for the introduction of novel, unproven AI tools. Thus, research policies need to consider not only how to fund new tools but also how to ensure support for complementary assets.

Current AI tools automate routine experimental, observational and classificational tasks in scientific research, typically within laboratories and offices in research organisations (Royal Society, 2018; Raghu and Schmidt, 2020). Future AI tools that aim to verify or even identify causal relationships might not necessarily be treated as “equipment” by funders. Instead, they could instead come to be considered like a “member” of the research team. In competitive funding systems, researchers are evaluated based on their track record of publications, an indicator of high performance. Research funders may need to consider how future AI tools that automate or partially automate the identification of causal relationships will be evaluated in the competition for funding.

There has already been some experimentation with the application of machine-learning tools to funding body processes. These include identification of appropriate peer reviewers for grant proposals (e.g. the National Natural Sciences Foundation in China, see Cyranoski, 2019). Such tools hold the promise of speeding up the slow processes of matching reviewers with applications. They have also been lauded as a means of avoiding old boy networks or lobbying.

However, these uses of AI have also been criticised for their potential to introduce new biases into review processes. For example, they might select reviewers who have conflicts of interest or are not appropriately qualified to assess the proposal (Cyranoski, 2019).

There has also been much interest in tools to partially automate aspects of the funding or journal peer review process (Heaven, 2018; Checco et al., 2021). This has raised similar concerns about the unintended consequences of hidden biases within black-boxed processes.

A question raised less often concerns acceptance of the use of such tools by scientists themselves. Many researchers have resisted the use of metrics and proposals to replace peer review of grant applications with funding lotteries. This gives a sense of the possible response to adoption of automated processes by funders (Wilsdon, 2021).

Potential applications of AI have also been touted in tackling fraud, plagiarism and poor practice in research. These aim to improve replicability and weed out poor quality or fraudulent findings from the scientific literature. At the same time, application of AI techniques may increase, rather than resolve, problems of reproducibility. Various issues related to data leakage such as duplicates and sampling bias, for example, have been found in machine-learning methods (Kapoor and Narayanan, 2022).

Scholars have emphasised the role of expectations in legitimating actions, justifying decisions, guiding activities and attracting the interest of governments, industries and research communities towards emerging technologies to make desired technological futures a reality (e.g. Borup et al., 2006). Proponents of specific nascent technologies tend to identify pressing problems that can only be solved by that technology. Many narratives of AI in science portray it as radically transforming scientific research and heralding a new era of scientific productivity. Some narratives argue that AI will free up time for researchers, augment scientific reasoning, boost diversity and promote the decentralisation of science. While often speculative, these narratives might affect the development, use and impacts of AI (Ferri, 2022). This dynamic is an ever-present aspect of the uptake of new technologies, and AI is no different. However, rhetoric around emerging technologies always risks drawing attention from alternative possible trajectories of development, and from negative or unintended consequences.

AI technologies will reconfigure the organisation and conditions of the science base. This will have many positive consequences and potentially negative and unintended ones, such as the deskilling of researchers or problems arising from the black-boxing of key processes and practices. Such negative and unintended consequences should be considered ahead of promissory statements about AI and scientific productivity. When proponents of AI in science speak, they should be asked: are they speaking about routine monitoring or leading-edge research? When they talk about productivity, what is the product produced, and why is it important? Of course, the terms “science” and “productivity” must be used consistently.

A key implicit goal of science policy is the generation of the human, organisational and infrastructural capacity to do research. This underpins problem-driven (applied) research, scientific entrepreneurship and industrial innovation. Statements about AI in science also need to be judged in light of their effects on this research capacity. Their effects on everyday scientific tasks, as well as the opportunities for (and content of) research training, should also be assessed.

It may well be that adoption of AI tools can remove some boring routine tasks from research work, giving more space for exploratory, creative and social elements of research practice. However, as discussed above, it might equally be the case that demands for new routine tasks stemming from the adoption of AI tools will create new pressures on everyday scientific work.

Our understanding of how AI might change science cannot be detached from the human capital dimension of scientific practice. Different kinds of scientists and engineers will emerge from a scientific enterprise in which AI tools are widely used. They will have different sets of skills and be accustomed to different routines from their predecessors. Fields such as synthetic biology rely heavily on interdisciplinary collaborations (e.g. computer scientists and biologists). These scientists are slowly becoming more equipped to conduct technical work when troubleshooting machines. They are also getting used to interacting with technicians from supplier companies. Their relationship with the lab is also changing as large robotics platforms take the space of the bench. More experimental work is done in office spaces away from lab benches.

These new configurations affect the way scientists collaborate and how they co-ordinate their everyday tasks. The consequences may vary according to position in the scientific hierarchy. Innovation and adoption of new AI tools in science will also create demands on research funding and governance practices. Questions will arise about how such tools will be funded and evaluated, and how they will be used in funding and governance practices.

AI tools have been framed as an answer not just to the problem of scientific productivity but also to problems of reproducibility and poor practice in science. However, some argue the crisis of replicability and related problems stem from the ever-more intense “publish or perish” nature of modern scientific competition. If anything, these critical voices say, science needs to slow down (Stengers, 2018; Frith, 2020).

Perhaps the key question is not how AI tools can accelerate scientific productivity in terms of the quanta of discoveries or their direct social and economic impacts. After all, individual parcels of new knowledge are not the key mechanism through which science has such impacts. Instead, perhaps the question should be: how can AI tools help build a slower but more sustainable, more responsible and socially productive science base?


Azad, B. and N. King (2008), “Enacting computer workaround practices within a medication dispensing system”, in European Journal of Information Systems, Vol. 17/3, pp. 264-278,

Bartlett, A., J. Lewis and M. Williams (2016), “Generations of interdisciplinarity in bioinformatics”, New Genetics and Society, Vol. 35/2, pp. 186-209,

Borup, M. et al. (2006), “The sociology of expectations in science and technology”, Technology Analysis & Strategic Management, Vol. 18/3-4, pp. 285-298,

Checco, A. et al. (2021), “AI-assisted peer review”, Humanities and Social Sciences Communications, Vol. 8/25,

Clarke, S. (2019), “What can be learned from government industrial development and research policy in the United Kingdom, 1914-1965”, in Lessons from the History of UK Science Policy, The British Academy, London,

Cyranoski, D. (2019), “Artificial intelligence is selecting grant reviewers in China”, Nature, Vol. 569, pp. 316-317,

Dennis, M.A. (2006), “Reconstructing sociotechnical order: Vannevar Bush and US science policy” in Jasanoff, S. (ed.), States of Knowledge: The Co-production of Science and Social Order, Routledge, New York.

EC (2022), “Study on factors impeding the productivity of research and the prospects for open science policies to improve the ability of the research and innovation system – final report”, European Commission, Brussels,

Ferri, P. (2022), “The impact of artificial intelligence on scientific collaboration: Setting the scene for a future research agenda”, presented at Eu-SPRI 2022, 1-3 June, Utrecht, Netherlands.

Flanagan, K. et al. (2003), “Chasing the leading edge: Some lessons for research infrastructure policy”, presented at ASEAT Conference, Manchester,

Frith, U. (2020), “Fast lane to slow science”, Trends in Cognitive Sciences, Vol. 24/1, pp 1-2,

Georghiou, L. and P. Halfpenny (1996), “Equipping researchers for the future‟, Nature, Vol. 383, pp. 663-664,

Heaven, D. (2018), “AI peer reviewers unleashed to ease publishing grind”, Nature, Vol. 563, pp. 609-610,

Helmy, M., M. Awad and K.A. Mosa (2016), “Limited resources of genome sequencing in developing countries: Challenges and solutions”, Applied & Translational Genomics, Vol. 9, pp. 15-19,

Junk, T.R. and L. Lyons (2020), “Reproducibility and replication of experimental particle physics results”, Harvard Data Science Review, Vol. 2/4,

Kapoor, S. and A. Narayanan (2022), “Leakage and the reproducibility crisis in ML-based science”, arXiv, preprint arXiv:2207.07048,

Mindell, D.A. (2015), Our robots, Ourselves: Robotics and the Myths of Autonomy, Viking, New York.

Nickerson, R.S. (2000), “Null hypothesis significance testing: A review of an old and continuing controversy”, Psychological Methods, Vol. 5/2, p. 241,

OECD (2015), Frascati Manual 2015: Guidelines for Collecting and Reporting Data on Research and Experimental Development, The Measurement of Scientific, Technological and Innovation Activities, OECD Publishing, Paris,

OECD (1995), Measurement of Scientific and Technological Activities: Manual on the Measurement of Human Resources Devoted to S&T – Canberra Manual, The Measurement of Scientific and Technological Activities, OECD Publishing, Paris,

Pickstone, J.V. (2000), Ways of Knowing: A New History of Science, Technology and Medicine, University of Chicago Press.

Raghu, M. and E. Schmidt (2020), “A survey of deep learning for scientific discovery”, arXiv, preprint arXiv:2003.11755,

Ribeiro, B. and P. Shapira (2020), “Private and public values of innovation: A patent analysis of synthetic biology”, Research Policy, Vol. 49/1, p. 103875,

Ribeiro, B. et al. (2023), “The digitalisation paradox of everyday scientific labour: How mundane knowledge work is amplified and diversified in the biosciences”, Research Policy, Vol. 52/1, p. 104607,

Royal Society (2018), “The AI revolution in scientific research”, Royal Society/Alan Turing Institute, London,

Salter, A.J. and B. Martin (2001), “The economic benefits of publicly funded basic research: A critical review”, Research Policy, Vol. 30/3, pp. 509-532,

Stengers, I. (2018), Another Science is Possible: A Manifesto for Slow Science, Polity Press, New York.

Valli, T. et al. (2007), “Over 60% of NIH extramural funding involves animal-related research”, Veterinary Pathology, Vol. 44/6, pp. 962-963,

Vicsek, L. (2021), “Artificial intelligence and the future of work – Lessons from the sociology of expectations”, International Journal of Sociology and Social Policy, Vol. 41/7/8, pp. 842-861,

Wagner, C. (2008), The New Invisible College: Science for Development, Brookings, Washington, DC.

Wilsdon, J. (2021), “AI & machine learning in research assessment: Can we draw lessons from debates over responsible metrics?”, presentation to Research on Research Institute & Research Council of Norway workshop, January,

Zachary, G.P. (1997), Endless Frontier: Vannevar Bush, Engineer of the American Century, MIT Press, Cambridge, MA.

Metadata, Legal and Rights

This document, as well as any data and map included herein, are without prejudice to the status of or sovereignty over any territory, to the delimitation of international frontiers and boundaries and to the name of any territory, city or area. Extracts from publications may be subject to additional disclaimers, which are set out in the complete version of the publication, available at the link provided.

© OECD 2023

The use of this work, whether digital or print, is governed by the Terms and Conditions to be found at