Robot scientists: From Adam to Eve to Genesis

R. King
Chalmers University
O. Peter
P. Courtney

This essay addresses the concept of the robot scientist, a technology that combines robotics with artificial intelligence (AI) to automate the scientific process. It traces the origins, enabling technologies and possible future directions for robot scientists, as well as the potential impact in advancing science and health care via their use in the biopharmaceutical industry. In identifying key trends, it makes recommendations for continued investment in the development of both AI and robotics and their interface across the medium to long term; incentives to develop and adopt interoperability standards and ontologies to support exchange and collaboration via open science; increased opportunities for collaboration between disciplines, including skills development; and international support for broad initiatives such as the Nobel Turing Challenge that can galvanise and inspire researchers, and even the general public.

A robot scientist builds an artefact to perform basic scientific research autonomously by combining AI and machine learning (ML) algorithms and laboratory robots. A robot here is defined as an entity that interacts with the world. Once beyond the simplest mechanical devices, such as a pump, the possibilities of combining existing and new technologies in the service of robotics quickly become interesting. With suitable sensors such as cameras, a robot can perceive the world and make decisions. With suitable manipulators, it can interact with objects. With suitable locomotion, it can navigate around the world. Finally, with suitable cognitive or reasoning ability, it can adapt and learn.

For AI to have a significant impact on science, it needs to get into the laboratories where experimental research is actually done. Science is not just thinking about the world – it is about testing hypotheses by experiment.

A robot scientist – as distinct from other sorts of robot – is provided with background knowledge about an area of research. This knowledge is best represented using established tools from logic and probability theory. The robot scientist can automatically form a novel hypothesis about its area of science (Figure 1). That is, it considers existing knowledge available in databases and annotated datasets. It also considers published literature in the form of papers and patents, although to a lesser extent given limitations of natural language processing technology. With this knowledge, it formulates a hypothesis (step 1 in Figure 1); it devises experiments to test the hypothesis (step 2); physically runs the experiments using laboratory robotics (step 3); interprets the results (step 4) to change the probability of different hypotheses (step 5); and then repeats the cycle (King et al., 2009, 2004). It also has an automated way of selecting efficient experiments (in terms of time and money) to decide between alternative hypotheses.

Several advanced features distinguish robot scientists from other complex laboratory systems (such as high-throughput drug-screening platforms). These are their integral AI software, their many complex internal cycles (such as hypothesis generation, selection, evaluation and refinement) and their ability to execute individually planned cycles of experiments at high throughput.

Materials scientists, chemists and drug designers have increasingly taken up integration of AI and laboratory automation. Robot scientists now go by different names, such as closed-loop platform, AI scientist, high-throughput experimentation platform and self-driving labs. The different names reflect developments in the different scientific communities (see later for more examples).

The original robot scientist was Adam (Figure 2), the first machine to discover novel scientific knowledge autonomously (King et al., 2009). It was designed to identify the relationship between genes and enzymes in yeast metabolism. Adam discovered the function of locally orphan enzymes; i.e. enzymes known to exist because of the functions they perform but for which the gene(s) encoding them are unknown (King et al., 2009).

Adam was designed to study how different strains of bacteria grow in a wide range of conditions. It efficiently generates the growth data needed to compare predictions made by the system. In scientific terms, a growth data maps the bacterial strain (genotype) to its behaviour (phenotype). Being fully automated, Adam required no technicians, except to periodically add laboratory consumables and remove waste. Each experiment ran for many days and around the clock. Adam designed and initiated over 100 new experiments a day from a selection of thousands of yeast strains and growth conditions. Every 20 minutes, accurate optical measurements were made for each experiment. This resulted in more than 10 000 reliable measurements per day of the growth of the different bacterial strains. Adam also automatically recorded the metadata for the experiments.

A second robot scientist, Eve (Figure 3), was designed to automate early-stage drug development (Williams et al., 2015). The design of Eve was motivated by the need to make drug discovery cheaper and faster. This was meant to promote development of treatments for diseases neglected for economic reasons, such as tropical and orphan diseases, and more generally to increase the supply of new drugs.

Eve integrates two advances in drug discovery automation. First, a laboratory automation system uses AI techniques to discover scientific knowledge through cycles of experimentation. Second, synthetic biology – an area of research designed to create new biological parts and systems – constructs cellular analogue computers (i.e. computers within living cells based on continuously changing biological processes rather than binary 0s and 1s).

In a novel development, Eve has three integrated modes, each corresponding to a stage in discovering what is known as a “lead” drug (i.e. a candidate-useful drug molecule). This integration aims to save time, avoiding the need to switch between different equipment:

  1. 1. In its library-screening mode, Eve systematically tests (assays) each compound in a library of compounds in the standard brute force way of conventional mass screening. While simple to automate, such mass screening is slow. It also wastes resources since every compound in the library is tested. It is also unintelligent, making no use of what is learnt during screening.

  2. 2. In confirmation mode, Eve then re-tests the promising compounds found by library screening. To minimise false positives, it uses multiple repeats and iterations.

  3. 3. In its final step, it executes cycles of statistics and ML to hypothesise so-called quantitative structure activity relationships (QSARs) and to test these QSARs on new compounds. A QSAR is a mathematical/computational function that predicts activity in an assay from a compound’s structure. Eve’s “QSAR-mode” is designed to execute such cycles of QSAR learning and testing, thus checking and improving performance.

The authors demonstrated that, under most circumstances, such intelligent library screening out-performs standard mass screening economically as it saves on time and compound use.

To validate Eve’s performance on these processes, the authors used the system to quickly and cheaply find drugs known to be safe and that target multiple human and parasite enzymes. For several of these drugs, Eve helped provide new insight into their mode of action and helped indicate new uses for safe drugs. Furthermore, using econometric modelling, they demonstrated that Eve’s use of AI to select compounds outperformed standard drug screening. Eve’s most significant discovery is that triclosan (an anti-microbial compound commonly used in toothpaste) inhibits an essential mechanism in the malaria-causing parasites Plasmodium falciparum and P. vixax (Bilsland et al., 2018).

The general motivation for using robot scientists is to increase the productivity of science. AI systems and robots can work more cheaply, faster, more accurately and longer than human beings (i.e. 24/7). More specifically, robot scientists can do the following:

  • flawlessly collect, record and consider vast numbers of facts

  • systematically extract data from millions of scientific papers

  • perform unbiased, near-optimal probabilistic reasoning

  • generate and compare a vast number of hypotheses in parallel

  • select near-optimal (in time and money) experiments to test hypotheses

  • systematically describe experiments in semantic detail, automatically recording and storing results along with the associated metadata and procedures employed, in accordance with accepted standards, at no additional cost1 to help reproduce work in other labs, increasing knowledge transfer and improving the quality of science

  • increase transparency of research (fraudulent research is more difficult), standardisation and exchangeability (by reducing undocumented laboratory bias).

Furthermore, once a working robot scientist is built, it can be easily multiplied and scaled. Robot scientists are also immune to a range of hazards, including pandemic infections. Importantly, all these remarkable capabilities remain complementary to the creative power of human scientists.

Laboratory automation is already a multi-billion-US-dollar industry with strong contributions from Germany and Switzerland, as well as Japan, the United Kingdom and the United States. Laboratory robotics technology is steadily advancing. Today, many (but not all) tasks that a human can do in the laboratory can be automated.

Robotics has been the subject of considerable public investment. This is notably the case in Europe, where public investments have amounted to around EUR 100 million per year over the last 20 years (EC, 2008). Many countries have robotics programs, such as the United Kingdom’s Robotics and Autonomous Systems initiative. These are set to grow as more investment is committed to development of AI.

AI-inside: AI is generally thought of as the abstract analysis of off-line data, at some distance from actual physical action. Conversely, robotics can be considered as “embodied AI”, directly linked to action in the world. AI technology and robotics support one another at several levels – from sensing and actuation to reacting and planning. Classical engineering increasingly adopts and adapts ideas and elements from AI. The relationship between AI and robotics is not simple, but dialogue between the disciplines is ongoing, especially in Europe.

Robotic systems are already widely installed in labs, in particular for handling liquids. Many larger labs in the pharmaceutical industry use these routinely. Most clinical analysis, such as that of blood, is fully automated. Recently, some interesting robot assistants have also appeared (Burger, 2020). Developments in robotics in other sectors – such as robot chefs – also have clear implications for laboratory operations.

Robots still have significant limitations. Many tasks in most labs remain largely un-automated. Robots today operate in protective boxes and can be hard for scientists to program. Often, logistics tasks still fall to lab technicians and scientists, who provide the robots with consumables, such as plates and chemicals, and remove waste. The average lab is still a long way from the digitalisation familiar in homes, exemplified by smartphone apps and robot vacuum cleaners.

Some developments can be observed in important components of robotic systems: robotic arms have become cheaper, easier to use and safer thanks to the development of collaborative robots with force-sensing capability. However, it is not uncommon to see industrial robots designed to lift 5 kg metal payloads moving plastic tubes weighing 50 g. Physical manipulators remain clumsy and not well suited to grasp the range of tubes and other devices commonly used in the lab; new ones are needed. Mobile platforms are the subject of intense interest and experimentation. However, they cannot be used to the extent they are in industrial warehouse operations without adaptation. All of the above indicate opportunities that are increasingly being recognised and addressed.

Given the importance of robotics and AI, there are potentially significant benefits in sketching out future needs. There have been considerable efforts to co-ordinate priorities and funding between researchers and firms working in AI and robotics through public-private partnerships such as euRobotics (n.d.). Through consultations, interviews and brainstorming meetings, for example, laboratory robotics has identified a range of use cases and challenges. These span diverse applications and capabilities, as well as a range of integrating (or platform) and modular approaches. Automated discovery platforms, for example, combine advances in AI and robotics to build on knowledge across many disciplines. Indeed, some groups have started to assemble such systems as open platforms (see below).

Road-mapping exercises identified interoperability as an important barrier. Mutually agreed ontologies of concepts are needed to feed and train the AI algorithms so that semantic information can be shared and understood. Laboratories have benefited greatly from the widespread adoption of the SBS well plate format (Wikipedia, n.d.) as a standard carrier of biological samples. Adopting the SBS format has produced between a hundred- and thousandfold increase in productivity, a transformative impact comparable to the huge efficiency benefits in global trade and logistics from adoption of the 40-foot shipping container. A similar advance in the digital laboratory could come from standardised human- and machine-readable data formats. Initiatives to do this are well underway (SiLA, n.d.). One further consideration is how to share data in a findable, accessible, interoperable and reusable (FAIR) manner to support open science (Wilkinson et al., 2016).

The innovative core of any new therapeutic drug is a molecule with intricately tuned properties: the active pharmaceutical ingredient (API). This may be a small molecule, typically for oral antibiotic, or a larger biological entity, such as an antibody therapeutic against cancer or an mRNA vaccine.

Novel drug molecules are created in two stages. First, a large substance collection is screened to identify chemical structures with some initial activity (“hits”). This process is called (ultra) high-throughput screening (HTS). The demand by the pharmaceutical industry to make HTS technically possible was a key driver of biological laboratory automation, beginning in the 1980s. HTS also contributed to enabling full genome sequencing, starting in the 2000s. In contrast, the second stage of drug discovery is much less automated. It typically takes several years to optimise the structures of candidate drug molecules. It also takes countless cycles of structure design, manual chemical synthesis and biological property testing before a promising drug molecule is found.

Automating medicinal chemistry was long considered technically impossible, largely because of the complexity of the task and human insight needed. With the emergence of ML/AI, this has started to change. Computer-aided drug design is increasing and might become widespread. However, actually making the molecules is still limited by cost and the capability and capacity of traditional chemistry labs. The pharmaceutical industry has already outsourced much of the work of making molecules to lower-cost countries. Expensive lab space is only occupied 25% of the time during a typical work week, considering eight-hour days and a five-day week. Manual optimisation of all the necessary work on promising candidate molecules inevitably consists mostly of unproductive waiting times, even when the work is outsourced across suppliers in different time zones.

In future, competitive biopharmaceutical drug discovery and development will involve novel, fully automated, closed-loop design-make-test (DMT) platforms. These will integrate iterative design of the structure of molecules by ML/AI algorithms and the synthesis and testing of physical molecules. The goal is to eventually push down the time for optimisation of good candidate molecules from weeks to hours, producing valuable preclinical development candidates in months rather than years.

The “cloud lab” concept is also emerging within the biopharmaceutical industry. This development recognises that laboratory automation systems are still expensive to build and difficult to use. It also reflects how providing automation at scale as a service in the cloud can address challenges. In this way, customers access automated labs through a user interface or an API, designing and executing their experiments remotely. A few companies have started to offer such services, including Strateos (n.d.) and Emerald Cloud Lab (n.d.), both in California.2 Since 2021, and based on Eli Lilly’s decade-long experience automating medicinal chemistry, Strateos built and operates the Lilly Life Science Studio in San Diego. This is the most ambitious generalised medicinal chemistry platform publicly available today (Mullin, 2021).

Automated lab infrastructure might be built into standard shipping containers to be scalable and relocatable. By analogy with virtualised computing infrastructure like Amazon Web Services, such remote experimentation services could enable the emergence of “virtual” biopharmaceutical enterprises. This means individual companies would not need to own a laboratory. However, to avoid the “balkanisation” of APIs, global cross-platform standards must be adopted.

Most of the physical and computational elements required to build closed-loop platforms exist today. One challenge is to automate milligram-scale transfer of thousands of solid chemicals (the building blocks of molecular synthesis), which are often sticky or viscous. Companies such as Chemspeed offer solutions and are working to improve them. However, systems must be modular to make their inherent complexity manageable and their process control efficient. Modularisation must itself happen in iterative cycles of platform improvement.

A stepwise approach towards fully integrated closed-loop DMT platforms could be based on independent islands of automation with manageable yet limited functionality. These would be connected by autonomous lab robots to transport samples. Such robots are just emerging (but manually transferring samples might suffice initially).

These loosely coupled functional modules would be used in a workpiece-centric way, much like a modern digitally connected factory. Rather than centrally orchestrating every move, the system responsible for each job (such as transferring a rack of samples from one station to another) would request whatever processing step is required next as the overall process moves to completion.

The authors’ experience of building large high-throughput systems has shown that technical standards for physical and logical interfaces between laboratory devices are crucial to constantly improve and adapt such platforms to changing requirements. Such standards relate to the physical dimensions of items consumed by the platforms during operation, reagent packaging, device commands, method descriptions and scientific data formats. Open standards will foster a healthy ecosystem of co-operating and productively competing suppliers and consumers of future DMT platform modules and services.

In summary, better design algorithms will have limited impact on experimentation in the biopharmaceuticals sector without similarly efficient, robotically automated, physical synthesis and testing. They also need interfaces to engage humans, who bring their unique ingenuity.

The robot scientist concept is increasingly recognised as a general platform for accelerating science. Beyond Adam-Eve, a number of other automated discovery platforms operate in a range of disciplines. Each has specific tools and configurations, and each reveals the next bottlenecks to resolve in automation. These include, for example, initiatives in the following:

  • chemistry in Canada (Kebotix n.d.), in the United Kingdom (Imperial College London, n.d.), in the United States (The Cernak Lab, n.d.; MIT, n.d.), and in Switzerland (IBM Research, n.d.) and material science, such as the material genome initiatives in Canada and Cambridge in the United Kingdom (BIG-MAP, n.d.) and in the United States (MGI, n.d.), which call upon large databases of known reactions to create desired molecular forms

  • catalysis in France (Realcat, n.d.) and Switzerland (Swiss CAT+, n.d.), which assess the functional performance of the new materials in specific tasks

  • metallurgy, which explore new alloys by the combination of existing metals.

In each of these cases, the lack of data on failed experiments, which human scientists are not incentivised to record, is revealed to be a knowledge gap, one that such platforms are well placed to address. One should also mention initiatives in cell culture and bioprocessing, such as the KIWI-biolab (n.d.) from TU Berlin, which leverage developments in genomics. Meanwhile, at labdroids at Riken in Japan, humanoid robots automate operations using tools designed for human hands to improve reproducibility.

Chalmers University in Sweden aims to take the robot scientist to a new level in terms of the number of experiments per step, and the generation and use of more and better quality data. The Chalmers robot, known as Genesis, aims to achieve a detailed and complete understanding of the functioning of complex cells such as yeast. This goal remains a fundamental challenge for 21st century science, and a solution could help answer many questions in the life sciences, biotechnology and medicine.

The new hardware for Genesis will be equipped with 10 000 miniature fermenters (technically, chemostats that control the culture of microorganisms). They will be able to carry out detailed analysis of how biological function relates to metabolism and active genes. The original technology was developed through government-funded research at Vanderbilt University in the United States. Genesis represents a thousandfold scale-up on a traditional manual laboratory that has around ten chemostats. Only an AI system can control so many different experiments, where every day each chemostat will run a separately designed experiment to test a hypothesis.

Science requires experiments involving physical actions, which creates a critical role for robots. This will require support for development of both AI and robotics, and the interface between them (see above section on advances in robotics technology). The newly formed AI, Robotics and Data Association, which focuses on this AI-robotics interface, is a recent example of work in this direction (Adra, n.d.).

The Nobel Turing Challenge underlines the importance of interdisciplinary collaboration at the international level. It challenges researchers to build a system by 2050 that can perform scientific discoveries at a level that merits a Nobel Prize (Kitano, 2021).3 This bold and inspiring challenge is gaining support in the United States, Japan and Europe.

A few suggestions for the role of public support are noted below.

Robots are developing fast for industrial applications but not always in ways that meet the needs of laboratories. Since laboratory users are often highly skilled and collaborative, it may not be the most productive path to replace them directly with existing robots. There are deep intellectual challenges in developing the necessary technologies in partnership with laboratory users. As a consequence, more interaction is required between the roboticists and the domain experts, perhaps in collaborative research programmes and centres. Such programmes could, for example, bring together materials scientists, chemists, AI experts and roboticists to help develop next-generation battery materials (BATT4EU, n.d.; Stein and Gregoire, 2019). Collaborative programmes could also facilitate road-mapping activities across disciplines to identify future gaps and opportunities, and thus guide funding priorities (euRobotics, n.d.). Governments are best placed to create such programmes because of the broad reach required. They can bring together players that otherwise rarely co-ordinate their activities.

Ontologies are necessary for AI/ML, but the fragmented ones must be consolidated and aligned. Laboratory instruments need to become interoperable via standardised interfaces. Laboratory users, suppliers and technology developers could be brought together and incentivised to co-operate from the moment where the data are generated by funders and publishers. This might take place under open science initiatives, for example, that support data curation and sharing through the FAIR principles, as well as appropriate data governance processes, including ethics.

Ongoing long-term collaboration across scientific disciplines is essential but is still too weak. The development of closed-loop research and development centres is encouraging and can serve as a focus for such collaboration, setting medium-term goals and providing formal training that combines engineering (robotics, AI, data, etc.) and science. When linked together, such centres (often national in reach) can also support common interests such as training and evolving research practice. For example, biologists are increasingly exposed to applied mathematics and statistics. However, engineers are still seldom exposed to modern, data-rich life science. All need greater exposure to the many issues relating to data governance, as noted above. Again, governments have a role here, one unlikely to be pursued by the private sector alone.

Initiatives such as the Nobel Turing Challenge can galvanise and inspire collaboration and co-ordination in science and should be supported at an international level. This support can help focus efforts on addressing long-term global challenges such as climate change and cancer. At the same time, it can drive agreement on standards and, not least, attract the young talent needed to make this ambition a reality.

These suggestions align well with those made in a recent US report regarding research practices, educational gaps, long-term support and data governance (National Academies of Sciences, Engineering and Medicine, 2022).


Adra (n.d.), The AI Data Robotics Association website, (accessed 10 January 2023).

BATT4EU (n.d.), Batteries European Partnership website, (accessed 10 January 2023).

BIG-MAP (n.d.), Batteries Interface Genome – Material Applications Platform website, (accessed 10 January 2023).

Bilsland, E. et al. (2018), “Plasmodium dihydrofolate reductase is a second enzyme target for the antimalarial action of triclosan”, Scientific Reports, Vol. 8/1, pp. 1-8,

Bilsland, E. et al. (2011), “Functional expression of parasite drug targets and their human orthologs in yeast”, PLoS Neglected Tropical Diseases, Vol. 5/10, pp. e1320,

Burger, B. et al. (2020), “A mobile robotic chemist”, Nature, Vol. 583/7815, pp. 237-241,

EC (2008), “EU doubles investment in robotics”, European Commission CORDIS Research Results, Brussels,

Emerald Cloud Lab (n.d.), Emerald Cloud Lab website, (accessed 10 January 2023).

euRobotics (n.d.), euRobotics website, (accessed 10 January 2023).

Gromski, P. et al. (2020), “Universal chemical synthesis and discovery with ‘the chemputer’”, Trends in Chemistry, Vol. 2/1, pp. 4-12,

IBM Research (n.d.), “IBM RoboRXN”, webpage, (accessed 10 January 2023).

Imperial College London (n.d.), “Centre for Rapid Online Analysis of Reactions (ROAR)”, webpage, (accessed 10 January 2023).

Kebotix (n.d.), Kebotix website, (accessed 11 January 2023).

KIWI-biolab (n.d.), KIWI-biolab website, (accessed 11 January 2023).

Kitano, H. (2021), “Nobel Turing Challenge: Creating the engine for scientific discovery”, npj Systems Biology and Applications, Vol. 7/1, pp. 1-12,

King, R.D. et al. (2009), “The automation of science”, Science, Vol. 324/5923, pp. 85-89,

King, R.D. et al. (2004), “Functional genomic hypothesis generation and experimentation by a robot scientist”, Nature, Vol. 427/6971, pp. 247-252,

MGI (n.d.), Materials Genome Initiative website, (accessed 11 January 2023).

MIT (n.d.), Jensen Research Group website, (accessed 11 January 2023).

Mullin, R. (2021), “The lab of the future is now”, Chemical & Engineering News Vol. 99/11, pp. 28,

National Academies of Sciences, Engineering, and Medicine (2022), Automated Research Workflows for Accelerated Discovery: Closing the Knowledge Discovery Loop, The National Academies Press, Washington, DC,

Realcat (n.d.), Realcat website, (accessed 11 January 2023).

SiLA (n.d.), SiLA website, (accessed 11 January 2023).

Stein, H.S. and J.M. Gregoire (2019), “Progress and prospects for accelerating materials science with automated and autonomous workflows”, Chemical Science, Vol. 10/42, pp. 9640-9649,

Strateos (n.d.), Strateos website, (accessed 11 January 2023).

Swiss CAT+ (n.d.), Swiss CAT+ website, (accessed 11 January 2023).

The Cernak Lab (n.d.), The Cernak Lab website, (accessed 11 January 2023).

Wikipedia (n.d.), “Microplate”, webpage, (accessed 11 January 2023).

Wilkinson, M.D. et al. (2016), “The FAIR Guiding Principles for scientific data management and stewardship”, Scientific Data, Vol. 3/1, pp. 1-9,

Williams, K. et al. (2015), “Cheaper faster drug development validated by the repositioning of drugs against neglected tropical diseases”, Journal of the Royal Society Interface, Vol. 12/104, pp. 20141289,


← 1. Unlike for humans, the recording of data, metadata and procedures adds up to 15% of the total costs of experimentation. Moreover, despite the recording of experimental data being widespread, it is still uncommon to fully document used procedures, errors and all the metadata.

← 2. Other private initiatives outside the United States include Arctoris and LabGenius in the United Kingdom.

← 3. For further details, see

Metadata, Legal and Rights

This document, as well as any data and map included herein, are without prejudice to the status of or sovereignty over any territory, to the delimitation of international frontiers and boundaries and to the name of any territory, city or area. Extracts from publications may be subject to additional disclaimers, which are set out in the complete version of the publication, available at the link provided.

© OECD 2023

The use of this work, whether digital or print, is governed by the Terms and Conditions to be found at