Democratising artificial intelligence to accelerate scientific discovery

J. Vanschoren
Eindhoven University of Technology

In recent years, artificial intelligence (AI) has gone from strength to strength. This has led to major scientific advances such as models that predict how proteins fold, how DNA determines gene expression, how to control plasma in nuclear fusion reactors, and many more. These advances depend on AI models – programs trained on data to recognise certain types of patterns and make predictions. The development of such models often requires large interdisciplinary teams of excellent scientists and engineers, large datasets and significant computational resources. Creating these conditions is hard, which hinders a more widespread acceleration of AI-enabled scientific discovery. This essay explores how automating a key task – the design of machine-learning models – can help democratise AI and allow many more and smaller teams to use it effectively in breakthrough scientific research.

AI models usually need to be complex to solve real-world scientific problems. They require a large amount of design and tuning based on thorough insight and intuition from both AI experts and scientists working in the domain in question. For instance, models such as AlphaFold (Jumper et al., 2021) combine deep learning (one of a broad family of AI models centred around neural networks) with built-in constraints derived from knowledge of biological and physical systems. In this way, they generate a hybrid model.

Deep-learning models have proven to be well suited for scientific problems with a massive combinatorial search space (e.g. there are 10300 possible conformations for an average protein), a clear metric to optimise against (e.g. how well the predicted protein structure matches the experimental observations) and lots of data to learn from (Institute for Ethics in AI Oxford YouTube Channel, 13 July 2022).

However, somewhat ironically, designing these models also requires navigating a tremendously vast search space of possible conformations of neural network architectures. For instance, some of these architectures (the structure of layers of artificial neurons, their configurations and the connections between them) can easily have 1018 possible configurations (Tu et al., 2022).

Discovering well-performing models is a science in itself, requiring non-trivial insight and technical expertise. A large team of research engineers can solve this problem by manual trial-and-error. However, in light of today’s endemic shortage of (and intense competition for) highly trained AI experts, it is hard to scale this approach to thousands of other labs.

Imagine if such AI expertise could be harnessed and made universally available through easy-to-use tools that largely automate the design of machine-learning systems. This would empower all of society to apply machine learning much more easily, in smaller teams and with fewer resources. In so doing, it would accelerate science more widely and effectively.

As Figure 1 depicts, automated machine learning (AutoML) can help democratise AI, enabling more and smaller teams to solve hard scientific problems. AutoML tools could augment effective intelligence, creating mixed teams of human scientists and AI assistants. Human scientists can make hypotheses, gather the right data and define goals. For their part, AI assistants can automatically optimise models and explore various ideas that humans could then analyse and quickly improve upon. These AI assistants could also learn across tasks and across teams, rapidly spreading effective solutions and best practices.

Moreover, automation is only one aspect of democratising AI. Efficiency and safety are equally important in enabling widespread access (Talwalkar, 2018). Training many deep-learning models from scratch can be prohibitively expensive for many scientists, or require too much training data. Further advances are needed to train models much more efficiently (e.g. using continual or transfer learning where, in the latter case, knowledge gained while solving one problem would be stored and applied to different but related problems).

Widespread access to specialised computational hardware and optimised AI software is also important. In addition, it is critical to understand and audit the behaviour of AI models to verify whether they are scientifically plausible and safe to use. This may include providing interpretable explanations for predictions and evaluating any ethical ramifications.

While some requirements may be automated, human experts must often be brought into the loop and establish effective collaboration between humans and automated tools. For instance, interpretable AutoML methods have emerged that explain what health-care models have learnt (AI Pursuit by TAIR YouTube Channel, 2021) as semantically meaningful formulas. These are generally complex, such as how 20 different medical factors influence the risk of breast cancer.

AutoML systems often depend on hard-coded assumptions about what the models should look like. These assumptions make the search for good models faster, as long as they work well for the data at hand. As such, there is always a continuum between AutoML systems that are very general but slow, and those that are extremely specific but efficient.

AutoML-Zero (Real et al., 2020) aims to create deep-learning algorithms from scratch, evolving basic mathematical operations into entire algorithms. While it can (re)discover several known modern machine-learning techniques, the process is expensive because it takes a long time to evolve such algorithms. On the other hand, by encoding elements of human knowledge, AutoML systems can become very efficient (Liu, Simonyan and Yang, 2019) but also less able to think outside the box of expert knowledge. Consequently, they are less likely to generalise to new scientific problems. As such, AutoML systems need to be imbued with assumptions that are right for each scientific problem. This can be done by embedding prior knowledge expressed by human experts (Souza et al., 2021). However, the systems could also learn directly from empirical data on the performance of AI models gathered across many scientific problems, as discussed next.

Such advances in self-learning AutoML are accelerated by the emergence of open AI data platforms, such as OpenML (Vanschoren et al., 2013). As illustrated in Figure 2, such platforms host or index many datasets representing different scientific problems. For each dataset, one can look up the best models trained on them and the neural architecture or best ways to pre-process the data they use.

Such platforms also make these data easily available through graphical, as well as programmatic, interfaces. In this way, both AutoML systems and human experts alike can use the information. For instance, researchers can look up which models work well on similar tasks and use them as a starting point.

When new models are found for new tasks, they can also be shared on the platform, thus creating a collective AI memory. Much like global databases of genetic sequences or astronomical observations, information can (and should) be collected on how to build AI models, placed on line and put through tools that help structure it to accelerate AI-driven science (Nielsen, 2012).

How can this global memory be leveraged effectively to find the best AI models for that problem? This is called “learning how to learn” or “meta-learning”. Meta-learning allows learning across scientific problems, transferring that learning to similar problems and using all this to design models for new problems (semi-automatically). As illustrated in Figure 3, different scenarios call for distinct approaches.

First, the left panel shows a set of related scientific problems (e.g. various medical image analysis problems, here simply shown as dots). From these problems, models that work best for each of them can be identified. This, in turn, leads to deductions about which variety of models should be primarily considered for future problems. Learning which neural network models work best for various medical image segmentation problems, for example, would enable solving future problems faster (He et al., 2021). This can be automated further by parameterising some aspect of the model (e.g. which neural layers to use) and then learning what works well across all problems (Elsken et al., 2020).

Second, as shown in the centre panel, examining the problems themselves and extracting their key properties can help construct a (metric) space in which similar problems are close to each other and dissimilar problems are far away from each other. For instance, for problems concerning image data, the similarity of the images in two tasks could be measured (Alvarez-Melis and Fusi, 2020). Problems with similar kinds of images will then also be deemed similar. Given a new problem, information can be transferred, e.g. by recommending the best models known for the most similar prior problems (Feurer et al., 2015). Figure 3 shows how, when given a new problem, the most similar prior problems (i.e. the green, yellow and light blue dots) can be identified. The best models for old problems are likely to work well on the new problem as well.

Finally, as shown in the rightmost panel, a meta-model can be trained to predict which models to try on a given new problem. Most such meta-models go through multiple cycles, iteratively refining the model architecture to work optimally for the new problem (Robles and Vanschoren, 2019; Chen et al., 2022). Other meta-models learn how to transform the data to make them easier to model (Olier et al., 2021).

Automating AI has significant potential to accelerate scientific progress, but so far it has only scratched the surface of what is possible. Fully realising this potential will require co-operation between AI experts, domain scientists and policy makers.

The AutoML community and scientific communities should work closer together. While it is generally known what family of models work well for certain types of data, redesigning and tuning models to solve new scientific problems still require massive human resources. AutoML can help reduce this effort significantly. However, most AutoML researchers only evaluate their methods on specific performance benchmarks (Gijsbers et al., 2022), instead of on scientific problems where they could have much more impact. To address this issue, challenges around AutoML for science could be organised, or research that directly applies AutoML research in AI-driven sciences could be funded.

On a larger scale, support should be given for the development of open AI data platforms that track which AI models – such as OpenML – work best for a wide range of problems. While these platforms are already having an impact in AI research, public support is needed to make them easier to use across many scientific fields, and ensure long-term availability and reliability. For instance, interlinking scientific data infrastructure would link the latest scientific datasets to the best AI models known for that data in an easily accessible way. Moreover, AutoML could help find these models, and even train AutoML systems on all these data to obtain even better models that help solve new scientific problems. In the past, agreements around rapid public sharing of genome data – the Bermuda principles – have led to the creation of global genome databases that now play a critical role in research. Doing the same for AI models, building databases of the best AI models for all kinds of scientific problems, could dramatically facilitate their use to accelerate science.

Moreover, to create new incentives for scientists, such platforms could track dataset and model re-use, much like existing paper citation tracking services. That way, people would get proper credit for sharing datasets and AI models. Setting this up requires public funding. The investment entailed would be both quite small overall and well worthwhile.

AutoML methods need to become more holistic. To become true AI assistants, they need to be better at verifying and explaining the models they find to scientists. They also need to interact efficiently with domain scientists. For instance, it should be easy to define multiobjective metrics, add scientifically inspired constraints, perform safety checks and generally allow scientists to track what kind of models the system is coming up with. This would allow scientists to adjust the AutoML system’s trajectory at any time.

Better incentives are needed for brilliant AI scientists and engineers to focus on solving large scientific challenges. Talent is scarce, and much of it is focused on problems that bring little long-term societal benefit. High-profile AI-driven labs could be created or supported to offer better career perspectives and sufficient compute resources. At the same time, the open release of datasets, models and infrastructure would surely help accelerate AI-driven scientific research. They may play a pivotal role in democratising AI itself.

AI clearly benefits science, but its true potential has not yet been reached. Since AI still relies largely on manually designed AI models, it requires extensive expertise and resources that many labs cannot easily obtain. Employing AI itself to solve this bottleneck can truly accelerate scientific discovery. This requires a data-driven approach to AI model discovery. Such an approach would collect data on which models work best for a large range of scientific problems. It would organise data in online platforms that make them easily accessible. Finally, it would leverage ALM techniques that learn from this experience and help scientists discover better models more quickly. Novel incentives for collecting data and sharing AI models can truly democratise AI and solve problems that benefit society, with machines and humans working together.


AI Pursuit by TAIR YouTube channel (19 October 2021), “Interpretable AutoML – powering the machine learning revolution in healthcare in the era of COVID-19”,

Alvarez-Melis, D. and N. Fusi (2020), “Geometric dataset distances via optimal transport”, arXiv, arXiv:2002.02923 [cs.LG],

Chen, Y. et al. (2022), “Towards learning universal hyperparameter optimizers with transformers”, arXiv, arXiv:2205.13320 [cs.LG],

Elsken, T. et al. (2020), “Meta-learning of neural architectures for few-shot learning”, arXiv, arXiv:1911.11090 [cs.LG],

Feurer, M. et al. (2015), “Efficient and robust automated machine learning”, in Advances in Neural Information Processing Systems 28, pp. 2962-2970,

Gijsbers, P. et al. (2022), “Amlb: An automl benchmark”, arXiv, arXiv:2207.12560 [cs.LG],

He, Y. et al (2021), “Dints: Differentiable neural network topology search for 3d medical image segmentation”, arXiv, arXiv:2103.15954 [cs.CV],

Institute for Ethics in AI Oxford YouTube channel (13 July 2022), “Using AI to accelerate scientific discovery”,

Jumper, J.M. et al. (2021), “Highly accurate protein structure prediction with AlphaFold”, Nature, Vol. 596, pp. 583-589,

Liu, H., K. Simonyan and Y. Yang (2019), “DARTS: Differentiable Architecture Search”, arXiv, arXiv:1806.09055 [cs.LG],

Nielsen, M. (2012), Reinventing Discovery: The New Era of Networked Science, Princeton University Press.

Olier, I. et al. (2021), “Transformational machine learning: Learning how to learn from many related scientific problems”, in Proceedings of the National Academy of Sciences, Vol. 118/49,

Real, E. et al. (2020), “Automl-zero: Evolving machine learning algorithms from scratch”, arXiv, arXiv:2003.03384 [cs.LG],

Robles, J.G. and J. Vanschoren (2019), “Learning to reinforcement learn for neural architecture search”, arXiv, arXiv:1911.03769,

Souza, A.L.F. et al. (2021), “Bayesian optimization with a prior for the optimum”, arXiv, arXiv:2006.14608 [cs.LG],

Talwalkar, A. (2018), “Toward the jet age of machine learning”, 25 April, O’Reilly Media,

Tu, R. et al. (2022), “NAS-Bench-360: Benchmarking neural architecture search on diverse tasks”, arXiv, arXiv:2110.05668 [cs.CV],

Vanschoren, J. et al. (2013), “OpenML: Networked science in machine learning”, SIGKDD Explorations, Vol. 15/2, pp. 49-60,

Metadata, Legal and Rights

This document, as well as any data and map included herein, are without prejudice to the status of or sovereignty over any territory, to the delimitation of international frontiers and boundaries and to the name of any territory, city or area. Extracts from publications may be subject to additional disclaimers, which are set out in the complete version of the publication, available at the link provided.

© OECD 2023

The use of this work, whether digital or print, is governed by the Terms and Conditions to be found at