The present and future of AI

Finale doshi-velez on how ai is shaping our lives and how we can shape ai.

image of Finale Doshi-Velez, the John L. Loeb Professor of Engineering and Applied Sciences

Finale Doshi-Velez, the John L. Loeb Professor of Engineering and Applied Sciences. (Photo courtesy of Eliza Grinnell/Harvard SEAS)

How has artificial intelligence changed and shaped our world over the last five years? How will AI continue to impact our lives in the coming years? Those were the questions addressed in the most recent report from the One Hundred Year Study on Artificial Intelligence (AI100), an ongoing project hosted at Stanford University, that will study the status of AI technology and its impacts on the world over the next 100 years.

The 2021 report is the second in a series that will be released every five years until 2116. Titled “Gathering Strength, Gathering Storms,” the report explores the various ways AI is  increasingly touching people’s lives in settings that range from  movie recommendations  and  voice assistants  to  autonomous driving  and  automated medical diagnoses .

Barbara Grosz , the Higgins Research Professor of Natural Sciences at the Harvard John A. Paulson School of Engineering and Applied Sciences (SEAS) is a member of the standing committee overseeing the AI100 project and Finale Doshi-Velez , Gordon McKay Professor of Computer Science, is part of the panel of interdisciplinary researchers who wrote this year’s report. 

We spoke with Doshi-Velez about the report, what it says about the role AI is currently playing in our lives, and how it will change in the future.  

Q: Let's start with a snapshot: What is the current state of AI and its potential?

Doshi-Velez: Some of the biggest changes in the last five years have been how well AIs now perform in large data regimes on specific types of tasks.  We've seen [DeepMind’s] AlphaZero become the best Go player entirely through self-play, and everyday uses of AI such as grammar checks and autocomplete, automatic personal photo organization and search, and speech recognition become commonplace for large numbers of people.  

In terms of potential, I'm most excited about AIs that might augment and assist people.  They can be used to drive insights in drug discovery, help with decision making such as identifying a menu of likely treatment options for patients, and provide basic assistance, such as lane keeping while driving or text-to-speech based on images from a phone for the visually impaired.  In many situations, people and AIs have complementary strengths. I think we're getting closer to unlocking the potential of people and AI teams.

There's a much greater recognition that we should not be waiting for AI tools to become mainstream before making sure they are ethical.

Q: Over the course of 100 years, these reports will tell the story of AI and its evolving role in society. Even though there have only been two reports, what's the story so far?

There's actually a lot of change even in five years.  The first report is fairly rosy.  For example, it mentions how algorithmic risk assessments may mitigate the human biases of judges.  The second has a much more mixed view.  I think this comes from the fact that as AI tools have come into the mainstream — both in higher stakes and everyday settings — we are appropriately much less willing to tolerate flaws, especially discriminatory ones. There's also been questions of information and disinformation control as people get their news, social media, and entertainment via searches and rankings personalized to them. So, there's a much greater recognition that we should not be waiting for AI tools to become mainstream before making sure they are ethical.

Q: What is the responsibility of institutes of higher education in preparing students and the next generation of computer scientists for the future of AI and its impact on society?

First, I'll say that the need to understand the basics of AI and data science starts much earlier than higher education!  Children are being exposed to AIs as soon as they click on videos on YouTube or browse photo albums. They need to understand aspects of AI such as how their actions affect future recommendations.

But for computer science students in college, I think a key thing that future engineers need to realize is when to demand input and how to talk across disciplinary boundaries to get at often difficult-to-quantify notions of safety, equity, fairness, etc.  I'm really excited that Harvard has the Embedded EthiCS program to provide some of this education.  Of course, this is an addition to standard good engineering practices like building robust models, validating them, and so forth, which is all a bit harder with AI.

I think a key thing that future engineers need to realize is when to demand input and how to talk across disciplinary boundaries to get at often difficult-to-quantify notions of safety, equity, fairness, etc. 

Q: Your work focuses on machine learning with applications to healthcare, which is also an area of focus of this report. What is the state of AI in healthcare? 

A lot of AI in healthcare has been on the business end, used for optimizing billing, scheduling surgeries, that sort of thing.  When it comes to AI for better patient care, which is what we usually think about, there are few legal, regulatory, and financial incentives to do so, and many disincentives. Still, there's been slow but steady integration of AI-based tools, often in the form of risk scoring and alert systems.

In the near future, two applications that I'm really excited about are triage in low-resource settings — having AIs do initial reads of pathology slides, for example, if there are not enough pathologists, or get an initial check of whether a mole looks suspicious — and ways in which AIs can help identify promising treatment options for discussion with a clinician team and patient.

Q: Any predictions for the next report?

I'll be keen to see where currently nascent AI regulation initiatives have gotten to. Accountability is such a difficult question in AI,  it's tricky to nurture both innovation and basic protections.  Perhaps the most important innovation will be in approaches for AI accountability.

Topics: AI / Machine Learning , Computer Science

Cutting-edge science delivered direct to your inbox.

Join the Harvard SEAS mailing list.

Scientist Profiles

Finale Doshi-Velez

Finale Doshi-Velez

Herchel Smith Professor of Computer Science

Press Contact

Leah Burrows | 617-496-1351 | [email protected]

Related News

Harvard SEAS and GSAS banners, bagpipers, students in Crimson regalia

2024 Commencement photos

Images from the 373rd Harvard Commencement on Thursday, May 23

Academics , Applied Computation , Applied Mathematics , Applied Physics , Bioengineering , Computer Science , Environmental Science & Engineering , Events , Materials Science & Mechanical Engineering , Robotics

A green energy facility featuring solar panels, wind turbines, and a building with a prominent recycling symbol on the roof. The facility is surrounded by water and greenery.

Sustainable computing project awarded $12 million from NSF

Multi-institution research initiative aims to reduce computing’s carbon footprint by 45% within the next decade

Climate , Computer Science

A group of Harvard SEAS students and professors standing on steps in the Science and Engineering Complex

A sustainable future for data centers

SEAS students tackle the engineering implications of the ever-increasing need for data centers

Academics , Climate , Computer Science , Electrical Engineering , Environmental Science & Engineering , Materials Science & Mechanical Engineering

Help | Advanced Search

Artificial Intelligence

Authors and titles for recent submissions.

  • Fri, 24 May 2024
  • Wed, 22 May 2024
  • Tue, 21 May 2024
  • Mon, 20 May 2024
  • Fri, 17 May 2024

Fri, 24 May 2024 (showing first 25 of 333 entries )

Artificial intelligence in Finance: a comprehensive review through bibliometric and content analysis

  • Open access
  • Published: 20 January 2024
  • Volume 4 , article number  23 , ( 2024 )

Cite this article

You have full access to this open access article

recent research paper on artificial intelligence

  • Salman Bahoo 1 ,
  • Marco Cucculelli   ORCID: orcid.org/0000-0003-0035-9454 2 ,
  • Xhoana Goga 2 &
  • Jasmine Mondolo 2  

13k Accesses

1 Altmetric

Explore all metrics

Over the past two decades, artificial intelligence (AI) has experienced rapid development and is being used in a wide range of sectors and activities, including finance. In the meantime, a growing and heterogeneous strand of literature has explored the use of AI in finance. The aim of this study is to provide a comprehensive overview of the existing research on this topic and to identify which research directions need further investigation. Accordingly, using the tools of bibliometric analysis and content analysis, we examined a large number of articles published between 1992 and March 2021. We find that the literature on this topic has expanded considerably since the beginning of the XXI century, covering a variety of countries and different AI applications in finance, amongst which Predictive/forecasting systems, Classification/detection/early warning systems and Big data Analytics/Data mining /Text mining stand out. Furthermore, we show that the selected articles fall into ten main research streams, in which AI is applied to the stock market, trading models, volatility forecasting, portfolio management, performance, risk and default evaluation, cryptocurrencies, derivatives, credit risk in banks, investor sentiment analysis and foreign exchange management, respectively. Future research should seek to address the partially unanswered research questions and improve our understanding of the impact of recent disruptive technological developments on finance.

Similar content being viewed by others

recent research paper on artificial intelligence

Examining the research taxonomy of artificial intelligence, deep learning & machine learning in the financial sphere—a bibliometric analysis

recent research paper on artificial intelligence

A Survey of Trendy Financial Sector Applications of Machine and Deep Learning

recent research paper on artificial intelligence

Machine Learning and Finance

Avoid common mistakes on your manuscript.

Introduction

The first two decades of the twenty-first century have experienced an unprecedented way of technological progress, which has been driven by advances in the development of cutting-edge digital technologies and applications in Artificial Intelligence (AI). Artificial intelligence is a field of computer science that creates intelligent machines capable of performing cognitive tasks, such as reasoning, learning, taking action and speech recognition, which have been traditionally regarded as human tasks (Frankenfield 2021 ). AI comprises a broad and rapidly growing number of technologies and fields, and is often regarded as a general-purpose technology, namely a technology that becomes pervasive, improves over time and generates complementary innovation (Bresnahan and Trajtenberg 1995 ). As a result, it is not surprising that there is no consensus on the way AI is defined (Van Roy et al. 2020 ). An exhaustive definition has been recently proposed by Acemoglu and Restrepo ( 2020 , p.1), who assert that Artificial Intelligence is “(…) the study and development of intelligent (machine) agents, which are machines, software or algorithms that act intelligently by recognising and responding to their environment.” Even though it is often difficult to draw precise boundaries, this promising and rapidly evolving field mainly comprises machine learning, deep learning, NLP (natural language processing) platforms, predictive APIs (application programming interface), image recognition and speech recognition (Martinelli et al. 2021 ).

The term “Artificial intelligence” was first coined by John McCarthy in 1956 during a conference at Dartmouth College to describe “thinking machines” (Buchanan 2019 ). However, until 2000, the lack of storage capability and low computing power prevented any progress in the field. Accordingly, governments and investors lost their interest and AI fell short of financial support and funding in 1974–1980 and again in 1987–1993. These periods of funding shortage are also known as “AI winters Footnote 1 ”.

However, the most significant development and spread of AI-related technologies is much more recent, and has been prompted by the availability of large unstructured databases, the explosion of computing power, and the rise in venture capital intended to support innovative, technological projects (Ernst et al. 2018 ). One of the most distinctive The term AI winter first appeared in 1characteristics of AI technologies is that, unlike industrial robots, which need to receive specific instructions, generally provided by a software, before they perform any action, can learn for themselves how to map information about the environment, such as visual and tactile data from a robot’s sensors, into instructions sent to the robot’s actuators (Raj and Seamans 2019 ). Additionally, as remarked by Ernst et al. ( 2018 ), whilst industrial robots mostly perform manual tasks, AI technologies are able to carry out activities that, until some years ago, were still regarded as typically human, i.e. what Ernst and co-authors label as “mental tasks”.

The adoption of AI is likely to have remarkable implications for the subjects adopting them and, more in general, for the economy and the society. In particular, it is expected to contribute to the growth of the global GDP, which, according to a study conducted by Pricewater-house-Coopers (PwC) and published in 2017, is likely to increase by up to 14% by 2030. Moreover, companies adopting AI technologies sometimes report better performance (Van Roy et al. 2020 ). Concerning the geographic dimension of this field, North America and China are the leading investors and are expected to benefit the most from AI-driven economic returns. Europe and emerging markets in Asia and South America will follow, with moderate profits owing to fewer and later investments (PwC 2017 ). AI is going to affect labour markets as well. The demand for high-skilled employees is expected to increase, whilst the demand for low-skilled jobs is likely to shrink because of automation; the resulting higher unemployment rate, however, is going to be offset by the new job opportunities offered by AI (Ernst et al. 2018 ; Acemoglu and Restrepo 2020 ).

AI solutions have been introduced in every major sector of the economy; a sector that is witnessing a profound transformation led by the ongoing technological revolution is the financial one. Financial institutions, which rely heavily on Big Data and process automation, are indeed in a “unique position to lead the adoption of AI” (PwC 2020 ), which generates several benefits: for instance, it encourages automation of manufacturing processes which in turn enhances efficiency and productivity. Next, since machines are immune to human errors and psychological factors, it ensures accurate and unbiased predictive analytics and trading strategies. AI also fosters business model innovation and radically changes customer relationships by promoting customised digital finance, which, together with the automation of processes, results in better service efficiency and cost-saving (Cucculelli and Recanatini 2022 ). Furthermore, AI is likely to have substantial implications for financial conduct and prudential supervisors, and it also has the potential to help supervisors identify potential violations and help regulators better anticipate the impact of changes in regulation (Wall 2018 ). Additionally, complex AI/machine learning algorithms allow Fintech lenders to make fast (almost instantaneous) credit decisions, with benefits for both the lenders and the consumers (Jagtiani and John 2018 ). Intelligent devices in Finance are used in a number of areas and activities, including fraud detection, algorithmic trading and high-frequency trading, portfolio management, credit decisions based on credit scoring or credit approval models, bankruptcy prediction, risk management, behavioural analyses through sentiment analysis and regulatory compliance.

In recent years, the adoption of AI technologies in a broad range of financial applications has received increasing attention by scholars; however, the extant literature, which is reviewed in the next section, is quite broad and heterogeneous in terms of research questions, country and industry under scrutiny, level of analysis and method, making it difficult to draw robust conclusions and to understand which research areas require further investigation. In the light of these considerations, we conduct an extensive review of the research on the use of AI in Finance thorough which we aim to provide a comprehensive account of the current state of the art and, importantly, to identify a number of research questions that are still (partly) unanswered. This survey may serve as a useful roadmap for researchers who are not experts of this topic and could find it challenging to navigate the extensive and composite research on this subject. In particular, it may represent a useful starting point for future empirical contributions, as it provides an account of the state of the art and of the issues that deserve further investigation. In doing so, this study complements some previous systematic reviews on the topic, such as the ones recently conducted by Hentzen et al. ( 2022b ) and (Biju et al. 2020 ), which differ from our work in the following main respects: Hentzen and co-authors’ study focuses on customer-facing financial services, whilst the valuable contribution of Biju et al. poses particular attention to relevant technical aspects and the assessment of the effectiveness and the predictive capability of machine learning, AI and deep learning mechanisms within the financial sphere; in doing so, it covers an important issue which, however, is out of the scope of our work.

From our review, it emerges that, from the beginning of the XXI century, the literature on this topic has significantly expanded, and has covered a broad variety of countries, as well as several AI applications in finance, amongst which Predictive/forecasting systems, Classification /detection/early warning systems and Big data Analytics/Data mining /Text mining stand out. Additionally, we show that the selected articles can be grouped into ten main research streams, in which AI is applied to the stock market, trading models, volatility forecasting, portfolio management, performance, risk & default evaluation, cryptocurrencies, derivatives, credit risks in banks, investor sentiment analysis and foreign exchange management, respectively.

The balance of this paper is organised as follows: Sect. “ Methodology ” shortly presents the methodology. Sect. “ A detailed account of the literature on AI in Finance ” illustrates the main results of the bibliometric analysis and the content analysis. Sect. “ Issues that deserve further investigation ” draws upon the research streams described in the previous section to pinpoint several potential research avenues. Sect. “ Conclusions ” concludes. Finally, Appendix 1 clarifies some AI-related terms and definitions that appear several times throughout the paper, whilst Appendix 2 provides more information on some of the articles under scrutiny.

Methodology

To conduct a sound review of the literature on the selected topic, we resort to two well-known and extensively used approaches, namely bibliometric analysis and content analysis. Bibliometric analysis is a popular and rigorous method for exploring and analysing large volumes of scientific data which allows us to unpack the evolutionary nuances of a specific field whilst shedding light on the emerging areas in that field (Donthu et al. 2021 ). In this study, we perform bibliometric analysis using HistCite, a popular software package developed to support researchers in elaborating and visualising the results of literature searches in the Web of Science platform. Specifically, we employ HistCite to recover the annual number of publications, the number of forward citations (which we use to identify the most influential journals and articles) and the network of co-citations, namely, all the citations received and given by journals belonging to a certain field, which help us identify the major research streams described in Sect. “ Identification of the major research streams ”. After that, to delve into the contents of the most pertinent studies on AI in finance, we resort to traditional content analysis, a research method that provides a systematic and objective means to make valid inferences from verbal, visual, or written data which, in turn, permit to describe and quantify specific phenomena (Downe-Wambolt 1992 ).

In order to identify the sample of studies on which bibliometric and content analysis were performed, we proceeded as follows. First, we searched for pertinent articles published in English be-tween 1950 and March 2021. Specifically, we scrutinised the “Finance”, “Economics”, “Business Finance” and “Business” sections of the “Web of Science” (WoS) database using the keyword “finance” together with an array of keywords concerning Artificial Intelligence (i.e. “Finance” AND (“Artificial Intelligence” OR “Machine Learning” OR “Deep Learning” OR “Neural Networks*” OR “Natural Language Processing*” OR “Algorithmic Trading*” OR “Artificial Neural Network” OR “Robot*” OR “Automation” OR “Text Mining” OR “Data Mining” OR “Soft Computing” OR “Fuzzy Logic Analysis” OR “Biometrics*” OR “Geotagging” OR “Wearable*” OR “IoT” OR “Internet of Thing*” OR “digitalization” OR “Artificial Neutral Networks” OR “Big Data” OR “Industry 4.0″ OR “Smart products*” OR Cloud Computing” OR “Digital Technologies*”). In doing so, we ended up with 1,218 articles. Next, two researchers independently analysed the title, abstract and content of these papers and kept only those that address the topic under scrutiny in a non-marginal and non-trivial way. This second step reduced the number of eligible papers to 892, which were used to perform the first part of the bibliometric analysis. Finally, we delved into the contents of the previously selected articles and identified 110 contributions which specifically address the adoption and implications in Finance of AI tools focussing on the economic dimension of the topic, and which are employed in the second part of the bibliometric analysis and in the content analysis.

A detailed account of the literature on AI in Finance

In this section, we explore the patterns and trends in the literature on AI in Finance in order to obtain a compact but exhaustive account of the state of the art. Specifically, we identify some relevant bibliographic characteristics using the tools of bibliometric analysis. After that, focussing on a sub-sample of papers, we conduct a preliminary assessment of the selected studies through a content analysis and detect the main AI applications in Finance. Finally, we identify and briefly describe ten major research streams.

Main results of the bibliometric analysis

First, using HistCite and considering the sample of 892 studies, we computed, for each year, the number of publications related to the topic “AI in Finance”. The corresponding publication trend is shown in Fig.  1 , which plots both the annual absolute number of sampled papers (bar graph in blue) and the ratio between the latter and the annual overall amount of publications (indexed in Scopus) in the finance area (line graph in orange). We also compute relative numbers to see if the trend emerging from the selected studies is not significantly attributable to a “common trend” (i.e. to the fact that, in the meantime, also the total number of publications in the financial area has significantly increased). It can be noted that both graphs exhibit a strong upward trend from 2015 onwards; during the most recent years, the pace of growth and the degree of pervasiveness of AI adoption in the financial sphere have indeed remarkably strengthened, and have become the subject of a rapidly growing number of research articles.

figure 1

Publication Trend, 1992–2021

After that, focussing on the more pertinent (110) articles, we checked the journals in which these studies were published. Table 1 presents the top-ten list of journals reported in the Academic Journal Guide-ABS List 2020 and ranked on the basis of the total global citation score (TGCS), which captures the number of times an article is cited by other articles that deal with the same topic and are indexed in the WoS database. For each journal, we also report the total number of studies published in that journal. We can notice that the most influential journals in terms of TGCS are the Journal of Finance (with a TGCS equal to 1283) and the Journal of Banking and Finance (with a TGCS of 1253), whilst the journals containing the highest number of articles on the topic are Quantitative Finance (68 articles) and Intelligent Systems in Accounting, Finance and Management (43).

Finally, Fig.  2 provides a visual representation of the citation-based relationships amongst papers starting from the most-cited papers, which we obtained using the Java application CiteSpace.

figure 2

Source: authors’ elaboration of data from Web of Science; visualisation produced using CiteSpace

Citation Mapping and identification of the research streams.

Preliminary results of the content analysis

In this paragraph, we shortly illustrate some relevant characteristics of our sub-sample made up of 110 studies, including country and industry coverage, method and underpinning theoretical background. Table 2 comprises the list of countries under scrutiny, and, for each of them, a list of papers that perform their analysis on that country. We can see that our sample exhibits significant geographical heterogeneity, as it covers 74 countries across all continents; however, the most investigated areas are three, that is Europe, the US and China. These results corroborate the fact that the above-mentioned regions are the leaders of the AI-driven financial industry, as suggested by PwC ( 2017 ). The United States, in particular, are considered the “early adopters” of AI and are likely to benefit the most from this source of competitive advantage. More lately, emerging countries in Southeast Asia and the Middle East have received growing interest. Finally, a smaller number of papers address underdeveloped regions in Africa and various economies in South America.

The most investigated sectors are reported in Table  3 . We can notice that, although it primarily deals with banking and financial services, the extant research has addressed the topic in a vast array of industries. This confirms that the application potential of AI is very broad, and that any industry may benefit from it.

Through our analysis, we also detected the key theories and frameworks applied by researchers in the prior literature. As shown in Table  4 , 73 (out of 110) papers explicitly refer to some theoretical framework. Specifically, ten of them (14%) resort to computational learning theory; this theory, which is an extension of statistical learning, provides researchers with a theoretical guide for finding the most suitable learning model for a given problem, and is regarded as one of the most important and most used theories in the field. Specific theories concerning types of neural networks and learning methods are used too, such as the fuzzy set theory, which is mentioned in 8% of the sample, and to a lesser extent, the Naive Bayes theorem, the theory of neural networks, the theory of genetic programming and the TOPSIS analytical framework. Finance theories (e.g. Arbitrage Pricing Theory; Black and Scholes 1973 ) are jointly employed with portfolio management theories (e.g. modern portfolio theory), and the two of them account together for 21% (15) of the total number of papers. Finally, bankruptcy theories support business failure forecasts, whilst other theoretical underpinnings concern mathematical and probability concepts.

The content analysis also provides information on the main types of companies under scrutiny. Table 5 indicates that 30 articles (out of 110) focus on large companies listed on stock exchanges, whilst only 16 studies cover small and medium enterprises. Similarly, trading and digital platforms are examined in 16 papers that deal with derivatives and cryptocurrencies.

Furthermore, Table  6 summarises the key methods applied in the literature, which are divided by category (note that all the papers employ more than one method). Looking at the table, we see that machine learning and artificial neural networks are the most popular ones (they are employed in 41 and 51 articles, respectively). The majority of the papers resort to different approaches to compare their results with those obtained through autoregressive and regression models or conventional statistics, which are used as the benchmark; therefore, there may be some overlaps. Nevertheless, we notice that support vector machine and random forest are the most widespread machine learning methods. On the other hand, the use of artificial neural networks (ANNs) is highly fragmented. Backpropagation, Recurrent, and Feed-Forward NNs are considered basic neural nets and are commonly employed. Advanced NNs, such as Higher-Order Neural network (HONN) and Long Short-Term Memory Networks (LSTM), are more performing than their standard version but also much more complicated to apply. These methods are usually compared to autoregressive models and regressions, such as ARMA, ARIMA, and GARCH. Finally, we observe that almost all the sampled papers are quantitative, whilst only three of them are qualitative and four of them consist in literature reviews.

A taxonomy of AI applications in Finance

After scrutinising some relevant features of the papers, we make a step forward and outline a taxonomy of AI applications used in Finance and tackled by previous literature. The main uses of AI in Finance and the papers that address each of them are summarised in Table  7 .

Many research papers (39 out of 110) employ AI as a predictive instrument for forecasting stock prices, performance and volatility. In 23 papers, AI is employed in classification problems and warning systems to detect credit risk and frauds, as well as to monitor firm or bank performance. The former use of AI permits to classify firms into two categories based on qualitative and quantitative data; for example, we may have distressed or non-distressed, viable–nonviable, bankrupt–non-bankrupt, or financially healthy–not healthy, good–bad, and fraud–not fraud. Warning systems follow a similar principle: after analysing customers’ financial behaviour and classifying potential fraud issues in bank accounts, alert models signal to the bank unusual transactions. Additionally, we see that 14 articles employ text mining and data mining language recognition, i.e. natural language processing, as well as sentiment analysis. This may be the starting point of AI-driven behavioural analysis in Finance. Amongst others, trading models and algorithmic trading are further popular aspects of AI widely analysed in the literature. Moreover, interest in Robo-advisory is growing in the asset investment field. Finally, less studied AI applications concern the modelling capability of algorithms and traditional machine learning and neural networks.

Identification of the major research streams

Drawing upon the co-citation analysis mentioned in Sect. " Methodology ", we detected ten main research streams: (1) AI and the stock market; (2) AI and Trading Models; (3) AI and Volatility Forecasting; (4) AI and Portfolio Management; (5) AI and Performance, Risk, and Default Valuation; (6) AI and Bitcoin, Cryptocurrencies; (7) AI and Derivatives; (8) AI and Credit Risk in Banks; (9) AI and Investor Sentiments Analysis; (10) AI and Foreign Exchange Management. Some research streams can be further divided into sub-streams as they deal with various aspects of the same main topic. In this section, we provide a compact account for each of the aforementioned research streams. More detailed information on some of the papers fuelling them is provided in Appendix 2.

Stream 01: AI and the stock market

The stream “AI and the Stock Market” comprises two sub-streams, namely algorithmic trading and stock market, and AI and stock price prediction. The first sub-stream deals with the impact of algorithmic trading (AT) on financial markets. In this regard, Herdershott et al. ( 2011 ) argue that AT increases market liquidity by reducing spreads, adverse selection, and trade-related price discovery. This results in a lowered cost of equity for listed firms in the medium–long term, especially in emerging markets (Litzenberger et al. 2012 ). As opposed to human traders, algorithmic trading adjusts faster to information and generates higher profits around news announcements thanks to better market timing ability and rapid executions (Frino et al. 2017 ). Even though high-frequency trading (a subset of algorithmic trading) has sometimes increased volatility related to news or fundamentals, and transmitted it within and across industries, AT has overall reduced return volatility variance and improved market efficiency (Kelejian and Mukerji 2016 ; Litzenberger et al. 2012 ).

The second sub-stream investigates the use of neural networks and traditional methods to forecast stock prices and asset performance. ANNs are preferred to linear models because they capture the non-linear relationships between stock returns and fundamentals and are more sensitive to changes in variables relationships (Kanas 2001 ; Qi 1999 ). Dixon et al. ( 2017 ) argue that deep neural networks have strong predictive power, with an accuracy rate equal to 68%. Also, Zhang et al. ( 2021 ) propose a model, the Long Short-Term Memory Networks (LSTM), that outperforms all classical ANNs in terms of prediction accuracy and rational time cost, especially when various proxies of online investor attention (such as the internet search volume) are considered.

Stream 02: AI and trading models

From the review of the literature represented by this stream, it emerges that neural networks and machine learning algorithms are used to build intelligent automated trading systems. To give some examples, Creamer and Freund ( 2010 ) create a machine learning-based model that analyses stock price series and then selects the best-performing assets by suggesting a short or long position. The model is also equipped with a risk management overlayer preventing the transaction when the trading strategy is not profitable. Similarly, Creamer ( 2012 ) uses the above-mentioned logic in high-frequency trading futures: the model selects the most profitable and less risky futures by sending a long or short recommendation. To construct an efficient trading model, Trippi and DeSieno ( 1992 ) combine several neural networks into a single decision rule system that outperforms the single neural networks; Kercheval and Zhang ( 2015 ) use a supervised learning method (i.e. multi-class SVM) that automatically predicts mid-price movements in high-frequency limit order books by classifying them in low-stationary-up; these predictions are embedded in trading strategies and yield positive payoffs with controlled risk.

Stream 03: AI and volatility forecasting

The third stream deals with AI and the forecasting of volatility. The volatility index (VIX) from Chicago Board Options Exchange (CBOE) is a measure of market sentiment and expectations. Forecasting volatility is not a simple task because of its very persistent nature (Fernandes et al. 2014 ). According to Fernandes and co-authors, the VIX is negatively related to the SandP500 index return and positively related to its volume. The heterogeneous autoregressive (HAR) model yields the best predictive results as opposed to classical neural networks (Fernandes et al. 2014 ; Vortelinos 2017 ). Modern neural networks, such as LSTM and NARX (nonlinear autoregressive exogenous network), also qualify as valid alternatives (Bucci 2020 ). Another promising class of neural networks is the higher-order neural network (HONN) used to forecast the 21-day-ahead realised volatility of FTSE100 futures. Thanks to its ability to capture higher-order correlations within the dataset, HONN shows remarkable performance in terms of statistical accuracy and trading efficiency over multi-layer perceptron (MLP) and the recurrent neural network (RNN) (Sermpinis et al. 2013 ).

Stream 04: AI and portfolio management

This research stream analyses the use of AI in portfolio selection. As an illustration, Soleymani and Vasighi ( 2020 ) consider a clustering approach paired with VaR analysis to improve asset allocation: they group the least risky and more profitable stocks and allocate them in the portfolio. More elaborate asset allocation designs incorporate a bankruptcy detection model and an advanced utility performance system: before adding the stock to the portfolio, the sophisticated neural network estimates the default probability of the company and asset’s contribution to the optimal portfolio (Loukeris and Eleftheriadis 2015 ). Index-tracking powered by deep learning technology minimises tracking error and generates positive performance (Kim and Kim 2020 ). The asymmetric copula method for returns dependence estimates further promotes the portfolio optimization process (Zhao et al. 2018 ). To sum up, all papers show that AI-based prediction models improve the portfolio selection process by accurately forecasting stock returns (Zhao et al. 2018 ).

Stream 05: AI and performance, risk, default valuation

This research stream comprises three sub-streams, namely AI and Corporate Performance, Risk and Default Valuation; AI and Real Estate Investment Performance, Risk, and Default Valuation; AI and Banks Performance, Risk and Default Valuation.

The first sub-stream examines corporate financial conditions to predict financially distressed companies (Altman et al. 1994 ). As an illustration, Jones et al. ( 2017 ) and Gepp et al. ( 2010 ) determine the probability of corporate default. Sabău Popa et al. ( 2021 ) predict business performance based on a composite financial index. The findings of the aforementioned papers confirm that AI-powered classifiers are extremely accurate and easy to interpret, hence, superior to classic linear models. A quite interesting paper surveys the relationship between face masculinity traits in CEOs and firm riskiness through image processing (Kamiya et al. 2018 ). The results reveal that firms lead by masculine-faced CEO have higher risk and leverage ratios and are more frequent acquirers in MandA operations.

The second sub-stream focuses on mortgage and loan default prediction (Feldman and Gross 2005 ; Episcopos, Pericli, and Hu, 1998 ). For instance, Chen et al. ( 2013 ) evaluate real estate investment returns by forecasting the REIT index; they show that the industrial production index, the lending rate, the dividend yield and the stock index influence real estate investments. All the forecasting techniques adopted (i.e. supervised machine learning and ANNs) outperform linear models in terms of efficiency and precision.

The third sub-stream deals with banks’ performance. In contradiction with past research, a text mining study argues that the most important risk factors in banking are non-financial, i.e. regulation, strategy and management operation. However, the findings from text analysis are limited to what is disclosed in the papers (Wei et al. 2019 ). A highly performing NN-based study on the Malaysian and Islamic banking sector asserts that negative cost structure, cultural aspects and regulatory barriers (i.e. low competition) lead to inefficient banks compared to the U.S., which, on the contrary, are more resilient, healthier and well regulated (Wanke et al. 2016a, b, c, d; Papadimitriou et al. 2020 ).

Stream 06: AI and cryptocurrencies

Although algorithms and AI advisors are gaining ground, human traders still dominate the cryptocurrency market (Petukhina et al. 2021 ). For this reason, substantial arbitrage opportunities are available in the Bitcoin market, especially for USD–CNY and EUR–CNY currency pairs (Pichl and Kaizoji 2017 ). Concerning daily realised volatility, the HAR model delivers good results. Likewise, the feed-forward neural network effectively approximates the daily logarithmic returns of BTCUSD and the shape of their distribution (Pichl and Kaizoji 2017 ).

Additionally, the Hierarchical Risk Parity (HRP) approach, an asset allocation method based on machine learning, represents a powerful risk management tool able to manage the high volatility characterising Bitcoin prices, thereby helping cryptocurrency investors (Burggraf 2021 ).

Stream 07: AI and derivatives

ANNs and machine learning models are accurate predictors in pricing financial derivatives. Jang and Lee ( 2019 ) propose a machine learning model that outperforms traditional American option pricing models: the generative Bayesian NN; Culkin and Das ( 2017 ) use a feed-forward deep NN to reproduce Black and Scholes’ option pricing formula with a high accuracy rate. Similarly, Chen and Wan ( 2021 ) suggest a deep NN for American option and deltas pricing in high dimensions. Funahashi ( 2020 ), on the contrary, rejects deep learning for option pricing due to the instability of the prices, and introduces a new hybrid method that combines ANNs and asymptotic expansion (AE). This model does not directly predict the option price but measures instead, the difference between the target (i.e. derivative price) and its approximation. As a result, the ANN becomes faster, more accurate and “lighter” in terms of layers and training data volume. This innovative method mimics a human learning process when one learns about a new object by recognising its differences from a similar and familiar item (Funahashi 2020 ).

Stream 08: AI and credit risk in banks

The research stream labelled “AI and Credit Risk in Banks” Footnote 2 includes the following sub-streams: AI and Bank Credit Risk; AI and Consumer Credit Risk and Default; AI and Financial Fraud detection/ Early Warning System; AI and Credit Scoring Models.

The first sub-stream addresses bank failure prediction. Machine learning and ANNs significantly outperform statistical approaches, although they lack transparency (Le and Viviani 2018 ). To overcome this limitation, Durango‐Gutiérrez et al. ( 2021 ) combine traditional methods (i.e. logistic regression) with AI (i.e. Multiple layer perceptron -MLP), thus gaining valuable insights on explanatory variables. With the scope of preventing further global financial crises, the banking industry relies on financial decision support systems (FDSSs), which are strongly improved by AI-based models (Abedin et al. 2019 ).

The second sub-stream compares classic and advanced consumer credit risk models. Supervised learning tools, such as SVM, random forest, and advanced decision trees architectures, are powerful predictors of credit card delinquency: some of them can predict credit events up to 12 months in advance (Lahmiri 2016 ; Khandani et al. 2010 ; Butaru et al. 2016 ). Jagric et al. ( 2011 ) propose a learning vector quantization (LVQ) NN that better deals with categorical variables, achieving an excellent classification rate (i.e. default, non-default). Such methods overcome logit-based approaches and result in cost savings ranging from 6% up to 25% of total losses (Khadani et al. 2010 ).

The third group discusses the role of AI in early warning systems. On a retail level, advanced random forests accurately detect credit card fraud based on customer financial behaviour and spending pattern, and then flag it for investigation (Kumar et al. 2019 ). Similarly, Coats and Fant ( 1993 ) build a NN alert model for distressed firms that outperforms linear techniques. On a macroeconomic level, systemic risk monitoring models enhanced by AI technologies, i.e. k-nearest neighbours and sophisticated NNs, support macroprudential strategies and send alerts in case of global unusual financial activities (Holopainen, and Sarlin 2017 ; Huang and Guo 2021 ). However, these methods are still work-in-progress.

The last group studies intelligent credit scoring models, with machine learning systems, Adaboost and random forest delivering the best forecasts for credit rating changes. These models are robust to outliers, missing values and overfitting, and require minimal data intervention (Jones et al. 2015 ). As an illustration, combining data mining and machine learning, Xu et al. ( 2019 ) build a highly sophisticated model that selects the most important predictors and eliminates noisy variables, before performing the task.

Stream 09: AI and investor sentiment analysis

Investor sentiment has become increasingly important in stock prediction. For this purpose, sentiment analysis extracts investor sentiment from social media platforms (e.g. StockTwits, Yahoo-finance, eastmoney.com) through natural language processing and data mining techniques, and classifies it into negative or positive (Yin et al. 2020 ). The resulting sentiment is regarded either as a risk factor in asset pricing models, an input to forecast asset price direction, or an intraday stock index return (Houlihan and Creamer 2021 ; Renault 2017 ). In this respect, Yin et al. ( 2020 ) find that investor sentiment has a positive correlation with stock liquidity, especially in slowing markets; additionally, sensitivity to liquidity conditions tends to be higher for firms with larger size and a higher book-to-market ratio, and especially those operating in weakly regulated markets. As for predictions, daily news usually predicts stock returns for few days, whereas weekly news predicts returns for longer period, from one month to one quarter. This generates a return effect on stock prices, as much of the delayed response to news occurs around major events in company life, specifically earnings announcement, thus making investor sentiment a very important variable in assessing the impact of AI in financial markets. (Heston and Sinha 2017 ).

Stream 10: AI and foreign exchange management

The last stream addresses AI and the management of foreign exchange. Cost-effective trading or hedging activities in this market require accurate exchange rate forecasts (Galeshchuk and Mukherjee 2017 ). In this regard, the HONN model significantly outperforms traditional neural networks (i.e. multi-layer perceptron, recurrent NNs, Psi sigma-models) in forecasting and trading the EUR/USD currency pair using ECB daily fixing series as input data (Dunis et al. 2010 ). On the contrary, Galeshchuk and Mukherjee ( 2017 ) consider these methods as unable to predict the direction of change in the forex rates and, therefore, ineffective at supporting profitable trading. For this reason, they apply a deep NN (Convolution NNs) to forecast three main exchange rates (i.e. EUR/USD, GBP/USD, and JPY/USD). The model performs remarkably better than time series models (e.g. ARIMA: Autoregressive integrated moving average) and machine learning classifiers. To sum up, from this research stream it emerges that AI-based models, such as NARX and the above-mentioned techniques, achieve better prediction performance than statistical or time series models, as remarked by Amelot et al. ( 2021 ).

Issues that deserve further investigation

As shown in Sect. " A detailed account of the literature on AI in Finance ", the literature on Artificial Intelligence in Finance is vast and rapidly growing as technological progress advances. There are, however, some aspects of this subject that are unexplored yet or that require further investigation. In this section, we further scrutinise, through content analysis, the papers published between 2015 and 2021 (as we want to focus on the most recent research directions) in order to define a potential research agenda. Hence, for each of the ten research streams presented in Sect. " Identification of the major research streams ", we report a number of research questions that were put forward over time and are still at least partly unaddressed. The complete list of research questions is enclosed in Table  8 .

AI and the stock market

This research stream focuses on algorithmic trading (AT) and stock price prediction. Future research in the field could analyse more deeply alternative AI-based market predictors (e.g. clustering algorithms and similar learning methods) and draw up a regime clustering algorithm in order to get a clearer view of the potential applications and benefits of clustering methodologies (Law, and Shawe-Taylor 2017 ). In this regard, Litzenberger et al. ( 2012 ) and Booth et al. ( 2015 ) recommend broadening the study to market cycles and regulation policies that may affect AI models’ performance in stock prediction and algorithmic trading, respectively. Footnote 3 Furthermore, forecasting models should be evaluated with deeper order book information, which may lead to a higher prediction accuracy of stock prices (Tashiro et al. 2019 ).

AI and trading models

This research stream builds on the application of AI in trading models. Robo advisors are the evolution of basic trading models: they are easily accessible, cost-effective, profitable for investors and, unlike human traders, immune to behavioural biases. Robo advisory, however, is a recent phenomenon and needs further performance evaluations, especially in periods of financial distress, such as the post-COVID-19 one (Tao et al. 2021 ), or in the case of the so-called “Black swan” events. Conversely, trading models based on spatial neural networks (an advanced ANN) outperform all statistical techniques in modelling limit order books and suggest an extensive interpretation of the joint distribution of the best bid and best ask. Given the versatility of such a method, forthcoming research should resort to it with the aim of understanding whether neural networks with more order book information (i.e. order flow history) lead to better trading performance (Sirignano 2018 ).

AI and volatility forecasting

As previously mentioned, volatility forecasting is a challenging task. Although recent studies report solid results in the field (see Sermpinis et al. 2013 ; Vortelinos 2017 ), future work could deploy more elaborated recurrent NNs by modifying the activation function of the processing units composing the ANNs, or by adding hidden layers and then evaluate their performance (Bucci 2020 ). Since univariate time series are commonly used for realised volatility prediction, it would be interesting to also inquire about the performance of multivariate time series.

AI and portfolio management

This research stream examines the use of AI in portfolio selection strategies. Past studies have developed AI models that are capable of replicating the performance of stock indexes (known as index tracking strategy) and constructing efficient portfolios with no human intervention. In this regard, Kim and Kim ( 2020 ) suggest focussing on optimising AI algorithms to boost index-tracking performance. Soleymani and Vasighi ( 2020 ) recognise the importance of clustering algorithms in portfolio management and propose a clustering approach powered by a membership function, also known as fuzzy clustering, to further improve the selection of less risky and most profitable assets. For this reason, analysis of asset volatility through deep learning should be embedded in portfolio selection models (Chen and Ge 2021 ).

AI and performance, risk, default valuation

Bankruptcy and performance prediction models rely on binary classifiers that only provide two outcomes, e.g. risky–not risky, default–not default, good–bad performance. These methods may be restrictive as sometimes there is not a clear distinction between the two categories (Jones et al. 2017 ). Therefore, prospective research might focus on multiple outcome domains and extend the research area to other contexts, such as bond default prediction, corporate mergers, reconstructions, takeovers, and credit rating changes (Jones et al. 2017 ). Corporate credit ratings and social media data should be included as independent predictors in credit risk forecasts to evaluate their impact on the accuracy of risk-predicting models (Uddin et al. 2020 ). Moreover, it is worth evaluating the benefits of a combined human–machine approach, where analysts contribute to variables’ selection alongside data mining techniques (Jones et al. 2017 ). Forthcoming studies should also address black box and over-fitting biases (Sariev and Germano 2020 ), as well as provide solutions for the manipulation and transformation of missing input data relevant to the model (Jones et al. 2017 ).

AI and cryptocurrencies

The use of AI in the cryptocurrency market is in its infancy, and so are the policies regulating it. As the digital currency industry has become increasingly important in the financial world, future research should study the impact of regulations and blockchain progress on the performance of AI techniques applied in this field (Petukhina et al., 2021 ). Cryptocurrencies, and especially Bitcoins, are extensively used in financial portfolios. Hence, new AI approaches should be developed in order to optimise cryptocurrency portfolios (Burggraf 2021 ).

AI and derivatives

This research stream examines derivative pricing models based on AI. A valuable research area that should be further explored concerns the incorporation of text-based input data, such as tweets, blogs, and comments, for option price prediction (Jang and Lee 2019 ). Since derivative pricing is an utterly complicated task, Chen and Wan ( 2021 ) suggest studying advanced AI designs that minimise computational costs. Funahashi ( 2020 ) recognises a typical human learning process (i.e. recognition by differences) and applies it to the model, significantly simplifying the pricing problem. In the light of these considerations, prospective research may also investigate other human learning and reasoning paths that can improve AI reasoning skills.

AI and credit risk in banks

Bank default prediction models often rely solely on accounting information from banks’ financial statements. To enhance default forecast, future work should consider market data as well (Le and Viviani 2018 ). Credit risk includes bank account fraud and financial systemic risk. Fraud detection based on AI needs further experiments in terms of training speed and classification accuracy (Kumar et al. 2019 ). Early warning models, on the other hand, should be more sensitive to systemic risk. For this reason, subsequent studies ought to provide a common platform for modelling systemic risk and visualisation techniques enabling interaction with both model parameters and visual interfaces (Holopainen and Sarlin 2017 ).

AI and investor sentiment analysis

Sentiment analysis builds on text-based data from social networks and news to identify investor sentiment and use it as a predictor of asset prices. Forthcoming research may analyse the effect of investor sentiment on specific sectors (Houlihan and Creamer 2021 ), as well as the impact of diverse types of news on financial markets (Heston and Sinha 2017 ). This is important for understanding how markets process information. In this respect, Xu and Zhao ( 2022 ) propose a deeper analysis of how social networks’ sentiment affects individual stock returns. They also believe that the activity of financial influencers, such as financial analysts or investment advisors, potentially affects market returns and needs to be considered in financial forecasts or portfolio management.

AI and foreign exchange management

This research stream investigates the application of AI models to the Forex market. Deep networks, in particular, efficiently predict the direction of change in forex rates thanks to their ability to “learn” abstract features (i.e. moving averages) through hidden layers. Future work should study whether these abstract features can be inferred from the model and used as valid input data to simplify the deep network structure (Galeshchuk and Mukherjee 2017 ). Moreover, the performance of foreign exchange trading models should be assessed in financial distressed times. Further research may also compare the predictive performance of advanced times series models, such as genetic algorithms and hybrid NNs, for forex trading purposes (Amelot et al. 2021 ).

Conclusions

Despite its recent advent, Artificial Intelligence has revolutionised the entire financial system, thanks to advanced computer science and Big Data Analytics and the increasing outflow of data generated by consumers, investors, business, and governments’ activities. Therefore, it is not surprising that a growing strand of literature has examined the uses, benefits and potential of AI applications in Finance. This paper aims to provide an accurate account of the state of the art, and, in doing so, it would represent a useful guide for readers interested in this topic and, above all, the starting point for future research. To this purpose, we collected a large number of articles published in journals indexed in Web of Science (WoS), and then resorted to both bibliometric analysis and content analysis. In particular, we inspected several features of the papers under study, identified the main AI applications in Finance and highlighted ten major research streams. From this extensive review, it emerges that AI can be regarded as an excellent market predictor and contributes to market stability by minimising information asymmetry and volatility; this results in profitable investing systems and accurate performance evaluations. Additionally, in the risk management area, AI aids with bankruptcy and credit risk prediction in both corporate and financial institutions; fraud detection and early warning models monitor the whole financial system and raise expectations for future artificial market surveillance. This suggests that global financial crises or unexpected financial turmoil will be likely to be anticipated and prevented.

All in all, judging from the rapid widespread of AI applications in the financial sphere and across a large variety of countries, and, more in general, based on the growth rate exhibited by technological progress over time, we expect that the use of AI tools will further expand, both geographically, across sectors and across financial areas. Hence, firms that still struggle with coping with the latest wave of technological change should be aware of that, and try to overcome this burden in order to reap the potential benefits associated with the adoption of AI and remain competitive. In the light of these considerations, policymakers should motivate companies, especially those that have not adopted yet, or have just begun to introduce AI applications, to catch up, for instance by providing funding or training courses aimed to strengthen the complex skills required by employees dealing with these sophisticated systems and languages.

This study presents some limitations. For instance, it tackles a significant range of interrelated topics (in particular, the main financial areas affected by AI which have been the main object of past research), and then presents a concise description for each of them; other studies may decide to focus on only one or a couple of subjects and provide a more in-depth account of the chosen one(s). Also, we are aware that technological change has been progressing at an unprecedented fast and growing pace; even though we considered a significantly long time-frame and a relevant amount of studies have been released in the first two decades of the XXI century, we are aware that further advancements have been made from 2021 (the last year included in the time frame used to the select our sample); for instance, in the last few years, AI experts, policymakers, and also a growing number of scholars have been debating the potential and risks of AI-related devices, such as chatGBT and the broader and more elusive “metaverse” (see for instance Mondal et al. 2023 and Calzada 2023 , for an overview). Hence, future contributions may advance our understanding of the implications of these latest developments for finance and other important fields, such as education and health.

Data availability

Full data are available from authors upon request.

The term AI winter first appeared in 1984 as the topic of a public debate at the annual meeting of the American Association of Artificial Intelligence (AAAI). It referred to hype generated by over promises from developers, unrealistically high expectations from end users, and extensive media promotion.

Since credit risk in the banking industry remarkably differs from credit risk in firms, the two of them are treated separately.

As this issue has not been addressed in the latest papers, we include these two papers although their year of publication lies outside the established range period.

Abdou HA, Ellelly NN, Elamer AA, Hussainey K, Yazdifar H (2021) Corporate governance and earnings management Nexus: evidence from the UK and Egypt using neural networks. Int J Financ Econ 26(4):6281–6311. https://doi.org/10.1002/ijfe.2120

Article   Google Scholar  

Abedin MZ, Guotai C, Moula F, Azad AS, Khan MS (2019) Topological applications of multilayer perceptrons and support vector machines in financial decision support systems. Int J Financ Econ 24(1):474–507. https://doi.org/10.1002/ijfe.1675

Acemoglu D, Restrepo P (2020) The wrong kind of AI? Artificial intelligence and the future of labor demand. Cambr J Reg Econ Soc, Cambr Pol Econ Soc 13(1):25–35

Altman EI, Marco G, Varetto F (1994) Corporate distress diagnosis: comparisons using linear discriminant analysis and neural networks (the Italian experience). J Bank Finance 18(3):505–529. https://doi.org/10.1016/0378-4266(94)90007-8

Amelot LM, Subadar Agathee U, Sunecher Y (2021) Time series modelling, narx neural network and HYBRID kpca–svr approach to forecast the foreign exchange market in Mauritius. Afr J Econ Manag Stud 12(1):18–54. https://doi.org/10.1108/ajems-04-2019-0161

Bekiros SD, Georgoutsos DA (2008) Non-linear dynamics in financial asset returns: The predictive power of the CBOE volatility index. Eur J Fin 14(5):397–408. https://doi.org/10.1080/13518470802042203

Biju AKVN, Thomas AS, Thasneem J (2020) Examining the research taxonomy of artificial intelligence, deep learning & machine learning in the financial sphere—a bibliometric analysis. Qual Quant Online First. https://doi.org/10.1007/s11135-023-01673-0

Black F, Scholes M (1973) The pricing of Options and corporate liabilities. J Pol Econ 81(3):637–654

Article   MathSciNet   Google Scholar  

Booth A, Gerding E, McGroarty F (2015) Performance-weighted ensembles of random forests for predicting price impact. Quant Finance 15(11):1823–1835. https://doi.org/10.1080/14697688.2014.983539

Bresnahan TF, Trajtenberg M (1995) General purpose technologies ‘Engines of growth’? J Econom 65(1):83–108. https://doi.org/10.1016/0304-4076(94)01598-T

Bucci A (2020) Realized volatility forecasting with neural networks. J Financ Econom 3:502–531. https://doi.org/10.1093/jjfinec/nbaa008

Buchanan, B. G. (2019). Artificial intelligence in finance - Alan Turing Institute. https://www.turing.ac.uk/sites/default/files/2019-04/artificial_intelligence_in_finance_-_turing_report_0.pdf .

Burggraf T (2021) Beyond risk parity – a machine learning-based hierarchical risk parity approach on cryptocurrencies. Finance Res Lett 38:101523. https://doi.org/10.1016/j.frl.2020.101523

Butaru F, Chen Q, Clark B, Das S, Lo AW, Siddique A (2016) Risk and risk management in the credit card industry. J Bank Finance 72:218–239. https://doi.org/10.1016/j.jbankfin.2016.07.015

Caglayan M, Pham T, Talavera O, Xiong X (2020) Asset mispricing in peer-to-peer loan secondary markets. J Corp Finan 65:101769. https://doi.org/10.1016/j.jcorpfin.2020.101769

Calomiris CW, Mamaysky H (2019) How news and its context drive risk and returns around the world. J Financ Econ 133(2):299–336. https://doi.org/10.1016/j.jfineco.2018.11.009

Calzada I (2023) Disruptive technologies for e-diasporas: blockchain, DAOs, data cooperatives, metaverse, and ChatGPT. Futures 154:103258. https://doi.org/10.1016/j.futures.2023.103258

Cao Y, Liu X, Zhai J, Hua S (2022) A Two-stage Bayesian network model for corporate bankruptcy prediction. Int J Financ Econ 27(1):455–472. https://doi.org/10.1002/ijfe.2162

Chaboud AP, Chiquoine B, Hjalmarsson E, Vega C (2014) Rise of the machines: Algorithmic trading in the foreign exchange market. J Financ 69(5):2045–2084. https://doi.org/10.1111/jofi.12186

Chen S, Ge L (2021) A learning-based strategy for portfolio selection. Int Rev Econ Financ 71:936–942. https://doi.org/10.1016/j.iref.2020.07.010

Chen Y, Wan JW (2021) Deep neural network framework based on backward stochastic differential equations for pricing and hedging American options in high dimensions. Quant Finance 21(1):45–67. https://doi.org/10.1080/14697688.2020.1788219

Article   MathSciNet   CAS   Google Scholar  

Chen J, Chang T, Ho C, Diaz JF (2013) Grey relational analysis and neural Network forecasting of reit returns. Quantitative Finance 14(11):2033–2044. https://doi.org/10.1080/14697688.2013.816765

Coats PK, Fant LF (1993) Recognizing financial distress patterns using a neural network tool. Financ Manage 22(3):142. https://doi.org/10.2307/3665934

Corazza M, De March D, Di Tollo G (2021) Design of adaptive Elman networks for credit risk assessment. Quantitative Finance 21(2):323–340. https://doi.org/10.1080/14697688.2020.1778175

Cortés EA, Martínez MG, Rubio NG (2008) FIAMM return persistence analysis and the determinants of the fees charged. Span J Finance Account Revis Esp De Financ Y Contab 37(137):13–32. https://doi.org/10.1080/02102412.2008.10779637

Creamer G (2012) Model calibration and automated trading agent for euro futures. Quant Finance 12(4):531–545. https://doi.org/10.1080/14697688.2012.664921

Creamer G, Freund Y (2010) Automated trading with boosting and expert weighting. Quant Finance 10(4):401–420. https://doi.org/10.1080/14697680903104113

Cucculelli M, Recanatini M (2022) Distributed Ledger technology systems in securities post-trading services. Evid Eur Global Syst Banks Eur J Finance 28(2):195–218. https://doi.org/10.1080/1351847X.2021.1921002

Culkin R, Das SR (2017) Machine learning in finance: The case of deep learning for option pricing. J Invest Management 15(4):92–100

Google Scholar  

D’Hondt C, De Winne R, Ghysels E, Raymond S (2020) Artificial intelligence alter egos: Who might benefit from robo-investing? J Empir Financ 59:278–299. https://doi.org/10.1016/j.jempfin.2020.10.002

Deku SY, Kara A, Semeyutin A (2020) The predictive strength of mbs yield spreads during asset bubbles. Rev Quant Financ Acc 56(1):111–142. https://doi.org/10.1007/s11156-020-00888-8

Dixon M, Klabjan D, Bang JH (2017) Classification-based financial markets prediction using deep neural networks. Algorithmic Finance 6(3–4):67–77. https://doi.org/10.3233/af-170176

Donthu N, Kumar S, Mukherjee D, Pandey N, Lim WM (2021) How to conduct a bibliometric analysis: an overview and guidelines. J Bus Res 133:285–296. https://doi.org/10.1016/j.jbusres.2021.04.070

Downe-Wamboldt B (1992) Content analysis: method, applications, and issues. Health Care Women Int 13(3):313–321. https://doi.org/10.1080/07399339209516006

Article   CAS   PubMed   Google Scholar  

Dubey RK, Chauhan Y, Syamala SR (2017) Evidence of algorithmic trading from Indian equity Market: Interpreting the transaction velocity element of financialization. Res Int Bus Financ 42:31–38. https://doi.org/10.1016/j.ribaf.2017.05.014

Dunis CL, Laws J, Sermpinis G (2010) Modelling and trading the EUR/USD exchange rate at the ECB fixing. Eur J Finance 16(6):541–560. https://doi.org/10.1080/13518470903037771

Dunis CL, Laws J, Karathanasopoulos A (2013) Gp algorithm versus hybrid and mixed neural networks. Eur J Finance 19(3):180–205. https://doi.org/10.1080/1351847x.2012.679740

Durango-Gutiérrez MP, Lara-Rubio J, Navarro-Galera A (2021) Analysis of default risk in microfinance institutions under the Basel Iii framework. Int J Financ Econ. https://doi.org/10.1002/ijfe.2475

Episcopos A, Pericli A, Hu J (1998) Commercial mortgage default: A comparison of logit with radial basis function networks. J Real Estate Finance Econ 17(2):163–178

Ernst, E., Merola, R., and Samaan, D. (2018). The economics of artificial intelligence: Implications for the future of work. ILO Futur Work Res Paper Ser No. 5.

Feldman D, Gross S (2005) Mortgage default: classification trees analysis. J Real Estate Finance Econ 30(4):369–396. https://doi.org/10.1007/s11146-005-7013-7

Fernandes M, Medeiros MC, Scharth M (2014) Modeling and predicting the CBOE market volatility index. J Bank Finance 40:1–10. https://doi.org/10.1016/j.jbankfin.2013.11.004

Frankenfield, J. (2021). How Artificial Intelligence Works. Retrieved June 11, 2021, from https://www.investopedia.com/terms/a/artificial-intelligence-ai.asp

Frino A, Prodromou T, Wang GH, Westerholm PJ, Zheng H (2017) An empirical analysis of algorithmic trading around earnings announcements. Pac Basin Financ J 45:34–51. https://doi.org/10.1016/j.pacfin.2016.05.008

Frino A, Garcia M, Zhou Z (2020) Impact of algorithmic trading on speed of adjustment to new information: Evidence from interest rate derivatives. J Futur Mark 40(5):749–760. https://doi.org/10.1002/fut.22104

Funahashi H (2020) Artificial neural network for option pricing with and without asymptotic correction. Quant Finance 21(4):575–592. https://doi.org/10.1080/14697688.2020.1812702

Galeshchuk S, Mukherjee S (2017) Deep networks for predicting direction of change in foreign exchange rates. Intell Syst Account Finance Manage 24(4):100–110. https://doi.org/10.1002/isaf.1404

Gao M, Liu Y, Wu W (2016) Fat-finger trade and market quality: the first evidence from China. J Futur Mark 36(10):1014–1025. https://doi.org/10.1002/fut.21771

Gepp A, Kumar K, Bhattacharya S (2010) Business failure prediction using decision trees. J Forecast 29(6):536–555. https://doi.org/10.1002/for.1153

Guotai C, Abedin MZ (2017) Modeling credit approval data with neural networks: an experimental investigation and optimization. J Bus Econ Manag 18(2):224–240. https://doi.org/10.3846/16111699.2017.1280844

Hamdi M, Aloui C (2015) Forecasting crude oil price using artificial neural networks: a literature survey. Econ Bull 35(2):1339–1359

Hendershott T, Jones CM, Menkveld AJ (2011) Does algorithmic trading improve liquidity? J Financ 66(1):1–33. https://doi.org/10.1111/j.1540-6261.2010.01624.x

Hentzen JK, Hoffmann A, Dolan R, Pala E (2022a) Artificial intelligence in customer-facing financial services: a systematic literature review and agenda for future research. Int J Bank Market 40(6):1299–1336. https://doi.org/10.1108/IJBM-09-2021-0417

Hentzen JK, Hoffmann AOI, Dolan RM (2022b) Which consumers are more likely to adopt a retirement app and how does it explain mobile technology-enabled retirement engagement? Int J Consum Stud 46:368–390. https://doi.org/10.1111/ijcs.12685

Heston SL, Sinha NR (2017) News vs sentiment: predicting stock returns from news stories. Financial Anal J 73(3):67–83. https://doi.org/10.2469/faj.v73.n3.3

Holopainen M, Sarlin P (2017) Toward robust early-warning models: a horse race, ensembles and model uncertainty. Quant Finance 17(12):1933–1963. https://doi.org/10.1080/14697688.2017.1357972

Houlihan P, Creamer GG (2021) Leveraging social media to predict continuation and reversal in asset prices. Comput Econ 57(2):433–453. https://doi.org/10.1007/s10614-019-09932-9

Huang X, Guo F (2021) A kernel fuzzy twin SVM model for early warning systems of extreme financial risks. Int J Financ Econ 26(1):1459–1468. https://doi.org/10.1002/ijfe.1858

Huang Y, Kuan C (2021) Economic prediction with the fomc minutes: an application of text mining. Int Rev Econ Financ 71:751–761. https://doi.org/10.1016/j.iref.2020.09.020

IBM Cloud Education. (2020). What are Neural Networks? Retrieved May 10, 2021, from https://www.ibm.com/cloud/learn/neural-networks

Jagric T, Jagric V, Kracun D (2011) Does non-linearity matter in retail credit risk modeling? Czech J Econ Finance Faculty Soc Sci 61(4):384–402

Jagtiani J, Kose J (2018) Fintech: the impact on consumers and regulatory responses. J Econ Bus 100:1–6. https://doi.org/10.1016/j.jeconbus.2018.11.002

Jain A, Jain C, Khanapure RB (2021) Do algorithmic traders improve liquidity when information asymmetry is high? Q J Financ 11(01):1–32. https://doi.org/10.1142/s2010139220500159

Article   CAS   Google Scholar  

Jang H, Lee J (2019) Generative Bayesian neural network model for risk-neutral pricing of American index options. Quant Finance 19(4):587–603. https://doi.org/10.1080/14697688.2018.1490807

Jiang Y, Jones S (2018) Corporate distress prediction in China: a machine learning approach. Account Finance 58(4):1063–1109. https://doi.org/10.1111/acfi.12432

Jones S, Wang T (2019) Predicting private company failure: a multi-class analysis. J Int Finan Markets Inst Money 61:161–188. https://doi.org/10.1016/j.intfin.2019.03.004

Jones S, Johnstone D, Wilson R (2015) An empirical evaluation of the performance of binary classifiers in the prediction of credit ratings changes. J Bank Finance 56:72–85. https://doi.org/10.1016/j.jbankfin.2015.02.006

Jones S, Johnstone D, Wilson R (2017) Predicting corporate bankruptcy: an evaluation of alternative statistical frameworks. J Bus Financ Acc 44(1–2):3–34. https://doi.org/10.1111/jbfa.12218

Kamiya S, Kim YH, Park S (2018) The face of risk: Ceo facial masculinity and firm risk. Eur Financ Manag 25(2):239–270. https://doi.org/10.1111/eufm.12175

Kanas A (2001) Neural network linear forecasts for stock returns. Int J Financ Econ 6(3):245–254. https://doi.org/10.1002/ijfe.156

Kelejian HH, Mukerji P (2016) Does high frequency algorithmic trading matter for non-at investors? Res Int Bus Financ 37:78–92. https://doi.org/10.1016/j.ribaf.2015.10.014

Kercheval AN, Zhang Y (2015) Modelling high-frequency limit order book dynamics with support vector machines. Quant Finance 15(8):1315–1329. https://doi.org/10.1080/14697688.2015.1032546

Khandani AE, Kim AJ, Lo AW (2010) Consumer credit-risk models via machine-learning algorithms. J Bank Finance 34(11):2767–2787. https://doi.org/10.1016/j.jbankfin.2010.06.001

Kim S, Kim D (2014) Investor sentiment from internet message postings and the predictability of stock returns. J Econ Behav Organ 107:708–729. https://doi.org/10.1016/j.jebo.2014.04.015

Kim S, Kim S (2020) Index tracking through deep latent representation learning. Quant Finance 20(4):639–652. https://doi.org/10.1080/14697688.2019.1683599

Kumar G, Muckley CB, Pham L, Ryan D (2019) Can alert models for fraud protect the elderly clients of a financial institution? Eur J Finance 25(17):1683–1707. https://doi.org/10.1080/1351847x.2018.1552603

Lahmiri S (2016) Features selection, data mining and financial risk classification: a comparative study. Intell Syst Account Finance Managed 23(4):265–275. https://doi.org/10.1002/isaf.1395

Lahmiri S, Bekiros S (2019) Can machine learning approaches predict corporate bankruptcy? evidence from a qualitative experimental design. Quant Finance 19(9):1569–1577. https://doi.org/10.1080/14697688.2019.1588468

Law T, Shawe-Taylor J (2017) Practical Bayesian support vector regression for financial time series prediction and market condition change detection. Quant Finance 17(9):1403–1416. https://doi.org/10.1080/14697688.2016.1267868

Le HH, Viviani J (2018) Predicting bank failure: An improvement by implementing a machine-learning approach to classical financial ratios. Res Int Bus Financ 44:16–25. https://doi.org/10.1016/j.ribaf.2017.07.104

Li J, Li G, Zhu X, Yao Y (2020) Identifying the influential factors of commodity futures prices through a new text mining approach. Quant Finance 20(12):1967–1981. https://doi.org/10.1080/14697688.2020.1814008

Litzenberger R, Castura J, Gorelick R (2012) The impacts of automation and high frequency trading on market quality. Annu Rev Financ Econ 4(1):59–98. https://doi.org/10.1146/annurev-financial-110311-101744

Loukeris N, Eleftheriadis I (2015) Further higher moments in portfolio Selection and a priori detection of bankruptcy, under multi-layer perceptron neural Networks, HYBRID Neuro-genetic MLPs, and the voted perceptron. Int J Financ Econ 20(4):341–361. https://doi.org/10.1002/ijfe.1521

Lu J, Ohta H (2003) A data and digital-contracts driven method for pricing complex derivatives. Quant Finance 3(3):212–219. https://doi.org/10.1088/1469-7688/3/3/307

Lu Y, Shen C, Wei Y (2013) Revisiting early warning signals of corporate credit default using linguistic analysis. Pac Basin Financ J 24:1–21. https://doi.org/10.1016/j.pacfin.2013.02.002

Martinelli A, Mina A, Moggi M (2021) The enabling technologies of industry 4.0: examining the seeds of the fourth industrial revolution. Ind Corp Chang 2021:1–28. https://doi.org/10.1093/icc/dtaa060

Mondal S, Das S, Vrana VG (2023) How to bell the cat? a theoretical review of generative artificial intelligence towards digital disruption in all walks of life. Technologies 11(2):44. https://doi.org/10.3390/technologies11020044

Moshiri S, Cameron N (2000) Neural network versus econometric models in forecasting inflation. J Forecast 19(3):201–217. https://doi.org/10.1002/(sici)1099-131x(200004)19:33.0.co;2-4

Mselmi N, Lahiani A, Hamza T (2017) Financial distress prediction: the case of French small and medium-sized firms. Int Rev Financ Anal 50:67–80. https://doi.org/10.1016/j.irfa.2017.02.004

Nag AK, Mitra A (2002) Forecasting daily foreign exchange rates using genetically optimized neural networks. J Forecast 21(7):501–511. https://doi.org/10.1002/for.838

Papadimitriou T, Goga P, Agrapetidou A (2020) The resilience of the US banking system. Int J Finance Econ. https://doi.org/10.1002/ijfe.2300

Parot A, Michell K, Kristjanpoller WD (2019) Using artificial neural networks to forecast exchange rate, including Var-vecm residual analysis and prediction linear combination. Intell Syst Account Finance Manage 26(1):3–15. https://doi.org/10.1002/isaf.1440

Petukhina AA, Reule RC, Härdle WK (2020) Rise of the machines? intraday high-frequency trading patterns of cryptocurrencies. Eur J Finance 27(1–2):8–30. https://doi.org/10.1080/1351847x.2020.1789684

Petukhina A, Trimborn S, Härdle WK, Elendner H (2021) Investing with cryptocurrencies – evaluating their potential for portfolio allocation strategies. Quant Finance 21(11):1825–1853. https://doi.org/10.1080/14697688.2021.1880023

Pichl L, Kaizoji T (2017) Volatility analysis of bitcoin price time series. Quant Finance Econ 1(4):474–485. https://doi.org/10.3934/qfe.2017.4.474

Pompe PP, Bilderbeek J (2005) The prediction of bankruptcy of small- and medium-sized industrial firms. J Bus Ventur 20(6):847–868. https://doi.org/10.1016/j.jbusvent.2004.07.003

PricewaterhouseCoopers-PwC (2017). PwC‘s global Artificial Intelligence Study: Sizing the prize. Retrieved May 10, 2021, from https://www.PwC.com/gx/en/issues/data-and-analytics/publications/artificial-intelligence-study.html .

PricewaterhouseCoopers- PwC (2018). The macroeconomic impact of artificial intelligence. Retrieved May 17, 2021, from https://www.PwC.co.uk/economic-services/assets/macroeconomic-impact-of-ai-technical-report-feb-18.pdf .

PricewaterhouseCoopers- PwC (2020). How mature is AI adoption in financial services? Retrieved May 15, 2021, from https://www.PwC.de/de/future-of-finance/how-mature-is-ai-adoption-in-financial-services.pdf .

Qi M (1999) Nonlinear predictability of stock returns using financial and economic variables. J Bus Econ Stat 17(4):419. https://doi.org/10.2307/1392399

Qi M, Maddala GS (1999) Economic factors and the stock market: a new perspective. J Forecast 18(3):151–166. https://doi.org/10.1002/(sici)1099-131x(199905)18:33.0.co;2-v

Raj M, Seamans R (2019) Primer on artificial intelligence and robotics. J Organ Des 8(1):1–14. https://doi.org/10.1186/s41469-019-0050-0

Rasekhschaffe KC, Jones RC (2019) Machine learning for stock selection. Financ Anal J 75(3):70–88. https://doi.org/10.1080/0015198x.2019.1596678

Reber B (2014) Estimating the risk–return profile of new venture investments using a risk-neutral framework and ‘thick’ models. Eur J Finance 20(4):341–360. https://doi.org/10.1080/1351847x.2012.708471

Reboredo JC, Matías JM, Garcia-Rubio R (2012) Nonlinearity in forecasting of high-frequency stock returns. Comput Econ 40(3):245–264. https://doi.org/10.1007/s10614-011-9288-5

Renault T (2017) Intraday online investor sentiment and return patterns in the U.S. stock market. J Bank Finance 84:25–40. https://doi.org/10.1016/j.jbankfin.2017.07.002

Rodrigues BD, Stevenson MJ (2013) Takeover prediction using forecast combinations. Int J Forecast 29(4):628–641. https://doi.org/10.1016/j.ijforecast.2013.01.008

Van Roy V, Vertesy D, Damioli G (2020). AI and robotics innovation. In K. F., Zimmermann (ed.), Handbook of Labor, Human Resources and Population Economics (pp. 1–35) Springer Nature

Sabău Popa DC, Popa DN, Bogdan V, Simut R (2021) Composite financial performance index prediction – a neural networks approach. J Bus Econ Manag 22(2):277–296. https://doi.org/10.3846/jbem.2021.14000

Sariev E, Germano G (2020) Bayesian regularized artificial neural networks for the estimation of the probability of default. Quant Finance 20(2):311–328. https://doi.org/10.1080/14697688.2019.1633014

Scholtus M, Van Dijk D, Frijns B (2014) Speed, algorithmic trading, and market quality around macroeconomic news announcements. J Bank Finance 38:89–105. https://doi.org/10.1016/j.jbankfin.2013.09.016

Sermpinis G, Laws J, Dunis CL (2013) Modelling and trading the realised volatility of the ftse100 futures with higher order neural networks. Eur J Finance 19(3):165–179. https://doi.org/10.1080/1351847x.2011.606990

Sirignano JA (2018) Deep learning for limit order books. Quant Finance 19(4):549–570. https://doi.org/10.1080/14697688.2018.1546053

Soleymani F, Vasighi M (2020) Efficient portfolio construction by means OF CVaR and K -means++ CLUSTERING analysis: evidence from the NYSE. Int J Financ Econ. https://doi.org/10.1002/ijfe.2344

Sun T, Vasarhelyi MA (2018) Predicting credit card delinquencies: an application of deep neural networks. Intell Syst Account Finance Manage 25(4):174–189. https://doi.org/10.1002/isaf.1437

Szczepański, M. (2019). Economic impacts of artificial intelligence. Retrieved May 10, 2021, from https://www.europarl.europa.eu/RegData/etudes/BRIE/2019/637967/EPRS_BRI(2019)637967_EN.pdf

Tao R, Su C, Xiao Y, Dai K, Khalid F (2021) Robo advisors, algorithmic trading and investment management: Wonders of fourth industrial revolution in financial markets. Technol Forecast Soc Chang 163:120421. https://doi.org/10.1016/j.techfore.2020.120421

Tashiro D, Matsushima H, Izumi K, Sakaji H (2019) Encoding of high-frequency order information and prediction of short-term stock price by deep learning. Quant Finance 19(9):1499–1506. https://doi.org/10.1080/14697688.2019.1622314

Trinkle BS, Baldwin AA (2016) Research opportunities for neural networks: the case for credit. Intell Syst Account Finance Manage 23(3):240–254. https://doi.org/10.1002/isaf.1394

Trippi RR, DeSieno D (1992) Trading equity index futures with a neural network. J Portf Manage 19(1):27–33. https://doi.org/10.3905/jpm.1992.409432

Uddin MS, Chi G, Al Janabi MA, Habib T (2020) Leveraging random forest in micro-enterprises credit risk modelling for accuracy and interpretability. Int J Financ Econ. https://doi.org/10.1002/ijfe.2346

Varetto F (1998) Genetic algorithms applications in the analysis of insolvency risk. J Bank Finance 22(10–11):1421–1439. https://doi.org/10.1016/s0378-4266(98)00059-4

Vortelinos DI (2017) Forecasting realized Volatility: HAR against principal components combining, neural networks and GARCH. Res Int Bus Financ 39:824–839. https://doi.org/10.1016/j.ribaf.2015.01.004

Wall LD (2018) Some financial regulatory implications of artificial intelligence. J Econ Bus 100:55–63. https://doi.org/10.1016/j.jeconbus.2018.05.003

Wanke P, Azad MA, Barros C (2016a) Predicting efficiency in Malaysian islamic banks: a two-stage TOPSIS and neural networks approach. Res Int Bus Financ 36:485–498. https://doi.org/10.1016/j.ribaf.2015.10.002

Wanke P, Azad MA, Barros CP, Hassan MK (2016c) Predicting efficiency in Islamic banks: an integrated multicriteria decision Making (MCDM) Approach. J Int Finan Markets Inst Money 45:126–141. https://doi.org/10.1016/j.intfin.2016.07.004

Wei L, Li G, Zhu X, Li J (2019) Discovering bank risk factors from financial statements based on a new semi-supervised text mining algorithm. Account Finance 59(3):1519–1552. https://doi.org/10.1111/acfi.12453

Xu Y, Zhao J (2022) Can sentiments on macroeconomic news explain stock returns? evidence from social network data. Int J Financ Econ 27(2):2073–2088. https://doi.org/10.1002/ijfe.2260

Xu D, Zhang X, Feng H (2019) Generalized fuzzy soft sets theory-based novel hybrid ensemble credit scoring model. Int J Financ Econ 24(2):903–921. https://doi.org/10.1002/ijfe.1698

Yang Z, Platt MB, Platt HD (1999) Probabilistic neural networks in bankruptcy prediction. J Bus Res 44(2):67–74. https://doi.org/10.1016/s0148-2963(97)00242-7

Yin H, Wu X, Kong SX (2020) Daily investor sentiment, order flow imbalance and stock liquidity: Evidence from the Chinese stock market. Int J Financ Econ. https://doi.org/10.1002/ijfe.2402

Zhang Y, Chu G, Shen D (2021) The role of investor attention in predicting stock prices: the long short-term memory networks perspective. Financ Res Lett 38:101484. https://doi.org/10.1016/j.frl.2020.101484

Zhao Y, Stasinakis C, Sermpinis G, Shi Y (2018) Neural network copula portfolio optimization for exchange traded funds. Quant Finance 18(5):761–775. https://doi.org/10.1080/14697688.2017.1414505

Zheng X, Zhu M, Li Q, Chen C, Tan Y (2019) Finbrain: When finance meets ai 2.0. Front Inform Technol Electr Eng 20(7):914–924. https://doi.org/10.1631/fitee.1700822

Download references

Open access funding provided by Università Politecnica delle Marche within the CRUI-CARE Agreement. This study has not received specific funding. We are granted with research funds by our institution which would allow us to cover the publication costs.

Author information

Authors and affiliations.

Department of Strategy and Management, EDC Paris Business School, 10074m, Puteaux Cedex, La Défense, 92807, Paris, France

Salman Bahoo

Department of Economics and Social Sciences, Marche Polytechnic University, Piazzale Martelli 8, 60100, Ancona, Italy

Marco Cucculelli, Xhoana Goga & Jasmine Mondolo

You can also search for this author in PubMed   Google Scholar

Contributions

Conceptualization: MC and SB. Methodology: SB. Investigation: MC, XG, SB. Writing: Marco Cucculelli, Xhoana Goga, Salman Bahoo and JM. Writing – Review and Editing: JM. Supervision: MC . Project Administration: MC.

Corresponding author

Correspondence to Marco Cucculelli .

Ethics declarations

Conflict of interest.

The authors declare no conflict of interest.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 50 kb)

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Bahoo, S., Cucculelli, M., Goga, X. et al. Artificial intelligence in Finance: a comprehensive review through bibliometric and content analysis. SN Bus Econ 4 , 23 (2024). https://doi.org/10.1007/s43546-023-00618-x

Download citation

Received : 25 April 2023

Accepted : 13 December 2023

Published : 20 January 2024

DOI : https://doi.org/10.1007/s43546-023-00618-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Artificial intelligence
  • Machine learning
  • Bibliometric analysis
  • Content analysis

JEL Classification

  • Find a journal
  • Publish with us
  • Track your research

Suggestions or feedback?

MIT News | Massachusetts Institute of Technology

  • Machine learning
  • Social justice
  • Black holes
  • Classes and programs

Departments

  • Aeronautics and Astronautics
  • Brain and Cognitive Sciences
  • Architecture
  • Political Science
  • Mechanical Engineering

Centers, Labs, & Programs

  • Abdul Latif Jameel Poverty Action Lab (J-PAL)
  • Picower Institute for Learning and Memory
  • Lincoln Laboratory
  • School of Architecture + Planning
  • School of Engineering
  • School of Humanities, Arts, and Social Sciences
  • Sloan School of Management
  • School of Science
  • MIT Schwarzman College of Computing

Artificial intelligence

Download RSS feed: News Articles / In the Media / Audio

Three rows of five portrait photos

School of Engineering welcomes new faculty

Fifteen new faculty members join six of the school’s academic departments.

May 23, 2024

Read full story →

Grayscale photo of Nolen Scruggs seated on a field of grass

A community collaboration for progress

Graduate student Nolen Scruggs works with a local tenant association to address housing inequality as part of the MIT Initiative on Combatting Systemic Racism.

May 22, 2024

Ten portrait photos are featured in geometrical shapes on a dark blue background. Text indicates "2024 Design Fellows"

2024 MAD Design Fellows announced

The 10 Design Fellows are MIT graduate students working at the intersection of design and multiple disciplines across the Institute.

May 21, 2024

A cute robot is at the chalkboard. The chalkboard is filled with complex charts, waves and shapes.

Scientists use generative AI to answer complex questions in physics

A new technique that can automatically classify phases of physical systems could help scientists investigate novel materials.

May 16, 2024

A digital illustration featuring two stylized humanlike figures engaged in a conversation over a tabletop board game.

Using ideas from game theory to improve the reliability of language models

A new “consensus game,” developed by MIT CSAIL researchers, elevates AI’s text comprehension and generation skills.

May 14, 2024

App inventor logo, which looks like a bee inside a very small honeycomb

The power of App Inventor: Democratizing possibilities for mobile applications

More than a decade since its launch, App Inventor recently hosted its 100 millionth project and registered its 20 millionth user. Now hosted by MIT, the app also supports experimenting with AI.

May 10, 2024

Three orange blobs turn into the letters and spell “MIT.” Two cute cartoony blobs are in the corner smiling.

A better way to control shape-shifting soft robots

A new algorithm learns to squish, bend, or stretch a robot’s entire body to accomplish diverse tasks like avoiding obstacles or retrieving items.

Ashutash Kumar stands with arms folded in the lab

From steel engineering to ovarian tumor research

Ashutosh Kumar, a materials science and engineering PhD student and MathWorks Fellow, applies his eclectic skills to studying the relationship between bacteria and cancer.

Sally Kornbluth and Sam Altman are sitting on stage in conversation.

President Sally Kornbluth and OpenAI CEO Sam Altman discuss the future of AI

The conversation in Kresge Auditorium touched on the promise and perils of the rapidly evolving technology.

May 6, 2024

Jonathan Ragan-Kelley stands outdoors in Budapest, with the city as a backdrop

Creating bespoke programming languages for efficient visual AI systems

Associate Professor Jonathan Ragan-Kelley optimizes how computer graphics and images are processed for the hardware of today and tomorrow.

May 3, 2024

A group of 30 people stand in Lobby 7 at MIT, a large atrium with multiple floors

HPI-MIT design research collaboration creates powerful teams

Together, the Hasso Plattner Institute and MIT are working toward novel solutions to the world’s problems as part of the Designing for Sustainability research program.

2 by 8 grid of portrait photos plus the MIT Mechanical Engineering logo

Exploring frontiers of mechanical engineering

MIT Department of Mechanical Engineering grad students are undertaking a broad range of innovative research projects.

Three boxes demonstrate different tasks assisted by natural language. One is a rectangle showing colorful lines of code with a white speech bubble highlighting an abstraction; another is a pale 3D kitchen, and another is a robotic quadruped dropping a can into a trash bin.

Natural language boosts LLM performance in coding, planning, and robotics

Three neurosymbolic methods help language models find better abstractions within natural language, then use those representations to execute complex tasks.

May 1, 2024

Two researchers sit at a desk looking at computer screens showing tornado radar images

An AI dataset carves new paths to tornado detection

TorNet, a public artificial intelligence dataset, could help models reveal when and why tornadoes form, improving forecasters' ability to issue warnings.

April 29, 2024

Five people seated in a row in front of an audience smile and listen intently to someone out of the frame.

MIT faculty, instructors, students experiment with generative AI in teaching and learning

At MIT’s Festival of Learning 2024, panelists stressed the importance of developing critical thinking skills while leveraging technologies like generative AI.

Massachusetts Institute of Technology 77 Massachusetts Avenue, Cambridge, MA, USA

  • Map (opens in new window)
  • Events (opens in new window)
  • People (opens in new window)
  • Careers (opens in new window)
  • Accessibility
  • Social Media Hub
  • MIT on Facebook
  • MIT on YouTube
  • MIT on Instagram

Advertisement

What is artificial general intelligence, and is it a useful concept?

The world's biggest AI companies have made artificial general intelligence, or AGI, their goal. But it isn't always clear what AGI means, and there is debate about whether it is a valuable idea

By Alex Wilkins

21 May 2024

Machine learning , artificial intelligence, ai, deep learning blockchain neural network concept. Brain made with shining wireframe above multiple blockchain cpu on circuit board 3d render.; Shutterstock ID 1096541144; purchase_order: -; job: -; client: -; other: -

Shutterstock/archy13

If you take even a passing interest in artificial intelligence , you will inevitably have come across the notion of artificial general intelligence. AGI, as it is often known, has ascended to buzzword status over the past few years as AI has exploded into the public consciousness on the back of the success of large language models (LLMs), a form of AI that powers chatbots such as ChatGPT.

That is largely because AGI has become a lodestar for the companies at the vanguard of this type of technology. ChatGPT creator OpenAI, for example, states that its mission is “to ensure that artificial general intelligence benefits all of humanity”. Governments, too, have become obsessed with the opportunities AGI might present, as well as possible existential threats, while the media (including this magazine, naturally) report on claims that we have already seen “sparks of AGI” in LLM systems.

How AI mathematicians might finally deliver human-level reasoning

Despite all this, it isn’t always clear what AGI really means. Indeed, that is the subject of heated debate in the AI community, with some insisting it is a useful goal and others that it is a meaningless figment that betrays a misunderstanding of the nature of intelligence – and our prospects for replicating it in machines. “It’s not really a scientific concept,” says Melanie Mitchell at the Santa Fe Institute in New Mexico.

Artificial human-like intelligence and superintelligent AI have been staples of science fiction for centuries. But the term AGI took off around 20 years ago when it was used by the computer scientist Ben Goertzel and Shane Legg, cofounder of…

Sign up to our weekly newsletter

Receive a weekly dose of discovery in your inbox! We'll also keep you up to date with New Scientist events and special offers.

To continue reading, subscribe today with our introductory offers

No commitment, cancel anytime*

Offer ends 2nd of July 2024.

*Cancel anytime within 14 days of payment to receive a refund on unserved issues.

Inclusive of applicable taxes (VAT)

Existing subscribers

More from New Scientist

Explore the latest news, articles and features

OpenAI’s chatbot shows racial bias in advising home buyers and renters

Subscriber-only

Environment

Tech firms claim nuclear will solve ai's power needs – they're wrong, chatgpt got an upgrade to make it seem more human, deepmind is experimenting with a nearly indestructible robot hand, popular articles.

Trending New Scientist articles

Advertisement

Supported by

Senators Propose $32 Billion in Annual A.I. Spending but Defer Regulation

Their plan is the culmination of a yearlong listening tour on the dangers of the new technology.

  • Share full article

Martin Heinrich, Todd Young, Chuck Schumer and Mike Rounds sit facing one another in separate chairs in a Senate office.

By Cecilia Kang and David McCabe

Cecilia Kang and David McCabe cover technology policy.

A bipartisan group of senators released a long-awaited legislative plan for artificial intelligence on Wednesday, calling for billions in funding to propel American leadership in the technology while offering few details on regulations to address its risks.

In a 20-page document titled “Driving U.S. Innovation in Artificial Intelligence,” the Senate leader, Chuck Schumer, and three colleagues called for spending $32 billion annually by 2026 for government and private-sector research and development of the technology.

The lawmakers recommended creating a federal data privacy law and said they supported legislation, planned for introduction on Wednesday, that would prevent the use of realistic misleading technology known as deepfakes in election campaigns. But they said congressional committees and agencies should come up with regulations on A.I., including protections against health and financial discrimination, the elimination of jobs, and copyright violations caused by the technology.

“It’s very hard to do regulations because A.I. is changing too quickly,” Mr. Schumer, a New York Democrat, said in an interview. “We didn’t want to rush this.”

He designed the road map with two Republican senators, Mike Rounds of South Dakota and Todd Young of Indiana, and a fellow Democrat, Senator Martin Heinrich of New Mexico, after their yearlong listening tour to hear concerns about new generative A.I. technologies. Those tools, like OpenAI’s ChatGPT, can generate realistic and convincing images, videos, audio and text. Tech leaders have warned about the potential harms of A.I., including the obliteration of entire job categories, election interference, discrimination in housing and finance, and even the replacement of humankind.

The senators’ decision to delay A.I. regulation widens a gap between the United States and the European Union , which this year adopted a law that prohibits A.I.’s riskiest uses, including some facial recognition applications and tools that can manipulate behavior or discriminate. The European law requires transparency around how systems operate and what data they collect. Dozens of U.S. states have also proposed privacy and A.I. laws that would prohibit certain uses of the technology.

Outside of recent legislation mandating the sale or ban of the social media app TikTok, Congress hasn’t passed major tech legislation in years, despite multiple proposals.

“It’s disappointing because at this point we’ve missed several windows of opportunity to act while the rest of the world has,” said Amba Kak, a co-executive director of the nonprofit AI Now Institute and a former adviser on A.I. to the Federal Trade Commission.

Mr. Schumer’s efforts on A.I. legislation began in June with a series of high-profile forums that brought together tech leaders including Elon Musk of Tesla, Sundar Pichai of Google and Sam Altman of OpenAI.

(The New York Times has sued OpenAI and its partner, Microsoft, over use of the publication’s copyrighted works in A.I. development.)

Mr. Schumer said in the interview that through the forums, lawmakers had begun to understand the complexity of A.I. technologies and how expert agencies and congressional committees were best equipped to create regulations.

The legislative road map encourages greater federal investment in the growth of domestic research and development.

“This is sort of the American way — we are more entrepreneurial,” Mr. Schumer said in the interview, adding that the lawmakers hoped to make “innovation the North Star.”

In a separate briefing with reporters, he said the Senate was more likely to consider A.I. proposals piecemeal instead of in one large legislative package.

“What we’d expect is that we would have some bills that certainly pass the Senate and hopefully pass the House by the end of the year,” Mr. Schumer said. “It won’t cover the whole waterfront. There’s too much waterfront to cover, and things are changing so rapidly.”

He added that his staff had spoken with Speaker Mike Johnson’s office

Maya Wiley, president of the Leadership Conference on Civil and Human Rights, participated in the first forum. She said that the closed-door meetings were “tech industry heavy” and that the report’s focus on promoting innovation overshadowed the real-world harms that could result from A.I. systems, noting that health and financial tools had already shown signs of discrimination against certain ethnic and racial groups.

Ms. Wiley has called for greater focus on the vetting of new products to make sure they are safe and operate without biases that can target certain communities.

“We should not assume that we don’t need additional rights,” she said.

Cecilia Kang reports on technology and regulatory policy and is based in Washington D.C. She has written about technology for over two decades. More about Cecilia Kang

David McCabe covers tech policy. He joined The Times from Axios in 2019. More about David McCabe

Explore Our Coverage of Artificial Intelligence

News  and Analysis

News Corp, the Murdoch-owned empire of publications like The Wall Street Journal and The New York Post, announced that it had agreed to a deal with OpenAI to share its content  to train and service A.I. chatbots.

The Silicon Valley company Nvidia was again lifted by sales of its A.I. chips , but it faces growing competition and heightened expectations.

Researchers at the A.I. company Anthropic claim to have found clues about the inner workings  of large language models, possibly helping to prevent their misuse and to curb their potential threats.

The Age of A.I.

D’Youville University in Buffalo had an A.I. robot speak at its commencement . Not everyone was happy about it.

A new program, backed by Cornell Tech, M.I.T. and U.C.L.A., helps prepare lower-income, Latina and Black female computing majors  for A.I. careers.

Publishers have long worried that A.I.-generated answers on Google would drive readers away from their sites. They’re about to find out if those fears are warranted, our tech columnist writes .

A new category of apps promises to relieve parents of drudgery, with an assist from A.I.  But a family’s grunt work is more human, and valuable, than it seems.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Innovation (Camb)
  • v.2(4); 2021 Nov 28

Artificial intelligence: A powerful paradigm for scientific research

1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China

35 University of Chinese Academy of Sciences, Beijing 100049, China

5 Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China

10 Zhongshan Hospital Institute of Clinical Science, Fudan University, Shanghai 200032, China

Changping Huang

18 Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China

11 Institute of Physics, Chinese Academy of Sciences, Beijing 100190, China

37 Songshan Lake Materials Laboratory, Dongguan, Guangdong 523808, China

26 Institute of High Energy Physics, Chinese Academy of Sciences, Beijing 100049, China

Xingchen Liu

28 Institute of Coal Chemistry, Chinese Academy of Sciences, Taiyuan 030001, China

2 Institute of Software, Chinese Academy of Sciences, Beijing 100190, China

Fengliang Dong

3 National Center for Nanoscience and Technology, Beijing 100190, China

Cheng-Wei Qiu

4 Department of Electrical and Computer Engineering, National University of Singapore, Singapore 117583, Singapore

6 Department of Gynaecology, Obstetrics and Gynaecology Hospital, Fudan University, Shanghai 200011, China

36 Shanghai Key Laboratory of Female Reproductive Endocrine-Related Diseases, Shanghai 200011, China

7 School of Food Science and Technology, Dalian Polytechnic University, Dalian 116034, China

41 Second Affiliated Hospital School of Medicine, and School of Public Health, Zhejiang University, Hangzhou 310058, China

8 Department of Obstetrics and Gynecology, Peking University Third Hospital, Beijing 100191, China

9 Zhejiang Provincial People’s Hospital, Hangzhou 310014, China

Chenguang Fu

12 School of Materials Science and Engineering, Zhejiang University, Hangzhou 310027, China

Zhigang Yin

13 Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences, Fuzhou 350002, China

Ronald Roepman

14 Medical Center, Radboud University, 6500 Nijmegen, the Netherlands

Sabine Dietmann

15 Institute for Informatics, Washington University School of Medicine, St. Louis, MO 63110, USA

Marko Virta

16 Department of Microbiology, University of Helsinki, 00014 Helsinki, Finland

Fredrick Kengara

17 School of Pure and Applied Sciences, Bomet University College, Bomet 20400, Kenya

19 Agriculture College of Shihezi University, Xinjiang 832000, China

Taolan Zhao

20 Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China

21 The Brain Cognition and Brain Disease Institute, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China

38 Shenzhen-Hong Kong Institute of Brain Science-Shenzhen Fundamental Research Institutions, Shenzhen 518055, China

Jialiang Yang

22 Geneis (Beijing) Co., Ltd, Beijing 100102, China

23 Department of Communication Studies, Hong Kong Baptist University, Hong Kong, China

24 South China Botanical Garden, Chinese Academy of Sciences, Guangzhou 510650, China

39 Center of Economic Botany, Core Botanical Gardens, Chinese Academy of Sciences, Guangzhou 510650, China

Zhaofeng Liu

27 Shanghai Astronomical Observatory, Chinese Academy of Sciences, Shanghai 200030, China

29 Suzhou Institute of Nano-Tech and Nano-Bionics, Chinese Academy of Sciences, Suzhou 215123, China

Xiaohong Liu

30 Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences, Chongqing 400714, China

James P. Lewis

James m. tiedje.

34 Center for Microbial Ecology, Department of Plant, Soil and Microbial Sciences, Michigan State University, East Lansing, MI 48824, USA

40 Zhejiang Lab, Hangzhou 311121, China

25 Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai 200031, China

31 Department of Computer Science, Aberystwyth University, Aberystwyth, Ceredigion SY23 3FL, UK

Zhipeng Cai

32 Department of Computer Science, Georgia State University, Atlanta, GA 30303, USA

33 Institute of Soil Science, Chinese Academy of Sciences, Nanjing 210008, China

Jiabao Zhang

Artificial intelligence (AI) coupled with promising machine learning (ML) techniques well known from computer science is broadly affecting many aspects of various fields including science and technology, industry, and even our day-to-day life. The ML techniques have been developed to analyze high-throughput data with a view to obtaining useful insights, categorizing, predicting, and making evidence-based decisions in novel ways, which will promote the growth of novel applications and fuel the sustainable booming of AI. This paper undertakes a comprehensive survey on the development and application of AI in different aspects of fundamental sciences, including information science, mathematics, medical science, materials science, geoscience, life science, physics, and chemistry. The challenges that each discipline of science meets, and the potentials of AI techniques to handle these challenges, are discussed in detail. Moreover, we shed light on new research trends entailing the integration of AI into each scientific discipline. The aim of this paper is to provide a broad research guideline on fundamental sciences with potential infusion of AI, to help motivate researchers to deeply understand the state-of-the-art applications of AI-based fundamental sciences, and thereby to help promote the continuous development of these fundamental sciences.

Graphical abstract

An external file that holds a picture, illustration, etc.
Object name is fx1.jpg

Public summary

  • • “Can machines think?” The goal of artificial intelligence (AI) is to enable machines to mimic human thoughts and behaviors, including learning, reasoning, predicting, and so on.
  • • “Can AI do fundamental research?” AI coupled with machine learning techniques is impacting a wide range of fundamental sciences, including mathematics, medical science, physics, etc.
  • • “How does AI accelerate fundamental research?” New research and applications are emerging rapidly with the support by AI infrastructure, including data storage, computing power, AI algorithms, and frameworks.

Introduction

“Can machines think?” Alan Turing posed this question in his famous paper “Computing Machinery and Intelligence.” 1 He believes that to answer this question, we need to define what thinking is. However, it is difficult to define thinking clearly, because thinking is a subjective behavior. Turing then introduced an indirect method to verify whether a machine can think, the Turing test, which examines a machine's ability to show intelligence indistinguishable from that of human beings. A machine that succeeds in the test is qualified to be labeled as artificial intelligence (AI).

AI refers to the simulation of human intelligence by a system or a machine. The goal of AI is to develop a machine that can think like humans and mimic human behaviors, including perceiving, reasoning, learning, planning, predicting, and so on. Intelligence is one of the main characteristics that distinguishes human beings from animals. With the interminable occurrence of industrial revolutions, an increasing number of types of machine types continuously replace human labor from all walks of life, and the imminent replacement of human resources by machine intelligence is the next big challenge to be overcome. Numerous scientists are focusing on the field of AI, and this makes the research in the field of AI rich and diverse. AI research fields include search algorithms, knowledge graphs, natural languages processing, expert systems, evolution algorithms, machine learning (ML), deep learning (DL), and so on.

The general framework of AI is illustrated in Figure 1 . The development process of AI includes perceptual intelligence, cognitive intelligence, and decision-making intelligence. Perceptual intelligence means that a machine has the basic abilities of vision, hearing, touch, etc., which are familiar to humans. Cognitive intelligence is a higher-level ability of induction, reasoning and acquisition of knowledge. It is inspired by cognitive science, brain science, and brain-like intelligence to endow machines with thinking logic and cognitive ability similar to human beings. Once a machine has the abilities of perception and cognition, it is often expected to make optimal decisions as human beings, to improve the lives of people, industrial manufacturing, etc. Decision intelligence requires the use of applied data science, social science, decision theory, and managerial science to expand data science, so as to make optimal decisions. To achieve the goal of perceptual intelligence, cognitive intelligence, and decision-making intelligence, the infrastructure layer of AI, supported by data, storage and computing power, ML algorithms, and AI frameworks is required. Then by training models, it is able to learn the internal laws of data for supporting and realizing AI applications. The application layer of AI is becoming more and more extensive, and deeply integrated with fundamental sciences, industrial manufacturing, human life, social governance, and cyberspace, which has a profound impact on our work and lifestyle.

An external file that holds a picture, illustration, etc.
Object name is gr1.jpg

The general framework of AI

History of AI

The beginning of modern AI research can be traced back to John McCarthy, who coined the term “artificial intelligence (AI),” during at a conference at Dartmouth College in 1956. This symbolized the birth of the AI scientific field. Progress in the following years was astonishing. Many scientists and researchers focused on automated reasoning and applied AI for proving of mathematical theorems and solving of algebraic problems. One of the famous examples is Logic Theorist, a computer program written by Allen Newell, Herbert A. Simon, and Cliff Shaw, which proves 38 of the first 52 theorems in “Principia Mathematica” and provides more elegant proofs for some. 2 These successes made many AI pioneers wildly optimistic, and underpinned the belief that fully intelligent machines would be built in the near future. However, they soon realized that there was still a long way to go before the end goals of human-equivalent intelligence in machines could come true. Many nontrivial problems could not be handled by the logic-based programs. Another challenge was the lack of computational resources to compute more and more complicated problems. As a result, organizations and funders stopped supporting these under-delivering AI projects.

AI came back to popularity in the 1980s, as several research institutions and universities invented a type of AI systems that summarizes a series of basic rules from expert knowledge to help non-experts make specific decisions. These systems are “expert systems.” Examples are the XCON designed by Carnegie Mellon University and the MYCIN designed by Stanford University. The expert system derived logic rules from expert knowledge to solve problems in the real world for the first time. The core of AI research during this period is the knowledge that made machines “smarter.” However, the expert system gradually revealed several disadvantages, such as privacy technologies, lack of flexibility, poor versatility, expensive maintenance cost, and so on. At the same time, the Fifth Generation Computer Project, heavily funded by the Japanese government, failed to meet most of its original goals. Once again, the funding for AI research ceased, and AI was at the second lowest point of its life.

In 2006, Geoffrey Hinton and coworkers 3 , 4 made a breakthrough in AI by proposing an approach of building deeper neural networks, as well as a way to avoid gradient vanishing during training. This reignited AI research, and DL algorithms have become one of the most active fields of AI research. DL is a subset of ML based on multiple layers of neural networks with representation learning, 5 while ML is a part of AI that a computer or a program can use to learn and acquire intelligence without human intervention. Thus, “learn” is the keyword of this era of AI research. Big data technologies, and the improvement of computing power have made deriving features and information from massive data samples more efficient. An increasing number of new neural network structures and training methods have been proposed to improve the representative learning ability of DL, and to further expand it into general applications. Current DL algorithms match and exceed human capabilities on specific datasets in the areas of computer vision (CV) and natural language processing (NLP). AI technologies have achieved remarkable successes in all walks of life, and continued to show their value as backbones in scientific research and real-world applications.

Within AI, ML is having a substantial broad effect across many aspects of technology and science: from computer science to geoscience to materials science, from life science to medical science to chemistry to mathematics and to physics, from management science to economics to psychology, and other data-intensive empirical sciences, as ML methods have been developed to analyze high-throughput data to obtain useful insights, categorize, predict, and make evidence-based decisions in novel ways. To train a system by presenting it with examples of desired input-output behavior, could be far easier than to program it manually by predicting the desired response for all potential inputs. The following sections survey eight fundamental sciences, including information science (informatics), mathematics, medical science, materials science, geoscience, life science, physics, and chemistry, which develop or exploit AI techniques to promote the development of sciences and accelerate their applications to benefit human beings, society, and the world.

AI in information science

AI aims to provide the abilities of perception, cognition, and decision-making for machines. At present, new research and applications in information science are emerging at an unprecedented rate, which is inseparable from the support by the AI infrastructure. As shown in Figure 2 , the AI infrastructure layer includes data, storage and computing power, ML algorithms, and the AI framework. The perception layer enables machines have the basic ability of vision, hearing, etc. For instance, CV enables machines to “see” and identify objects, while speech recognition and synthesis helps machines to “hear” and recognize speech elements. The cognitive layer provides higher ability levels of induction, reasoning, and acquiring knowledge with the help of NLP, 6 knowledge graphs, 7 and continual learning. 8 In the decision-making layer, AI is capable of making optimal decisions, such as automatic planning, expert systems, and decision-supporting systems. Numerous applications of AI have had a profound impact on fundamental sciences, industrial manufacturing, human life, social governance, and cyberspace. The following subsections provide an overview of the AI framework, automatic machine learning (AutoML) technology, and several state-of-the-art AI/ML applications in the information field.

An external file that holds a picture, illustration, etc.
Object name is gr2.jpg

The knowledge graph of the AI framework

The AI framework provides basic tools for AI algorithm implementation

In the past 10 years, applications based on AI algorithms have played a significant role in various fields and subjects, on the basis of which the prosperity of the DL framework and platform has been founded. AI frameworks and platforms reduce the requirement of accessing AI technology by integrating the overall process of algorithm development, which enables researchers from different areas to use it across other fields, allowing them to focus on designing the structure of neural networks, thus providing better solutions to problems in their fields. At the beginning of the 21st century, only a few tools, such as MATLAB, OpenNN, and Torch, were capable of describing and developing neural networks. However, these tools were not originally designed for AI models, and thus faced problems, such as complicated user API and lacking GPU support. During this period, using these frameworks demanded professional computer science knowledge and tedious work on model construction. As a solution, early frameworks of DL, such as Caffe, Chainer, and Theano, emerged, allowing users to conveniently construct complex deep neural networks (DNNs), such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), and LSTM conveniently, and this significantly reduced the cost of applying AI models. Tech giants then joined the march in researching AI frameworks. 9 Google developed the famous open-source framework, TensorFlow, while Facebook's AI research team released another popular platform, PyTorch, which is based on Torch; Microsoft Research published CNTK, and Amazon announced MXNet. Among them, TensorFlow, also the most representative framework, referred to Theano's declarative programming style, offering a larger space for graph-based optimization, while PyTorch inherited the imperative programming style of Torch, which is intuitive, user friendly, more flexible, and easier to be traced. As modern AI frameworks and platforms are being widely applied, practitioners can now assemble models swiftly and conveniently by adopting various building block sets and languages specifically suitable for given fields. Polished over time, these platforms gradually developed a clearly defined user API, the ability for multi-GPU training and distributed training, as well as a variety of model zoos and tool kits for specific tasks. 10 Looking forward, there are a few trends that may become the mainstream of next-generation framework development. (1) Capability of super-scale model training. With the emergence of models derived from Transformer, such as BERT and GPT-3, the ability of training large models has become an ideal feature of the DL framework. It requires AI frameworks to train effectively under the scale of hundreds or even thousands of devices. (2) Unified API standard. The APIs of many frameworks are generally similar but slightly different at certain points. This leads to some difficulties and unnecessary learning efforts, when the user attempts to shift from one framework to another. The API of some frameworks, such as JAX, has already become compatible with Numpy standard, which is familiar to most practitioners. Therefore, a unified API standard for AI frameworks may gradually come into being in the future. (3) Universal operator optimization. At present, kernels of DL operator are implemented either manually or based on third-party libraries. Most third-party libraries are developed to suit certain hardware platforms, causing large unnecessary spending when models are trained or deployed on different hardware platforms. The development speed of new DL algorithms is usually much faster than the update rate of libraries, which often makes new algorithms to be beyond the range of libraries' support. 11

To improve the implementation speed of AI algorithms, much research focuses on how to use hardware for acceleration. The DianNao family is one of the earliest research innovations on AI hardware accelerators. 12 It includes DianNao, DaDianNao, ShiDianNao, and PuDianNao, which can be used to accelerate the inference speed of neural networks and other ML algorithms. Of these, the best performance of a 64-chip DaDianNao system can achieve a speed up of 450.65× over a GPU, and reduce the energy by 150.31×. Prof. Chen and his team in the Institute of Computing Technology also designed an Instruction Set Architecture for a broad range of neural network accelerators, called Cambricon, which developed into a serial DL accelerator. After Cambricon, many AI-related companies, such as Apple, Google, HUAWEI, etc., developed their own DL accelerators, and AI accelerators became an important research field of AI.

AI for AI—AutoML

AutoML aims to study how to use evolutionary computing, reinforcement learning (RL), and other AI algorithms, to automatically generate specified AI algorithms. Research on the automatic generation of neural networks has existed before the emergence of DL, e.g., neural evolution. 13 The main purpose of neural evolution is to allow neural networks to evolve according to the principle of survival of the fittest in the biological world. Through selection, crossover, mutation, and other evolutionary operators, the individual quality in a population is continuously improved and, finally, the individual with the greatest fitness represents the best neural network. The biological inspiration in this field lies in the evolutionary process of human brain neurons. The human brain has such developed learning and memory functions that it cannot do without the complex neural network system in the brain. The whole neural network system of the human brain benefits from a long evolutionary process rather than gradient descent and back propagation. In the era of DL, the application of AI algorithms to automatically generate DNN has attracted more attention and, gradually, developed into an important direction of AutoML research: neural architecture search. The implementation methods of neural architecture search are usually divided into the RL-based method and the evolutionary algorithm-based method. In the RL-based method, an RNN is used as a controller to generate a neural network structure layer by layer, and then the network is trained, and the accuracy of the verification set is used as the reward signal of the RNN to calculate the strategy gradient. During the iteration, the controller will give the neural network, with higher accuracy, a higher probability value, so as to ensure that the strategy function can output the optimal network structure. 14 The method of neural architecture search through evolution is similar to the neural evolution method, which is based on a population and iterates continuously according to the principle of survival of the fittest, so as to obtain a high-quality neural network. 15 Through the application of neural architecture search technology, the design of neural networks is more efficient and automated, and the accuracy of the network gradually outperforms that of the networks designed by AI experts. For example, Google's SOTA network EfficientNet was realized through the baseline network based on neural architecture search. 16

AI enabling networking design adaptive to complex network conditions

The application of DL in the networking field has received strong interest. Network design often relies on initial network conditions and/or theoretical assumptions to characterize real network environments. However, traditional network modeling and design, regulated by mathematical models, are unlikely to deal with complex scenarios with many imperfect and high dynamic network environments. Integrating DL into network research allows for a better representation of complex network environments. Furthermore, DL could be combined with the Markov decision process and evolve into the deep reinforcement learning (DRL) model, which finds an optimal policy based on the reward function and the states of the system. Taken together, these techniques could be used to make better decisions to guide proper network design, thereby improving the network quality of service and quality of experience. With regard to the aspect of different layers of the network protocol stack, DL/DRL can be adopted for network feature extraction, decision-making, etc. In the physical layer, DL can be used for interference alignment. It can also be used to classify the modulation modes, design efficient network coding 17 and error correction codes, etc. In the data link layer, DL can be used for resource (such as channels) allocation, medium access control, traffic prediction, 18 link quality evaluation, and so on. In the network (routing) layer, routing establishment and routing optimization 19 can help to obtain an optimal routing path. In higher layers (such as the application layer), enhanced data compression and task allocation is used. Besides the above protocol stack, one critical area of using DL is network security. DL can be used to classify the packets into benign/malicious types, and how it can be integrated with other ML schemes, such as unsupervised clustering, to achieve a better anomaly detection effect.

AI enabling more powerful and intelligent nanophotonics

Nanophotonic components have recently revolutionized the field of optics via metamaterials/metasurfaces by enabling the arbitrary manipulation of light-matter interactions with subwavelength meta-atoms or meta-molecules. 20 , 21 , 22 The conventional design of such components involves generally forward modeling, i.e., solving Maxwell's equations based on empirical and intuitive nanostructures to find corresponding optical properties, as well as the inverse design of nanophotonic devices given an on-demand optical response. The trans-dimensional feature of macro-optical components consisting of complex nano-antennas makes the design process very time consuming, computationally expensive, and even numerically prohibitive, such as device size and complexity increase. DL is an efficient and automatic platform, enabling novel efficient approaches to designing nanophotonic devices with high-performance and versatile functions. Here, we present briefly the recent progress of DL-based nanophotonics and its wide-ranging applications. DL was exploited for forward modeling at first using a DNN. 23 The transmission or reflection coefficients can be well predicted after training on huge datasets. To improve the prediction accuracy of DNN in case of small datasets, transfer learning was introduced to migrate knowledge between different physical scenarios, which greatly reduced the relative error. Furthermore, a CNN and an RNN were developed for the prediction of optical properties from arbitrary structures using images. 24 The CNN-RNN combination successfully predicted the absorption spectra from the given input structural images. In inverse design of nanophotonic devices, there are three different paradigms of DL methods, i.e., supervised, unsupervised, and RL. 25 Supervised learning has been utilized to design structural parameters for the pre-defined geometries, such as tandem DNN and bidirectional DNNs. Unsupervised learning methods learn by themselves without a specific target, and thus are more accessible to discovering new and arbitrary patterns 26 in completely new data than supervised learning. A generative adversarial network (GAN)-based approach, combining conditional GANs and Wasserstein GANs, was proposed to design freeform all-dielectric multifunctional metasurfaces. RL, especially double-deep Q-learning, powers up the inverse design of high-performance nanophotonic devices. 27 DL has endowed nanophotonic devices with better performance and more emerging applications. 28 , 29 For instance, an intelligent microwave cloak driven by DL exhibits millisecond and self-adaptive response to an ever-changing incident wave and background. 28 Another example is that a DL-augmented infrared nanoplasmonic metasurface is developed for monitoring dynamics between four major classes of bio-molecules, which could impact the fields of biology, bioanalytics, and pharmacology from fundamental research, to disease diagnostics, to drug development. 29 The potential of DL in the wide arena of nanophotonics has been unfolding. Even end-users without optics and photonics background could exploit the DL as a black box toolkit to design powerful optical devices. Nevertheless, how to interpret/mediate the intermediate DL process and determine the most dominant factors in the search for optimal solutions, are worthy of being investigated in depth. We optimistically envisage that the advancements in DL algorithms and computation/optimization infrastructures would enable us to realize more efficient and reliable training approaches, more complex nanostructures with unprecedented shapes and sizes, and more intelligent and reconfigurable optic/optoelectronic systems.

AI in other fields of information science

We believe that AI has great potential in the following directions:

  • • AI-based risk control and management in utilities can prevent costly or hazardous equipment failures by using sensors that detect and send information regarding the machine's health to the manufacturer, predicting possible issues that could occur so as to ensure timely maintenance or automated shutdown.
  • • AI could be used to produce simulations of real-world objects, called digital twins. When applied to the field of engineering, digital twins allow engineers and technicians to analyze the performance of an equipment virtually, thus avoiding safety and budget issues associated with traditional testing methods.
  • • Combined with AI, intelligent robots are playing an important role in industry and human life. Different from traditional robots working according to the procedures specified by humans, intelligent robots have the ability of perception, recognition, and even automatic planning and decision-making, based on changes in environmental conditions.
  • • AI of things (AIoT) or AI-empowered IoT applications. 30 have become a promising development trend. AI can empower the connected IoT devices, embedded in various physical infrastructures, to perceive, recognize, learn, and act. For instance, smart cities constantly collect data regarding quality-of-life factors, such as the status of power supply, public transportation, air pollution, and water use, to manage and optimize systems in cities. Due to these data, especially personal data being collected from informed or uninformed participants, data security, and privacy 31 require protection.

AI in mathematics

Mathematics always plays a crucial and indispensable role in AI. Decades ago, quite a few classical AI-related approaches, such as k-nearest neighbor, 32 support vector machine, 33 and AdaBoost, 34 were proposed and developed after their rigorous mathematical formulations had been established. In recent years, with the rapid development of DL, 35 AI has been gaining more and more attention in the mathematical community. Equipped with the Markov process, minimax optimization, and Bayesian statistics, RL, 36 GANs, 37 and Bayesian learning 38 became the most favorable tools in many AI applications. Nevertheless, there still exist plenty of open problems in mathematics for ML, including the interpretability of neural networks, the optimization problems of parameter estimation, and the generalization ability of learning models. In the rest of this section, we discuss these three questions in turn.

The interpretability of neural networks

From a mathematical perspective, ML usually constructs nonlinear models, with neural networks as a typical case, to approximate certain functions. The well-known Universal Approximation Theorem suggests that, under very mild conditions, any continuous function can be uniformly approximated on compact domains by neural networks, 39 which serves a vital function in the interpretability of neural networks. However, in real applications, ML models seem to admit accurate approximations of many extremely complicated functions, sometimes even black boxes, which are far beyond the scope of continuous functions. To understand the effectiveness of ML models, many researchers have investigated the function spaces that can be well approximated by them, and the corresponding quantitative measures. This issue is closely related to the classical approximation theory, but the approximation scheme is distinct. For example, Bach 40 finds that the random feature model is naturally associated with the corresponding reproducing kernel Hilbert space. In the same way, the Barron space is identified as the natural function space associated with two-layer neural networks, and the approximation error is measured using the Barron norm. 41 The corresponding quantities of residual networks (ResNets) are defined for the flow-induced spaces. For multi-layer networks, the natural function spaces for the purposes of approximation theory are the tree-like function spaces introduced in Wojtowytsch. 42 There are several works revealing the relationship between neural networks and numerical algorithms for solving partial differential equations. For example, He and Xu 43 discovered that CNNs for image classification have a strong connection with multi-grid (MG) methods. In fact, the pooling operation and feature extraction in CNNs correspond directly to restriction operation and iterative smoothers in MG, respectively. Hence, various convolution and pooling operations used in CNNs can be better understood.

The optimization problems of parameter estimation

In general, the optimization problem of estimating parameters of certain DNNs is in practice highly nonconvex and often nonsmooth. Can the global minimizers be expected? What is the landscape of local minimizers? How does one handle the nonsmoothness? All these questions are nontrivial from an optimization perspective. Indeed, numerous works and experiments demonstrate that the optimization for parameter estimation in DL is itself a much nicer problem than once thought; see, e.g., Goodfellow et al. 44 As a consequence, the study on the solution landscape ( Figure 3 ), also known as loss surface of neural networks, is no longer supposed to be inaccessible and can even in turn provide guidance for global optimization. Interested readers can refer to the survey paper (Sun et al. 45 ) for recent progress in this aspect.

An external file that holds a picture, illustration, etc.
Object name is gr3.jpg

Recent studies indicate that nonsmooth activation functions, e.g., rectified linear units, are better than smooth ones in finding sparse solutions. However, the chain rule does not work in the case that the activation functions are nonsmooth, which then makes the widely used stochastic gradient (SG)-based approaches not feasible in theory. Taking approximated gradients at nonsmooth iterates as a remedy ensures that SG-type methods are still in extensive use, but that the numerical evidence has also exposed their limitations. Also, the penalty-based approaches proposed by Cui et al. 46 and Liu et al. 47 provide a new direction to solve the nonsmooth optimization problems efficiently.

The generalization ability of learning models

A small training error does not always lead to a small test error. This gap is caused by the generalization ability of learning models. A key finding in statistical learning theory states that the generalization error is bounded by a quantity that grows with the increase of the model capacity, but shrinks as the number of training examples increases. 48 A common conjecture relating generalization to solution landscape is that flat and wide minima generalize better than sharp ones. Thus, regularization techniques, including the dropout approach, 49 have emerged to force the algorithms to bypass the sharp minima. However, the mechanism behind this has not been fully explored. Recently, some researchers have focused on the ResNet-type architecture, with dropout being inserted after the last convolutional layer of each modular building. They thus managed to explain the stochastic dropout training process and the ensuing dropout regularization effect from the perspective of optimal control. 50

AI in medical science

There is a great trend for AI technology to grow more and more significant in daily operations, including medical fields. With the growing needs of healthcare for patients, hospital needs are evolving from informationization networking to the Internet Hospital and eventually to the Smart Hospital. At the same time, AI tools and hardware performance are also growing rapidly with each passing day. Eventually, common AI algorithms, such as CV, NLP, and data mining, will begin to be embedded in the medical equipment market ( Figure 4 ).

An external file that holds a picture, illustration, etc.
Object name is gr4.jpg

AI doctor based on electronic medical records

For medical history data, it is inevitable to mention Doctor Watson, developed by the Watson platform of IBM, and Modernizing Medicine, which aims to solve oncology, and is now adopted by CVS & Walgreens in the US and various medical organizations in China as well. Doctor Watson takes advantage of the NLP performance of the IBM Watson platform, which already collected vast data of medical history, as well as prior knowledge in the literature for reference. After inputting the patients' case, Doctor Watson searches the medical history reserve and forms an elementary treatment proposal, which will be further ranked by prior knowledge reserves. With the multiple models stored, Doctor Watson gives the final proposal as well as the confidence of the proposal. However, there are still problems for such AI doctors because, 51 as they rely on prior experience from US hospitals, the proposal may not be suitable for other regions with different medical insurance policies. Besides, the knowledge updating of the Watson platform also relies highly on the updating of the knowledge reserve, which still needs manual work.

AI for public health: Outbreak detection and health QR code for COVID-19

AI can be used for public health purposes in many ways. One classical usage is to detect disease outbreaks using search engine query data or social media data, as Google did for prediction of influenza epidemics 52 and the Chinese Academy of Sciences did for modeling the COVID-19 outbreak through multi-source information fusion. 53 After the COVID-19 outbreak, a digital health Quick Response (QR) code system has been developed by China, first to detect potential contact with confirmed COVID-19 cases and, secondly, to indicate the person's health status using mobile big data. 54 Different colors indicate different health status: green means healthy and is OK for daily life, orange means risky and requires quarantine, and red means confirmed COVID-19 patient. It is easy to use for the general public, and has been adopted by many other countries. The health QR code has made great contributions to the worldwide prevention and control of the COVID-19 pandemic.

Biomarker discovery with AI

High-dimensional data, including multi-omics data, patient characteristics, medical laboratory test data, etc., are often used for generating various predictive or prognostic models through DL or statistical modeling methods. For instance, the COVID-19 severity evaluation model was built through ML using proteomic and metabolomic profiling data of sera 55 ; using integrated genetic, clinical, and demographic data, Taliaz et al. built an ML model to predict patient response to antidepressant medications 56 ; prognostic models for multiple cancer types (such as liver cancer, lung cancer, breast cancer, gastric cancer, colorectal cancer, pancreatic cancer, prostate cancer, ovarian cancer, lymphoma, leukemia, sarcoma, melanoma, bladder cancer, renal cancer, thyroid cancer, head and neck cancer, etc.) were constructed through DL or statistical methods, such as least absolute shrinkage and selection operator (LASSO), combined with Cox proportional hazards regression model using genomic data. 57

Image-based medical AI

Medical image AI is one of the most developed mature areas as there are numerous models for classification, detection, and segmentation tasks in CV. For the clinical area, CV algorithms can also be used for computer-aided diagnosis and treatment with ECG, CT, eye fundus imaging, etc. As human doctors may be tired and prone to make mistakes after viewing hundreds and hundreds of images for diagnosis, AI doctors can outperform a human medical image viewer due to their specialty at repeated work without fatigue. The first medical AI product approved by FDA is IDx-DR, which uses an AI model to make predictions of diabetic retinopathy. The smartphone app SkinVision can accurately detect melanomas. 58 It uses “fractal analysis” to identify moles and their surrounding skin, based on size, diameter, and many other parameters, and to detect abnormal growth trends. AI-ECG of LEPU Medical can automatically detect heart disease with ECG images. Lianying Medical takes advantage of their hardware equipment to produce real-time high-definition image-guided all-round radiotherapy technology, which successfully achieves precise treatment.

Wearable devices for surveillance and early warning

For wearable devices, AliveCor has developed an algorithm to automatically predict the presence of atrial fibrillation, which is an early warning sign of stroke and heart failure. The 23andMe company can also test saliva samples at a small cost, and a customer can be provided with information based on their genes, including who their ancestors were or potential diseases they may be prone to later in life. It provides accurate health management solutions based on individual and family genetic data. In the 20–30 years of the near feature, we believe there are several directions for further research: (1) causal inference for real-time in-hospital risk prediction. Clinical doctors usually acquire reasonable explanations for certain medical decisions, but the current AI models nowadays are usually black box models. The casual inference will help doctors to explain certain AI decisions and even discover novel ground truths. (2) Devices, including wearable instruments for multi-dimensional health monitoring. The multi-modality model is now a trend for AI research. With various devices to collect multi-modality data and a central processor to fuse all these data, the model can monitor the user's overall real-time health condition and give precautions more precisely. (3) Automatic discovery of clinical markers for diseases that are difficult to diagnose. Diseases, such as ALS, are still difficult for clinical doctors to diagnose because they lack any effective general marker. It may be possible for AI to discover common phenomena for these patients and find an effective marker for early diagnosis.

AI-aided drug discovery

Today we have come into the precision medicine era, and the new targeted drugs are the cornerstones for precision therapy. However, over the past decades, it takes an average of over one billion dollars and 10 years to bring a new drug into the market. How to accelerate the drug discovery process, and avoid late-stage failure, are key concerns for all the big and fiercely competitive pharmaceutical companies. The highlighted emerging role of AI, including ML, DL, expert systems, and artificial neural networks (ANNs), has brought new insights and high efficiency into the new drug discovery processes. AI has been adopted in many aspects of drug discovery, including de novo molecule design, structure-based modeling for proteins and ligands, quantitative structure-activity relationship research, and druggable property judgments. DL-based AI appliances demonstrate superior merits in addressing some challenging problems in drug discovery. Of course, prediction of chemical synthesis routes and chemical process optimization are also valuable in accelerating new drug discovery, as well as lowering production costs.

There has been notable progress in the AI-aided new drug discovery in recent years, for both new chemical entity discovery and the relating business area. Based on DNNs, DeepMind built the AlphaFold platform to predict 3D protein structures that outperformed other algorithms. As an illustration of great achievement, AlphaFold successfully and accurately predicted 25 scratch protein structures from a 43 protein panel without using previously built proteins models. Accordingly, AlphaFold won the CASP13 protein-folding competition in December 2018. 59 Based on the GANs and other ML methods, Insilico constructed a modular drug design platform GENTRL system. In September 2019, they reported the discovery of the first de novo active DDR1 kinase inhibitor developed by the GENTRL system. It took the team only 46 days from target selection to get an active drug candidate using in vivo data. 60 Exscientia and Sumitomo Dainippon Pharma developed a new drug candidate, DSP-1181, for the treatment of obsessive-compulsive disorder on the Centaur Chemist AI platform. In January 2020, DSP-1181 started its phase I clinical trials, which means that, from program initiation to phase I study, the comprehensive exploration took less than 12 months. In contrast, comparable drug discovery using traditional methods usually needs 4–5 years with traditional methods.

How AI transforms medical practice: A case study of cervical cancer

As the most common malignant tumor in women, cervical cancer is a disease that has a clear cause and can be prevented, and even treated, if detected early. Conventionally, the screening strategy for cervical cancer mainly adopts the “three-step” model of “cervical cytology-colposcopy-histopathology.” 61 However, limited by the level of testing methods, the efficiency of cervical cancer screening is not high. In addition, owing to the lack of knowledge by doctors in some primary hospitals, patients cannot be provided with the best diagnosis and treatment decisions. In recent years, with the advent of the era of computer science and big data, AI has gradually begun to extend and blend into various fields. In particular, AI has been widely used in a variety of cancers as a new tool for data mining. For cervical cancer, a clinical database with millions of medical records and pathological data has been built, and an AI medical tool set has been developed. 62 Such an AI analysis algorithm supports doctors to access the ability of rapid iterative AI model training. In addition, a prognostic prediction model established by ML and a web-based prognostic result calculator have been developed, which can accurately predict the risk of postoperative recurrence and death in cervical cancer patients, and thereby better guide decision-making in postoperative adjuvant treatment. 63

AI in materials science

As the cornerstone of modern industry, materials have played a crucial role in the design of revolutionary forms of matter, with targeted properties for broad applications in energy, information, biomedicine, construction, transportation, national security, spaceflight, and so forth. Traditional strategies rely on the empirical trial and error experimental approaches as well as the theoretical simulation methods, e.g., density functional theory, thermodynamics, or molecular dynamics, to discover novel materials. 64 These methods often face the challenges of long research cycles, high costs, and low success rates, and thus cannot meet the increasingly growing demands of current materials science. Accelerating the speed of discovery and deployment of advanced materials will therefore be essential in the coming era.

With the rapid development of data processing and powerful algorithms, AI-based methods, such as ML and DL, are emerging with good potentials in the search for and design of new materials prior to actually manufacturing them. 65 , 66 By integrating material property data, such as the constituent element, lattice symmetry, atomic radius, valence, binding energy, electronegativity, magnetism, polarization, energy band, structure-property relation, and functionalities, the machine can be trained to “think” about how to improve material design and even predict the properties of new materials in a cost-effective manner ( Figure 5 ).

An external file that holds a picture, illustration, etc.
Object name is gr5.jpg

AI is expected to power the development of materials science

AI in discovery and design of new materials

Recently, AI techniques have made significant advances in rational design and accelerated discovery of various materials, such as piezoelectric materials with large electrostrains, 67 organic-inorganic perovskites for photovoltaics, 68 molecular emitters for efficient light-emitting diodes, 69 inorganic solid materials for thermoelectrics, 70 and organic electronic materials for renewable-energy applications. 66 , 71 The power of data-driven computing and algorithmic optimization can promote comprehensive applications of simulation and ML (i.e., high-throughput virtual screening, inverse molecular design, Bayesian optimization, and supervised learning, etc.), in material discovery and property prediction in various fields. 72 For instance, using a DL Bayesian framework, the attribute-driven inverse materials design has been demonstrated for efficient and accurate prediction of functional molecular materials, with desired semiconducting properties or redox stability for applications in organic thin-film transistors, organic solar cells, or lithium-ion batteries. 73 It is meaningful to adopt automation tools for quick experimental testing of potential materials and utilize high-performance computing to calculate their bulk, interface, and defect-related properties. 74 The effective convergence of automation, computing, and ML can greatly speed up the discovery of materials. In the future, with the aid of AI techniques, it will be possible to accomplish the design of superconductors, metallic glasses, solder alloys, high-entropy alloys, high-temperature superalloys, thermoelectric materials, two-dimensional materials, magnetocaloric materials, polymeric bio-inspired materials, sensitive composite materials, and topological (electronic and phonon) materials, and so on. In the past decade, topological materials have ignited the research enthusiasm of condensed matter physicists, materials scientists, and chemists, as they exhibit exotic physical properties with potential applications in electronics, thermoelectrics, optics, catalysis, and energy-related fields. From the most recent predictions, more than a quarter of all inorganic materials in nature are topologically nontrivial. The establishment of topological electronic materials databases 75 , 76 , 77 and topological phononic materials databases 78 using high-throughput methods will help to accelerate the screening and experimental discovery of new topological materials for functional applications. It is recognized that large-scale high-quality datasets are required to practice AI. Great efforts have also been expended in building high-quality materials science databases. As one of the top-ranking databases of its kind, the “atomly.net” materials data infrastructure, 79 has calculated the properties of more than 180,000 inorganic compounds, including their equilibrium structures, electron energy bands, dielectric properties, simulated diffraction patterns, elasticity tensors, etc. As such, the atomly.net database has set a solid foundation for extending AI into the area of materials science research. The X-ray diffraction (XRD)-matcher model of atomly.net uses ML to match and classify the experimental XRD to the simulated patterns. Very recently, by using the dataset from atomly.net, an accurate AI model was built to rapidly predict the formation energy of almost any given compound to yield a fairly good predictive ability. 80

AI-powered Materials Genome Initiative

The Materials Genome Initiative (MGI) is a great plan for rational realization of new materials and related functions, and it aims to discover, manufacture, and deploy advanced materials efficiently, cost-effectively, and intelligently. The initiative creates policy, resources, and infrastructure for accelerating materials development at a high level. This is a new paradigm for the discovery and design of next-generation materials, and runs from a view point of fundamental building blocks toward general materials developments, and accelerates materials development through efforts in theory, computation, and experiment, in a highly integrated high-throughput manner. MGI raises an ultimately high goal and high level for materials development and materials science for humans in the future. The spirit of MGI is to design novel materials by using data pools and powerful computation once the requirements or aspirations of functional usages appear. The theory, computation, and algorithm are the primary and substantial factors in the establishment and implementation of MGI. Advances in theories, computations, and experiments in materials science and engineering provide the footstone to not only accelerate the speed at which new materials are realized but to also shorten the time needed to push new products into the market. These AI techniques bring a great promise to the developing MGI. The applications of new technologies, such as ML and DL, directly accelerate materials research and the establishment of MGI. The model construction and application to science and engineering, as well as the data infrastructure, are of central importance. When the AI-powered MGI approaches are coupled with the ongoing autonomy of manufacturing methods, the potential impact to society and the economy in the future is profound. We are now beginning to see that the AI-aided MGI, among other things, integrates experiments, computation, and theory, and facilitates access to materials data, equips the next generation of the materials workforce, and enables a paradigm shift in materials development. Furthermore, the AI-powdered MGI could also design operational procedures and control the equipment to execute experiments, and to further realize autonomous experimentation in future material research.

Advanced functional materials for generation upgrade of AI

The realization and application of AI techniques depend on the computational capability and computer hardware, and this bases physical functionality on the performance of computers or supercomputers. For our current technology, the electric currents or electric carriers for driving electric chips and devices consist of electrons with ordinary characteristics, such as heavy mass and low mobility. All chips and devices emit relatively remarkable heat levels, consuming too much energy and lowering the efficiency of information transmission. Benefiting from the rapid development of modern physics, a series of advanced materials with exotic functional effects have been discovered or designed, including superconductors, quantum anomalous Hall insulators, and topological fermions. In particular, the superconducting state or topologically nontrivial electrons will promote the next-generation AI techniques once the (near) room temperature applications of these states are realized and implanted in integrated circuits. 81 In this case, the central processing units, signal circuits, and power channels will be driven based on the electronic carriers that show massless, energy-diffusionless, ultra-high mobility, or chiral-protection characteristics. The ordinary electrons will be removed from the physical circuits of future-generation chips and devices, leaving superconducting and topological chiral electrons running in future AI chips and supercomputers. The efficiency of transmission, for information and logic computing will be improved on a vast scale and at a very low cost.

AI for materials and materials for AI

The coming decade will continue to witness the development of advanced ML algorithms, newly emerging data-driven AI methodologies, and integrated technologies for facilitating structure design and property prediction, as well as to accelerate the discovery, design, development, and deployment of advanced materials into existing and emerging industrial sectors. At this moment, we are facing challenges in achieving accelerated materials research through the integration of experiment, computation, and theory. The great MGI, proposed for high-level materials research, helps to promote this process, especially when it is assisted by AI techniques. Still, there is a long way to go for the usage of these advanced functional materials in future-generation electric chips and devices to be realized. More materials and functional effects need to be discovered or improved by the developing AI techniques. Meanwhile, it is worth noting that materials are the core components of devices and chips that are used for construction of computers or machines for advanced AI systems. The rapid development of new materials, especially the emergence of flexible, sensitive, and smart materials, is of great importance for a broad range of attractive technologies, such as flexible circuits, stretchable tactile sensors, multifunctional actuators, transistor-based artificial synapses, integrated networks of semiconductor/quantum devices, intelligent robotics, human-machine interactions, simulated muscles, biomimetic prostheses, etc. These promising materials, devices, and integrated technologies will greatly promote the advancement of AI systems toward wide applications in human life. Once the physical circuits are upgraded by advanced functional or smart materials, AI techniques will largely promote the developments and applications of all disciplines.

AI in geoscience

Ai technologies involved in a large range of geoscience fields.

Momentous challenges threatening current society require solutions to problems that belong to geoscience, such as evaluating the effects of climate change, assessing air quality, forecasting the effects of disaster incidences on infrastructure, by calculating the incoming consumption and availability of food, water, and soil resources, and identifying factors that are indicators for potential volcanic eruptions, tsunamis, floods, and earthquakes. 82 , 83 It has become possible, with the emergence of advanced technology products (e.g., deep sea drilling vessels and remote sensing satellites), for enhancements in computational infrastructure that allow for processing large-scale, wide-range simulations of multiple models in geoscience, and internet-based data analysis that facilitates collection, processing, and storage of data in distributed and crowd-sourced environments. 84 The growing availability of massive geoscience data provides unlimited possibilities for AI—which has popularized all aspects of our daily life (e.g., entertainment, transportation, and commerce)—to significantly contribute to geoscience problems of great societal relevance. As geoscience enters the era of massive data, AI, which has been extensively successful in different fields, offers immense opportunities for settling a series of problems in Earth systems. 85 , 86 Accompanied by diversified data, AI-enabled technologies, such as smart sensors, image visualization, and intelligent inversion, are being actively examined in a large range of geoscience fields, such as marine geoscience, rock physics, geology, ecology, seismicity, environment, hydrology, remote sensing, Arc GIS, and planetary science. 87

Multiple challenges in the development of geoscience

There are some traits of geoscience development that restrict the applicability of fundamental algorithms for knowledge discovery: (1) inherent challenges of geoscience processes, (2) limitation of geoscience data collection, and (3) uncertainty in samples and ground truth. 88 , 89 , 90 Amorphous boundaries generally exist in geoscience objects between space and time that are not as well defined as objects in other fields. Geoscience phenomena are also significantly multivariate, obey nonlinear relationships, and exhibit spatiotemporal structure and non-stationary characteristics. Except for the inherent challenges of geoscience observations, the massive data at multiple dimensions of time and space, with different levels of incompleteness, noise, and uncertainties, disturb processes in geoscience. For supervised learning approaches, there are other difficulties owing to the lack of gold standard ground truth and the “small size” of samples (e.g., a small amount of historical data with sufficient observations) in geoscience applications.

Usage of AI technologies as efficient approaches to promote the geoscience processes

Geoscientists continually make every effort to develop better techniques for simulating the present status of the Earth system (e.g., how much greenhouse gases are released into the atmosphere), and the connections between and within its subsystems (e.g., how does the elevated temperature influence the ocean ecosystem). Viewed from the perspective of geoscience, newly emerging approaches, with the aid of AI, are a perfect combination for these issues in the application of geoscience: (1) characterizing objects and events 91 ; (2) estimating geoscience variables from observations 92 ; (3) forecasting geoscience variables according to long-term observations 85 ; (4) exploring geoscience data relationships 93 ; and (5) causal discovery and causal attribution. 94 While characterizing geoscience objects and events using traditional methods are primarily rooted in hand-coded features, algorithms can automatically detect the data by improving the performance with pattern-mining techniques. However, due to spatiotemporal targets with vague boundaries and the related uncertainties, it can be necessary to advance pattern-mining methods that can explain the temporal and spatial characteristics of geoscience data when characterizing different events and objects. To address the non-stationary issue of geoscience data, AI-aided algorithms have been expanded to integrate the holistic results of professional predictors and engender robust estimations of climate variables (e.g., humidity and temperature). Furthermore, forecasting long-term trends of the current situation in the Earth system using AI-enabled technologies can simulate future scenarios and formulate early resource planning and adaptation policies. Mining geoscience data relationships can help us seize vital signs of the Earth system and promote our understanding of geoscience developments. Of great interest is the advancement of AI-decision methodology with uncertain prediction probabilities, engendering vague risks with poorly resolved tails, signifying the most extreme, transient, and rare events formulated by model sets, which supports various cases to improve accuracy and effectiveness.

AI technologies for optimizing the resource management in geoscience

Currently, AI can perform better than humans in some well-defined tasks. For example, AI techniques have been used in urban water resource planning, mainly due to their remarkable capacity for modeling, flexibility, reasoning, and forecasting the water demand and capacity. Design and application of an Adaptive Intelligent Dynamic Water Resource Planning system, the subset of AI for sustainable water resource management in urban regions, largely prompted the optimization of water resource allocation, will finally minimize the operation costs and improve the sustainability of environmental management 95 ( Figure 6 ). Also, meteorology requires collecting tremendous amounts of data on many different variables, such as humidity, altitude, and temperature; however, dealing with such a huge dataset is a big challenge. 96 An AI-based technique is being utilized to analyze shallow-water reef images, recognize the coral color—to track the effects of climate change, and to collect humidity, temperature, and CO 2 data—to grasp the health of our ecological environment. 97 Beyond AI's capabilities for meteorology, it can also play a critical role in decreasing greenhouse gas emissions originating from the electric-power sector. Comprised of production, transportation, allocation, and consumption of electricity, many opportunities exist in the electric-power sector for Al applications, including speeding up the development of new clean energy, enhancing system optimization and management, improving electricity-demand forecasts and distribution, and advancing system monitoring. 98 New materials may even be found, with the auxiliary of AI, for batteries to store energy or materials and absorb CO 2 from the atmosphere. 99 Although traditional fossil fuel operations have been widely used for thousands of years, AI techniques are being used to help explore the development of more potential sustainable energy sources for the development (e.g., fusion technology). 100

An external file that holds a picture, illustration, etc.
Object name is gr6.jpg

Applications of AI in hydraulic resource management

In addition to the adjustment of energy structures due to climate change (a core part of geoscience systems), a second, less-obvious step could also be taken to reduce greenhouse gas emission: using AI to target inefficiencies. A related statistical report by the Lawrence Livermore National Laboratory pointed out that around 68% of energy produced in the US could be better used for purposeful activities, such as electricity generation or transportation, but is instead contributing to environmental burdens. 101 AI is primed to reduce these inefficiencies of current nuclear power plants and fossil fuel operations, as well as improve the efficiency of renewable grid resources. 102 For example, AI can be instrumental in the operation and optimization of solar and wind farms to make these utility-scale renewable-energy systems far more efficient in the production of electricity. 103 AI can also assist in reducing energy losses in electricity transportation and allocation. 104 A distribution system operator in Europe used AI to analyze load, voltage, and network distribution data, to help “operators assess available capacity on the system and plan for future needs.” 105 AI allowed the distribution system operator to employ existing and new resources to make the distribution of energy assets more readily available and flexible. The International Energy Agency has proposed that energy efficiency is core to the reform of energy systems and will play a key role in reducing the growth of global energy demand to one-third of the current level by 2040.

AI as a building block to promote development in geoscience

The Earth’s system is of significant scientific interest, and affects all aspects of life. 106 The challenges, problems, and promising directions provided by AI are definitely not exhaustive, but rather, serve to illustrate that there is great potential for future AI research in this important field. Prosperity, development, and popularization of AI approaches in the geosciences is commonly driven by a posed scientific question, and the best way to succeed is that AI researchers work closely with geoscientists at all stages of research. That is because the geoscientists can better understand which scientific question is important and novel, which sample collection process can reasonably exhibit the inherent strengths, which datasets and parameters can be used to answer that question, and which pre-processing operations are conducted, such as removing seasonal cycles or smoothing. Similarly, AI researchers are better suited to decide which data analysis approaches are appropriate and available for the data, the advantages and disadvantages of these approaches, and what the approaches actually acquire. Interpretability is also an important goal in geoscience because, if we can understand the basic reasoning behind the models, patterns, or relationships extracted from the data, they can be used as building blocks in scientific knowledge discovery. Hence, frequent communication between the researchers avoids long detours and ensures that analysis results are indeed beneficial to both geoscientists and AI researchers.

AI in the life sciences

The developments of AI and the life sciences are intertwined. The ultimate goal of AI is to achieve human-like intelligence, as the human brain is capable of multi-tasking, learning with minimal supervision, and generalizing learned skills, all accomplished with high efficiency and low energy cost. 107

Mutual inspiration between AI and neuroscience

In the past decades, neuroscience concepts have been introduced into ML algorithms and played critical roles in triggering several important advances in AI. For example, the origins of DL methods lie directly in neuroscience, 5 which further stimulated the emergence of the field of RL. 108 The current state-of-the-art CNNs incorporate several hallmarks of neural computation, including nonlinear transduction, divisive normalization, and maximum-based pooling of inputs, 109 which were directly inspired by the unique processing of visual input in the mammalian visual cortex. 110 By introducing the brain's attentional mechanisms, a novel network has been shown to produce enhanced accuracy and computational efficiency at difficult multi-object recognition tasks than conventional CNNs. 111 Other neuroscience findings, including the mechanisms underlying working memory, episodic memory, and neural plasticity, have inspired the development of AI algorithms that address several challenges in deep networks. 108 These algorithms can be directly implemented in the design and refinement of the brain-machine interface and neuroprostheses.

On the other hand, insights from AI research have the potential to offer new perspectives on the basics of intelligence in the brains of humans and other species. Unlike traditional neuroscientists, AI researchers can formalize the concepts of neural mechanisms in a quantitative language to extract their necessity and sufficiency for intelligent behavior. An important illustration of such exchange is the development of the temporal-difference (TD) methods in RL models and the resemblance of TD-form learning in the brain. 112 Therefore, the China Brain Project covers both basic research on cognition and translational research for brain disease and brain-inspired intelligence technology. 113

AI for omics big data analysis

Currently, AI can perform better than humans in some well-defined tasks, such as omics data analysis and smart agriculture. In the big data era, 114 there are many types of data (variety), the volume of data is big, and the generation of data (velocity) is fast. The high variety, big volume, and fast velocity of data makes having it a matter of big value, but also makes it difficult to analyze the data. Unlike traditional statistics-based methods, AI can easily handle big data and reveal hidden associations.

In genetics studies, there are many successful applications of AI. 115 One of the key questions is to determine whether a single amino acid polymorphism is deleterious. 116 There have been sequence conservation-based SIFT 117 and network-based SySAP, 118 but all these methods have met bottlenecks and cannot be further improved. Sundaram et al. developed PrimateAI, which can predict the clinical outcome of mutation based on DNN. 119 Another problem is how to call copy-number variations, which play important roles in various cancers. 120 , 121 Glessner et al. proposed a DL-based tool DeepCNV, in which the area under the receiver operating characteristic (ROC) curve was 0.909, much higher than other ML methods. 122 In epigenetic studies, m6A modification is one of the most important mechanisms. 123 Zhang et al. developed an ensemble DL predictor (EDLm6APred) for mRNA m6A site prediction. 124 The area under the ROC curve of EDLm6APred was 86.6%, higher than existing m6A methylation site prediction models. There are many other DL-based omics tools, such as DeepCpG 125 for methylation, DeepPep 126 for proteomics, AtacWorks 127 for assay for transposase-accessible chromatin with high-throughput sequencing, and deepTCR 128 for T cell receptor sequencing.

Another emerging application is DL for single-cell sequencing data. Unlike bulk data, in which the sample size is usually much smaller than the number of features, the sample size of cells in single-cell data could also be big compared with the number of genes. That makes the DL algorithm applicable for most single-cell data. Since the single-cell data are sparse and have many unmeasured missing values, DeepImpute can accurately impute these missing values in the big gene × cell matrix. 129 During the quality control of single-cell data, it is important to remove the doublet solo embedded cells, using autoencoder, and then build a feedforward neural network to identify the doublet. 130 Potential energy underlying single-cell gradients used generative modeling to learn the underlying differentiation landscape from time series single-cell RNA sequencing data. 131

In protein structure prediction, the DL-based AIphaFold2 can accurately predict the 3D structures of 98.5% of human proteins, and will predict the structures of 130 million proteins of other organisms in the next few months. 132 It is even considered to be the second-largest breakthrough in life sciences after the human genome project 133 and will facilitate drug development among other things.

AI makes modern agriculture smart

Agriculture is entering a fourth revolution, termed agriculture 4.0 or smart agriculture, benefiting from the arrival of the big data era as well as the rapid progress of lots of advanced technologies, in particular ML, modern information, and communication technologies. 134 , 135 Applications of DL, information, and sensing technologies in agriculture cover the whole stages of agricultural production, including breeding, cultivation, and harvesting.

Traditional breeding usually exploits genetic variations by searching natural variation or artificial mutagenesis. However, it is hard for either method to expose the whole mutation spectrum. Using DL models trained on the existing variants, predictions can be made on multiple unidentified gene loci. 136 For example, an ML method, multi-criteria rice reproductive gene predictor, was developed and applied to predict coding and lincRNA genes associated with reproductive processes in rice. 137 Moreover, models trained in species with well-studied genomic data (such as Arabidopsis and rice) can also be applied to other species with limited genome information (such as wild strawberry and soybean). 138 In most cases, the links between genotypes and phenotypes are more complicated than we expected. One gene can usually respond to multiple phenotypes, and one trait is generally the product of the synergism between multi-genes and multi-development. For this reason, multi-traits DL models were developed and enabled genomic editing in plant breeding. 139 , 140

It is well known that dynamic and accurate monitoring of crops during the whole growth period is vitally important to precision agriculture. In the new stage of agriculture, both remote sensing and DL play indispensable roles. Specifically, remote sensing (including proximal sensing) could produce agricultural big data from ground, air-borne, to space-borne platforms, which have a unique potential to offer an economical approach for non-destructive, timely, objective, synoptic, long-term, and multi-scale information for crop monitoring and management, thereby greatly assisting in precision decisions regarding irrigation, nutrients, disease, pests, and yield. 141 , 142 DL makes it possible to simply, efficiently, and accurately discover knowledge from massive and complicated data, especially for remote sensing big data that are characterized with multiple spatial-temporal-spectral information, owing to its strong capability for feature representation and superiority in capturing the essential relation between observation data and agronomy parameters or crop traits. 135 , 143 Integration of DL and big data for agriculture has demonstrated the most disruptive force, as big as the green revolution. As shown in Figure 7 , for possible application a scenario of smart agriculture, multi-source satellite remote sensing data with various geo- and radio-metric information, as well as abundance of spectral information from UV, visible, and shortwave infrared to microwave regions, can be collected. In addition, advanced aircraft systems, such as unmanned aerial vehicles with multi/hyper-spectral cameras on board, and smartphone-based portable devices, will be used to obtain multi/hyper-spectral data in specific fields. All types of data can be integrated by DL-based fusion techniques for different purposes, and then shared for all users for cloud computing. On the cloud computing platform, different agriculture remote sensing models developed by a combination of data-driven ML methods and physical models, will be deployed and applied to acquire a range of biophysical and biochemical parameters of crops, which will be further analyzed by a decision-making and prediction system to obtain the current water/nutrient stress, growth status, and to predict future development. As a result, an automatic or interactive user service platform can be accessible to make the correct decisions for appropriate actions through an integrated irrigation and fertilization system.

An external file that holds a picture, illustration, etc.
Object name is gr7.jpg

Integration of AI and remote sensing in smart agriculture

Furthermore, DL presents unique advantages in specific agricultural applications, such as for dense scenes, that increase the difficulty of artificial planting and harvesting. It is reported that CNNs and Autoencoder models, trained with image data, are being used increasingly for phenotyping and yield estimation, 144 such as counting fruits in orchards, grain recognition and classification, disease diagnosis, etc. 145 , 146 , 147 Consequently, this may greatly liberate the labor force.

The application of DL in agriculture is just beginning. There are still many problems and challenges for the future development of DL technology. We believe, with the continuous acquisition of massive data and the optimization of algorithms, DL will have a better prospect in agricultural production.

AI in physics

The scale of modern physics ranges from the size of a neutron to the size of the Universe ( Figure 8 ). According to the scale, physics can be divided into four categories: particle physics on the scale of neutrons, nuclear physics on the scale of atoms, condensed matter physics on the scale of molecules, and cosmic physics on the scale of the Universe. AI, also called ML, plays an important role in all physics in different scales, since the use of the AI algorithm will be the main trend in data analyses, such as the reconstruction and analysis of images.

An external file that holds a picture, illustration, etc.
Object name is gr8.jpg

Scale of the physics

Speeding up simulations and identifications of particles with AI

There are many applications or explorations of applications of AI in particle physics. We cannot cover all of them here, but only use lattice quantum chromodynamics (LQCD) and the experiments on the Beijing spectrometer (BES) and the large hadron collider (LHC) to illustrate the power of ML in both theoretical and experimental particle physics.

LQCD studies the nonperturbative properties of QCD by using Monte Carlo simulations on supercomputers to help us understand the strong interaction that binds quarks together to form nucleons. Markov chain Monte Carlo simulations commonly used in LQCD suffer from topological freezing and critical slowing down as the simulations approach the real situation of the actual world. New algorithms with the help of DL are being proposed and tested to overcome those difficulties. 148 , 149 Physical observables are extracted from LQCD data, whose signal-to-noise ratio deteriorates exponentially. For non-Abelian gauge theories, such as QCD, complicated contour deformations can be optimized by using ML to reduce the variance of LQCD data. Proof-of-principle applications in two dimensions have been studied. 150 ML can also be used to reduce the time cost of generating LQCD data. 151

On the experimental side, particle identification (PID) plays an important role. Recently, a few PID algorithms on BES-III were developed, and the ANN 152 is one of them. Also, extreme gradient boosting has been used for multi-dimensional distribution reweighting, muon identification, and cluster reconstruction, and can improve the muon identification. U-Net is a convolutional network for pixel-level semantic segmentation, which is widely used in CV. It has been applied on BES-III to solve the problem of multi-turn curling track finding for the main drift chamber. The average efficiency and purity for the first turn's hits is about 91%, at the threshold of 0.85. Current (and future) particle physics experiments are producing a huge amount of data. Machine leaning can be used to discriminate between signal and overwhelming background events. Examples of data analyses on LHC, using supervised ML, can be found in a 2018 collaboration. 153 To take the potential advantage of quantum computers forward, quantum ML methods are also being investigated, see, for example, Wu et al., 154 and references therein, for proof-of-concept studies.

AI makes nuclear physics powerful

Cosmic ray muon tomography (Muography) 155 is an imaging graphe technology using natural cosmic ray muon radiation rather than artificial radiation to reduce the dangers. As an advantage, this technology can detect high-Z materials without destruction, as muon is sensitive to high-Z materials. The Classification Model Algorithm (CMA) algorithm is based on the classification in the supervised learning and gray system theory, and generates a binary classifier designing and decision function with the input of the muon track, and the output indicates whether the material exists at the location. The AI helps the user to improve the efficiency of the scanning time with muons.

AIso, for nuclear detection, the Cs 2 LiYCl 6 :Ce (CLYC) signal can react to both electrons and neutrons to create a pulse signal, and can therefore be applied to detect both neutrons and electrons, 156 but needs identification of the two particles by analyzing the shapes of the waves, that is n-γ ID. The traditional method has been the PSD (pulse shape discrimination) method, which is used to separate the waves of two particles by analyzing the distribution of the pulse information—such as amplitude, width, raise time, fall time, and the two particles that can be separated when the distribution has two separated Gaussian distributions. The traditional PSD can only analyze single-pulse waves, rather than multipulse waves, when two particles react with CLYC closely. But it can be solved by using an ANN method for classification of the six categories (n,γ,n + n,n + γ,γ + n,γ). Also, there are several parameters that could be used by AI to improve the reconstruction algorithm with high efficiency and less error.

AI-aided condensed matter physics

AI opens up a new avenue for physical science, especially when a trove of data is available. Recent works demonstrate that ML provides useful insights to improve the density functional theory (DFT), in which the single-electron picture of the Kohn-Sham scheme has the difficulty of taking care of the exchange and correlation effects of many-body systems. Yu et al. proposed a Bayesian optimization algorithm to fit the Hubbard U parameter, and the new method can find the optimal Hubbard U through a self-consistent process with good efficiency compared with the linear response method, 157 and boost the accuracy to the near-hybrid-functional-level. Snyder et al. developed an ML density functional for a 1D non-interacting non-spin-polarized fermion system to obtain significantly improved kinetic energy. This method enabled a direct approximation of the kinetic energy of a quantum system and can be utilized in orbital-free DFT modeling, and can even bypass the solving of the Kohn-Sham equation—while maintaining the precision to the quantum chemical level when a strong correlation term is included. Recently, FermiNet showed that the many-body quantum mechanics equations can be solved via AI. AI models also show advantages of capturing the interatom force field. In 2010, the Gaussian approximation potential (GAP) 158 was introduced as a powerful interatomic force field to describe the interactions between atoms. GAP uses kernel regression and invariant many-body representations, and performs quite well. For instance, it can simulate crystallization of amorphous crystals under high pressure fairly accurately. By employing the smooth overlap of the atomic position kernel (SOAP), 159 the accuracy of the potential can be further enhanced and, therefore, the SOAP-GAP can be viewed as a field-leading method for AI molecular dynamic simulation. There are also several other well-developed AI interatomic potentials out there, e.g., crystal graph CNNs provide a widely applicable way of vectorizing crystalline materials; SchNet embeds the continuous-filter convolutional layers into its DNNs for easing molecular dynamic as the potentials are space continuous; DimeNet constructs the directional message passing neural network by adding not only the bond length between atoms but also the bond angle, the dihedral angle, and the interactions between unconnected atoms into the model to obtain good accuracy.

AI helps explore the Universe

AI is one of the newest technologies, while astronomy is one of the oldest sciences. When the two meet, new opportunities for scientific breakthroughs are often triggered. Observations and data analysis play a central role in astronomy. The amount of data collected by modern telescopes has reached unprecedented levels, even the most basic task of constructing a catalog has become challenging with traditional source-finding tools. 160 Astronomers have developed automated and intelligent source-finding tools based on DL, which not only offer significant advantages in operational speed but also facilitate a comprehensive understanding of the Universe by identifying particular forms of objects that cannot be detected by traditional software and visual inspection. 160 , 161

More than a decade ago, a citizen science project called “Galaxy Zoo” was proposed to help label one million images of galaxies collected by the Sloan Digital Sky Survey (SDSS) by posting images online and recruiting volunteers. 162 Larger optical telescopes, in operation or under construction, produce data several orders of magnitude higher than SDSS. Even with volunteers involved, there is no way to analyze the vast amount of data received. The advantages of ML are not limited to source-finding and galaxy classification. In fact, it has a much wider range of applications. For example, CNN plays an important role in detecting and decoding gravitational wave signals in real time, reconstructing all parameters within 2 ms, while traditional algorithms take several days to accomplish the same task. 163 Such DL systems have also been used to automatically generate alerts for transients and track asteroids and other fast-moving near-Earth objects, improving detection efficiency by several orders of magnitude. In addition, astrophysicists are exploring the use of neural networks to measure galaxy clusters and study the evolution of the Universe.

In addition to the amazing speed, neural networks seem to have a deeper understanding of the data than expected and can recognize more complex patterns, indicating that the “machine” is evolving rather than just learning the characteristics of the input data.

AI in chemistry

Chemistry plays an important “central” role in other sciences 164 because it is the investigation of the structure and properties of matter, and identifies the chemical reactions that convert substances into to other substances. Accordingly, chemistry is a data-rich branch of science containing complex information resulting from centuries of experiments and, more recently, decades of computational analysis. This vast treasure trove of data is most apparent within the Chemical Abstract Services, which has collected more than 183 million unique organic and inorganic substances, including alloys, coordination compounds, minerals, mixtures, polymers, and salts, and is expanding by addition of thousands of additional new substances daily. 165 The unlimited complexity in the variety of material compounds explains why chemistry research is still a labor-intensive task. The level of complexity and vast amounts of data within chemistry provides a prime opportunity to achieve significant breakthroughs with the application of AI. First, the type of molecules that can be constructed from atoms are almost unlimited, which leads to unlimited chemical space 166 ; the interconnection of these molecules with all possible combinations of factors, such as temperature, substrates, and solvents, are overwhelmingly large, giving rise to unlimited reaction space. 167 Exploration of the unlimited chemical space and reaction space, and navigating to the optimum ones with the desired properties, is thus practically impossible solely from human efforts. Secondly, in chemistry, the huge assortment of molecules and the interplay of them with the external environments brings a new level of complexity, which cannot be simply predicted using physical laws. While many concepts, rules, and theories have been generalized from centuries of experience from studying trivial (i.e., single component) systems, nontrivial complexities are more likely as we discover that “more is different” in the words of Philip Warren Anderson, American physicist and Nobel Laureate. 168 Nontrivial complexities will occur when the scale changes, and the breaking of symmetry in larger, increasingly complex systems, and the rules will shift from quantitative to qualitative. Due to lack of systematic and analytical theory toward the structures, properties, and transformations of macroscopic substances, chemistry research is thus, incorrectly, guided by heuristics and fragmental rules accumulated over the previous centuries, yielding progress that only proceeds through trial and error. ML will recognize patterns from large amounts of data; thereby offering an unprecedented way of dealing with complexity, and reshaping chemistry research by revolutionizing the way in which data are used. Every sub-field of chemistry, currently, has utilized some form of AI, including tools for chemistry research and data generation, such as analytical chemistry and computational chemistry, as well as application to organic chemistry, catalysis, and medical chemistry, which we discuss herein.

AI breaks the limitations of manual feature selection methods

In analytical chemistry, the extraction of information has traditionally relied heavily on the feature selection techniques, which are based on prior human experiences. Unfortunately, this approach is inefficient, incomplete, and often biased. Automated data analysis based on AI will break the limitations of manual variable selection methods by learning from large amounts of data. Feature selection through DL algorithms enables information extraction from the datasets in NMR, chromatography, spectroscopy, and other analytical tools, 169 thereby improving the model prediction accuracy for analysis. These ML approaches will greatly accelerate the analysis of materials, leading to the rapid discovery of new molecules or materials. Raman scattering, for instance, since its discovery in the 1920s, has been widely employed as a powerful vibrational spectroscopy technology, capable of providing vibrational fingerprints intrinsic to analytes, thus enabling identification of molecules. 170 Recently, ML methods have been trained to recognize features in Raman (or SERS) spectra for the identity of an analyte by applying DL networks, including ANN, CNN, and fully convolutional network for feature engineering. 171 For example, Leong et al. designed a machine-learning-driven “SERS taster” to simultaneously harness useful vibrational information from multiple receptors for enhanced multiplex profiling of five wine flavor molecules at ppm levels. Principal-component analysis is employed for the discrimination of alcohols with varying degrees of substitution, and supported with vector machine discriminant analysis, is used to quantitatively classify all flavors with 100% accuracy. 172 Overall, AI techniques provide the first glimmer of hope for a universal method for spectral data analysis, which is fast, accurate, objective and definitive and with attractive advantages in a wide range of applications.

AI improves the accuracy and efficiency for various levels of computational theory

Complementary to analytical tools, computational chemistry has proven a powerful approach for using simulations to understand chemical properties; however, it is faced with an accuracy-versus-efficiency dilemma. This dilemma greatly limits the application of computational chemistry to real-world chemistry problems. To overcome this dilemma, ML and other AI methods are being applied to improve the accuracy and efficiency for various levels of theory used to describe the effects arising at different time and length scales, in the multi-scaling of chemical reactions. 173 Many of the open challenges in computational chemistry can be solved by ML approaches, for example, solving Schrödinger's equation, 174 developing atomistic 175 or coarse graining 176 potentials, constructing reaction coordinates, 177 developing reaction kinetics models, 178 and identifying key descriptors for computable properties. 179 In addition to analytical chemistry and computational chemistry, several disciplines of chemistry have incorporated AI technology to chemical problems. We discuss the areas of organic chemistry, catalysis, and medical chemistry as examples of where ML has made a significant impact. Many examples exist in literature for other subfields of chemistry and AI will continue to demonstrate breakthroughs in a wide range of chemical applications.

AI enables robotics capable of automating the synthesis of molecules

Organic chemistry studies the structure, property, and reaction of carbon-based molecules. The complexity of the chemical and reaction space, for a given property, presents an unlimited number of potential molecules that can be synthesized by chemists. Further complications are added when faced with the problems of how to synthesize a particular molecule, given that the process relies much on heuristics and laborious testing. Challenges have been addressed by researchers using AI. Given enough data, any properties of interest of a molecule can be predicted by mapping the molecular structure to the corresponding property using supervised learning, without resorting to physical laws. In addition to known molecules, new molecules can be designed by sampling the chemical space 180 using methods, such as autoencoders and CNNs, with the molecules coded as sequences or graphs. Retrosynthesis, the planning of synthetic routes, which was once considered an art, has now become much simpler with the help of ML algorithms. The Chemetica system, 181 for instance, is now capable of autonomous planning of synthetic routes that are subsequently proven to work in the laboratory. Once target molecules and the route of synthesis are determined, suitable reaction conditions can be predicted or optimized using ML techniques. 182

The integration of these AI-based approaches with robotics has enabled fully AI-guided robotics capable of automating the synthesis of small organic molecules without human intervention Figure 9 . 183 , 184

An external file that holds a picture, illustration, etc.
Object name is gr9.jpg

A closed loop workflow to enable automatic and intelligent design, synthesis, and assay of molecules in organic chemistry by AI

AI helps to search through vast catalyst design spaces

Catalytic chemistry originates from catalyst technologies in the chemical industry for efficient and sustainable production of chemicals and fuels. Thus far, it is still a challenging endeavor to make novel heterogeneous catalysts with good performance (i.e., stable, active, and selective) because a catalyst's performance depends on many properties: composition, support, surface termination, particle size, particle morphology, atomic coordination environment, porous structure, and reactor during the reaction. The inherent complexity of catalysis makes discovering and developing catalysts with desired properties more dependent on intuition and experiment, which is costly and time consuming. AI technologies, such as ML, when combined with experimental and in silico high-throughput screening of combinatorial catalyst libraries, can aid catalyst discovery by helping to search through vast design spaces. With a well-defined structure and standardized data, including reaction results and in situ characterization results, the complex association between catalytic structure and catalytic performance will be revealed by AI. 185 , 186 An accurate descriptor of the effect of molecules, molecular aggregation states, and molecular transport, on catalysts, could also be predicted. With this approach, researchers can build virtual laboratories to develop new catalysts and catalytic processes.

AI enables screening of chemicals in toxicology with minimum ethical concerns

A more complicated sub-field of chemistry is medical chemistry, which is a challenging field due to the complex interactions between the exotic substances and the inherent chemistry within a living system. Toxicology, for instance, as a broad field, seeks to predict and eliminate substances (e.g., pharmaceuticals, natural products, food products, and environmental substances), which may cause harm to a living organism. Living organisms are already complex, nearly any known substance can cause toxicity at a high enough exposure because of the already inherent complexity within living organisms. Moreover, toxicity is dependent on an array of other factors, including organism size, species, age, sex, genetics, diet, combination with other chemicals, overall health, and/or environmental context. Given the scale and complexity of toxicity problems, AI is likely to be the only realistic approach to meet regulatory body requirements for screening, prioritization, and risk assessment of chemicals (including mixtures), therefore revolutionizing the landscape in toxicology. 187 In summary, AI is turning chemistry from a labor-intensive branch of science to a highly intelligent, standardized, and automated field, and much more can be achieved compared with the limitation of human labor. Underlying knowledge with new concepts, rules, and theories is expected to advance with the application of AI algorithms. A large portion of new chemistry knowledge leading to significant breakthroughs is expected to be generated from AI-based chemistry research in the decades to come.

Conclusions

This paper carries out a comprehensive survey on the development and application of AI across a broad range of fundamental sciences, including information science, mathematics, medical science, materials science, geoscience, life science, physics, and chemistry. Despite the fact that AI has been pervasively used in a wide range of applications, there still exist ML security risks on data and ML models as attack targets during both training and execution phases. Firstly, since the performance of an ML system is highly dependent on the data used to train it, these input data are crucial for the security of the ML system. For instance, adversarial example attacks 188 providing malicious input data often lead the ML system into making false judgments (predictions or categorizations) with small perturbations that are imperceptible to humans; data poisoning by intentionally manipulating raw, training, or testing data can result in a decrease in model accuracy or lead to other error-specific attack purposes. Secondly, ML model attacks include backdoor attacks on DL, CNN, and federated learning that manipulate the model's parameters directly, as well as model stealing attack, model inversion attack, and membership inference attack, which can steal the model parameters or leak the sensitive training data. While a number of defense techniques against these security threats have been proposed, new attack models that target ML systems are constantly emerging. Thus, it is necessary to address the problem of ML security and develop robust ML systems that remain effective under malicious attacks.

Due to the data-driven character of the ML method, features of the training and testing data must be drawn from the same distribution, which is difficult to guarantee in practice. This is because, in practical application, the data source might be different from that in the training dataset. In addition, the data feature distribution may drift over time, which leads to a decline of the performance of the model. Moreover, if the model is trained with only new data, it will lead to catastrophic “forgetting” of the model, which means the model only remembers the new features and forgets the previously learned features. To solve this problem, more and more scholars pay attention on how to make the model have the ability of lifelong learning, that is, a change in the computing paradigm from “offline learning + online reasoning” to “online continuous learning,” and thus give the model have the ability of lifelong learning, just like a human being.

Acknowledgments

This work was partially supported by the National Key R&D Program of China (2018YFA0404603, 2019YFA0704900, 2020YFC1807000, and 2020YFB1313700), the Youth Innovation Promotion Association CAS (2011225, 2012006, 2013002, 2015316, 2016275, 2017017, 2017086, 2017120, 2017204, 2017300, 2017399, 2018356, 2020111, 2020179, Y201664, Y201822, and Y201911), NSFC (nos. 11971466, 12075253, 52173241, and 61902376), the Foundation of State Key Laboratory of Particle Detection and Electronics (SKLPDE-ZZ-201902), the Program of Science & Technology Service Network of CAS (KFJ-STS-QYZX-050), the Fundamental Science Center of the National Nature Science Foundation of China (nos. 52088101 and 11971466), the Scientific Instrument Developing Project of CAS (ZDKYYQ20210003), the Strategic Priority Research Program (B) of CAS (XDB33000000), the National Science Foundation of Fujian Province for Distinguished Young Scholars (2019J06023), the Key Research Program of Frontier Sciences, CAS (nos. ZDBS-LY-7022 and ZDBS-LY-DQC012), the CAS Project for Young Scientists in Basic Research (no. YSBR-005). The study is dedicated to the 10th anniversary of the Youth Innovation Promotion Association of the Chinese Academy of Sciences.

Author contributions

Y.X., Q.W., Z.A., Fei W., C.L., Z.C., J.M.T., and J.Z. conceived and designed the research. Z.A., Q.W., Fei W., Libo.Z., Y.W., F.D., and C.W.-Q. wrote the “ AI in information science ” section. Xin.L. wrote the “ AI in mathematics ” section. J.Q., K.H., W.S., J.W., H.X., Y.H., and X.C. wrote the “ AI in medical science ” section. E.L., C.F., Z.Y., and M.L. wrote the “ AI in materials science ” section. Fang W., R.R., S.D., M.V., and F.K. wrote the “ AI in geoscience ” section. C.H., Z.Z., L.Z., T.Z., J.D., J.Y., L.L., M.L., and T.H. wrote the “ AI in life sciences ” section. Z.L., S.Q., and T.A. wrote the “ AI in physics ” section. X.L., B.Z., X.H., S.C., X.L., W.Z., and J.P.L. wrote the “ AI in chemistry ” section. Y.X., Q.W., and Z.A. wrote the “Abstract,” “ introduction ,” “ history of AI ,” and “ conclusions ” sections.

Declaration of interests

The authors declare no competing interests.

Published Online: October 28, 2021

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Review Article
  • Published: 16 May 2024

A guide to artificial intelligence for cancer researchers

  • Raquel Perez-Lopez   ORCID: orcid.org/0000-0002-9176-0130 1 ,
  • Narmin Ghaffari Laleh   ORCID: orcid.org/0000-0003-0889-3352 2 ,
  • Faisal Mahmood   ORCID: orcid.org/0000-0001-7587-1562 3 , 4 , 5 , 6 , 7 , 8 &
  • Jakob Nikolas Kather   ORCID: orcid.org/0000-0002-3730-5348 2 , 9 , 10  

Nature Reviews Cancer ( 2024 ) Cite this article

2889 Accesses

150 Altmetric

Metrics details

  • Cancer imaging
  • Mathematics and computing
  • Tumour biomarkers

Artificial intelligence (AI) has been commoditized. It has evolved from a specialty resource to a readily accessible tool for cancer researchers. AI-based tools can boost research productivity in daily workflows, but can also extract hidden information from existing data, thereby enabling new scientific discoveries. Building a basic literacy in these tools is useful for every cancer researcher. Researchers with a traditional biological science focus can use AI-based tools through off-the-shelf software, whereas those who are more computationally inclined can develop their own AI-based software pipelines. In this article, we provide a practical guide for non-computational cancer researchers to understand how AI-based tools can benefit them. We convey general principles of AI for applications in image analysis, natural language processing and drug discovery. In addition, we give examples of how non-computational researchers can get started on the journey to productively use AI in their own work.

This is a preview of subscription content, access via your institution

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

24,99 € / 30 days

cancel any time

Subscribe to this journal

Receive 12 print issues and online access

195,33 € per year

only 16,28 € per issue

Buy this article

  • Purchase on Springer Link
  • Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

recent research paper on artificial intelligence

Similar content being viewed by others

recent research paper on artificial intelligence

Artificial intelligence in histopathology: enhancing cancer research and clinical oncology

recent research paper on artificial intelligence

Guiding principles for the responsible development of artificial intelligence tools for healthcare

recent research paper on artificial intelligence

AI in health and medicine

Jiang, T., Gradus, J. L. & Rosellini, A. J. Supervised machine learning: a brief primer. Behav. Ther. 51 , 675–687 (2020).

Article   PubMed   PubMed Central   Google Scholar  

Alloghani, M., Al-Jumeily, D., Mustafina, J., Hussain, A. & Aljaaf, A. J. in Supervised and Unsupervised Learning for Data Science (eds Berry, M. W. et al.) 3–21 (Springer International, 2020).

Yala, A. et al. Optimizing risk-based breast cancer screening policies with reinforcement learning. Nat. Med. 28 , 136–143 (2022).

Article   CAS   PubMed   Google Scholar  

Kaufmann, E. et al. Champion-level drone racing using deep reinforcement learning. Nature 620 , 982–987 (2023).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Nasteski, V. An overview of the supervised machine learning methods. Horizons 4 , 51–62 (2017).

Article   Google Scholar  

Dike, H. U., Zhou, Y., Deveerasetty, K. K. & Wu, Q. Unsupervised learning based on artificial neural network: a review. In 2 018 IEEE International Conference on Cyborg and Bionic Systems (CBS) 322–327 (2018).

Shurrab, S. & Duwairi, R. Self-supervised learning methods and applications in medical imaging analysis: a survey. PeerJ Comput. Sci. 8 , e1045 (2022).

Wang, X. et al. Transformer-based unsupervised contrastive learning for histopathological image classification. Med. Image Anal. 81 , 102559 (2022).

Article   PubMed   Google Scholar  

Wang, X. et al. RetCCL: clustering-guided contrastive learning for whole-slide image retrieval. Med. Image Anal. 83 , 102645 (2023).

Vinyals, O. et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature 575 , 350–354 (2019).

Zhao, Y., Kosorok, M. R. & Zeng, D. Reinforcement learning design for cancer clinical trials. Stat. Med. 28 , 3294–3315 (2009).

Sapsford, R. & Jupp, V. Data Collection and Analysis (SAGE, 2006).

Yamashita, R., Nishio, M., Do, R. K. G. & Togashi, K. Convolutional neural networks: an overview and application in radiology. Insights Imaging 9 , 611–629 (2018).

Chowdhary, K. R. in Fundamentals of Artificial Intelligence (ed. Chowdhary, K. R.) 603–649 (Springer India, 2020).

Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9 , 1735–1780 (1997).

Vaswani, A. et al. Attention is all you need. Preprint at https://doi.org/10.48550/arXiv.1706.03762 (2017).

Shmatko, A., Ghaffari Laleh, N., Gerstung, M. & Kather, J. N. Artificial intelligence in histopathology: enhancing cancer research and clinical oncology. Nat. Cancer 3 , 1026–1038 (2022).

Wagner, S. J. et al. Transformer-based biomarker prediction from colorectal cancer histology: a large-scale multicentric study. Cancer Cell 41 , 1650–1661.e4 (2023).

Khan, A. et al. A survey of the vision transformers and their CNN-transformer based variants. Artif. Intell. Rev. 56 , 2917–2970 (2023).

Hamm, C. A. et al. Deep learning for liver tumor diagnosis part I: development of a convolutional neural network classifier for multi-phasic MRI. Eur. Radiol. 29 , 3338–3347 (2019).

Ren, J., Eriksen, J. G., Nijkamp, J. & Korreman, S. S. Comparing different CT, PET and MRI multi-modality image combinations for deep learning-based head and neck tumor segmentation. Acta Oncol. 60 , 1399–1406 (2021).

Unger, M. & Kather, J. N. A systematic analysis of deep learning in genomics and histopathology for precision oncology. BMC Med. Genomics 17 , 48 (2024).

Gawehn, E., Hiss, J. A. & Schneider, G. Deep learning in drug discovery. Mol. Inform. 35 , 3–14 (2016).

Bayramoglu, N., Kannala, J. & Heikkilä, J. Deep learning for magnification independent breast cancer histopathology image classification. In 2016 23rd International Conference on Pattern Recognition (ICPR) 2440–2445 (IEEE, 2016).

Galon, J. et al. Type, density, and location of immune cells within human colorectal tumors predict clinical outcome. Science 313 , 1960–1964 (2006).

Schmidt, U., Weigert, M., Broaddus, C. & Myers, G. Cell detection with star-convex polygons. In Medical Image Computing and Computer Assisted Intervention — MICCAI 2018. Lecture Notes in Computer Science Vol. 11071 (eds Frangi, A. et al.) https://doi.org/10.1007/978-3-030-00934-2_30 (Springer, 2018).

Edlund, C. et al. LIVECell—a large-scale dataset for label-free live cell segmentation. Nat. Methods 18 , 1038–1045 (2021).

Bankhead, P. et al. QuPath: open source software for digital pathology image analysis. Sci. Rep. 7 , 16878 (2017).

Schneider, C. A., Rasband, W. S. & Eliceiri, K. W. NIH image to imageJ: 25 years of image analysis. Nat. Methods 9 , 671–675 (2012).

Rueden, C. T. et al. ImageJ2: ImageJ for the next generation of scientific image data. BMC Bioinformatics 18 , 529 (2017).

Schindelin, J. et al. Fiji: an open-source platform for biological-image analysis. Nat. Methods 9 , 676–682 (2012).

Linkert, M. et al. Metadata matters: access to image data in the real world. J. Cell Biol. 189 , 777–782 (2010).

Gómez-de-Mariscal, E. et al. DeepImageJ: a user-friendly environment to run deep learning models in ImageJ. Nat. Methods 18 , 1192–1195 (2021).

Betge, J. et al. The drug-induced phenotypic landscape of colorectal cancer organoids. Nat. Commun. 13 , 3135 (2022).

Park, T. et al. Development of a deep learning based image processing tool for enhanced organoid analysis. Sci. Rep. 13 , 19841 (2023).

Belthangady, C. & Royer, L. A. Applications, promises, and pitfalls of deep learning for fluorescence image reconstruction. Nat. Methods 16 , 1215–1225 (2019).

Echle, A. et al. Deep learning in cancer pathology: a new generation of clinical biomarkers. Br. J. Cancer 124 , 686–696 (2021).

Cifci, D., Foersch, S. & Kather, J. N. Artificial intelligence to identify genetic alterations in conventional histopathology. J. Pathol. 257 , 430–444 (2022).

Greenson, J. K. et al. Pathologic predictors of microsatellite instability in colorectal cancer. Am. J. Surg. Pathol. 33 , 126–133 (2009).

Kather, J. N. et al. Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer. Nat. Med. 25 , 1054–1056 (2019).

Echle, A. et al. Clinical-grade detection of microsatellite instability in colorectal tumors by deep learning. Gastroenterology 159 , 1406–1416.e11 (2020).

Kather, J. N. et al. Pan-cancer image-based detection of clinically actionable genetic alterations. Nat. Cancer 1 , 789–799 (2020).

Fu, Y. et al. Pan-cancer computational histopathology reveals mutations, tumor composition and prognosis. Nat. Cancer 1 , 800–810 (2020).

Coudray, N. et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat. Med. 24 , 1559–1567 (2018).

Schmauch, B. et al. A deep learning model to predict RNA-seq expression of tumours from whole slide images. Nat. Commun. 11 , 3877 (2020).

Binder, A. et al. Morphological and molecular breast cancer profiling through explainable machine learning. Nat. Mach. Intell. 3 , 355–366 (2021).

Loeffler, C. M. L. et al. Predicting mutational status of driver and suppressor genes directly from histopathology with deep learning: a systematic study across 23 solid tumor types. Front. Genet. 12 , 806386 (2022).

Chen, R. J. et al. Pan-cancer integrative histology-genomic analysis via multimodal deep learning. Cancer Cell 40 , 865–878.e6 (2022).

Bilal, M. et al. Development and validation of a weakly supervised deep learning framework to predict the status of molecular pathways and key mutations in colorectal cancer from routine histology images: a retrospective study. Lancet Digit. Health 3 , e763–e772 (2021).

Yamashita, R. et al. Deep learning model for the prediction of microsatellite instability in colorectal cancer: a diagnostic study. Lancet Oncol. 22 , 132–141 (2021).

Echle, A. et al. Artificial intelligence for detection of microsatellite instability in colorectal cancer—a multicentric analysis of a pre-screening tool for clinical application. ESMO Open 7 , 100400 (2022).

Schirris, Y., Gavves, E., Nederlof, I., Horlings, H. M. & Teuwen, J. DeepSMILE: contrastive self-supervised pre-training benefits MSI and HRD classification directly from H&E whole-slide images in colorectal and breast cancer. Med. Image Anal. 79 , 102464 (2022).

Jain, M. S. & Massoud, T. F. Predicting tumour mutational burden from histopathological images using multiscale deep learning. Nat. Mach. Intell . 2 , 356–362 (2020).

Xu, H. et al. Spatial heterogeneity and organization of tumor mutation burden with immune infiltrates within tumors based on whole slide images correlated with patient survival in bladder cancer. J. Pathol. Inform. 13 , 100105 (2022).

Chen, S. et al. Deep learning-based approach to reveal tumor mutational burden status from whole slide images across multiple cancer types. Preprint at https://doi.org/10.48550/arXiv.2204.03257 (2023).

Shamai, G. et al. Artificial intelligence algorithms to assess hormonal status from tissue microarrays in patients with breast cancer. JAMA Netw. Open 2 , e197700 (2019).

Beck, A. H. et al. Systematic analysis of breast cancer morphology uncovers stromal features associated with survival. Sci. Transl. Med. 3 , 108ra113 (2011).

Arslan, S. et al. A systematic pan-cancer study on deep learning-based prediction of multi-omic biomarkers from routine pathology images. Commun. Med. 4 , 48 (2024).

Campanella, G. et al. Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat. Med. 25 , 1301–1309 (2019).

Lu, M. Y. et al. AI-based pathology predicts origins for cancers of unknown primary. Nature 594 , 106–110 (2021).

Kleppe, A. et al. A clinical decision support system optimising adjuvant chemotherapy for colorectal cancers by integrating deep learning and pathological staging markers: a development and validation study. Lancet Oncol. 23 , 1221–1232 (2022).

Jiang, X. et al. End-to-end prognostication in colorectal cancer by deep learning: a retrospective, multicentre study. Lancet Digit. Health 6 , e33–e43 (2024).

Zeng, Q. et al. Artificial intelligence-based pathology as a biomarker of sensitivity to atezolizumab–bevacizumab in patients with hepatocellular carcinoma: a multicentre retrospective study. Lancet Oncol. 24 , 1411–1422 (2023).

Ghaffari Laleh, N., Ligero, M., Perez-Lopez, R. & Kather, J. N. Facts and hopes on the use of artificial intelligence for predictive immunotherapy biomarkers in cancer. Clin. Cancer Res. 29 , 316–323 (2022).

Pedersen, A. et al. FastPathology: an open-source platform for deep learning-based research and decision support in digital pathology. IEEE Access 9 , 58216–58229 (2021).

Pocock, J. et al. TIAToolbox as an end-to-end library for advanced tissue image analytics. Commun. Med. 2 , 120 (2022).

Lu, M. Y. et al. Data-efficient and weakly supervised computational pathology on whole-slide images. Nat. Biomed. Eng. 5 , 555–570 (2021).

El Nahhas, O. S. M. et al. From whole-slide image to biomarker prediction: a protocol for end-to-end deep learning in computational pathology. Preprint at https://doi.org/10.48550/arXiv.2312.10944 (2023).

Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. Preprint at https://doi.org/10.48550/arXiv.1912.01703 (2019).

Jorge Cardoso, M. et al. MONAI: an open-source framework for deep learning in healthcare. Preprint at https://doi.org/10.48550/arXiv.2211.02701 (2022).

Goode, A., Gilbert, B., Harkes, J., Jukic, D. & Satyanarayanan, M. OpenSlide: a vendor-neutral software foundation for digital pathology. J. Pathol. Inform. 4 , 27 (2013).

Martinez, K. & Cupitt, J. VIPS—a highly tuned image processing software architecture. In IEEE Int.Conf. Image Processing 2005 ; https://doi.org/10.1109/icip.2005.1530120 (2005).

Dolezal, J. M. et al. Deep learning generates synthetic cancer histology for explainability and education. NPJ Precis. Oncol. 7 , 49 (2023).

Plass, M. et al. Explainability and causability in digital pathology. Hip Int. 9 , 251–260 (2023).

Google Scholar  

Reis-Filho, J. S. & Kather, J. N. Overcoming the challenges to implementation of artificial intelligence in pathology. J. Natl Cancer Inst. 115 , 608–612 (2023).

Aggarwal, R. et al. Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis. NPJ Digit. Med. 4 , 65 (2021).

Rajput, D., Wang, W.-J. & Chen, C.-C. Evaluation of a decided sample size in machine learning applications. BMC Bioinformatics 24 , 48 (2023).

Ligero, M. et al. Minimizing acquisition-related radiomics variability by image resampling and batch effect correction to allow for large-scale data analysis. Eur. Radiol. 31 , 1460–1470 (2021).

Zwanenburg, A. et al. The image biomarker standardization initiative: standardized quantitative radiomics for high-throughput image-based phenotyping. Radiology 295 , 328–338 (2020).

van Griethuysen, J. J. M. et al. Computational radiomics system to decode the radiographic phenotype. Cancer Res. 77 , e104–e107 (2017).

Fedorov, A. et al. 3D Slicer as an image computing platform for the quantitative imaging network. Magn. Reson. Imaging 30 , 1323–1341 (2012).

Yushkevich, P. A. et al. User-guided 3D active contour segmentation of anatomical structures: significantly improved efficiency and reliability. Neuroimage 31 , 1116–1128 (2006).

Khader, F. et al. Multimodal deep learning for integrating chest radiographs and clinical parameters: a case for transformers. Radiology 309 , e230806 (2023).

Yu, A. C., Mohajer, B. & Eng, J. External validation of deep learning algorithms for radiologic diagnosis: a systematic review. Radiol. Artif. Intell. 4 , e210064 (2022).

US FDA. Artificial intelligence and machine learning (AI/ML)-enabled medical devices; https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-aiml-enabled-medical-devices (2023).

Bruker Corporation. Artificial intelligence in NMR; https://www.bruker.com/en/landingpages/bbio/artificial-intelligence-in-nmr.html (2024).

Wasserthal, J. TotalSegmentator: tool for robust segmentation of 104 important anatomical structures in CT images. GitHub https://doi.org/10.5281/zenodo.6802613 (2023).

Garcia-Ruiz, A. et al. An accessible deep learning tool for voxel-wise classification of brain malignancies from perfusion MRI. Cell Rep. Med. 5 , 101464 (2024).

Lång, K. et al. Artificial intelligence-supported screen reading versus standard double reading in the Mammography Screening with Artificial Intelligence trial (MASAI): a clinical safety analysis of a randomised, controlled, non-inferiority, single-blinded, screening accuracy study. Lancet Oncol. 24 , 936–944 (2023).

Bera, K., Braman, N., Gupta, A., Velcheti, V. & Madabhushi, A. Predicting cancer outcomes with radiomics and artificial intelligence in radiology. Nat. Rev. Clin. Oncol. 19 , 132–146 (2022).

Núñez, L. M. et al. Unraveling response to temozolomide in preclinical GL261 glioblastoma with MRI/MRSI using radiomics and signal source extraction. Sci. Rep. 10 , 19699 (2020).

Müller, J. et al. Radiomics-based tumor phenotype determination based on medical imaging and tumor microenvironment in a preclinical setting. Radiother. Oncol. 169 , 96–104 (2022).

Amirrashedi, M. et al. Leveraging deep neural networks to improve numerical and perceptual image quality in low-dose preclinical PET imaging. Comput. Med. Imaging Graph. 94 , 102010 (2021).

Zinn, P. O. et al. A coclinical radiogenomic validation study: conserved magnetic resonance radiomic appearance of periostin-expressing glioblastoma in patients and xenograft models. Clin. Cancer Res. 24 , 6288–6299 (2018).

Lin, Y.-C. et al. Diffusion radiomics analysis of intratumoral heterogeneity in a murine prostate cancer model following radiotherapy: pixelwise correlation with histology. J. Magn. Reson. Imaging 46 , 483–489 (2017).

Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616 , 259–265 (2023).

Chen, R. J. et al. Towards a general-purpose foundation model for computational pathology. Nat. Med. 30 , 850–862 (2024).

Unger, M. & Kather, J. N. Deep learning in cancer genomics and histopathology. Genome Med. 16 , 44 (2024).

Zhou, Y. et al. A foundation model for generalizable disease detection from retinal images. Nature 622 , 156–163 (2023).

Filiot, A. et al. Scaling self-supervised learning for histopathology with masked image modeling. Preprint at bioRxiv https://doi.org/10.1101/2023.07.21.23292757 (2023).

Campanella, G. et al. Computational pathology at health system scale—self-supervised foundation models from three billion images. Preprint at https://doi.org/10.48550/arXiv.2310.07033 (2023).

Vorontsov, E. et al. Virchow: a million-slide digital pathology foundation model. Preprint at https://doi.org/10.48550/arXiv.2309.07778 (2023).

Clusmann, J. et al. The future landscape of large language models in medicine. Commun. Med. 3 , 141 (2023).

Bubeck, S. et al. Sparks of artificial general intelligence: early experiments with GPT-4. Preprint at https://doi.org/10.48550/arXiv.2303.12712 (2023).

Truhn, D., Reis-Filho, J. S. & Kather, J. N. Large language models should be used as scientific reasoning engines, not knowledge databases. Nat. Med. 29 , 2983–2984 (2023).

Adams, L. C. et al. Leveraging GPT-4 for post hoc transformation of free-text radiology reports into structured reporting: a multilingual feasibility study. Radiology 307 , e230725 (2023).

Truhn, D. et al. Extracting structured information from unstructured histopathology reports using generative pre-trained transformer 4 (GPT-4). J. Pathol. 262 , 310–319 (2023).

Wiest, I. C. et al. From text to tables: a local privacy preserving large language model for structured information retrieval from medical documents. Preprint at bioRxiv https://doi.org/10.1101/2023.12.07.23299648 (2023).

Singhal, K. et al. Large language models encode clinical knowledge. Nature 620 , 172–180 (2023).

Truhn, D. et al. A pilot study on the efficacy of GPT-4 in providing orthopedic treatment recommendations from MRI reports. Sci. Rep. 13 , 20159 (2023).

Wang, H. et al. Scientific discovery in the age of artificial intelligence. Nature 620 , 47–60 (2023).

Derraz, B. et al. New regulatory thinking is needed for AI-based personalised drug and cell therapies in precision oncology. NPJ Precis. Oncol . https://doi.org/10.1038/s41698-024-00517-w (2024).

Extance, A. ChatGPT has entered the classroom: how LLMs could transform education. Nature 623 , 474–477 (2023).

Thirunavukarasu, A. J. et al. Large language models in medicine. Nat. Med. 29 , 1930–1940 (2023).

Webster, P. Six ways large language models are changing healthcare. Nat. Med. 29 , 2969–2971 (2023).

Krishnan, R., Rajpurkar, P. & Topol, E. J. Self-supervised learning in medicine and healthcare. Nat. Biomed. Eng. 6 , 1346–1352 (2022).

Meskó, B. Prompt engineering as an important emerging skill for medical professionals: tutorial. J. Med. Internet Res. 25 , e50638 (2023).

Sushil, M. et al. CORAL: expert-curated oncology reports to advance language model inference. NEJM AI 1 , 4 (2024).

Brown, T. B. et al. Language models are few-shot learners. Preprint at https://doi.org/10.48550/arXiv.2005.01416 (2020).

Ferber, D. & Kather, J. N. Large language models in uro-oncology. Eur. Urol. Oncol. 7 , 157–159 (2023).

Jiang, L. Y. et al. Health system-scale language models are all-purpose prediction engines. Nature 619 , 357–362 (2023).

Nori, H. et al. Can generalist foundation models outcompete special-purpose tuning? Case study in medicine. Preprint at https://doi.org/10.48550/arXiv.2311.16452 (2023).

Balaguer, A. et al. RAG vs fine-tuning: pipelines, tradeoffs, and a case study on agriculture. Preprint at https://doi.org/10.48550/arXiv.2401.08406 (2024).

Gemini Team et al. Gemini: a family of highly capable multimodal models. Preprint at https://doi.org/10.48550/arXiv.2312.11805 (2023).

Tisman, G. & Seetharam, R. OpenAI’s ChatGPT-4, BARD and YOU.Com (AI) and the cancer patient, for now, caveat emptor, but stay tuned. Digit. Med. Healthc. Technol. https://doi.org/10.5772/dmht.19 (2023).

Touvron, H. et al. LLaMA: open and efficient foundation language models. Preprint at https://doi.org/10.48550/arXiv.2302.13971 (2023).

Lipkova, J. et al. Artificial intelligence for multimodal data integration in oncology. Cancer Cell 40 , 1095–1110 (2022).

Niehues, J. M. et al. Generalizable biomarker prediction from cancer pathology slides with self-supervised deep learning: a retrospective multi-centric study. Cell Rep. Med. 4 , 100980 (2023).

Foersch, S. et al. Multistain deep learning for prediction of prognosis and therapy response in colorectal cancer. Nat. Med. 29 , 430–439 (2023).

Boehm, K. M. et al. Multimodal data integration using machine learning improves risk stratification of high-grade serous ovarian cancer. Nat. Cancer 3 , 723–733 (2022).

Vanguri, R. et al. Multimodal integration of radiology, pathology and genomics for prediction of response to PD-(L)1 blockade in patients with non-small cell lung cancer. Nat. Cancer 3 , 1151–1164 (2022).

Shifai, N., van Doorn, R., Malvehy, J. & Sangers, T. E. Can ChatGPT vision diagnose melanoma? An exploratory diagnostic accuracy study. J. Am. Acad. Dermatol . 90 , 1057–1059 (2024).

Liu, H., Li, C., Wu, Q. & Lee, Y. J. Visual instruction tuning. Preprint at https://doi.org/10.48550/arXiv.2304.08485 (2023).

Li, C. et al. LLaVA-med: training a large language-and-vision assistant for biomedicine in one day. Preprint at https://doi.org/10.48550/arXiv.2306.00890 (2023).

Lu, M. Y. et al. A foundational multimodal vision language AI assistant for human pathology. Preprint at https://doi.org/10.48550/arXiv.2312.07814 (2023).

Adalsteinsson, V. A. et al. Scalable whole-exome sequencing of cell-free DNA reveals high concordance with metastatic tumors. Nat. Commun. 8 , 1324 (2017).

Zhang, Z. et al. Uniform genomic data analysis in the NCI Genomic Data Commons. Nat. Commun. 12 , 1226 (2021).

Vega, D. M. et al. Aligning tumor mutational burden (TMB) quantification across diagnostic platforms: phase II of the Friends of Cancer Research TMB Harmonization Project. Ann. Oncol. 32 , 1626–1636 (2021).

Anaya, J., Sidhom, J.-W., Mahmood, F. & Baras, A. S. Multiple-instance learning of somatic mutations for the classification of tumour type and the prediction of microsatellite status. Nat. Biomed. Eng. 8 , 57–67 (2023).

Chen, B. et al. Predicting HLA class II antigen presentation through integrated deep learning. Nat. Biotechnol. 37 , 1332–1343 (2019).

Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596 , 583–589 (2021).

Callaway, E. What’s next for AlphaFold and the AI protein-folding revolution. Nature 604 , 234–238 (2022).

Cheng, J. et al. Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science 381 , eadg7492 (2023).

Barrio-Hernandez, I. et al. Clustering predicted structures at the scale of the known protein universe. Nature 622 , 637–645 (2023).

Yang, X., Wang, Y., Byrne, R., Schneider, G. & Yang, S. Concepts of artificial intelligence for computer-assisted drug discovery. Chem. Rev. 119 , 10520–10594 (2019).

Mullowney, M. W. et al. Artificial intelligence for natural product drug discovery. Nat. Rev. Drug Discov. 22 , 895–916 (2023).

Jayatunga, M. K. P., Xie, W., Ruder, L., Schulze, U. & Meier, C. AI in small-molecule drug discovery: a coming wave? Nat. Rev. Drug Discov. 21 , 175–176 (2022).

Vert, J.-P. How will generative AI disrupt data science in drug discovery? Nat. Biotechnol. 41 , 750–751 (2023).

Wong, F. et al. Discovery of a structural class of antibiotics with explainable deep learning. Nature 626 , 177–185 (2023).

Swanson, K. et al. Generative AI for designing and validating easily synthesizable and structurally novel antibiotics. Nat. Mach. Intell. 6 , 338–353 (2024).

Janizek, J. D. et al. Uncovering expression signatures of synergistic drug responses via ensembles of explainable machine-learning models. Nat. Biomed. Eng. 7 , 811–829 (2023).

Savage, N. Drug discovery companies are customizing ChatGPT: here’s how. Nat. Biotechnol. 41 , 585–586 (2023).

Boiko, D. A., MacKnight, R., Kline, B. & Gomes, G. Autonomous chemical research with large language models. Nature 624 , 570–578 (2023).

Arnold, C. AlphaFold touted as next big thing for drug discovery—but is it? Nature 622 , 15–17 (2023).

Mock, M., Edavettal, S., Langmead, C. & Russell, A. AI can help to speed up drug discovery—but only if we give it the right data. Nature 621 , 467–470 (2023).

AI’s potential to accelerate drug discovery needs a reality check. Nature 622 , 217 (2023).

Upswing in AI drug-discovery deals. Nat. Biotechnol . 41 , 1361 (2023).

Hutson, M. AI for drug discovery is booming, but who owns the patents? Nat. Biotechnol. 41 , 1494–1496 (2023).

Wong, C. H., Siah, K. W. & Lo, A. W. Estimation of clinical trial success rates and related parameters. Biostatistics 20 , 273–286 (2019).

Subbiah, V. The next generation of evidence-based medicine. Nat. Med. 29 , 49–58 (2023).

Yuan, C. et al. Criteria2Query: a natural language interface to clinical databases for cohort definition. J. Am. Med. Inform. Assoc. 26 , 294–305 (2019).

Lu, L., Dercle, L., Zhao, B. & Schwartz, L. H. Deep learning for the prediction of early on-treatment response in metastatic colorectal cancer from serial medical imaging. Nat. Commun. 12 , 6654 (2021).

Trebeschi, S. et al. Prognostic value of deep learning-mediated treatment monitoring in lung cancer patients receiving immunotherapy. Front. Oncol. 11 , 609054 (2021).

Castelo-Branco, L. et al. ESMO guidance for reporting oncology real-world evidence (GROW). Ann. Oncol. 34 , 1097–1112 (2023).

Morin, O. et al. An artificial intelligence framework integrating longitudinal electronic health records with real-world data enables continuous pan-cancer prognostication. Nat. Cancer 2 , 709–722 (2021).

Yang, X. et al. A large language model for electronic health records. NPJ Digit. Med. 5 , 194 (2022).

Huang, X., Rymbekova, A., Dolgova, O., Lao, O. & Kuhlwilm, M. Harnessing deep learning for population genetic inference. Nat. Rev. Genet. 25 , 61–78 (2024).

Pawlicki, Lee, D.-S., Hull & Srihari. Neural network models and their application to handwritten digit recognition. In IEEE 1988 Int. Conf. Neural Networks (eds Pawlicki, T. F. et al.) 63–70 (1988).

Chui, M. et al. The economic potential of generative AI: the next productivity frontier. McKinsey https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/the-economic-potential-of-generative-ai-the-next-productivity-frontier (2023).

Dell’Acqua, F. et al. Navigating the jagged technological frontier: field experimental evidence of the effects of AI on knowledge worker productivity and quality. Harvard Business School https://www.hbs.edu/ris/Publication%20Files/24-013_d9b45b68-9e74-42d6-a1c6-c72fb70c7282.pdf (2023).

Boehm, K. M., Khosravi, P., Vanguri, R., Gao, J. & Shah, S. P. Harnessing multimodal data integration to advance precision oncology. Nat. Rev. Cancer 22 , 114–126 (2022).

Gilbert, S., Harvey, H., Melvin, T., Vollebregt, E. & Wicks, P. Large language model AI chatbots require approval as medical devices. Nat. Med. 29 , 2396–2398 (2023).

Mobadersany, P. et al. Predicting cancer outcomes from histology and genomics using convolutional networks. Proc. Natl Acad. Sci. USA 115 , E2970–E2979 (2018).

Chang, Y. et al. A survey on evaluation of large language models. ACM Trans. Intell. Syst. Technol. 15 , 1–45 (2024).

Lin, T., Wang, Y., Liu, X. & Qiu, X. A survey of transformers. AI Open 3 , 111–132 (2022).

Download references

Acknowledgements

R.P.-L. is supported by LaCaixa Foundation, a CRIS Foundation Talent Award (TALENT19-05), the FERO Foundation, the Instituto de Salud Carlos III-Investigacion en Salud (PI18/01395 and PI21/01019), the Prostate Cancer Foundation (18YOUN19) and the Asociación Española Contra el Cancer (AECC) (PRYCO211023SERR). J.N.K. is supported by the German Cancer Aid (DECADE, 70115166), the German Federal Ministry of Education and Research (PEARL, 01KD2104C; CAMINO, 01EO2101; SWAG, 01KD2215A; TRANSFORM LIVER, 031L0312A; and TANGERINE, 01KT2302 through ERA-NET Transcan), the German Academic Exchange Service (SECAI, 57616814), the German Federal Joint Committee (TransplantKI, 01VSF21048), the European Union’s Horizon Europe and innovation programme (ODELIA, 101057091; and GENIAL, 101096312), the European Research Council (ERC; NADIR, 101114631) and the National Institute for Health and Care Research (NIHR; NIHR203331) Leeds Biomedical Research Centre. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care. This work was funded by the European Union. Views and opinions expressed are, however, those of the authors only and do not necessarily reflect those of the European Union. Neither the European Union nor the granting authority can be held responsible for them.

Author information

Authors and affiliations.

Radiomics Group, Vall d’Hebron Institute of Oncology, Vall d’Hebron Barcelona Hospital Campus, Barcelona, Spain

Raquel Perez-Lopez

Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany

Narmin Ghaffari Laleh & Jakob Nikolas Kather

Department of Pathology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA

Faisal Mahmood

Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA

Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA

Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA

Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA

Harvard Data Science Initiative, Harvard University, Cambridge, MA, USA

Department of Medicine I, University Hospital Dresden, Dresden, Germany

Jakob Nikolas Kather

Medical Oncology, National Center for Tumour Diseases (NCT), University Hospital Heidelberg, Heidelberg, Germany

You can also search for this author in PubMed   Google Scholar

Contributions

All authors contributed substantially to discussion of the content and reviewed and/or edited the manuscript before the submission. R.P.-L., N.G.L. and J.N.K. researched data for the article and wrote the article.

Corresponding author

Correspondence to Jakob Nikolas Kather .

Ethics declarations

Competing interests.

J.N.K. declares consulting services for Owkin, DoMore Diagnostics, Panakeia, Scailyte, Mindpeak and MultiplexDx; holds shares in StratifAI GmbH; has received a research grant from GSK; and has received honoraria from AstraZeneca, Bayer, Eisai, Janssen, MSD, BMS, Roche, Pfizer and Fresenius. R.P.-L. declares research funding by AstraZeneca and Roche, and participates in the steering committee of a clinical trial sponsored by Roche, not related to this work. All other authors declare no competing interests.

Peer review

Peer review information.

Nature Reviews Cancer thanks the anonymous reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Related links

Hugging Face: https://huggingface.co/

You.com: https://you.com

Supplementary information

Supplementary information.

(API). A set of tools and protocols for building software and applications, enabling software to communicate with AI models.

(ANNs). Computational models loosely inspired by the structure and function of the human brain, consisting of interconnected layers of nodes, called neurons, that process input data and learn to recognize patterns and make decisions.

The use of algorithms, machine learning and image analysis techniques to extract information from digital pathology images.

A field of AI that focuses on enabling computers to analyse and interpret visual data, such as images and videos.

(CNNs). A type of deep neural network that is especially effective for analysing visual imagery and used in image analysis.

Deep learning is a subfield of machine learning that uses artificial neural networks with multiple layers, called deep neural networks, to learn and extract highly complex features and patterns from raw input data.

Visual representations captured and stored in a digital format, consisting of a grid of pixels, with each pixel representing a colour intensity value.

The practice of converting glass slides into digital slides that can be viewed, managed and analysed on a computer.

Techniques in AI that provide insights and explanations on how the AI model arrived at its conclusions, thus making the decision-making process of the AI more transparent.

AI systems that can generate new content (text, images or music) that is similar to the content on which it was trained, often creating novel and coherent outputs.

Extremely high-resolution digital images consisting of 1 billion pixels, obtained by scanning tissue slides with a slide scanner.

(GPUs). Specialized hardware used to rapidly process large blocks of data simultaneously, used in computer gaming and AI.

(LLMs). Advanced AI models trained on vast amounts of text data, capable of analysing, generating and manipulating human language, often at the human level 174 .

A type of neural network particularly good at processing sequences of data (such as time series or language), with a capability to remember information for a certain time.

A subset of AI focusing on the development of algorithms and models that enable computers to learn and improve their performance on a specific task without being explicitly instructed how to achieve this.

(NLP). A branch of AI that helps computers to analyse, interpret and respond to human language in a useful way.

Crafting inputs or questions in a way that guides AI models, particularly LLMs, to provide the most effective and accurate responses.

Types of a neural network model that excel at processing sequences of data, such as sentences in text, by focusing on different parts of the sequence to make predictions 175 .

The three-dimensional equivalent of a pixel in images, representing a value on a regular grid in three-dimensional space, commonly used in medical imaging such as MRI and CT scans.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article.

Perez-Lopez, R., Ghaffari Laleh, N., Mahmood, F. et al. A guide to artificial intelligence for cancer researchers. Nat Rev Cancer (2024). https://doi.org/10.1038/s41568-024-00694-7

Download citation

Accepted : 09 April 2024

Published : 16 May 2024

DOI : https://doi.org/10.1038/s41568-024-00694-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

recent research paper on artificial intelligence

IMAGES

  1. Research paper artificial intelligence. 10 most impressive Research

    recent research paper on artificial intelligence

  2. (PDF) Research Paper on Artificial Intelligence

    recent research paper on artificial intelligence

  3. (PDF) Artificial intelligence

    recent research paper on artificial intelligence

  4. (PDF) Artificial Intelligence Technologies in Newspaper Exclusive Topic

    recent research paper on artificial intelligence

  5. How To Write A Research Paper On Artificial Intelligence?

    recent research paper on artificial intelligence

  6. (PDF) A Review on Artificial Intelligence, Challenges Involved & Its

    recent research paper on artificial intelligence

VIDEO

  1. Solution of Artificial Intelligence Question Paper || AI || 843 Class 12 || CBSE Board 2023-24

  2. Class 9 final paper of Artificial intelligence || Question paper Artificial intelligence

  3. Solution of Artificial Intelligence Question Paper || AI || 843 Class 12 || CBSE Board 2022-23

  4. Artificial Intelligence (AI) Sample Paper class 10 2023 -24

  5. Genetic algorithm & Travelling salesman problem

  6. Artificial intelligence April may 2023 question paper

COMMENTS

  1. Scientific discovery in the age of artificial intelligence

    Artificial intelligence (AI) is being increasingly integrated into scientific discovery to augment and accelerate research, helping scientists to generate hypotheses, design experiments, collect ...

  2. AI now beats humans at basic tasks

    AI now beats humans at basic tasks — new benchmarks are needed, says major report. Stanford University's 2024 AI Index charts the meteoric rise of artificial-intelligence tools. By. Nicola ...

  3. The present and future of AI

    The 2021 report is the second in a series that will be released every five years until 2116. Titled "Gathering Strength, Gathering Storms," the report explores the various ways AI is increasingly touching people's lives in settings that range from movie recommendations and voice assistants to autonomous driving and automated medical ...

  4. Artificial Intelligence

    Artificial Intelligence Authors and titles for recent submissions. Fri, 17 May 2024 Thu, 16 May 2024 Wed, 15 May 2024 Tue, 14 May 2024 Mon, 13 May 2024 ... Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

  5. Artificial intelligence and machine learning research ...

    Artificial intelligence and machine learning research: towards digital transformation at a global scale ... A variety of innovative topics are included in the agenda of the published papers in this special issue including topics such as: ... Brahimi, T. et al. Artificial intelligence and machine learning research: towards digital transformation ...

  6. AIJ

    The journal of Artificial Intelligence (AIJ) welcomes papers on broad aspects of AI that constitute advances in the overall field including, but not limited to, cognition and AI, automated reasoning and inference, case-based reasoning, commonsense reasoning, computer vision, constraint processing, ethical AI, heuristic search, human interfaces, intelligent robotics, knowledge representation ...

  7. Forecasting the future of artificial intelligence with machine learning

    The corpus of scientific literature grows at an ever-increasing speed. Specifically, in the field of artificial intelligence (AI) and machine learning (ML), the number of papers every month is ...

  8. Artificial intelligence: A powerful paradigm for scientific research

    Artificial intelligence (AI) is a rapidly evolving field that has transformed various domains of scientific research. This article provides an overview of the history, applications, challenges, and opportunities of AI in science. It also discusses how AI can enhance scientific creativity, collaboration, and communication. Learn more about the potential and impact of AI in science by reading ...

  9. Machine Learning: Algorithms, Real-World Applications and Research

    Artificial intelligence (AI), particularly, machine learning (ML) have grown rapidly in recent years in the context of data analysis and computing that typically allows the applications to function in an intelligent manner [].ML usually provides systems with the ability to learn and enhance from experience automatically without being specifically programmed and is generally referred to as the ...

  10. Artificial Intelligence and Machine Learning in Clinical Medicine, 2023

    Artificial Intelligence in Organ Transplantation: Surveying Current Applications, Addressing Challenges and Exploring Frontiers, Artificial Intelligence in Medicine and Surgery - An Exploration of ...

  11. AI technologies for education: Recent research & future directions

    2.1 Prolific countries. Artificial intelligence in education (AIEd) research has been conducted in many countries around the world. The 40 articles reported AIEd research studies in 16 countries (See Table 1).USA was so far the most prolific, with nine articles meeting all criteria applied in this study, and noticeably seven of them were conducted in K-12.

  12. 578339 PDFs

    Artificial Intelligence | Explore the latest full-text research PDFs, articles, conference papers, preprints and more on ARTIFICIAL INTELLIGENCE. Find methods information, sources, references or ...

  13. Artificial intelligence in Finance: a comprehensive review through

    Over the past two decades, artificial intelligence (AI) has experienced rapid development and is being used in a wide range of sectors and activities, including finance. In the meantime, a growing and heterogeneous strand of literature has explored the use of AI in finance. The aim of this study is to provide a comprehensive overview of the existing research on this topic and to identify which ...

  14. Artificial Intelligence: Overview, Recent Advances, and Considerations

    The term artificial intelligence was coined at the Dartmouth Summer Research Project on Artificial Intelligence, a conference proposed in 1955 and held the following year.6 Since that time, the field of AI has gone through what some have termed summers and winters—periods of much research and advancement followed by lulls in activity and ...

  15. (PDF) The Impact of Artificial Intelligence on Academics: A Concise

    Abstract. Artificial intelligence (AI) has developed into a powerful tool that Academics can exploit for their research, writing, and cooperation in recent years. AI can speed up the writing and ...

  16. Recent Advancements in Artificial Intelligence Technology: Trends and

    Recent years have witnessed unprecedented advancements in artificial intelligence (AI) technology, reshaping industries, economies, and daily interactions. This paper delves into the forefront of ...

  17. Artificial intelligence in healthcare: transforming the practice of

    Artificial intelligence (AI) is a powerful and disruptive area of computer science, with the potential to fundamentally transform the practice of medicine and the delivery of healthcare. In this review article, we outline recent breakthroughs in the application of AI in healthcare, describe a roadmap to building effective, reliable and safe AI ...

  18. AI in health and medicine

    Artificial intelligence (AI) is poised to broadly reshape medicine, potentially improving the experiences of both clinicians and patients. We discuss key findings from a 2-year weekly effort to ...

  19. Artificial intelligence

    The power of App Inventor: Democratizing possibilities for mobile applications. More than a decade since its launch, App Inventor recently hosted its 100 millionth project and registered its 20 millionth user. Now hosted by MIT, the app also supports experimenting with AI. May 10, 2024. Read full story.

  20. The impact of artificial intelligence on human society and bioethics

    Bioethics is not a matter of calculation but a process of conscientization. Although AI designers can up-load all information, data, and programmed to AI to function as a human being, it is still a machine and a tool. AI will always remain as AI without having authentic human feelings and the capacity to commiserate.

  21. PDF The Impact of Artificial Intelligence on Innovation

    ABSTRACT. Artificial intelligence may greatly increase the efficiency of the existing economy. But it may have an even larger impact by serving as a new general-purpose "method of invention" that can reshape the nature of the innovation process and the organization of R&D.

  22. Artificial intelligence in information systems research: A systematic

    Followed by a discussion, implications, and a research agenda for the future. The paper ends with a conclusion and directions for future research. 2. Background and related work. ... Russel & Norvig's book Artificial Intelligence: ... the resurgence of AI interest and research in recent years, the specific contribution types of AI literature ...

  23. What is artificial general intelligence, and is it a useful concept?

    Indeed, that is the subject of heated debate in the AI community, with some insisting it is a useful goal and others that it is a meaningless figment that betrays a misunderstanding of the nature ...

  24. Overview of Research on Object Detection Based on YOLO

    With the continuous innovation of artificial intelligence and the rapid development of its branch deep learning, YOLO series algorithm is in a leading position in the field of object detection because of its good balance between speed and accuracy. ... In order to further promote the optimization research of YOLO series algorithms and improve ...

  25. Senators Propose $32 Billion in Annual A.I ...

    In a 20-page document titled "Driving U.S. Innovation in Artificial Intelligence," the Senate leader, Chuck Schumer, and three colleagues called for spending $32 billion annually by 2026 for ...

  26. The Role of Artificial Intelligence in Tackling COVID-19

    COVID-19. remdesivir. SARS-CoV-2. severe acute respiratory syndrome. The past two decades were marked with the outbreaks of many viral diseases such as Chikungunya, Ebola, Zika, Nipah, H7N9 Bird flu, H1N1, SARS and MERS. The world woke up to this decade with a new disease outbreak. An outbreak of a novel Coronavirus emerged in Wuhan city in the ...

  27. Religions

    It is the contention of this paper that ethics of work ought to be anthropological, and artificial intelligence (AI) research and development, which is the focus of work today, should be anthropological, that is, human-centered. This paper discusses the philosophical and theological implications of the development of AI research on the intrinsic nature of work and the nature of the human person.

  28. Artificial intelligence: A powerful paradigm for scientific research

    Abstract. Artificial intelligence (AI) coupled with promising machine learning (ML) techniques well known from computer science is broadly affecting many aspects of various fields including science and technology, industry, and even our day-to-day life. The ML techniques have been developed to analyze high-throughput data with a view to ...

  29. A guide to artificial intelligence for cancer researchers

    Artificial intelligence (AI) has been commoditized. It has evolved from a specialty resource to a readily accessible tool for cancer researchers. AI-based tools can boost research productivity in ...

  30. Artificial intelligence, data and competition

    Research and policy advice on competition including monopolisation, cartels, mergers, liberalisation, intervention, competition enforcement and regulatory reform., This paper discusses recent developments in Artificial Intelligence (AI), particularly generative AI, which could positively impact many markets. While it is important that markets remain competitive to ensure their benefits are ...