Machine Learning - CMU

PhD Dissertations

PhD Dissertations

[all are .pdf files].

Neural processes underlying cognitive control during language production (unavailable) Tara Pirnia, 2024

The Neurodynamic Basis of Real World Face Perception Arish Alreja, 2024

Towards More Powerful Graph Representation Learning Lingxiao Zhao, 2024

Robust Machine Learning: Detection, Evaluation and Adaptation Under Distribution Shift Saurabh Garg, 2024

UNDERSTANDING, FORMALLY CHARACTERIZING, AND ROBUSTLY HANDLING REAL-WORLD DISTRIBUTION SHIFT Elan Rosenfeld, 2024

Representing Time: Towards Pragmatic Multivariate Time Series Modeling Cristian Ignacio Challu, 2024

Foundations of Multisensory Artificial Intelligence Paul Pu Liang, 2024

Advancing Model-Based Reinforcement Learning with Applications in Nuclear Fusion Ian Char, 2024

Learning Models that Match Jacob Tyo, 2024

Improving Human Integration across the Machine Learning Pipeline Charvi Rastogi, 2024

Reliable and Practical Machine Learning for Dynamic Healthcare Settings Helen Zhou, 2023

Automatic customization of large-scale spiking network models to neuronal population activity (unavailable) Shenghao Wu, 2023

Estimation of BVk functions from scattered data (unavailable) Addison J. Hu, 2023

Rethinking object categorization in computer vision (unavailable) Jayanth Koushik, 2023

Advances in Statistical Gene Networks Jinjin Tian, 2023 Post-hoc calibration without distributional assumptions Chirag Gupta, 2023

The Role of Noise, Proxies, and Dynamics in Algorithmic Fairness Nil-Jana Akpinar, 2023

Collaborative learning by leveraging siloed data Sebastian Caldas, 2023

Modeling Epidemiological Time Series Aaron Rumack, 2023

Human-Centered Machine Learning: A Statistical and Algorithmic Perspective Leqi Liu, 2023

Uncertainty Quantification under Distribution Shifts Aleksandr Podkopaev, 2023

Probabilistic Reinforcement Learning: Using Data to Define Desired Outcomes, and Inferring How to Get There Benjamin Eysenbach, 2023

Comparing Forecasters and Abstaining Classifiers Yo Joong Choe, 2023

Using Task Driven Methods to Uncover Representations of Human Vision and Semantics Aria Yuan Wang, 2023

Data-driven Decisions - An Anomaly Detection Perspective Shubhranshu Shekhar, 2023

Applied Mathematics of the Future Kin G. Olivares, 2023

METHODS AND APPLICATIONS OF EXPLAINABLE MACHINE LEARNING Joon Sik Kim, 2023

NEURAL REASONING FOR QUESTION ANSWERING Haitian Sun, 2023

Principled Machine Learning for Societally Consequential Decision Making Amanda Coston, 2023

Long term brain dynamics extend cognitive neuroscience to timescales relevant for health and physiology Maxwell B. Wang, 2023

Long term brain dynamics extend cognitive neuroscience to timescales relevant for health and physiology Darby M. Losey, 2023

Calibrated Conditional Density Models and Predictive Inference via Local Diagnostics David Zhao, 2023

Towards an Application-based Pipeline for Explainability Gregory Plumb, 2022

Objective Criteria for Explainable Machine Learning Chih-Kuan Yeh, 2022

Making Scientific Peer Review Scientific Ivan Stelmakh, 2022

Facets of regularization in high-dimensional learning: Cross-validation, risk monotonization, and model complexity Pratik Patil, 2022

Active Robot Perception using Programmable Light Curtains Siddharth Ancha, 2022

Strategies for Black-Box and Multi-Objective Optimization Biswajit Paria, 2022

Unifying State and Policy-Level Explanations for Reinforcement Learning Nicholay Topin, 2022

Sensor Fusion Frameworks for Nowcasting Maria Jahja, 2022

Equilibrium Approaches to Modern Deep Learning Shaojie Bai, 2022

Towards General Natural Language Understanding with Probabilistic Worldbuilding Abulhair Saparov, 2022

Applications of Point Process Modeling to Spiking Neurons (Unavailable) Yu Chen, 2021

Neural variability: structure, sources, control, and data augmentation Akash Umakantha, 2021

Structure and time course of neural population activity during learning Jay Hennig, 2021

Cross-view Learning with Limited Supervision Yao-Hung Hubert Tsai, 2021

Meta Reinforcement Learning through Memory Emilio Parisotto, 2021

Learning Embodied Agents with Scalably-Supervised Reinforcement Learning Lisa Lee, 2021

Learning to Predict and Make Decisions under Distribution Shift Yifan Wu, 2021

Statistical Game Theory Arun Sai Suggala, 2021

Towards Knowledge-capable AI: Agents that See, Speak, Act and Know Kenneth Marino, 2021

Learning and Reasoning with Fast Semidefinite Programming and Mixing Methods Po-Wei Wang, 2021

Bridging Language in Machines with Language in the Brain Mariya Toneva, 2021

Curriculum Learning Otilia Stretcu, 2021

Principles of Learning in Multitask Settings: A Probabilistic Perspective Maruan Al-Shedivat, 2021

Towards Robust and Resilient Machine Learning Adarsh Prasad, 2021

Towards Training AI Agents with All Types of Experiences: A Unified ML Formalism Zhiting Hu, 2021

Building Intelligent Autonomous Navigation Agents Devendra Chaplot, 2021

Learning to See by Moving: Self-supervising 3D Scene Representations for Perception, Control, and Visual Reasoning Hsiao-Yu Fish Tung, 2021

Statistical Astrophysics: From Extrasolar Planets to the Large-scale Structure of the Universe Collin Politsch, 2020

Causal Inference with Complex Data Structures and Non-Standard Effects Kwhangho Kim, 2020

Networks, Point Processes, and Networks of Point Processes Neil Spencer, 2020

Dissecting neural variability using population recordings, network models, and neurofeedback (Unavailable) Ryan Williamson, 2020

Predicting Health and Safety: Essays in Machine Learning for Decision Support in the Public Sector Dylan Fitzpatrick, 2020

Towards a Unified Framework for Learning and Reasoning Han Zhao, 2020

Learning DAGs with Continuous Optimization Xun Zheng, 2020

Machine Learning and Multiagent Preferences Ritesh Noothigattu, 2020

Learning and Decision Making from Diverse Forms of Information Yichong Xu, 2020

Towards Data-Efficient Machine Learning Qizhe Xie, 2020

Change modeling for understanding our world and the counterfactual one(s) William Herlands, 2020

Machine Learning in High-Stakes Settings: Risks and Opportunities Maria De-Arteaga, 2020

Data Decomposition for Constrained Visual Learning Calvin Murdock, 2020

Structured Sparse Regression Methods for Learning from High-Dimensional Genomic Data Micol Marchetti-Bowick, 2020

Towards Efficient Automated Machine Learning Liam Li, 2020

LEARNING COLLECTIONS OF FUNCTIONS Emmanouil Antonios Platanios, 2020

Provable, structured, and efficient methods for robustness of deep networks to adversarial examples Eric Wong , 2020

Reconstructing and Mining Signals: Algorithms and Applications Hyun Ah Song, 2020

Probabilistic Single Cell Lineage Tracing Chieh Lin, 2020

Graphical network modeling of phase coupling in brain activity (unavailable) Josue Orellana, 2019

Strategic Exploration in Reinforcement Learning - New Algorithms and Learning Guarantees Christoph Dann, 2019 Learning Generative Models using Transformations Chun-Liang Li, 2019

Estimating Probability Distributions and their Properties Shashank Singh, 2019

Post-Inference Methods for Scalable Probabilistic Modeling and Sequential Decision Making Willie Neiswanger, 2019

Accelerating Text-as-Data Research in Computational Social Science Dallas Card, 2019

Multi-view Relationships for Analytics and Inference Eric Lei, 2019

Information flow in networks based on nonstationary multivariate neural recordings Natalie Klein, 2019

Competitive Analysis for Machine Learning & Data Science Michael Spece, 2019

The When, Where and Why of Human Memory Retrieval Qiong Zhang, 2019

Towards Effective and Efficient Learning at Scale Adams Wei Yu, 2019

Towards Literate Artificial Intelligence Mrinmaya Sachan, 2019

Learning Gene Networks Underlying Clinical Phenotypes Under SNP Perturbations From Genome-Wide Data Calvin McCarter, 2019

Unified Models for Dynamical Systems Carlton Downey, 2019

Anytime Prediction and Learning for the Balance between Computation and Accuracy Hanzhang Hu, 2019

Statistical and Computational Properties of Some "User-Friendly" Methods for High-Dimensional Estimation Alnur Ali, 2019

Nonparametric Methods with Total Variation Type Regularization Veeranjaneyulu Sadhanala, 2019

New Advances in Sparse Learning, Deep Networks, and Adversarial Learning: Theory and Applications Hongyang Zhang, 2019

Gradient Descent for Non-convex Problems in Modern Machine Learning Simon Shaolei Du, 2019

Selective Data Acquisition in Learning and Decision Making Problems Yining Wang, 2019

Anomaly Detection in Graphs and Time Series: Algorithms and Applications Bryan Hooi, 2019

Neural dynamics and interactions in the human ventral visual pathway Yuanning Li, 2018

Tuning Hyperparameters without Grad Students: Scaling up Bandit Optimisation Kirthevasan Kandasamy, 2018

Teaching Machines to Classify from Natural Language Interactions Shashank Srivastava, 2018

Statistical Inference for Geometric Data Jisu Kim, 2018

Representation Learning @ Scale Manzil Zaheer, 2018

Diversity-promoting and Large-scale Machine Learning for Healthcare Pengtao Xie, 2018

Distribution and Histogram (DIsH) Learning Junier Oliva, 2018

Stress Detection for Keystroke Dynamics Shing-Hon Lau, 2018

Sublinear-Time Learning and Inference for High-Dimensional Models Enxu Yan, 2018

Neural population activity in the visual cortex: Statistical methods and application Benjamin Cowley, 2018

Efficient Methods for Prediction and Control in Partially Observable Environments Ahmed Hefny, 2018

Learning with Staleness Wei Dai, 2018

Statistical Approach for Functionally Validating Transcription Factor Bindings Using Population SNP and Gene Expression Data Jing Xiang, 2017

New Paradigms and Optimality Guarantees in Statistical Learning and Estimation Yu-Xiang Wang, 2017

Dynamic Question Ordering: Obtaining Useful Information While Reducing User Burden Kirstin Early, 2017

New Optimization Methods for Modern Machine Learning Sashank J. Reddi, 2017

Active Search with Complex Actions and Rewards Yifei Ma, 2017

Why Machine Learning Works George D. Montañez , 2017

Source-Space Analyses in MEG/EEG and Applications to Explore Spatio-temporal Neural Dynamics in Human Vision Ying Yang , 2017

Computational Tools for Identification and Analysis of Neuronal Population Activity Pengcheng Zhou, 2016

Expressive Collaborative Music Performance via Machine Learning Gus (Guangyu) Xia, 2016

Supervision Beyond Manual Annotations for Learning Visual Representations Carl Doersch, 2016

Exploring Weakly Labeled Data Across the Noise-Bias Spectrum Robert W. H. Fisher, 2016

Optimizing Optimization: Scalable Convex Programming with Proximal Operators Matt Wytock, 2016

Combining Neural Population Recordings: Theory and Application William Bishop, 2015

Discovering Compact and Informative Structures through Data Partitioning Madalina Fiterau-Brostean, 2015

Machine Learning in Space and Time Seth R. Flaxman, 2015

The Time and Location of Natural Reading Processes in the Brain Leila Wehbe, 2015

Shape-Constrained Estimation in High Dimensions Min Xu, 2015

Spectral Probabilistic Modeling and Applications to Natural Language Processing Ankur Parikh, 2015 Computational and Statistical Advances in Testing and Learning Aaditya Kumar Ramdas, 2015

Corpora and Cognition: The Semantic Composition of Adjectives and Nouns in the Human Brain Alona Fyshe, 2015

Learning Statistical Features of Scene Images Wooyoung Lee, 2014

Towards Scalable Analysis of Images and Videos Bin Zhao, 2014

Statistical Text Analysis for Social Science Brendan T. O'Connor, 2014

Modeling Large Social Networks in Context Qirong Ho, 2014

Semi-Cooperative Learning in Smart Grid Agents Prashant P. Reddy, 2013

On Learning from Collective Data Liang Xiong, 2013

Exploiting Non-sequence Data in Dynamic Model Learning Tzu-Kuo Huang, 2013

Mathematical Theories of Interaction with Oracles Liu Yang, 2013

Short-Sighted Probabilistic Planning Felipe W. Trevizan, 2013

Statistical Models and Algorithms for Studying Hand and Finger Kinematics and their Neural Mechanisms Lucia Castellanos, 2013

Approximation Algorithms and New Models for Clustering and Learning Pranjal Awasthi, 2013

Uncovering Structure in High-Dimensions: Networks and Multi-task Learning Problems Mladen Kolar, 2013

Learning with Sparsity: Structures, Optimization and Applications Xi Chen, 2013

GraphLab: A Distributed Abstraction for Large Scale Machine Learning Yucheng Low, 2013

Graph Structured Normal Means Inference James Sharpnack, 2013 (Joint Statistics & ML PhD)

Probabilistic Models for Collecting, Analyzing, and Modeling Expression Data Hai-Son Phuoc Le, 2013

Learning Large-Scale Conditional Random Fields Joseph K. Bradley, 2013

New Statistical Applications for Differential Privacy Rob Hall, 2013 (Joint Statistics & ML PhD)

Parallel and Distributed Systems for Probabilistic Reasoning Joseph Gonzalez, 2012

Spectral Approaches to Learning Predictive Representations Byron Boots, 2012

Attribute Learning using Joint Human and Machine Computation Edith L. M. Law, 2012

Statistical Methods for Studying Genetic Variation in Populations Suyash Shringarpure, 2012

Data Mining Meets HCI: Making Sense of Large Graphs Duen Horng (Polo) Chau, 2012

Learning with Limited Supervision by Input and Output Coding Yi Zhang, 2012

Target Sequence Clustering Benjamin Shih, 2011

Nonparametric Learning in High Dimensions Han Liu, 2010 (Joint Statistics & ML PhD)

Structural Analysis of Large Networks: Observations and Applications Mary McGlohon, 2010

Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy Brian D. Ziebart, 2010

Tractable Algorithms for Proximity Search on Large Graphs Purnamrita Sarkar, 2010

Rare Category Analysis Jingrui He, 2010

Coupled Semi-Supervised Learning Andrew Carlson, 2010

Fast Algorithms for Querying and Mining Large Graphs Hanghang Tong, 2009

Efficient Matrix Models for Relational Learning Ajit Paul Singh, 2009

Exploiting Domain and Task Regularities for Robust Named Entity Recognition Andrew O. Arnold, 2009

Theoretical Foundations of Active Learning Steve Hanneke, 2009

Generalized Learning Factors Analysis: Improving Cognitive Models with Machine Learning Hao Cen, 2009

Detecting Patterns of Anomalies Kaustav Das, 2009

Dynamics of Large Networks Jurij Leskovec, 2008

Computational Methods for Analyzing and Modeling Gene Regulation Dynamics Jason Ernst, 2008

Stacked Graphical Learning Zhenzhen Kou, 2007

Actively Learning Specific Function Properties with Applications to Statistical Inference Brent Bryan, 2007

Approximate Inference, Structure Learning and Feature Estimation in Markov Random Fields Pradeep Ravikumar, 2007

Scalable Graphical Models for Social Networks Anna Goldenberg, 2007

Measure Concentration of Strongly Mixing Processes with Applications Leonid Kontorovich, 2007

Tools for Graph Mining Deepayan Chakrabarti, 2005

Automatic Discovery of Latent Variable Models Ricardo Silva, 2005

thesis about machine learning

Available Master's thesis topics in machine learning

Main content.

Here we list topics that are available. You may also be interested in our list of completed Master's theses .

Learning and inference with large Bayesian networks

Most learning and inference tasks with Bayesian networks are NP-hard. Therefore, one often resorts to using different heuristics that do not give any quality guarantees.

Task: Evaluate quality of large-scale learning or inference algorithms empirically.

Advisor: Pekka Parviainen

Sum-product networks

Traditionally, probabilistic graphical models use a graph structure to represent dependencies and independencies between random variables. Sum-product networks are a relatively new type of a graphical model where the graphical structure models computations and not the relationships between variables. The benefit of this representation is that inference (computing conditional probabilities) can be done in linear time with respect to the size of the network.

Potential thesis topics in this area: a) Compare inference speed with sum-product networks and Bayesian networks. Characterize situations when one model is better than the other. b) Learning the sum-product networks is done using heuristic algorithms. What is the effect of approximation in practice?

Bayesian Bayesian networks

The naming of Bayesian networks is somewhat misleading because there is nothing Bayesian in them per se; A Bayesian network is just a representation of a joint probability distribution. One can, of course, use a Bayesian network while doing Bayesian inference. One can also learn Bayesian networks in a Bayesian way. That is, instead of finding an optimal network one computes the posterior distribution over networks.

Task: Develop algorithms for Bayesian learning of Bayesian networks (e.g., MCMC, variational inference, EM)

Large-scale (probabilistic) matrix factorization

The idea behind matrix factorization is to represent a large data matrix as a product of two or more smaller matrices.They are often used in, for example, dimensionality reduction and recommendation systems. Probabilistic matrix factorization methods can be used to quantify uncertainty in recommendations. However, large-scale (probabilistic) matrix factorization is computationally challenging.

Potential thesis topics in this area: a) Develop scalable methods for large-scale matrix factorization (non-probabilistic or probabilistic), b) Develop probabilistic methods for implicit feedback (e.g., recommmendation engine when there are no rankings but only knowledge whether a customer has bought an item)

Bayesian deep learning

Standard deep neural networks do not quantify uncertainty in predictions. On the other hand, Bayesian methods provide a principled way to handle uncertainty. Combining these approaches leads to Bayesian neural networks. The challenge is that Bayesian neural networks can be cumbersome to use and difficult to learn.

The task is to analyze Bayesian neural networks and different inference algorithms in some simple setting.

Deep learning for combinatorial problems

Deep learning is usually applied in regression or classification problems. However, there has been some recent work on using deep learning to develop heuristics for combinatorial optimization problems; see, e.g., [1] and [2].

Task: Choose a combinatorial problem (or several related problems) and develop deep learning methods to solve them.

References: [1] Vinyals, Fortunato and Jaitly: Pointer networks. NIPS 2015. [2] Dai, Khalil, Zhang, Dilkina and Song: Learning Combinatorial Optimization Algorithms over Graphs. NIPS 2017.

Advisors: Pekka Parviainen, Ahmad Hemmati

Estimating the number of modes of an unknown function

Mode seeking considers estimating the number of local maxima of a function f. Sometimes one can find modes by, e.g., looking for points where the derivative of the function is zero. However, often the function is unknown and we have only access to some (possibly noisy) values of the function. 

In topological data analysis,  we can analyze topological structures using persistent homologies. For 1-dimensional signals, this can translate into looking at the birth/death persistence diagram, i.e. the birth and death of connected topological components as we expand the space around each point where we have observed our function. These observations turn out to be closely related to the modes (local maxima) of the function. A recent paper [1] proposed an efficient method for mode seeking.

In this project, the task is to extend the ideas from [1] to get a probabilistic estimate on the number of modes. To this end, one has to use probabilistic methods such as Gaussian processes.

[1] U. Bauer, A. Munk, H. Sieling, and M. Wardetzky. Persistence barcodes versus Kolmogorov signatures: Detecting modes of one-dimensional signals. Foundations of computational mathematics17:1 - 33, 2017.

Advisors:  Pekka Parviainen ,  Nello Blaser

Causal Abstraction Learning

We naturally make sense of the world around us by working out causal relationships between objects and by representing in our minds these objects with different degrees of approximation and detail. Both processes are essential to our understanding of reality, and likely to be fundamental for developing artificial intelligence. The first process may be expressed using the formalism of structural causal models, while the second can be grounded in the theory of causal abstraction [1].      This project will consider the problem of learning an abstraction between two given structural causal models. The primary goal will be the development of efficient algorithms able to learn a meaningful abstraction between the given causal models.      [1] Rubenstein, Paul K., et al. "Causal consistency of structural equation models." arXiv preprint arXiv:1707.00819 (2017).

Advisor: Fabio Massimo Zennaro

Causal Bandits

"Multi-armed bandit" is an informal name for slot machines, and the formal name of a large class of problems where an agent has to choose an action among a range of possibilities without knowing the ensuing rewards. Multi-armed bandit problems are one of the most essential reinforcement learning problems where an agent is directly faced with an exploitation-exploration trade-off.       This project will consider a class of multi-armed bandits where an agent, upon taking an action, interacts with a causal system [1]. The primary goal will be the development of learning strategies that takes advantage of the underlying causal system in order to learn optimal policies in a shortest amount of time.      [1] Lattimore, Finnian, Tor Lattimore, and Mark D. Reid. "Causal bandits: Learning good interventions via causal inference." Advances in neural information processing systems 29 (2016).

Causal Modelling for Battery Manufacturing

Lithium-ion batteries are poised to be one of the most important sources of energy in the near future. Yet, the process of manufacturing these batteries is very hard to model and control. Optimizing the different phases of production to maximize the lifetime of the batteries is a non-trivial challenge since physical models are limited in scope and collecting experimental data is extremely expensive and time-consuming [1].      This project will consider the problem of aggregating and analyzing data regarding a few stages in the process of battery manufacturing. The primary goal will be the development of algorithms for transporting and integrating data collected in different contexts, as well as the use of explainable algorithms to interpret them.      [1] Niri, Mona Faraji, et al. "Quantifying key factors for optimised manufacturing of Li-ion battery anode and cathode via artificial intelligence." Energy and AI 7 (2022): 100129.

Advisor: Fabio Massimo Zennaro ,  Mona Faraji Niri

Reinforcement Learning for Computer Security

The field of computer security presents a wide variety of challenging problems for artificial intelligence and autonomous agents. Guaranteeing the security of a system against attacks and penetrations by malicious hackers has always been a central concern of this field, and machine learning could now offer a substantial contribution. Security capture-the-flag simulations are particularly well-suited as a testbed for the application and development of reinforcement learning algorithms [1].       This project will consider the use of reinforcement learning for the preventive purpose of testing systems and discovering vulnerabilities before they can be exploited. The primary goal will be the modelling of capture-the-flag challenges of interest and the development of reinforcement learning algorithms that can solve them.      [1] Erdodi, Laszlo, and Fabio Massimo Zennaro. "The Agent Web Model--Modelling web hacking for reinforcement learning." arXiv preprint arXiv:2009.11274 (2020).

Advisor: Fabio Massimo Zennaro ,  Laszlo Tibor Erdodi

Approaches to AI Safety

The world and the Internet are more and more populated by artificial autonomous agents carrying out tasks on our behalf. Many of these agents are provided with an objective and they learn their behaviour trying to achieve their objective as better as they can. However, this approach can not guarantee that an agent, while learning its behaviour, will not undertake actions that may have unforeseen and undesirable effects. Research in AI safety tries to design autonomous agent that will behave in a predictable and safe way [1].      This project will consider specific problems and novel solution in the domain of AI safety and reinforcement learning. The primary goal will be the development of innovative algorithms and their implementation withing established frameworks.      [1] Amodei, Dario, et al. "Concrete problems in AI safety." arXiv preprint arXiv:1606.06565 (2016).

Reinforcement Learning for Super-modelling

Super-modelling [1] is a technique designed for combining together complex dynamical models: pre-trained models are aggregated with messages and information being exchanged in order synchronize the behavior  of the different modles and produce more accurate and reliable predictions. Super-models are used, for instance, in weather or climate science, where pre-existing models are ensembled together and their states dynamically aggregated to generate more realistic simulations. 

This project will consider how reinforcement learning algorithms may be used to solve the coordination problem among the individual models forming a super-model. The primary goal will be the formulation of the super-modelling problem within the reinforcement learning framework and the study of custom RL algorithms to improve the overall performance of super-models.

[1] Schevenhoven, Francine, et al. "Supermodeling: improving predictions with an ensemble of interacting models." Bulletin of the American Meteorological Society 104.9 (2023): E1670-E1686.

Advisor: Fabio Massimo Zennaro ,  Francine Janneke Schevenhoven

Multilevel Causal Discovery

Modelling causal relationships between variables of interest is a crucial step in understanding and controlling a system. A common approach is to represent such relations using graphs with directed arrows discriminating causes from effects.

While causal graphs are often built relying on expert knowledge, a more interesting challenge is to learn them from data. In particular, we want to consider the case where data might have been collected at multiple levels, for instance, with sensor with different resolutions. In this project we want to explore how these heterogeneous data can help the process of inferring causal structures.

[1] Anand, Tara V., et al. "Effect identification in cluster causal diagrams." Proceedings of the 37th AAAI Conference on Artificial Intelligence. Vol. 82. 2023.

Advisor: Fabio Massimo Zennaro ,  Pekka Parviainen

Manifolds of Causal Models

Modelling causal relationships is fundamental in order to understand real-world systems. A common formalism is offered by structural causal models (SCMs) which represent these relationships graphical. However, SCMs are complex mathematical objects entailing collections of different probability distributions      In this project we want to explore a differential geometric perspective on structural causal models [1]. We will model an SCM and the probability distributions it generates in terms of manifold, and we will study how this modelling encodes causal properties of interest and how relevant quantities may be computed in this framework.      [1] Dominguez-Olmedo, Ricardo, et al. "On data manifolds entailed by structural causal models." International Conference on Machine Learning. PMLR, 2023.

Advisor: Fabio Massimo Zennaro ,  Nello Blaser

Abstraction for Epistemic Logic

Weighted Kripke models constitute a powerful formalism to express the evolving knowledge of an agent; it allows to express known facts and beliefs, and to recursively model the knowledge of an agent about another agent. Moreover, such relations of knowledge can be given a graphical expression using suitable diagrams on which to perform reasoning. Unfortunately, such graphs can quickly become very large and inefficient to process.

This project consider the reduction of epistemic logic graph using ideas from causal abstraction [1]. The primary goal will be the development of ML models that can learn to output small epistemic logic graph still satisfying logical and consistency constraints.

[1] Zennaro, Fabio Massimo, et al. "Jointly learning consistent causal abstractions over multiple interventional distributions." Conference on Causal Learning and Reasoning. PMLR, 2023

Advisor: Fabio Massimo Zennaro ,  Rustam Galimullin

Optimal Transport for Public Transportation

Modelling public transportation across cities is critical in order to improve viability, provide reliable services and increase reliance on greener form of mass transport. Yet cities and transportation networks are complex systems and modelling often has to rely on incomplete and uncertain data. 

This project will start from considering a concrete challenge in modelling commuter flows across the city of Bergen. In particular, it will consider the application of the mathematical framework of optimal transport [1] to recover statistical patterns in the usage of the main transportation lines across different periods.

[1] Peyré, Gabriel, and Marco Cuturi. "Computational optimal transport: With applications to data science." Foundations and Trends in Machine Learning 11.5-6 (2019): 355-607.

Finalistic Models

The behavior of an agent may be explained both in causal terms (what has caused a certain behavior) or in finalistic terms (what aim justifies a certain behaviour). While causal reasoning is well explained by different mathematical formalism (e.g., structural causal models), finalistic reasoning is still object of research.

In this project we want to explore how a recently-proposed framework for finalistic reasoning [1] may be used to model intentions and counterfactuals in a causal bandit setting, or how it could be used to enhance inverse reinforcement learning.

[1] Compagno, Dario. "Final models: A finalistic interpretation of statistical correlation." arXiv preprint arXiv:2310.02272 (2023).

Automatic hyperparameter selection for isomap

Isomap is a non-linear dimensionality reduction method with two free hyperparameters (number of nearest neighbors and neighborhood radius). Different hyperparameters result in dramatically different embeddings. Previous methods for selecting hyperparameters focused on choosing one optimal hyperparameter. In this project, you will explore the use of persistent homology to find parameter ranges that result in stable embeddings. The project has theoretic and computational aspects.

Advisor: Nello Blaser

Topological Ancombs quartet

This topic is based on the classical Ancombs quartet and families of point sets with identical 1D persistence ( https://arxiv.org/abs/2202.00577 ). The goal is to generate more interesting datasets using the simulated annealing methods presented in ( http://library.usc.edu.ph/ACM/CHI%202017/1proc/p1290.pdf ). This project is mostly computational.

Persistent homology vectorization with cycle location

There are many methods of vectorizing persistence diagrams, such as persistence landscapes, persistence images, PersLay and statistical summaries. Recently we have designed algorithms to in some cases efficiently detect the location of persistence cycles. In this project, you will vectorize not just the persistence diagram, but additional information such as the location of these cycles. This project is mostly computational with some theoretic aspects.

Divisive covers

Divisive covers are a divisive technique for generating filtered simplicial complexes. They original used a naive way of dividing data into a cover. In this project, you will explore different methods of dividing space, based on principle component analysis, support vector machines and k-means clustering. In addition, you will explore methods of using divisive covers for classification. This project will be mostly computational.

Learning Acquisition Functions for Cost-aware Bayesian Optimization

This is a follow-up project of an earlier Master thesis that developed a novel method for learning Acquisition Functions in Bayesian Optimization through the use of Reinforcement Learning. The goal of this project is to further generalize this method (more general input, learned cost-functions) and apply it to hyperparameter optimization for neural networks.

Advisors: Nello Blaser , Audun Ljone Henriksen

Stable updates

This is a follow-up project of an earlier Master thesis that introduced and studied empirical stability in the context of tree-based models. The goal of this project is to develop stable update methods for deep learning models. You will design sevaral stable methods and empirically compare them (in terms of loss and stability) with a baseline and with one another.

Advisors:  Morten Blørstad , Nello Blaser

Multimodality in Bayesian neural network ensembles

One method to assess uncertainty in neural network predictions is to use dropout or noise generators at prediction time and run every prediction many times. This leads to a distribution of predictions. Informatively summarizing such probability distributions is a non-trivial task and the commonly used means and standard deviations result in the loss of crucial information, especially in the case of multimodal distributions with distinct likely outcomes. In this project, you will analyze such multimodal distributions with mixture models and develop ways to exploit such multimodality to improve training. This project can have theoretical, computational and applied aspects.

Wet area segmentation for rivers

NORCE LFI is working on digitizing wetted areas in rivers. You will apply different machine learning techniques for distinguishing water bodies (rivers) from land based on drone aerial (RGB) pictures. This is important for water management and assessing effects of hydropower on river ecosystems (residual flow, stranding of fish and spawning areas).  We have a database of approximately 100 rivers (aerial pictures created from totally ca. 120.000 single pictures with Structure from Motion, single pictures available as well) and several of these rivers are flown at 2-4 different discharges, taken in different seasons and with different weather patterns. For ca. 50 % of the pictures the wetted area is digitized for training (GIS shapefile), most (>90 % of single pictures) cover water surface and land. Possible challenges include shading, reflectance from the water surface, different water/ground colours and wet surfaces on land. This is an applied topic, where you will try many different machine learning techniques to find the best solution for the mapping tasks by NORCE LFI.

Advisor: Nello Blaser , Sebastian Franz Stranzl

Learning a hierarchical metric

Often, labels have defined relationships to each other, for instance in a hierarchical taxonomy. E.g. ImageNet labels are derived from the WordNet graph, and biological species are taxonomically related, and can have similarities depending on life stage, sex, or other properties.

ArcFace is an alternative loss function that aims for an embedding that is more generally useful than softmax. It is commonly used in metric learning/few shot learning cases.

Here, we will develop a metric learning method that learns from data with hierarchical labels. Using multiple ArcFace heads, we will simultaneously learn to place representations to optimize the leaf label as well as intermediate labels on the path from leaf to root of the label tree. Using taxonomically classified plankton image data, we will measure performance as a function of ArcFace parameters (sharpness/temperature and margins -- class-wise or level-wise), and compare the results to existing methods.

Advisor: Ketil Malde ( [email protected] )

Self-supervised object detection in video

One challenge with learning object detection is that in many scenes that stretch off into the distance, annotating small, far-off, or blurred objects is difficult. It is therefore desirable to learn from incompletely annotated scenes, and one-shot object detectors may suffer from incompletely annotated training data.

To address this, we will use a region-propsal algorithm (e.g. SelectiveSearch) to extract potential crops from each frame. Classification will be based on two approaches: a) training based on annotated fish vs random similarly-sized crops without annotations, and b) using a self-supervised method to build a representation for crops, and building a classifier for the extracted regions. The method will be evaluated against one-shot detectors and other training regimes.

If successful, the method will be applied to fish detection and tracking in videos from baited and unbaited underwater traps, and used to estimate abundance of various fish species.

See also: Benettino (2016): https://link.springer.com/chapter/10.1007/978-3-319-48881-3_56

Representation learning for object detection

While traditional classifiers work well with data that is labeled with disjoint classes and reasonably balanced class abundances, reality is often less clean. An alternative is to learn a vectors space embedding that reflects semantic relationships between objects, and deriving classes from this representation. This is especially useful for few-shot classification (ie. very few examples in the training data).

The task here is to extend a modern object detector (e.g. Yolo v8) to output an embedding of the identified object. Instead of a softmax classifier, we can learn the embedding either in a supervised manner (using annotations on frames) by attaching an ArcFace or other supervised metric learning head. Alternatively, the representation can be learned from tracked detections over time using e.g. a contrastive loss function to keep the representation for an object (approximately) constant over time. The performance of the resulting object detector will be measured on underwater videos, targeting species detection and/or indiviual recognition (re-ID).

Time-domain object detection

Object detectors for video are normally trained on still frames, but it is evident (from human experience) that using time domain information is more effective. I.e., it can be hard to identify far-off or occluded objects in still images, but movement in time often reveals them.

Here we will extend a state of the art object detector (e.g. yolo v8) with time domain data. Instead of using a single frame as input, the model will be modified to take a set of frames surrounding the annotated frame as input. Performance will be compared to using single-frame detection.

Large-scale visualization of acoustic data

The Institute of Marine Research has decades of acoustic data collected in various surveys. These data are in the process of being converted to data formats that can be processed and analyzed more easily using packages like Xarray and Dask.

The objective is to make these data more accessible to regular users by providing a visual front end. The user should be able to quickly zoom in and out, perform selection, export subsets, apply various filters and classifiers, and overlay annotations and other relevant auxiliary data.

Learning acoustic target classification from simulation

Broadband echosounders emit a complex signal that spans a large frequency band. Different targets will reflect, absorb, and generate resonance at different amplitudes and frequencies, and it is therefore possible to classify targets at much higher resolution and accuracy than before. Due to the complexity of the received signals, deriving effective profiles that can be used to identify targets is difficult.

Here we will use simulated frequency spectra from geometric objects with various shapes, orientation, and other properties. We will train ML models to estimate (recover) the geometric and material properties of objects based on these spectra. The resulting model will be applied to read broadband data, and compared to traditional classification methods.

Online learning in real-time systems

Build a model for the drilling process by using the Virtual simulator OpenLab ( https://openlab.app/ ) for real-time data generation and online learning techniques. The student will also do a short survey of existing online learning techniques and learn how to cope with errors and delays in the data.

Advisor: Rodica Mihai

Building a finite state automaton for the drilling process by using queries and counterexamples

Datasets will be generated by using the Virtual simulator OpenLab ( https://openlab.app/ ). The student will study the datasets and decide upon a good setting to extract a finite state automaton for the drilling process. The student will also do a short survey of existing techniques for extracting finite state automata from process data. We present a novel algorithm that uses exact learning and abstraction to extract a deterministic finite automaton describing the state dynamics of a given trained RNN. We do this using Angluin's L*algorithm as a learner and the trained RNN as an oracle. Our technique efficiently extracts accurate automata from trained RNNs, even when the state vectors are large and require fine differentiation.arxiv.org

Machine Learning for Drug Repositioning in Parkinson’s Disease

Background : Parkinson’s Disease (PD) is a major neurological condition with a complex etiology that tends to affect the elderly population. Understanding the risk factors associated with PD, including drug usage patterns across different demographics, can provide insights into its management and prevention. The Norwegian Prescribed Drug Registry (NorPD) provides comprehensive data on prescriptions dispensed from 2004, making it an excellent resource for such an analysis.

Objective : This project seeks to investigate how well machine learning techniques can predict PD risk, using the individual histories of drug usage along with demographic variables like gender and age.

Methodology :

  • Exploratory Data Analysis and Data Preprocessing: Although the dataset is clean and structured, specific preprocessing steps will be required to tailor the data for the chosen methods.
  • Predictive Modeling: Apply standard machine learning models such as Random Forest for handling large, imbalanced, sparse dataset, to find the best model or ensemble models to robust prediction. The predictive model will be employed to discern patterns in drug usage and demographic factors that correlate with PD risk.
  • Feature Analysis: Conduct a detailed analysis to understand the importance of different features, such as specific drugs, gender, and age, in predicting PD risk and explore complex dependencies between features.
  • Evaluation Metrics: Explore different metrics, such as F1-score and AUC-ROC to evaluate the performance of the predictive models.

Expected Outcomes : The project aims to study and develop predictive models that can accurately identify individuals at increased risk of developing PD based on their prescription history and demographic data.

Ethical Considerations : Data privacy and confidentiality will be strictly maintained by conducting all analyses on the SAFE server, following ethical guidelines for handling sensitive health data. The approval from regional ethics committee (REK) is already in place, as the project will be part of DRONE ( https://www.uib.no/en/epistat/139849/drone-drug-repurposing-neurological-diseases ).

Project Benefits .

  • The student practices working with a huge and rich set of real data and working with experts from epidemiology group at MED faculty.
  • Utilizing different ML methods in real data
  • The possibility of publication if the results are promising.

Advisors :  Asieh Abolpour Mofrad , Samaneh Abolpour Mofrad , Julia Romanowska , Jannicke Igland

Exploring Graph Neural Networks for Analyzing Prescription Data to Predict Parkinson’s Disease Risk

Background : Parkinson’s Disease (PD) significantly impacts the elderly, necessitating advanced computational approaches to predict and understand its risk factors better. The Norwegian Prescribed Drug Registry (NorPD) provides comprehensive data on prescriptions dispensed from 2004, presents an excellent opportunity to employ graph neural networks (GNNs), especially to analyze the temporal dynamics of prescription data.

Objective . The project aims to investigate the effectiveness of GNNs in analyzing time-dependent prescription data, focusing on various graph structures to understand how drug interactions and patient demographics influence PD risk over time.

  • Exploratory Data Analysis and Data Preprocessing: Prepare the prescription data for GNN analysis by investigating different structures to represent the data as a graph. This step is a challenging step; we must investigate what is the best structure for a graph based on the existing GNN and temporal GNN methods. For instance, one might assign a graph to each individual and consider classification approaches, or defining a graph for all participants, and investigating the GNN methods for clustering or predicting nodes and edges.

Incorporate demographic features, such as age, gender, and education, into the graph. Additionally, explore how to integrate time-dependent features to reflect the dynamic nature of the prescription data effectively.

  • Graph Neural Network Implementation: Apply graph neural network models such as Graph Convolutional Networks (GCNs) or Graph Attention Networks (GATs) that can process temporal graph data, based on the structure of our defined graph.
  • Feature Analysis: Perform an in-depth analysis of the learned embeddings and node features to identify significant patterns and influential factors related to increased, decreased PD risk.
  • Evaluation Metrics: Explore different metrics to evaluate the performance of the predictive models.

Expected Outcomes :

The project aims to study how graph neural networks (GNNs) can be utilized to analyze complex, time-dependent prescription data.

Ethical Considerations . All analyses will adhere to strict privacy protocols by conducting research on the SAFE server, ensuring that all individual data remains confidential and secure in compliance with ethical healthcare data management practices. The approval from regional ethics committee (REK) is already in place, as the project will be part of DRONE ( https://www.uib.no/en/epistat/139849/drone-drug-repurposing-neurological-diseases )

Project Benefits :

  • Get familiar with GNNs as advanced ML methods and utilize them in real data.

Advisors :  Samaneh Abolpour Mofrad , Asieh Abolpour Mofrad , Julia Romanowska , Jannicke Igland

Scaling Laws for Language Models in Generative AI

Large Language Models (LLM) power today's most prominent language technologies in Generative AI like ChatGPT, which, in turn, are changing the way that people access information and solve tasks of many kinds.

A recent interest on scaling laws for LLMs has shown trends on understanding how well they perform in terms of factors like the how much training data is used, how powerful the models are, or how much computational cost is allocated. (See, for example, Kaplan et al. - "Scaling Laws for Neural Language Models”, 2020.)

In this project, the task will consider to study scaling laws for different language models and with respect with one or multiple modeling factors.

Advisor: Dario Garigliotti

Applications of causal inference methods to omics data

Many hard problems in machine learning are directly linked to causality [1]. The graphical causal inference framework developed by Judea Pearl can be traced back to pioneering work by Sewall Wright on path analysis in genetics and has inspired research in artificial intelligence (AI) [1].

The Michoel group has developed the open-source tool Findr [2] which provides efficient implementations of mediation and instrumental variable methods for applications to large sets of omics data (genomics, transcriptomics, etc.). Findr works well on a recent data set for yeast [3].

We encourage students to explore promising connections between the fiels of causal inference and machine learning. Feel free to contact us to discuss projects related to causal inference. Possible topics include: a) improving methods based on structural causal models, b) evaluating causal inference methods on data for model organisms, c) comparing methods based on causal models and neural network approaches.

References:

1. Schölkopf B, Causality for Machine Learning, arXiv (2019):  https://arxiv.org/abs/1911.10500

2. Wang L and Michoel T. Efficient and accurate causal inference with hidden confounders from genome-transcriptome variation data. PLoS Computational Biology 13:e1005703 (2017).  https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005703

3. Ludl A and and Michoel T. Comparison between instrumental variable and mediation-based methods for reconstructing causal gene networks in yeast. arXiv:2010.07417  https://arxiv.org/abs/2010.07417

Advisors: Adriaan Ludl ,  Tom Michoel

Space-Time Linkage of Fish Distribution to Environmental Conditions

Conditions in the marine environment, such as, temperature and currents, influence the spatial distribution and migration patterns of marine species. Hence, understanding the link between environmental factors and fish behavior is crucial in predicting, e.g., how fish populations may respond to climate change.   Deriving this link is challenging because it requires analysis of two types of datasets (i) large environmental (currents, temperature) datasets that vary in space and time, and (ii) sparse and sporadic spatial observations of fish populations.

Project goal   

The primary goal of the project is to develop a methodology that helps predict how spatial distribution of two fish stocks (capelin and mackerel) change in response to variability in the physical marine environment (ocean currents and temperature).  The information can also be used to optimize data collection by minimizing time spent in spatial sampling of the populations.

The project will focus on the use of machine learning and/or causal inference algorithms.  As a first step, we use synthetic (fish and environmental) data from analytic models that couple the two data sources.  Because the ‘truth’ is known, we can judge the efficiency and error margins of the methodologies. We then apply the methodologies to real world (empirical) observations.

Advisors:  Tom Michoel , Sam Subbey . 

Towards precision medicine for cancer patient stratification

On average, a drug or a treatment is effective in only about half of patients who take it. This means patients need to try several until they find one that is effective at the cost of side effects associated with every treatment. The ultimate goal of precision medicine is to provide a treatment best suited for every individual. Sequencing technologies have now made genomics data available in abundance to be used towards this goal.

In this project we will specifically focus on cancer. Most cancer patients get a particular treatment based on the cancer type and the stage, though different individuals will react differently to a treatment. It is now well established that genetic mutations cause cancer growth and spreading and importantly, these mutations are different in individual patients. The aim of this project is use genomic data allow to better stratification of cancer patients, to predict the treatment most likely to work. Specifically, the project will use machine learning approach to integrate genomic data and build a classifier for stratification of cancer patients.

Advisor: Anagha Joshi

Unraveling gene regulation from single cell data

Multi-cellularity is achieved by precise control of gene expression during development and differentiation and aberrations of this process leads to disease. A key regulatory process in gene regulation is at the transcriptional level where epigenetic and transcriptional regulators control the spatial and temporal expression of the target genes in response to environmental, developmental, and physiological cues obtained from a signalling cascade. The rapid advances in sequencing technology has now made it feasible to study this process by understanding the genomewide patterns of diverse epigenetic and transcription factors as well as at a single cell level.

Single cell RNA sequencing is highly important, particularly in cancer as it allows exploration of heterogenous tumor sample, obstructing therapeutic targeting which leads to poor survival. Despite huge clinical relevance and potential, analysis of single cell RNA-seq data is challenging. In this project, we will develop strategies to infer gene regulatory networks using network inference approaches (both supervised and un-supervised). It will be primarily tested on the single cell datasets in the context of cancer.

Developing a Stress Granule Classifier

To carry out the multitude of functions 'expected' from a human cell, the cell employs a strategy of division of labour, whereby sub-cellular organelles carry out distinct functions. Thus we traditionally understand organelles as distinct units defined both functionally and physically with a distinct shape and size range. More recently a new class of organelles have been discovered that are assembled and dissolved on demand and are composed of liquid droplets or 'granules'. Granules show many properties characteristic of liquids, such as flow and wetting, but they can also assume many shapes and indeed also fluctuate in shape. One such liquid organelle is a stress granule (SG). 

Stress granules are pro-survival organelles that assemble in response to cellular stress and important in cancer and neurodegenerative diseases like Alzheimer's. They are liquid or gel-like and can assume varying sizes and shapes depending on their cellular composition. 

In a given experiment we are able to image the entire cell over a time series of 1000 frames; from which we extract a rough estimation of the size and shape of each granule. Our current method is susceptible to noise and a granule may be falsely rejected if the boundary is drawn poorly in a small majority of frames. Ideally, we would also like to identify potentially interesting features, such as voids, in the accepted granules.

We are interested in applying a machine learning approach to develop a descriptor for a 'classic' granule and furthermore classify them into different functional groups based on disease status of the cell. This method would be applied across thousands of granules imaged from control and disease cells. We are a multi-disciplinary group consisting of biologists, computational scientists and physicists. 

Advisors: Sushma Grellscheid , Carl Jones

Machine Learning based Hyperheuristic algorithm

Develop a Machine Learning based Hyper-heuristic algorithm to solve a pickup and delivery problem. A hyper-heuristic is a heuristics that choose heuristics automatically. Hyper-heuristic seeks to automate the process of selecting, combining, generating or adapting several simpler heuristics to efficiently solve computational search problems [Handbook of Metaheuristics]. There might be multiple heuristics for solving a problem. Heuristics have their own strength and weakness. In this project, we want to use machine-learning techniques to learn the strength and weakness of each heuristic while we are using them in an iterative search for finding high quality solutions and then use them intelligently for the rest of the search. Once a new information is gathered during the search the hyper-heuristic algorithm automatically adjusts the heuristics.

Advisor: Ahmad Hemmati

Machine learning for solving satisfiability problems and applications in cryptanalysis

Advisor: Igor Semaev

Hybrid modeling approaches for well drilling with Sintef

Several topics are available.

"Flow models" are first-principles models simulating the flow, temperature and pressure in a well being drilled. Our project is exploring "hybrid approaches" where these models are combined with machine learning models that either learn from time series data from flow model runs or from real-world measurements during drilling. The goal is to better detect drilling problems such as hole cleaning, make more accurate predictions and correctly learn from and interpret real-word data.

The "surrogate model" refers to  a ML model which learns to mimic the flow model by learning from the model inputs and outputs. Use cases for surrogate models include model predictions where speed is favoured over accuracy and exploration of parameter space.

Surrogate models with active Learning

While it is possible to produce a nearly unlimited amount of training data by running the flow model, the surrogate model may still perform poorly if it lacks training data in the part of the parameter space it operates in or if it "forgets" areas of the parameter space by being fed too much data from a narrow range of parameters.

The goal of this thesis is to build a surrogate model (with any architecture) for some restricted parameter range and implement an active learning approach where the ML requests more model runs from the flow model in the parts of the parameter space where it is needed the most. The end result should be a surrogate model that is quick and performs acceptably well over the whole defined parameter range.

Surrogate models trained via adversarial learning

How best to train surrogate models from runs of the flow model is an open question. This master thesis would use the adversarial learning approach to build a surrogate model which to its "adversary" becomes indistinguishable from the output of an actual flow model run.

GPU-based Surrogate models for parameter search

While CPU speed largely stalled 20 years ago in terms of working frequency on single cores, multi-core CPUs and especially GPUs took off and delivered increases in computational power by parallelizing computations.

Modern machine learning such as deep learning takes advantage this boom in computing power by running on GPUs.

The SINTEF flow models in contrast, are software programs that runs on a CPU and does not happen to utilize multi-core CPU functionality. The model runs advance time-step by time-step and each time step relies on the results from the previous time step. The flow models are therefore fundamentally sequential and not well suited to massive parallelization.

It is however of interest to run different model runs in parallel, to explore parameter spaces. The use cases for this includes model calibration, problem detection and hypothesis generation and testing.

The task of this thesis is to implement an ML-based surrogate model in such a way that many surrogate model outputs can be produced at the same time using a single GPU. This will likely entail some trade off with model size and maybe some coding tricks.

Uncertainty estimates of hybrid predictions (Lots of room for creativity, might need to steer it more, needs good background literature)

When using predictions from a ML model trained on time series data, it is useful to know if it's accurate or should be trusted. The student is challenged to develop hybrid approaches that incorporates estimates of uncertainty. Components could include reporting variance from ML ensembles trained on a diversity of time series data, implementation of conformal predictions, analysis of training data parameter ranges vs current input, etc. The output should be a "traffic light signal" roughly indicating the accuracy of the predictions.

Transfer learning approaches

We're assuming an ML model is to be used for time series prediction

It is possible to train an ML on a wide range of scenarios in the flow models, but we expect that to perform well, the model also needs to see model runs representative of the type of well and drilling operation it will be used in. In this thesis the student implements a transfer learning approach, where the model is trained on general model runs and fine-tuned on a most representative data set.

(Bonus1: implementing one-shot learning, Bonus2: Using real-world data in the fine-tuning stage)

ML capable of reframing situations

When a human oversees an operation like well drilling, she has a mental model of the situation and new data such as pressure readings from the well is interpreted in light of this model. This is referred to as "framing" and is the normal mode of work. However, when a problem occurs, it becomes harder to reconcile the data with the mental model. The human then goes into "reframing", building a new mental model that includes the ongoing problem. This can be seen as a process of hypothesis generation and testing.

A computer model however, lacks re-framing. A flow model will keep making predictions under the assumption of no problems and a separate alarm system will use the deviation between the model predictions and reality to raise an alarm. This is in a sense how all alarm systems work, but it means that the human must discard the computer model as a tool at the same time as she's handling a crisis.

The student is given access to a flow model and a surrogate model which can learn from model runs both with and without hole cleaning and is challenged to develop a hybrid approach where the ML+flow model continuously performs hypothesis generation and testing and is able to "switch" into predictions of  a hole cleaning problem and different remediations of this.

Advisor: Philippe Nivlet at Sintef together with advisor from UiB

Explainable AI at Equinor

In the project Machine Teaching for XAI (see  https://xai.w.uib.no ) a master thesis in collaboration between UiB and Equinor.

Advisor: One of Pekka Parviainen/Jan Arne Telle/Emmanuel Arrighi + Bjarte Johansen from Equinor.

Explainable AI at Eviny

In the project Machine Teaching for XAI (see  https://xai.w.uib.no ) a master thesis in collaboration between UiB and Eviny.

Advisor: One of Pekka Parviainen/Jan Arne Telle/Emmanuel Arrighi + Kristian Flikka from Eviny.

If you want to suggest your own topic, please contact Pekka Parviainen ,  Fabio Massimo Zennaro or Nello Blaser .

Graph

youtube logo

The Future of AI Research: 20 Thesis Ideas for Undergraduate Students in Machine Learning and Deep Learning for 2023!

A comprehensive guide for crafting an original and innovative thesis in the field of ai..

By Aarafat Islam on 2023-01-11

“The beauty of machine learning is that it can be applied to any problem you want to solve, as long as you can provide the computer with enough examples.” — Andrew Ng

This article provides a list of 20 potential thesis ideas for an undergraduate program in machine learning and deep learning in 2023. Each thesis idea includes an  introduction , which presents a brief overview of the topic and the  research objectives . The ideas provided are related to different areas of machine learning and deep learning, such as computer vision, natural language processing, robotics, finance, drug discovery, and more. The article also includes explanations, examples, and conclusions for each thesis idea, which can help guide the research and provide a clear understanding of the potential contributions and outcomes of the proposed research. The article also emphasized the importance of originality and the need for proper citation in order to avoid plagiarism.

1. Investigating the use of Generative Adversarial Networks (GANs) in medical imaging:  A deep learning approach to improve the accuracy of medical diagnoses.

Introduction:  Medical imaging is an important tool in the diagnosis and treatment of various medical conditions. However, accurately interpreting medical images can be challenging, especially for less experienced doctors. This thesis aims to explore the use of GANs in medical imaging, in order to improve the accuracy of medical diagnoses.

2. Exploring the use of deep learning in natural language generation (NLG): An analysis of the current state-of-the-art and future potential.

Introduction:  Natural language generation is an important field in natural language processing (NLP) that deals with creating human-like text automatically. Deep learning has shown promising results in NLP tasks such as machine translation, sentiment analysis, and question-answering. This thesis aims to explore the use of deep learning in NLG and analyze the current state-of-the-art models, as well as potential future developments.

3. Development and evaluation of deep reinforcement learning (RL) for robotic navigation and control.

Introduction:  Robotic navigation and control are challenging tasks, which require a high degree of intelligence and adaptability. Deep RL has shown promising results in various robotics tasks, such as robotic arm control, autonomous navigation, and manipulation. This thesis aims to develop and evaluate a deep RL-based approach for robotic navigation and control and evaluate its performance in various environments and tasks.

4. Investigating the use of deep learning for drug discovery and development.

Introduction:  Drug discovery and development is a time-consuming and expensive process, which often involves high failure rates. Deep learning has been used to improve various tasks in bioinformatics and biotechnology, such as protein structure prediction and gene expression analysis. This thesis aims to investigate the use of deep learning for drug discovery and development and examine its potential to improve the efficiency and accuracy of the drug development process.

5. Comparison of deep learning and traditional machine learning methods for anomaly detection in time series data.

Introduction:  Anomaly detection in time series data is a challenging task, which is important in various fields such as finance, healthcare, and manufacturing. Deep learning methods have been used to improve anomaly detection in time series data, while traditional machine learning methods have been widely used as well. This thesis aims to compare deep learning and traditional machine learning methods for anomaly detection in time series data and examine their respective strengths and weaknesses.

thesis about machine learning

Photo by  Joanna Kosinska  on  Unsplash

6. Use of deep transfer learning in speech recognition and synthesis.

Introduction:  Speech recognition and synthesis are areas of natural language processing that focus on converting spoken language to text and vice versa. Transfer learning has been widely used in deep learning-based speech recognition and synthesis systems to improve their performance by reusing the features learned from other tasks. This thesis aims to investigate the use of transfer learning in speech recognition and synthesis and how it improves the performance of the system in comparison to traditional methods.

7. The use of deep learning for financial prediction.

Introduction:  Financial prediction is a challenging task that requires a high degree of intelligence and adaptability, especially in the field of stock market prediction. Deep learning has shown promising results in various financial prediction tasks, such as stock price prediction and credit risk analysis. This thesis aims to investigate the use of deep learning for financial prediction and examine its potential to improve the accuracy of financial forecasting.

8. Investigating the use of deep learning for computer vision in agriculture.

Introduction:  Computer vision has the potential to revolutionize the field of agriculture by improving crop monitoring, precision farming, and yield prediction. Deep learning has been used to improve various computer vision tasks, such as object detection, semantic segmentation, and image classification. This thesis aims to investigate the use of deep learning for computer vision in agriculture and examine its potential to improve the efficiency and accuracy of crop monitoring and precision farming.

9. Development and evaluation of deep learning models for generative design in engineering and architecture.

Introduction:  Generative design is a powerful tool in engineering and architecture that can help optimize designs and reduce human error. Deep learning has been used to improve various generative design tasks, such as design optimization and form generation. This thesis aims to develop and evaluate deep learning models for generative design in engineering and architecture and examine their potential to improve the efficiency and accuracy of the design process.

10. Investigating the use of deep learning for natural language understanding.

Introduction:  Natural language understanding is a complex task of natural language processing that involves extracting meaning from text. Deep learning has been used to improve various NLP tasks, such as machine translation, sentiment analysis, and question-answering. This thesis aims to investigate the use of deep learning for natural language understanding and examine its potential to improve the efficiency and accuracy of natural language understanding systems.

thesis about machine learning

Photo by  UX Indonesia  on  Unsplash

11. Comparing deep learning and traditional machine learning methods for image compression.

Introduction:  Image compression is an important task in image processing and computer vision. It enables faster data transmission and storage of image files. Deep learning methods have been used to improve image compression, while traditional machine learning methods have been widely used as well. This thesis aims to compare deep learning and traditional machine learning methods for image compression and examine their respective strengths and weaknesses.

12. Using deep learning for sentiment analysis in social media.

Introduction:  Sentiment analysis in social media is an important task that can help businesses and organizations understand their customers’ opinions and feedback. Deep learning has been used to improve sentiment analysis in social media, by training models on large datasets of social media text. This thesis aims to use deep learning for sentiment analysis in social media, and evaluate its performance against traditional machine learning methods.

13. Investigating the use of deep learning for image generation.

Introduction:  Image generation is a task in computer vision that involves creating new images from scratch or modifying existing images. Deep learning has been used to improve various image generation tasks, such as super-resolution, style transfer, and face generation. This thesis aims to investigate the use of deep learning for image generation and examine its potential to improve the quality and diversity of generated images.

14. Development and evaluation of deep learning models for anomaly detection in cybersecurity.

Introduction:  Anomaly detection in cybersecurity is an important task that can help detect and prevent cyber-attacks. Deep learning has been used to improve various anomaly detection tasks, such as intrusion detection and malware detection. This thesis aims to develop and evaluate deep learning models for anomaly detection in cybersecurity and examine their potential to improve the efficiency and accuracy of cybersecurity systems.

15. Investigating the use of deep learning for natural language summarization.

Introduction:  Natural language summarization is an important task in natural language processing that involves creating a condensed version of a text that preserves its main meaning. Deep learning has been used to improve various natural language summarization tasks, such as document summarization and headline generation. This thesis aims to investigate the use of deep learning for natural language summarization and examine its potential to improve the efficiency and accuracy of natural language summarization systems.

thesis about machine learning

Photo by  Windows  on  Unsplash

16. Development and evaluation of deep learning models for facial expression recognition.

Introduction:  Facial expression recognition is an important task in computer vision and has many practical applications, such as human-computer interaction, emotion recognition, and psychological studies. Deep learning has been used to improve facial expression recognition, by training models on large datasets of images. This thesis aims to develop and evaluate deep learning models for facial expression recognition and examine their performance against traditional machine learning methods.

17. Investigating the use of deep learning for generative models in music and audio.

Introduction:  Music and audio synthesis is an important task in audio processing, which has many practical applications, such as music generation and speech synthesis. Deep learning has been used to improve generative models for music and audio, by training models on large datasets of audio data. This thesis aims to investigate the use of deep learning for generative models in music and audio and examine its potential to improve the quality and diversity of generated audio.

18. Study the comparison of deep learning models with traditional algorithms for anomaly detection in network traffic.

Introduction:  Anomaly detection in network traffic is an important task that can help detect and prevent cyber-attacks. Deep learning models have been used for this task, and traditional methods such as clustering and rule-based systems are widely used as well. This thesis aims to compare deep learning models with traditional algorithms for anomaly detection in network traffic and analyze the trade-offs between the models in terms of accuracy and scalability.

19. Investigating the use of deep learning for improving recommender systems.

Introduction:  Recommender systems are widely used in many applications such as online shopping, music streaming, and movie streaming. Deep learning has been used to improve the performance of recommender systems, by training models on large datasets of user-item interactions. This thesis aims to investigate the use of deep learning for improving recommender systems and compare its performance with traditional content-based and collaborative filtering approaches.

20. Development and evaluation of deep learning models for multi-modal data analysis.

Introduction:  Multi-modal data analysis is the task of analyzing and understanding data from multiple sources such as text, images, and audio. Deep learning has been used to improve multi-modal data analysis, by training models on large datasets of multi-modal data. This thesis aims to develop and evaluate deep learning models for multi-modal data analysis and analyze their potential to improve performance in comparison to single-modal models.

I hope that this article has provided you with a useful guide for your thesis research in machine learning and deep learning. Remember to conduct a thorough literature review and to include proper citations in your work, as well as to be original in your research to avoid plagiarism. I wish you all the best of luck with your thesis and your research endeavors!

Continue Learning

Is ai-generated content grounds for plagiarism, beginner’s guide to openai’s gpt-3.5-turbo model.

From GPT-3 to GPT-3.5-Turbo: Understanding the Latest Upgrades in OpenAI’s Language Model API.

The Depth I: Stereo Calibration and Rectification

Znote ai: the perfect sandbox for prototyping and deploying code, future trends in conversational ai: what's next.

The future of Conversational AI: from NLU advancements to ethical considerations. Discover emerging trends shaping software development.

Llama 2: A New LLMs Family Has Arrived

Use Llama 2 LLMs with Hugging Face and Transformers

thesis about machine learning

Research Topics & Ideas

Artifical Intelligence (AI) and Machine Learning (ML)

Research topics and ideas about AI and machine learning

If you’re just starting out exploring AI-related research topics for your dissertation, thesis or research project, you’ve come to the right place. In this post, we’ll help kickstart your research topic ideation process by providing a hearty list of research topics and ideas , including examples from past studies.

PS – This is just the start…

We know it’s exciting to run through a list of research topics, but please keep in mind that this list is just a starting point . To develop a suitable research topic, you’ll need to identify a clear and convincing research gap , and a viable plan  to fill that gap.

If this sounds foreign to you, check out our free research topic webinar that explores how to find and refine a high-quality research topic, from scratch. Alternatively, if you’d like hands-on help, consider our 1-on-1 coaching service .

Research topic idea mega list

AI-Related Research Topics & Ideas

Below you’ll find a list of AI and machine learning-related research topics ideas. These are intentionally broad and generic , so keep in mind that you will need to refine them a little. Nevertheless, they should inspire some ideas for your project.

  • Developing AI algorithms for early detection of chronic diseases using patient data.
  • The use of deep learning in enhancing the accuracy of weather prediction models.
  • Machine learning techniques for real-time language translation in social media platforms.
  • AI-driven approaches to improve cybersecurity in financial transactions.
  • The role of AI in optimizing supply chain logistics for e-commerce.
  • Investigating the impact of machine learning in personalized education systems.
  • The use of AI in predictive maintenance for industrial machinery.
  • Developing ethical frameworks for AI decision-making in healthcare.
  • The application of ML algorithms in autonomous vehicle navigation systems.
  • AI in agricultural technology: Optimizing crop yield predictions.
  • Machine learning techniques for enhancing image recognition in security systems.
  • AI-powered chatbots: Improving customer service efficiency in retail.
  • The impact of AI on enhancing energy efficiency in smart buildings.
  • Deep learning in drug discovery and pharmaceutical research.
  • The use of AI in detecting and combating online misinformation.
  • Machine learning models for real-time traffic prediction and management.
  • AI applications in facial recognition: Privacy and ethical considerations.
  • The effectiveness of ML in financial market prediction and analysis.
  • Developing AI tools for real-time monitoring of environmental pollution.
  • Machine learning for automated content moderation on social platforms.
  • The role of AI in enhancing the accuracy of medical diagnostics.
  • AI in space exploration: Automated data analysis and interpretation.
  • Machine learning techniques in identifying genetic markers for diseases.
  • AI-driven personal finance management tools.
  • The use of AI in developing adaptive learning technologies for disabled students.

Research topic evaluator

AI & ML Research Topic Ideas (Continued)

  • Machine learning in cybersecurity threat detection and response.
  • AI applications in virtual reality and augmented reality experiences.
  • Developing ethical AI systems for recruitment and hiring processes.
  • Machine learning for sentiment analysis in customer feedback.
  • AI in sports analytics for performance enhancement and injury prevention.
  • The role of AI in improving urban planning and smart city initiatives.
  • Machine learning models for predicting consumer behaviour trends.
  • AI and ML in artistic creation: Music, visual arts, and literature.
  • The use of AI in automated drone navigation for delivery services.
  • Developing AI algorithms for effective waste management and recycling.
  • Machine learning in seismology for earthquake prediction.
  • AI-powered tools for enhancing online privacy and data protection.
  • The application of ML in enhancing speech recognition technologies.
  • Investigating the role of AI in mental health assessment and therapy.
  • Machine learning for optimization of renewable energy systems.
  • AI in fashion: Predicting trends and personalizing customer experiences.
  • The impact of AI on legal research and case analysis.
  • Developing AI systems for real-time language interpretation for the deaf and hard of hearing.
  • Machine learning in genomic data analysis for personalized medicine.
  • AI-driven algorithms for credit scoring in microfinance.
  • The use of AI in enhancing public safety and emergency response systems.
  • Machine learning for improving water quality monitoring and management.
  • AI applications in wildlife conservation and habitat monitoring.
  • The role of AI in streamlining manufacturing processes.
  • Investigating the use of AI in enhancing the accessibility of digital content for visually impaired users.

Recent AI & ML-Related Studies

While the ideas we’ve presented above are a decent starting point for finding a research topic in AI, they are fairly generic and non-specific. So, it helps to look at actual studies in the AI and machine learning space to see how this all comes together in practice.

Below, we’ve included a selection of AI-related studies to help refine your thinking. These are actual studies,  so they can provide some useful insight as to what a research topic looks like in practice.

  • An overview of artificial intelligence in diabetic retinopathy and other ocular diseases (Sheng et al., 2022)
  • HOW DOES ARTIFICIAL INTELLIGENCE HELP ASTRONOMY? A REVIEW (Patel, 2022)
  • Editorial: Artificial Intelligence in Bioinformatics and Drug Repurposing: Methods and Applications (Zheng et al., 2022)
  • Review of Artificial Intelligence and Machine Learning Technologies: Classification, Restrictions, Opportunities, and Challenges (Mukhamediev et al., 2022)
  • Will digitization, big data, and artificial intelligence – and deep learning–based algorithm govern the practice of medicine? (Goh, 2022)
  • Flower Classifier Web App Using Ml & Flask Web Framework (Singh et al., 2022)
  • Object-based Classification of Natural Scenes Using Machine Learning Methods (Jasim & Younis, 2023)
  • Automated Training Data Construction using Measurements for High-Level Learning-Based FPGA Power Modeling (Richa et al., 2022)
  • Artificial Intelligence (AI) and Internet of Medical Things (IoMT) Assisted Biomedical Systems for Intelligent Healthcare (Manickam et al., 2022)
  • Critical Review of Air Quality Prediction using Machine Learning Techniques (Sharma et al., 2022)
  • Artificial Intelligence: New Frontiers in Real–Time Inverse Scattering and Electromagnetic Imaging (Salucci et al., 2022)
  • Machine learning alternative to systems biology should not solely depend on data (Yeo & Selvarajoo, 2022)
  • Measurement-While-Drilling Based Estimation of Dynamic Penetrometer Values Using Decision Trees and Random Forests (García et al., 2022).
  • Artificial Intelligence in the Diagnosis of Oral Diseases: Applications and Pitfalls (Patil et al., 2022).
  • Automated Machine Learning on High Dimensional Big Data for Prediction Tasks (Jayanthi & Devi, 2022)
  • Breakdown of Machine Learning Algorithms (Meena & Sehrawat, 2022)
  • Technology-Enabled, Evidence-Driven, and Patient-Centered: The Way Forward for Regulating Software as a Medical Device (Carolan et al., 2021)
  • Machine Learning in Tourism (Rugge, 2022)
  • Towards a training data model for artificial intelligence in earth observation (Yue et al., 2022)
  • Classification of Music Generality using ANN, CNN and RNN-LSTM (Tripathy & Patel, 2022)

As you can see, these research topics are a lot more focused than the generic topic ideas we presented earlier. So, in order for you to develop a high-quality research topic, you’ll need to get specific and laser-focused on a specific context with specific variables of interest.  In the video below, we explore some other important things you’ll need to consider when crafting your research topic.

Get 1-On-1 Help

If you’re still unsure about how to find a quality research topic, check out our Research Topic Kickstarter service, which is the perfect starting point for developing a unique, well-justified research topic.

Research Topic Kickstarter - Need Help Finding A Research Topic?

can one come up with their own tppic and get a search

can one come up with their own title and get a search

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

  • Print Friendly
  • ODSC EUROPE
  • AI+ Training
  • Speak at ODSC

thesis about machine learning

  • Data Analytics
  • Data Engineering
  • Data Visualization
  • Deep Learning
  • Generative AI
  • Machine Learning
  • NLP and LLMs
  • Business & Use Cases
  • Career Advice
  • Write for us
  • ODSC Community Slack Channel
  • Upcoming Webinars

17 Compelling Machine Learning Ph.D. Dissertations

17 Compelling Machine Learning Ph.D. Dissertations

Machine Learning Modeling Research posted by Daniel Gutierrez, ODSC August 12, 2021 Daniel Gutierrez, ODSC

Working in the field of data science, I’m always seeking ways to keep current in the field and there are a number of important resources available for this purpose: new book titles, blog articles, conference sessions, Meetups, webinars/podcasts, not to mention the gems floating around in social media. But to dig even deeper, I routinely look at what’s coming out of the world’s research labs. And one great way to keep a pulse for what the research community is working on is to monitor the flow of new machine learning Ph.D. dissertations. Admittedly, many such theses are laser-focused and narrow, but from previous experience reading these documents, you can learn an awful lot about new ways to solve difficult problems over a vast range of problem domains. 

In this article, I present a number of hand-picked machine learning dissertations that I found compelling in terms of my own areas of interest and aligned with problems that I’m working on. I hope you’ll find a number of them that match your own interests. Each dissertation may be challenging to consume but the process will result in hours of satisfying summer reading. Enjoy!

Please check out my previous data science dissertation round-up article . 

1. Fitting Convex Sets to Data: Algorithms and Applications

This machine learning dissertation concerns the geometric problem of finding a convex set that best fits a given data set. The overarching question serves as an abstraction for data-analytical tasks arising in a range of scientific and engineering applications with a focus on two specific instances: (i) a key challenge that arises in solving inverse problems is ill-posedness due to a lack of measurements. A prominent family of methods for addressing such issues is based on augmenting optimization-based approaches with a convex penalty function so as to induce a desired structure in the solution. These functions are typically chosen using prior knowledge about the data. The thesis also studies the problem of learning convex penalty functions directly from data for settings in which we lack the domain expertise to choose a penalty function. The solution relies on suitably transforming the problem of learning a penalty function into a fitting task; and (ii) the problem of fitting tractably-described convex sets given the optimal value of linear functionals evaluated in different directions.

2. Structured Tensors and the Geometry of Data

This machine learning dissertation analyzes data to build a quantitative understanding of the world. Linear algebra is the foundation of algorithms, dating back one hundred years, for extracting structure from data. Modern technologies provide an abundance of multi-dimensional data, in which multiple variables or factors can be compared simultaneously. To organize and analyze such data sets we can use a tensor , the higher-order analogue of a matrix. However, many theoretical and practical challenges arise in extending linear algebra to the setting of tensors. The first part of the thesis studies and develops the algebraic theory of tensors. The second part of the thesis presents three algorithms for tensor data. The algorithms use algebraic and geometric structure to give guarantees of optimality.

3. Statistical approaches for spatial prediction and anomaly detection

This machine learning dissertation is primarily a description of three projects. It starts with a method for spatial prediction and parameter estimation for irregularly spaced, and non-Gaussian data. It is shown that by judiciously replacing the likelihood with an empirical likelihood in the Bayesian hierarchical model, approximate posterior distributions for the mean and covariance parameters can be obtained. Due to the complex nature of the hierarchical model, standard Markov chain Monte Carlo methods cannot be applied to sample from the posterior distributions. To overcome this issue, a generalized sequential Monte Carlo algorithm is used. Finally, this method is applied to iron concentrations in California. The second project focuses on anomaly detection for functional data; specifically for functional data where the observed functions may lie over different domains. By approximating each function as a low-rank sum of spline basis functions the coefficients will be compared for each basis across each function. The idea being, if two functions are similar then their respective coefficients should not be significantly different. This project concludes with an application of the proposed method to detect anomalous behavior of users of a supercomputer at NREL. The final project is an extension of the second project to two-dimensional data. This project aims to detect location and temporal anomalies from ground motion data from a fiber-optic cable using distributed acoustic sensing (DAS). 

4. Sampling for Streaming Data

Advances in data acquisition technology pose challenges in analyzing large volumes of streaming data. Sampling is a natural yet powerful tool for analyzing such data sets due to their competent estimation accuracy and low computational cost. Unfortunately, sampling methods and their statistical properties for streaming data, especially streaming time series data, are not well studied in the literature. Meanwhile, estimating the dependence structure of multidimensional streaming time-series data in real-time is challenging. With large volumes of streaming data, the problem becomes more difficult when the multidimensional data are collected asynchronously across distributed nodes, which motivates us to sample representative data points from streams. This machine learning dissertation proposes a series of leverage score-based sampling methods for streaming time series data. The simulation studies and real data analysis are conducted to validate the proposed methods. The theoretical analysis of the asymptotic behaviors of the least-squares estimator is developed based on the subsamples.

5.  Statistical Machine Learning Methods for Complex, Heterogeneous Data

This machine learning dissertation develops statistical machine learning methodology for three distinct tasks. Each method blends classical statistical approaches with machine learning methods to provide principled solutions to problems with complex, heterogeneous data sets. The first framework proposes two methods for high-dimensional shape-constrained regression and classification. These methods reshape pre-trained prediction rules to satisfy shape constraints like monotonicity and convexity. The second method provides a nonparametric approach to the econometric analysis of discrete choice. This method provides a scalable algorithm for estimating utility functions with random forests, and combines this with random effects to properly model preference heterogeneity. The final method draws inspiration from early work in statistical machine translation to construct embeddings for variable-length objects like mathematical equations

6. Topics in Multivariate Statistics with Dependent Data

This machine learning dissertation comprises four chapters. The first is an introduction to the topics of the dissertation and the remaining chapters contain the main results. Chapter 2 gives new results for consistency of maximum likelihood estimators with a focus on multivariate mixed models. The presented theory builds on the idea of using subsets of the full data to establish consistency of estimators based on the full data. The theory is applied to two multivariate mixed models for which it was unknown whether maximum likelihood estimators are consistent. In Chapter 3 an algorithm is proposed for maximum likelihood estimation of a covariance matrix when the corresponding correlation matrix can be written as the Kronecker product of two lower-dimensional correlation matrices. The proposed method is fully likelihood-based. Some desirable properties of separable correlation in comparison to separable covariance are also discussed. Chapter 4 is concerned with Bayesian vector auto-regressions (VARs). A collapsed Gibbs sampler is proposed for Bayesian VARs with predictors and the convergence properties of the algorithm are studied. 

7.  Model Selection and Estimation for High-dimensional Data Analysis

In the era of big data, uncovering useful information and hidden patterns in the data is prevalent in different fields. However, it is challenging to effectively select input variables in data and estimate their effects. The goal of this machine learning dissertation is to develop reproducible statistical approaches that provide mechanistic explanations of the phenomenon observed in big data analysis. The research contains two parts: variable selection and model estimation. The first part investigates how to measure and interpret the usefulness of an input variable using an approach called “variable importance learning” and builds tools (methodology and software) that can be widely applied. Two variable importance measures are proposed, a parametric measure SOIL and a non-parametric measure CVIL, using the idea of a model combining and cross-validation respectively. The SOIL method is theoretically shown to have the inclusion/exclusion property: When the model weights are properly around the true model, the SOIL importance can well separate the variables in the true model from the rest. The CVIL method possesses desirable theoretical properties and enhances the interpretability of many mysterious but effective machine learning methods. The second part focuses on how to estimate the effect of a useful input variable in the case where the interaction of two input variables exists. Investigated is the minimax rate of convergence for regression estimation in high-dimensional sparse linear models with two-way interactions, and construct an adaptive estimator that achieves the minimax rate of convergence regardless of the true heredity condition and the sparsity indices.

https://odsc.com/california/#register

8.  High-Dimensional Structured Regression Using Convex Optimization

While the term “Big Data” can have multiple meanings, this dissertation considers the type of data in which the number of features can be much greater than the number of observations (also known as high-dimensional data). High-dimensional data is abundant in contemporary scientific research due to the rapid advances in new data-measurement technologies and computing power. Recent advances in statistics have witnessed great development in the field of high-dimensional data analysis. This machine learning dissertation proposes three methods that study three different components of a general framework of the high-dimensional structured regression problem. A general theme of the proposed methods is that they cast a certain structured regression as a convex optimization problem. In so doing, the theoretical properties of each method can be well studied, and efficient computation is facilitated. Each method is accompanied by a thorough theoretical analysis of its performance, and also by an R package containing its practical implementation. It is shown that the proposed methods perform favorably (both theoretically and practically) compared with pre-existing methods.

9. Asymptotics and Interpretability of Decision Trees and Decision Tree Ensembles

Decision trees and decision tree ensembles are widely used nonparametric statistical models. A decision tree is a binary tree that recursively segments the covariate space along the coordinate directions to create hyper rectangles as basic prediction units for fitting constant values within each of them. A decision tree ensemble combines multiple decision trees, either in parallel or in sequence, in order to increase model flexibility and accuracy, as well as to reduce prediction variance. Despite the fact that tree models have been extensively used in practice, results on their asymptotic behaviors are scarce. This machine learning dissertation presents analyses on tree asymptotics in the perspectives of tree terminal nodes, tree ensembles, and models incorporating tree ensembles respectively. The study introduces a few new tree-related learning frameworks which provides provable statistical guarantees and interpretations. A study on the Gini index used in the greedy tree building algorithm reveals its limiting distribution, leading to the development of a test of better splitting that helps to measure the uncertain optimality of a decision tree split. This test is combined with the concept of decision tree distillation, which implements a decision tree to mimic the behavior of a block box model, to generate stable interpretations by guaranteeing a unique distillation tree structure as long as there are sufficiently many random sample points. Also applied is mild modification and regularization to the standard tree boosting to create a new boosting framework named Boulevard. Also included is an integration of two new mechanisms: honest trees , which isolate the tree terminal values from the tree structure, and adaptive shrinkage , which scales the boosting history to create an equally weighted ensemble. This theoretical development provides the prerequisite for the practice of statistical inference with boosted trees. Lastly, the thesis investigates the feasibility of incorporating existing semi-parametric models with tree boosting. 

10. Bayesian Models for Imputing Missing Data and Editing Erroneous Responses in Surveys

This dissertation develops Bayesian methods for handling unit nonresponse, item nonresponse, and erroneous responses in large-scale surveys and censuses containing categorical data. The focus is on applications of nested household data where individuals are nested within households and certain combinations of the variables are not allowed, such as the U.S. Decennial Census, as well as surveys subject to both unit and item nonresponse, such as the Current Population Survey.

11. Localized Variable Selection with Random Forest  

Due to recent advances in computer technology, the cost of collecting and storing data has dropped drastically. This makes it feasible to collect large amounts of information for each data point. This increasing trend in feature dimensionality justifies the need for research on variable selection. Random forest (RF) has demonstrated the ability to select important variables and model complex data. However, simulations confirm that it fails in detecting less influential features in presence of variables with large impacts in some cases. This dissertation proposes two algorithms for localized variable selection: clustering-based feature selection (CBFS) and locally adjusted feature importance (LAFI). Both methods aim to find regions where the effects of weaker features can be isolated and measured. CBFS combines RF variable selection with a two-stage clustering method to detect variables where their effect can be detected only in certain regions. LAFI, on the other hand, uses a binary tree approach to split data into bins based on response variable rankings, and implements RF to find important variables in each bin. Larger LAFI is assigned to variables that get selected in more bins. Simulations and real data sets are used to evaluate these variable selection methods. 

12. Functional Principal Component Analysis and Sparse Functional Regression

The focus of this dissertation is on functional data which are sparsely and irregularly observed. Such data require special consideration, as classical functional data methods and theory were developed for densely observed data. As is the case in much of functional data analysis, the functional principal components (FPCs) play a key role in current sparse functional data methods via the Karhunen-Loéve expansion. Thus, after a review of relevant background material, this dissertation is divided roughly into two parts, the first focusing specifically on theoretical properties of FPCs, and the second on regression for sparsely observed functional data.

13. Essays In Causal Inference: Addressing Bias In Observational And Randomized Studies Through Analysis And Design

In observational studies, identifying assumptions may fail, often quietly and without notice, leading to biased causal estimates. Although less of a concern in randomized trials where treatment is assigned at random, bias may still enter the equation through other means. This dissertation has three parts, each developing new methods to address a particular pattern or source of bias in the setting being studied. The first part extends the conventional sensitivity analysis methods for observational studies to better address patterns of heterogeneous confounding in matched-pair designs. The second part develops a modified difference-in-difference design for comparative interrupted time-series studies. The method permits partial identification of causal effects when the parallel trends assumption is violated by an interaction between group and history. The method is applied to a study of the repeal of Missouri’s permit-to-purchase handgun law and its effect on firearm homicide rates. The final part presents a study design to identify vaccine efficacy in randomized control trials when there is no gold standard case definition. The approach augments a two-arm randomized trial with natural variation of a genetic trait to produce a factorial experiment. 

14. Bayesian Shrinkage: Computation, Methods, and Theory

Sparsity is a standard structural assumption that is made while modeling high-dimensional statistical parameters. This assumption essentially entails a lower-dimensional embedding of the high-dimensional parameter thus enabling sound statistical inference. Apart from this obvious statistical motivation, in many modern applications of statistics such as Genomics, Neuroscience, etc. parameters of interest are indeed of this nature. For over almost two decades, spike and slab type priors have been the Bayesian gold standard for modeling of sparsity. However, due to their computational bottlenecks, shrinkage priors have emerged as a powerful alternative. This family of priors can almost exclusively be represented as a scale mixture of Gaussian distribution and posterior Markov chain Monte Carlo (MCMC) updates of related parameters are then relatively easy to design. Although shrinkage priors were tipped as having computational scalability in high-dimensions, when the number of parameters is in thousands or more, they do come with their own computational challenges. Standard MCMC algorithms implementing shrinkage priors generally scale cubic in the dimension of the parameter making real-life application of these priors severely limited. 

The first chapter of this dissertation addresses this computational issue and proposes an alternative exact posterior sampling algorithm complexity of which that linearly in the ambient dimension. The algorithm developed in the first chapter is specifically designed for regression problems. The second chapter develops a Bayesian method based on shrinkage priors for high-dimensional multiple response regression. Chapter three chooses a specific member of the shrinkage family known as the horseshoe prior and studies its convergence rates in several high-dimensional models. 

15.  Topics in Measurement Error Analysis and High-Dimensional Binary Classification

This dissertation proposes novel methods to tackle two problems: the misspecified model with measurement error and high-dimensional binary classification, both have a crucial impact on applications in public health. The first problem exists in the epidemiology practice. Epidemiologists often categorize a continuous risk predictor since categorization is thought to be more robust and interpretable, even when the true risk model is not a categorical one. Thus, their goal is to fit the categorical model and interpret the categorical parameters. The second project considers the problem of high-dimensional classification between the two groups with unequal covariance matrices. Rather than estimating the full quadratic discriminant rule, it is proposed to perform simultaneous variable selection and linear dimension reduction on original data, with the subsequent application of quadratic discriminant analysis on the reduced space. Further, in order to support the proposed methodology, two R packages were developed, CCP and DAP, along with two vignettes as long-format illustrations for their usage.

16. Model-Based Penalized Regression

This dissertation contains three chapters that consider penalized regression from a model-based perspective, interpreting penalties as assumed prior distributions for unknown regression coefficients. The first chapter shows that treating a lasso penalty as a prior can facilitate the choice of tuning parameters when standard methods for choosing the tuning parameters are not available, and when it is necessary to choose multiple tuning parameters simultaneously. The second chapter considers a possible drawback of treating penalties as models, specifically possible misspecification. The third chapter introduces structured shrinkage priors for dependent regression coefficients which generalize popular independent shrinkage priors. These can be useful in various applied settings where many regression coefficients are not only expected to be nearly or exactly equal to zero, but also structured.

17. Topics on Least Squares Estimation

This dissertation revisits and makes progress on some old but challenging problems concerning least squares estimation, the work-horse of supervised machine learning. Two major problems are addressed: (i) least squares estimation with heavy-tailed errors, and (ii) least squares estimation in non-Donsker classes. For (i), this problem is studied both from a worst-case perspective, and a more refined envelope perspective. For (ii), two case studies are performed in the context of (a) estimation involving sets and (b) estimation of multivariate isotonic functions. Understanding these particular aspects of least squares estimation problems requires several new tools in the empirical process theory, including a sharp multiplier inequality controlling the size of the multiplier empirical process, and matching upper and lower bounds for empirical processes indexed by non-Donsker classes.

How to Learn More about Machine Learning

At our upcoming event this November 16th-18th in San Francisco,  ODSC West 2021  will feature a plethora of talks, workshops, and training sessions on machine learning and machine learning research. You can  register now for 50% off all ticket types  before the discount drops to 40% in a few weeks. Some  highlighted sessions on machine learning  include:

  • Towards More Energy-Efficient Neural Networks? Use Your Brain!: Olaf de Leeuw | Data Scientist | Dataworkz
  • Practical MLOps: Automation Journey: Evgenii Vinogradov, PhD | Head of DHW Development | YooMoney
  • Applications of Modern Survival Modeling with Python: Brian Kent, PhD | Data Scientist | Founder The Crosstab Kite
  • Using Change Detection Algorithms for Detecting Anomalous Behavior in Large Systems: Veena Mendiratta, PhD | Adjunct Faculty, Network Reliability and Analytics Researcher | Northwestern University

Sessions on MLOps:

  • Tuning Hyperparameters with Reproducible Experiments: Milecia McGregor | Senior Software Engineer | Iterative
  • MLOps… From Model to Production: Filipa Peleja, PhD | Lead Data Scientist | Levi Strauss & Co
  • Operationalization of Models Developed and Deployed in Heterogeneous Platforms: Sourav Mazumder | Data Scientist, Thought Leader, AI & ML Operationalization Leader | IBM
  • Develop and Deploy a Machine Learning Pipeline in 45 Minutes with Ploomber: Eduardo Blancas | Data Scientist | Fidelity Investments

Sessions on Deep Learning:

  • GANs: Theory and Practice, Image Synthesis With GANs Using TensorFlow: Ajay Baranwal | Center Director | Center for Deep Learning in Electronic Manufacturing, Inc
  • Machine Learning With Graphs: Going Beyond Tabular Data: Dr. Clair J. Sullivan | Data Science Advocate | Neo4j
  • Deep Dive into Reinforcement Learning with PPO using TF-Agents & TensorFlow 2.0: Oliver Zeigermann | Software Developer | embarc Software Consulting GmbH
  • Get Started with Time-Series Forecasting using the Google Cloud AI Platform: Karl Weinmeister | Developer Relations Engineering Manager | Google

thesis about machine learning

Daniel Gutierrez, ODSC

Daniel D. Gutierrez is a practicing data scientist who’s been working with data long before the field came in vogue. As a technology journalist, he enjoys keeping a pulse on this fast-paced industry. Daniel is also an educator having taught data science, machine learning and R classes at the university level. He has authored four computer industry books on database and data science technology, including his most recent title, “Machine Learning and Data Science: An Introduction to Statistical Learning Methods with R.” Daniel holds a BS in Mathematics and Computer Science from UCLA.

west square

10 Can’t-Miss Sessions Coming to ODSC Europe 2024

Europe 2024 Conferences posted by ODSC Team Aug 16, 2024 ODSC Europe 2024 is coming up soon on September 5th and 6th in London, and there...

ODSC’s AI Weekly Recap: Week of August 16th

ODSC’s AI Weekly Recap: Week of August 16th

AI and Data Science News posted by ODSC Team Aug 16, 2024 Every week, the ODSC team researches the latest advancements in AI for this AI Weekly Recap....

Huawei Poised to Challenge Nvidia with New AI Chip Amid U.S. Sanctions

Huawei Poised to Challenge Nvidia with New AI Chip Amid U.S. Sanctions

AI and Data Science News posted by ODSC Team Aug 15, 2024 China’s Huawei Technologies is preparing to unveil a new chip designed for artificial intelligence applications. According...

AI weekly square

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

paper cover thumbnail

Thesis on Machine Learning Methods and Its Applications

Profile image of IJRASET Publication

2021, IJRASET

In the 1950s, the concept of machine learning was discovered and developed as a subfield of artificial intelligence. However, there were no significant developments or research on it until this decade. Typically, this field of study has developed and expanded since the 1990s. It is a field that will continue to develop in the future due to the difficulty of analysing and processing data as the number of records and documents increases. Due to the increasing data, machine learning focuses on finding the best model for the new data that takes into account all the previous data. Therefore, machine learning research will continue in correlation with this increasing data. This research focuses on the history of machine learning, the methods of machine learning, its applications, and the research that has been conducted on this topic. Our study aims to give researchers a deeper understanding of machine learning, an area of research that is becoming much more popular today, and its applications.

Related Papers

Manisha More

Machine learning is the fastest growing areas of computer science. It has the ability to lets the computer to create the program. It is a subset of Artificial Intelligence (AI), and consists of the more advanced techniques and models that enable computers to figure things out from the data and deliver. It is a field of learning and broadly divided into supervised learning, unsupervised learning, and reinforcement learning. There are many fields where the Machine learning algorithms are used. The objective of the paper is to represent the ML objectives, explore the various ML techniques and algorithms with its applications in the various fields from published papers, workshop materials & material collected from books and material available online on the World Wide Web.

thesis about machine learning

Pattern Recognition Letters

Ramon Lopez De Mantaras

pankaj verma

The field of machine learning is introduced at a conceptual level. The main goal of machine learning is how computers automatically learn without any human invention or assistance so that they can adjust their action accordingly. We are discussing mainly three types of algorithms in machine learning and also discussed ML's features and applications in detail. Supervised ML, In this typeof algorithm, the machine applies what it has learned in its past to new data, in which they use labeled examples, so that they predict future events. Unsupervised ML studies how systems can infer a function, so that they can describe a hidden structure from unlabeled data. Reinforcement ML, is a type of learning method, which interacts with its environment, produces action, as well as discovers errors and rewards.

Journal of Advances in Mathematical & Computational Science. Vol 10, No.3. Pp 1 – 14.

Jerry Sarumi

Machine learning and associated algorithms occupies a pride of place in the execution of automation in the field of computing and its application to addressing contemporary and human-centred problems such as predictions, evaluations, deductions, analytics and analysis. This paper presents types of data and machine learning algorithms in a broader sense. We briefly discuss and explain different machine learning algorithms and real-world application areas based on machine learning. We highlight several research issues and potential future directions

IJESRT Journal

Machine learning [1], a branch of artificial intelligence, that gives computers the ability to learn without being explicitly programmed, means it gives system the ability to learn from data. There are two types of learning techniques: supervised learning and unsupervised learning [2]. This paper summarizes the recent trends of machine learning research.

International Journal for Research in Applied Science & Engineering Technology (IJRASET)

Dr. Manish Kumar Singh

Machine learning has become one of the most envisaged areas of research and development field in modern times. But the area of research related to machine learning is not new. The term machine learning was coined by Arthur Samuel in 1952 and since then lots of developments have been made in this field. The data scientists and the machine learning enthusiasts have developed myriad algorithms from time to time to let the benefit of machine learning reach to each and every field of human endeavors. This paper is an effort to put light on some of the most prominent algorithms that have been used in machine learning field on frequent basis since the time of its inception. Further, we will analyze their area of applications.

International Journal of Advanced Technology and Engineering Exploration

Akash Badone

International Journal of Engineering Applied Sciences and Technology

vishal bari

Today, huge amounts of data are available everywhere. Therefore, analyzing this data is very important to derive useful information from it and develop an algorithm based on this analysis. This can be achieved through data mining and machine learning. Machine learning is an essential part of artificial intelligence used to design algorithms based on data trends and past relationships between data. Machine learning is used in a variety of areas such as bioinformatics, intrusion detection, information retrieval, games, marketing, malware detection, and image decoding. This paper shows the work of various authors in the field of machine learning in various application areas.

Ioannis Vlahavas

IJRASET Publication

This paper describes essential points of machine learning and its application. It seamlessly turns around and teach about the pros and cons of the ML. As well as it covers the real-life application where the machine learning is being used. Different types of machine learning and its algorithms. This paper is giving the detail knowledge about the different algorithms used in machine learning with their applications. There is brief explanation about the Weather Prediction application using the machine learning and also the comparison between various machine learning algorithms used by various researchers for weather prediction.

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

RELATED PAPERS

Sumeet Agarwal

JMSS, A2Z Journals

Journal of Management and Service Science (JMSS), A 2 Z Journals

Applied Sciences

Grzegorz Dudek

Pooja Ambatkar

Journal of Physics: Conference Series

Jafar Alzubi

IRJET Journal

Kostantinos Demertzis

International Journal of Computer Applications

IJERA Journal

International journal of engineering research and technology

Dr Nitin Rajvanshi

International Journal of Engineering Research and Advanced Technology

rama prasad

International Journal of Scientific Research in Computer Science, Engineering and Information Technology

International Journal of Scientific Research in Computer Science, Engineering and Information Technology IJSRCSEIT

Zachary Barillaro

International Journal of Innovative Technology and Exploring Engineering

atul kathole

Iqbal Muhammad

Artificial Intelligence

mplab.ucsd.edu

Paul Ruvolo

Foundation of Computer Applications

Editor IJATCA , nikhil katoch

International Journal of Scientific Research in Science, Engineering and Technology

International Journal of Scientific Research in Science, Engineering and Technology IJSRSET

RELATED TOPICS

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024

PHD PRIME

Thesis Topics for Machine Learning

  Machine learning is one of the recently growing fields of research for classification, clustering, and prediction of input data . The techniques of ensembles and hybridization have contributed to the improvement of machine learning models . As a result, the speed of computation, operation, accuracy, and robustness of the machine learning models are enhanced. Through this article, you can get an overview of novel machine learning models, their design, performance, merits, and uses explained via a new taxonomic approach . Also, you can get all essential details regarding any thesis topics for machine learning from this page.

At present, there are many new ensembles and hybridized machine learning models being introduced and developed .  Here the essentials of thesis writing are presented to you by our world-class certified writers and developers. What are the essential elements of a thesis statement?

  • First of all you have to understand that thesis statement writing is the most crucial process which involves a lot of time and thinking
  • Enough research data and evidence have to be gathered before writing a thesis statement
  • The main Idea or the objective has to be presented clearly with supporting evidence
  • Also remember that the thesis statement should be in accordance with the argument where adjustments are allowed

Usually, research scholars interact with our writers and experts for all aspects of thesis writing in machine learning. So we insist that you contact us much before you start your thesis so that you can have a clear-cut vision and well-strategized approach towards writing the best thesis.

Top 5 Research Thesis Topics for Machine Learning

Let us now have an idea about various headings to be included in any thesis topics for machine learning.

  • Introduction – overview of the thesis
  • Related / Existing works – presents of existing research
  • Problems definition/statements – identify and highlight the problems
  • Research methodology – convey the proposed concepts
  • Results and Discussion – discuss the results of the proposed works with previous works
  • Conclusion and future work – present the results of the proposed work

The introduction is the very first part of your thesis. It is the way by which you tend to create the first impression in the minds of the readers. What are the parts of the introduction in the thesis?

  • The issue under examination is the core of an overview
  • Main Idea and assertion has to be mentioned clearly
  • Thesis statement and argument forms the fundamental aspect here
  • Address the audience to prove to them that they are at the right place
  • Scope of your paper should be mentioned satisfactorily
  • The Planning based approach that you used to conduct research

In general, the choice of words, tone, approach, and language decide the quality of a thesis likewise the introduction. Our technical team and expert writers have gained enough experience writing thesis topics in machine learning. The amount of field knowledge and expertise that we gathered is quite large which will be of great use to you. Let us now talk about the next most important topic of a thesis called the issue

What are the guidelines for thesis writing? 

Under the heading of the issue, the following aspects of research are to be included

  • The background history about your result issue or concern solving which is stated as your objective
  • The impact of the issue in this field
  • Important characteristic features that affect the issue
  • Potential research solutions that are undertaken for research

With the massive amount of reliable and authentic research materials that we provide, you can surely get all the necessary information to include in the issues part of your thesis. Also, our engineers and technical team are here to solve any kind of technical queries that you may get. Let us now talk about the literature review  

LITERATURE REVIEW 

  • With important references and constructs from standard textbooks journals and relevant publications you need to make the following descriptions
  • Relevant theory
  • Issue explanation
  • Potential solution
  • Theoretical constructs
  • Explanation on major theories
  • Empirical literature from journal articles are considered for the following aspects
  • Explanation on latest empirical studies
  • Summary of the methodology adopted
  • Important findings of the study
  • Constraints associated with your findings
  • The pathway of your research study has to be organized in line with the literature review to make keynotes on the following
  • The referred definitions and concepts
  • Unique aspects of the issues under examination
  • Suitable method of your research

If you are searching for the best and most reliable online research guide for all kinds of thesis topics in machine learning then you are here at the right place. You can get professional and customized research support aligned with your institutional format from our experts. Let us now look into the method section in detail below  

The following are the different aspects that you need to incorporate in the methods section of your thesis

  • The research questions and issues under your examination
  • Description of proposed works like data collection
  • Rationale justification for the method of your choice

In addition to these aspects, you need to provide a clear description of all the research methods that you adopt in your study. For this purpose here are our research experts who will provide you with details on novel and innovative approaches useful for your research . You can also get concise and precise quantitative research data from us. Let us now look into this section of results

RESULTS AND DISCUSSION

On the page of results and discussion you need to incorporate the following aspects

  • Description of major findings
  • Visualization tools like charts, graphs, and tables to present the findings
  • Relevant previous studies and results
  • Creative and new results that you obtained
  • Scopes to expand the previous studies with your findings
  • Constraints of your study

The support of technical experts can help you do the best research work in machine learning . The interested researcher plus reliable and experienced research support makes the best PhD work possible. With our guidance, you get access to the best combo needed to carry out your research. Let’s now discuss the conclusions part  

Conclusion and recommendation

In the part of conclusion, you need to include the following aspects

  • Recap of issues being discussed
  • Methods used and major findings
  • Comparison between the original objective and accomplished results
  • Scope for future expansion of your research

For each and every aspect of your machine learning PhD thesis , you can get complete support from our experts. In this respect let us now look to the topmost machine learning thesis topics below  

Top 5 Thesis Topics for machine learning

  • Machine learning is of great importance to physicians in the following perspectives
  • Chatbots for speech recognition
  • Pattern recognition for disease detection
  • Treatment recommendation
  • Detecting cancerous cells
  • Body fluid analysis
  • Identification of phenotypes in case of rare diseases
  • Classifying data into groups for fault detection is possible using machine learning
  • The following are some real-time examples for predictive analysis
  • Fraudulent and legitimate transaction
  • Improvement of prediction mechanism for detecting faults
  • From the basics of developing products to predicting the stock market and real estate prices, predictive analytics is of greater importance
  • Using a trading algorithm that makes use of a proper strategy for financing huge volumes of security is called statistical arbitrage
  • Real-time examples of statistical arbitrage
  • Analysis of huge data sets
  • Algorithm-based trading for market microstructural analysis
  • Real-time arbitrage possibilities
  • Machine learning is used to enhance the strategy for statistical arbitrage as a result of which advanced results can be obtained
  • In order to help the predictive analytics mechanisms to obtain increased accuracy feature extraction using machine learning plays a significant role
  • Dataset annotations can be performed with greater significance using machine learning extraction methods where structured data can be extracted from unstructured information
  • Real-time examples of machine learning-based feature extraction include the following
  • Vocal cord disorder prediction
  • Mechanism for prevention diagnosis and treatment of many disorders
  • Detecting and solving many physiological problems in a Swift manner
  • Extraction of critical information becomes easy with machine learning even when large volumes of data are being processed
  • Machine learning methodologies can be used for translating speech into texts
  • Recorded speech and real-time voice can be converted into text using machine learning systems designed for this purpose
  • Speech can also be classified based on intensity, time, and frequency
  • Voice search, appliance control, and voice dialing are the main real-time examples of speech recognition

In order to get confidential research guidance from world-class experts on all these thesis topics for machine learning, you can feel free to contact us. With more than 15 years of customer satisfaction, we are providing in-depth Research and advanced project support for all thesis topics for machine learning . Our thesis writing support also includes the following aspects

  • Multiple revisions
  • Complete grammatical check
  • Formatting and editing
  • Benchmark reference and citations from topmost journals
  • Work privacy
  • Internal review

We ensure all these criteria are conferred to you by world-class certified engineers, developers, and writers. So you can avail of our services with elevated confidence. We are here to support you fully. Let us now see some important machine learning methods in the following  

Machine learning methods

Machine learning techniques are most often used in cases of making automatic decisions for any kind of input that they are trained and implemented for. Therefore machine learning approaches are expected to support the following aspects in decision making.

  • Maximum accuracy of recommendations
  • In-depth understanding and analysis before deciding to increase the trustworthiness

The decision-making approach using machine learning methods provides for higher accuracy in prediction and advanced comprehensible models respectively in implicit and explicit learning. For all your doubts and queries regarding the above-mentioned machine learning and decision-making approaches , you may feel free to contact us at any time of your convenience. Our technical team is highly experienced and skilled in resolving any kind of queries . Let us now see the important machine learning algorithms  

Machine learning algorithms

Machine learning algorithms are very much diverse that they can be oriented into various objectives and goals for which machine learning methods are frequently adopted

  • One rule, zero rule, and cubist
  • RIPPER or Repeated Incremental Pruning to Produce Error Reduction
  • Random forest, boosting, and AdaBoost
  • Gradient Boosted Regression Trees and the Stacked Generalization
  • Gradient Boosting Machines and Bootstrapped Aggregation
  • Convolutional Neural Networks and Stacked Autoencoders
  • Deep Boltzmann Machine and Deep Belief Networks
  • Projection Pursuit and Sammon Mapping
  • Principal Component Analysis and Partial Least Square Discriminant Analysis
  • Quadratic Discriminant Analysis and Flexible Discriminant Analysis
  • Partial Least Squares Regression and Multidimensional Scaling
  • Principal Component Regression and Mixture Discriminant Analysis
  • Regularized Discriminant Analysis and Linear Discriminant Analysis
  • K means and K medians
  • Expectation Maximization and Hierarchical Clustering
  • Ridge Regression and Elastic Net
  • Least Angle Regression and the LASSO or Least Absolute Shrinkage and Selection Operator
  • Hopfield Network and perception
  • Black Propagation and Radian Basis Function Network
  • Naive Bayes and Bayesian Network
  • Averaged One Dependents Estimators and Gaussian Naive Bayes
  • Bayesian Belief Networks and Multinomial Naive Bayes
  • Logistic, stepwise, and linear regression
  • Locally Estimated Scatterplot Smoothing and Ordinary Least Squares Regression
  • Multivariate Adaptive Regression Splines
  • MS, C 4.5, C 5.0, and Decision stump
  • Conditional Decision Trees and Iterative Dichotomiser 3
  • Chi-squared Automatic Interaction Detection
  • Classification and regression tree
  • K Nearest Neighbour and Self Organising Map
  • Locally Weighted Learning and Learning Vector Quantization

You can get a complete technical explanation and tips associated with the usage of these algorithms from our website. The selection of your thesis topic for machine learning becomes easier than before when you look into the various aspects of these algorithms and get to choose the best one based on your interests and needs. For this purpose, you can connect with us. We are here to assist you by giving proper expert consultation support for topic selection and allocating a highly qualified team of engineers to carry out your project successfully. Let us now talk about linear regression in detail

What is the process of linear regression?

The following are the three important stages in the process of linear regression analysis

  • Data correlation and directionality analysis
  • Model estimation based on linear fitting
  • Estimation of validity and assessing the merits of the model

It is important that certain characteristic features are inherent in a model for the proper working of an algorithm. Feature engineering is the process by which essential features from raw data are obtained for the better functioning of an algorithm. With the most appropriate features extracted the algorithms become simple. Thus as a result accuracy of results is obtained even in the case of nonideal algorithms. What are the objectives of feature engineering?

  • Preparation of input data for Better compatibility with the chosen machine learning algorithm
  • Enhancement of the efficiency and working of machine learning models

With these goals, feature engineering becomes one of the important aspects of a machine learning research project. Talk to engineers for more details on the methods and algorithms used in extracting the necessary features.  What are the techniques used in feature engineering? 

  • Imputation and binning
  • Log transform and feature split
  • Outliers handling and grouping functions
  • One hot encoding and scaling
  • Data extraction

Usually, we provide practical explanations in easy to understand words to our customers so that all their doubts are cleared even before they start their research. For this purpose, we make use of the real-time implemented models and our successful projects . Check out our website for all our machine learning project details. Let us now talk about hybrid machine learning models.

HYBRID MACHINE LEARNING MODELS

  • When the machine learning methods are integrated with other methods such as optimization approaches, soft computing, and so on drastic improvement can be observed in the machine learning model.
  • The ensemble methods are the culmination of grouping methods like boosting and bagging in case of multiple machine learning classifiers.

Our experts claim that the success of machine learning is dependent on ensemble and hybrid methods advancements. In this regard let us have a look into some of the hybrid methods below

  • NBTree and functional tree
  • Hybrid fuzzy with decision tree
  • Logistic model tree and hybrid hoeffding tree

Most importantly these hybrid models and ensemble-based approaches in machine learning are on a rising scale and our technical team always stays updated about such novelties. So we are highly capable of providing you with the best support in all thesis topics for machine learning. Let us now look into the metrics used in analyzing the performance of machine learning models

Performance analysis of machine learning

Confusion metrics are prominently used for analyzing the machine learning models. The following are the fundamental terms associated with machine learning confusion metrics

  • Contradiction of actual and predicted classes
  • Correct prediction of negative values consisting of ‘no’ results for both actual and predicted classes
  • Correct prediction of positive values consisting of ‘yes’ results for both actual and prediction classes

Using these fundamental parameters the essential values for calculation of efficiency and performance of the machine learning models are obtained as follows.

  • Procession is considered as the ratio between the number of accurate positives predicted and the total number of positives claimed
  • Recall is the ratio of all The true positive rate (in actual class being yes)
  • F1 Score is the average between recall and precision hence taking into account all the false positives and false negatives
  • Uneven distribution of classes require F1 Score to be evaluated than the accuracy, about which we will discuss below
  • Accuracy can be considered in cases of similar false positives and false negatives.
  • For different cost values of false positives and false negatives, it is recommended that you choose to recall and precision for performance evaluation
  • Accuracy is the ratio between correct predictions and the total observations
  • Also accuracy is considered as one of the most important and intuitive measures for analyzing the machine learning system performance

It becomes significant to note here that at thesis topics for machine learning ; our experts have produced excellent results in all these performance metrics. Contact our experts’ team for more details on the approaches that are considered to produce such the best outcomes . We work 24/7 to assist you.

thesis about machine learning

Opening Hours

  • Mon-Sat 09.00 am – 6.30 pm
  • Lunch Time 12.30 pm – 01.30 pm
  • Break Time 04.00 pm – 04.30 pm
  • 18 years service excellence
  • 40+ country reach
  • 36+ university mou
  • 194+ college mou
  • 6000+ happy customers
  • 100+ employees
  • 240+ writers
  • 60+ developers
  • 45+ researchers
  • 540+ Journal tieup

Payment Options

money gram

Our Clients

thesis about machine learning

Social Links

thesis about machine learning

  • Terms of Use

thesis about machine learning

Opening Time

thesis about machine learning

Closing Time

  • We follow Indian time zone

award1

Navigation Menu

Search code, repositories, users, issues, pull requests..., provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

master-thesis

Here are 95 public repositories matching this topic..., johnmartinsson / bird-species-classification.

Using convolutional neural networks to build and train a bird species classifier on bird song data with corresponding species labels.

  • Updated Oct 11, 2023

Tim-HW / HW-BlueRov2-Sonar-based-SLAM

This project will evaluate simultaneous localisation and mapping (SLAM) algorithms for fusing sonar with DVL and IMU to produce maps for autonomous underwater vehicle (AUV) navigation for underwater ROV

  • Updated Jan 24, 2024

meurissemax / autonomous-drone

Master's thesis about autonomous navigation of a drone in indoor environments carried out to obtain the degree of Master of Science in Computer Science Engineering (University of Liège, academic year 2020-2021).

  • Updated Jul 13, 2021

dpuljic01 / financial-dashboard

Masters Thesis - Fintech Dashboard

  • Updated Sep 10, 2023

LasseRegin / master-thesis-deep-learning

Code for my master thesis in Deep Learning: "Generating answers to medical questions using recurrent neural networks"

  • Updated Jul 16, 2017

harshildarji / thesis

Master's thesis, Uni Passau

  • Updated Mar 21, 2022

RealityNet / McAFuse

Toolset to analyze disks encrypted with McAFee FDE technology

  • Updated Mar 11, 2021

Boren / MasterThesis

Deep Convolutional Neural Networks for Semantic Segmentation of Multi-Band Satellite Images

  • Updated May 30, 2018

thomasSve / Msc_Multi_label_ZeroShot

Code for master thesis on Zero-Shot Learning in multi-label scenarios

  • Updated Mar 28, 2018

kdevo / chaos-rrs

Chaos - a first of its kind framework for researching Reciprocal Recommender Systems (RRS).

  • Updated Nov 7, 2021

Josef-Djarf / sEMG-Sim

Source code for multiple parameter modelling of synthetic electromyography data.

  • Updated Feb 21, 2024

JanPokorny / speed-climbing-mapping

Mapping videos of speed climbers onto a virtual wall using ML, OpenCV, and maths. Implementation of my master's thesis.

  • Updated Jul 20, 2023

danielathome19 / Form-NN

Master thesis project - a hybrid Neural Network-Decision Tree system and dataset for classical music form recognition and analysis.

  • Updated Jul 29, 2024

KyleOng / starreco

State-of-The-Art Rating-based RECOmmendation system: pytorch lightning implementation

  • Updated Oct 10, 2023

lukaselmer / hierarchical-paragraph-vectors

Hierarchical Paragraph Vectors

  • Updated Sep 21, 2015

rand-asswad / muallef

Study of Music Information Retrieval (MIR) methods for multi-pitch estimation and onset detection.

  • Updated Dec 8, 2022

lcebear / memoryDialogueBot

Master Thesis Project: A memory based dialogue agent

  • Updated Dec 20, 2020

develooper1994 / MasterThesis

My Master Thesis experimentation source codes

  • Updated Apr 6, 2021

EivindArvesen / master_code

Various code from my master's project

  • Updated Jan 9, 2019

jrmak / FNNR-ABM-Primate

An agent-based model (with a web simulation) for Guizhou "golden" monkey population and movements using the Mesa Python framework; thesis project + human/GTGP expansion; Summer 2018-Winter 2019

  • Updated Mar 22, 2020

Improve this page

Add a description, image, and links to the master-thesis topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the master-thesis topic, visit your repo's landing page and select "manage topics."

Gaming

Machine Learning

The broad goal of machine learning is to automate the decision-making process, so that computer-automated predictions can make a task more efficient, accurate, or cost-effective than it would be using only human decision making.

Carnegie Mellon is widely regarded as one of the world’s leading centers for machine learning research, and the scope of our machine learning research is broad. Our current research addresses learning in games, where there are multiple learners with different interests; semi-supervised learning; astrostatistics; intrusion detection; and structured prediction.

Our is distinguished by its serious focus on applications and real systems. A notable example from machine learning is research that has led a system for early detection of disease outbreaks. Carnegie Mellon has also received ongoing recognition from its Robotic soccer research program, which provides a rich environment for machine learning that “improves with experience,” involving problem solving in complex domains with multiple agents, dynamic environments, the need for learning from feed-back, real-time planning, and many other artificial intelligence issues.

Faculty Working in this Area

Last First Professional Title Available To Advise?
Assistant Professor
Research Faculty Emeritus
Fredkin University Professor of Computer Science
Assistant Professor
SCS Founders University Professor
Assistant Professor
Angel Jordan University Professor of Computer Science
Associate Professor
Professor
  • Who’s Teaching What
  • Subject Updates
  • MEng program
  • Opportunities
  • Minor in Computer Science
  • Resources for Current Students
  • Program objectives and accreditation
  • Graduate program requirements
  • Admission process
  • Degree programs
  • Graduate research
  • EECS Graduate Funding
  • Resources for current students
  • Student profiles
  • Instructors
  • DEI data and documents
  • Recruitment and outreach
  • Community and resources
  • Get involved / self-education
  • Rising Stars in EECS
  • Graduate Application Assistance Program (GAAP)
  • MIT Summer Research Program (MSRP)
  • Sloan-MIT University Center for Exemplary Mentoring (UCEM)
  • Electrical Engineering
  • Computer Science
  • Artificial Intelligence + Decision-making
  • AI and Society
  • AI for Healthcare and Life Sciences
  • Artificial Intelligence and Machine Learning
  • Biological and Medical Devices and Systems
  • Communications Systems
  • Computational Biology
  • Computational Fabrication and Manufacturing
  • Computer Architecture
  • Educational Technology
  • Electronic, Magnetic, Optical and Quantum Materials and Devices
  • Graphics and Vision
  • Human-Computer Interaction
  • Information Science and Systems
  • Integrated Circuits and Systems
  • Nanoscale Materials, Devices, and Systems
  • Natural Language and Speech Processing
  • Optics + Photonics
  • Optimization and Game Theory
  • Programming Languages and Software Engineering
  • Quantum Computing, Communication, and Sensing
  • Security and Cryptography
  • Signal Processing
  • Systems and Networking
  • Systems Theory, Control, and Autonomy
  • Theory of Computation
  • Departmental History
  • Departmental Organization
  • Visiting Committee
  • News & Events
  • News & Events
  • EECS Celebrates Awards

Doctoral Thesis: From Data, to Models, and Back: Making ML “Predictably Reliable”

Kiva (32-G449)

By: Andrew Ilyas

Thesis Supervisors: Costis Daskalakis, Aleksander Madry

  • Date: Friday, August 23
  • Time: 2:30 pm - 4:00 pm
  • Category: Thesis Defense
  • Location: Kiva (32-G449)

Additional Location Details:

Abstract: Despite their impressive performance, training and deploying ML models is currently a somewhat messy affair. But does it have to be? In this defense, I’ll discuss some of my research on making ML “predictably reliable”—enabling developers to know when their models will work, when they will fail, and why. To begin, we use a case study of adversarial examples to show that human intuition can be a poor predictor of how ML models operate. Motivated by this, we present a few lines of work that aim to develop a precise understanding of the entire ML pipeline: from how we source data, to the datasets we train on, to the learning algorithms to use.

Grab your spot at the free arXiv Accessibility Forum

Help | Advanced Search

Computer Science > Machine Learning

Title: dilated convolution with learnable spacings.

Abstract: This thesis presents and evaluates the Dilated Convolution with Learnable Spacings (DCLS) method. Through various supervised learning experiments in the fields of computer vision, audio, and speech processing, the DCLS method proves to outperform both standard and advanced convolution techniques. The research is organized into several steps, starting with an analysis of the literature and existing convolution techniques that preceded the development of the DCLS method. We were particularly interested in the methods that are closely related to our own and that remain essential to capture the nuances and uniqueness of our approach. The cornerstone of our study is the introduction and application of the DCLS method to convolutional neural networks (CNNs), as well as to hybrid architectures that rely on both convolutional and visual attention approaches. DCLS is shown to be particularly effective in tasks such as classification, semantic segmentation, and object detection. Initially using bilinear interpolation, the study also explores other interpolation methods, finding that Gaussian interpolation slightly improves performance. The DCLS method is further applied to spiking neural networks (SNNs) to enable synaptic delay learning within a neural network that could eventually be transferred to so-called neuromorphic chips. The results show that the DCLS method stands out as a new state-of-the-art technique in SNN audio classification for certain benchmark tasks in this field. These tasks involve datasets with a high temporal component. In addition, we show that DCLS can significantly improve the accuracy of artificial neural networks for the multi-label audio classification task. We conclude with a discussion of the chosen experimental setup, its limitations, the limitations of our method, and our results.
Comments: PhD Thesis
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as: [cs.LG]
  (or [cs.LG] for this version)
  Focus to learn more arXiv-issued DOI via DataCite

Submission history

Access paper:.

  • HTML (experimental)
  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

applsci-logo

Article Menu

thesis about machine learning

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Machine learning methods in weather and climate applications: a survey.

thesis about machine learning

1. Introduction

  • Limited Scope: Existing surveys predominantly focus either on short-term weather forecasting or medium-to-long-term climate predictions. There is a notable absence of comprehensive surveys that endeavour to bridge these two-time scales. In addition, current investigations tend to focus narrowly on specific methods, such as simple neural networks, thereby neglecting some combination of methods.
  • Lack of model details: Many extisting studies offer only generalized viewpoints and lack a systematic analysis of the specific model employed in weather and climate prediction. This absence creates a barrier for researchers aiming to understand the intricacies and efficacy of individual methods.
  • Neglect of Recent Advances: Despite rapid developments in machine learning and computational techniques, existing surveys have not kept pace with these advancements. The paucity of information on cutting-edge technologies stymies the progression of research in this interdisciplinary field.
  • Comprehensive scope: Unlike research endeavors that restrict their inquiry to a singular temporal scale, our survey provides a comprehensive analysis that amalgamates short-term weather forecasting with medium- and long-term climate predictions. In total, 20 models were surveyed, of which a select subset of eight were chosen for in-depth scrutiny. These models are discerned as the industry’s avant-garde, thereby serving as invaluable references for researchers. For instance, the PanGu model exhibits remarkable congruence with actual observational results, thereby illustrating the caliber of the models included in our analysis.
  • In-Depth Analysis: Breaking new ground, this study delves into the intricate operational mechanisms of the eight focal models. We have dissected the operating mechanisms of these eight models, distinguishing the differences in their approaches and summarizing the commonalities in their methods through comparison. This comparison helps readers gain a deeper understanding of the efficacy and applicability of each model and provides a reference for choosing the most appropriate model for a given scenario.
  • Identification of Contemporary Challenges and Future Work: The survey identifies pressing challenges currently facing the field, such as the limited dataset of chronological seasons and complex climate change effects, and suggests directions for future work, including simulating datasets and physics-based constraint models. These recommendations not only add a forward-looking dimension to our research but also act as a catalyst for further research and development in climate prediction.

2. Background

3. related work, 3.1. statistical method, 3.2. physical models, 4. taxonomy of climate prediction applications, 4.1. climate prediction milestone based on machine-learning, 4.2. classification of climate prediction methods, 5. short-term weather forecast, 5.1. model design.

  • The Navier-Stokes Equations [ 73 ]: Serving as the quintessential descriptors of fluid motion, these equations delineate the fundamental mechanics underlying atmospheric flow. ∇ · v = 0 (3) ρ ∂ v ∂ t + v · ∇ v = − ∇ p + μ ∇ 2 v + ρ g (4)
  • The Thermodynamic Equations [ 74 ]: These equations intricately interrelate the temperature, pressure, and humidity within the atmospheric matrix, offering insights into the state and transitions of atmospheric energy. ∂ ρ ∂ t + ∇ · ( ρ v ) = 0 ( Continuity equation ) (5) ∂ T ∂ t + v · ∇ T = q c p ( Energy equation ) (6) D p D t = − ρ c p ∇ · v ( Pressure equation ) (7)
  • The Cloud Microphysics Parameterization Scheme is instrumental for simulating the life cycles of cloud droplets and ice crystals, thereby affecting [ 75 , 76 ] and atmospheric energy balance.
  • Shortwave and Longwave Radiation Transfer Equations elucidate the absorption, scattering, and emission of both solar and terrestrial radiation, which in turn influence atmospheric temperature and dynamics.
  • Empirical or Semi-Empirical Convection Parameterization Schemes simulate vertical atmospheric motions initiated by local instabilities, facilitating the capture of weather phenomena like thunderstorms.
  • Boundary-Layer Dynamics concentrates on the exchanges of momentum, energy, and matter between the Earth’s surface and the atmosphere which are crucial for the accurate representation of surface conditions in the model.
  • Land Surface and Soil/Ocean Interaction Modules simulate the exchange of energy, moisture, and momentum between the surface and the atmosphere, while also accounting for terrestrial and aquatic influences on atmospheric conditions.
  • Encoder: The encoder component maps the local region of the input data (on the original latitude-longitude grid) onto the nodes of the multigrid graphical representation. It maps two consecutive input frames of the latitude-longitude input grid, with numerous variables per grid point, into a multi-scale internal mesh representation. This mapping process helps the model better capture and understand spatial dependencies in the data, allowing for more accurate predictions of future weather conditions.
  • Processor: This part performs several rounds of message-passing on the multi-mesh, where the edges can span short or long ranges, facilitating efficient communication without necessitating an explicit hierarchy. More specifically, the section uses a multi-mesh graph representation. It refers to a special graph structure that is able to represent the spatial structure of the Earth’s surface in an efficient way. In a multi-mesh graph representation, nodes may represent specific regions of the Earth’s surface, while edges may represent spatial relationships between these regions. In this way, models can capture spatial dependencies on a global scale and are able to utilize the power of GNNs to analyze and predict weather changes.
  • Decoder: It then maps the multi-mesh representation back to the latitude-longitude grid as a prediction for the next time step.

5.2. Result Analysis

6. medium-to-long-term climate prediction, 6.1. model design.

  • Problem Definition: The goal is to approximate p ( Y ∣ X , M ) , a task challenged by high-dimensional geospatial data, data inhomogeneity, and a large dataset.
  • Random Variable z : A latent variable with a fixed standard Gaussian distribution.
  • Parametric Functions p θ , q ϕ , p ψ : Neural networks for transforming z and approximating target and posterior distributions.
  • Objective Function: Maximization of the Evidence Lower Bound (ELBO).
  • Initialize: Define random variable z ∼ N ( 0 , 1 ) [ 96 , 97 ] parametric functions p θ ( z , X , M ) , q ϕ ( z ∣ X , Y , M ) , p ψ ( Y ∣ X , M , z ) .
  • Training Objective (Maximize ELBO) [ 98 ]: The ELBO is defined as: ELBO = E z ∼ q ϕ log p ψ ( Y ∣ X , M , z ) − D KL ( q ϕ ∥ p ( z ∣ X , M ) ) − D KL ( q ϕ ∥ p ( z ∣ X , Y , M ) ) (8) with terms for reconstruction, regularization, and residual error.
  • Optimization: Utilize variational inference, Monte Carlo reparameterization, and Gaussian assumptions.
  • Forecasting: Generate forecasts by sampling p ( z ∣ X , M ) , the likelihood of p ψ , and using the mean of p ψ for an average estimate.
  • Two Generators : The CycleGAN model includes two generators. Generator G learns the mapping from the simulated domain to the real domain, and generator F learns the mapping from the real domain to the simulated domain [ 100 ].
  • Two Discriminators : There are two discriminators, one for the real domain and one for the simulated domain. Discriminator D x encourages generator G to generate samples that look similar to samples in the real domain, and discriminator D y encourages generator F to generate samples that look similar to samples in the simulated domain.
  • Cycle Consistency Loss : To ensure that the mappings are consistent, the model enforces the following condition through a cycle consistency loss: if a sample is mapped from the simulated domain to the real domain and then mapped back to the simulated domain, it should get a sample similar to the original simulated sample. Similarly, if a sample is mapped from the real domain to the simulated domain and then mapped back to the real domain, it should get a sample similar to the original real sample. L cyc ( G , F ) = E x ∼ p data ( x ) | | F ( G ( x ) ) − x | | 1 + E y ∼ p data ( y ) | | G ( F ( y ) ) − y | | 1 (10)
  • Training Process : The model is trained to learn the mapping between these two domains by minimizing the adversarial loss and cycle consistency loss between the generators and discriminators. L Gen ( G , F ) = L GAN ( G , D y , X , Y ) + L GAN ( F , D x , Y , X ) + λ L cyc ( G , F ) (11)
  • Application to Prediction : Once trained, these mappings can be used for various tasks, such as transforming simulated precipitation data into forecasts that resemble observed data.
  • Reference Model: SPCAM. SPCAM serves as the foundational GCM and is embedded with Cloud-Resolving Models (CRMs) to simulate microscale atmospheric processes like cloud formation and convection. SPCAM is employed to generate “target simulation data”, which serves as the training baseline for the neural networks. The use of CRMs is inspired by recent advancements in data science, demonstrating that machine learning parameterizations can potentially outperform traditional methods in simulating convective and cloud processes.
  • Neural Networks: ResDNNs, a specialized form of deep neural networks, are employed for their ability to approximate complex, nonlinear relationships. The network comprises multiple residual blocks, each containing two fully connected layers with Rectified Linear Unit (ReLU) activations. ResDNNs are designed to address the vanishing and exploding gradient problems in deep networks through residual connections, offering a stable and effective gradient propagation mechanism. This makes them well-suited for capturing the complex and nonlinear nature of atmospheric processes.
  • Subgrid-Scale Physical Simulator. Traditional parameterizations often employ simplified equations to model subgrid-scale processes, which might lack accuracy. In contrast, the ResDNNs are organized into a subgrid-scale physical simulator that operates independently within each model grid cell. This simulator takes atmospheric states as inputs and outputs physical quantities at the subgrid scale, such as cloud fraction and precipitation rate.

6.2. Result Analysis

7. discussion, 7.1. overall comparison, 7.2. challenge, 7.3. future work.

  • Simulate the dataset using statistical methods or physical methods.
  • Combining statistical knowledge with machine learning methods to enhance the interpretability of patterns.
  • Consider the introduction of physics-based constraints into deep learning models to produced more accurate and reliable results.
  • Accelerating Physical Model Prediction with machine learning knowledge.

8. Conclusions

Author contributions, institutional review board statement, informed consent statement, data availability statement, conflicts of interest, abbreviations.

vvelocity vector
ttime
fluid density
ppressure
dynamic viscosity
ggravitational acceleration vector
expectation under the variational distribution
latent variable
observed data
joint distribution of observed and latent variables
variational distribution
G, FGenerators for mappings from simulated to real domain and vice versa.
D , D Discriminators for real and simulated domains.
, Cycle consistency loss and Generative Adversarial Network loss.
X, YData distributions for simulated and real domains.
Weighting factor for the cycle consistency loss.
  • Abbe, C. The physical basis of long-range weather. Mon. Weather Rev. 1901 , 29 , 551–561. [ Google Scholar ] [ CrossRef ]
  • Zheng, Y.; Capra, L.; Wolfson, O.; Yang, H. Urban computing: Concepts, methodologies, and applications. Acm Trans. Intell. Syst. Technol. TIST 2014 , 5 , 1–55. [ Google Scholar ]
  • Gneiting, T.; Raftery, A.E. Weather forecasting with ensemble methods. Science 2005 , 310 , 248–249. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Agapiou, A. Remote sensing heritage in a petabyte-scale: Satellite data and heritage Earth Engine applications. Int. J. Digit. Earth 2017 , 10 , 85–102. [ Google Scholar ] [ CrossRef ]
  • Bendre, M.R.; Thool, R.C.; Thool, V.R. Big data in precision agriculture: Weather forecasting for future farming. In Proceedings of the 2015 1st International Conference on Next Generation Computing Technologies (NGCT), Dehradun, India, 4–5 September 2015; pp. 744–750. [ Google Scholar ]
  • Zavala, V.M.; Constantinescu, E.M.; Krause, T. On-line economic optimization of energy systems using weather forecast information. J. Process Control 2009 , 19 , 1725–1736. [ Google Scholar ] [ CrossRef ]
  • Nurmi, V.; Perrels, A.; Nurmi, P.; Michaelides, S.; Athanasatos, S.; Papadakis, M. Economic value of weather forecasts on transportation–Impacts of weather forecast quality developments to the economic effects of severe weather. EWENT FP7 Project . 2012, Volume 490. Available online: http://virtual.vtt.fi/virtual/ewent/Deliverables/D5/D5_2_16_02_2012_revised_final.pdf (accessed on 8 September 2023).
  • Russo, J.A., Jr. The economic impact of weather on the construction industry of the United States. Bull. Am. Meteorol. Soc. 1966 , 47 , 967–972. [ Google Scholar ] [ CrossRef ]
  • Badorf, F.; Hoberg, K. The impact of daily weather on retail sales: An empirical study in brick-and-mortar stores. J. Retail. Consum. Serv. 2020 , 52 , 101921. [ Google Scholar ] [ CrossRef ]
  • De Freitas, C.R. Tourism climatology: Evaluating environmental information for decision making and business planning in the recreation and tourism sector. Int. J. Biometeorol. 2003 , 48 , 45–54. [ Google Scholar ] [ CrossRef ]
  • Smith, K. Environmental Hazards: Assessing Risk and Reducing Disaster ; Routledge: London, UK, 2013. [ Google Scholar ]
  • Hammer, G.L.; Hansen, J.W.; Phillips, J.G.; Mjelde, J.W.; Hill, H.; Love, A.; Potgieter, A. Advances in application of climate prediction in agriculture. Agric. Syst. 2001 , 70 , 515–553. [ Google Scholar ] [ CrossRef ]
  • Guedes, G.; Raad, R.; Raad, L. Welfare consequences of persistent climate prediction errors on insurance markets against natural hazards. Estud. Econ. Sao Paulo 2019 , 49 , 235–264. [ Google Scholar ] [ CrossRef ]
  • McNamara, D.E.; Keeler, A. A coupled physical and economic model of the response of coastal real estate to climate risk. Nat. Clim. Chang. 2013 , 3 , 559–562. [ Google Scholar ] [ CrossRef ]
  • Kleerekoper, L.; Esch, M.V.; Salcedo, T.B. How to make a city climate-proof, addressing the urban heat island effect. Resour. Conserv. Recycl. 2012 , 64 , 30–38. [ Google Scholar ] [ CrossRef ]
  • Kaján, E.; Saarinen, J. Tourism, climate change and adaptation: A review. Curr. Issues Tour. 2013 , 16 , 167–195. [ Google Scholar ]
  • Dessai, S.; Hulme, M.; Lempert, R.; Pielke, R., Jr. Climate prediction: A limit to adaptation. Adapt. Clim. Chang. Threshold. Values Gov. 2009 , 64 , 78. [ Google Scholar ]
  • Ham, Y.-G.; Kim, J.-H.; Luo, J.-J. Deep Learning for Multi-Year ENSO Forecasts. Nature 2019 , 573 , 568–572. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Howe, L.; Wain, A. Predicting the Future ; Cambridge University Press: Cambridge, UK, 1993; Volume V, pp. 1–195. [ Google Scholar ]
  • Hantson, S.; Arneth, A.; Harrison, S.P.; Kelley, D.I.; Prentice, I.C.; Rabin, S.S.; Archibald, S.; Mouillot, F.; Arnold, S.R.; Artaxo, P.; et al. The status and challenge of global fire modelling. Biogeosciences 2016 , 13 , 3359–3375. [ Google Scholar ]
  • Racah, E.; Beckham, C.; Maharaj, T.; Ebrahimi Kahou, S.; Prabhat, M.; Pal, C. ExtremeWeather: A large-scale climate dataset for semi-supervised detection, localization, and understanding of extreme weather events. Adv. Neural Inf. Process. Syst. 2017 , 30 , 3402–3413. [ Google Scholar ]
  • Gao, S.; Zhao, P.; Pan, B.; Li, Y.; Zhou, M.; Xu, J.; Zhong, S.; Shi, Z. A nowcasting model for the prediction of typhoon tracks based on a long short term memory neural network. Acta Oceanol. Sin. 2018 , 37 , 8–12. [ Google Scholar ]
  • Ren, X.; Li, X.; Ren, K.; Song, J.; Xu, Z.; Deng, K.; Wang, X. Deep Learning-Based Weather Prediction: A Survey. Big Data Res. 2021 , 23 , 100178. [ Google Scholar ]
  • Reichstein, M.; Camps-Valls, G.; Stevens, B.; Jung, M.; Denzler, J.; Carvalhais, N.; Prabhat, F. Deep learning and process understanding for data-driven Earth system science. Nature 2019 , 566 , 195–204. [ Google Scholar ] [ CrossRef ]
  • Stockhause, M.; Lautenschlager, M. CMIP6 data citation of evolving data. Data Sci. J. 2017 , 16 , 30. [ Google Scholar ] [ CrossRef ]
  • Hsieh, W.W. Machine Learning Methods in the Environmental Sciences: Neural Networks and Kernels ; Cambridge University Press: Cambridge, UK, 2009. [ Google Scholar ]
  • Krasnopolsky, V.M.; Fox-Rabinovitz, M.S.; Chalikov, D.V. New Approach to Calculation of Atmospheric Model Physics: Accurate and Fast Neural Network Emulation of Longwave Radiation in a Climate Model. Mon. Weather Rev. 2005 , 133 , 1370–1383. [ Google Scholar ] [ CrossRef ]
  • Krasnopolsky, V.M.; Fox-Rabinovitz, M.S.; Belochitski, A.A. Using ensemble of neural networks to learn stochastic convection parameterizations for climate and numerical weather prediction models from data simulated by a cloud resolving model. Adv. Artif. Neural Syst. 2013 , 2013 , 485913. [ Google Scholar ] [ CrossRef ]
  • Chevallier, F.; Morcrette, J.-J.; Chéruy, F.; Scott, N.A. Use of a neural-network-based long-wave radiative-transfer scheme in the ECMWF atmospheric model. Q. J. R. Meteorol. Soc. 2000 , 126 , 761–776. [ Google Scholar ]
  • Krasnopolsky, V.M.; Fox-Rabinovitz, M.S.; Hou, Y.T.; Lord, S.J.; Belochitski, A.A. Accurate and fast neural network emulations of model radiation for the NCEP coupled climate forecast system: Climate simulations and seasonal predictions. Mon. Weather Rev. 2010 , 138 , 1822–1842. [ Google Scholar ] [ CrossRef ]
  • Tolman, H.L.; Krasnopolsky, V.M.; Chalikov, D.V. Neural network approximations for nonlinear interactions in wind wave spectra: Direct mapping for wind seas in deep water. Ocean. Model. 2005 , 8 , 253–278. [ Google Scholar ] [ CrossRef ]
  • Markakis, E.; Papadopoulos, A.; Perakakis, P. Spatiotemporal Forecasting: A Survey. arXiv 2018 , arXiv:1808.06571. [ Google Scholar ]
  • Box, G.E.; Jenkins, G.M.; Reinsel, G.C.; Ljung, G.M. Time Series Analysis: Forecasting and Control ; John Wiley & Sons: Hoboken, NJ, USA, 2015. [ Google Scholar ]
  • He, Y.; Kolovos, A. Spatial and Spatio-Temporal Geostatistical Modeling and Kriging. In Wiley StatsRef: Statistics Reference Online ; John Wiley & Sons: Hoboken, NJ, USA, 2015. [ Google Scholar ]
  • Lu, H.; Fan, Z.; Zhu, H. Spatiotemporal Analysis of Air Quality and Its Application in LASG/IAP Climate System Model. Atmos. Ocean. Sci. Lett. 2011 , 4 , 204–210. [ Google Scholar ]
  • Chatfield, C. The Analysis of Time Series: An Introduction , 7th ed.; CRC Press: Boca Raton, FL, USA, 2016. [ Google Scholar ]
  • Stull, R. Meteorology for Scientists and Engineers , 3rd ed.; Brooks/Cole: Pacific Grove, CA, USA, 2015. [ Google Scholar ]
  • Yuval, J.; O’Gorman, P.A. Machine Learning for Parameterization of Moist Convection in the Community Atmosphere Model. Proc. Natl. Acad. Sci. USA 2020 , 117 , 12–20. [ Google Scholar ]
  • Gagne, D.J.; Haupt, S.E.; Nychka, D.W. Machine Learning for Spatial Environmental Data. Meteorol. Monogr. 2020 , 59 , 9.1–9.36. [ Google Scholar ]
  • Xu, Z.; Li, Y.; Guo, Q.; Shi, X.; Zhu, Y. A Multi-Model Deep Learning Ensemble Method for Rainfall Prediction. J. Hydrol. 2020 , 584 , 124579. [ Google Scholar ]
  • Kuligowski, R.J.; Barros, A.P. Localized precipitation forecasts from a numerical weather prediction model using artificial neural networks. Weather. Forecast. 1998 , 13 , 1194–1204. [ Google Scholar ] [ CrossRef ]
  • Shi, X.; Chen, Z.; Wang, H.; Yeung, D.Y.; Wong, W.K.; Woo, W.C. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting. arXiv 2015 , arXiv:1506.04214. [ Google Scholar ]
  • Qiu, M.; Zhao, P.; Zhang, K.; Huang, J.; Shi, X.; Wang, X.; Chu, W. A short-term rainfall prediction model using multi-task convolutional neural networks. In Proceedings of the 2017 IEEE International Conference on Data Mining (ICDM), New Orleans, LA, USA, 18–21 November 2017; IEEE: New York, NY, USA, 2017; pp. 395–404. [ Google Scholar ]
  • Karevan, Z.; Suykens, J.A. Spatio-temporal stacked lstm for temperature prediction in weather forecasting. arXiv 2018 , arXiv:1811.06341. [ Google Scholar ]
  • Chattopadhyay, A.; Nabizadeh, E.; Hassanzadeh, P. Analog Forecasting of extreme-causing weather patterns using deep learning. J. Adv. Model. Earth Syst. 2020 , 12 , e2019MS001958. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Sønderby, C.K.; Espeholt, L.; Heek, J.; Dehghani, M.; Oliver, A.; Salimans, T.; Alchbrenner, N. MetNet: A Neural Weather Model for Precipitation Forecasting. arXiv 2020 , arXiv:2003.12140. [ Google Scholar ]
  • Pathak, J.; Subramanian, S.; Harrington, P.; Raja, S.; Chattopadhyay, A.; Mardani, M.; Anandkumar, A. FourCastNet: A Global Data-Driven High-Resolution Weather Model Using Adaptive Fourier Neural Operators. arXiv 2022 , arXiv:2202.11214. [ Google Scholar ]
  • Lam, R.; Sanchez-Gonzalez, A.; Willson, M.; Wirnsberger, P.; Fortunato, M.; Pritzel, A.; Battaglia, P. GraphCast: Learning skillful medium-range global weather forecasting. arXiv 2022 , arXiv:2212.12794. [ Google Scholar ]
  • Bi, K.; Xie, L.; Zhang, H.; Chen, X.; Gu, X.; Tian, Q. Accurate Medium-Range Global Weather Forecasting with 3D Neural Networks. Nature 2023 , 619 , 533–538. [ Google Scholar ] [ CrossRef ]
  • Nguyen, T.; Brandstetter, J.; Kapoor, A.; Gupta, J.K.; Grover, A. ClimaX: A foundation model for weather and climate. arXiv 2023 , arXiv:2301.10343. [ Google Scholar ]
  • Gangopadhyay, S.; Clark, M.; Rajagopalan, B. Statistical Down-scaling using K-nearest neighbors. In Water Resources Research ; Wiley Online Library: Hoboken, NJ, USA, 2005; Volume 41. [ Google Scholar ]
  • Tripathi, S.; Srinivas, V.V.; Nanjundiah, R.S. Down-scaling of precipitation for climate change scenarios: A support vector machine approach. J. Hydrol. 2006 , 330 , 621–640. [ Google Scholar ] [ CrossRef ]
  • Krasnopolsky, V.M.; Fox-Rabinovitz, M.S. Complex hybrid models combining deterministic and machine learning components for numerical climate modeling and weather prediction. Neural Netw. 2006 , 19 , 122–134. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Raje, D.; Mujumdar, P.P. A conditional random field–based Down-scaling method for assessment of climate change impact on multisite daily precipitation in the Mahanadi basin. In Water Resources Research ; Wiley Online Library: Hoboken, NJ, USA, 2009; Volume 45. [ Google Scholar ]
  • Zarei, M.; Najarchi, M.; Mastouri, R. Bias correction of global ensemble precipitation forecasts by Random Forest method. Earth Sci. Inform. 2021 , 14 , 677–689. [ Google Scholar ] [ CrossRef ]
  • Andersson, T.R.; Hosking, J.S.; Pérez-Ortiz, M.; Paige, B.; Elliott, A.; Russell, C.; Law, S.; Jones, D.C.; Wilkinson, J.; Phillips, T.; et al. Seasonal Arctic Sea Ice Forecasting with Probabilistic Deep Learning. Nat. Commun. 2021 , 12 , 5124. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Wang, X.; Han, Y.; Xue, W.; Yang, G.; Zhang, G. Stable climate simulations using a realistic general circulation model with neural network parameterizations for atmospheric moist physics and radiation processes. Geosci. Model Dev. 2022 , 15 , 3923–3940. [ Google Scholar ] [ CrossRef ]
  • Baño-Medina, J.; Manzanas, R.; Cimadevilla, E.; Fernández, J.; González-Abad, J.; Cofiño, A.S.; Gutiérrez, J.M. Down-scaling Multi-Model Climate Projection Ensembles with Deep Learning (DeepESD): Contribution to CORDEX EUR-44. Geosci. Model Dev. 2022 , 15 , 6747–6758. [ Google Scholar ] [ CrossRef ]
  • Hess, P.; Lange, S.; Boers, N. Deep Learning for bias-correcting comprehensive high-resolution Earth system models. arXiv 2022 , arXiv:2301.01253. [ Google Scholar ]
  • Wang, F.; Tian, D. On deep learning-based bias correction and Down-scaling of multiple climate models simulations. Clim. Dyn. 2022 , 59 , 3451–3468. [ Google Scholar ] [ CrossRef ]
  • Pan, B.; Anderson, G.J.; Goncalves, A.; Lucas, D.D.; Bonfils, C.J.W.; Lee, J. Improving Seasonal Forecast Using Probabilistic Deep Learning. J. Adv. Model. Earth Syst. 2022 , 14 , e2021MS002766. [ Google Scholar ] [ CrossRef ]
  • Hu, Y.; Chen, L.; Wang, Z.; Li, H. SwinVRNN: A Data-Driven Ensemble Forecasting Model via Learned Distribution Perturbation. J. Adv. Model. Earth Syst. 2023 , 15 , e2022MS003211. [ Google Scholar ] [ CrossRef ]
  • Chen, L.; Zhong, X.; Zhang, F.; Cheng, Y.; Xu, Y.; Qi, Y.; Li, H. FuXi: A cascade machine learning forecasting system for 15-day global weather forecast. arXiv 2023 , arXiv:2306.12873. [ Google Scholar ]
  • Lin, H.; Gao, Z.; Xu, Y.; Wu, L.; Li, L.; Li, S.Z. Conditional local convolution for spatio-temporal meteorological forecasting. Proc. Aaai Conf. Artif. Intell. 2022 , 36 , 7470–7478. [ Google Scholar ] [ CrossRef ]
  • Chen, K.; Han, T.; Gong, J.; Bai, L.; Ling, F.; Luo, J.J.; Chen, X.; Ma, L.; Zhang, T.; Su, R.; et al. FengWu: Pushing the Skillful Global Medium-range Weather Forecast beyond 10 Days Lead. arXiv 2023 , arXiv:2304.02948. [ Google Scholar ]
  • De Burgh-Day, C.O.; Leeuwenburg, T. Machine Learning for numerical weather and climate modelling: A review. EGUsphere 2023 , 2023 , 1–48. [ Google Scholar ]
  • LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998 , 86 , 2278–2324. [ Google Scholar ] [ CrossRef ]
  • Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012 , 25 , 1097–1105. [ Google Scholar ] [ CrossRef ]
  • Scherer, D.; Müller, A.; Behnke, S. Evaluation of pooling operations in convolutional architectures for object recognition. In Proceedings of the International Conference on Artificial Neural Networks 2010, Thessaloniki, Greece, 15–18 September 2010; pp. 92–101. [ Google Scholar ]
  • LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015 , 521 , 436–444. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Liu, Y.; Racah, E.; Correa, J.; Khosrowshahi, A.; Lavers, D.; Kunkel, K.; Wehner, M.; Collins, W. Application of deep convolutional neural networks for detecting extreme weather in climate datasets. arXiv 2016 , arXiv:1605.01156. [ Google Scholar ]
  • Goodfellow, I.; Warde-Farley, D.; Mirza, M.; Courville, A.; Bengio, Y. Maxout networks. In Proceedings of the International Conference on Machine Learning, Atlanta, GA, USA, 16–21 June 2013; pp. 1319–1327. [ Google Scholar ]
  • Marion, M.; Roger, T. Navier-Stokes equations: Theory and approximation. Handb. Numer. Anal. 1998 , 6 , 503–689. [ Google Scholar ]
  • Iacono, M.J.; Mlawer, E.J.; Clough, S.A.; Morcrette, J.-J. Impact of an improved longwave radiation model, RRTM, on the energy budget and thermodynamic properties of the NCAR community climate model, CCM3. J. Geophys. Res. Atmos. 2000 , 105 , 14873–14890. [ Google Scholar ] [ CrossRef ]
  • Guo, Y.; Shao, C.; Su, A. Comparative Evaluation of Rainfall Forecasts during the Summer of 2020 over Central East China. Atmosphere 2023 , 14 , 992. [ Google Scholar ] [ CrossRef ]
  • Guo, Y.; Shao, C.; Su, A. Investigation of Land–Atmosphere Coupling during the Extreme Rainstorm of 20 July 2021 over Central East China. Atmosphere 2023 , 14 , 1474. [ Google Scholar ] [ CrossRef ]
  • Bauer, P.; Thorpe, A.; Brunet, G. The Quiet Revolution of Numerical Weather Prediction. Nature 2015 , 525 , 47–55. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is All You Need. In Proceedings of the NeurIPS, Long Beach, CA, USA, 4–9 December 2017. [ Google Scholar ]
  • Wang, H.; Zhu, Y.; Green, B.; Adam, H.; Yuille, A.; Chen, L.C. Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation. arXiv 2019 , arXiv:2003.07853. [ Google Scholar ]
  • Schmit, T.J.; Griffith, P.; Gunshor, M.M.; Daniels, J.M.; Goodman, S.J.; Lebair, W.J. A closer look at the ABI on the GOES-R series. Bull. Am. Meteorol. Soc. 2017 , 98 , 681–698. [ Google Scholar ] [ CrossRef ]
  • Li, Z.; Kovachki, N.; Azizzadenesheli, K.; Liu, B.; Bhattacharya, K.; Stuart, A.; Anandkumar, A. Fourier Neural Operator for Parametric Partial Differential Equations. In Proceedings of the International Conference on Learning Representations (ICLR), Virtual Event. 3–7 May 2021. [ Google Scholar ]
  • Guibas, J.; Mardani, M.; Li, Z.; Tao, A.; Anandkumar, A.; Catanzaro, B. Adaptive Fourier Neural Operators: Efficient token mixers for transformers. In Proceedings of the International Conference on Representation Learning, Virtual Event. 25–29 April 2022. [ Google Scholar ]
  • Rasp, S.; Thuerey, N. Purely data-driven medium-range weather forecasting achieves comparable skill to physical models at similar resolution. arXiv 2020 , arXiv:2008.08626. [ Google Scholar ]
  • Weyn, J.A.; Durran, D.R.; Caruana, R.; Cresswell-Clay, N. Sub-seasonal forecasting with a large ensemble of deep-learning weather prediction models. arXiv 2021 , arXiv:2102.05107. [ Google Scholar ] [ CrossRef ]
  • Rasp, S.; Dueben, P.D.; Scher, S.; Weyn, J.A.; Mouatadid, S.; Thuerey, N. Weatherbench: A benchmark data set for data-driven weather forecasting. J. Adv. Model. Earth Syst. 2020 , 12 , e2020MS002203. [ Google Scholar ] [ CrossRef ]
  • Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the International Conference on Computer Vision, Virtual. 11–17 October 2021; IEEE: New York, NY, USA, 2021; pp. 10012–10022. [ Google Scholar ]
  • Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020 , arXiv:2010.11929. [ Google Scholar ]
  • Váňa, F.; Düben, P.; Lang, S.; Palmer, T.; Leutbecher, M.; Salmond, D.; Carver, G. Single precision in weather forecasting models: An evaluation with the IFS. Mon. Weather Rev. 2017 , 145 , 495–502. [ Google Scholar ] [ CrossRef ]
  • IPCC. Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change ; Cambridge University Press: Cambridge, UK; New York, NY, USA, 2013. [ Google Scholar ]
  • Flato, G.; Marotzke, J.; Abiodun, B.; Braconnot, P.; Chou, S.C.; Collins, W.; Cox, P.; Driouech, F.; Emori, S.; Eyring, V.; et al. Evaluation of Climate Models. In Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change ; Cambridge University Press: Cambridge, UK; New York, NY, USA, 2013. [ Google Scholar ]
  • Washington, W.M.; Parkinson, C.L. An Introduction to Three-Dimensional Climate Modeling ; University Science Books: Beijing, China, 2005. [ Google Scholar ]
  • Giorgi, F.; Gutowski, W.J. Regional Dynamical Down-scaling and the CORDEX Initiative. Annu. Rev. Environ. Resour. 2015 , 40 , 467–490. [ Google Scholar ] [ CrossRef ]
  • Randall, D.A.; Wood, R.A.; Bony, S.; Colman, R.; Fichefet, T.; Fyfe, J.; Kattsov, V.; Pitman, A.; Shukla, J.; Srinivasan, J.; et al. Climate Models and Their Evaluation. In Climate Change 2007: The Physical Science Basis. Contribution of Working Group I to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change ; Cambridge University Press: Cambridge, UK; New York, NY, USA, 2007. [ Google Scholar ]
  • Taylor, K.E.; Stouffer, R.J.; Meehl, G.A. An overview of CMIP5 and the experiment design. Bull. Am. Meteorol. Soc. 2012 , 93 , 485–498. [ Google Scholar ] [ CrossRef ]
  • Miao, C.; Shen, Y.; Sun, J. Spatial–temporal ensemble forecasting (STEFS) of high-resolution temperature using machine learning models. J. Adv. Model. Earth Syst. 2019 , 11 , 2961–2973. [ Google Scholar ]
  • Mukkavilli, S.; Perone, C.S.; Rangapuram, S.S.; Müller, K.R. Distribution regression forests for probabilistic spatio-temporal forecasting. In Proceedings of the International Conference on Machine Learning (ICML), Vienna, Austria, 12–18 July 2020. [ Google Scholar ]
  • Walker, G.; Charlton-Perez, A.; Lee, R.; Inness, P. Challenges and progress in probabilistic forecasting of convective phenomena: The 2016 GFE/EUMETSAT/NCEP/SPC severe convective weather workshop. Bull. Am. Meteorol. Soc. 2016 , 97 , 1829–1835. [ Google Scholar ]
  • Kingma, D.P.; Welling, M. Auto-encoding variational bayes. arXiv 2013 , arXiv:1312.6114. [ Google Scholar ]
  • Krasting, J.P.; John, J.G.; Blanton, C.; McHugh, C.; Nikonov, S.; Radhakrishnan, A.; Zhao, M. NOAA-GFDL GFDL-ESM4 model output prepared for CMIP6 CMIP. Earth Syst. Grid Fed. 2018 , 10 . [ Google Scholar ] [ CrossRef ]
  • Zhu, J.-Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2223–2232. [ Google Scholar ]
  • Brands, S.; Herrera, S.; Fernández, J.; Gutiérrez, J.M. How well do CMIP5 Earth System Models simulate present climate conditions in Europe and Africa? Clim. Dynam. 2013 , 41 , 803–817. [ Google Scholar ] [ CrossRef ]
  • Vautard, R.; Kadygrov, N.; Iles, C. Evaluation of the large EURO-CORDEX regional climate model ensemble. J. Geophys. Res.-Atmos. 2021 , 126 , e2019JD032344. [ Google Scholar ] [ CrossRef ]
  • Boé, J.; Somot, S.; Corre, L.; Nabat, P. Large discrepancies in summer climate change over Europe as projected by global and regional climate models: Causes and consequences. Clim. Dynam. 2020 , 54 , 2981–3002. [ Google Scholar ] [ CrossRef ]
  • Baño-Medina, J.; Manzanas, R.; Gutiérrez, J.M. Configuration and intercomparison of deep learning neural models for statistical Down-scaling. Geosci. Model Dev. 2020 , 13 , 2109–2124. [ Google Scholar ] [ CrossRef ]
  • Lecun, Y.; Bengio, Y. Convolutional Networks for Images, Speech, and Time-Series. Handb. Brain Theory Neural Netw. 1995 , 336 , 1995. [ Google Scholar ]
  • Dee, D.P.; Uppala, S.M.; Simmons, A.J.; Berrisford, P.; Poli, P.; Kobayashi, S.; Andrae, U.; Balmaseda, M.A.; Balsamo, G.; Bauer, D.P.; et al. The ERA-Interim reanalysis: Configuration and performance of the data assimilation system. Q. J. Roy Meteor. Soc. 2011 , 137 , 553–597. [ Google Scholar ] [ CrossRef ]
  • Cornes, R.C.; van der Schrier, G.; van den Besselaar, E.J.M.; Jones, P.D. An Ensemble Version of the E-OBS Temperature and Precipitation Data Sets. J. Geophys. Res.-Atmos. 2018 , 123 , 9391–9409. [ Google Scholar ] [ CrossRef ]
  • Baño-Medina, J.; Manzanas, R.; Gutiérrez, J.M. On the suitability of deep convolutional neural networks for continentalwide Down-scaling of climate change projections. Clim. Dynam. 2021 , 57 , 1–11. [ Google Scholar ] [ CrossRef ]
  • Maraun, D.; Widmann, M.; Gutiérrez, J.M.; Kotlarski, S.; Chandler, R.E.; Hertig, E.; Wibig, J.; Huth, R.; Wilcke, R.A. VALUE: A framework to validate Down-scaling approaches for climate change studies. Earths Future 2015 , 3 , 1–14. [ Google Scholar ] [ CrossRef ]
  • Vrac, M.; Ayar, P. Influence of Bias Correcting Predictors on Statistical Down-scaling Models. J. Appl. Meteorol. Clim. 2016 , 56 , 5–26. [ Google Scholar ] [ CrossRef ]
  • Williams, P.M. Modelling Seasonality and Trends in Daily Rainfall Data. In Advances in Neural Information Processing Systems 10, Proceedings of the Neural Information Processing Systems (NIPS): Denver, Colorado, USA, 1997 ; MIT Press: Cambridge, MA, USA, 1998; pp. 985–991. ISBN 0-262-10076-2. [ Google Scholar ]
  • Cannon, A.J. Probabilistic Multisite Precipitation Down-scaling by an Expanded Bernoulli–Gamma Density Network. J. Hydrometeorol. 2008 , 9 , 1284–1300. [ Google Scholar ] [ CrossRef ]
  • Schoof, J.T. and Pryor, S.C. Down-scaling temperature and precipitation: A comparison of regression-based methods and artificial neural networks. Int. J. Climatol. 2001 , 21 , 773–790. [ Google Scholar ] [ CrossRef ]
  • Maraun, D.; Widmann, M. Statistical Down-Scaling and Bias Correction for Climate Research ; Cambridge University Press: Cambridge, UK, 2018; ISBN 9781107588783. [ Google Scholar ]
  • Vrac, M.; Stein, M.; Hayhoe, K.; Liang, X.-Z. A general method for validating statistical Down-scaling methods under future climate change. Geophys. Res. Lett. 2007 , 34 , L18701. [ Google Scholar ] [ CrossRef ]
  • San-Martín, D.; Manzanas, R.; Brands, S.; Herrera, S.; Gutiérrez, J.M. Reassessing Model Uncertainty for Regional Projections of Precipitation with an Ensemble of Statistical Down-scaling Methods. J. Clim. 2017 , 30 , 203–223. [ Google Scholar ] [ CrossRef ]
  • Quesada-Chacón, D.; Barfus, K.; Bernhofer, C. Climate change projections and extremes for Costa Rica using tailored predictors from CORDEX model output through statistical Down-scaling with artificial neural networks. Int. J. Climatol. 2021 , 41 , 211–232. [ Google Scholar ] [ CrossRef ]

Click here to enlarge figure

Time ScaleDomainsApplications
Short TermAgricultureThe timing for sowing and harvesting;
Irrigation and fertilization plans [ ].
EnergyPredicts output for wind and solar energy [ ].
TransportationRoad traffic safety; Rail transport;
Aviation and maritime industries [ ].
ConstructionProject plans and timelines; Safe operations [ ].
Retail and SalesAdjusts inventory based on weather forecasts [ ].
Tourism and
Entertainment
Operations of outdoor activities
and tourist attractions [ ]
Environment and
Disaster Management
Early warnings for floods, fires,
and other natural disasters [ ].
Medium—Long TermAgricultureLong-term land management and planning [ ].
InsurancePreparations for future increases in
types of disasters, such as floods and droughts [ ].
Real EstateAssessment of future sea-level rise or other
climate-related factors [ ].
Urban PlanningWater resource management [ ].
TourismLong-term investments and planning,
such as deciding which regions may become
popular tourist destinations in the future [ ].
Public HealthLong-term climate changes may impact the
spread of diseases [ ].
Time ScaleSpational ScaleTypeModelTechnologyNameEvent
Short-term weather predictionGlobalMLSpecial DNN ModelsAFNOFourCastNet [ ]Extreme Events
3D Neural NetworkPanGu [ ]
Vision TransformersClimaX [ ]Temperature & Extreme
Event
SwinTransformerSwinVRNN [ ]Temperature & Precipitation
U-TransformerFuXi [ ]
Single DNNs ModelGNNCLCRN [ ]Temperature
GraphCast [ ]
TransformerFengWu [ ]Extreme Events
Regional CapsNet [ ]
CNNPrecipitation Convolution
prediction [ ]
Precipitation
ANNPrecipitation Neural
Network prediction [ ]
LSTMStacked-LSTM-Model [ ]Temperature
Hybrid DNNs ModelLSTM + CNNConsvLSTM [ ]Precipitation
MetNet [ ]
Medium-to-long-term climate predictionGlobal Single DNN modelsProbalistic deep learningConditional Generative
Forecasting [ ]
Temperature & Precipitation
ML EnhancedCNNCNN-Bias-correction
model [ ]
Temperature & Extreme
Event
GANCycle GAN [ ]Precipitation
NNHybrid-GCM-Emulation [ ]
ResDNNNNCAM-emulation [ ]
RegionalCNNDeepESD-Down-scaling
model [ ]
Temperature
Non-Deep-Learning
Model
Random forest (RF)RF-bias-correction model [ ]Precipitation
Support vector
machine (SVM)
SVM-Down-scaling model [ ]
K-nearest
neighbor (KNN)
KNN-Down-scaling model [ ]
Conditional random
field (CRF)
CRF-Down-scaling model [ ]
ModelForecast-TimelinessZ500 RMSE (7 Days)Z500 ACC (7 Days)Training-ComplexityForecasting-Speed
MetNet [ ]8 h--256 Google-TPU-accelerators (16-days-training)Fewer seconds
FourCastNet [ ]7 days5950.7624 A100-GPU24-h forecast for 100 members in 7 s
GraphCast [ ]9.75 days4600.82532 Cloud-TPU-V4 (21-days-training)10-days-predication within 1 min
PanGu [ ]7 days5100.872192 V100-GPU (16-days-training)24-h-global-prediction in 1.4 s for each GPU
IFS [ ]8.5 days4390.85--
NameCategoriesMetricsESMThis Model
CycleGAN [ ]Bias correctionMAE0.2410.068
DeepESD [ ]Down-scalingEuclidean Distance to Observations in PDF0.50.03
CGF [ ]PredictionACC0.310.4
NNCAM [ ]EmulationSpeed130 times speed-up
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Chen, L.; Han, B.; Wang, X.; Zhao, J.; Yang, W.; Yang, Z. Machine Learning Methods in Weather and Climate Applications: A Survey. Appl. Sci. 2023 , 13 , 12019. https://doi.org/10.3390/app132112019

Chen L, Han B, Wang X, Zhao J, Yang W, Yang Z. Machine Learning Methods in Weather and Climate Applications: A Survey. Applied Sciences . 2023; 13(21):12019. https://doi.org/10.3390/app132112019

Chen, Liuyi, Bocheng Han, Xuesong Wang, Jiazhen Zhao, Wenke Yang, and Zhengyi Yang. 2023. "Machine Learning Methods in Weather and Climate Applications: A Survey" Applied Sciences 13, no. 21: 12019. https://doi.org/10.3390/app132112019

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

Duke Electrical & Computer Engineering

Master of Science in ECE

Uniquely interdisciplinary and flexible: coursework-only, project and thesis options.

Miroslave Pajic with students and robot

Program Benefits

The 30-credit Duke Master of Science in Electrical & Computer Engineering degree provides a unique combination of opportunities:

  • World-class research Integrated into a project-based learning environment
  • Flexible, individualized curriculum You choose: Thesis, Project or Coursework-only options
  • Professional development opportunities Take an internship or teaching assistantship
  • Excellent graduate outcomes Enter an elite PhD program or launch an industry career
  • Project MS option: 3 credits of ungraded research may substitute for standard coursework.
  • Thesis MS option: Up to 6 credits of ungraded research may substitute for standard coursework.
  • Responsible Conduct in Research (RCR) —3 training forums
  • ECE Master’s Success Seminar (ECE 701)—0 credits Weekly seminar (no tuition). Required for students entering Fall 2024 or later.
I was looking for that strong university-industry connection. That, along with the flexibility of the coursework, which gave me a lot more bandwidth for research, made Duke the best fit for me, in the end. Aniket Dalvi ’21 PhD Candidate at Duke University LinkedIn Logo

Choose Your Study Track

abstract image showing lines of software programming code

Degree Options & Requirements

  • Only graduate-level courses (500 and above) satisfy MS degree requirements.
  • No more than two ECE 899: Independent Study courses may be taken.
  • English for International Students (EIS) courses (EGR 505, 506, 705, 706) do not count toward the 30 total units required for the MS degree.
  • Students must maintain a 3.0 cumulative GPA to remain in good standing and to graduate.
  • Course selection is formally approved by submitting a Program of Study form.
  • MS students (except Duke 4+1) are required to take at least three full-time semesters to graduate.

Coursework Only

Requirements.

  • 30 units of graduate-level coursework as determined by the curricular track course requirements
  • ECE 701—ECE Master’s Success Seminar (0 credit, tuition-free) Required for students entering Fall 2024 or later.
  • 3 Responsible Conduct in Research (RCR) training forums in order to graduate.

Coursework MS Final Exam

The Graduate School requires a final exam approved by a committee made up of three Graduate Faculty members. The committee must be approved by the Director of Graduate Studies and the Dean of the Graduate School at least one month prior to the examination date. The student is not required to generate a written document for the ECE department, and the format of the exam is determined by the department.

  • 3 units of ungraded research (if desired, to substitute for standard coursework)

Project MS Final Exam

For the project option, a written research report and oral presentation are required to be presented to a committee made up of the student’s advisor and two other members of the graduate faculty, one of whom must be from a department other than ECE or outside the student’s main curricular area. The committee must be approved by the Director of Graduate Studies and the Dean of the Graduate School at least one month prior to the examination date. The formats of the written and oral project reports are determined by the student’s advisor. The project report is not submitted to the Graduate School; however, a final copy must be submitted to the ECE Department.

  • Up to 6 units of ungraded research (if desired, to substitute for standard coursework)

Thesis MS Final Exam

A written thesis must be uploaded by the guidelines presented in the Graduate School’s Guide for the Electronic Submission of Thesis and Dissertation , and the thesis must be defended orally before a committee composed of the faculty member under whose direction the work was done and at least two other members of the graduate faculty, one of whom must be from a department other than ECE or outside the student’s main curricular area. The committee must be approved by the Director of Graduate Studies and the Dean of the Graduate School at least one month prior to the examination date.

Additional Information

  • Complete Degree Requirements (PDF)
  • Admissions Requirements
  • Application Deadlines
  • Tuition & Financial Aid
  • Career Services

aerial view of Duke Chapel with fall trees

Take the Next Step

Want more information? Ready to join our community?

Master’s Contacts

Kevyn Light Profile Photo

Kevyn Light

Senior Program Coordinator

Matt Novik Profile Photo

Graduate Program Coordinator

Miroslav Pajic Profile Photo

Miroslav Pajic

Director of Master’s Studies, Professor in the Department of ECE

Krista Turner Profile Photo

Krista Turner

Master’s Program Coordinator

More Options

Meng in electrical & computer engineering, meng in photonics & optical sciences, introductory c programming specialization (online).

Deep Learning in Asset Pricing

New citation alert added.

This alert has been successfully added and will be sent to:

You will be notified whenever a record that you have chosen has been cited.

To manage your alert preferences, click on the button below.

New Citation Alert!

Please log in to your account

Information & Contributors

Bibliometrics & citations, view options.

  • Wang X Tang Y Wang W (2024) Securities Quantitative Trading Strategy Based on Deep Learning of Industrial Internet of Things International Journal of Information Technology and Web Engineering 10.4018/IJITWE.347880 19 :1 (1-16) Online publication date: 17-Jan-2024 https://dl.acm.org/doi/10.4018/IJITWE.347880
  • Hao J Yuan J Li J (2024) HCEG Information Sciences: an International Journal 10.1016/j.ins.2024.121082 678 :C Online publication date: 1-Sep-2024 https://dl.acm.org/doi/10.1016/j.ins.2024.121082

Index Terms

Applied computing

Law, social and behavioral sciences

Computing methodologies

Machine learning

Machine learning approaches

Neural networks

Mathematics of computing

Theory of computation

Recommendations

Macroeconomic risks and asset pricing: evidence from a dynamic stochastic general equilibrium model.

We study the relation between macroeconomic fundamentals and asset pricing through the lens of a dynamic stochastic general equilibrium (DSGE) model. We provide full-information Bayesian estimation of the DSGE model using macroeconomic variables and ...

Essays in Empirical Asset Pricing

Asset pricing with downside liquidity risks.

We develop a parsimonious liquidity-adjusted downside capital asset pricing model to investigate whether phenomena such as downward liquidity spirals and flights to liquidity impact expected asset returns. We find strong empirical support for the model. ...

Information

Published in.

Linthicum, MD, United States

Publication History

Author tags.

  • conditional asset pricing model
  • no arbitrage
  • stock returns
  • nonlinear factor model
  • cross-section of expected returns
  • machine learning
  • deep learning
  • hidden states
  • Research-article

Contributors

Other metrics, bibliometrics, article metrics.

  • 1 Total Citations View Citations
  • 0 Total Downloads
  • Downloads (Last 12 months) 0
  • Downloads (Last 6 weeks) 0

View options

Login options.

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Share this publication link.

Copying failed.

Share on social media

Affiliations, export citations.

  • Please download or close your previous search result export first before starting a new bulk export. Preview is not available. By clicking download, a status dialog will open to start the export process. The process may take a few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress. Download
  • Download citation
  • Copy citation

We are preparing your search results for download ...

We will inform you here when the file is ready.

Your file of search results citations is now ready.

Your search export query has expired. Please try again.

  • DSpace@MIT Home
  • MIT Libraries

This collection of MIT Theses in DSpace contains selected theses and dissertations from all MIT departments. Please note that this is NOT a complete collection of MIT theses. To search all MIT theses, use MIT Libraries' catalog .

MIT's DSpace contains more than 58,000 theses completed at MIT dating as far back as the mid 1800's. Theses in this collection have been scanned by the MIT Libraries or submitted in electronic format by thesis authors. Since 2004 all new Masters and Ph.D. theses are scanned and added to this collection after degrees are awarded.

MIT Theses are openly available to all readers. Please share how this access affects or benefits you. Your story matters.

If you have questions about MIT theses in DSpace, [email protected] . See also Access & Availability Questions or About MIT Theses in DSpace .

If you are a recent MIT graduate, your thesis will be added to DSpace within 3-6 months after your graduation date. Please email [email protected] with any questions.

Permissions

MIT Theses may be protected by copyright. Please refer to the MIT Libraries Permissions Policy for permission information. Note that the copyright holder for most MIT theses is identified on the title page of the thesis.

Theses by Department

  • Comparative Media Studies
  • Computation for Design and Optimization
  • Computational and Systems Biology
  • Department of Aeronautics and Astronautics
  • Department of Architecture
  • Department of Biological Engineering
  • Department of Biology
  • Department of Brain and Cognitive Sciences
  • Department of Chemical Engineering
  • Department of Chemistry
  • Department of Civil and Environmental Engineering
  • Department of Earth, Atmospheric, and Planetary Sciences
  • Department of Economics
  • Department of Electrical Engineering and Computer Sciences
  • Department of Humanities
  • Department of Linguistics and Philosophy
  • Department of Materials Science and Engineering
  • Department of Mathematics
  • Department of Mechanical Engineering
  • Department of Nuclear Science and Engineering
  • Department of Ocean Engineering
  • Department of Physics
  • Department of Political Science
  • Department of Urban Studies and Planning
  • Engineering Systems Division
  • Harvard-MIT Program of Health Sciences and Technology
  • Institute for Data, Systems, and Society
  • Media Arts & Sciences
  • Operations Research Center
  • Program in Real Estate Development
  • Program in Writing and Humanistic Studies
  • Science, Technology & Society
  • Science Writing
  • Sloan School of Management
  • Supply Chain Management
  • System Design & Management
  • Technology and Policy Program

Collections in this community

Doctoral theses, graduate theses, undergraduate theses, recent submissions.

Thumbnail

Alkyl guanidines and nitroguanidines 

Classroom model of an information and computing system. , geology of the snake mountain region .

Show Statistical Information

feed

  • Events at UC Santa Cruz
  • Monday, August 26

Guo, Y. (ECE) - Integrating Machine Learning with Green Communication Systems

Monday, August 26, 2024 10am

  • Share Guo, Y. (ECE) - Integrating Machine Learning with Green Communication Systems on Facebook
  • Share Guo, Y. (ECE) - Integrating Machine Learning with Green Communication Systems on Twitter
  • Share Guo, Y. (ECE) - Integrating Machine Learning with Green Communication Systems on LinkedIn

thesis about machine learning

About this Event

Engineering 2 1156 High Street, Santa Cruz, California 95064

In recent years, two significant developments have emerged that cannot be overlooked. The first is the need for and prospects of sustainable development, which includes the increasing adoption of green energy. The second major change is the advent of the AI era, highlighted by advancements such as OpenAI and Waymo's self-driving cars. Exploring how communication technologies can adapt to these two changes to seize opportunities for significant development is an intriguing prospect. Here is an overview of the opportunities and technical challenges this thesis seeks to address: (1) Green energy sources are not as stable as traditional energy sources. Accurate prediction of energy harvesting and efficient allocation for communication systems is crucial for the adoption of green energy. Traditional methods for predicting energy are often complicated and less accurate. However, machine learning (ML) techniques have proven to achieve high accuracy in predicting solar energy. (2) In addition, ML techniques typically require significant power, which can limit their deployment in green communication scenarios. Therefore, pruning ML techniques to achieve low-power inference is crucial. Visible light communication (VLC) is a new communication method, and it addresses the issue of insufficient bandwidth in traditional communications and can serve as a valuable supplement to existing communication systems. However, Current VLC systems either fail to attract users due to insufficient communication quality or are too complex and expensive for widespread deployment. Therefore, new modulation and demodulation methods are needed to simplify the system, reduce costs, increase throughput, and improve robustness to noise. Further, ML holds promise in enhancing VLC systems. By applying ML techniques like Long Short-Term Memory (LSTM) networks to predict renewable energy income, we can develop intelligent power allocation and demodulation algorithms, reducing the barriers to green communication and unlocking the potential of new approaches such as VLC.

Event Host: Yawen Guo, Ph.D. Student, Electrical & Computer Engineering

Advisor: Colleen Josephson

Event Details

Invited Audience

See Who Is Interested

pmanicka

2 people are interested in this event

Dial-In Information

Zoom Link:  https://ucsc.zoom.us/j/98357303208?pwd=1j3tV23WVQpxApnkaFRPTr6mVpBFJA.1 Meeting ID: 983 5730 3208 Passcode: 144509

User Activity

No recent activity

IMAGES

  1. Free supervised machine learning thesis proposal example

    thesis about machine learning

  2. thesis ideas for machine learning

    thesis about machine learning

  3. Machine Learning Thesis Ideas

    thesis about machine learning

  4. Top 15+ Interesting Machine Learning Master Thesis (Research Guidance)

    thesis about machine learning

  5. Research Master Thesis Machine Learning (Performance Analysis)

    thesis about machine learning

  6. Innovative PhD Thesis Machine Learning Research Guidance

    thesis about machine learning

COMMENTS

  1. PhD Dissertations

    The Machine Learning Department at Carnegie Mellon University is ranked as #1 in the world for AI and Machine Learning, we offer Undergraduate, Masters and PhD programs. ... Essays in Machine Learning for Decision Support in the Public Sector Dylan Fitzpatrick, 2020. Towards a Unified Framework for Learning and Reasoning Han Zhao, 2020.

  2. Available Master's thesis topics in machine learning

    Develop a Machine Learning based Hyper-heuristic algorithm to solve a pickup and delivery problem. A hyper-heuristic is a heuristics that choose heuristics automatically. Hyper-heuristic seeks to automate the process of selecting, combining, generating or adapting several simpler heuristics to efficiently solve computational search problems ...

  3. The Future of AI Research: 20 Thesis Ideas for Undergraduate ...

    "The beauty of machine learning is that it can be applied to any problem you want to solve, as long as you can provide the computer with enough examples." — Andrew Ng. This article provides a list of 20 potential thesis ideas for an undergraduate program in machine learning and deep learning in 2023.

  4. AI & Machine Learning Research Topics (+ Free Webinar)

    Get 1-On-1 Help. If you're still unsure about how to find a quality research topic, check out our Research Topic Kickstarter service, which is the perfect starting point for developing a unique, well-justified research topic. A comprehensive list of research topics ideas in the AI and machine learning area. Includes access to a free webinar ...

  5. PDF Artificial Intelligence and Machine Learning Capabilities and

    that a machine can be made to simulate it." [3] In the AI field, there are several terms. Artificial intelligence is the largest collection, machine learning is a subset of artificial intelligence, and deep learning is a subset of machine learning, as shown in Exhibit 2.3 [4]. This thesis mainly

  6. PDF ADVERSARIALLY ROBUST MACHINE LEARNING WITH ...

    Machine learning (ML) systems are remarkably successful on a variety of benchmarks across sev-eral domains. In these benchmarks, the test data points, though not identical, are very similar to ... This thesis focuses on an extreme version of this brittleness, adversarial examples, where even imperceptible (but carefully constructed) changes ...

  7. 10 Compelling Machine Learning Ph.D. Dissertations for 2020

    10 Compelling Machine Learning Ph.D. Dissertations for 2020. Machine Learning Modeling Research posted by Daniel Gutierrez, ODSC August 19, 2020. As a data scientist, an integral part of my work in the field revolves around keeping current with research coming out of academia. I frequently scour arXiv.org for late-breaking papers that show ...

  8. A machine learning approach to modeling and predicting training

    However, traditional analysis techniques and human intuition are of limited use on so-called "big-data" environments, and one of the most promising areas to prepare for this influx of complex training data is the field of machine learning. Thus, the objective of this thesis was to lay the foundations for the use of machine learning algorithms ...

  9. 17 Compelling Machine Learning Ph.D. Dissertations

    This machine learning dissertation presents analyses on tree asymptotics in the perspectives of tree terminal nodes, tree ensembles, and models incorporating tree ensembles respectively. The study introduces a few new tree-related learning frameworks which provides provable statistical guarantees and interpretations.

  10. PDF New Theoretical Frameworks for Machine Learning

    Machine Learning, a natural outgrowth at the intersection of Computer Science and Statistics, has evolved into a broad, highly successful, and extremely dynamic discipline. ... In this thesis, we develop theoretical foundations and new algorithms for several important emerging learning paradigms of significant practical importance, including ...

  11. Brown Digital Repository

    Advancements in machine learning techniques have encouraged scholars to focus on convolutional neural network (CNN) based solutions for object detection and pose estimation tasks. Most … Year: 2020 Contributor: Derman, Can Eren (creator) Bahar, Iris (thesis advisor) Taubin, Gabriel (reader) Brown University. School of Engineering (sponsor ...

  12. PDF Integrating Machine Learning into Data Analysis and Plant Performance

    This thesis shows, drawing from a recent project at Nissan's Canton, ... Machine learning was not initially a part of our project scope. Two things led us to it. The first was the general frustration we heard from operators, engineers, and managers about the challenges they had dealing with data. With the number

  13. PDF Master Thesis Using Machine Learning Methods for Evaluating the ...

    Based on this background, the aim of this thesis is to select and implement a machine learning process that produces an algorithm, which is able to detect whether documents have been translated by humans or computerized systems. This algorithm builds the basic structure for an approach to evaluate these documents. 1.2 Related Work

  14. Thesis on Machine Learning Methods and Its Applications

    Machine learning is an essential part of artificial intelligence used to design algorithms based on data trends and past relationships between data. Machine learning is used in a variety of areas such as bioinformatics, intrusion detection, information retrieval, games, marketing, malware detection, and image decoding.

  15. PDF Machine Learning for Decision Making

    Machine learning applications to both decision-making and decision-support are growing. Further,witheachsuccessfulapplication,learningalgorithmsaregain- ing increased autonomy and control over decision-making. As a result, research into intelligent decision-making algorithms continues to improve. For example, theStanfordResearchInstitute ...

  16. How to write a great data science thesis

    Glancing through past dissertations helped me understand how a typical machine learning research paper is structured and led to numerous ideas about interesting statistics and visualizations that I could include in my thesis. Below, I've compiled a list of great sources and databases containing previous theses.

  17. Thesis Topics for Machine Learning

    Let us now have an idea about various headings to be included in any thesis topics for machine learning. Introduction - overview of the thesis; Related / Existing works - presents of existing research; Problems definition/statements - identify and highlight the problems; Research methodology - convey the proposed concepts; Results and Discussion - discuss the results of the proposed ...

  18. master-thesis · GitHub Topics · GitHub

    Add this topic to your repo. To associate your repository with the master-thesis topic, visit your repo's landing page and select "manage topics." GitHub is where people build software. More than 100 million people use GitHub to discover, fork, and contribute to over 420 million projects.

  19. UofT Machine Learning

    2010. Andriy Mnih Learning Distributed Representations for Statistical Language Modelling and Collaborative Filtering (Ph. D. Thesis) Renqiang Min Machine Learning Approaches to Biological Sequence and Phenotype Data Analysis (Ph. D. Thesis) Vinod Nair Visual Object Recognition Using Generative Models of Images (Ph. D. Thesis)

  20. PDF Using Machine Learning to Predict Student Performance

    Student PerformanceM. Sc. Thesis, 35 pages June 2017This thesis examines the application of machine learning algorithms t. predict whether a student will be successful or not. The specific focus of the thesis is the comparison of machine learning methods and feature engineering techniques in term. of how much they improve the prediction ...

  21. PDF Solving Machine Learning Problems

    homework, and quiz questions from MIT's 6.036 Introduction to Machine Learning course and train a machine learning model to answer these questions. Our system demonstrates an overall accuracy of 96% for open-response questions and 97% for multiple-choice questions, compared with MIT students' average of 93%, achieving

  22. Machine Learning

    The broad goal of machine learning is to automate the decision-making process, so that computer-automated predictions can make a task more efficient, accurate, or cost-effective than it would be using only human decision making. Carnegie Mellon is widely regarded as one of the world's leading centers for machine learning research, and the scope of our machine learning research is broad.

  23. Doctoral Thesis: From Data, to Models, and Back: Making ML "Predictably

    Artificial Intelligence and Decision-making combines intellectual traditions from across computer science and electrical engineering to develop techniques for the analysis and synthesis of systems that interact with an external world via perception, communication, and action; while also learning, making decisions and adapting to a changing environment.

  24. [2408.06383] Dilated Convolution with Learnable Spacings

    This thesis presents and evaluates the Dilated Convolution with Learnable Spacings (DCLS) method. Through various supervised learning experiments in the fields of computer vision, audio, and speech processing, the DCLS method proves to outperform both standard and advanced convolution techniques. The research is organized into several steps, starting with an analysis of the literature and ...

  25. Applied Sciences

    With the rapid development of artificial intelligence, machine learning is gradually becoming popular for predictions in all walks of life. In meteorology, it is gradually competing with traditional climate predictions dominated by physical models. This survey aims to consolidate the current understanding of Machine Learning (ML) applications in weather and climate prediction—a field of ...

  26. PDF Artificial Intelligence and Machine Learning: Current Applications in

    intelligence and machine learning. This thesis will define machine learning and artificial intelligence for the investor and real estate audience, examine the ways in which these new analytic, predictive, and automating technologies are being used in the real estate industry, and postulate potential

  27. Master of Science in ECE

    Machine Learning & Big Data. Quantum Software & Hardware. Semiconductor Technology. Degree Options & Requirements. Only graduate-level courses (500 and above) satisfy MS degree requirements. ... A written thesis must be uploaded by the guidelines presented in the Graduate School's Guide for the Electronic Submission of Thesis and Dissertation

  28. Deep Learning in Asset Pricing

    Bianchi D, Tamoni A, Buchner M (2021) Bond risk premiums with machine learning. Rev. Financial Stud. 34(2):1046-1089. Google Scholar [8] Blanchet J, Kang Y, Murthy K (2019) Robust Wasserstein profile inference and applications to machine learning. J. Appl. ... Essays in Empirical Asset Pricing.

  29. MIT Theses

    Theses in this collection have been scanned by the MIT Libraries or submitted in electronic format by thesis authors. Since 2004 all new Masters and Ph.D. theses are scanned and added to this collection after degrees are awarded. MIT Theses are openly available to all readers. Please share how this access affects or benefits you.

  30. Guo, Y. (ECE)

    Here is an overview of the opportunities and technical challenges this thesis seeks to address: (1) Green energy sources are not as stable as traditional energy sources. ... However, machine learning (ML) techniques have proven to achieve high accuracy in predicting solar energy. (2) In addition, ML techniques typically require significant ...