Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

An improved genetic algorithm and its application in neural network adversarial attack

Contributed equally to this work with: Dingming Yang, Zeyu Yu, Hongqiang Yuan

Roles Conceptualization, Data curation, Formal analysis, Methodology, Software, Validation, Writing – original draft, Writing – review & editing

Affiliation School of Computer Science, Yangtze University, Jingzhou, China

ORCID logo

Roles Funding acquisition, Supervision, Validation, Writing – review & editing

Affiliation School of Electronic & Information, Yangtze University, Jingzhou, China

Roles Funding acquisition, Resources, Writing – review & editing

Affiliation School of Urban Construction, Yangtze University, Jingzhou, China

Roles Conceptualization, Project administration, Resources, Supervision, Writing – review & editing

* E-mail: [email protected]

  • Dingming Yang, 
  • Zeyu Yu, 
  • Hongqiang Yuan, 
  • Yanrong Cui

PLOS

  • Published: May 5, 2022
  • https://doi.org/10.1371/journal.pone.0267970
  • Reader Comments

Fig 1

The choice of crossover and mutation strategies plays a crucial role in the searchability, convergence efficiency and precision of genetic algorithms. In this paper, a novel improved genetic algorithm is proposed by improving the crossover and mutation operation of the simple genetic algorithm, and it is verified by 15 test functions. The qualitative results show that, compared with three other mainstream swarm intelligence optimization algorithms, the algorithm can not only improve the global search ability, convergence efficiency and precision, but also increase the success rate of convergence to the optimal value under the same experimental conditions. The quantitative results show that the algorithm performs superiorly in 13 of the 15 tested functions. The Wilcoxon rank-sum test was used for statistical evaluation, showing the significant advantage of the algorithm at 95% confidence intervals. Finally, the algorithm is applied to neural network adversarial attacks. The applied results show that the method does not need the structure and parameter information inside the neural network model, and it can obtain the adversarial samples with high confidence in a brief time just by the classification and confidence information output from the neural network.

Citation: Yang D, Yu Z, Yuan H, Cui Y (2022) An improved genetic algorithm and its application in neural network adversarial attack. PLoS ONE 17(5): e0267970. https://doi.org/10.1371/journal.pone.0267970

Editor: Mohd Nadhir Ab Wahab, Universiti Sains Malaysia, MALAYSIA

Received: November 24, 2021; Accepted: April 19, 2022; Published: May 5, 2022

Copyright: © 2022 Yang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the paper and its Supporting information files.

Funding: D.Y., Z.Y., H.Y. and Y.C.; This work was supported by the Major Technology Innovation of Hubei Province [2019AAA011]. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

1 Introduction

In real life, optimization problems such as shortest path, path planning, task scheduling, parameter tuning, etc. are becoming more and more complex and have complex features such as nonlinear, multi-constrained, high-dimensional, and discontinuous [ 1 ]. Although a series of artificial intelligence algorithms represented by deep learning can solve some optimization problems, they lack mathematical interpretability due to the existence of a large number of nonlinear functions and parameters inside their models, so they are difficult to be widely used in the field of information security. Traditional optimization algorithms and artificial intelligence algorithms can hardly solve complex optimization problems with high dimensionality and nonlinearity in the field of information security.

Therefore, it is necessary to find an effective optimization algorithm to solve such problems. In this background, various swarm intelligence optimization algorithms have been proposed one after another, such as Particle Swarm Optimization(PSO) [ 2 , 3 ], Grey Wolf Optimizer(GWO) [ 4 ], etc. Subsequently, a variety of improved optimization algorithms also have been proposed one after another. For example, the improved genetic algorithm for cloud environment task scheduling [ 5 ], the improved genetic algorithm for flexible job shop scheduling [ 6 ], the improved genetic algorithm for green fresh food logistics [ 7 ], etc.

However, these improved optimization algorithms are improved for domain-specific optimization problems and do not improve the accuracy, convergence efficiency and generalization of the algorithms themselves. In this paper, the crossover operator and mutation operator of the genetic algorithm are improved to improve the convergence efficiency and precision of the algorithm without affecting the effectiveness of the improved genetic algorithm on most optimization problems. The effectiveness of the improved genetic algorithm is also verified through many comparison experiments and applications in the field of neural network adversarial attacks.

  • By improving the single-point crossover link of SGA, the fitness function is used as an evaluation index for selecting children after crossover, thus reducing the number of iterations and accelerating the convergence speed.
  • By improving the basic bitwise mutation of the SGA, traversing each gene of the offspring and performing selective mutation on them, setting different mutation rates for two parts of a chromosome, thus improving the global search in the stable case of local optimum.
  • The improved genetic algorithm is applied to the field of neural network adversarial attack, which increases the speed of adversarial sample generation and improves the robustness of the neural network model.

2 Related works

2.1 genetic algorithm.

Genetic Algorithm is a series of simulation evolutionary algorithms proposed by Holland et al. [ 8 ], and later summarized by DeJong, Goldberg and others. The general flowchart of the Genetic Algorithm is shown in Fig 1 . The Genetic Algorithm first encodes the problem, then calculates the fitness, then selects the parent and the mother by roulette, and finally generates the children with high fitness by crossover and mutation, and finally generates the individuals with high fitness after many iterations, which is the satisfied solution or optimal solution of the problem. Simple Genetic Algorithm (SGA) uses single-point crossover and simple mutation to embody information exchange between individuals and local search, and does not rely on gradient information, so SGA can find the global optimal solution.

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pone.0267970.g001

2.2 Other meta-heuristic algorithms

The meta-heuristic algorithm is problem-independent, does not exploit the specificity of the problem, and is a general solution. In general, it is not greedy, can explore more search space, and tends to obtain the global optimum. To be more specific, meta-heuristic have one of the most important ideas: a dynamic balance mechanism between diversification and intensification.

The PSO [ 2 , 3 ] algorithm is a swarm intelligence-based global stochastic search algorithm inspired by the results of artificial life research and by simulating the migration and flocking behavior of bird flocks during foraging, and its basic idea is inspired by the results of research on modeling and simulation of birds flock behavior. The GWO algorithm is a swarm intelligence optimization algorithm proposed by Mirjalili et al. [ 4 ]. The algorithm is inspired by the grey wolf prey hunting activity and developed as an optimization search algorithm, which has strong convergence performance, few parameters, and easy implementation. The Marine Predator Algorithm (MPA) [ 9 ] is mainly inspired by foraging strategies widely found in marine predators, namely Lévy and Brownian motion, and optimal encounter rate strategies in biological interactions between predators and prey. The Artificial Gorilla Troops Optimizer (GTO) [ 10 ] was inspired by the gorilla group life behavior. The GTO is characterized by fast search speed and high solution accuracy. The African Vulture Optimization Algorithm(AVOA) [ 11 ] was inspired by the foraging and navigation behavior of African vultures. this algorithm is fast and has high solution accuracy which is widely used in single-objective optimization. The Remora Optimization Algorithm (ROA) [ 12 ] first proposed an intelligent optimization algorithm inspired by the biological habits of the neutrals in nature, which has good solution accuracy and high engineering practical value in both function seeking to solve extreme values and typical engineering optimization problems.

2.3 Neural network adversarial attack

Szegedy et al. [ 13 ] first demonstrated that a highly accurate deep neural network can be misled to make a misclassification by adding a slight perturbation to an image that is imperceptible to the human eye, and also found that the robustness of deep neural networks can be improved by adversarial training. Such phenomena are far-reaching and have attracted many researchers in the area of adversarial attacks and deep learning security. Akhtar and Mian [ 14 ] surveyed 12 attack methods and 15 defense methods for neural networks adversarial attacks. The main attack methods are finding the minimum loss function additive term [ 13 ], increasing the loss function of the classifier [ 15 ], the method of limiting the l_0 norm [ 16 ], changing only one pixel value [ 17 ], etc.

Nguyen et al. [ 18 ] continued to explore the question of “what differences remain between computer and human vision” based on Szegedy et al. [ 13 ]. They used the Evolutionary Algorithm to generate high-confidence adversarial images by iterating over direct-encoded images and CPPN (Compositional Pattern-Producing Network) encoded images, respectively. They obtained high-confidence adversarial samples (fooling images) using the Evolutionary Algorithm on a LeNet model pre-trained on the MNIST dataset [ 19 ] and an AlexNet model pre-trained on the ILSVRC 2012 ImageNet dataset [ 20 , 21 ], respectively.

Neural network adversarial attacks are divided into black-box attacks and white-box attacks. Black-box attacks do not require the internal structure and parameters of the neural network, and the adversarial samples can be generated with optimization algorithms as long as the output classification and confidence information is known. The study of neural network adversarial attacks not only helps to understand the working principle of neural networks but also increases the robustness of neural networks by training with adversarial samples.

3 Approaches

This section improves the single-point crossover and simple mutation of SGA. The fitness function is used as the evaluation index of the crossover link, and the crossover points of the whole chromosome are traversed to improve the efficiency of the search for the best. A selective mutation is performed for each gene of the children’s chromosome, and the mutation rate of the latter half of the chromosome is set to twice that of the first half to improve the global search under the stable situation of local optimum.

3.1 Improved crossover operation

As shown in algorithm 1 is the Python pseudocode for the improved crossover algorithm. The single-point crossover of SGA is to generate a random number within the parental chromosome length range, and then intercept the first half of the father’s chromosome and the second half of the mother’s chromosome to cross-breed the children according to the generated random number. In this paper, the algorithm is improved by trying to cross genes within the parental chromosome length range one by one, calculating the fitness, and picking out the highest fitness children individuals. Experimental data show that such an improvement can reduce the number of iterations and speed up the convergence of fitness.

Algorithm 1 Crossover with fitness as evaluation.

Input : Father’s gene, mother’s gene, fitness function;

Output : Child’s gene;

1: function CROSSOVER( father , mother , fitness )

2:   best _ fitness = float . MIN _ VALUE ;

3:   best _ child = np . zeros ( father . size );

4:   for i = 0 → father . size do

5:    current _ child = np . zeros ( father . size );

6:    current _ child = np . append ( father [0: i ], mother [ i :]);

7:    current _ fitness = fitness ( current _ child );

8:    if current _ fitness > best _ fitness then

9:     best _ fitness = current _ fitness ;

10:     best _ child = current _ child . copy ();

11:    end if

12:   end for

13:   return best _ child

14: end function

3.2 Improved mutation operation

As shown in algorithm 2 is the pseudocode of the improved mutation algorithm. The simple mutation of SGA sets a relatively large mutation rate, and mutates any one gene of the incoming children’s chromosome when the generated random number is smaller than the mutation rate. In this paper, we improve the algorithm by setting a small mutation rate and then selectively mutating each gene of the incoming children’s chromosome. That is, when the generated random number is smaller than the mutation rate, the gene is mutated, and when the traversed gene position is larger than half of the chromosome length, the mutation rate is set to twice the original one (the second half of the gene has relatively less influence on the result). This ensures that the first half of the gene and the second half of the gene have an equal chance of mutation respectively, and can mutate at the same time. When the gene length is 784, the mutation rate of the whole chromosome is 1 − (1 − 0.025) 392 × (1 − 0.05) 392 , which greatly improves the species diversity and at the same time ensures the stability of the species (in the stable situation of the local optimum improves the global search ability), and experimental data show that it can improve the search capability.

Algorithm 2 Mutate child with alter each gene if rand number less than mutate rate.

Input : Child’s gene;

Output : Mutated child’s gene;

1: function MUTATE( child )

2:   mutate _ rate = 0.025;

3:   for i = 0 → child . size do

4:    if i > child . size //2 then

5:     mutate _ rate = 0.05;

6:    end if

7:    if random . random () < mutate _ rate then

8:     child [ i ] = ! child [ i ];//child[i] equals 0 or 1

9:    end if

10:   end for

11:   return child

12: end function

4 Numerical experiments and analysis

4.1 test functions.

In order to evaluate the optimization performance of the proposed improved genetic algorithm, 15 representative test functions from AVOA paper of Abdollahzadeh et al. [ 11 ] and Wikipedia [ 22 ] are selected in this paper. Since the proposed improved genetic algorithm is mainly used for the neural network adversarial attack problem, and the neural network has multi-dimensional parameters, the dimensions of the test functions will be tested on 30, 50, and 100, respectively. The details of the formula, dimensions, range, and minimum of the 15 test functions are shown in Tables 1 – 3 , where Table 1 are multi-dimensional test functions with unimodal, Table 2 are multi-dimensional test functions with multi-modal, and Table 3 for fixed-dimensional test functions.

thumbnail

https://doi.org/10.1371/journal.pone.0267970.t001

thumbnail

https://doi.org/10.1371/journal.pone.0267970.t002

thumbnail

https://doi.org/10.1371/journal.pone.0267970.t003

4.2 Experimental environment

The hardware environment of the experiment includes 8G of RAM, i7–4700MQ CPU; the software environment includes Windows 10 system, and the version of Python is 3.8.8. In order to compare the optimization performance of IGA, SGA (Simple Genetic Algorithm), PSO (Particle Swarm Optimization) and GWO (Grey Wolf Optimizer) are selected as the experimental objects for comparison experiments in this paper.

genetic algorithm based research papers

(a) Mutation rate. (b) Population size. (c) Max iteration.

https://doi.org/10.1371/journal.pone.0267970.g002

thumbnail

https://doi.org/10.1371/journal.pone.0267970.t004

4.3 Experimental results and analysis

4.3.1 qualitative result analysis..

genetic algorithm based research papers

(a) Parameter space. (b) Population distribution. (c) Best record. (d) Convergence curve.

https://doi.org/10.1371/journal.pone.0267970.g003

thumbnail

https://doi.org/10.1371/journal.pone.0267970.g004

thumbnail

https://doi.org/10.1371/journal.pone.0267970.g005

thumbnail

https://doi.org/10.1371/journal.pone.0267970.g006

4.3.2 Quantitative result analysis.

In order to make a quantitative comparison with the other three mainstream optimization algorithms, the four optimization algorithms are performed independently for 10 experiments on F1-F11 test functions in dimensions 30, 50, and 100, respectively. The purpose of performing the high-dimensional function test is to test the convergence superiority of IGA on the high-dimensional space for application in the field of neural network adversarial attack. Tables 5 – 7 are the test results of the test functions F1-F11 in 30, 50, and 100 dimensions, respectively. Table 8 shows the results of the four optimization algorithms tested on the test functions F12-F15. The best result, worst result, mean, median, standard deviation, and P-value are compared for 10 experiments. Where P-value is the result of the Wilcoxon rank-sum statistical test and P-value below 5% is significant.

thumbnail

https://doi.org/10.1371/journal.pone.0267970.t005

thumbnail

https://doi.org/10.1371/journal.pone.0267970.t006

thumbnail

https://doi.org/10.1371/journal.pone.0267970.t007

thumbnail

https://doi.org/10.1371/journal.pone.0267970.t008

In Table 5 , IGA achieves significantly superior performance in 9 test functions, PSO is better in F3, and SGA is slightly better in F8. In Tables 6 and 7 , IGA achieves significantly superior performance in 10 test functions, PSO performs better in F3. It can be seen that the performance loss of IGA with increasing dimensionality is not as large as the other three optimization algorithms. In Table 8 , IGA achieves significantly superior performance in 3 test functions, and PSO performs slightly better in F14.

In general, IGA has better iteration efficiency, global search capability, and convergence success rate than the other three optimization algorithms.

5 Application in neural network adversarial attack

5.1 mnst dataset.

The MNST dataset (Mixed National Institute of Standards and Technology database) [ 19 ] is one of the most well-known datasets in the field of machine learning and is used in applications from simple experiments to published paper research. It consists of handwritten digital images from 0–9. The MNIST image data is a single-channel grayscale map of 28 × 28 pixels, with each pixel taking values between 0 and 255, with 60,000 samples in the training set and 10,000 samples in the test set. The general usage of the MNIST dataset is to learn with the training set first and then use the learned model to measure how well the test set can be correctly classified [ 23 ].

5.2 Implementation

As shown in Fig 7(a) , the Deep Convolutional Neural Network (DCNN) pre-trained on the MNST dataset [ 19 ] is used as the experimental object in this paper, and the accuracy of the model is 99.35% with a Loss value of 0.9632. As shown in Fig 7(b) , the model of network adversarial attack is shown. The number of populations of a specific size (set to 100 in this paper) is first generated and then input to the neural network to obtain the confidence of the specified labels. To reduce the computational expense, the input is reduced to a binary image of 28 × 28 and the randomly generated binary image is iterated using the IGA proposed in this paper. Among the 100 individuals, the fathers and mothers with relatively high confidence are selected by roulette selection, and then the children are generated by using the improved crossover link in this paper, and the children from a new population by improving the mutation link until the specified number of iterations. Finally, the individual with the highest confidence is picked from the 100 individuals, which is the binary image with the highest confidence after passing through the neural network.

thumbnail

(a) The structure of DCNN for experiment. (b) The model of network adversarial attack.

https://doi.org/10.1371/journal.pone.0267970.g007

As shown in Fig 8 , the confidence after 99 iterations of DCNN is 99.98% for sample “2”. Sample “6” and sample “4” have the slowest convergence speed, and the confidence of sample “6” is 78.84% after 99 iterations, and the confidence of sample “4” is 78.84% after 99 iterations.

thumbnail

https://doi.org/10.1371/journal.pone.0267970.g008

The statistics of the experimental results are shown in Fig 9 . The binary image of sample “1” generated after 999 iterations has confidence of 99.94% after passing DCNN, which is much higher than the confidence of sample “1” in the MNIST test set in the DCNN control group. In the statistics of the results after initializing the population with the MNIST test set, because the overall confidence of the population initialized with the test set is higher, the increase in confidence during iteration is smaller. The confidence of the sample selected from the MNIST test set is 99.56%, and after 10 iterations the confidence of the sample is 99.80%, and the number “1” becomes vertical; after 89 iterations the confidence is 99.98%, and the number “1” has a tendency to “decompose” gradually.

thumbnail

https://doi.org/10.1371/journal.pone.0267970.g009

As shown in Fig 10 , the reason for this situation is probably that the confidence as a function of the image input is a multi-peak function, and the interval in which the test set images are distributed is not the highest peak of the confidence function. This causes the initial population of the test set to “stray” from some pixels in the images generated by the IGA.

thumbnail

https://doi.org/10.1371/journal.pone.0267970.g010

6 Conclusion

The comparison and simulation experiments show that the improved method proposed in this paper is effective and greatly improves the convergence efficiency, global search capability and convergence success rate. Applying IGA to the field of neural network adversarial attacks can also quickly obtain adversarial samples with high confidence, which is meaningful for the improvement of the robustness and security of neural network models.

In this paper, although the genetic algorithm has been improved to enhance the performance of the genetic algorithm, it is based on the genetic algorithm, so it cannot be completely separated from the general framework of the genetic algorithm, and the problem that the genetic algorithm is relatively slow in a single iteration cannot be solved. We hope to explore a new nature-inspired optimization algorithm in our future work. In addition, the reason why the neural network model has so many adversarial samples, we believe that it is a design flaw in the architecture of the neural network model. In future work, we will also try to explore a completely new way of the infrastructure of neural networks so as to compress the space of adversarial samples.

With the wide application of artificial intelligence and deep learning in the field of computer vision, face recognition has outstanding performance in access control systems and payment systems, which require a fast response to the input face image, but this has instead become a drawback to be hacked. For face recognition systems without in vivo detection, using the method in this paper only requires output labels and confidence information can obtain high confidence images quickly. In summary, neural networks have many pitfalls due to their uninterpretability and still need to be considered carefully for use in important areas.

Supporting information

https://doi.org/10.1371/journal.pone.0267970.s001

  • View Article
  • Google Scholar
  • 2. Eberhart R. and Kennedy J. (1995). A new optimizer using particle swarm theory. In MHS’95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science , pages 39–43. Ieee.
  • 3. Kennedy J. and Eberhart R. (1995). Particle swarm optimization. In Proceedings of ICNN’95-international conference on neural networks , volume 4, pages 1942–1948. IEEE.
  • 8. Holland J. H. et al. (1975). Adaptation in natural and artificial systems.
  • 13. Szegedy C., Zaremba W., Sutskever I., Bruna J., Erhan D., Goodfellow I., et al. (2013). Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 .
  • 15. Kurakin A., Goodfellow I., Bengio S., et al. (2016). Adversarial examples in the physical world.
  • 16. Papernot N., McDaniel P., Jha S., Fredrikson M., Celik Z. B., and Swami A. (2016). The limitations of deep learning in adversarial settings. In 2016 IEEE European symposium on security and privacy (EuroS&P) , pages 372–387. IEEE.
  • 18. Nguyen A., Yosinski J., and Clune J. (2015). Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In Proceedings of the IEEE conference on computer vision and pattern recognition , pages 427–436.
  • 19. LeCun Y. (1998). The mnist database of handwritten digits. http://yann.lecun.com/exdb/mnist/ .
  • 20. Deng J., Dong W., Socher R., Li L.-J., Li K., and Fei-Fei L. (2009). Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition , pages 248–255. Ieee.
  • 22. Wikipedia (2021). Test functions for optimization. Website. https://en.wikipedia.org/wiki/Test_functions_for_optimization .
  • 23. Yasue S. (2018). Deep Learning from Scratch . “Beijing: Posts and Telecom Press”.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • v.30(6); 2015 Nov

Logo of omanmedj

The Applications of Genetic Algorithms in Medicine

Ali ghaheri.

1 Department of Management and Economy, Science and Research Branch, Azad University, Tehran, Iran

Saeed Shoar

2 Department of Surgery, Shariati Hospital, Tehran University of Medical Sciences, Tehran, Iran

Mohammad Naderan

3 School of Medicine Tehran University of Medical Sciences, Tehran, Iran

Sayed Shahabuddin Hoseini

4 Hannover Medical School, Germany

A great wealth of information is hidden amid medical research data that in some cases cannot be easily analyzed, if at all, using classical statistical methods. Inspired by nature, metaheuristic algorithms have been developed to offer optimal or near-optimal solutions to complex data analysis and decision-making tasks in a reasonable time. Due to their powerful features, metaheuristic algorithms have frequently been used in other fields of sciences. In medicine, however, the use of these algorithms are not known by physicians who may well benefit by applying them to solve complex medical problems. Therefore, in this paper, we introduce the genetic algorithm and its applications in medicine. The use of the genetic algorithm has promising implications in various medical specialties including radiology, radiotherapy, oncology, pediatrics, cardiology, endocrinology, surgery, obstetrics and gynecology, pulmonology, infectious diseases, orthopedics, rehabilitation medicine, neurology, pharmacotherapy, and health care management. This review introduces the applications of the genetic algorithm in disease screening, diagnosis, treatment planning, pharmacovigilance, prognosis, and health care management, and enables physicians to envision possible applications of this metaheuristic method in their medical career.]

Introduction

There is no doubt that computers have revolutionized our everyday life. They are vastly used and have benefited nearly all fields of science from aerospace and astronomy to biology, chemistry, physics, mathematics, geography, archeology, engineering, and social sciences.

In medicine, electronic chips and computers are the backbones of a lot of imaging, diagnostic, monitoring, and therapeutic devices. These devices, which are composed of several different hardware components, are managed and controlled by software, which in turn are based on algorithms. An algorithm is a set of well-described rules and instructions that define a sequence of operations. Metaheuristic methods are algorithms that can more quickly solve complex problems, or they can find an approximate solution when classical methods are not able to find an exact one. 1

Several metaheuristic algorithms for finding an optimal or near-optimal solution exist. These include the ant colony (inspired by ants behavior), 2 artificial bee colony (based on bees behavior), 3 Grey Wolf Optimizer (inspired by grey wolves behavior), 4 artificial neural networks (derived from the neural systems), 5 simulated annealing, 6 river formation dynamics (based on the process of river formation), 7 artificial immune systems (based on immune system function), 8 and genetic algorithm (inspired by genetic mechanisms). 9 Metaheuristic approaches have been frequently used in other fields of science where complex problems need to be solved, or optimal decisions should be made. In medicine, although valuable work has been done, the power of these potent algorithms for offering solutions to the countless complex problems physicians encounter every day has not been fully exploited.

In this paper, we introduce the genetic algorithm (GA) as one of these metaheuristics and review some of its applications in medicine.

The genetic algorithm

A GA is a metaheuristic method, inspired by the laws of genetics, trying to find useful solutions to complex problems. In this method, first some random solutions (individuals) are generated each containing several properties (chromosomes). Based on the laws of genetics, cross-over and mutations occur in chromosomes to produce a second generation of individuals with more diverse properties.

Crossover and mutation are the two most central methods for diversifying individuals. In crossover, two chromosomes are chosen. Then a crossover point along each chromosome is chosen followed by the exchange of the values up to the crossover point between the two chromosomes [Figure 1]. These two newly-generated chromosomes produce new offspring. The process of crossover will be iterated over and over until the desired diversity of individuals (i.e. solutions) is made. The mutation also generates new configurations by applying random changes in different chromosomes. 10 One of the simplest mutation methods has been depicted in Figure 1.

An external file that holds a picture, illustration, etc.
Object name is OMJ-D-15-00162-f1.jpg

Methods to induce diversity in the population of individuals (candidate solutions). (a) During crossover, one part of a chromosome is exchanged by another fragment of another chromosome. (b) During mutations, one or more datasets on a chromosome are converted to different ones. These alterations will generate new individuals whose fittest (more optimal solutions) will survive.

In a GA, the possibility of reproduction depends on the fitness of individuals. The better chromosomes they have (i.e., those with better characteristics), the more likely they are to be selected for breeding the next generation. There are several selection methods; however, the aim of all is to assign fitness values to individuals based on a fitness function and to select the fittest. Genetic alterations in chromosomes will happen via crossover and mutations to produce another generation. This iterative process will continue until the fittest individual (the optimal solution) is formed or the maximum number of generations is reached. 9 , 11

It is worth noting that GAs are different from the derivative-based, optimization algorithms. First of all, GAs search a population of points in the solution space in each iteration while classical derivative-based methods search only a single point. Moreover, GAs select the next population using probabilistic transition rules and random number generators while derivative-based algorithms use deterministic transition rules for selecting the next point in the sequence. 11 , 12

In the following, we introduce some of the applications of GAs in a variety of medical disciplines.

Imaging techniques in radiology generate a large amount of data that needs to be analyzed and interpreted by radiologists in a relatively short time. Computer-aided detection and diagnosis are rapidly growing interdisciplinary technologies that aim to assist radiologists in faster and more accurate image analysis by detection, segmentation, and classification of normal and pathological patterns found on various imaging modalities. These include X-rays, magnetic resonance imaging (MRI), compute tomography (CT) scan, and ultrasound. 13

In machine vision, an image of scenery (such as organs of the human body in radiology images) is acquired, processed, and interpreted. The boundaries (shape) and sizes of objects within the images need to be determined to assess the objects in more detail. Therefore, the process of edge detection becomes one of the integral parts of automatic image processing techniques. 14 Several researchers have used the GAs for edge detection of images acquired using different imaging modalities including MRI, CT, and ultrasound. 14 - 16

Screening mammography is the gold standard for detection of breast cancer; however, due to its failure rate, 17 , 18 researchers have tried to apply computational tools to improve the sensitivity of the system. In fact, the majority of the applications of GAs in radiology were performed on breast cancer screening primarily using mammography.

Karnan and Thangavel 19 applied the GA to detect microcalcifications in mammograms suggesting of breast cancer. In their method, after enhancement and normalization of the mammograms, the border of breast and the nipple position was detected by the GA. Using the border and the nipple position of the right and left breasts as a reference, the mammogram images were aligned and subtracted from each other to find the asymmetry image suggestive of breast cancer. The Az value, which is the area under the receiver operating characteristic (ROC) curve, has been used as a useful measure for assessing the diagnostic performance of a system. 20 The Az value for their proposed algorithm was about 0.9. 19

In another study, Pereira et al, 21 applied a set of computational tools for mammogram segmentation to improve the detection of breast cancer. An algorithm was first designed to eliminate artifacts followed by denoising and image enhancement. Consecutively, combining wavelet analysis and the GA allowed detection and segmentation of suspicious areas with 95% sensitivity. GAs have also been successfully used for classification and detection of clustered microcalcifications in digital mammograms. 22 - 24

In machine learning, feature selection is the process of selecting a subset of relevant features to construct a model by removing variables with little or no analytical value. Feature selection is important since choosing irrelevant features would increase the time, cost, and complexity of computation and reduce the accuracy of the model. 25 Besides, reducing the number of features would avoid the problem of over-fitting, reduce the chance of failure upon missing data, and allow for a better explanation and generalization of the model. 26

GAs have been applied for feature selection in studies aiming to identify a region of interest in mammograms as normal or containing a mass, 27 and to differentiate benign and malignant breast tumors in ultrasound images. 25

de Carvalho Filho et al, 28 developed a GA for automatic detection and classification of solitary lung nodules. The designed algorithm could detect lung nodules with about 86% sensitivity, 98% specificity, and 98% accuracy.

Image registration or fusion is the process of optimal aligning of two or more images into one coordinate system. Precise integration of images becomes crucial when valuable information is embedded within several images acquired under different conditions (viewpoint, sensor, or time). 29 GAs have successfully been used to align MRI and CT scan images in several studies. 30 , 31 In another study, positron emission tomography (PET) images were fused with MRI images by a GA to generate colored breast cancer images. 32

Precise tumor staging is an important part of designing a treatment plan. Accurate tumor size and volume determination using non-invasive imaging studies becomes essential for tumor staging. Zhou et al, 33 developed a system for extraction of tongue carcinoma from head and neck MRIs. A GA was applied for segmentation of images followed by an artificial neural network (ANN)-based symmetry-detection algorithm to reduce the number of false positive results. This approach was able to extract tongue carcinoma from an MRI with high accuracy and minimal user-dependency.

Screening tests offer a valuable opportunity for early cancer detection, which if followed by proper treatment could improve the survival rate of patients.

To develop a non-invasive technique for cervical cancer detection, Duraipandian et al, 34 acquired Raman spectra from the cervical area via colposcopy. The biomolecular information generated via the Raman spectroscopy was analyzed by a GA-partial least square-discriminant analysis system to differentiate between a normal and dysplastic cervix. Partial least square (PLS) is a statistical method aiming to find a linear regression model between a dependent variable and some predictor variables. 35 This system was able to differentiate dysplasia from a normal cervix with 72% sensitivity and 90% specificity. 34

The advent of DNA microarrays has paved the way for massive gene expression profiling that could revolutionize the field of molecular diagnostics and prognosis. However, generation of large sets of data poses statistical and analytical challenges necessitating the need to find key predictive genes. 36 Due to the inherent capability of GAs to search and find the optimal solution among large and complex possible solutions with multiple simultaneous interactions, they have been applied to analyze microarray data from several cancer cell lines. 36 Dolled-Filhart et al, 37 generated microarray data by staining breast cancer tissues with several antibodies specific for various markers to find a minimum set of biomarkers with maximum classification and prognostication values in breast cancer patients. The data analyzed using GAs showed that three markers with available antibodies could define a population of patients with more than a 95% five-year survival rate.

Tan et al, 38 conducted a study to investigate the relationship between soil trace elements and cervical cancer mortality in China. A combination of GA and PLS was used to choose five out of 25 trace elements. Then a least square support vector machine (LSSVM) model was developed. LSSVM is a method used in machine learning to infer a function from or find a pattern in training data. 39 The results showed that a combination of GA-PLS and LSSVM could predict the mortality of cervical cancer based on trace elements. 38

One of the important and informative factors influencing the choice of an appropriate therapeutic approach for cancer patients is determination of the disease prognosis. In a retrospective study on more than 200 patients, Bozcuk et al, 40 compared the performance of four different data mining methods to determine the outcome of cancer patients not being in terminal stages after hospitalization. In comparison to other methods, GA selected the least number of explanatory variables (lactate dehydrogenase and the reason for admission) to predict the outcome of patients.

GAs have been used in different fields of cardiovascular medicine. Atherosclerotic plaques are hallmarks of most myocardial infarctions and strokes. Determination of plaque mechanical properties such as elasticity would enable physicians to locate better and map vulnerable or unstable plaques. Khalil et al, 41 used a system involving GAs for parameter estimation necessary for accurate elasticity quantification to determine tissue elasticity. This system is superior to gradient-based methods used for parameter estimation of the inefficiency of gradient-based techniques for inhomogeneous solution spaces containing several local minima and requirement for substantial computational time limits their application. 41

The field of biomarker discovery and clinical proteomics is rapidly growing in medical diagnosis, prognosis, and disease follow-up. Advanced technologies such as mass spectrometry can generate readouts of thousands of proteins from patient samples; however, the cost and complexity of such techniques on the one hand and computational and statistical methods for analysis, on the other hand, necessitates the selection of a few, relevant markers for clinical assay development. Zhou et al, 42 employed an improved version of the GA supported by a recursive local floating enhancement technique to predict the risk of a major adverse cardiac event (MACE). This technique was able to select a panel of seven proteins including myeloperoxidase to predict the risk of MACE with 77% accuracy, which outperformed over several current methods.

Logistic regression models have been frequently used in diagnosing diseases. Due to its outstanding performance, a GA has been used to select the best variables for a logistic regression system aiming to model the presence of myocardial infarction in patients with chest pain. The GA-based method was superior in variable selection to other traditional methods. 26

One of the key elements in the automatic interpretation of the electrocardiogram (ECG) is the detection of QRS complexes that would allow assessment of heart rate variability and other relevant diagnostic parameters. Tu et al, 43 introduced a simple and effective GA to detect QRS complexes. Then, p-waves and f-waves, which happen in normal ECG and after atrial fibrillation, respectively, were successfully extracted from patient databases. Such algorithms could allow comprehensive research into ECG details.

Endocrinology

Hypoglycemia is the most common complication of insulin therapy in patients with type 1 diabetes mellitus (T1DM). Hypoglycemia can induce alterations in the patterns of electroencephalograms (EEGs). Nguyen et al, 44 combined ANNs, GAs, and Levenberg-Marquardt (LM) training techniques to detect hypoglycemia based on EEG signals. ANN was used to model the relationship between blood glucose and EEG signals. For training ANN, the global search ability of GA and the local search capability of LM were combined. Data from four EEG parameters derived from two EEG channels were used by the analyzing system to detect hypoglycemia with 75% sensitivity and 60% specificity. In another paper, a GA-based multiple regression with fuzzy inference system was developed to detect non-invasive episodes of nocturnal hypoglycemia in children with T1DM. Using heart rate and corrected QT interval, hypoglycemia was detected with a sensitivity of 75% and specificity of over 50%. 45

Obstetrics and gynecology

The differentiation between normal and prolonged delivery allows obstetricians to determine the optimal timing for interventions, if necessary, during childbirth. One of the parameters that can help to forecast the delivery time and segregate normal versus prolonged labor is the time to reach full cervical dilation. Hoh et al, 46 applied a three-parameter logistic model using GA or the Newtone-Raphson (NR) method to predict the time to reach full cervical dilation. The GA-based algorithm outperformed the NR method by more accurately predicting the time to full cervical dilation.

A Pap smear is a cytology test for detection of precancerous and cancerous cervical changes. In this method, 20 features of cells are assessed to describe them as normal or abnormal or, more specifically, categorize them into seven classes. Marinakis et al, 47 generated a hybrid model that took advantage of the feature-selection capability of GAs to reduce the complexity of features necessary for a nearest neighbor algorithm for classification of Pap smear results. The new method outperformed several other previously used approaches by accurately classifying the Pap smear results.

GAs have also been applied in prenatal diagnosis. One of the fetal features that can complicate delivery is fetal macrosomia. In an attempt to differentiate the large-for-gestational-age (LGA) from the appropriate-for-gestational-age (AGA) infants, amniotic fluid from the second trimester was evaluated by capillary electrophoresis. Bayesian statistics was applied for data analysis. A GA was used to select the suitable wavelets (variables) of the electropherogram to minimize the computation time required for the Bayesian computation. This system was able to differentiate LGA from AGA using only two wavelets, one of albumin and the other of a negatively-charged unknown small molecule with 100% sensitivity and 98% specificity. 48

The prediction of fetal weight before delivery can reduce the potential problems associated with low-birth-weight infants. Yu et al, 49 introduced fuzzy logic into the support vector regression (FSVR) to estimate the fetal weight. GAs were used to generate an evolutionary FSVR to select the optimal features for the FSVR system. This outperformed a back-propagation neural network by achieving the lowest mean absolute percent error (6.6%) and the highest correlation coefficient (0.902) between the estimated and the actual fetal birth weight.

Cardiotocography is a cheap and non-invasive technique to assess the fetal heart rate and uterine contractions to determine fetal well-being. Ocak 50 applied a GA to select the optimal features of cardiotocogram recordings for a support vector machine (SVM) classifier. The results showed that the new system classified fetal health status as normal or abnormal with 99.3% and 100% accuracy, which was superior to an ANN algorithm designed for the same purpose.

Autism is a neurodevelopmental disease that appears in early childhood and is characterized by impaired social functioning and verbal and non-verbal communications and repetitive behavior. To recognize autism based on the microarray gene expression data, Latkowski and Osowski 51 used GAs to select the most relevant genes associated with the disease. Frequently selected genes include RMI1, NRIP1, TOP1, ZFHX3, CEP350, NFYA, PSENEN, ANP32A, SEMA4C, and SP1. These genes provided an input for an ensemble of classifiers including SVM and random forest classifiers. The introduced system recognized autism with 96% sensitivity and 83% specificity. 51

Acute lymphoblastic leukemia (ALL) is the most common type of leukemia in children and has many subtypes. Analysis of gene expression data derived from tumor cells can help classifying cancers. Due to the enormous size of information generated from microarray gene expression profiling, Lin et al, 52 used a GA to select the most relevant genes needed for ALL classification. Silhouette statistics was applied as a discriminant function to differentiate between six ALL subtypes. The proposed technique reached a 100% classification accuracy and used fewer discriminating genes compared to other methods.

Aneuploidy is a condition where one or a few chromosomes in the nucleus of a cell are above or below the normal chromosomal number of a species. Conventional chromosomal studies on amniocentesis samples are performed for definite diagnosis of fetal aneuploidy yet the rather long required time for these techniques necessitates the development of faster diagnostic tests. To this end, the proteomic profile of the amniotic fluid specimens was identified via mass spectrometry and the generated data was assessed by a GA. The proposed method could detect aneuploidy with 100% sensitivity, 72%–96% specificity, 11%–50% positive predictive value and 100% negative predictive value. 53

ANNs are powerful mathematical algorithms capable of predicting the behavior of systems. Due to the predictive value of ANNs, a GA-based ANN (GANN) was developed to predict the outcomes after surgery for patients with non-small cell lung cancer (NSCLC). The GA was applied to help optimization not to fall into local minima. The GANN model could predict the outcome of NSCLC patients more accurately and significantly better than logistic regression. Besides, the inclusion of tumor size in calculations significantly improved prediction outcomes. 54

As populations age, the number of geriatric patients needing cardiac surgeries increases. Due to the high prevalence of comorbid conditions in elderly, proper prognostication of postoperative morbidity and mortality would be informative, precluding overestimation of risk and denial of surgery for patients deserving it, which could happen with some prediction models. Applying a GA, Lee et al, 55 showed that a short length of stay after cardiac surgery was correlated with younger age, no preoperative use of beta blockers, shorter cross-clamp time, and absence of congestive heart failure.

Pulmonology

In pulmonology, auscultation is the most common diagnostic method that can differentiate lung diseases and guide the diagnostic approach toward more specific techniques. To automate lung sound diagnosis, a hybrid GANN was designed. The GA was applied to optimize the ANN training parameters and reduce the computation time. The new system could classify the lung sounds into normal, wheeze, and crackle. 56

Assessment of the partial pressure of carbon dioxide in the arterial blood (PaCO 2 ) is important in the management of critically ill patients. To avoid difficulties associated with arterial blood sampling, non-invasive methods for predicting PaCO 2 such as assessment of exhaled carbon dioxide at end-expiration (PetCO 2 ) could be applied in normal individuals; however, their use in sicker persons might be biased and less helpful. Engoren et al, 57 designed a GA to predict the PaCO 2 using 11 variables from capnography of non-intubated patients in the emergency department. The proposed system could improve the precision and bias of PaCO 2 prediction.

Infectious diseases

Tuberculosis is a possible lethal infectious disease not only in developing countries but also in developed nations after the emergence of human immunodeficiency virus (HIV). To predict the diagnosis (tuberculosis vs. non-tuberculosis patients), 38 parameters composed of examination parameters and laboratory data were used to design an ANN trained by a GA. The classification accuracy of the system was about 95%, which was higher than the results obtained by other algorithms. 58

Highly active antiretroviral therapy (HAART), an integral part of the treatment modalities against HIV, is composed of a combination of several antiretroviral medications aiming to decrease the replication of the virus. Since long-term HAART treatment needs patient compliance and might be associated with some side effects, structured treatment interruption has been proposed to reduce not only side effects, but also the selection pressure on the virus that could lead to the emergence of resistant particles. Therefore, Castiglione et al, 59 devised a GA-based system to choose the best HAART treatment schedule to control HIV and help the immune system to reconstitute. A virtual model of the immune system was used to assess the effects of anti-HIV drugs on virtual patients. 59 , 60 The new structured interruption schedule could achieve therapeutic results and protection against an opportunistic infection comparable to a full-length treatment. 61

Radiotherapy

Intensity modulated radiotherapy (IMRT) was developed to transfer an accurate dose of radiation to a target such as the brain, prostate, or head and neck. Planning IMRT involves selection of 5–10 angles for wavelet projection and determining the radiation dose. The application of GA could improve the selection of gantry angles in a reasonable time frame. 62 Similar GA-based irradiation planning has been applied for patients with other types of cancer including pancreatic, 63 rhabdomyosarcoma, and brain tumors. 64 GAs have also been successfully used to optimize the design of stereotactic radiosurgery, and radiotherapy treatment plans. 65

Rehabilitation medicine

As the need for physical rehabilitation increases, novel treatment equipment and techniques have to be developed and tested. Refinement of these new methods needs changing various parameters and testing of the resultant techniques on individuals, which is time-consuming and costly. Development of musculoskeletal models enables computer simulation of movements to assess the effect of new modifications on the efficiency of training. Pei et al, 66 developed a robotic technique for physiotherapy of the lower limb. A GA was applied to generate custom-made treatment plans for each patient.

In another paper, a therapeutic robot was designed for lower limb exercise. The system that consisted of an ANN and a GA was capable of learning the actions of a physiotherapist for each patient and mimicked its behavior in the absence of a therapist. 67

Orthopedics

Biomedical engineering has offered great solutions to the field of orthopedic surgery. Total hip arthroplasty (THA) has improved the management of various disabling hip joint diseases. Yet, failure of the femoral stem of a THA can compromise the success of treatment. Ishida et al, 68 reported the use of a GA in designing an optimized geometry of the femoral stem component. GAs have also been exploited to select the best design of tibial locking screws to reduce the probability of screw breakage or loosening. 69 In another report, a combination of ANNs and GAs was applied to design spinal pedicle screws used for fixation of spinal fractures. The hybrid algorithm was able to design screws with a higher fatigue life and ideal pullout and bending characteristics. 70

Scoliosis is a three-dimensional deformity of spinal axis curves. The progression of the disease, which only happens in a small percentage of patients, is monitored by serial X-rays over time. Since frequent exposure to X-rays might increase the chance of cancer, it is desirable to assess the disease development using harmless methods. Jaremko et al, 71 developed a GA-based ANN algorithm to estimate the angle of spinal axis deformity from indices of trunk surface deformity. The hybrid system was able to determine the angle deformity within 5% accuracy in more than two third of patients.

Multiple sclerosis (MS) is a debilitating inflammatory disease of the neural system characterized by the formation of white matter scars otherwise known as plaques. Computer-assisted diagnosis has been applied for detection of pathologic features in these patients. In one study, a GA was developed to detect the MS lesions of brain MRIs. The similarity index of lesions determined by the GA and by a radiologist was 87%. 72

The EEG is a useful diagnostic method to detect the abnormal brain electrical discharges occurring during a seizure. To design an automated system for detection of abnormal EEG signals, several learning algorithms (LM, Quickprop, Delta-bar delta, and Momentum and Conjugate gradient) were used to train an ANN for EEG-based classification of epileptic versus healthy individuals. A GA was used to find the optimal parameters for and architecture of the ANN. The results demonstrated that the LM method combined with the GA was the best algorithm for training the ANN, which reached a general success of 96.5% in its performance. 73

Several reports have suggested that mitochondrial dysfunction plays an important role in Parkinson’s disease. Since mitochondrial genetics has its idiosyncrasies, a simple comparison of mitochondrial mutations between healthy and disease conditions might not be so informative. Therefore, Smigrodzki et al, 74 devised a GA to detect biologically important patterns of mitochondrial mutations in Parkinson’s patients. The proposed system was able to diagnose Parkinson’s disease with 100% accuracy based on mutational patterns in mitochondrial DNA.

Pharmacotherapy

Pharmacovigilance, the study of safety and adverse effects of drugs, is not only an integral part of currently-used drug assessment; it is also a crucial element in the evaluation of novel investigational medicines. The clinical judgment of a pharmacotherapist to attribute an observed adverse effect to a drug is valuable yet implicit while algorithms can make a less arbitrary and more objective evaluation. Koh et al, 75 developed a GA-based quantitative system for the evaluation of adverse drug reactions. The new scoring system was able to determine a probability of the causality of an adverse drug reaction to a suspected drug with about 84% sensitivity and 71% specificity.

Tacrolimus is an immunosuppressive agent used to prevent rejection after organ transplantation. The drug has highly variable pharmacokinetics and a narrow therapeutic window making its blood level control an essential and difficult task. In an attempt to predict the blood concentration of tacrolimus in liver-transplanted patients, an ANN algorithm was developed. A GA was used to choose the best set of clinically significant candidate variables. For validation, predicted results were compared to observed figures. The ANN was able to predict the blood level of tacrolimus, with 84% of data sets being within a clinically acceptable range of 3 ng/ml of the observed data. 76

Studies have shown that poor pharmacokinetics and lack of efficiency account for more than 50% of failures in the process of drug development. The traditional assessment of the efficacy and pharmacokinetics of novel investigational agents in animal models is a costly and time-consuming process. Therefore, computational methods have evolved to generate quantitative structure-pharmacokinetic relationship (QSPKR) models for rapid in silico screening of novel potential drugs.

Zandkarimi et al, 77 applied a GA to select the most suitable characteristics out of more than 1480 descriptors of alkaloid drugs. These sets of characteristics were then extracted from known drugs for training an ANN to generate QSPKR prediction models. The new system was able to predict the volume of distribution, clearance, and plasma protein binding of alkaloid drugs with an acceptable efficiency.

Health care management

Proper management of monetary resources and personnel is an integral part of health systems all over the world. One of the important elements of hospital management which can improve patient servicing, satisfaction, and cost-effectiveness ratios is efficient scheduling of patients admission. A mathematical model was developed and optimized using a GA to improve the patient scheduling in an ophthalmology hospital. The new algorithm was superior to the traditional "first come, first serve" model in that it shortened the waiting list, lowered the vacancy rate of hospital beds, reduceed the preoperative waiting time for patients, and increased the number of patients discharged from the hospital. 78 Another report showed that a combination of GA and particle swarm optimization, another powerful metaheuristic algorithm, was able to improve patient scheduling, reduce time wastage, and increase patient satisfaction. 79

In clinical laboratories, regular rotation of staff based on their skills through different facilities is fundamental for maintaining job skills and competence. GAs have been applied to improve staff rotation scheduling in a clinical laboratory. In one report, the GA-based software was capable of planning the rotation of staff effectively, ensuring maintenance of techniques and skills, saving time and the cost necessary for the scheduling process, and it was associated with the satisfaction of responsible supervisory personnel. 80

In this paper, we introduced GAs and some of their applications in various fields of medicine. Although GAs and some other metaheuristics are inspired by biology, the experts of other fields of science are more aware of them and these methods are frequently used to solve complex problems. Due to the inherent complexity of medicine, optimization methods could be of great value for physicians and medical researchers. The lack of an efficient interaction between computer scientists and physicians on the one hand and the unfamiliarity of complex mathematical formulas among the medical professions on the other is responsible for this situation. Therefore, improving the interaction and understanding between physicians, computer scientists, and engineers, which could happen via joint journal clubs or attendance of physicians ground rounds and case report presentations, could solve the problem. Besides, improvement of interdisciplinary courses and efficient involvement of engineering researchers in health care environments and hospitals could offer new solutions for medical problems and new ideas for non-medical researchers.

The authors declared no conflicts of interest. No funding was received for this study.

Help | Advanced Search

Computer Science > Other Computer Science

Title: genetic algorithm: reviews, implementations, and applications.

Abstract: Nowadays genetic algorithm (GA) is greatly used in engineering pedagogy as an adaptive technique to learn and solve complex problems and issues. It is a meta-heuristic approach that is used to solve hybrid computation challenges. GA utilizes selection, crossover, and mutation operators to effectively manage the searching system strategy. This algorithm is derived from natural selection and genetics concepts. GA is an intelligent use of random search supported with historical data to contribute the search in an area of the improved outcome within a coverage framework. Such algorithms are widely used for maintaining high-quality reactions to optimize issues and problems investigation. These techniques are recognized to be somewhat of a statistical investigation process to search for a suitable solution or prevent an accurate strategy for challenges in optimization or searches. These techniques have been produced from natural selection or genetics principles. For random testing, historical information is provided with intelligent enslavement to continue moving the search out from the area of improved features for processing of the outcomes. It is a category of heuristics of evolutionary history using behavioral science-influenced methods like an annuity, gene, preference, or combination (sometimes refers to as hybridization). This method seemed to be a valuable tool to find solutions for problems optimization. In this paper, the author has explored the GAs, its role in engineering pedagogies, and the emerging areas where it is using, and its implementation.

Submission history

Access paper:.

  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

DBLP - CS Bibliography

Bibtex formatted citation.

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 05 September 2023

A bio-inspired convolution neural network architecture for automatic breast cancer detection and classification using RNA-Seq gene expression data

  • Tehnan I. A. Mohamed 1 ,
  • Absalom E. Ezugwu 2 ,
  • Jean Vincent Fonou-Dombeu 1 ,
  • Abiodun M. Ikotun 1 &
  • Mohanad Mohammed 1  

Scientific Reports volume  13 , Article number:  14644 ( 2023 ) Cite this article

1938 Accesses

4 Citations

3 Altmetric

Metrics details

  • Computational models
  • Machine learning

Breast cancer is considered one of the significant health challenges and ranks among the most prevalent and dangerous cancer types affecting women globally. Early breast cancer detection and diagnosis are crucial for effective treatment and personalized therapy. Early detection and diagnosis can help patients and physicians discover new treatment options, provide a more suitable quality of life, and ensure increased survival rates. Breast cancer detection using gene expression involves many complexities, such as the issue of dimensionality and the complicatedness of the gene expression data. This paper proposes a bio-inspired CNN model for breast cancer detection using gene expression data downloaded from the cancer genome atlas (TCGA). The data contains 1208 clinical samples of 19,948 genes with 113 normal and 1095 cancerous samples. In the proposed model, Array-Array Intensity Correlation (AAIC) is used at the pre-processing stage for outlier removal, followed by a normalization process to avoid biases in the expression measures. Filtration is used for gene reduction using a threshold value of 0.25. Thereafter the pre-processed gene expression dataset was converted into images which were later converted to grayscale to meet the requirements of the model. The model also uses a hybrid model of CNN architecture with a metaheuristic algorithm, namely the Ebola Optimization Search Algorithm (EOSA), to enhance the detection of breast cancer. The traditional CNN and five hybrid algorithms were compared with the classification result of the proposed model. The competing hybrid algorithms include the Whale Optimization Algorithm (WOA-CNN), the Genetic Algorithm (GA-CNN), the Satin Bowerbird Optimization (SBO-CNN), the Life Choice-Based Optimization (LCBO-CNN), and the Multi-Verse Optimizer (MVO-CNN). The results show that the proposed model determined the classes with high-performance measurements with an accuracy of 98.3%, a precision of 99%, a recall of 99%, an f1-score of 99%, a kappa of 90.3%, a specificity of 92.8%, and a sensitivity of 98.9% for the cancerous class. The results suggest that the proposed method has the potential to be a reliable and precise approach to breast cancer detection, which is crucial for early diagnosis and personalized therapy.

Similar content being viewed by others

genetic algorithm based research papers

A novel prognostic two-gene signature for triple negative breast cancer

Mansour A. Alsaleem, Graham Ball, … Emad Rakha

genetic algorithm based research papers

Wx: a neural network-based feature selection algorithm for transcriptomic data

Sungsoo Park, Bonggun Shin, … Keunsoo Kang

genetic algorithm based research papers

Machine learning for RNA sequencing-based intrinsic subtyping of breast cancer

Silvia Cascianelli, Ivan Molineris, … Enzo Medico

Introduction

Breast cancer (BRCA) is the most prevalent cancer in women, and it is characterized by the uncontrolled division and expansion of breast cells 1 , 2 . Industrialized and developing nations are experiencing increased cancer incidence and prevalence 3 . Breast cancer incidence and death rates are serious public health concerns 4 . The World Health Organization (WHO) estimates that in 2023 there will be more than 2.3 million new instances of breast cancer globally and 685,000 deaths from the disease 5 . Early detection and accurate diagnosis of BRCA are crucial for effective treatment and personalized therapy. Morphological characteristics play an important role in detecting and diagnosing breast cancer. When a sample of breast tissue is obtained through a biopsy or surgical procedure, a pathologist examines the tissue under a microscope and looks for specific morphological features that are associated with breast cancer, such as abnormal cell growth, changes in cell shape or size, and the presence of cancerous cells. These morphological characteristics can provide important information about the type, stage, and aggressiveness of the cancer, which can help guide treatment decisions and predict patient outcomes. While morphological examination remains a crucial tool in the detection and diagnosis of breast cancer, it has some limitations 6 , 7 , 8 , 9 .

The limitations of morphological characteristics in detecting and diagnosing breast cancer can lead to bias and difficulty in identification by physicians 10 . Advancements in microarray technology and the more recent Next Generation Sequencing (NGS) has made gene expression profiling of patients widely available, resulting in the collection of gene expression datasets corresponding to various diseases. This shift has marked a significant transformation in personalized medicine, departing from traditional descriptive "morphological" classification approaches towards a more comprehensive strategy that considers clinical characteristics and immunohistochemical biomarkers. Today, gene expression profiling has become well-integrated into routine clinical practice 11 , 12 . Breast cancer researchers have examined gene expression profiling in-depth, and clinical oncologists are starting to use the findings of these studies in their daily practices. Also, the early detection and treatment of different cancer types have benefited from mining gene expression level data 13 . Many methods are designed to accurately predict breast cancer based on gene expression data 14 , 15 , 16 . Computational techniques are becoming increasingly crucial in detecting breast cancer due to the rapid growth of computer technology. However, the use of computational techniques is affected by gene expression dataset characteristics such as small dataset sizes, excessive dimensionality, and unbalanced data 17 . Several machine learning, deep learning, and metaheuristic techniques have been created and applied to detect and classify cancer using gene expression data.

Khalsan et al. 18 presented an extensive overview of recent cancer research works that utilize gene expression data from various types of cancer, including kidney, breast, ovarian, lung, liver, gallbladder and central nervous system. The review encompasses several facets of machine learning in cancer research, including cancer classification, cancer prediction, identification of biomarker genes, and using microarray and RNA-Seq data. Yuan et al. 19 applied different methods of machine learning for the detection of lung cancer through the use of gene expression data. A novel computational method for detecting breast cancer was proposed by Wang et al. 20 based on incorporating random forest (RF), Monte Carlo feature selection (MCFS), rough set-based rule learning, SVM, and dagging. A deep learning method that uses Stacked Denoising Autoencoder (SDAE) to identify genes that can effectively differentiate between tumor and healthy cases of breast cancer was proposed by Danaee et al. 21 . BRCA gene expression data from TCGA and gene expression omnibus (GEO) was analyzed by Jia et al. 22 . They used differentially expressed genes (DEG) and weighted gene co-expression network analysis (WGCNA) to select the most significant genes. A deep learning model combined with an artificial intelligence-based feature selection method (AIFSDL-PCD) using gene expression data was proposed by Alshareef et al. 23 for detecting prostate cancer.

The field of cancer prediction using machine and deep learning methods based on gene expression data has seen significant progress in recent years. However, despite the progress in predicting cancer using machine and deep learning methods based on gene expression data, the existing models have some issues affecting their performance. These issues include choosing the feature representation, optimal architecture, including the number of layers and nodes, suitable model parameters, and picking the best values for weights and bias are critical steps in improving performance 24 , 25 , 26 . Moreover, selecting the most suitable learning rates and regularization parameters can affect the model's ability to generalize to unseen data. Therefore, this paper aims to resolve these issues by finding a precise prediction model and advancing the state-of-the-art use of CNN to classify gene expression data using metaheuristic methods to optimize the CNN model.

Metaheuristic algorithms are optimization algorithms that search for solutions by exploring a large search space and iteratively improving candidate solutions. They have the ability to handle NP-hard problems, which are computationally intractable problems that cannot be solved using exact methods, by providing near-optimal solutions within a reasonable amount of time 27 , 28 , 29 . Metaheuristic optimization algorithms have been identified as an effective tool for solving large-scale optimization problems in bioinformatics. Many of these problems can be classified as NP-hard; thus, researchers have relied heavily on metaheuristic methods to address them. The metaheuristic methods allow for the efficient solution of large-scale samples while minimizing the use of computational resources. Despite the availability of various optimization methods, metaheuristic optimization algorithms are instrumental in solving optimization problems due to their flexibility in providing high-quality optimization solutions in a relatively short amount of computing time 30 . The use of metaheuristics models assists in solving the problems of high dimensionality, the complexity of variable relationships and noisy data peculiar to gene expression data. In addition, metaheuristics models can handle noisy and non-linear data by incorporating techniques such as randomization and simulated annealing to escape from local optima 31 . Chakraborty et al. 32 presented a metaheuristic method for skin disease classification based on an artificial neural network. In MotieGhader et al. 33 , metaheuristic methods, including GA, WCC, PSO, CUK, ICA, LA, HTS, ACO, FOA, DSOS, and LCA, with an SVM classifier were used for the detection of breast cancer based on mRNA and micro-RNA expression data.

This paper proposes using the metaheuristic model EOSA-CNN for breast cancer detection using gene expression data 34 . EOSA is a new optimization algorithm with excellent performance track records in different application domains 35 , 36 , 37 , 38 , 39 . It is population-based and bio-inspired, developed by taking clues from the Ebola virus's effective propagation. The algorithm's framework was designed based on the spread of Ebola disease (EVD) 34 , 40 . This research makes significant contributions by introducing a bio-inspired CNN model for detecting breast cancer using gene expression data from the TCGA repository. The AAIC method is used for pre-processing to remove the outliers' samples, thereafter, normalization and filtration were used. Furthermore, we converted the pre-processed data into 2D images that can be utilized in the CNN architecture. The study also proposes a hybrid of the proposed CNN architecture that employs the EOSA to enhance the classification performance. The proposed model showed its ability to classify the tumor and normal samples with high accuracy and reliability. In our proposed model, the best combination of weights required for the feature extraction is obtained using the EOSA algorithm to handle the classification problem. Therefore, this study presents a hybrid model that combines the proposed CNN and EOSA for the process of classification based on BRCA gene expression data. Consequently, in this study, the main contributions are as follows:

Applying various pre-processing techniques (such as removing outliers, normalizing, and filtering) to prepare the gene expression data.

Transforming the gene expression data into two-dimensional images.

Proposal of a novel bio-inspired CNN architecture for the detection of breast cancer.

Introducing a hybrid model that combines the proposed CNN and EOSA for the classification process.

Assessing and comparing the proposed model with other metaheuristic algorithms combined with the proposed CNN.

The rest of the paper is structured as follows: a detailed account of the related work is given in Section “Related work”, while Section “Model Methodology” describes the model technology discussing the CNN Architecture and the Ebola Optimization Algorithm CNN Model (EOSA-CNN) along with the associated algorithms. Section “Experimentation, results and discussion” presents the experimental results with a discussion of the results. Comparison with results from the literature, the strengths and limitations of the model are also enumerated. Finally, the conclusion and the recommendations for future work are presented in Section “Conclusion and future work”.

Related work

As earlier noted, several machine learning, deep learning, and metaheuristic techniques have been created and applied to detect and classify cancer using gene expression data. Yuan et al. 19 applied different machine-learning methods for detecting lung cancer through gene expression data. The Monte Carlo and incremental feature selection methods were used to identify the most important genes. Then, SVM and random forest (RF) were implemented, and their performances were compared. The results indicated that SVM achieved an accuracy, sensitivity, specificity, precision, and F1-measure of 100%, 93.2%, 96.7%, 93.9%, and 96.9%, respectively. These results are higher than those obtained using RF. Wang et al. 20 proposed a novel computational method called Patient-derived tumor xenograft (PDX) for breast cancer detection by incorporating Monte Carlo feature selection, RF, rough set-based rule learning, SVM, and dagging. In the work of Danaee et al. 21 proposed, a deep learning approach that uses Stacked Denoising Autoencoder (SDAE) to identify genes that can effectively differentiate between tumor and healthy cases of breast cancer was proposed. They tested the efficacy of the extracted features using an artificial neural network (ANN), SVM, and SVM-RBF. The results showed that using the SDAE method with SVM-RBF achieved the highest accuracy of 98.26%.

Jia et al. 22 analyzed BRCA gene expression data from TCGA and GEO using differentially expressed genes (DEG) and weighted gene co-expression network analysis (WGCNA) to select the most significant genes. Twenty-three hub genes were then identified using a protein–protein interaction (PPI) network. They applied SVM, decision tree (DT), Bayesian network (BN), ANN, and convolutional neural network (CNN-LeNet and CNN-AlexNet), and the results showed that ANN has the best performance with an average accuracy of 97.36%. Elbashir et al. 41 developed a lightweight CNN model for detecting breast cancer using RNASeq gene expression data. They first pre-processed the data by removing outliers, normalization and filtration. Then they converted the gene expression profiles into 2-D images. Thereafter, they applied a lightweight CNN model for the classification. From their result, their model achieved an accuracy of 98.76. Alshareef et al. 22 proposed a deep learning model with an artificial intelligence-based feature selection method for prostate cancer detection (AIFSDL-PCD) using gene expression data. In addition, a feature selection (FS) method based on a chaotic invasive weed optimization (CIWO) to select the optimal genes revealed the novelty of their approach. Their results showed sensitivity, specificity, precision, F1-measure, and accuracy of 97.25%, 97.25%, 0.967%, 97.14%, 97.28%, and 97.19%, respectively. Chakraborty et al. 32 presented a metaheuristic method for skin disease classification based on an artificial neural network. Their proposed method, a non-dominated sorting genetic algorithm—II (NNNSGAII), was used to train an ANN. The proposed method obtained 87.92% accuracy, 94.2% precision, 87.5% recall, and 90.73% F-measure.

MotieGhader et al. 33 used metaheuristic methods, including world competitive contest (WCC), league championship algorithm( LCA), GA, particle swarm optimization (PSO), ant colony optimization (ACO), imperialist competitive algorithm (ICA), learning automata (LA), heat transfer optimization algorithm (HTS), Forest optimization algorithm (FOA), discrete symbiotic organisms search (DSOS), and cuckoo optimization (CUK), with an SVM classifier for breast cancer detection using mRNA and micro-RNA expression data. The proposed algorithm selected 186 mRNAs out of 9,692 and 116 miRNAs out of 489 and obtained an accuracy above 90% for the miRNAs dataset and 100% for the mRNA dataset. Wei et al. 42 proposed a generative adversarial model based on cancer genetic data (GANs). They used 12 different gene expression data from the TCGA, including lung, breast, prostate, colon, gastric, liver, rectal, esophageal, thyroid, clear cell renal cell carcinoma (CCRCC), uterine, and head and neck squamous cell carcinomas (HNSCC). They further used a reconstruction loss to enhance stability during model training. From their results, an accuracy of 92.6% was achieved by their proposed model. Deng et al. 43 proposed a gene selection model in a two-stage format for cancer classification in microarray datasets. Their approach combined a multi-objective optimization genetic algorithm (XGBoost-MOGA) with gradient boosting (XGBoost). During the first stage, the XGBoost-based feature selection is used in ranking the genes to eliminate genes that are not relevant effectively, thereby leaving a group of genes that are most relevant to the class. In the second stage, a subset of optimal genes from the group of the most relevant genes is identified using XGBoost-MOGA through multi-objective optimization. Based on two widely used learning classifiers, a comparison of the proposed method with other state-of-the-art feature selection methods using two widely used learning classifiers on 14 publicly available microarray datasets was performed. The results demonstrated that XGBoost-MOGA outperformed previous methods in terms of accuracy, F-score, precision, and recall.

In Houssein et al. 44 , the selection of genes that contribute to the prediction of cancer from gene expression datasets with the highest accuracy based on microarray gene expression was achieved by combining a Barnacles Mating Optimizer (BMO) algorithm with SVM called (BMO-SVM). They evaluated the proposed model using four benchmark microarray datasets, including leukemia1, lymphoma, a small-round-blue-cell tumor (SRBCT), and leukemia2. From their results, the proposed BMO-SVM approach performed better than the other well-known methods, such as Particle Swarm Optimization (PSO), the Tunicate Swarm Algorithm (TSA), Artificial Bee Colony (ABC), and Genetic Algorithm (GA). Devi et al. 45 proposed an Improved Whale Optimization Algorithm (IWOA) algorithm for gene selection. The proposed solution used a multi-objective fitness function that balances error rate minimization and feature selection. The results show that the proposed IWOA obtained a minimal subset of genes used for the BRCA classification using Gradient Boost Classifier (GBC) and achieved an accuracy of 97.7%. The related studies are summarised and presented in Table 1 .

From the existing literature, various shortcomings were discovered regarding utilizing deep learning models for the given task. Deep learning models necessitate substantial data, and acquiring sizable, high-quality datasets for analyzing breast cancer gene expression can be challenging. Consequently, this can cause overfitting of the model to the training data, thereby resulting in inadequate performance on fresh, unobserved data. The computational complexity and time required for developing and training deep learning models can pose a significant hurdle to their widespread implementation in clinical practice. The complexity of breast cancer, which entails numerous biological processes such as cell proliferation, invasion, and angiogenesis, may not be captured entirely by deep learning models, thereby restricting their capacity to forecast outcomes or recognize potential therapeutic targets precisely. To resolve this challenge, optimizing the CNN model becomes necessary using suitable approximate optimization methods. Metaheuristic optimization algorithms have been applied to solve these problems. Nevertheless, the critical challenge of using deep learning models for effectively and efficiently classifying breast cancer remains unresolved. Therefore, this paper aims to enhance the efficacy of DL models on breast cancer detection and classification using gene expression data by leveraging a new optimization algorithm inspired by the biological mechanism of the Ebola disease.

Model methodology

Dataset and pre-processing.

Using the R software, we used the BRCA gene expression data from the Cancer Genome Atlas (TCGA) repository. The GDCquery function from the TCGAbiolinks library was used in developing the query 41 , 46 . The BRCA contains 1208 clinical samples and 14,895 genes or features. Moreover, there are 113 and 1095 normal and tumor samples, respectively. The data were identified to be noisy with many features. Therefore, different pre-processing steps were implemented to get clean data with genes positively contributing to BRCA detection. To identify the outliers samples, the array-array intensity correlation (AAIC), which defines a symmetric matrix of Spearman correlation between samples, was calculated 47 . The cut-off value of 0.6 was used to define the outlier samples to remove them. Normalization was applied for the gene expression data to ensure the validity of the expression levels and avoid biases in the analysis 48 . The TCGAanalyze-Normalization function was used from the TCGAbiolinks library to perform the normalization. Then filtration was performed using a cut-off value of 0.25 for reduction of gene number through the selection of genes whose mean expression values are higher than the cut-off value 41 , 49 . Consequently, the pre-processing obtained a dataset that contains 1208 clinical samples with 14,895 genes.

The gene expression data was reshaped from 1 to 2D images with a dimension of 122 × 123 to be appropriate for our metaheuristic models. The BRCA gene expression data contains columns that could not be reshaped into the desired dimension. However, 112 columns of zeros were attached at the end to adjust the image size 41 , 50 . Moreover, we transformed the images into grayscale using the cvtColor() function from the OpenCV library in Python. This was done to ensure that the images met the requirements of the classification model and to improve image quality. Once the images were converted, they were prepared as input for the hybrid model. Figure  1 shows the proposed methodology.

figure 1

The proposed methodology.

The CNN architecture

After the pre-processing step, the resulting images were used as input to the model. A specially designed CNN was used for the optimization model. The architecture of the proposed CNN model is a deep neural network designed to analyze and classify gene expression images with dimensions of \(150 \times 150\) pixels and a single colour channel (grayscale). The model consists of multiple convolutional layers with increasing filter sizes, followed by max pooling layers to reduce the spatial dimensions of the feature maps. The architecture is designed to extract and learn high-level features from the input images, gradually increasing the number of filters to capture more complex patterns. The final output of the convolutional layers is flattened and passed through a Dropout layer, which randomly drops out some of the neurons to prevent overfitting. The final output layer is a Dense layer with ReLU activation that is fully connected. The CNN model architecture designed in this study is shown in Fig.  2 . The proposed CNN model for breast cancer detection has a specific architecture that utilizes filters (denoted by "F"), kernels (denoted by "K"), and strides (denoted by "S").

figure 2

The proposed CNN architecture for the detection of breast cancer.

Ebola optimization search algorithm CNN model (EOSA-CNN)

Ebola is a viral hemorrhagic fever that affects humans and primates, also called Ebola hemorrhagic fever or Ebola virus disease. The Ebola viruses cause this disease, which can cause individuals to transition between susceptible, quarantined, infected, recovered, hospitalized, and deceased subpopulations in a seemingly random manner. Drawing inspiration from the Ebola virus's ability to spread effectively, a novel optimization algorithm that is both bio-inspired and population-based was developed. The method of the propagation of Ebola disease (EVD) 34 was adopted in the design of the algorithm. To update the propagation, the EOSA model used a dynamic mechanism for propagation via susceptible, infection, quarantine, recovered, and hospitalized operations to gain a better fit. It helped to find the best or worst solution and provided an intuitive outcome. In this paper, the EOSA metaheuristic algorithm was hybridized with CNN to improve the performance of the CNN model. This was accomplished in all the iterations when the metaheuristic algorithm was trained to achieve the solution vector and update the CNN model. The weights and biases for the CNN were updated, and the loss function was subsequently calculated. Thereafter, the results obtained were compared with different hybrid models. The following steps describe the EOSA-CNN Model:

Set up the initial scalar and vector quantities for parameters and individuals, respectively. Assign initial values to individuals categorized as Susceptible (S), Infected (I), Recovered (R), Dead (D), Vaccinated (V), Hospitalized (H), and Quarantine (Q).

Randomly select an individual from the susceptible individuals as the index case ( \({\mathrm{I}}_{1}\) )

Designate the index case as the global and current best, then compute its fitness value.

While there is at least one infected individual and the number of iterations is not complete,

Update the position of each susceptible individual based on their displacement, and generate newly infected individuals (nI) accordingly. Note that the greater the displacement of an infected case, the higher the infection rate, with shorter displacement representing exploitation and longer displacement signifying exploration.

Based on (a), create individuals that are newly infected

The newly generated cases are then added to the newly infected individuals created in I.

Evaluate the number of individuals to be added to R, H, D, Q, V, and B determined by the size of I, based on their rates, respectively.

Use nI to update I and S.

Choose the current best from I and compare it with the global best.

While stopping criteria are not satisfied, return to step 4.

Return all solutions and the global best solution.

figure a

The pseudocode in Algorithm 1 presents the algorithm that uses mathematical models to optimize a CNN model. The algorithm uses evolutionary optimization techniques. The algorithm starts by initializing variables such as the CNN model's objective function, lower and upper bounds, batch size, number of epochs, population size, and the incubation period. It also creates empty sets for groups of individuals (Quarantine (Q), Susceptible (S), Exposed (E), Recovered (R), Hospitalized (H), Vaccinated(V), Infected (I)) and solutions. The set of susceptible individuals is then generated, and the algorithm starts with a time equal to 0 and an index case is randomly generated. The current best and global best solutions are set to the index case. The positions of the exposed individuals are updated by the algorithm using a mathematical model illustrated in Equation \({\mathrm{mI}}_{\mathrm{i}}^{\mathrm{t}+1}={\mathrm{mI}}_{\mathrm{i}}^{\mathrm{t}}+\mathrm{\rho M}\) .

The displacement scale factor of individuals is represented by \(\rho\) while \(m{I}_{i}^{t+1}\) and \(m{I}_{i}^{t}\) indicate the updated and original positions at time \(t\) , respectively. The current time is denoted as \(t+1\) , and the movement rate of each individual represented as \(M\left(I\right)\) is calculated using Eqs. ( 2 ) and ( 3 ).

The exploration stage of the EOSA involves the infected individual moving beyond the normal neighbourhood range, \(lrate\) . In contrast, during the algorithm's exploitation phase, it is either assumed that the infected individual is displaced within a limit of \(srate\) in comparison to its previous position and remains within a distance of zero (0).

The algorithm also uses Eq. ( 4 ) to generate the susceptible population, Eq. ( 5 ) computes the global best solution, and Eqs. ( 6 ), ( 7 ), ( 8 ), ( 9 ), ( 10 ), ( 11 ) and ( 12 ) update the population of the dead, infected, susceptible, hospitalized, recovered, vaccinated, quarantined, funeral, and exposed groups. These equations are scalar functions that represent each population's rate of change. Where \({U}_{i},{L}_{i}\) indicate the lower and upper for the \({i}^{th}\) individual, \(i=\mathrm{1,2},..,\mathrm{N}.\)

To determine the current best ( \(cBest\) ), the individuals infected in time t are evaluated, and the global best ( \(gBest\) ) is calculated using Eq. ( 5 ):

At time t, the terms \(cBest,\) \(bestS\) , and \(gBest\) represent the current best solution, best solution, and global best solution, respectively. The objective function used for the problem is denoted by the term \(fitness\) .

The set of differential calculus equations used by the algorithm in updating the population of Quarantine (Q), susceptible (S), Infected (I), Recovered (R), Vaccinated (V), Dead (D), Funeral (F), Exposed (E), and Hospitalized (H) individuals as in Eqs. ( 6 ), ( 7 ), ( 8 ), ( 9 ), ( 10 ), ( 11 ) and ( 12 ).

Equations ( 6 ), ( 7 ), ( 8 ), ( 9 ), ( 10 ), ( 11 ) and ( 12 ) \(\frac{\partial \mathrm{I}\left(\mathrm{t}\right)}{\partial \mathrm{t}}=\left({\upbeta }_{1}\mathrm{I}+{\upbeta }_{3}\mathrm{D}+{\upbeta }_{4}\mathrm{R}+{\upbeta }_{2}\left(\mathrm{PE}\right)\uplambda \right)\mathrm{S}-\left(\Gamma +\upgamma \right)\mathrm{I}-\left(\uptau \right)\mathrm{S}\) are scalar functions. For each function, a single float value is assigned. The rate at which the susceptible population changes is specified, and it is used to determine the number of susceptible individuals at time t by applying it to the susceptible vector's current size. The sets of individuals in vectors I, H, R, V, D, and Q is calculated using this procedure. It is assumed that the initial conditions of \(S\left(0\right)=S0,I\left(0\right)=I0,R\left(0\right)=R0,D\left(0\right)=D0,P\left(0\right)=P0,andQ\left(0\right)=Q0\) , where \(t\) follows after the epoch, and the term \(\delta\) in Eq. ( 11 ) represents the burial rate. The quarantine rate for infected Ebola cases is denoted by Eq. ( 12 ).

Experimentation, results and discussion

System configuration and algorithms parameters setting.

The experiments were conducted using Dell Optiplex 5050 computer machine with the following configuration: an Intel Core i5 7th generation processor with a hard disk size of 500 GB and 16 GB memory. All the models were developed using Python. EOSA-CNN model's performance was compared to that of a standalone CNN and five other metaheuristic algorithms, namely MVO-CNN (Physics-based), GA-CNN (Evolutionary-based), LCBO-CNN (Human-based), WOA-CNN (Swarm-based), and SBO-CNN (Biology-based. The same parameter values of batch size and epoch were used for all algorithms. The input images to the hybrid algorithms were of size 150 × 150, corresponding to the pre-processed images. The configuration of metaheuristic algorithms and the EOSA-CNN algorithm for optimizing the proposed CNN model is depicted in Table 2 below. Table 3 presents the CNN hyperparameter configuration.

Model performance measuring metrics

To evaluate the efficacy of the model, Balanced Accuracy, Accuracy, precision, Recall, f1-score, Cohen's kappa, sensitivity, and specificity are calculated. The false positive (FP) indicates the number of images incorrectly predicted as cancerous when they are not, while the true positive (TP) denotes the number of accurately classified cancerous images. False negative (FN) represents the number of cancerous images that were misclassified as non-cancerous. True negative (TN) is the number of accurately classified non-cancerous images. The performance metrics are calculated using the formulas involving TP, FP, FN, and TN presented in Eqs. ( 13 ), ( 14 ), ( 15 ), ( 16 ), ( 17 ), ( 18 ), ( 19 ) and ( 20 ).

Results and discussions

Table 4 presents the overall performance of the competing algorithms. It shows that the hybrid algorithms performed better than the traditional CNN and the proposed model EOSA-CNN recorded a better performance than the hybrid algorithms. We calculate the Balanced Accuracy, Accuracy, precision, Recall, f1-score, Cohen's kappa, sensitivity, and specificity. In terms of Balanced Accuracy, WOA-CNN, GA-CNN, MVO-CNN, SBO-CNN, CNN, and LCBO-CNN achieved 0.956, 0.942, 0.923, 0.942, 0.924, 0.940, respectively. Whereas the EOSA-CNN achieved 0.958, which is the best performance. With reference to accuracy, the GA-CNN, SBO-CNN, and EOSA-CNN performed the same result of 0.983. In contrast, for recall, EOSA-CNN and WOA-CNN attained 0.928. In terms of the f1-score, EOSA-CNN achieved 0.912.

The comparative study of the proposed method with five metaheuristic algorithms and CNN is reported in Fig.  3 . The proposed model performs better than the other models with respect to the validation accuracy in 100 epochs.

figure 3

Comparative performance of the proposed EOSA-CNN model against other models.

Figure  4 presents the precision, f1-score and recall of all models per normal class. It shows that the Precision of the GA-CNN and SBO-CNN have the same performance of 0.93 and CNN performance of 0.92. Furthermore, the gene expression dataset was imbalanced, so different metrics were calculated for more confirmation, like F1-Score, balanced accuracy, and recall. It presents the F1-score result of EOSA-CNN has a high performance of 0.91 for the normal class. Also, GA-CNN and SBO-CNN have identical results. The EOSA-CNN have high performance compared to other methods in term of Recall 0.93%. All the methods correctly identified the tumor class with a high performance of 99% in terms of recall, precision, and F1-Score. Overall, the experiments indicated that the hybrid models benefited from pre-processing the gene expression data and almost had an equivalent performance in detecting the BRCA.

figure 4

Comparative results of precision, f1-score, and recall for EOSA-CNN model and other models for normal class.

Figure  5 shows the confusion matrix for CNN and the hybrid algorithm, considering all the datasets' class labels. Each plot of the confusion matrix shows the classification accuracy for all classes, providing an accurate performance report for each one. Taking EOSA-CNN (top left of Fig.  5 ), for instance, the hybrid algorithm proposed in this study correctly identified 26 from 28 samples as a normal class and 270 from 273 samples as tumor. Also, CNN correctly identified the tumor class but misclassified 3 from 29 samples for the normal class. This result highlights the significance of the proposed hybrid algorithm in this study as it successfully enhanced the classification accuracy.

figure 5

Confusion matrix (Hybrid algorithms and CNN).

Figures  6 , 7 , 8 , 9 , 10 and 11 display the training and validation accuracy for all hybrid algorithms in each epoch. In all the hybrid models, the validation accuracy is higher than the training accuracy at the beginning of training. That indicates the models possess good generalization ability to new, unseen data, which is a positive indication. During training, the model's training accuracy improves, while the validation accuracy improves slower. Both training and validation accuracies stabilize at a level higher than 97%. In Fig.  12 , CNN's performance in training and validation is depicted. Although the training accuracy improves and reaches 100%, the validation accuracy remains lower. This implies that the model is overfitting to the training data, effectively memorizing it but lacking the ability to perform well on new and unseen data. As a result, it may lack generalization ability.

figure 6

Training and validation accuracy curve for EOSA-CNN.

figure 7

Training and validation accuracy curve for GA-CNN.

figure 8

Training and validation accuracy curve for LCBO-CNN.

figure 9

Training and validation accuracy Curve for MVO-CNN.

figure 10

Training and validation accuracy curve for SBO-CNN.

figure 11

Training and validation accuracy curve for WOA-CNN.

figure 12

Training and validation accuracy curve for CNN.

Comparison with related studies

Table 5 shows the comparison between our proposed model performance and different studies. The proposed model in this study achieved higher classification accuracy than the results observed in previous works reported by Danaee et al. 21 , Jia et al. 22 , and MotieGhader et al. 33 . While Elbashir et al. 41 achieved higher classification accuracy than our study using a CNN model, our approach showed a sensitivity of 0.9890% and an f1-score of 0.99% for both tumor and normal class. Moreover, the EOSA-CNN model achieved a sensitivity of 0.989%, which means the model has missed a few of the positive cases. Sensitivity is a crucial metric as it assesses the model's ability to detect positive cases correctly. Our models must identify all positive cases to ensure accurate predictions. Thus, this study highlights the significance of employing a metaheuristic algorithm to optimize CNN model hyperparameters, which is crucial in selecting the optimal combination of biases and weights required to train a CNN model effectively. Furthermore, the proposed method showcased that integrating these methods can significantly enhance gene expression data's overall performance and classification accuracy.

Strength and limitations of the EOSA-CNN model

In this section, the limitations of the study are discussed in more detail, including the small sample size of gene expression data compared to the very high number of genes. Moreover, the absence of addressing the problem of imbalanced data using approaches such as random over and under-sampling and cluster-based over-sampling is considered a serious challenge. The sample size used for the study may not be sufficient to capture the full complexity of the gene expression data, leading to potential biases and limitations in the analysis. Additionally, the issue of imbalanced data can significantly impact the model's performance, as the algorithm may be biased towards the majority class and struggle to predict the minority class accurately. While the EOSA-CNN model outperformed traditional CNN models and other hybrid algorithms, there is still room for improvement in addressing these limitations. Future research should concentrate on more experiments using large sample sizes of genomics data with handling class imbalance to enhance the model's effectiveness. Despite this constraint, the EOSA-CNN model outperformed other hybrid algorithms and traditional CNN models. Furthermore, evaluating the EOSA algorithm's performance in diverse diseases and medical conditions would be crucial to assess its generalizability and applicability to a broader range of healthcare problems. By addressing these limitations and exploring the model's performance in various contexts, the EOSA-CNN model could be a promising tool for accurate and reliable disease diagnosis and classification based on gene expression data.

Conclusion and future work

Breast cancer is the most common medical diagnosis in women. The study, understanding and research of breast cancer have aided the diagnosis and development of new treatments for breast cancer. Gene expression profiling is helping researchers and doctors to comprehend the heterogeneous nature of breast cancer on a genomic level. In this study, we developed a hybrid model that combines the Ebola optimization search algorithm (EOSA) with CNN architecture for the detection of breast cancer and diagnosis using gene expression data. We prepared the data using different pre-processing methods, including removing the outliers using Array-Array Intensity Correlation (AAIC). To avoid biases in the expression measures, we utilized the normalization method. The final step in pre-processing was filtration. After that, we converted the gene expression data into two-dimensional images, which were converted into grayscale images. For the classification, we use the EOSA-CNN model. The findings of this study demonstrate that the proposed model achieved high-performance measurements with exceptional accuracy (98.3%), precision (99%), recall (99%), f1-score (99%), kappa (90.3%), specificity (92.8%), and sensitivity (98.9%) for the cancerous class. These results suggest that the model has the potential to be an effective and reliable method for breast cancer detection using gene expression data. For future extensions, we planned to solve the problem of imbalanced data and hybridize the model with various state-of-the-art optimization algorithms.

Data availability

The dataset is publicly available on The Cancer Genome Atlas (TCGA) repository.

Abbreviations

Convolutional neural network

The cancer genome atlas

Array-array intensity correlation

Ebola optimization search algorithm

Genetic algorithm

Life choice-based optimization

Multi-verse optimizer

Satin bowerbird optimization

Whale optimization algorithm

Breast cancer

World health organization

Next generation sequencing

Ribonucleic acid sequencing

Monte Carlo feature selection

Random forest

Support vector machine

Stacked denoising autoencoder

Gene expression omnibus

Weighted gene co-expression network analysis

Intelligence-based feature selection method with a deep learning model for prostate cancer detection

Nondeterministic polynomial time hard

World competitive contest

League championship algorithm

Particle swarm optimization

Ant colony optimization

Imperialist competitive algorithm

Learning automata

Heat transfer optimization algorithm

Forest optimization algorithm

Discrete symbiotic organisms search

Cuckoo optimization

Ebola virus disease

Patient-derived tumor xenograft

Artificial neural network

Protein–protein interaction

Decision tree

Bayesian network

Chaotic invasive weed optimization

Feature selection

Non-dominated sorting genetic algorithm—II

Generative adversarial model based on cancer genetic data

Clear cell renal cell carcinoma

Head and neck squamous cell carcinomas

Gradient boosting

Multi-objective optimization genetic algorithm

Barnacles mating optimizer

Small-round-blue-cell tumor

Tunicate swarm algorithm

Artificial bee colony

Improved whale optimization algorithm

Gradient boost classifier

One-dimensional

Two-dimensional

Infected individuals

True positive

False positive

False negative

True negative

Alam, M. S. et al. Statistics and network-based approaches to identify molecular mechanisms that drive the progression of breast cancer. Comput. Biol. Med. 145 , 105508 (2022).

Article   CAS   PubMed   Google Scholar  

Wilkinson, L. & Gathani, T. Understanding breast cancer as a global health concern. Br. J. Radiol. 95 (1130), 20211033 (2022).

Article   PubMed   Google Scholar  

Morhason-Bello, I. O. et al. Challenges and opportunities in cancer control in Africa: A perspective from the African Organisation for Research and Training in Cancer. Lancet Oncol. 14 (4), e142–e151 (2013).

Sung, H. et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 71 (3), 209–249 (2021).

WHO. Breast cancer . 2021; https://www.who.int/news-room/fact-sheets/detail/breast-cancer .

Meirovitz, A. et al. Thyroid hormones and morphological features of primary breast cancer. Anticancer Res. 42 (1), 253–261 (2022).

do Nascimento, R. G. & Otoni, K. M. Histological and molecular classification of breast cancer: What do we know. Mastology 30 , e20200024 (2020).

Article   Google Scholar  

Gamble, P. et al. Determining breast cancer biomarker status and associated morphological features using deep learning. Commun. Med. 1 (1), 14 (2021).

Article   PubMed Central   PubMed   Google Scholar  

Oyelade, O. N. & Ezugwu, A. E. A novel wavelet decomposition and transformation convolutional neural network with data augmentation for breast cancer detection using digital mammogram. Sci. Rep. 12 (1), 5913 (2022).

Article   CAS   PubMed Central   ADS   PubMed   Google Scholar  

Mohammed, M. et al. A stacking ensemble deep learning approach to cancer type classification based on TCGA data. Sci. Rep. 11 (1), 1–22 (2021).

Article   MathSciNet   CAS   Google Scholar  

Triantafyllou, A. et al. Circulating miRNA expression profiling in breast cancer molecular subtypes: Applying machine learning analysis in bioinformatics. Cancer Diagn. Progn. 2 (6), 739 (2022).

Majumder, S. et al. Performance analysis of deep learning models for binary classification of cancer gene expression data. J. Healthc. Eng. 2022 , 1–11 (2022).

Aziz, R. M. Nature-inspired metaheuristics model for gene selection and classification of biomedical microarray data. Med. Biol. Eng. Comput. 60 (6), 1627–1646 (2022).

Ogundokun, R. O. et al. Medical internet-of-things based breast cancer diagnosis using hyperparameter-optimized neural networks. Fut. Internet 14 (5), 153 (2022).

Chowdhary, C. L. et al. Past, present and future of gene feature selection for breast cancer classification–A survey. Int. J. Eng. Syst. Modell. Simul. 13 (2), 140–153 (2022).

Google Scholar  

Amethiya, Y. et al. Comparative analysis of breast cancer detection using machine learning and biosensors. Intell. Med. 2 (2), 69–81 (2022).

Shukla, A. K., Singh, P. & Vardhan, M. A new hybrid wrapper TLBO and SA with SVM approach for gene expression data. Inf. Sci. 503 , 238–254 (2019).

Article   MathSciNet   Google Scholar  

Khalsan, M. et al. A survey of machine learning approaches applied to gene expression analysis for cancer prediction. IEEE Access 10 , 27522–27534 (2022).

Yuan, F., Lu, L. & Zou, Q. Analysis of gene expression profiles of lung cancer subtypes with machine learning algorithms. Biochimica et Biophysica Acta (BBA)-Mol. Basis Dis. 1866 (8), 165822 (2020).

Article   CAS   Google Scholar  

Wang, D. et al. Identification of differentially expressed genes between original breast cancer and xenograft using machine learning algorithms. Genes 9 (3), 155 (2018).

Danaee, P., Ghaeini, R. & Hendrix, D. A. A deep learning approach for cancer detection and relevant gene identification. In Pacific Symposium on Biocomputing 2017 (World Scientific, 2017).

Jia, D. et al. Breast cancer case identification based on deep learning and bioinformatics analysis. Front. Genet. 12 , 628136 (2021).

Article   CAS   PubMed Central   PubMed   Google Scholar  

Alshareef, A. M. et al. Optimal deep learning enabled prostate cancer detection using microarray gene expression. J. Healthc. Eng. 2022 , 1–12 (2022).

Ma, Q. & Xu, D. Deep learning shapes single-cell data analysis. Nat. Rev. Mol. Cell Biol. 23 (5), 303–304 (2022).

Kaveh, M. & Mesgari, M. S. Application of meta-heuristic algorithms for training neural networks and deep learning architectures: A comprehensive review. Neural Process. Lett. https://doi.org/10.1007/s11063-022-11055-6 (2022).

Zhang, W. et al. Application of machine learning, deep learning and optimization algorithms in geoengineering and geoscience: Comprehensive review and future challenge. Gondwana Res. https://doi.org/10.1016/j.gr.2022.03.015 (2022).

Rahman, M. A. et al. Nature-inspired metaheuristic techniques for combinatorial optimization problems: Overview and recent advances. Mathematics 9 (20), 2633 (2021).

Article   ADS   Google Scholar  

Tkatek, S. et al. Artificial intelligence for improving the optimization of NP-hard problems: A review. Int. J. Adv. Trends Comput. Sci. Appl. 9 (5), 7411 (2020).

Mandal, A.K. and S. Dehuri. A survey on ant colony optimization for solving some of the selected np-hard problem . in Biologically Inspired Techniques in Many-Criteria Decision Making: International Conference on Biologically Inspired Techniques in Many-Criteria Decision Making (BITMDM-2019) . 2020. Springer.

Calvet, L. et al. On the role of metaheuristic optimization in bioinformatics. Int. Trans. Oper. Res. https://doi.org/10.1111/itor.13164 (2022).

Shukla, A. K. et al. A study on metaheuristics approaches for gene selection in microarray data: Algorithms, applications and open challenges. Evol. Intel. 13 , 309–329 (2020).

Chakraborty, S., et al. Detection of skin disease using metaheuristic supported artificial neural networks . in 2017 8th Annual Industrial Automation and Electromechanical Engineering Conference (IEMECON) . 2017. IEEE.

MotieGhader, H. et al. mRNA and microRNA selection for breast cancer molecular subtype stratification using meta-heuristic based algorithms. Genomics 112 (5), 3207–3217 (2020).

Oyelade, O.N. and A.E. Ezugwu, Ebola Optimization Search Algorithm (EOSA): A new metaheuristic algorithm based on the propagation model of Ebola virus disease. Preprint at https://arXiv.org/quant-ph/2106.01416 (2021).

Oyelade, O. N. & Ezugwu, A. E. Immunity-based Ebola optimization search algorithm for minimization of feature extraction with reduction in digital mammography using CNN models. Sci. Rep. 12 (1), 17916 (2022).

Oyelade, O. N., Agushaka, J. O. & Ezugwu, A. E. Evolutionary binary feature selection using adaptive ebola optimization search algorithm for high-dimensional datasets. PLoS ONE 18 (3), e0282812 (2023).

Oyelade, O. N. & Ezugwu, A. E. EOSA-GAN: Feature enriched latent space optimized adversarial networks for synthesization of histopathology images using Ebola optimization search algorithm. Biomed. Signal Process. Control 84 , 104734 (2023).

Akinola, O., Oyelade, O. N. & Ezugwu, A. E. Binary ebola optimization search algorithm for feature selection and classification problems. Appl. Sci. 12 (22), 11787 (2022).

Ashwini, C. & Sellam, V. EOS-3D-DCNN: Ebola optimization search-based 3D-dense convolutional neural network for corn leaf disease prediction. Neural Comput. Appl. https://doi.org/10.1007/s00521-023-08289-3 (2023).

Article   PubMed Central   Google Scholar  

Oyelade, O. N. et al. Ebola optimization search algorithm: A new nature-inspired metaheuristic optimization algorithm. IEEE Access 10 , 16150–16177 (2022).

Elbashir, M. K. et al. Lightweight convolutional neural network for breast cancer classification using RNA-seq gene expression data. IEEE Access 7 , 185338–185348 (2019).

Wei, K. et al. Cancer classification with data augmentation based on generative adversarial networks. Front. Comp. Sci. 16 , 1–11 (2022).

Deng, X. et al. Hybrid gene selection approach using XGBoost and multi-objective genetic algorithm for cancer classification. Med. Biol. Eng. Comput. 60 (3), 663–681 (2022).

Houssein, E. H. et al. A hybrid barnacles mating optimizer algorithm with support vector machines for gene selection of microarray cancer classification. IEEE Access 9 , 64895–64905 (2021).

Devi, S. S. & Prithiviraj, K. Breast cancer classification with microarray gene expression data based on improved whale optimization algorithm. Int. J. Swarm Intell. Res. 14 (1), 1–21 (2023).

Cancer Genome Atlas Research N et al. The cancer genome atlas pan-cancer analysis project. Nat. Genet. 45 (10), 1113–1120 (2013).

Yang, S. et al. Detecting outlier microarray arrays by correlation and percentage of outliers spots. Cancer Inform. 2 , 117693510600200020 (2006).

Lovén, J. et al. Revisiting global gene expression analysis. Cell 151 (3), 476–482 (2012).

Sha, Y., J.H. Phan, and M.D. Wang. Effect of low-expression gene filtering on detection of differentially expressed genes in RNA-seq data . in 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) . 2015. IEEE.

de Guia, J.M., M. Devaraj, and C.K. Leung. DeepGx: deep learning using gene expression for cancer classification . in Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining . 2019.

Download references

Acknowledgements

The authors would like to thank Dr Murtada K. Elbashir for his help in language editing.

Author information

Authors and affiliations.

School of Mathematics, Statistics, and Computer Science, University of KwaZulu-Natal, King Edward Avenue, Pietermaritzburg Campus, Pietermaritzburg, 3201, KwaZulu-Natal, South Africa

Tehnan I. A. Mohamed, Jean Vincent Fonou-Dombeu, Abiodun M. Ikotun & Mohanad Mohammed

Unit for Data Science and Computing, North-West University, Potchefstroom, South Africa

Absalom E. Ezugwu

You can also search for this author in PubMed   Google Scholar

Contributions

T.I.A.M. conceived the study, performed all the analyses, and drafted the manuscript. All authors proof-read, discussed, and approved the final manuscript.

Corresponding authors

Correspondence to Tehnan I. A. Mohamed or Absalom E. Ezugwu .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Mohamed, T.I.A., Ezugwu, A.E., Fonou-Dombeu, J.V. et al. A bio-inspired convolution neural network architecture for automatic breast cancer detection and classification using RNA-Seq gene expression data. Sci Rep 13 , 14644 (2023). https://doi.org/10.1038/s41598-023-41731-z

Download citation

Received : 19 March 2023

Accepted : 30 August 2023

Published : 05 September 2023

DOI : https://doi.org/10.1038/s41598-023-41731-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Refining breast cancer biomarker discovery and drug targeting through an advanced data-driven approach.

  • Morteza Rakhshaninejad
  • Mohammad Fathian
  • Amir H. Gandomi

BMC Bioinformatics (2024)

SurvConvMixer: robust and interpretable cancer survival prediction based on ConvMixer using pathway-level gene expression images

  • Yuanning Liu

Breast mass density categorisation using deep transferred EfficientNet with support vector machines

  • Ankita Patra
  • Santi Kumari Behera
  • Nalini Kanta Barpanda

Multimedia Tools and Applications (2024)

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

genetic algorithm based research papers

Subscribe to the PwC Newsletter

Join the community, edit method, add a method collection.

  • HEURISTIC SEARCH ALGORITHMS

Remove a collection

  • HEURISTIC SEARCH ALGORITHMS -

Add A Method Component

Remove a method component, genetic algorithms.

Genetic Algorithms are search algorithms that mimic Darwinian biological evolution in order to select and propagate better solutions.

genetic algorithm based research papers

Usage Over Time

Categories edit add remove.

Optimized Design of Fixed-Sun Mirror Field Based on Genetic Algorithm and Monte Carlo Fusion

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Empirical Enhancement of Intrusion Detection Systems: A Comprehensive Approach with Genetic Algorithm-based Hyperparameter Tuning and Hybrid Feature Selection

  • Research Article-Computer Engineering and Computer Science
  • Open access
  • Published: 12 April 2024

Cite this article

You have full access to this open access article

  • Halit Bakır   ORCID: orcid.org/0000-0003-3327-2822 1 &
  • Özlem Ceviz   ORCID: orcid.org/0000-0002-8610-4008 2  

158 Accesses

Explore all metrics

Machine learning-based IDSs have demonstrated promising outcomes in identifying and mitigating security threats within IoT networks. However, the efficacy of such systems is contingent on various hyperparameters, necessitating optimization to elevate their performance. This paper introduces a comprehensive empirical and quantitative exploration aimed at enhancing intrusion detection systems (IDSs). The study capitalizes on a genetic algorithm-based hyperparameter tuning mechanism and a pioneering hybrid feature selection approach to systematically investigate incremental performance improvements in IDS. Specifically, our work proposes a machine learning-based IDS approach tailored for detecting attacks in IoT environments. To achieve this, we introduce a hybrid feature selection method designed to identify the most salient features for the task. Additionally, we employed the genetic algorithm (GA) to fine-tune hyperparameters of multiple machine learning models, ensuring their accuracy in detecting attacks. We commence by evaluating the default hyperparameters of these models on the CICIDS2017 dataset, followed by rigorous testing of the same algorithms post-optimization through GA. Through a series of experiments, we scrutinize the impact of combining feature selection methods with hyperparameter tuning approaches. The outcomes unequivocally demonstrate the potential of hyperparameter optimization in enhancing the accuracy and efficiency of machine learning-based IDS systems for IoT networks. The empirical nature of our research method provides a meticulous analysis of the efficacy of the proposed techniques through systematic experimentation and quantitative evaluation. Consolidated in a unified manner, the results underscore the step-by-step enhancement of IDS performance, especially in terms of detection time, substantiating the efficacy of our approach in real-world scenarios.

Avoid common mistakes on your manuscript.

1 Introduction

Due to advances in network technology, the Internet is witnessing a significant surge in connected devices and applications. The number of Internet of Things (IoT) devices skyrocketed by 18%, hitting 14.4 billion in 2022. Forecasts from the State of IoT—Spring 2022 report project indicated that an additional 27 billion connected devices expected to join the internet by 2025 [ 1 ]. The proliferation of wireless connections and the diverse distribution of internet-connected devices, ranging from sensors and smartphones to autonomous systems and critical applications, has led to the emergence of numerous cyber threats. [ 2 , 3 ]. Exploiting wireless connection vulnerabilities can compromise the CIA triad principles, encompassing confidentiality, integrity, and availability. McAfee Labs' survey revealed a staggering 118% surge in ransomware attacks during the first quarter of 2019. Moreover, the use of PowerShell witnessed a significant 460% rise in handling attacks on vulnerable devices [ 4 ].

Securing both IoT devices and their associated networks is paramount to safeguarding against cyber threats. Organizations allocate substantial funds to enhance security, considering the most significant challenges lie in sectors like healthcare, banking, telecommunications, energy, and government. Numerous cryptographic methods are being suggested to thwart attacks targeting these environments. [ 5 ]. The effectiveness of current methods is limited by the dynamic network structure and the heterogeneous distribution of IoT devices. As a result, there is a pressing demand for systems capable of detecting attacks in computer networks, with a specific focus on the IoT environment. These systems are known as intrusion detection systems (IDSs). Intrusion detection systems (IDSs) are designed to monitor, recognize, and assess events within a computer system or local domain, aiming to identify malicious activities. These systems provide a range of options for effectively managing threat and vulnerability risks. [ 6 ]. IDS systems can be categorized into three main types: signature based, anomaly based, and specification based. Signature-based IDS compares current traffic patterns with known attack signatures to identify matches, effectively detecting known attack types. However, it may struggle with detecting unknown or zero-day attacks. On the other hand, anomaly-based IDS detects anomalies by comparing them with profiles of normal system behavior, allowing it to identify deviations from expected patterns. In specification-based IDS, deviations from system standards are flagged as potential attacks. This type combines the advantages of both signature-based and anomaly-based IDS, offering a more comprehensive approach to intrusion detection. Intelligent artificial-based anomaly detection systems are widely favored in the literature due to their numerous advantages [ 7 , 8 , 9 ]. These methods excel at early detection of new or mutated attacks through models trained with existing samples. However, determining the optimal hyperparameters during model construction and training poses a challenge, as these systems rely on various algorithms. To enhance the intrusion detection performance of AI-based IDS, it is crucial to fine-tune the hyperparameters of the utilized machine learning algorithms and deep learning models [ 10 ]. Traditionally, hyperparameter tuning involves manually testing various values for the hyperparameters to assess the model's performance. However, this method is time-consuming and subjective, relying on human observation, leading to questions about its reliability [ 11 ]. To address these limitations, automating the hyperparameter optimization process becomes essential, saving time and reducing human effort. This automation can be achieved through different algorithms such as grid search, random search, Bayesian, and genetic algorithms.

In this work, we developed an anomaly-based IDS system by leveraging multiple machine learning models and employing the genetic algorithm for hyperparameter tuning. The CICIDS2017 dataset was chosen for training and testing our model. Initially, the dataset underwent pre-processing, and various machine learning models with their default parameters were used to detect and classify attacks and their variants. Next, we introduced a hybrid feature selection method to identify the most relevant features for the task. The used ML models that performed best in the initial stage were retrained using the selected sub-feature group from the proposed feature selection method. Finally, we optimized the selected models’ detection accuracy by employing the genetic algorithm to select the best-performing hyperparameters.

2 Motivation

The growing prevalence of cyber threats has intensified the need for robust and efficient intrusion detection systems (IDSs) to safeguard critical computer networks. In this study, we present a novel and comprehensive approach for developing an anomaly-based IDS system. By harnessing the power of multiple machine learning models and integrating the genetic algorithm for hyperparameter tuning, we aim to significantly improve the system's detection accuracy and adaptability. Our research utilizes the CICIDS2017 dataset, a widely recognized benchmark dataset in the field, for training and testing the proposed model. To address the challenges posed by dynamic network structures and the diverse distribution of Internet of Things (IoT) devices, we meticulously preprocess the dataset. We systematically evaluate various machine learning models, exploring their default parameters to detect and classify attacks and their variants accurately. Recognizing that feature selection plays a crucial role in enhancing performance, we introduce a hybrid feature selection method to identify the most relevant features for our task. Taking the analysis one step further, we retrain the algorithms that exhibit optimal performance in the initial stage, utilizing the selected sub-feature group from our proposed feature selection method. This step ensures that our IDS system is fine-tuned to focus on the most discriminative features, enhancing its precision in detecting and classifying anomalous activities. The second side of our contribution lies in the application of the genetic algorithm for hyperparameter tuning for more polishing of the proposed IDS model’s performance. By automating this process, we reduce human intervention and ensure that our IDS system is optimized to deliver the best possible results. The genetic algorithm helps fine-tune the machine learning models' hyperparameters, leading to enhanced accuracy, adaptability, and robustness in the proposed IDS system. Our research is expected to significantly advance the field of intrusion detection by introducing a comprehensive IDS system that outperforms existing solutions. The combination of multiple machine learning models, hybrid feature selection, and genetic algorithm-based hyperparameter tuning contributes to a versatile and efficient system capable of detecting a wide range of cyber threats, ultimately fortifying the security of critical computer networks and IoT environments.

The rest of the paper is organized as follows: Section 2 illustrates the most important previously conducted works in this domain. Section 3 elaborates on our chosen research methodology, detailing the approaches and techniques employed in our investigation. Section 4 introduces our proposed approach. Following this, in Sect. 5 we delve into the Model Evaluation Metrics employed to assess the performance of our proposed approach. In Sect. 6 , we provide insights into the Hardware and Software Platform utilized for our experiments, offering transparency on the computational environment. Subsequently, in Sect. 7 we present our Experimental Results, showcasing the outcomes of our empirical studies. Section 8 is dedicated to a comprehensive Comparison Study. In Sect. 9 , we extend our investigation by Testing the proposed method using different datasets, examining its robustness and generalizability. Finally, in Sect. 10 we draw together our findings and insights, culminating in the conclusion, where we summarize the key contributions of our work and discuss avenues for future research.

3 Related Work

3.1 machine learning-based idss.

A good number of studies propose using artificial intelligence, deep learning, and machine learning approaches for attack detection using different datasets [ 12 , 13 ]. For example, in [ 12 ], the authors apply popular supervised and unsupervised algorithms to detect attacks in the CICIDS2017 dataset. Particularly, a combination of machine learning and deep learning methods has been used to compare the obtained results. The hyperparameters in this work have been tuned manually to choose the values that can give the best results. It has been stated that kNN, DT, and NB algorithms gave higher performance compared with the other used algorithms. In [ 14 ], a model based on machine learning algorithm has been proposed. Particularly, the ensemble margin technique has been used to conduct a voting process between multiple algorithms, and the algorithms with the highest votes were selected to provide the highest accuracy. The deep learning methods have been utilized for feature extraction, and SVM and kNN algorithms have been adopted for conducting the classification process. The proposed method has been tested using multiple datasets including UNSW-NB15, CICIDS2017, and NSL-KDD and the results have been compared. Furthermore, the outputs of the kNN and SVM methods are integrated with the Dempster–Shafer classifier method. This integration method has increased the accuracy rate, and 99.84% of success has been achieved in the detection of the U2R attack. In [ 15 ], authors focused on machine learning-based IDS using CICIDS2017 dataset. Firstly, three different machine learning approaches namely decision tree, random forest, and SVM are implemented to detect the attacks in the dataset. Then, a machine learning model called the voting classifier (VC) determines the output class with the highest probability between multiple decision. This method improves the detection accuracy to reach 96.25%. Similarly, [ 16 ] employed different classification algorithms in order to detect the attacks in the CICIDS2017 dataset and compares their results. It has been stated that the random forest gave the best results between the adopted algorithms. In [ 17 ], the paper begins by outlining the importance of IDS in the field of network security and the limitations of traditional IDS techniques. The proposed hybrid algorithm aims to overcome these limitations by improving the accuracy of detection while minimizing false alarms. The authors then explain the principles of tabu search and genetic algorithms and describe how they can be combined to form a hybrid algorithm. The tabu search algorithm is used to explore the search space and generate candidate solutions, while the genetic algorithm is used to optimize these solutions. The proposed IDS system consists of two main components: the training phase and the detection phase. In the training phase, the system learns the normal behavior of the network and creates a profile for each network user. In the detection phase, the system monitors the network traffic in real-time and compares it to the learned profiles to detect any anomalies. The paper presents the results of experiments conducted on the KDD Cup 99 dataset, which is a standard dataset for evaluating IDS systems. The results show that the proposed hybrid algorithm outperforms traditional IDS techniques in terms of detection accuracy and false alarm rate.

3.2 Hyperparameters Tuning for IDSs

The use of artificial intelligence methods in attack detection is increasing due to many features such as scalability, computational ability, and increasingly accurate detections. However, these methods have a large number of parameters, and it is important to increase the accuracy rate by optimizing these parameters. As a result, hyperparameter tuning research has gained popularity in the literature [ 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 ]. Different types of hyperparameter tuning approaches such as grid search, random search, and Bayesian have been adopted in the literature.

In [ 18 ], the goal of the study is to find the best accuracy for network attack detection by fine-tuning various LSTM hyperparameters including optimizers, loss functions, learning rates, and activation functions and comparing their performance on the CICIDS2017 dataset. The best accuracy obtained after the hyperparameter tuning process was 99.54%. Similarly, [ 19 ] demonstrated that tuning hyperparameters had an impact on the performance of machine learning algorithms. Some hyperparameters such as learning rate, iteration numbers, and optimizers have been tuned to select the best values for each of which. The study focused on DDoS attacks and employed simple neural networks and LSTM algorithms for detecting them. The simple neural network obtained an accuracy of 100% when it has been trained using CAIDA and DARPA datasets.

In [ 20 ], author proposed machine learning approaches for DDoS attack detection based on CICDDoS2019, which contains 12 different DDoS attacks. Firstly, data pre-processing is performed, which includes organizing, cleaning, and scaling the data samples. Second, a hybrid feature selection method is presented for extracting the best features. Next, grid search was used to tune the hyperparameter by selecting the best parameters to improve detection performance, and the model was trained using the selected best features. It has been stated that the GB model obtained the highest accuracy of 99.97% among all the applied algorithms. In [ 21 ], NSL-KDD and CICIDS2017 datasets have been used for training a proposed neural network IDS system. Authors focused on the effect of hyperparameter tuning on the model accuracy. Grid search algorithm has been used for tuning hyperparameters including the number of hidden layers, number of neurons, the activation function, the optimizer, the batch size, and the number of epochs. The study has been conducted to show with experimental results that the smallest change in hyperparameters affects the accuracy of machine learning models. After conducting the hyperparameterd optimization, the accuracy value with the best hyperparameters reached 99%. Similarly, machine learning-based DoS and DDoS attacks detection was carried out using various datasets including ISCXIDS2012, CICIDS2017, CSE-CIC-IDS2018, and CICDDoS2019 in [ 22 ]. The hyperparameters of various algorithms were optimized by the grid search algorithm. After parameter optimizations, the RF and DT algorithms obtained an accuracy value of over 98% for all four datasets.

In [ 11 ], hyperparameter tuning method has been proposed as a combination of grid search and random search approaches to improve the performance of deep learning model classification performance. Data preparation and pre-processing operations were performed on the NSL-KDD and CSE-CIC-IDS2018 datasets. Grid search was conducted through all hyperparameters to find the best value for each of which. In order to reduce the time of the grid search algorithm, hyperparameter setting was combined with random search, and the process was carried out without a specific order or criteria. In [ 23 ], Bayesian optimization algorithm has been adopted for hyperparameter tuning. The authors used an unsupervised learning algorithm for feature extraction, while they used deep learning techniques for intrusion detection. Activation function and weight hyperparameters were tuned to increase the performance of the adopted deep learning model. As a result of the evaluation using the BoT-IoT dataset, the accuracy value increased to 99.99% thanks to the hyperparameter setting.

In [ 24 ], the paper addresses the need for effective network intrusion detection systems for cloud IoT devices. It highlights the use of CNN architecture and transfer learning. In this case, the knowledge learned from a base dataset is transferred to the IDS on cloud IoT devices. Five CNN model trained and two datasets are used: CICIDS2017 and CSE-CICIDS2018. It discusses the accuracy, precision, recall, and F1 score achieved by the system. After hyperparameter tuning, accuracy was obtained at 0.9999 for both datasets. In [ 25 ], authors emphasize the importance of anomaly-based detection, where abnormal patterns or behaviors are identified as potential attacks. The model is trained using a deep neural network (DNN), and tuning hyperparameters and a filter-based feature selection approach are used to get the highest performance on the UNSW-NB15 dataset. Without data balancing, the proposed model has an accuracy of 84%. The final score after data balancing is 91%. Similarly, in [ 26 ], the paper presents a novel approach for intrusion detection in IoT environments by combining deep reinforcement learning, feature selection, and optimal hyperparameters. The proposed system combines filter-based, wrapper-based, and embedded feature selection methods. Feature selection methods are applied to the NLS-KDD dataset to select related features. The optimal hyperparameters for the deep reinforcement learning (DRL) algorithm are determined using a swarm-based metaheuristic optimization algorithm called the whale optimization algorithm (WOA). Proposed method increases accuracy with feature selection methods. In addition, selecting the appropriate hyperparameter is critical to the efficiency of IDS performance.

In [ 27 ], attack detection performance was evaluated with stacked LSTM and bi-directional LSTM techniques applied to UNSW-NB15 and BoT-loT datasets. With hyperparameter optimization, the LSTM method gives an accuracy of 96.60% in the UNSW-NB15 dataset, while the result for the BoT-loT dataset is 99.99%. Bi-directional LSTM gives 96.41% and 99.99% accuracy values for these datasets, respectively. In a recent paper [ 28 ], authors propose a model for detecting attacks in BoT-IoT dataset. The proposed model uses a kNN classifier and feature selection techniques to identify and classify network intrusions. The authors claim that their model is more efficient and accurate than existing models. In addition, to enhance data quality and choose the best-performing features, the principal component analysis (PCA), univariate statistical test, and genetic algorithm (GA) are utilized for feature selection. With the GridSearchCV hyperparameter tuning method, the best parameters of the kNN algorithm were selected, and the optimum result was found. The authors propose an advanced and optimized light gradient boosting machine (LGBM) technique to identify intrusive activities in the Internet of Things (IoT) network in [ 29 ]. The dataset utilized in this study was initially designed for anomaly detection in IoT service accesses and is known as the distributed smart space orchestration system (DS2OS). The proposed approach is used genetic algorithm (GA) to optimize the hyperparameters. The paper aims to improve the accuracy of intrusion detection in IoT networks by using the proposed model. The paper compares the proposed model with other machine learning techniques such as support vector machine (SVM), random forest (RF), and decision tree (DT). The results show that the proposed model outperforms other machine learning techniques in terms of accuracy, precision, recall, and F1 score. In [ 30 ], The authors suggest that traditional intrusion detection systems (IDSs) are not effective in detecting attacks in IoT networks due to the unique characteristics of these networks, such as the large number of devices, the heterogeneity of devices, and the limited resources of devices. The proposed framework consists of anomaly detection and multi-class classification. The authors use UNSW BoT-IoT dataset to evaluate the performance of the proposed framework. Random forest algorithm is used for multi-class classification, and the SVM algorithm is applied for anomaly detection. Hyperparameters are tuned for both scenarios. The results show that the proposed framework outperforms traditional IDS in terms of accuracy, precision, and recall. The Table 1 illustrates a brief information about some important works conducted previously in this domain.

Our proposed approach stands out from surveyed methodologies by integrating a robust optimization technique—the genetic algorithm. This method is particularly valuable for addressing complex, nonlinear relationships within the algorithm. Mirroring the process of natural selection, the genetic algorithm ensures the survival of the fittest set of parameters, thereby optimizing the algorithm for the specific task at hand. This sophisticated optimization sets our approach apart, providing a more nuanced and effective solution compared to conventional tuning methods. The result is the identification of optimal parameters, enhancing the adaptability and overall performance of our approach. Additionally, our approach introduces a novel hybrid feature selection method, a critical factor in improving the accuracy and detection time of intrusion detection systems. This method strategically combines various feature selection techniques, leveraging their respective strengths to identify and retain the most relevant features for the detection process. This contributes not only to increased detection accuracy and reduced overhead time but also ensures the model's robustness by focusing on the most influential features within the dataset. By synergizing the power of a genetic algorithm for algorithmic tuning with the effectiveness of a hybrid feature selection method, our proposed approach offers a comprehensive and innovative solution that surpasses conventional methods found in the literature.

4 Research Method

4.1 machine learning algorithms.

Artificial Intelligence (AI) is a broad field encompassing various technologies and methodologies aimed at endowing machines with human-like intelligence. Within AI, machine learning (ML) and deep learning (DL) stand out as key subfields. Machine learning involves algorithms and statistical models that enable systems to improve their performance on a specific task over time, learning from data without explicit programming. Deep learning, a subset of machine learning, focuses on neural networks with multiple layers (deep neural networks) to simulate the intricate processing patterns of the human brain. These technologies find applications across diverse domains, including natural language processing, computer vision, speech recognition, cyber security, and autonomous systems [ 33 , 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 , 45 ], showcasing their versatility and impact on various facets of modern technology.

4.1.1 Decision Tree

A decision tree (DT) was introduced by Quinlan [ 46 ] as a supervised learning algorithm that expresses groupings and divisions using a binary tree flow chart. DT typically begins with a single node before branching out to other possibilities results. Each of these results generates new nodes that branch out into other instances [ 47 ]. As a result, in order to classify a data point, one should start at the decision tree's root and work their way up the tree until they reach the leaf node, which is represented by the branch that each test's result indicates. The resultant classification is given by the name of the class at the leaf node.

4.1.2 Random Forest (RF)

Random forest (RF) [ 48 ], developed by Breiman, aims to provide low classification error by creating a large number of decision trees from randomly selected data [ 49 ]. Since they make the same sorts of prediction errors, randomly produced trees are less related and could reduce overfitting the model. Each tree created gives a vote for the classification, and the one with the most votes is determined to be the final prediction.

4.1.3 Naïve Bayes (NB)

NB is a method based on Bayes' theorem and using probabilistic classifiers. This method assumes that each feature is independent of the others. Without taking advantage of the interactions and relationships between features that are important in distinguishing between classes, this method may not perform well in complex tasks. However, it has advantages in terms of ease of use, simplicity, and the ability to work with low training examples [ 9 ].

4.1.4 XGBoots

Presented by the Distributed Machine Learning Community (DMLC) XGBoost was primarily created utilizing gradient-boosted decision trees for speed and performance [ 50 ]. XGBoost first generates ordered decision trees, after which all data are chosen by assigning a weight value that is initially constant but varies depending on the analysis [ 51 ]. This algorithm has a high tolerance for missing values, and classification is done by strengthening an already trained model with new data. It is an effective method for optimum use of resources and reduction of computation time.

4.1.5 Stochastic Gradient Descent Classifier (SGD)

The stochastic gradient descent (SGD) [ 52 ] method, which is effective and simple, employs approximate gradients computed from subsets of the training data and updates the parameters in real time. With its ability to handle large datasets and handle training instances individually, SGD classifier provides several benefits.

4.2 Optimization Methods and Techniques

4.2.1 feature selection.

Feature selection is the process of identifying the most relevant features or variables that can predict the target variable in a dataset. There are multiple algorithms can be used for conducting feature selection. In this work, we chosen to use two feature selection approaches in hybridized manner each of these approaches will be described briefly in this section:

Mutual Information-based Feature Selection (MIFS) : MIFS [ 53 ] is a feature selection method that is based on the mutual information measure between the input features and the target variable in a dataset. It selects a subset of features that maximize the mutual information with the target variable while minimizing the mutual information between the selected features. The MIFS algorithm works in a sequential manner, where it starts by selecting the feature with the highest mutual information with the target variable. In each subsequent step, it selects the feature that has the highest mutual information with the target variable, conditioned on the previously selected features. The algorithm stops when a predefined number of features have been selected or when the mutual information between the selected features is above a certain threshold.

Sequential Feature Selection (SFS) : SFS [ 54 ] is a type of wrapper feature selection method that selects the best subset of input features by evaluating the performance of a machine learning algorithm on each subset of features. SFS works by iteratively adding or removing features from the current subset until the best-performing subset is found. The SFS algorithm starts with an empty subset of features and adds one feature at each time based on some selection criterion, such as maximum accuracy or minimum error. At each step, the algorithm evaluates the performance of the machine learning algorithm on the subset of features and selects the feature that improves the performance as possible. The algorithm continues until the predefined number of features or some stopping criterion is met. SFS can be performed in a forward or backward manner. Forward selection starts with an empty subset and adds features one at a time, while backward selection starts with the full set of features and removes one feature at a time. Backward selection can be more computationally efficient as it avoids evaluating all possible feature subsets.

4.2.2 Hyper Parameter Tuning GA

The genetic algorithm [ 55 ] was developed and introduced in 1960 by John Holland, University of Michigan students, and colleagues. This algorithm is based on inspiring by natural evolutionary processes [ 56 ]. Particularly, this algorithm simulates natural selection, known in evolutionary theory as survival of the fittest. For the solution of an optimization problem, the suitability of each parameter set is checked, and the most suitable parameters are tried to be determined. The search space is represented as a grouping of individuals known as chromosomes. Gene refers to the set of characteristics that identify an individual. In order to select the most suitable parameters, the goodness of each chromosome [ 57 ] should be evaluated with a "Fitness Function". Mutation, crossover, and selection processes are used in the evaluation process to ensure natural selection. The best individuals are chosen to progress through crossover, mutation, or selection [ 58 ] until a new population is formed. As a result, the optimization problem's solution is determined to be the population's fittest or best members, who are then identified.

4.3 Dataset

The performance of the model is evaluated using the CICIDS2017 [ 59 ] dataset published by the Canadian Institute. The dataset was collected using simulation environment over the course of five days, including attacks and normal traffic scenarios, which produced data that was very close to reality. This dataset covers the abstract characteristic attitudes of 25 users in accordance with the HTTP, HTTPS, FTP, SSH, and email protocols. According to the 2016 McAfee Report, the dataset consists of a variety of different attacks, including brute force FTP, brute force SSH, DoS, Heartbleed, web, infiltration, botnet, and DDoS attacks, which were not present in any of the previous datasets [ 12 ]. Each data sample in the dataset has about 80 features. Although the dataset has a distinct advantage, it does have some drawbacks such as including NaN values and distribution of attacks on eight CSV files [ 15 ]. However, these disadvantages can be solved by pre-processing the dataset. Figure 1 shows information about the classes of the original dataset.

figure 1

The data distribution in the classes of the CICIDS2017 dataset

5 Proposed Approach

The proposed method comprises four major phases: pre-processing, feature selection, classification, and hyperparameter tuning. The first phase is pre-processing phase which will be described in the next section.

5.1 Pre-Processing

It is a crucial phase in the improvement of any machine learning framework and is mostly used to organize and clean raw data in order to make sure that it is suitable for the creation and training of any machine learning model. In the pre-processing phase, we performed exploratory data analysis to understand the dataset distribution using visualization techniques. Subsequently, data cleaning and feature scaling were applied to normalize the range of features. Categorical data were converted into numerical data using label encoding. This critical pre-processing phase significantly enhances classification results. The following are the pre-processing steps that have been conducted in this work:

Exploratory data analysis (EDA) is conducted to investigate, summarize, and visualize the data distribution in the dataset. This step aids researchers in comprehending and interpreting the dataset through various techniques. EDA provides valuable insights into data types, category distributions, the presence of NaN data and duplicates, and the identification of outlier points that require cleaning. Additionally, correlation values between features are observed after completing the data analysis.

5.1.2 Data Cleaning

Following the completion of EDA, the data cleaning process is initiated to prepare the dataset for training machine learning models effectively. Empty and infinite values, along with duplicate rows, are removed to enhance model performance and reduce computation time. Additionally, certain attacks (classes) with limited samples and minimal impact on model performance are excluded from the dataset. The final dataset includes Brute Force, DoS, PortScan, and DDoS attacks.

5.1.3 Handling Imbalanced Data

In this work, the attack detection process was carried out in two distinct scenarios. In the first scenario, the dataset's data were classified as either attack or benign, representing a binary classification task. In the second scenario, the data were categorized as benign or attack, and the detected attacks were further classified into four different attack families, forming a multi-class classification problem. So, we handled the imbalanced data in two different scenarios based on the type of classification process to be conducted:

5.1.3.1 Binary Classification

Initially, we merged the attack data with different labels in the dataset (Brute Force, DoS, PortScan, and DDoS attacks) and assigned them a common "attack" label, thus transforming the dataset into a binary classification format. Subsequently, we observed that the number of "Benign" data points exceeded the number of "attack" data points. To address this class imbalance and improve performance, we employed the SMOTE algorithm. By utilizing this technique, we increased the number of samples in the "attack" class from 421,603 to 2,072,444, thereby equalizing its data samples with the "Benign" class.

5.1.3.2 Multi-Class Classification

Following pre-processing, the dataset consists of 2,072,444 benign samples and 421,603 attack samples, distributed among four different classes. To address the class imbalance, we developed a hybrid data-balancing technique, comprising both downsampling and upsampling processes. Initially, we performed downsampling to equalize the number of benign data samples with that of the attack data classes, resulting in 421,603 benign samples. Subsequently, SMOTE was applied to increase the data samples in each attack class to reach 421,603 samples each. This approach effectively balanced the dataset, enabling more reliable and accurate model training.

5.1.4 Feature Scaling and Label Encoding

When training a model with features of varying scales, the process can become complex, time-consuming, and may lead to occasional model failures [ 20 ]. To circumvent these challenges, scaling techniques are employed. In this study, we utilized the StandardScaler, which standardizes the values of numerical columns in the dataset while preserving the variances in the value ranges. The StandardScaler is represented using the following question:

where z is the standardized value. x is the original value of the feature. μ is the mean of the feature. σ is the standard deviation of the feature.

Additionally, since the CICIDS2017 dataset contains multi-class data with non-numeric values, we associated these values numerically using the Label encoder. Each class was assigned a numeric value starting from 0, and the machine learning algorithms leveraged these numerical representations for performing multi-class classification. This approach ensures compatibility and efficiency in the model training process.

5.2 Proposed Model Optimization Method

After pre-processing the dataset, we utilized our proposed data-balancing technique to achieve an even distribution of samples across classes. This process led to the creation of two datasets: one with two classes (benign and malicious) and another with five classes (one benign and four types of attacks), allowing the application of our method to both binary and multi-class classification. Following this, we devised a multi-stage experimental process to pinpoint the optimal machine learning algorithm for enhancing performance in attack detection. Initially, we evaluated machine learning algorithms with their default hyperparameters using the balanced dataset. From this assessment, we identified the top three algorithms for subsequent stages. In the following stage, we introduced a hybrid feature selection method to optimize the dataset, enhancing the performance of machine learning algorithms in attack detection. The feature selection process involved a novel hybrid approach combining mutual information, a well-established feature selection method, with sequential feature selection. Initially, mutual information was employed to identify the best 35 features from the original 80. Subsequently, sequential forward feature selection further refined the feature set, ensuring the most effective features for accurate attack detection. This feature selection method was applied to both binary and multi-class classification datasets. For the binary classification dataset, the proposed feature selection yielded six selected features: Destination Port, Flow Duration, Bwd Packet Length Max, Flow Bytes/s, Bwd IAT Std, and Bwd URG Flags. Meanwhile, the multi-class classification dataset featured six selected features: Destination Port, Flow Bytes/s, Flow IAT Mean, Fwd IAT Std, Bwd IAT Max, and Bwd IAT Min. Subsequently, the top three machine learning algorithms, identified in the previous stage, were retested using the reduced feature dataset for both binary and multi-class datasets. This process led to the selection of two algorithms—RF and XGBoost—that demonstrated the best results. In the final stage, a hyperparameter tuning process was proposed for the selected algorithms, RF and XGBoost, to further refine their performance. Employing a genetic algorithm during hyperparameter tuning optimized the algorithms, resulting in enhanced performance. Experimental results underscore the varied outcomes of different machine learning algorithms with default parameters when hybrid feature selection methods are employed. Furthermore, the effectiveness of our approach is evident in the improved results achieved after hyperparameter optimization. The methodology is visually summarized in the flow diagram (Fig. 2 ), and a detailed analysis of our findings is presented in the subsequent sections.

figure 2

Proposed model

6 Model Evaluation Metrics

In this section, we provide a comprehensive overview of the machine learning metrics employed in the study to assess the performance and efficacy of the proposed model. These metrics play a crucial role in quantifying the model's ability to generalize and make accurate predictions.

6.1 False Positive (FP), False Negative (FN), True Positive (TP), True Negative (TN)

These metrics provide insights into the specific types of errors made by the model.

FP: The number of instances predicted as positive but is actually negative.

FN: The number of instances predicted as negative but is actually positive.

TP: The number of instances predicted as positive is actually positive.

TN: The number of instances predicted as negative is actually negative.

6.2 Accuracy (ACC)

Accuracy represents the ratio of correctly predicted instances to the total instances in the dataset. Accuracy provides an overall measure of the model's correctness in predictions.

6.3 Precision

Precision is the ratio of correctly predicted positive observations to the total predicted positives. Precision focuses on the accuracy of positive predictions.

6.4 Recall (Sensitivity)

Recall is the ratio of correctly predicted positive observations to the all observations in the actual class. Recall emphasizes the model's ability to capture all relevant instances of the positive class.

6.5 F1 Score

F1 score is the harmonic mean of precision and recall, providing a balanced measure between the two. F1 score is particularly useful in imbalanced datasets, where the class distribution is skewed.

6.6 Confusion Matrix

A confusion matrix is a tabular representation that summarizes the performance of a classification algorithm. It compares the predicted classes against the actual classes and is especially useful for evaluating the performance of a model on a dataset with known class labels.

7 Hardware and Software Platform

The proposed model was implemented using Python, with various libraries, including Scikit-learn, Optuna, skopt, and others, employed throughout the research. Experiments were conducted on a computer with an 13th Gen Intel (R) Core (TM) i9-13980HX 2.20 GHz processor, 32 GB RAM, NVIDIA DeForce RTX 4090 16GB GPU and Windows 11 operating system.

8 Experimental Results

Initially, the CICIDS2017 dataset underwent pre-processing to enhance its suitability for training machine learning algorithms. Subsequently, a data-balancing technique was employed to address class imbalance. Five distinct machine learning classification algorithms were applied using default hyperparameter values, and results were obtained for both binary and multi-classification scenarios. From this initial exploration, the top three performing machine learning algorithms were identified. Following the algorithm selection, a hybrid feature selection approach was implemented on the dataset. The previously chosen three machine learning algorithms were then retrained and tested to assess their detection performance. From this phase, the two algorithms demonstrating the highest accuracy values were chosen for further refinement through hyperparameter tuning. The hyperparameter tuning process utilized the genetic algorithm, leveraging the Optuna library. This algorithmic approach incorporates mutation, crossover, and selection processes to iteratively discover the optimal algorithm and parameters. The results obtained from hyperparameter tuning were compared based on detection accuracy, F1 score, and computational time, providing insights into the impact of this tuning process on the intrusion detection system's performance. Given that the research covered both binary classification and multi-class classification scenarios, the outcomes were categorized accordingly. The experimental findings and methodologies employed in this study are visually represented in Fig.  3 .

figure 3

The Experimental Framework

We initially explored five machine learning algorithms—XGBoost, random forest, decision tree, bagging, and extra tree algorithms. Through the evaluation process, we identified the most optimal classifier capable of achieving the task with superior performance. We enhanced the intrusion detection system's detection performance through multiple stages. In the following sections, we will describe the results obtained from the multi-class classification and binary classification experiments in details separately.

8.1 Multi-Class Classification Results

In the first scenario, we proposed classifying attacks in the dataset into their respective families. Initially, we addressed the dataset's imbalance using our hybridized balancing approach. Subsequently, our proposed three stages of optimization were applied. The first stage involved training five different machine learning algorithms on the balanced dataset, from which the top three algorithms were selected. In the second stage, we applied the hybridized feature selection method, refining the dataset's best features. The chosen three machine learning algorithms were then tested after the feature selection process, and the top two algorithms were selected. Finally, we employed the genetic algorithm to tune hyperparameters for the chosen two algorithms. The optimized hyperparameter values were utilized, and these two algorithms were trained and tested to select the best optimized model. Results are presented under three subtitles: first stage results, second stage results, and third stage results.

8.1.1 First Stage Results

In the first stage, we trained five ML algorithms—XGBoost, random forest, decision tree, bagging, and extra tree—using their default hyperparameters. The dataset underwent hybrid data balancing before training. Table 2 illustrates the obtained results. Notably, the XGBoost classifier demonstrated superior performance with a detection accuracy and F1 score of 99.98%. Despite the extra tree algorithm's comparatively lower performance, it still achieved a high detection accuracy of 99.96% and an F1 score of 99.96%. Additionally, the XGBoost algorithm outperformed others in computational time, completing the task in 70.15 seconds. We proceeded to the next step with XGBoost, RF, and bagging algorithms.

8.1.2 Second Stage Results

In this stage, our proposed hybrid feature selection approach was applied to identify the optimal features from the dataset. Subsequently, the three ML algorithms were trained using these selected features to optimize both detection accuracy and training time. The multi-class classification dataset featured six selected features: Destination Port, Flow Bytes/s, Flow IAT Mean, Fwd IAT Std, Bwd IAT Max, and Bwd IAT Min. The results, showcased in Table 3 , reveal a slight decrease in detection accuracy across all algorithms. However, this is deemed acceptable given the significant reduction in computational time for all ML algorithms. Notably, XGBoost maintained its superior performance, achieving 99.95% accuracy and an F1 score of 99.95%. Furthermore, the computational time of XGBoost notably decreased from 70 seconds to 23 seconds. At the end of this stage, we selected the best two ML algorithms in this scenario XGBoost and RF algorithms.

8.1.3 Third Stage Results

We conducted hyperparameter tuning to identify optimal parameters for both the RF and XGBoost algorithms, aiming to achieve the highest accuracy values. Table 4 outlines the default parameter values of these algorithms prior to optimization and their respective values after the tuning process. Subsequently, applying the optimized parameters for attack detection, the RF algorithm demonstrated an accuracy of 99.96%, and the XGBoost algorithm achieved 99.97% accuracy. So, there are no notable changes in the performance of the proposed ML algorithms after the hyperparameter tuning process and feature selection process. Yet, notably, we observed that comparable performance to the machine learning algorithms with their default hyperparameters can be achieved with a reduced computational time. While the XGBoost algorithm initially required 70 s, it now completes the task in approximately 42 s. In contrast, we observed a substantial reduction in the processing time for the random forest classifier. Initially requiring 891 s, it now completes the task in just 6.5 s following the feature selection and hyperparameter tuning processes. Table 5 illustrates the results obtained in this stage. Furthermore, Fig.  4 illustrates the confusion matrixes for the optimized models.

figure 4

The confusion matrices of xgboots and RF algorithms after the hyperparameter optimization for Multi-Class Classification

8.2 Binary Classification Results

In the second scenario, we proposed aggregating all attack classes in the dataset into a single-class labeled 'attacks'. To tackle data imbalance, we employed the SMOTE oversampling algorithm, which increased the number of attack data samples. Subsequently, our three-stage optimization process was applied to the modified dataset. In the initial stage, we conducted binary classification using five machine learning algorithms with their default hyperparameter values. The top three performing algorithms were then selected for the subsequent stage. In the second stage, our hybridized feature selection method was applied to identify the most effective feature group for optimal results with the machine learning algorithms. Within this stage, the best two ML algorithms were chosen to advance to the final stage. In the last stage, the genetic algorithm was employed to optimize the performance of the two selected machine learning algorithms. Ultimately, the best ML algorithm was chosen as an optimized IDS system capable of detecting potential attacks with high performance and minimal computational time. Further discussion of the case study findings will be presented in subsequent subsections.

8.2.1 First Stage Results

Initially, we utilized the XGBoost, random forest, bagging, decision tree, and extra tree machine learning algorithms with their default hyperparameters, training them on the balanced dataset. Table 6 showcases the results, indicating accuracies of 99.96%, 99.95%, 99.94%, 99.92%, and 99.91%, respectively. Similar to the multi-classification scenario, the XGBoost algorithm exhibited the highest accuracy and the lowest false-negative value for binary classification. Additionally, it was noted that the XGBoost algorithm achieved the task with the shortest computational time. At this stage's conclusion, we selected the top three ML algorithms for the subsequent stage: XGBClassifier, RandomForestClassifier, and BaggingClassifier.

8.2.2 Second Stage Results

Finally, the hybrid feature selection method was applied to the balanced dataset. Initially, 35 features were selected using mutual information, and the sequential feature selector was then employed to further narrow down the selection to just six features. For the binary classification dataset, the proposed feature selection yielded the following selected features: Destination Port, Flow Duration, Bwd Packet Length Max, Flow Bytes/s, Bwd IAT Std, and Bwd URG Flags. After determining the optimal features, the new dataset was tested using the default parameters of the XGBClassifier, RandomForestClassifier, and BaggingClassifier algorithms. In this evaluation, XGBoost achieved an accuracy of 99.88%, RF achieved 99.88%, and Bagging achieved 99.86%. Thus, by selecting only six features, we achieved nearly the same performance using the XGBoost classifier with only 3.36 s, compared to 32.94 s in the first stage. As a result of this stage, we selected the best two ML algorithms, namely XGBoost and random forest, to be used in the final stage. The results obtained in this stage illustrated in Table  7 .

8.2.3 Third Stage Results

In the concluding phase, we fine-tuned the hyperparameters of the RF and XGBoost algorithms utilizing a genetic algorithm and the feature-reduced dataset. Table 8 provides a succinct comparison between the default hyperparameters and the values refined through genetic algorithm tuning. Additionally, Table  9 presents the noteworthy results derived from this case study, indicating a substantial accuracy improvement for both algorithms—reaching 99.93% for RF and 99.93% for XGBoost compared to the preceding stage. Furthermore, the computational efficiency post-optimization has been halved compared to the initial stage, as demonstrated in Table  9 . In Fig.  5 , the confusion matrices of the optimized RF and XGBoost algorithms visually depict the positive impact of hyperparameter tuning, showcasing reduced false negative (FN) values and heightened accuracy. This outcome underscores the pronounced performance enhancements achieved through meticulous hyperparameter optimization.

figure 5

Results of XGBoost and RF algorithms after applying the hybrid feature selection and GA algorithm for Binary Classification

9 Comparison Study

This section presents a comparative study to clearly illustrate the impact of feature selection and hyperparameter tuning on the accuracy of machine learning-based IDS detection. Our findings indicate that, for the multi-class classification case study, these processes did not yield notable improvements in the adopted IDS performance. However, a significant reduction in computational time was observed for all employed ML algorithms—a crucial metric for IDS systems operating in real-time scenarios. It has been observed that the proposed hybrid feature selection approach, coupled with the genetic algorithm for hyperparameter tuning, demonstrated nearly identical performance to using the original dataset with complete features but with a significantly reduced computational time. In conclusion, the proposed feature selection method with the genetic algorithm enhances the time efficiency of the IDS systems. Table 10 presents a comparative analysis of the F1 scores obtained during the experimental studies. Additionally, Table  11 provides a statistical representation highlighting a significant enhancement in the time efficiency of the IDS system subsequent to the application of the proposed optimization procedure.

10 Testing the Proposed Method Using Different Dataset

To validate the effectiveness of the proposed method, we applied it to an additional dataset, specifically the CSE-CIC 2018 dataset. Our focus was on training the optimized ML algorithm, namely XGBoost, with hyperparameter values selected using the genetic algorithm. We opted for a binary classification scenario over this dataset, initially labeling samples as either "attack" or "benign." Subsequently, we decoded, rescaled feature values, and pre-processed the data in the dataset. The proposed hybrid feature selection method was then employed to identify the most relevant features, resulting in the selection of three features to optimize the IDS system's detection accuracy. The final step involved using this selected feature dataset to train the XGBoost classifier with hyperparameter values chosen through the genetic algorithm. The achieved results demonstrated exceptional performance, with the proposed approach achieving 100% accuracy and F1 score, all within a highly competitive computational time. The results obtained by training the IDS system using this dataset is illustrated in Table  12 . Moreover, the confusion matrix of the improved XGBoost is illustrated in Fig.  6 .

figure 6

The confusion matrix of the XGBoost trained using the CSE-CIC 2018 dataset

11 Conclusion

In conclusion, network attacks have detrimental effects on network performance and resource utilization, prompting the development of various methods for efficient attack detection. Among these methods, anomaly-based detection systems play a crucial role. This study focused on evaluating the performance of machine learning-based intrusion detection systems (IDSs) using the CICIDS2017 dataset. Our proposed hybridized IDS system comprises multiple stages. Initially, data pre-processing steps were implemented to cleanse the dataset and eliminate outlier points. Simultaneously, a hybridized data-balancing approach was introduced to address dataset imbalance. In the first stage, we assessed the performance of multiple machine learning algorithms with their default hyperparameters to evaluate their efficiency in attack detection. The subsequent stages involved proposing a hybridized feature selection method, integrating various feature selection techniques, and employing a genetic algorithm to fine-tune hyperparameters. These stages aimed to enhance IDS performance in both binary and multi-class classification tasks. The experimental results demonstrated that our hybrid feature selection method, coupled with hyperparameter optimization, notably improved the efficiency of XGBoost and RF algorithms, particularly in terms of computational time.

XGBoost consistently exhibited superior detection accuracy in both binary and multi-class classification applications. The hyperparameter tuning process, applied after feature selection, significantly reduced both false negatives and false positives, showcasing improvements of up to 61% and 62.5% for XGBoost, and 40% and 72.5% for random forest in binary classification.

The most significant achievement was the substantial reduction in overhead time, a critical metric for IDS systems. The proposed hybrid feature selection method, combined with genetic algorithm-based hyperparameter tuning, resulted in over 40% and 98% reduction in training time for XGBoost and RF-based IDS, respectively, in both binary and multi-class detection processes.

To validate the efficiency of our approach across diverse datasets, we tested it on the CSE-CIC 2018 dataset, achieving a 100% F1 score in detecting attacks. These findings have crucial implications for the development of effective IDS systems, enabling the identification of optimal hyperparameters and a reduction in feature dimensions for enhanced model efficiency and performance.

Looking ahead, future research could explore alternative hyperparameter optimization techniques and feature selection methods, along with assessing the performance of machine learning-based IDSs on different datasets. Additionally, experiments could be conducted to evaluate the impact of hyperparameter tuning and feature selection on the performance of deep learning models.

Availability of data and materials

The dataset will be available on request.

Dave Smith, “IoT 2022: Connected devices growing 18% to 14.4 Billion globally,” IOT For All, (2020)

Díaz López, D., et al.: Shielding IoT against cyber-attacks: an event-based approach using SIEM. Wirel. Commun. Mob. Comput. (2018). https://doi.org/10.1155/2018/3029638

Article   Google Scholar  

Sicari, S.; Rizzardi, A.; Miorandi, D.; Coen-Porisini, A.: REATO: REActing TO denial of service attacks in the internet of things. Comput. Netw. 137 , 37–48 (2018). https://doi.org/10.1016/j.comnet.2018.03.020

Dave Irvine, “Report shows 118 percent increase in ransomware attacks In 2019,” Sep. (2019)

Pawar, A.B.; Ghumbre, S.: A survey on IoT applications, security challenges and counter measures. Int. Conf. Comput. Anal. Secur. Trends CAST 2016 , 294–299 (2017). https://doi.org/10.1109/CAST.2016.7914983

Modi, C., Patel, D., Borisaniya, B., Patel, H., Patel, A., Rajarajan, M.: A survey of intrusion detection techniques in cloud. J. Netw. Comput. Appl. 36 (1), 42–57 (2013). https://doi.org/10.1016/j.jnca.2012.05.003

Mishra, P.; Varadharajan, V.; Tupakula, U.; Pilli, E.S.: A detailed investigation and analysis of using machine learning techniques for intrusion detection. IEEE Commun. Sur. Tutor. 21 (1), 686–728 (2019). https://doi.org/10.1109/COMST.2018.2847722

Masduki, B. W.; Ramli, K.; Saputra, F. A.; Sugiarto, D.: Study on implementation of machine learning methods combination for improving attacks detection accuracy on intrusion detection system (IDS), in 2015 International Conference on Quality in Research (QiR), IEEE, pp. 56–64 (2015)

Al-Garadi, M.A.; Mohamed, A.; Al-Ali, A.K.; Du, X.; Ali, I.; Guizani, M.: A Survey of machine and deep learning methods for internet of things (IoT) security. IEEE Commun. Surv. Tutor. 22 (3), 1646–1685 (2020). https://doi.org/10.1109/COMST.2020.2988293

Feurer, M.; Hutter, F.: “Hyperparameter optimization,” Automated machine learning: Methods, systems, challenges, pp. 3–33, (2019)

Kunang, Y.N.; Nurmaini, S.; Stiawan, D.; Suprapto, B.Y.: Attack classification of an intrusion detection system using deep learning and hyperparameter optimization. J. Inf. Secur. Appl. 58 , 102804 (2021)

Google Scholar  

Maseer, Z.K.; Yusof, R.; Bahaman, N.; Mostafa, S.A.; Foozy, C.F.M.: Benchmarking of machine learning for anomaly based intrusion detection systems in the CICIDS2017 dataset. IEEE Access 9 , 22351–22370 (2021). https://doi.org/10.1109/ACCESS.2021.3056614

Doğan, E.; H. Bakir, H.: “Hiperparemetreleri Ayarlanmış Makine Öğrenmesi Yöntemleri Kullanılarak Ağdaki Saldırıların Tespiti,” in International Conference on Pioneer and Innovative Studies, pp. 274–286 (2023)

Yousefnezhad, M.; Hamidzadeh, J.; Aliannejadi, M.: Ensemble classification for intrusion detection via feature extraction based on deep learning. Soft comput 25 (20), 12667–12683 (2021). https://doi.org/10.1007/s00500-021-06067-8

Sharma, D.K.; Mishra, J.; Singh, A.; Govil, R.; Srivastava, G.; Lin, J.C.W.: Explainable artificial intelligence for cybersecurity. Comput. Electr. Eng. (2022). https://doi.org/10.1016/j.compeleceng.2022.108356

Priyanka, V.; Gireesh Kumar, T.: Performance assessment of IDS based on CICIDS-2017 dataset. Lect. Notes Net. Syst. 191 , 611–621 (2020). https://doi.org/10.1007/978-981-16-0739-4_58

Bakour, K.; Daş, G.S.; Ünver, H.M.: “An intrusion detection system based on a hybrid Tabu-genetic algorithm. Int. Conf. Comput. Sci. Eng. (UBMK) (2017). https://doi.org/10.1109/UBMK.2017.8093378

Hossain, M. D.; Ochiai, H.; Fall, D.; Kadobayashi, Y.: “LSTM-based network attack detection: performance comparison by hyper-parameter values tuning,” in 2020 7th IEEE International Conference on Cyber Security and Cloud Computing (CSCloud)/2020 6th IEEE International Conference on Edge Computing and Scalable Cloud (EdgeCom), IEEE pp. 62–69 (2020)

Kim, M.: Supervised learning-based DDoS attacks detection: Tuning hyperparameters. ETRI J. 41 (5), 560–573 (2019). https://doi.org/10.4218/etrij.2019-0156

Batchu, R.K.; Seetha, H.: A generalized machine learning model for DDoS attacks detection using hybrid feature selection and hyperparameter tuning. Comput. Net. 200 , 108498 (2021). https://doi.org/10.1016/j.comnet.2021.108498

Choraś, M.; Pawlicki, M.: Intrusion detection approach based on optimised artificial neural network. Neurocomputing 452 , 705–715 (2021). https://doi.org/10.1016/j.neucom.2020.07.138

Sanchez, O. R.; Repetto, M.; Carrega, A.; Bolla, R.: “Evaluating ML-based DDoS detection with grid search hyperparameter optimization,” in 2021 IEEE 7th International Conference on Network Softwarization (NetSoft), IEEE, pp. 402–408 (2021)

Kunang, Y. N.; Nurmaini, S.; Stiawan, D.; Suprapto, B. Y.: “Improving Classification attacks in IOT intrusion detection system using bayesian hyperparameter optimization,” 2020 3rd international seminar on research of information technology and intelligent systems, ISRITI pp. 146–151 (2020) https://doi.org/10.1109/ISRITI51436.2020.9315360

Okey, O.D.; Melgarejo, D.C.; Saadi, M.; Rosa, R.L.; Kleinschmidt, J.H.; Rodríguez, D.Z.: Transfer learning approach to IDS on cloud IoT devices using optimized CNN. IEEE Access 11 , 1023–1038 (2023)

Sharma, B.; Sharma, L.; Lal, C.; Roy, S.: Anomaly based network intrusion detection for IoT attacks using deep learning technique. Comput. Electr. Eng. 107 , 108626 (2023). https://doi.org/10.1016/j.compeleceng.2023.108626

Bakhshad, S.; Ponnusamy, V.; Annur, R.; Waqasyz, M.; Alasmary, H.; Tux, S.: “Deep Reinforcement learning based intrusion detection system with feature selections method and optimal hyper-parameter in IoT environment,” International Conference on Computer, Information and Telecommunication Systems (CITS), 2022, pp. 1–7. doi: https://doi.org/10.1109/CITS55221.2022.9832976 .

Saurabh, K., et al.: “Lbdmids: LSTM based deep learning model for intrusion detection systems for IOT networks”, in. IEEE World AI IoT Congress (AIIoT) 2022 , 753–759 (2022)

Mohy-eddine, M.; Guezzaz, A.; Benkirane, S.; Azrour, M.: An efficient network intrusion detection model for IoT security using K-NN classifier and feature selection. Multimed Tools Appl 82 (15), 1–19 (2023)

Mishra, D.; Naik, B.; Nayak, J.; Souri, A.; Dash, P.B.; Vimal, S.: Light gradient boosting machine with optimized hyperparameters for identification of malicious access in IoT network. Digit. Commun. Net. 9 (1), 125–137 (2023)

Manzano, R.; Goel, N.; Zaman, M.; Joshi, R.; Naik, K.: “Design of a machine learning based intrusion detection framework and methodology for iot networks,” in 2022 IEEE 12th Annual computing and communication workshop and conference (CCWC), pp. 191–198 (2022)

Hossain, M. D.; Ochiai, H.; Fall, D.; Kadobayashi, Y.: “LSTM-based network attack detection: performance comparison by hyper-parameter values tuning,” Proceedings–2020 7th IEEE International conference on cyber security and cloud computing and 2020 6th IEEE International conference on edge computing and scalable cloud, CSCloud-EdgeCom 62–69 (2020) https://doi.org/10.1109/CSCloud-EdgeCom49738.2020.00020 .

Sanchez, O. R.; Repello, M.; Carrega, A.; Bolla, R.: “Evaluating ML-based DDoS detection with grid search hyperparameter optimization,” Proceedings of the 2021 IEEE conference on network softwarization: accelerating network softwarization in the cognitive age, NetSoft, no. Ml, pp. 402–408, (2021) https://doi.org/10.1109/NetSoft51509.2021.9492633 .

Bakır, H., Bakır, R.: DroidEncoder: malware detection using auto-encoder based feature extractor and machine learning algorithms. Comput. Electr. Eng. 110 , 108804 (2023)

Bakır, H., Elmabruk, K.: Deep learning-based approach for detection of turbulence-induced distortions in free-space optical communication links. Phys. Scr. 98 (6), 065521 (2023)

Demircioğlu, U.; Bakır, H.: Deep learning-based prediction of delamination growth in composite structures: bayesian optimization and hyperparameter refinement. Phys. Scr. 98 (10), 106004 (2023)

Bakir, H.; Yilmaz, Ş: Using Transfer learning technique as a feature extraction phase for diagnosis of cataract disease in the eye. Int. J. Sivas Univ. Sci. Technol. 1 (1), 17–33 (2022)

Yilmaz, E. K.; Bakir, H.: “Hyperparameter Tunning and feature selection methods for malware detection,” Politeknik Dergisi, p. 1, (2023)

Bakir, H.; Oktay, S.; Tabaru, E.: Detection of pneumonia from x-ray images using deep learning techniques. J. Sci. Rep.-A 052 , 419–440 (2023)

Bakır, H.; Çayır, A. N.; Navruz, T. S.: “A comprehensive experimental study for analyzing the effects of data augmentation techniques on voice classification,” Multimed Tools Appl, pp. 1–28, (2023)

Bakir, H.; Bakir, R.: Evaluating the robustness of yolo object detection algorithm in terms of detecting objects in noisy environment. J. Sci. Rep.-A 054 , 1–25 (2023)

H. Bakır, H.“Evaluating the impact of tuned pre-trained architectures’ feature maps on deep learning model performance for tomato disease detection,” Multimed Tools Appl, pp. 1–22, 2023.

Bakir, H.; Eker, S. B.: “A comprehensive experimental study for evaluating the performance of well-known cnn pre-trained models in noisy environments,” Politeknik Dergisi, p. 1 (2023)

Ghanem, R.; Erbay, H.: Context-dependent model for spam detection on social networks. SN Appl Sci 2 , 1–8 (2020)

Ghanem, R.; Erbay, H.: Spam detection on social networks using deep contextualized word representation. Multimed Tools Appl 82 (3), 3697–3712 (2023)

Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1 (1), 81–106 (1986). https://doi.org/10.1007/bf00116251

Hasan, M.; Islam, M.M.; Zarif, M.I.I.; Hashem, M.M.A.: Attack and anomaly detection in IoT sensors in IoT sites using machine learning approaches. Int. Things (Netherlands) 7 , 100059 (2019). https://doi.org/10.1016/j.iot.2019.100059

Breiman, L.: Random forests. Mach. Learn. 45 , 5–32 (2001)

Ariyaluran Habeeb, R.A.; Nasaruddin, F.; Gani, A.; Targio Hashem, I.A.; Ahmed, E.; Imran, M.: Real-time big data processing for anomaly detection: a Survey. Int. J. Inf. Manage. 45 , 289–307 (2019). https://doi.org/10.1016/j.ijinfomgt.2018.08.006

Dhaliwal, S.S.; Al Nahid, A.; Abbas, R.: Effective intrusion detection system using XGBoost. Information (Switzerland) (2018). https://doi.org/10.3390/info9070149

Bhati, B.S.; Chugh, G.; Al-Turjman, F.; Bhati, N.S.: An improved ensemble based intrusion detection technique using XGBoost. Trans. Emerg. Telecommun. Technol. 32 (6), 1–15 (2021). https://doi.org/10.1002/ett.4076

Tsuruoka,Y.; Tsujii, J.; Ananiadou, S.: “Stochastic gradient descent training for L1-regularized log-linear models with cumulative penalty,” ACL-IJCNLP 2009 - Joint Conf. of the 47th annual meeting of the association for computational linguistics and 4th Int. Joint Conf. on natural language processing of the AFNLP, Proceedings of the Conf., pp. 477–485, (2009) https://doi.org/10.3115/1687878.1687946 .

Sulaiman, M. A.; Labadin, J.: “Feature selection based on mutual information,” in 2015 9th International conference on IT in Asia (CITA), IEEE, (2015) pp. 1–6

Li, J., et al.: Feature selection: a data perspective. ACM Comput. Surv. (2017). https://doi.org/10.1145/3136625

Holland, J.H.: Genetic algorithms. Sci. Am. 267 (1), 66–73 (1992)

Singh, T.; Verma, S.; Kulshrestha, V.; Katiyar, S.: Intrusion detection system using genetic algorithm for cloud. ACM Int. Conf. Proc. Ser. 04 , 564–568 (2016). https://doi.org/10.1145/2905055.2905175

Sazzadul Hoque, M.: An implementation of intrusion detection system using genetic algorithm. Int. J. Net. Secur. Appl. 4 (2), 109–120 (2012). https://doi.org/10.5121/ijnsa.2012.4208

Alibrahim, H.; Ludwig, S. A.: “Hyperparameter optimization: comparing genetic algorithm against grid search and bayesian optimization,” 2021 IEEE Congress on evolutionary computation, CEC 2021–Proceedings, pp. 1551–1559, (2021) https://doi.org/10.1109/CEC45853.2021.9504761 .

Sharafaldin, I.; Lashkari, A. H.; Ghorbani, A. A.: “Toward generating a new intrusion detection dataset and intrusion traffic characterization,” ICISSP 2018–Proceedings of the 4th International Conference on Information Systems Security and Privacy, Cic, pp. 108–116, (2018) https://doi.org/10.5220/0006639801080116 .

Download references

Acknowledgements

The authors would like to thank Sharafaldin et al [ 59 ] for sharing their datasets.

Open access funding provided by the Scientific and Technological Research Council of Türkiye (TÜBİTAK).

Author information

Authors and affiliations.

Department of Computer Engineering, Faculty of Engineering and Natural Sciences, Sivas University of Science and Technology, Sivas, Turkey

Halit Bakır

WISE Lab, Department of Computer Engineering, Hacettepe University, Ankara, Turkey

Özlem Ceviz

You can also search for this author in PubMed   Google Scholar

Contributions

Halit Bakır executed the experiment applications, visualizations, graphics, and article revisions. Özlem Ceviz authored the initial manuscript version and conducted the first set of experiments. Additionally, Halit Bakır provided supervision throughout the process.

Corresponding author

Correspondence to Halit Bakır .

Ethics declarations

Conflict of interest.

Not applicable.

Ethical Approval

This research was carried out during the 'Python for Artificial Intelligence (Python ile Yapay Zeka)' course at Sivas University of Science and Technology during the fall semester of 2022–2023.

Additional information

Khaled Bakour or Halit Bakır: Due to the author’s dual citizenship, his name can be written in two different ways.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Bakır, H., Ceviz, Ö. Empirical Enhancement of Intrusion Detection Systems: A Comprehensive Approach with Genetic Algorithm-based Hyperparameter Tuning and Hybrid Feature Selection. Arab J Sci Eng (2024). https://doi.org/10.1007/s13369-024-08949-z

Download citation

Received : 27 August 2023

Accepted : 10 March 2024

Published : 12 April 2024

DOI : https://doi.org/10.1007/s13369-024-08949-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Intrusion detection systems (IDSs)
  • Machine learning (ML)
  • Hyperparameters tuning
  • Genetic algorithm
  • Find a journal
  • Publish with us
  • Track your research

IMAGES

  1. A Study of Genetic Algorithm Based on Niche Technique

    genetic algorithm based research papers

  2. (PDF) Automatic image generation by genetic algorithms for testing

    genetic algorithm based research papers

  3. A Simple Genetic Algorithm Flow Chart Download Scientific Diagram

    genetic algorithm based research papers

  4. (PDF) A Study on Genetic Algorithm and its Applications

    genetic algorithm based research papers

  5. (PDF) Using Genetic Algorithm to Improve Information Retrieval Systems

    genetic algorithm based research papers

  6. Introduction to Genetic Algorithms

    genetic algorithm based research papers

VIDEO

  1. Genetic algorithm| Main advantage of Genetic Algorithms in AI? #biotechnology #biotech #AI

  2. Genetic Algorithms 1

  3. Genetic algorithm(fixed question)-very easy and simple

  4. Genetic Algorithms

  5. Working of Genetic Algorithm in AI Part -2

  6. شرح مثال الـ genetic algorithm

COMMENTS

  1. A review on genetic algorithm: past, present, and future

    In this paper, the analysis of recent advances in genetic algorithms is discussed. The genetic algorithms of great interest in research community are selected for analysis. This review will help the new and demanding researchers to provide the wider vision of genetic algorithms. The well-known algorithms and their implementation are presented with their pros and cons. The genetic operators and ...

  2. A review on genetic algorithm: past, present, and future

    The research work related to genetic algorithm for multimedia applications were also included. During the screening of research papers, all the duplicate papers and papers published before 2007 were discarded. 4340 research papers were selected based on 2007 and duplicate entries.

  3. An improved genetic algorithm and its application in neural ...

    The choice of crossover and mutation strategies plays a crucial role in the searchability, convergence efficiency and precision of genetic algorithms. In this paper, a novel improved genetic algorithm is proposed by improving the crossover and mutation operation of the simple genetic algorithm, and it is verified by 15 test functions. The qualitative results show that, compared with three ...

  4. Genetic Algorithm: Reviews, Implementations, and Applications

    Paper— Genetic Algorithm: Reviews, Implementation and Applications Keywords— Genetic Algorithm, Search Techniques, Random Tests, Evolution, Applications. 1 Introduction The GA is a meta-heuristic motivated by the evolution process and belongs to the large class of evolutionary algorithms in informatics and computational mathematics.

  5. The Applications of Genetic Algorithms in Medicine

    A great wealth of information is hidden amid medical research data that in some cases cannot be easily analyzed, if at all, using classical statistical methods. ... In this paper, we introduce the genetic algorithm (GA) as one of these metaheuristics and review some of its applications in medicine. ... Genetic-algorithm-based multiple ...

  6. Genetic Algorithm: Reviews, Implementations, and Applications

    Nowadays genetic algorithm (GA) is greatly used in engineering pedagogy as an adaptive technique to learn and solve complex problems and issues. It is a meta-heuristic approach that is used to solve hybrid computation challenges. GA utilizes selection, crossover, and mutation operators to effectively manage the searching system strategy. This algorithm is derived from natural selection and ...

  7. Genetic algorithms for modelling and optimisation

    Abstract. Genetic algorithms (GAs) are a heuristic search and optimisation technique inspired by natural evolution. They have been successfully applied to a wide range of real-world problems of significant complexity. This paper is intended as an introduction to GAs aimed at immunologists and mathematicians interested in immunology.

  8. Genetic Algorithm- A Literature Review

    Genetic Algorithm (GA) may be attributed as method for optimizing the search tool for difficult problems based on genetics selection principle. In additions to Optimization it also serves the purpose of machine learning and for Research and development. It is analogous to biology for chromosome generation with variables such as selection, crossover and mutation together constituting genetic ...

  9. An Efficient Genetic Algorithm based Auto ML Approach for

    In recent years, AutoML is booming as the time-consuming and iterative tasks involved in developing a machine learning model can be automated using AutoML. It aims to lessen the requirement for skilled individuals to create the ML model. Additionally, it helps to increase productivity and advance machine learning research. Hence, this paper focusses on developing an AutoML model using genetic ...

  10. [PDF] Genetic algorithms

    2023. TLDR. An intelligent genetic crossover algorithm (IGCA) is introduced that assists PF by applying crossover schemes employed in genetic algorithms (GAs) to reshape the approximated posterior PDF and shows that the proposed algorithm improved the accuracy and performance in nonlinear state estimation. Expand.

  11. Genetic Algorithm: An Approach on Optimization

    Solutions for both constrained and unconstrained problems of optimization pose a challenge from the past till date. The genetic algorithm is a technique for solving such optimization problems based on biological laws of evolution particularly natural selection. In simple terms, a genetic algorithm is a successor to the traditional evolutionary algorithm where at each step it will select random ...

  12. (PDF) A Genetic Algorithm-Based Feature Selection

    A genetic algorithm was proposed by Babatunde et al. [98] which used combinatorial set of 100 extracted features from leaf datasets. Babatunde in his enhanced version of research [99] added 12 ...

  13. (PDF) Genetic Algorithms

    In this paper, we propose a Genetic Algorithm based method that optimizes heterogeneous sensor node clustering. Compared with five state-of-the-art methods, our proposed method greatly extends the ...

  14. Design and optimization of wall-climbing robot impeller by genetic

    Genetic algorithm (GA) is an optimization algorithm based on the evolutionary theory of "natural selection by nature, survival of the fittest." It mainly involves operations such as selection ...

  15. A bio-inspired convolution neural network architecture for automatic

    This paper proposes a bio-inspired CNN model for breast cancer detection using gene expression data downloaded from the cancer genome atlas (TCGA). ... (WOA-CNN), the Genetic Algorithm (GA-CNN ...

  16. A Study on Genetic Algorithm and its Applications

    Genetic algorithms (GA) are search a lgorithms. based on the principles of natural selection and genetics, introduced by J Holland in the 1970's and i nspired by the. biological evolution of ...

  17. Genetic Algorithms: Brief review on Genetic Algorithms for Global

    The foundation of genetic algorithms, which is based on Darwin's "survival of the fittest" principle, is explained, then outlining the algorithm's primary features and briefly discussing its drawbacks. An intelligent bionic algorithm with great global optimization potential, the genetic algorithm evolved in a manner analogous to the natural process of genetic evolution in living creatures.

  18. GA Explained

    Introduced by Scholz in Genetic Algorithms and the Traveling Salesman Problem a historical Review. Edit. Genetic Algorithms are search algorithms that mimic Darwinian biological evolution in order to select and propagate better solutions. Source: Genetic Algorithms and the Traveling Salesman Problem a historical Review. Read Paper.

  19. Summary of genetic algorithms research

    2015. TLDR. This paper presents a method of optimized PID parameter self-adapted ant colony algorithm with aberrance gene that overcomes genetic algorithm's defects of repeated iteration, slower solving efficiency, ordinary ant colony algorithms' defects of slow convergence speed, easy to get stagnate, and low ability of full search. Expand.

  20. Optimized Design of Fixed-Sun Mirror Field Based on Genetic Algorithm

    This paper investigates the optimization design problem in a heliostat mirror field based on genetic algorithm and Monte Carlo fusion. Firstly, the annual average optical efficiency, annual average output thermal power, and annual average output thermal power per unit mirror area of the heliostat mirror field are calculated, and the ray tracing model is established. Firstly, the altitude angle ...

  21. PDF Optimized decoder for low-density parity check codes based on genetic

    papers involved the genetic algorithm (GA) in coding theory, in particular, Low-density parity check (LDPC) codes, are a family of error-correcting codes, their performances close to the Shannon ...

  22. Coatings

    Feature papers represent the most advanced research with significant potential for high impact in the field. ... The NSGA-II algorithm is developed on the basis of genetic algorithms, ... "Optimization of Milling Process Parameters for Fe45 Laser-Clad Molded Parts Based on the Nondominated Sorting Genetic Algorithm II" Coatings 14, no. 4: 449 ...

  23. Hierarchical non-dominated sort: analysis and improvement

    Pareto dominance-based multiobjective evolutionary algorithms use non-dominated sorting to rank their solutions. In the last few decades, various approaches have been proposed for non-dominated sorting. However, the running time analysis of some of the approaches has some issues and they are imprecise. In this paper, we focus on one such algorithm namely hierarchical non-dominated sort (HNDS ...

  24. Rainfall Prediction using Hybridized Genetic Algorithm-Based Artificial

    Semantic Scholar extracted view of "Rainfall Prediction using Hybridized Genetic Algorithm-Based Artificial Neural Network (GAANN) and Genetic Algorithm-Based Support Vector Machine (GA-SVM) Models." ... Search 217,789,818 papers from all fields of science. Search. Sign In Create Free ... AI-powered research tool for scientific literature ...

  25. Matching area selection for arctic gravity matching navigation based on

    The purpose is to verify that the method of selecting the suitable matching area based on the AT-AEE algorithm proposed in this paper is better than the traditional algorithms. The selected suitable matching area has strong suitability, which helps improve the navigation performance of the gravity matching-aided navigation system in the Arctic ...

  26. Empirical Enhancement of Intrusion Detection Systems: A ...

    This paper introduces a comprehensive empirical and quantitative exploration aimed at enhancing intrusion detection systems (IDSs). The study capitalizes on a genetic algorithm-based hyperparameter tuning mechanism and a pioneering hybrid feature selection approach to systematically investigate incremental performance improvements in IDS.