AI-powered Network Security: Approaches and Research Directions

research papers on network security algorithms

New Citation Alert added!

This alert has been successfully added and will be sent to:

You will be notified whenever a record that you have chosen has been cited.

To manage your alert preferences, click on the button below.

New Citation Alert!

Please log in to your account

Information & Contributors

Bibliometrics & citations, index terms.

Computer systems organization

Computing methodologies

Artificial intelligence

Machine learning

Security and privacy

Social and professional topics

Computing / technology policy

Computer crime

Recommendations

Directions in network-based security monitoring.

This article outlines some recently emerging research in network-based malicious software detection. The author discusses differences between traditional network intrusion detection and these new techniques, and highlights a new freely available tool ...

Syntax vs. semantics: competing approaches to dynamic network intrusion detection

Malicious network traffic, including widespread worm activity, is a growing threat to internet-connected networks and hosts. In this paper, we consider both syntax and semantics based approaches for dynamic network intrusion detection. The semantics-...

AI-Driven Cybersecurity: An Overview, Security Intelligence Modeling and Research Directions

Artificial intelligence (AI) is one of the key technologies of the Fourth Industrial Revolution (or Industry 4.0), which can be used for the protection of Internet-connected systems from cyber threats, attacks, damage, or unauthorized access. To ...

Information

Published in.

cover image ACM Other conferences

Association for Computing Machinery

New York, NY, United States

Publication History

Check for updates, author tags.

  • intrusion detection
  • protocol analysis
  • smart network controllers
  • Invited-talk
  • Refereed limited

Acceptance Rates

Contributors, other metrics, bibliometrics, article metrics.

  • 0 Total Citations
  • 243 Total Downloads
  • Downloads (Last 12 months) 84
  • Downloads (Last 6 weeks) 2

View Options

Login options.

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

View options.

View or Download as a PDF file.

View online with eReader .

HTML Format

View this article in HTML Format.

Share this Publication link

Copying failed.

Share on social media

Affiliations, export citations.

  • Please download or close your previous search result export first before starting a new bulk export. Preview is not available. By clicking download, a status dialog will open to start the export process. The process may take a few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress. Download
  • Download citation
  • Copy citation

We are preparing your search results for download ...

We will inform you here when the file is ready.

Your file of search results citations is now ready.

Your search export query has expired. Please try again.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Sensors (Basel)

Logo of sensors

Intelligent Techniques for Detecting Network Attacks: Review and Research Directions

Malak aljabri.

1 Computer Science Department, College of Computer and Information Systems, Umm Al-Qura University, Makkah 21955, Saudi Arabia; as.ude.uqu@iritomhs

2 SAUDI ARAMCO Cybersecurity Chair, Department of Computer Science, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, Dammam 31441, Saudi Arabia; as.ude.uai@4807000812 (S.M.); as.ude.uai@5017000812 (F.M.A.); as.ude.uai@0917000812 (M.A.); as.ude.uai@3222000812 (H.S.A.)

Sumayh S. Aljameel

3 Department of Computer Science, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, Dammam 31441, Saudi Arabia; as.ude.uai@leemajlas

Rami Mustafa A. Mohammad

4 Department of Computer Information Systems, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, Dammam 31441, Saudi Arabia; as.ude.uai@dammahommr

Sultan H. Almotiri

Samiha mirza, fatima m. anis, menna aboulnour, dorieh m. alomari.

5 SAUDI ARAMCO Cybersecurity Chair, Department of Computer Engineering, College of Computer Science and Information Technology, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, Dammam 31441, Saudi Arabia; as.ude.uai@9807000812 (D.M.A.); as.ude.uai@5217000812 (D.H.A.)

Dina H. Alhamed

Hanan s. altamimi.

The significant growth in the use of the Internet and the rapid development of network technologies are associated with an increased risk of network attacks. Network attacks refer to all types of unauthorized access to a network including any attempts to damage and disrupt the network, often leading to serious consequences. Network attack detection is an active area of research in the community of cybersecurity. In the literature, there are various descriptions of network attack detection systems involving various intelligent-based techniques including machine learning (ML) and deep learning (DL) models. However, although such techniques have proved useful within specific domains, no technique has proved useful in mitigating all kinds of network attacks. This is because some intelligent-based approaches lack essential capabilities that render them reliable systems that are able to confront different types of network attacks. This was the main motivation behind this research, which evaluates contemporary intelligent-based research directions to address the gap that still exists in the field. The main components of any intelligent-based system are the training datasets, the algorithms, and the evaluation metrics; these were the main benchmark criteria used to assess the intelligent-based systems included in this research article. This research provides a rich source of references for scholars seeking to determine their scope of research in this field. Furthermore, although the paper does present a set of suggestions about future inductive directions, it leaves the reader free to derive additional insights about how to develop intelligent-based systems to counter current and future network attacks.

1. Introduction and Background

Rapid advancements in technology have made the Internet easily accessible and it is now actively used by the majority of people for a plethora of professional and personal tasks. Various sensitive activities including communication, information exchange, and business transactions are carried out using the Internet. The Internet helps foster connection and communication, but the integrity and confidentiality of these connections and information exchanges can be violated and compromised by attackers who seek to damage and disrupt network connections and network security. The number of attacks targeting networks are increasing over time, leading to the need to analyze and understand them and develop more robust security protection tools. Every organization, industry, and government requires network security solutions to protect them from the ever growing threat of cyber-attacks. The need for more effective and stable network security systems to protect business and client data is rising as there is no network immune to network attacks.

Several techniques have been proposed over the years to handle and classify network traffic attacks. One is the port-based technique, which includes identifying port numbers among the ones registered by the Internet Assign Number Authority (IANA) [ 1 ]. However, due to the growing number of applications, the number of unpredictable ports has increased and this technique has proven to be ineffective. Furthermore, this technique does not cover account applications that do not register their ports with the IANA or applications that use dynamic port numbers. Another technique that has been proposed is the payload-based technique, also known as deep packet inspection (DPI), where the network packet contents are observed and matched with an existing set of signatures stored in the database [ 1 ]. This method provides more accurate results than the port-based technique, but does not work on network applications using encrypted data. Furthermore, this technique has been proven to be complex, involving high computational costs and a high processing load [ 1 ]. Behavioral classification techniques analyze the entire network traffic received at the host in order to identify the type of application [ 2 ]. The network traffic patterns can be analyzed graphically as well as by examining heuristic information, for example, transport layer protocols and the number of distinct ports contacted. Although behavioral techniques yield good results as they are able to detect unknown threats, they are resource-intensive and are prone to false positives. Another technique, called the rationale-based technique or the statistical technique [ 2 ] examines the statistical characteristics of traffic flow, namely, the number of packets and the maximum, mean, and minimum of the packet size. These statistical characteristics are used to identify different applications since these measurements are unique for every application. However, there is a growing need to incorporate this approach with techniques that could improve the accuracy and speed up the process of classifying the statistical patterns. The correlation-based classification [ 2 ] accumulates packets into flows; that is, it collects data packets with the same source and destination IP, port, and protocol. These are classified according to the correlation between network flows. Multiple flows are usually accumulated further into a Bag of Flow (BoF). Although this technique has proven to perform better than statistical techniques as it overcomes the problem of feature redundancy, it has a high computational overhead for feature matching. Therefore, the need to create techniques that could overcome the rising challenges persist.

At the onset of the 21st century, the concepts of intelligent techniques, namely machine learning (ML) and deep learning (DL) became widespread. Researchers widely acknowledged that these techniques could greatly increase the calculation potential since they focus on using statistical methods and data to make computers think the way humans think. Hence, these intelligent techniques started being used by computer scientists in network security as they addressed the limitations of the non-intelligent techniques. In the field of network security, ML or DL algorithms can be trained with network data to recognize traffic type as normal or malicious and thus protect the network from intruders. Furthermore, the algorithms can be trained to identify the attack type if the network traffic is malicious and trigger appropriate action to prevent the attack. By analyzing past cyber-attacks, the model can be taught to prepare individual defensive reactions. These applications of intelligent methods in network security, which is the focal point of this research paper, can be useful in big businesses, organizations, law enforcement agencies, and banks that store sensitive information as well as in personal networks.

In the past, most of the developed network attack detection techniques actively depended on a set of pre-defined signature-based attacks. This was a major setback since the database of the attacks needed to be constantly updated as the attackers found new ways to exploit network security. However, with the evolution of intelligent-based techniques such as ML and DL, the predictive accuracy of identifying and classifying network attacks has been greatly improved. Therefore, using intelligent-based techniques in network security is a thriving field for research that needs to be explored.

Although several review articles exploring how intelligent-based systems have been applied to detect network attacks have been published in the last few years, none have been found that are as comprehensive as this article. This article covers almost one hundred research articles produced from 2010 to 2021 on a range of network attacks. It will provide clear insights into the race between developing intelligent systems to counter network attacks and how these attacks have evolved to circumvent intelligent systems, thus highlighting gaps in the research and indicating potential future research areas. This research also applied a different taxonomy that, to the best of our knowledge, has not been used in any previous research. It sets up several criteria against which the articles being reviewed could be assessed and compared including:

  • (i) What is/are the classification algorithms implemented?
  • (ii) What is/are the datasets employed for developing the intelligent systems?
  • (iii) Furthermore, this research article compared the results obtained using different evaluation metrics.

It then discusses the answers to the following main questions:

  • (i) Which algorithm(s) was/were commonly implemented and in which kind of attacks?
  • (ii) Which dataset(s) is/are considered more reliable based on the results obtained?

The resulting comparisons and discussions will help future researchers to identify the directions to take in their research, that is, to either improve the intelligent-based algorithms or consider other algorithms, to identify the features that should be added or removed when building the training dataset, and to indicate the evaluation metrics that should be adopted to evaluate the created intelligent systems.

The outcomes of this paper provide valuable directions for further research and applications in the field of applying effective and efficient intelligent techniques in network analytics.

This article is organized into four sections. The first section provides an introduction and background to the research area. A brief overview of network attacks is presented in Section 2 . Section 3 discusses intelligent network attack mitigation techniques where all the reviewed research papers, the network attacks they address using ML and DL techniques, and their findings are presented. Finally, the last section provides a discussion of the findings and the ideas presented in the papers reviewed and sets out promising research directions.

2. Network Attacks

For decades, networking technologies have been used to improve data transfer and circulation. Their continuous improvements have facilitated a wide range of new services.

The Internet of Things (IoT) is a powerful tool for improving communication by connecting different devices to the Internet and collecting data. The information gathered assists firms in the analysis and forecasting of consumer behavior to enhance the quality of their products. Nowadays, ML and DL are being used to construct network systems that can conduct advanced analytics and automation. This technology is transforming the users’ networking experiences by simulating human intellect and gathered data with built-in algorithms [ 3 ].

The emerging cloud computing technologies have brought about remarkable evolutions in network technology where different applications, services, and computing and storage resources are offered on demand to a large number of users via the Internet, thus offering tremendous advantages including flexibility, minimal administrative efforts, cost effective resource utilization, high accessibility, efficiency, and reliability [ 4 ].

A new global wireless standard is the 5th generation (5G) mobile network, which represents a logical network type that connects essentially anything including machines, objects, and gadgets. Not only does 5G offer faster speeds and a greater number of linked devices, it also enables network slicing. Network slicing is the process of dividing several virtual networks operating on the same network infrastructure to create subnetworks that meet the demands of various applications. From entertainment and gaming to school and community safety, the 5G network technology has the ability to develop anything. 5G has the potential to provide higher download rates, real-time replies, and improved connection over time, allowing companies and consumers to explore new innovations [ 5 ].

Such an exponential growth in network technologies has offered many advantages and has greatly improved communications. However, each emerging network technology presents new security challenges and triggers the need for the development of detection tools and countermeasures to meet the new demands. The following subsections briefly discuss the main types of network attacks.

2.1. Types of Network Attacks

A network attack is an approach to hurt, reveal, change, destroy, steal, or obtain illegal access to a network system resource. The attack could come from inside (internal attack) or from outside (external attack). Table 1 lists and describes a number of different types of network attacks that disrupt communication, classifying them as either active or passive attacks, bitcoin attacks, account attacks, or a security breach [ 6 ].

Types of network attacks.

Attack NameDescriptionAttack by (Packets, Tools, etc.)
Active Attacks
Jamming AttackBy using the channel that they are communicating on, it prohibits other nodes from accessing it to connect.Radio frequency noise.
FloodingA DoS attack in which a server receives many connection requests but does not reply to complete the handshake. (ICMP Flood, SYN Flood, HTTP Flood).Unbound number of requests without acknowledgment of packet after receiving it.
Smurf AttackA network layer DDoS attack caused due to the network tools misconfiguration.Source IP fooling victim IP.
Teardrop AttackA DoS attack that bombards a network with many Internet Protocol (IP) data fragments, then the network is unable to recombine the fragments back into their original packets.Sending fragmented packets to the target machine.
RansomwareA form of malware that infiltrates and encrypts important files and systems, preventing a person from accessing their own data.B0r0nt0k (encryption ransomware), Mado (malicious program)
Session Hijacking To obtain unauthorized access to the Web Server, the Session Hijacking attack disrupts the session token by stealing or guessing a valid session token (e.g., predictable session token).Malicious JavaScript Codes, XSS, Session Sniffing.
Active ReconnaissanceAn intruder is engaged in targeting the system to acquire information about vulnerabilities (e.g., port scanning).Nmap, Metasploit.
Passive ReconnaissanceGathering information about computers and networks without actively engaging with them (e.g., eavesdropping, OS fingerprinting).Wireshark, Shodan.
Traffic AnalysisA method to gather and monitor wireless frames, packets, or messages to drive information for communication patterns.Sniffing tools.
War DrivingMapping the wireless access points with wireless networks with vulnerabilities in moving cars.iStumbler, Global Positioning System (GPS), antenna, Wifiphisher.
Zero AccessAn attack that has an unknown pattern or aims to exploit a potentially serious software security vulnerability that the developer or security personnel are not aware of.Undiscovered vulnerabilities (hardest to detect).
Credential StuffingA kind of cyber-attack in which attackers break into a system using a list of compromised user credentials. (e.g., dictionary attack).Bots for automation, fake IP addresses.
Account TakeoverAccount Takeover is like identity theft where a criminal gets unauthorized access to another person’s account (e.g., phishing, call center fraud).Obtaining compromised credentials.
Account LockoutAn attacker who does not have access to genuine website users’ credentials yet nevertheless does harm to them by taking advantage of security mechanisms (e.g., brute force attack). Locking a huge number of user accounts.
Vulnerability Scanning A continuous automated process of finding security flaws in websites on a network to exploit threaten and attack those websites.Bots that look for security issues and match them to known vulnerabilities in a database.
API AbuseAPI Abuse is defined as unauthorized or unlawful access to a server’s API via mobile or desktop applications.Stealing application codes for valuable intellectual property.

2.2. Network Attack Detection and Prevention Techniques

Security and defense systems are designed to identify, defend, and recover from network assaults. Confidentiality, availability, and integrity are the three primary aims of network security systems. Network intrusion detection and prevention techniques can be classified based on the approach used to detect network threats, prevent them, or a combination of both. These techniques are developed as software, hardware, or a combination of both. They can be classified into two classes: intrusion detection systems (IDS), and intrusion prevention systems (IPS) [ 6 , 7 ].

  • Intrusion Detection System (IDS): Referred to also as network-based IDS (NIDS). This system intensely monitors malicious network activities and notifies officials if an attack is detected with no prevention abilities. Signature-based and anomaly-based detection are the two most prevalent approaches used by IDS to identify threats. Signature-based procedures are applied to detect only known threats, relying on a database containing a list of pre-existing characteristics of known attacks (attacks signatures) to identify suspicious events. The database needs to be continuously updated to include emerging attacks. On the other hand, anomaly-based procedures attempt to differentiate malicious traffic from real traffic based on a change in the network traffic; thus, they can detect unknown threats. Inconsistencies such as high-size traffic, network latency, traffic from uncommon ports, and abnormal system performance, all represent changes in the normal behaviors of the system and can indicate the presence of network attacks.
  • Intrusion Prevention System (IPS): Known also as intrusion detection and prevention systems (IDPS). It scans the network continuously for the presence of illegal or rogue control points that are detected on the basis of changes in behavior. The system automatically takes countermeasures to tackle the threats and defend the system. The primary objective of an IDPS is to keep malicious or undesired packets and attacks from causing any harm. IDPS is more effective than IDS as it not only detects threats, but is able to take action against them. There are two types of IDPS: network-based intrusion detection and prevention systems (NIDPS) that analyze the network protocol to identify any suspicious activities and host-based intrusion detection and prevention systems (HIDPS) that are used to monitor host activities for any suspicious events within the host.

To identify attacks effectively and efficiently, a variety of detection approaches are constantly being developed based on intelligent techniques including ML and DL, which have recently gained immense popularity in the network security field.

3. Intelligent Network Attack Mitigation Techniques

In this section, research studies that used intelligent models to detect different cyber-attack types are reviewed and their findings summarized. Several ML algorithms have been used in these studies including classification, regression, and clustering techniques such as logistic regression (LR), decision trees (DT), etc. Some used random forest (RF), an ensemble of DT, in order to visually represent the sequences of the decision-making process in the form of a tree. Support vector machine (SVM) was widely used in classification due to its ability to distinctly classify the data points by building a hyperplane in an n-dimensional space, where n represents the number of features. Another ML classifier that has been widely used is naïve Bayes (NB), a supervised learning model that uses Bayes’ theorem of probability. Finally, some researchers have used the K-nearest neighbor (KNN) for classification and K-means clustering, an unsupervised approach. Further details about these algorithms can be found in [ 8 ].

DL is a subset of ML, which is a subset of artificial intelligence (AI). A number of DL techniques have been used to build the detection models in some studies, primarily the artificial neural network (ANN), which is an information-processing system that consists of several layers that work best with non-linear dependence and recurrent neural network (RNN), a type of ANN that contains memory function to maintain previous content. Another commonly used DL technique is the convolutional neural network (CNN), which is also a type of ANN that mimics human vision. Furthermore, deep neural network (DNN), a supervised learning type of ANN that finds correct mathematical manipulation to turn input into output, has been used by some authors. Long-short term memory (LSTM), a type of RNN designed to model temporal sequences more accurately, and multi-layer perceptron (MLP), a type of ANN that consists of many layers in directed graphs, have also been widely used. Finally, the gated recurrent unit (GRU), which, though a variant of LSTM and is considered to be more efficient than LSTM as it uses comparatively less memory and executes faster, has also been used. More information about the mentioned algorithms can be found in [ 9 ].

3.1. Problem Domains of the Reviewed Articles

The papers were classified according to the cyber-attack type on which they focused. The different attack types mentioned in this section are insider threat, DDoS attacks, zero-day attacks, phishing attacks, malware attacks, and botnet attacks. We then reviewed articles that did not target specific attacks, but aimed to identify attacks at IoT networks, classify the malicious traffic to different attacks, and identify attacks at the DNS level. Finally, we also mention papers targeting the detection of intrusions in the network.

3.1.1. Insider Threat

Cybersecurity measures have tended to focus on threats outside an organization rather than threats inside that can cause harmful effects. Therefore, researchers have started to look at different techniques to identify insider threats. Tuor et al. [ 10 ] built a model using principal component analysis (PCA) for feature selection, and unsupervised DL namely, DNN, RNN, SVM, isolation forest, DNN-Ident, DNN-Diagnosis LSTM-Ident, LSTM-Diagnosis, among others, that use system logs to detect anomalous activities in the network. The dataset used was synthetic CERT insider threat v6.2 [ 11 ], which was taken from the event log lines of a network of a simulated organization’s computer. The researchers targeted two prediction approaches: the “next time step” and the “same time step”. The results of the experiments showed that the “same time step” approach resulted in higher performance, and that the isolation forest model was the strongest model. To evaluate the proposed model, recall was used and DNN-Diagnosis, LSTM-Diagnosis, and the isolation forest model all obtained 100% recall. In future work, the researchers may apply the proposed model to a wider range of streaming tasks and explore different granularities of time.

Similarly, LSTM and CNN techniques were used by Yuan et al. [ 12 ] to build a model to detect insider threats. They applied the model on the CERT insider threat v4.2 dataset [ 13 ], which contained 32 M log lines among which 7323 were anomalous activities. The advantage of this version of the CERT dataset was that it contained more samples of insider threats than other versions. The train–test split was 70–30%. The researchers first used LSTM to extract the user behavior, abstracted temporal features, and produced the feature vectors. After that, the researchers transformed the feature vectors into fixed-size matrices. Finally, CNN was used to classify the feature matrices into anomaly or normal. The proposed model resulted in an area under the curve (AUC) of 94.49%.

Hu et al. [ 14 ] used DL methods to build a user authentication model based on characteristics of mouse behaviors that could be used to monitor and detect insider authentications. They used an open-source dataset called the Balabit Mouse Dynamics Challenge dataset [ 15 ], and CNN algorithm. CNN showed high performance in user authentication based on mouse features with a false acceptance rate (FAR) of 2.94% and a false rejection rate (FRR) of 2.28%.

3.1.2. DDoS Attacks

One of the most harmful threats in network security is distributed denial of service (DDoS) attacks that attempt to disrupt the availability of services. Since DDoS is easy to launch but not easy to detect, as in most cases the attack traffic is very similar to legitimate traffic, some researchers have focused solely on detecting them using different ML approaches.

Yuan et al. [ 16 ] proposed DeepDefense, which is a DL-based DDoS attack detection approach that can study network traffic sequence patterns and trace the network attack activities. They used the UNB ISCX intrusion detection evaluation 2012 (ISCX2012) dataset [ 17 ], and the RNN algorithm to build the model. From ISCX2012, the team extracted 20 network traffic fields to generate a 3-D feature map using a sliding time window. Data14 and data15 were extracted from ISCX2012, which contained 9.6 M packets and 34.9 M packets, respectively. The total number of training samples in data14 and data15 were 15,176 and 233,450, respectively. The experiment results showed that the DL models reduced the error rate by 39.69% compared to ML methods in a small dataset. For large datasets, the reduction in the error rate ranged from 7.517% to 2.103%. For future work, they suggested increasing the diversity of DDoS vectors and system settings to test the DeepDefense model as well as compare DeepDefense against other ML algorithms.

A study proposing a model for analyzing and detecting DDoS attacks on the network-level and service levels of the bitcoin ecosystem was carried out by Baek et al. [ 18 ]. The dataset consisted of real DDoS attacks [ 19 ] and contained the service affected, date of the attack, category of service, number of posts, etc. From the bitcoin block data, the researchers extracted statistical data such as maximum, minimum, summation, and standard variation. The researchers used PCA to perform feature extraction. MLP was used to detect DDoS while the training set, validation set, and testing set were divided according to the ratio 6:2:2. The results showed that the accuracy of DDoS attack detection was about 50% and the accuracy for classifying normal block data was about 70% while setting the unit of epoch to 100. In future work, the researchers wish to find out how to extract the features that impact the characteristics of the blocks made when a DDoS attack occurs.

Sabeel et al. [ 20 ] used DNN and LSTM for binary prediction of unknown DoS and DDoS attacks. To train the models, they used the CICIDS2017 dataset (size 283 MB) [ 17 ]. For testing, a new dataset called ANTS2019 (size 330 MB), which mimics real-life attacks, was generated in a simulated environment to measure performance. In feature engineering, 78 features were used for the training set and 77 for testing (the ‘Fwd Header length’ feature was dropped). The train–test split was 75–25%. When the model was trained using CICIDS2017 and part of ANTS2019, the highest evaluation accuracy of 99.68% for DNN was obtained. When the researchers demonstrated the retraining of the models on a dataset with new unknown attacks, the true positive rate (TRP) obtained was 99.8% and 99.9% for DNN and LSTM, respectively. To maintain performance, it was concluded that the models should be updated with new attacks at regular intervals.

An intrusion detection system (IDS) used against DDoS attacks called DDoSNet was built by Elsayed et al. [ 21 ], which was a combination of autoencoder (AE) with RNN. In their paper, the researchers evaluated their classifier using the newly released CICDDoS2019 dataset [ 22 ], which contained 80 flow features. For feature engineering, PCA was applied, and the input features were 77. The total number of samples for training, validation, and testing sets were 161,523, 46,150, and 23,000, respectively. When the model was evaluated, the results indicated an accuracy of 99%, outperforming all compared ML methods—SVM, DT, NB, RF, Booster, and LR. In future work, the researchers intend to test the performance of their model in different datasets and extend the work to multiclass classification, since, in this research, a binary classification framework was applied.

A model that exploited the characteristics of CNN to classify the traffic flows as either benign or malicious was proposed by Doriguzzi-Corin et al. [ 23 ]. The CICIDS2018, CICIDS2017, and ISCX2012 datasets, which can be obtained through the Canadian Institute for Cybersecurity of the University of New Brunswick (UNB), were used by the researchers. They extracted 37,378 DDoS flows, and 37,378 randomly selected benign flows from ISCX2012. Then, they repeated the process for CICIDS2017 with 97,718 for benign and 97,718 for DDoS flows, and again for CICIDS2018 [ 17 ] with 360,832 for benign and 360,832 for DDoS flows. Following the pre-processing phase, each dataset was split as 90–10% train–test sets. The results showed that the accuracy for each dataset was 99.87%, 99.67%, and 98.88%, respectively. The UNB201X dataset was then constructed by combining splits from every year and the accuracy for the model with the UNB201X dataset was 99.46%. In future work, the researchers would like to optimize the pre-processing tool, rather than the detection model and also extend the dataset’s labels.

Ahuja et al. [ 24 ] used various DL algorithms to detect the DDOS attacks: CNN, RNN, LSTM, CNN-LSTM, support vector classifier-self organizing map (SVC-SOM), and stacked autoencoder-multi layer perceptron (SAE-MLP). The team used the dataset provided by leading India Project Mentor [ 25 ], which consists of 22 features. Two different optimizers were used: stochastic gradient descent (SGD) for the first 10 epochs and Adam for the next 150 epochs. For an unencrypted network, using a CNN, traffic features can be extracted automatically. Finally, they evaluated the model using the following metrics: accuracy, precision, recall, F-score, false positive rate (FPR), and false negative rate (FNR). The highest classification accuracy of 99.75% was achieved with the SAE-MLP algorithm.

A study conducted by Shi et al. [ 26 ] focused on using DL for both packet-wise and period-wise methods for traffic DDoS attack detection. They proposed a model that leveraged a DL approach for DDoS detection, which was DeepDDoS. It used spark as a big data processing framework. Additionally, for feature selection, maximal information coefficient and mutual information were used. The LSTM model was used for the training phase due to its better performance in longer sequences. The proposed work tried to filter out the abnormal flow with the least computational costs. The dataset used was CICIDS2017 (Size 283 MB). The results showed that the model achieved over 99% accuracy when receiving five packets in a continuous flow.

A model that used DL for the detection of multi-vector DDoS on a software-defined network was construed by Quamar Niyaz et al. [ 27 ]. An SAE-based DL approach was applied and the team collected network traffic from a real network (packets for normal traffic were captured from network connected to the Internet) and a private network (packets with DDoS attacks were captured from a private lab network) for the evaluation of the model. They divided the dataset files into training and testing, and then normalized them using max–min normalization. For comparison, models with soft-max and neural networks (NN) were also developed. The result showed that SAE performed better than the soft-max and NN model. The model achieved 95.65% accuracy. The researchers intend to develop a NIDS in future to detect the DDoS along with other attacks as well as the use of DL for feature extraction from raw bytes.

Pande et al. [ 28 ] aimed to build a ML model to detect DDoS attacks. To build the proposed model, a DDoS attack was performed using the ping of death technique and detected using RF. The dataset used by the researchers was the NSL-KDD [ 29 ] dataset containing a training set of 125,973 records and testing set of 22,544 instances and 41 attributes. The building time of the model was 8.71 s and the testing time was 1.28 s. The proposed model built using the random forest (RF) algorithm resulted in 99.76% accuracy. For future work, the researchers will implement the DL technique to classify the instances.

Radivilova et al.’s [ 30 ] goal was to analyze the main methods of identifying DDoS attacks through network traffic using the SNMP-MIB dataset [ 31 ]. They used RF as the classification method. The experiments began with the training and evaluation of a time series classifier. Recurrence analysis was used to extract features and the Hurst exponent was set at 10 intervals during the experiment. The main evaluation metrics were accuracy, FNR, and TPR. A numerical experiment showed that early detection is plausible when the average attack ratio represents 15–20% of the average traffic.

Likewise, Filho et al. [ 32 ] presented a smart detection system for DoS using ML. The goal was to detect both high- and low-volume DDoS attacks. The researchers used RF, perceptron, AdaBoost, DT, SGD, and LR. Since RF achieved higher precision while using 28 variables, it was used for classifying the network traffic. The evaluation of the proposed system was based on four intrusion detection benchmark datasets, namely, CICIDS2017, CICDoS2017 [ 33 ], CICIDS2018, and customized datasets. To evaluate the proposed model, recall, precision, and F-measure (F1) were used. In the CICIDS2018 and CIC-DoS2017 datasets, the proposed system achieved precision and a detection rate (DR) of more than 93% with a false alarm rate (FAR) of less than 1%. The researchers intend to include an analysis of DDoS attacks of Heartbleed and brute force attacks in their future work and to evolve methods for correlating triggered alarms.

Correspondingly, Vijayanand et al. [ 34 ] proposed a detection system of novel DoS attacks using multi-layer deep algorithms arranged in hierarchical order to detect the attacks accurately by analyzing the smart meter network traffic. The suggested technique addresses issues arising as a result of a large amount of input data and the complexity of input features. To evaluate the designed model, 9919 records from the CICIDS2017 dataset were used. The performance of the proposed system was analyzed by comparing it with simple multi-layer DL algorithms and hierarchical SVM algorithms, obtaining efficiency values of 39.78% and 99.99%, respectively.

An improved rule induction (IRI) based model was put forth by Mohammed et al. [ 35 ] for detecting DDoS attacks. UNSW-NB15 [ 36 ] dataset was used and, following the application of under-sampling without replacement and further pre-processing as well as correlation-based feature selection, the final dataset ended up with eight attributes. The suggested algorithm, called IRI for detecting DDoS attacks (IRIDOS), eliminates all insignificant items during the model creation and reduces the searching space to create the classification rules. Furthermore, the algorithm stops learning a rule after reaching a ‘rule-power’ threshold. The proposed technique was evaluated on 13 datasets from the UCI repository. IRI obtained a F1 score of 93.90% on UNSW-NB15. The model attained promising results, especially when compared to other data mining algorithms such as PRISM (divide-conquer knowledge-based approach), PART (a rule-based classification algorithm), and OneRule (OR).

An evaluation and comparison of the performance of different supervised ML algorithms on the CAIDA DoS attack dataset [ 37 ] were carried out by Robinson and Thomas [ 38 ]. Other datasets used were CAIDA Conficker, and KDD-99 [ 39 ]. The different ML algorithms included NB, RF, MLP, BayesNet, J.48, IBK, and Voting. It was observed that since the CAIDA Conficker dataset contained DDoS attacks generated from large botnets with flooding-attack vectors that were easily distinguishable with more bias, all ML algorithms, except NB, achieved an accuracy rate of more than 99% in this dataset.

Research that used the same CAIDA dataset was conducted by Barati et al. [ 40 ] who developed a hybrid ML technique to detect DDoS attacks. The CAIDA USCD 2007 was used for the attack as it contained an hour of anonymized traces from a DDoS attack on 4 August 2007. For normal traffic, the CAIDA Anonymized 2013 was used as it contained passive traces from CAIDA passive monitors in 2013. For feature selection and attack detection, genetic algorithm (GA) and ANN were used, respectively, and to select the most efficient feature wrapper method, GA was applied. The attack detection method was improved by deploying the MLP method of ANN. While building the model, the 10-fold cross-validation technique was used. The results showed that the proposed method obtained an excellent AUC of 99.91%. The researchers’ future work will include performing more experiments to detect the robustness of the model on different datasets.

Kim et al. [ 41 ] developed a model based on a convolutional neural network (CNN) for DoS attacks. They used two different datasets: the KDD-99 dataset and the CICIDS2018 dataset. They generated two types of intrusion images, RGB and grayscale. They considered the number of convolutional layers and the size of the kernel when they designed their CNN model. They performed both binary classification and multiclass classification. Moreover, the performance of the proposed model was evaluated by comparing it to the recurrent neural network (RNN) model. The best results were achieved with the KDD dataset by the CNN model that showed 99% or more accuracy in the binary and multiclass classifications. The RNN showed 99% accuracy in the binary classification. The CNN model proposed by the researchers was better able to identify specific DoS attacks with similar characteristics than the RNN model.

Finally, an approach to detect DDoS attacks using GRU was carried out by Rehman et al. [ 42 ]. The team produced a high-efficiency approach called DIDDOS to detect real world DDoS attacks using GRU, a form of RNN. Different classification models, namely GRU, RNN, NB, and SMO, were applied on the CICDDoS2019 dataset. For DDoS classification in the case of reflection attacks, the highest accuracy level of 99.69% was achieved while for the DDoS classification in the case of exploitation attacks, the highest accuracy level of 99.94% was achieved using GRU.

3.1.3. Phishing Attacks

Some studies have focused on training models and testing them to detect phishing attacks. For instance, the main goal of Alam et al. [ 43 ] was to defend against phishing attacks by developing an attack detection model using RF and DT, which are ML algorithms. For ML processing, a traditional phishing attack dataset from Kaggle that contained 32 features was used. To analyze the dataset characteristics, the intended model used PCA, a type of feature selection algorithm. An accuracy level of 97% through RF was reached. With less change and variance in RF, the over-fitting obstacle was controlled. Future studies will include the prediction of phishing attacks from the registered attacks in a dataset by applying CNN and implementing the IDS.

To identify phishing website attacks, a self-structuring neural network based on ANN was developed by Mohammad et al. [ 44 ]. Phishing-related features are crucial in detecting the kind of web pages that are extremely dynamic, thus the structure of the network should be constantly improved. The proposed approach addresses this issue by automating the network structuring process and demonstrating high acceptance for noisy input, fault tolerance, and significant prediction accuracy. This was accomplished by increasing the learning rate and expanding the hidden layer with additional neurons. The goal of the developed model was to obtain generalization ability, which means that the training and testing classification accuracy should be as similar as possible. The dataset included 600 legal and 800 phishing websites, with 17 characteristics retrieved using their own tool [ 45 , 46 ]. The accuracy of the training, validation, and testing sets were 94.07%, 91.31%, 92.18% for 1000 epochs, respectively. The principle of the model was to use an adaptive scheme with four processes including structural simplicity, learning rate adaptation, structural design adaptation, and an early stopping approach based on validation faults.

Trial and error is one of the most popular techniques used to train a neural network, but it has a significant drawback in that it takes a very long time to set the parameters and might even require the assistance of a domain expert. Rather than trial and error, a better self-structuring neural network anti-phishing model, which makes it simpler to structure NN classifiers, was proposed by Thabtah et al. [ 47 ]. The goal of the technique was to build a large enough structure from the training dataset to develop models that can be generalized to the testing dataset. During the training phase, the algorithm dynamically modifies the structural parameters in order to generate accurate non-overfitting classifiers. With a dataset of over 11,000 websites from UCI, the neural network characteristics were updated as the classification model was being built, but they were largely dependent on the computed error rate, intended error rate, and underlying technologies. When compared to Bayesian networks and DT, the findings indicated that the dynamic neural network anti-phishing model had a higher prediction accuracy. The highest average accuracy achieved was 93.06% when information gain was used for pre-processing.

A two-layered detection framework to identify phishing web attacks by using features derived from domain and DNS packet-level data was built by Rendall et al. [ 48 ] using four ML models, namely MLP, SVM, NB, and DT. The team investigated the use of the approach where a phishing domain was classified multiple times, with additional classification being carried out only when it scored below a predefined confidence level set by the owner of the system. The model was evaluated on a dataset created by the team, and it contained 5995 phishing records and 7053 benign records. After applying the models in the two-layered architecture, the highest accuracy of 86% was achieved by MLP and DT.

Li et al. [ 49 ] built a stacking model using URL and HTML features to detect phishing web pages. They used lightweight HTML and URL features as well as HTML string embeddings to make it possible to detect phishing in real-time. The 50K-PD dataset that contained around 49,947 samples as well as the 50K-IPD dataset that contained 53,103 web page samples were made and used. The stacking model was made by combining GBDT, XGBoost, and LightGBM in multiple layers. The model achieved an accuracy of 97.30% on the 50K-PD dataset and an accuracy of 98.60% on the 50K-IPD dataset.

Phishpedia, an ensemble deep learning model described in [ 50 ], addresses major technological difficulties in phishing detection by identifying and matching brand logo variations. Three different datasets were used for this experiment. First, researchers collected the first dataset by subscribing to a service; then they collected the second one from a top-ranked Alexa list, and finally, to evaluate the detection model, they collected the third dataset from a benign dataset. As a Siamese neural network converts image to vector, which assists in estimating the correlation between two visuals, this model was chosen by the researchers for their project. A better accuracy level and less runtime cost were achieved with Phishpedia. Unlike many other approaches, phishing data are not required for training. With an accuracy of 99.2%, Phishpedia outperformed the state-of-art approaches such as LogoSENSE, EMD, and PhishZoo by a large margin. In the future, the researchers plan to expand Phishpedia by adding a system to monitor phishing online.

Supervised machine learning models were used to detect phishing attacks based on novel combination features that were extracted from the URL by Batnaru et al. [ 51 ]. The researchers used a dataset from Kaggle [ 52 ] and PhishTank [ 53 ] containing 100,000 URLs that consisted of 40,000 benign URLs from Kaggle and 60,315 phishing URLs from PhishTank for the training. They used five ML models, namely MLP, RF, SVM, NB, and DT. In terms of model selection, RF was found to be the best candidate based on F1 scores. The evaluation process was performed using an unbalanced dataset that consisted of 305,737 benign URLs and 74,436 phishing URLs to evaluate the selected model in a realistic scenario. The achieved accuracy was 99.29%. The results were compared with the performance of Google Safe Browsing (GSB), which is the default protection that is available through popular web browsers. The model outperformed the GSB. In future work, the researchers’ aim is to explore the effectiveness of their model on other datasets as well as experiment with more features. They also plan to assess the robustness of the methodology against adversarial attacks that are mostly used by malicious parties.

PhishDump, a new mobile app based on a mix of LSTM and SVM algorithms, was suggested by Rao et al. [ 54 ] to detect genuine and fake websites in mobile platforms. Because PhishDump concentrates on extracting characteristics of URLs, it offers important benefits in comparison with previous efforts including quick calculation, class independence, and resistance to unintentional malware installation. The data were gathered from three separate inputs: Alexa, OpenPhish, and PhishTank. The application’s positive aspect is that it is free of external code and databases, allowing for the identification of malicious websites in as little as 621 ms. The characteristics extracted from the LSTM model are supplied as input for URL classification to SVM using a python code. Using several datasets, this application was compared against current baseline classifiers. PhishDump surpassed all previous studies with an accuracy of 97.30%. This approach has limitations such as the chance that an intruder might circumvent the approach by implementing structural modifications to the URL, and the system could miss phishing websites with shortened URLs.

Marchal et al. [ 55 ] reviewed phishing attack problems. The researchers provided guidelines for designing and evaluating phishing webpage detection techniques. They also presented the strengths and weaknesses of various design and implementation alternatives with regard to deployability and ease of use. Moreover, they provided a list of guidelines to evaluate the proposed solutions following the selection of representative ground truth, appropriate use of the dataset, and the relevant metrics. These recommendations can also enable comparison of the accuracy of different phishing detection technologies. The researchers state that academic research in phishing detection should adopt design and evaluation methods that are relevant to real-world publication.

Similarly, Das et al. [ 56 ] also reexamined the existing research on phishing and spear phishing from the perspective of different security domains such as real-time detection, dataset quality, active attacker, and base rate fallacy. They elucidated on the challenges faced and surveyed the existing solutions to phishing and spear phishing. Their work helps guide the development of more robust solutions by examining all the existing research on phishing.

3.1.4. Zero-Day Attacks

Interestingly, some researchers have focused on identifying zero-day attacks. One such study was conducted by Beaver et al. [ 57 ] who used ML methods that are able to distinguish between normal and malicious traffic. In their study, they used the adaptive boosting (AdaBoost) ensemble learner with DT in order to distinguish and classify the type of traffic on the KDD-99 dataset. The implementation that was tested in this study had four levels: (1) the top-level model that puts a cap on the FPR; (2) the first internal model that includes the AdaBoost ensemble, (3) the second internal model that implements the DT, and the lowest model that provides a judgment on whether the traffic was normal and relies on an anomaly detection algorithm. The system was able to detect 82% of the attacks that were previously missed by the signature-based sensor, detected 89% of attacks that it had not been trained to detect, and had a DR of 94% and a 1.8% false alarm rate. The future goals of the researchers are to scale the performance, which will require more parallelism in the architecture and modification of the training in order to accommodate larger datasets.

Ahmed et.al. [ 58 ] proposed a DL model that was used for identifying zero-day botnet attacks in real-time with a feed-forward backpropagation ANN technique and DNN. An important factor for obtaining high performance is a reliable dataset and hence the CTU-13 dataset [ 59 ] was obtained from the Botnet Capture Facility. There were nine input layer features and the dataset size was 10,000 randomly chosen flows. The first step was to normalize the whole data followed by the application of Adam’s optimizer in the model. The train–test split was 80–20%. The result showed that the accuracies achieved were over 99.6% after 300 epochs and that the model outperformed the NB, SVM, and backpropagation algorithms. In future work, the researchers suggest examining the efficiency of the proposed model with various other datasets.

3.1.5. Malware Attacks

Barut et al. [ 60 ] aimed to compare the ML algorithms, namely SVM, RF, and MLP, to determine the most accurate and the fastest method to detect malware encrypted data. Two datasets were generated: dataset1, which was produced using Stratosphere IPS [ 61 ] extracting 20 types of malware classes (Adload, Ransom, Trickbot, etc.), and dataset2, which used CICIDS2017. In feature engineering, 200 flow features were extracted and the chi-square was used. The researchers concluded that RF was the best performing algorithm as its results showed a DR of 99.996% and a FAR of 2.97%. Generally, the results showed that the SVM, RF, and MLP models are the most accurate, with some trade-offs. For dataset1, the RF model was the best performing across all evaluation metrics except for the prediction speed, which was higher when using the SVM model. For dataset2, the SVM model was the most accurate.

Marin et al. [ 62 ] developed a model for malware traffic detection of an encrypted network using DL. The specific DL model proposed in this study was the DeepMAL, which automatically discovered the best features/data representation from raw data. The dataset used was the USTCTFC2016 [ 63 ], which comprised two sections labelled malicious or normal traffic and 10 types of malware traffic. Two types of representations were used for the raw data: packets and flows. It was concluded that using raw flows representation of the input for the DL models achieved better results. The results showed that DeepMAL detected Rbot botnet with an accuracy of 99.9%, while Neris and Virut achieved 63.5% and 54.7% each. Despite the low rates achieved, they still performed better than RF.

Park et al. [ 64 ] evaluated the recognition performance of various types of attacks including IDS, malware, and shellcode using the RF algorithm and the Kyoto 2006+ [ 65 ] dataset (total size 19.8 GB). The dataset consisted of three class types: attack, shellcode, and normal. For the first two classes, there are three attack types: IDS, malware, and shellcode. This dataset contains the traffic data collected from November 2006 to December 2015. In the data preparation step, the researchers selected one month of data (May 2014) to train the model and another month (April 2014) to test the model. In the experiment, Park et al. considered 17 features and normalized the data. The overall performance was 99% for F-Score. However, it was observed that the performance of detecting different attacks differed. They propose to further evaluate the performance of the detection of various attack types using the same dataset but varying the training conditions.

In order to classify new malware variants accurately, David et al. [ 66 ] used DL to build a model using a deep belief network (DBN) algorithm that could generate and classify a malware signature automatically. The dataset used to build the proposed model was collected by the authors and contained 1800 instances and six malware categories (Zeus, Carberp, Spy-Eye, Cidox, Andromeda, and DarkCome) with 300 variants for each category. The DBN had eight layers with the output layer containing 30 neurons. The training process was unsupervised with 1200 vectors for training and 600 vectors for testing. To denoise the autoencoders, the noise ratio was 0.2 and training epochs was 1000. The model resulted in an accuracy of 98.6% when evaluated.

Reinforcement learning continuously mimics attackers to produce new malware samples, thereby giving viable attack models for defenders, as Wu et al. [ 67 ] explained. They suggested the gym-plus model, where gym-malware is improved by adding additional activities to the action space and allowing it to modify harmful portable executable files. Additionally, it retrains the algorithm using the public EMBER [ 68 ] dataset to substantially increase the DR. In gym-plus, the DQN, SARSA, and Double DQN algorithms were used, and DQN established better policies than the other algorithms. Through retraining on the adversarial instances provided by the DQN agent, malware detection accuracy increased from 15.75% to 93.5%.

Another dataset called MTA KDD 19 [ 69 ] was explored by Letteri et al. [ 70 ], who applied dataset optimization strategies to detect malware traffic. Two dataset optimization strategies, namely dimensional reduction technique based on autoencoders (AE-optimized) as well as feature selection technique based on rank relevance weight (RRw-optimized) and sensibility enhancement on the MLP algorithm were used. In RRw, feature selection consisted of two steps: dataset tampering where 5-fold cross-validation was applied, and backward feature elimination. In the AE-optimized technique, 33 input and output neurons were made and the train–validation split was 85–15%. The training set was further split to 15% testing. The highest accuracy of 99.60% was achieved in the RRw-optimized MTA KDD 19 dataset.

3.1.6. Malware Botnet Attacks

A novel scheme using supervised learning algorithms and an improved dataset to detect botnet traffic was carried out by Ramos et al. [ 71 ]. Five ML classifiers were evaluated namely, DT, RF, SVM, NB, and KNN on two datasets: CICIDS2018 and ISOT HTTP [ 72 ] Botnet (total size 420 GB). A network flow metrics analysis and feature selection was carried out on both datasets after which the ISOT dataset had 20 attributes including sources, destination port numbers, and transfer protocols among the selected features, and CICIDS2018 had 19 similar kinds of attributes. Five-fold cross-validation was applied and 80% of botnet instances were used for training and the remaining for testing. For the CICIDS2018 dataset, RF and DT achieved the highest accuracy of 99.99%. For ISOT HTTP, again, RF and DT achieved a high accuracy of 99.94% and 99.90%, respectively.

Using a similar dataset, Pektas and Akerman [ 73 ] utilized DL techniques and flow-based botnet discovery methods to identify botnet using two datasets: CTU-13 and ISOT HTTP, containing both normal and botnet data. They combined two DL algorithms namely, MLP and LSTM. In feature extraction, a flow graph was constructed where all flow data were processed to extract the features. The ISOT dataset consisted of two types of botnets, namely Waledac and Zeus, whereas CTU-13 contained seven botnet families. For the ISOT dataset, the approach achieved an F-score of 98.8%, and for CTU-13, an F-score of 99.1%.

3.1.7. Detecting Attacks over IoT Networks

As the Internet of Things (IoT) has become an important aspect of our lives, concerns about its security have increased, motivating researchers to focus their efforts on identifying new techniques to detect different attacks and increase the security of IoT. One such study was conducted by Abu Al-Haija et al. [ 74 ], where they developed an intelligent detection and classification DL-based system by leveraging the power of CNN for cyber-attacks in IoT communication networks. For evaluation, the NSL-KDD, which includes all the key IoT computing attacks, was employed. This system was validated and evaluated using K-fold and confusion matrix parameters, respectively. The outcome was an efficient and intelligent deep-learning-based system that can detect the mutations of IoT cyberattacks with an accuracy level that is greater than 99.3% and 98.2% for the binary-class and the multiclass, respectively. Discussions on future work include developing new software that catches and investigates data packets that communicate through the IoT environment and updating the existing dataset for more attacks.

By utilizing unique computing resources in a regular IoT space and applying an instance of extreme learning machine (ELM), a blockchain-based efficient solution for safe and secure IoT was proposed by Khan et al. [ 75 ]. This approach analyzes the credibility of the blockchain-based smart home in terms of the fundamental security objectives of confidentiality, accessibility, and integrity. The simulation outputs were provided to show that ELM’s overheads were minor in comparison to the cybersecurity advantages it brings. The ELM architecture is made up of input layers, numerous hidden layers, and a final output layer, with hidden layers consisting of fixed neurons to boost the network’s efficiency. To minimize the error rate, the backpropagation approach is combined with a feed-forward mechanism to modify the network weights. After pre-processing the data, to remove abnormalities and lessen the risk of faults, input data from NSL-KDD was mainly split into 85% training and 15% validation. The researchers aim to investigate more datasets and architectures in the future, because the presented ELM surpassed previous ML algorithms and achieved an accuracy of 93.91%.

Ullah et al. [ 76 ] aimed to detect malware-infected files and pirated software across the IoT network using the DL approach. The dataset used was collected by Google Code Jam (GCJ) [ 77 ]. The combined DL-based approach comprised two steps. First, to detect the pirated features, the TensorFlow neural network was proposed. The unwanted details were removed using the tokenization process and extra features were mined using stemming, root words, and frequency constraints. Second, to detect the malware, a new methodology based on CNN was proposed. The raw binary files were converted to a color image to solve the detection of malware by using an image classification problem. Grayscale visualization was gained by transforming the color images, which were then used to classify malware types. The results showed that this method performed better than modern methods when it came to measuring cybersecurity threats in IoT. In future work, the researchers intend to put forward an algorithm that can detect unknown malware families.

A model that was used for the classification of attacks in IoT networks and anomaly detection was created by Tama and Rhee [ 78 ] using a DNN. The team used CIDDS-001 [ 79 ], UNSW-NB15, GPRS-WEP [ 80 ], and GPRS-WPA2 [ 80 ] datasets and compared the results. The results showed a good performance in attack detection. The average performance of DNN was validated using 10-fold cross-validation on the UNSW-NB15, CIDDS-001, GPRS-WEP, and GPRS-WPA2 datasets that resulted in 94.17%, 99.99%, 82.89%, and 94% accuracy, respectively. In future work, the researchers want to investigate a larger value of trial repetition given the unaffected performance of the different validation methods.

To mitigate IoT cybersecurity threats in a smart city, Alrashdi et al. [ 81 ] proposed an anomaly detection-IoT system using the RF model of ML. The UNSW-NB15 dataset was selected for this project, which includes 49 features and nine attack classifications to revise normal and abnormal behaviors. The resulting model could detect cyber-attacks at fog nodes in a smart city by monitoring the network traffic that passed through each node. After detection, it alerted the security cloud services to analyze and update their system. This solution achieved the highest classification accuracy of 99.34% with the lowest FPR while detecting compromised IoT devices at distributed fog nodes. Using open sources of distributed computing to distribute the model in fog nodes to detect IoT attack networks and using n-fold cross validation to evaluate performance metrics of design are some of the researchers’ future goals.

3.1.8. Malicious Traffic Classification

In order to protect organizations and individuals against cyber-attacks, network traffic first needs to be analyzed and classified so that anomaly and malicious attacks can be detected. As the role of malicious traffic classification is very important, many researchers have sought to improve classification techniques using the power of AI. Some studies have focused on anomaly and abnormal traffic. Yang et al. [ 82 ] built a model that found hidden abnormal traffic in the network to detect attacks using DL techniques. The dataset used was NetFlow campus information, which is a collection of data gathered by campus routers. For the pre-processing stage, the authors transformed the data into standardized format, and then the RNN algorithm was applied. The proposed model resulted in an accuracy of 98%. For future work, the authors propose to search for more critical features that could help in detecting further cyber-attacks.

Chou et al. [ 83 ] used AI algorithms through TensorFlow to train the system by providing it with rules and signatures to distinguish between normal and abnormal traffic behavior. The researchers developed a framework of a DL model on TensorFlow by combining multiple layers of non-linear features and training the system to learn the normal behavior using a forward propagation algorithm on the NSL-KDD dataset. The results were promising, showing high accuracy during testing of up to 97.65% in the detection of probing attacks and 98.99% in the detection of DDoS attacks. In future work, improvements need to be made in the training characteristics in TensorFlow as the present model could not predict user to root (U2R—attacker tries to gain unauthorized access posing as a normal user) and remote to local (R2L—attacker tries to gain unauthorized access by exploiting network vulnerabilities) attacks since the dataset sample was too monotonous, leading to over-learning.

An ensemble deep model to detect and classify anomalies at both the network and host levels was presented by Dutta et al. [ 84 ]. The datasets used were IoT-23 [ 61 ], LITNET-2020 [ 85 ], and NetML-2020 [ 86 ] and the DL techniques applied were DNN, long short-term memory (LSTM), and a meta-classifier (i.e., LR). A deep sparse autoencoder (DSAE) was used as the feature engineering technique and a stacking ensemble learning approach was used for classification. After testing on three heterogenous datasets, the researchers concluded that the suggested approach outperformed individual and meta-classifiers such as RF and SVM. In future work, the researchers suggest conducting experiments on more sophisticated datasets and using advanced computational methods to boost processing speed.

Sun et al. [ 87 ] built a traffic classification model using DL techniques, focusing on web and peer-to-peer (P2P) traffic. The dataset used to train the proposed model was collected by the authors by capturing traffic from the network using a distributed host-based traffic collection platform (DHTCP). In the training process, the dataset was divided by 5:5, 7:3, and 10-fold cross-validation for the first, second, and third experiment, respectively, and radial basis function neural network (RBFNN), SVM, and probabilistic neural network (PNN) were applied. The results showed that the highest accuracy was 88.18% when using PNN and dividing the dataset as 7:3 for training and testing.

Some researchers have focused on investigating the effects of network data representation on the intelligent models. Millar et al. [ 88 ] devised and compared three ways of network data representation for malicious traffic classification to deep learners: payload data, flow image, and flow statistics. They showed that malicious classes can be predicted using just 50 bytes of a packet’s payload. Since DL benefits from an extensive and large dataset, the UNSW-NB15 dataset was selected for the experiment. The payload-based method was found to have the best performance. However, all methods failed to accurately identify DDOS attacks. Since different malicious attacks exhibit different defining characteristics, there is no ‘one size fits all’ solution for identifying all attacks. Hence, in future work, the researchers propose to research the combination of payload-based and statistical inputs to identify malicious traffic.

Yang et al. [ 89 ] aimed to develop a model for malicious traffic detection of an encrypted network using DL. The model proposed was developed based on a residual neural network (ResNet), which can automatically identify features and effectively isolate contextual information of the encrypted traffic. Moreover, the CTU-13 dataset was used to train the model and, in the pre-processing stage, the data were converted into the IDX format, then traffic refinement, traffic purification, data length unification, and IDX file generation were performed. Then, deep Q-network (DQN) reinforcement learning, and deep convolution generative adversarial networks (DCGAN) were used to generate the encrypted traffic adversarial sample. This resolved the issue of unbalanced and insufficient or small samples. The model achieved a high accuracy of 99.94%. In future, the researchers will focus on delivering advanced genetic algorithms into DCGAN to enhance generator efficiency.

A new framework using ML for hardware-assisted malware detection by monitoring and memory access pattern classification was introduced by Xu et al. [ 90 ]. They proposed in-processor monitoring to obtain virtual address trace and addressed this by dividing accesses into epochs and summarizing the memory access patterns of each epoch into features, after which they are fed to ML classifiers, namely RF and LR. It was concluded that the best performing classifier was RF for both kernel rootkits and memory corruption attacks. Its accuracy in kernel rootkits detection reached a 100% TPR, with less than 1% FPR. As for user-level memory corruption attacks, the algorithm demonstrated a 99.0% DR with less than 5% FPR.

De Lucia et al. [ 91 ] proposed a malicious network traffic detection mechanism of encrypted traffic using two techniques—SVM and CNN. To conduct the experiments, the team leveraged a public dataset [ 92 ], which consisted of malicious and normal TLS network traffic packets. In data pre-processing, the desired TLS features were extracted from the packet captures using a custom program written in the PcapPlusPlus framework [ 93 ]. The train–test split was 70–30%. Both methods successfully achieved a high F-score and accuracy and a low FPR. However, SVM outperformed CNN by achieving a lower FPR and a slightly higher F-score, precision, accuracy, and recall.

While building ML models for the detection of normal or malicious traffic, it was observed that questions arise regarding the selection of the right features. With this in mind, Shafiq et al. [ 94 ] proposed a ML algorithm called weighted mutual information_ area under the curve (WMI_AUC), a hybrid feature selection algorithm, that helps in selecting the effective features in the traffic flow. The databases used in the study were the HIT Trace 1, which was captured by the authors from WeChat messenger using Wireshark, and the NIMS dataset, which was collected by the authors from their research-tested network. To build the final model, the researchers used 11 different ML algorithms. The model built using the partial decision tree (PART) algorithm resulted in an accuracy of 97.88% using the HIT Trace 1 dataset. For the NIMS dataset, RF resulted in an accuracy of 100%.

Another field that was also covered by researchers was the detection of malicious virtual private network (VPN) traffic. Miller et al. [ 95 ] proposed a computational model to address the current limitations in detecting VPN traffic and aid in the detection of VPN technologies that are being used to hide an attacker’s identity. A model was built to detect VPN usage by using a MLP trained neural network by flow statistics found in the captured network packets’ TCP header. The experiment using OpenVPN was able to identify VPN traffic with an accuracy of 93.71% and identify Stunnel OpenVPN with an accuracy of 97.82% when using 10-fold cross-validation. Future studies could be carried out to detect unauthorized user access and research organizational security, which is essential for a business.

Since the spread of malicious websites, research emphasis has been on factor analysis of the site category and the correct identification of unlabeled data in order to distinguish between benign and dangerous websites to mitigate the risk of malicious websites. Wang et al. [ 96 ] demonstrated the use of the NB model to classify malicious websites. A self-learning system was developed to categorize websites based on their features, with NB being used to divide the websites into two categories: malicious or benign. The dataset used was the ISCX2016 [ 97 ] dataset, which contains over 100,000 URLs and 50 features for each URL. A higher accuracy of up to 90% was achieved after applying factor identification of datasets and accomplishing website classification using the NB classifier, demonstrating that the NB classifier can perform well when it comes to website classification.

Finally, Ongun et al. [ 98 ] used the CTU-13 dataset to build ensemble models for malicious traffic detection. The algorithms used to build the model were LR, RF, and gradient boosting (GB). The first representation was connection-level representation where the features were extracted from the raw connection logs. The second representation was aggregated traffic statistics where the authors compared between raw features in the first representation and the features obtained by time aggregation in this representation. The last representation was temporal features, where the authors considered the time interval with the features obtained by time aggregation in the second representation. The best performance achieved by the model built using RF and GB and resulted in high AUC of 99% when applying it on the features of the third representation.

Malicious Traffic in a Cloud Environment

Using a dataset constructed from a real cloud environment, Alshammari and Aldribi [ 99 ] built ML models to detect malicious traffic in cloud computing. The dataset used was the new ISOT CID [ 100 ], a publicly available cloud-specific dataset where the training data contained 17,296 instances and testing had 7411 instances. Their aim was to add some significant features, prepare the training data, and test the dataset against different ML models, namely DT, KNN, NNet, SVM, NB, and RF. The dataset contained 89,364 instances among which 44,569 were malicious and 44,795 were normal instances. They performed both cross-validation (5-, 10-, 15-folds) and split–validation (90–10%, 80–20%, 70–30%). For cross-validation (all 5-, 10-, 15-folds), DT, RF, and KNN all obtained an accuracy of 100%. In the case of split validation (for all 90%, 80%, and 70% splitting), both DT and RF achieved an accuracy of 100%.

Using the same cloud dataset, Sethi et al. [ 101 ] proposed an IDS to protect cloud networks from cyber-attacks. The algorithm applied was double deep Q-learning (DDQN). The datasets used were the ISOT CID dataset, and the standard NSL-KDD dataset. The total size of ISOT is 8 TB, but for the purposes of the experiment, only the network traffic data portion was used. For the feature selection phase, the team applied a chi-square feature selection algorithm. The selected features were 164 and 36 for ISOT CID and NSL-KDD, respectively. The accuracy for the proposed model tested for NSL-KDD was 83.40%, whereas for ISOT CID, it was 96.87%. After measuring the robustness of their model against an adversarial attack, the accuracy obtained was 79.77% for NSL-KDD and 92.17% for ISOT CID.

Xie et al. [ 102 ] used a class SVM technique based on a short sequence model. They used the Australian Defense Force Academy (ADFA) dataset [ 103 ], which contains thousands of normal traces taken from a host setup to simulate a modern Linux server as well as hundreds of anomalous traces caused by six different types of cyber-attacks. As it was a short sequence, duplicate entries were removed, leading to an improved separability between the normal and abnormal. The k values chosen for this experiment were k = 3, 5, 8, 10, with k = 5 providing the greatest results and an accuracy of 70% attained at an FPR of roughly 20%. Although the experimental result showed a significant reduction in computing cost, the rate of an individual kind of attack mode recognition was low.

Vanhoenshoven et al. [ 104 ] addressed a variety of ML approaches to solve the challenge of detecting malicious URLs as a binary classification problem including multi-layer perceptron, DT, RF, and KNN. The researchers used Ma et al.’s dataset [ 105 ], called the Malicious URLs Dataset, which consists of 121 sets gathered over 121 days. There are 2.3 million URLs and 3.2 million features in the overall dataset. The researchers divided the URLs into three groups based on their characteristics. Each of the methods was used to classify these sets. The models were assessed based on their accuracy, precision, and recall, with features such as blacklists and WHOIS information taken into account. The article implies that all of its approaches achieved high accuracy, with RF being the most convenient approach to use, obtaining an accuracy of roughly 97% based on experimental results. The method also had great precision and recall, demonstrating its reliability.

For the purpose of detecting harmful URLs, Yuan et al. [ 106 ] introduced a parallel neural joint model approach. The semantic and text features were included in the method by integrating a parallel joint neural network incorporating capsule network (CapsNet) and independent RNN (IndRNN) to improve the detection accuracy. The malicious URLs data were gathered from two sources: an anti-phishing website called PhishTank and a malware domain list that collects a blacklist of harmful websites. The 5-fold cross-validation technique was applied and unified performance metrics were used to evaluate the model’s performance. According to the results of the experiments, the model performed best when the dimension of the feature was 185 and the number of IndRNN layers was 2. The accuracy and recall rates both reached 99.78% and 99.98%, respectively, resulting in a performance that exceeded traditional models.

By utilizing ML on the latest and more advanced dataset for IoT networks called IoTID 20 [ 107 ], Maniriho et al. [ 108 ] proposed an approach for anomaly-based intrusion detection in IoT networks. The ML algorithm applied was RF. The dataset had three subsets: subset 1 contained normal and DoS instances; subset 2 contained normal and man-in-the-middle (MITM), and subset 3 contained normal and scan traffic. A 10-fold cross-validation and train–test split of 70–30% were applied. The overall accuracy for each subset attack was DoS—99.95%, Scan—99.96%, and MITM—99.9761% using cross-validation while using the percentage split DoS—99.94%, Scan—99.93%, and MITM—99.9647.

Since the security of IoT networks is a major concern for researchers and decision-makers, some other researchers have used the same IoTID 20 dataset in order to build an IDS for in-home devices. A three-stage strategy that includes clustering with oversampling, reduction, and classification using a single hidden layer feed-forward neural network (SLFN) was provided by Qaddoura et al. [ 109 ]. The paper’s significance lies in the data reduction and oversampling techniques used to provide relevant and balanced training data as well as the hybrid combination of supervised and unsupervised techniques for identifying intrusion activities. With a ratio of 0.9 and a k value of 3 for the k-means++ clustering technique, the results showed that using the SLFN classification technique and using the SVM and synthetic minority oversampling technique (SVM-SMOTE) yielded more accurate results than using other values and classification techniques. Similarly, a deep multi-layer classification strategy was suggested by Quddoura et al. [ 110 ], which consisted of two phases of detection. The first phase entails detecting the presence of an intrusion and the second phase identifies the kind of intrusion. In preprocessing, the oversampling technique was carried out to enhance classification results. Furthermore, the most optimal model was built, which contained 150 neurons for the single-hidden layer feed-forward neural network (SLFN) (phase 1), and 150 neurons and two layers for LSTM (phase 2). When the findings were compared to well-known classification approaches, the suggested model outscored them by 78% with regard to the G-mean.

3.1.9. Attacks at DNS Level

In order to improve the user’s privacy, a new protocol called DNS over HTTP (DoH) was recently created. This protocol can be used instead of traditional DNS for domain name translation with the benefit of encryption. However, security tools depend on readable information from DNS to detect attacks such as malware and botnet. Hence, Singh and Roy [ 111 ] aimed to use ML algorithms to detect malicious DoH traffic. The five ML algorithms used were GB, NB, RF, KNN, and LR. The team conducted the experiment on the benchmark MoH dataset—CIRA-CIC-DoHBrw-2020, which was recently developed and shared publicly [ 112 ]. It contained a benign file that had 19,807 instances and a malicious file that had 249,836 instances. The DoHMeter tool [ 113 ], which was developed in Python and is freely available, was used to extract important features from the PCAP files. To build the model, the data were split into a train–test ratio of 70–30%. The experimental results showed that RF and GB attained the maximum accuracy of 100%.

3.1.10. Intrusion Detection

NIDS analyzes and monitors the whole network to detect malicious traffic. The following studies used the NSL-KDD dataset. Al-Qatf et al. [ 114 ] proposed self-taught learning (STL)-IDS using the DL approach in an unsupervised manner as a feature selection technique to reduce the testing and training time and effectively enhance the accuracy of the prediction for the SVM model. In the pre-processing phase, a 1-n encoding system was applied before STL. Max–min normalization was used to map all features into a specific range. The results obtained through the proposed model represented the classification accuracy of improved SVM compared with algorithms such as J.48, NB, and RF. Moreover, it performed well in five-category (normal and five types of attacks) and two-category (attacks and normal traffic) classification.

Similarly, to develop a flexible and efficient NIDS, Niyaz et al. [ 115 ] proposed a self-taught learning (STL) based on sparse autoencoder (AE) and soft-max regression (SMR) on the NSL-KDD dataset. The authors applied 10-fold cross validation on the training data for STL and applied the dataset directly for SMR. The results showed a high-performance accuracy rate of 98% for STL.

Following the same principle of using DL for intrusion detection, Zhang et al. [ 116 ] proposed an approach using the NSL-KDD dataset, consisting of normal and different forms of abnormal traffic. By first applying feature selection to remove the unrelated features and noise, the autoencoder was implemented to learn the features of the input data and extract the key features. Soft-max regression classification was then applied. The measures for evaluation used were accuracy, precision, recall, and F-score. Finally, the model achieved F-score and recall values of 76.47% and 79.47%, respectively.

Some studies have focused on multi-layer DL algorithms. Wu and Guo [ 117 ] proposed a LuNet model, which is a hierarchical CNN and RNN neural network, applied on the NSL-KDD and UNSW-NB15 dataset. They started by converting the categorical features using the ‘get dummies’ function in Pandas, then they applied standardization to scale input data and concluded by employing K-fold cross-validation. To evaluate LuNet, the following evaluation criteria were used: accuracy, FPR, and DR. The performance in binary classification achieved on average 99.24% on the NSL-KDD dataset and 97.40% accuracy on the UNSW-NB15 dataset. The performance in multiclass classification was an average of 99.05% accuracy on NSL-KDD, and 84.98% accuracy on UNSW-NB15. In future work, the researchers intend to investigate worms and backdoors as these were wrongly classified in the model.

To detect network intrusions efficiently, Hasan et al. [ 118 ] used an ANN. Different backpropagation algorithm training approaches were employed to detect the attacks and non-attack connections. The DARPA 1998 [ 119 ] intrusion detection dataset was used for training and testing purposes. To train the model, the researchers used the backpropagation learning algorithm, letting it detect intrusions in the following three modes: batch gradient descent with momentum (BGDM), batch gradient descent (BGD), and resilient backpropagation (RP). Finally, they used the DR and the FPR to determine the performance of intrusion detection. The total attack detection performance and the efficiency measure support the RP method of training, which obtained an accuracy of 92%. Further changes in the network architecture can be made to enable the efficient use of the network with other approaches.

Likewise, Devikrishna et al. [ 120 ] proposed an approach that used ANN as a pattern recognition technique to classify normal and attack patterns. The dataset used was the KDD-99 dataset. The feature extraction process consisted of feature selection and feature construction. An MLP was used for intrusion detection. MLP was a layered feed-forward ANN network typically trained with backpropagation. Accuracy was a goal that largely improved the overall effectiveness of the IDS. A possible future research direction could be to incorporate more attack scenarios in the dataset.

Abuadlla et al. [ 121 ] also proposed an IDS based on flow data built in two stages. The first stage involved the detection of abnormal traffic on the network. The second stage involved detecting and classifying the attack types in the network traffic. The NetFlow dataset made by network captures was employed to train the proposed system. To build the proposed model, a multilayer feedforward neural network and the radial basis function network (RBFN) were used. The proposed model resulted in a higher accuracy of 94.2% for the abnormal traffic detection stage, and 99.4% for the attack detection and classification stage. Although the multilayer feedforward neural network resulted in higher accuracy, it consumed more time and memory in comparison with RBFN, which makes RBFN a better choice for real-time detection. In future work, the researchers aim to build a faster and more accurate model for real-time detection with a smaller number of features.

Utilizing the KDD-99 dataset, Alrawashdeh et al. [ 122 ] aimed to build a DL model for anomaly detection in real-time. The researchers began by transforming categorical features into numerical features for convenience. Then, they removed the duplicated records to reduce computational time and improve performance. Three models were built: first using the restricted Boltzmann machine (RBM), the second using deep belief network (DBN), and the third using DBN with LR. The model that was built using DBN and LR resulted in the best performance with an accuracy of 97.9% and a FN rate of 2.47%.

In addition, Al-Janabi et al. [ 123 ] proposed a model based on ANN using the KDD-99 dataset and incorporated three scenarios: detection mode, detection and classification mode, and detailed classification mode. The researchers performed their experiment for each scenario by training the models using different number of features in each. The best results achieved were a 91% DR and 3% FP rate using 44 features with the detection only scenario. The results showed that performance decreased as a higher level of classification was performed.

Belavagi et al. [ 124 ] evaluated the different ML algorithms used to classify the network data traffic as normal traffic or intrusive (malicious) traffic. By using the NSL-KDD dataset consisting of internet traffic record data, supervised ML classifiers, namely LR, SVM, Gaussian NB, and RF were applied to identify four simulated attacks. After converting all the categorical data to numerical form in the pre-processing stage, the predicted labels from these models were compared with the actual labels, and TPR and FPR were computed. From the observed results, it was concluded that the RF classifier outperformed other classifiers for the considered dataset, with an accuracy of 99%. The researchers suggested that the work can be further extended by considering the classifiers for multiclass classification and considering only the important attributes for intrusion detection.

Additionally, Almseidin et al. [ 125 ] evaluated the different ML algorithms, keeping the focus on FNR (identifying an attack as normal traffic) and FPR (identifying normal traffic as an attack) performance metrics to improve the DR of the IDS. They used several algorithms, namely J.48, RF, random tree, decision table, multi-layer perception (MLP), NB, and Bayes network. The KDD-99 dataset was imported to SQL server 2008 to implement statistical measurement values such as attack types and occurrence ratios. Then, 148,753 record instances were extracted for training data. A wide range of results was obtained by using Weka tools that demonstrated that the RF achieved the highest average accuracy and the decision table achieved the lowest FNR.

Choudhury et al. [ 126 ] implemented ML algorithms to categorize network traffic as normal or anomalous. Algorithms such as BayesNet, LR, instance-based knowledge (IBK), J.48, PART, JRip, random tree, RF, REPTree, boosting, bagging, and blending were incorporated and compared. The researchers used the NSL-KDD dataset and Weka tools to model and compare the algorithms. The results showed that RF achieved the highest accuracy of 91.523%, and the lowest accuracy of 84.96% resulted from LR.

Similarly, the objective of the system proposed by Thaseen et al. [ 127 ] was to detect any intrusions in the network using ML by classifying different packets without decrypting their content. For intrusion detection analysis, packets were generated and transmitted over a network and were captured by Wireshark. This captured data was organized into a dataset. By implementing ML algorithms such as NB, SVM, RF, and KNN, the data were classified with an accuracy of 83.63%, 98.23%, 99.81%, and 95.13%, respectively. Future work to this study includes the plan to use DL algorithms to enhance the performance and accuracy of recognition and classifying different types of packets transmitted over a network.

Likewise, Churcher et al. [ 128 ] proposed several ML models to cope with the increase in the number of network attacks. The researchers highlighted several ML methods that were used in IDS such as DT, SVM, NB, RF, KNN, LR, and ANN. The Bot-IoT dataset [ 129 ] containing ten CSV files that have records of IoT network attacks and 35 features was used. In the pre-processing stage, the undesirable features were removed. The results of the model showed that in RF, the accuracy for DDoS attacks was 99% in binary classification and its performance was superior in the context of all types of attacks. However, KNN achieved 99% accuracy and outperformed other ML algorithms in the multiclass classification. In conclusion, KNN and ANN are more accurate when used in weighted and non-weighted datasets, respectively, for multiclass classification.

A comparative analysis of two commonly used classification methods, SVM and NB, to evaluate the accuracy and misclassification rate was conducted by Halimaa et al. [ 130 ] using the NSL-KDD dataset. For comparative analysis, the Weka tool’s randomized filter was used to ensure the random selection of 19,000 cases. The results showed that SVM attained an accuracy of 93.95% and NB achieved an accuracy of 56.54%. The researchers plan to work with larger amounts of data and construct a cross multistage model to create the ability to categorize additional attacks with accuracy and better performance.

Ghanem et al. [ 131 ] assessed the performance of their existing IDS against 1- and 2-class SVMs by applying both straight and non-linear forms. For the first step of data collection, they collected five datasets from the IEEE 802.11 network testbed and another dataset was collected in Loughborough University from an ethernet local area network office. All this traffic was collected in the PCAP structure using tcpdump. The results demonstrated that the linear 2-class SVM presented generally highly accurate findings. In addition to reaching a 100% success rate over four out of five of the metrics, it required training datasets. Meanwhile, the linear 1-class SVM’s performance was nearly as good as the best technique and did not require training the dataset. Overall, it was concluded that the existing unsupervised anomaly-based IDS can benefit from using any of the two ML techniques to improve accuracy in detection and its analysis of traffic, especially when it is comprised of non-homogeneous features.

Mehmood et al. [ 132 ] focused on supervised learning algorithms to make a comparison of three ML algorithms, namely SVM, J.48, NB, and decision table for anomaly-based detection. These algorithms were trained using the short version of the KDD-99 dataset as it has many records. The performance measures used in this comparison were FPR, TPR, and precision. The results highlighted a limitation when it came to DR, as not a single algorithm had a high DR for all the tested attacks in the KDD-99 dataset. However, the J.48 had a low misclassification rate. Hence, it was concluded that this algorithm performed best out of all the other algorithms.

An approach that boosts the capacities of wireless network IDS was introduced by AlSubaie et al. [ 133 ]. The dataset used was WSN-DS [ 134 ], which included 23 attributes and five potential outputs (four attacks (DoS attack): flooding, grayhole, blackhole, and scheduling and one normal state (no attack)). The ML algorithms used here were ANN and the J.48. Additionally, the data noise was calculated as it affects the accuracy of the ML algorithms. The amount of noise permissible for the ML model to be deemed trustworthy was determined. The results determined that J.48 performed better than the ANN when noise was not considered, obtaining the highest accuracy rate of 99.66%. With datasets having more noise, ANN was more tolerable.

In order to determine which of the models could handle large amounts of data and still produce accurate predictions, Ahmad et al. [ 135 ] used the SVM linear and radial basis function (RBF), RF, and ELM methods and compared their performance on the NSL-KDD dataset. The results demonstrated that when using the full dataset, the ELM outperformed the other algorithms in terms of all the metrics being tested in all experiments including accuracy, which reached 99.5%. On the other hand, when using half and a quarter of the dataset, SVM performed better overall, with an accuracy of around 98.5%. Hence, it was concluded that ELM is best suited for intrusion detection when dealing with large amounts of data. The researchers plan to further explore ELM and experiment with it using different selection and feature transformation techniques and their impact on its performance.

Amira et al. [ 136 ] found MLP to be the most effective and appropriate classifier to increase detection accuracy. The data pre-processing phase was carried out using the equal width binning algorithm. The sequential floating forward selection (SFFS) feature selection technique was applied, resulting in the selection of 26 features. Using the NSL-KDD dataset, Amira et al. then applied a multi-agent, 2-layer classification algorithm. The different classifiers that were tested and compared were: NB and DT, namely NBTree, BFTree, J.48, and RF Tree. NBTree and BFTree gave better results than RF and J.48. MLP gave good results in terms of classifying normal and DoS attacks compared to identifying the R2L and U2R attacks. Overall, it was concluded that a single classifier is not sufficient to classify the attack class. Therefore, to increase classification accuracy, multiple classifiers must be involved.

Rather than comparing different techniques, Gogoi et al. [ 137 ] focused on evaluating the clustering approach to detect network traffic anomalies on different datasets. The proposed method was evaluated using TUIDS [ 138 ] datasets, the NSL-KDD dataset, and the KDD-99 datasets. The real-life TUIDS intrusion datasets consist of three datasets: flow level, packet level, and port scan. After the pre-processing stage, they applied a combination of supervised clusters and unsupervised incremental clusters which labelled the training data into different profiles (or rules). The prediction was undertaken using a supervised classification algorithm. Using the TUIDS dataset, the packet level had the highest accuracy of 99.42%. When using the KDD-99 dataset, the accuracy achieved was 92.39%. Finally, using NSL-KDD, the accuracy achieved was 98.34%.

Aiming to classify real-time traffic by using 12 features of network traffic data to classify 17 attack types of DoS, probing as well as normal was conducted by Wattanapongsakorn et al. [ 139 ]. Supervised ML techniques—DT, ripple rule, back-propagation neural network, and Bayesian network—were applied. In the pre-processing stage, the team used a packet sniffer and a built-in Jpcap library to collect and store network records over a period of time. Then, in the classification part, training and testing were performed using Weka tool, and results were observed. The DT approach achieved the highest DR of 85.7%. In the second experiment, some attack types were grouped together, and training data consisted of 9000 records with 600 records of each attack type (so 600 × 15). In this case, the DR was much higher, with the DT being 95.5%.

Further research that worked on enhancing an existing algorithm for intrusion detection was done by Cui et al. [ 140 ], who worked on enhancing the Bayes classifier (BC). The proposed method seeks to integrate the spatiotemporal patterns of measurement into a flexible BC to detect cyber-attacks. Spatiotemporal patterns were captured by the graph Laplacian matrix for system measurements. After the evaluation of the developed method’s performance, it was concluded that the flexible BC showed the largest TPR compared with the naïve BC, SVM, and DT methods, which verified the effectiveness of the developed method. For future work, DL techniques will be involved by mapping the spatiotemporal patterns to a linear space using the LSTM network for better detection accuracy of cyber-attacks.

Moreover, Kumar et al. [ 141 ] focused on enhancing the detection efficiency by combining three algorithms—RF, JRIP, PART—to identify threats of mobile devices. The dataset used contained around 600 samples that were captured by the researchers from the virtual machine using Wireshark. For feature extraction, the researchers used bidirectional flow export using the IP flow information export method (RFC-5103 BiFlow). The challenge the researchers faced was an overfitting problem and concept drift condition, which is caused by choosing low performance giving features. The ensemble model resulted in an accuracy of 98.2% with the ability to identify benign traffic. For future work, the researchers aim to integrate ML with conventional NIDS and to reduce the chance of concept drift by introducing innovative methods.

Similarly, Tahir et al. [ 142 ] constructed a hybrid ML technique for detecting network traffic as normal or intrusive by combining K-means clustering and SVM classification to improve the DR and to reduce the FPR alarm and FNR alarm. The dataset applied in the proposed technique was the NSL-KDD dataset. Pre-processing was performed on the dataset to reduce ambiguity and supply accurate information to the detection engine. After applying the classifier subset evaluator and best-first search algorithms, both the classifiers—K-means and SVM—were then tested and their performance evaluated. The hybrid ML technique results showed that they attained 96.26% as the DR and 3.7% as the FNR. The model showed a comparatively higher detection for DoS, PROBE, and R2L attacks.

One more enhanced technique was proposed by Sharma et al. [ 143 ] to apply efficient data mining algorithms for detecting network traffic as normal or anomalous. The team applied KDD-99, which contains 4.9 M data instances and four class types. In feature selection, they collected basic features such as protocol type, duration, flags, etc. The data was normalized and the classification was carried out using k-means clustering via a NB classifier. The target variable was classified as normal, DoS, U2L, R2L, probing. The DR achieved by using the proposed method was 99%.

Following the same ideology, Lehnert et al. [ 144 ] built their system in steps with more complexity added at each level. They used the KDD-99 dataset and Shogun ML Toolbox to test and train the data. The study’s focus was mainly on using the SVM implementation provided by the toolbox. The key step in this paper was the training phase, which was done using labelled data. The goal was to attempt to choose the most appropriate kernel and minimize the number of features. The results showed that two out of the four available kernels on Shogun tied in the best accuracy. These kernels were Gaussian and Sigmoid, which produced an error of only 2.79%. It was concluded that identifying both the kernel that has the lowest error rate and the subset of the most relevant features leads to an improved version of the algorithm. Ultimately, this can enhance the accuracy and efficiency of the SVM applied for intrusion detection, making it able to predict with higher speed and accuracy.

An innovative feature selection algorithm called the ‘highest wins (HW)’ was proposed by Mohammad and Alsmadi [ 145 ] in order to enhance intrusion detection. This HW algorithm was applied in NB techniques on 10 benchmark datasets from the UCI repository to evaluate its performance. The results showed that the proposed HW algorithm could successfully reduce the dimensionality for most of these datasets compared to other feature selection methods such as chi-square and IG. The team conducted another set of experiments where NB and DT (C4.5) classifiers were built using the HW technique on the NSL-KDD dataset on its binary and multiclass versions. For binary, HW reduced the features of the dataset from 41 to eight and the results gave an accuracy of 99.33% using the reduced features (0.23% decrease compared to using complete features). For multiclass, HW reduced the features of the dataset from 41 to 11, and in terms of time needed for building the model, reduced features had an enhancement of 2.3%. The results demonstrated that instead of using all 41 features of this dataset, using only eight by applying HW could produce classifiers with the same classification performance.

Furthermore, Chawla et al. [ 146 ] proposed a computational efficient anomaly-based IDS that was a combination of CNN and RNN. To detect malicious system calls, they merged stacked CNNs with GRUs. Using the ADFA dataset of system call traces, they obtained a set of equivalent findings with shorter training periods when using GRU. They employed CNN to extract the local features of system call sequences and feed them into the RNN layer, which was then processed through a fully connected SoftMax layer, which generates a probability distribution across the system calls processed by the network. Trained on normal system calls, which predict the likelihood of a subsequent system call, a testing sequence was employed to detect a malicious trace based on a pre-defined threshold. The RNN-based LSTM model’s training time was claimed to be reduced by the researchers.

In addition, Nguyen et al. [ 147 ] used the DL approach for detecting cyber-attacks in a mobile cloud environment. The used datasets were KDD-99, NSL-KDD, and UNSW-NB15 (training = 173,340 records, testing = 82,331 records). The researchers adopted principal component analysis (PCA) to reduce the dimensions for the datasets and the learning process comprised of three layers: the input layer, hidden layers, and output layer. The input layer used Gaussian restricted Boltzmann machine (GRBM) to transform real values to binary code. The hidden layer used restricted Boltzmann machine (RBM) to perform the learning process. The output of the hidden layer was used as input in the output layer (SoftMax regression step). They used accuracy, recall, and precision for measuring performance. The results showed that the accuracy for NSL-KDD, UNSW-NB15, and KDD-99 datasets, respectively, were 90.99%, 95.84%, and 97.11%. For future work, Nguyen et al. proposes implementing the model on real devices to measure the accuracy on a real-time basis and evaluate the energy and time consumed in the detection.

An improved IDS was proposed by Tama et al. [ 148 ] where they used two datasets to evaluate the performance of the model: NSL-KDD and UNSW-NB15. To minimize the feature size, a hybrid feature selection technique was used. The hybrid feature selection consisted of three methods: the ant colony algorithm, particle swarm optimization, and genetic algorithm. Then, the researchers proposed a two-stage classifier ensemble, which was rotation forest and bagging. The proposed model achieved an accuracy of 85.8% with the NSL-KDD dataset and 91.27% with the UNSW-NB15 dataset. For future work, the researchers intend to perform the proposed model to solve the multiclass classification problem.

A novel intrusion detection system was proposed that takes the advantage of both statistical features and payload features by Min et al. [ 149 ]. They used the ISCX2012 dataset, which is more updated and closer to reality, and they utilized word embedding and text-CNN to extract more features from the payloads. Then, the RF algorithm was applied on the combination of payload features and statistical features, where they named the model with TR-IDS. Moreover, the effectiveness of TR-IDS was compared against five ML models, namely SVM, NN, CNN, and RF (RF-1) and RF (RF-2, which used statistical features only). The highest result achieved was by TR-IDS with an accuracy of 99.13%.

Finally, more information on intrusion detection using unsupervised and hybrid methods can be found in a survey paper composed by Nisioti et al. [ 150 ]. They presented and highlighted important issues such as feature engineering methods for IDS. Furthermore, using IDS data to construct and correlate attacks to identify attackers as well as extending the current IDS to identify modern attacks were all addressed by the paper.

Table 2 below presents a summary of all details discussed in this section, giving overview picture of all reviewed articles in terms of research problem domain targeted, dataset used, and intelligent techniques applied as well as the results achieved.

Brief summaries of the reviewed papers.

AuthorsYearProblem DomainDatasetTechniquesResults
(Evaluation Metrics)
Churcher et al. [ ]2021IDSBot-IoTKNN, SVM, DT, NB, RF, LR, ANNBinary class: Accuracy (RF-99%)
Multi-class: Accuracy (KNN-99%)
Yang et al. [ ]2021Malicious TrafficCTU-13ResNet + DQN + DCGANAccuracy-99.94%
Tuor et al. [ ]2021Insider ThreatCERT v6.2SVM, isolation forest, DNN, RNNRecall (DNN, RNN, isolation forest-100%)
Marin et al. [ ]2021Malware AttackUSTCTFC2016DeepMAL-using CNN layersAccuracy (Rbot-99.9%, Neris-63.5%, Virut-54.7%)
Ahuja et al. [ ]2021DDoSPrivate DatasetCNN, RNN, LSTM, CNN-LSTM, SVC-SOM, SAE-MLPAccuracy (SAE-MLP-99.75%)
Yuan et al. [ ]2021Malicious TrafficPrivate DatasetNeural Network, RNNAccuracy (CapsNet,
IndRNN = 99.78%)
Alshammari et al. [ ]2021Malicious TrafficISOT CIDDT, KNN, RF, NB, SVM, NNetCross val: Accuracy (RF, DT, KNN-100%)
Spit val: Accuracy (RF, DT-100%)
Mohammad and Alsmadi [ ] 2021IDSNSL-KDD10
UCI benchmark datasets
NB and C4.5 using HWReduced features give similar results
Accuracy (C4.5-93.90%)
Qaddoura et al. [ ]2021Common IoT attacksIoT 20SLFNSLFN + SVM-SMOTE: ratio-0.9, k value-3 for k-means++
Qaddoura et al. [ ]2021Common IoT attacksIoT 20LSTM, SLFNG-mean (LSTM + SLFN-78%)
Maniriho et al. [ ]2021Common IoT attacksIoT 20RFDoS: Accuracy-99.95%
MITM: Accuracy-99.9761%
Scan: Accuracy-99.96%
Butnaru et al. [ ]2021Phishing AttacksPublic Dataset from Kaggle & PhishTankRF, MLP, SVM, NB, DTAccuracy (RF-99.29%)
Lin et al. [ ]2021Phishing AttacksPrivate DatasetNeural Network (Phishpedia)Accuracy (Phishpedia-99.2%)
Rehman et al. [ ]2021DDoSCICDDoS2019GRU, RNN, NB, SMOAccuracy (GRU-99.94%)
Wang et al. [ ]2020Malicious TrafficISCX 2016NBAccuracy (NB-90%)
Miller et al. [ ]2020Malicious TrafficWireshark Network CapturesNeural NetworkAccuracy (NNet-93.71%)
Thaseen et al. [ ]2020IDSWireshark Network CapturesNB, SVM, RF, KNNAccuracy (RF-99.81%)
Alam et al. [ ]2020Phishing AttacksPhishing dataset from KaggleRF, DTAccuracy (RF-97%)
Barut et al. [ ]2020Malware TrafficDataset from Stratosphere IPS,
CICIDS2017
NB, C4.5, DT, RF, SVM, AdaBoostAccuracy, DR (RF-99.996%),
FAR (RF-2.97%)
Pande et al. [ ]2020DDoSNSL-KDDRF, SVM, Clustering, Neural NetworksAccuracy (RF-99.76%)
Cui et al. [ ]2020IDSNetwork CapturesBCTPR (BC-98.75%)
Alsubaie et al. [ ]2020IDSWSN-DSJ.48 form of DT, ANNAccuracy (J.48-99.66%)
Dutta et al. [ ]2020Malicious TrafficIoT-23, LITNET-2020, and NetML-2020ensemble of DNN, LSTM, DSAEAccuracy-99.7%
Al-Haija et al. [ ]2020Common IoT attacksNSL-KDDCNNBinary class: Accuracy-99.3%
Multiclass: Accuracy-98.2%
Khan et al. [ ]2020Common IoT attacksNSL-KDDELMAccuracy-93.91%
Elsayed et al. [ ]2020DDoSCICDDoS2019AE with RNNAccuracy-99%
Yuan et al. [ ]2020Insider ThreatCERT v4.2LSTM + CNNAUC-0.9449
Ahmed et al. [ ]2020Zero-day attacksCTU-13ANNAccuracy (ANN-99.6%)
Doriguzzi-Corin et al. [ ]2020DDoSISCX2012,
CICIDS2017,
CICIDS2018, UNB201X
CNNCSECIC2018: Accuracy-98.88%
ISCX2012: Accuracy-99.87%
CIC2017: Accuracy-99.67%
UNB201X: Accuracy-99.46%
Yang et al. [ ]2020Malicious TrafficNetwork CapturesRNNAccuracy (RNN-98%)
Ramos et al. [ ]2020Botnet AttacksISOT-HTTP, CSE-CICIDS2018RF, DT, SVM, NB, KNNCIC-IDS2018: Accuracy (RF, DT-99.99%)
ISOT-HTTP: Accuracy (DT-99.90%)
Sethi et al. [ ]2020Malicious TrafficISOT CID, NSL-KDDDDQNISOT CID: Accuracy-96.87%
NSL-KDD: Accuracy-83.40%
Singh et al. [ ]2020Malicious DoH Traffic (at DNS level)CIRA-CIC-DoHBrw-2020GB, NB, RF, KNN, LRAccuracy (RF, GB-100%)
Mohammad et al. [ ]2020DDoSUNSW-NB15, UCI datasetsImproved Rule Induction (IRI)F Score (IRI-93.90%)
Letteri et al. [ ]2020Malware AttackMTA KDD 19MLP using AE optimization or RRw optimizationAccuracy (MLP with RRw opt.-99.60%)
Rendall et al. [ ]2020Phishing AttackPrivate DatasetSVM, NB, DT, MLPAccuracy (MLP, DT-86%)
Kim et al. [ ]2020DDoSKDD-99,
CICIDS2018
CNN, RNNAccuracy (CNN-99% or more)
Alrashdi et al. [ ]2019Common IoT attacksUNSW-NB15RFAccuracy (ML-99.34%)
Chawla et al. [ ] 2019IDSADFARNN, CNNTime Taken (CNN-GRU 10× faster than LSTM)
Halimaa et al. [ ]2019IDSNSL-KDDSVM, and NB.Accuracy (SVM-93.95%)
Ongun et al. [ ]2019Malicious TrafficCTU-13LR, RF, and GBAUC (RF-99%)
De Lucia et al. [ ]2019Malicious TrafficDatasets from Stratosphereips.orgSVM and CNNF-Score (SVM-0.9997)
Filho et al. [ ]2019DDoSCICDoS2017,
CICIDS2017,
CICIDS2018
RF, LR, AdaBoost, Stochastic Gradient Descent, DT, and PerceptronAccuracy (RF-96%)
Radivilova et al. [ ]2019DDoSSNMP-MIBRFAccuracy (RF-0.9)
Zhang et al. [ ]2019IDSNSL-KDDAEF-Score-76.47%
Recall-79.47%
Vijayanand et al. [ ]2019DDoSCICIDS2017SVM, Multi-Layer Deep NetworksAccuracy (MLDN-99.99%)
Hu et al. [ ] 2019Insider ThreatPrivate DatasetCNNFAR-2.94%
FRR-2.28%
Ullah et al. [ ]2019Common IoT attacksPrivate DatasetCNNAccuracy (CNN-97.46%)
Baek et al. [ ]2019DDoSPrivate DatasetMLPAccuracy (MLP-50%)
Shi et al. [ ]2019DDoSCICIDS2017LSTMAccuracy (LSTM-99%)
Sabeel et al. [ ]2019DDoSCICIDS2017DNN, LSTMTPR (DNN-99.8%) TPR (LSTM-99.9%)
Wu et al. [ ]2019IDSUNSW-NB15, NSL-KDDCNN, RNNBinary Class: Accuracy-99.24%
Multiclass: Accuracy-99.05%
Tama et al. [ ]2019IDSNSL-KDD, UNSW-NB15rotation forest + bagging UNSW-NB15: Accuracy-91.27%
NSL-KDD: Accuracy-85.8%
Rao et al. [ ]2019Phishing AttacksPrivate DatasetLSTM + SVMAccuracy (LSTM + SVM-97.3%)
Min et al. [ ]2018IDSISCX2012RF, SVM, NN, CNNAccuracy (RF-99.13%)
Pektas et al. [ ]2018Botnet AttacksISOT HTTP, CTU-13MLP + LSTMISOT: F score-98.8%
CTU: F score-99.1%
Ahmad et al. [ ]2018IDSNSL-KDDSVM, RF, ELMAccuracy (ELM-99.5%)
Shafiq et al. [ ]2018Malicious TrafficHIT Trace 1 captures
NIMS dataset
BayesNet, NB, AdaBoost, Bagging, PART, C4.5, RF, Random Tree, Sequential Minimal Optimization, oneR, HoeffdingHIT: Accuracy (PART-97.88%)
NIMS: Accuracy (RF-100%)
Park et al. [ ]2018Malware TrafficKyoto 2006+RFF-Score (RF-99%)
Chou et al. [ ]2018Malicious TrafficNSL-KDDNNETAccuracy (NNet-97.65%)
Nguyen et al. [ ]2018IDSUNSW-NB15, KDD-99, NSL-KDDNNETAccuracy (KDD-99-97.11%)
Al-Qatf et al. [ ]2018IDSNSL-KDDSVM, STLBinary: (Accuracy-84.96%)
Multiclass (Accuracy-80.48%)
Millar et al. [ ]2018Malicious TrafficUNSW-NB15NNETF-Score (Flow image-94.2%)
Wu et al. [ ]2018Malware TrafficEMBERDQN, SARSA, Double DQNAccuracy (DQN-93.5%)
Li et al. [ ]2018Phishing Attacks50K-PD, 50K-IPDGBDT + XGBoost + LightGBM50K-PD: Accuracy-97.3%
50K-IPD: Accuracy-98.6%
Vanhoenshoven et al. [ ]2017Malicious TrafficMalicious URLsKNN, RF, SVM, DT, NB, MLPAccuracy (RF-97%)
Kumar et al. [ ]2017IDSWireshark Network Capturesensemble of RF, PART and JRIPAccuracy-98.2%
Anderson et al. [ ]2017Malware TrafficCaptured TLS encrypted sessionsLinear Regression, l1/l2-LR, DT, RF ensemble, SVM, MLPAccuracy (LR-99.92%)
Almseidin et al. [ ]2017IDSKDD-99J.48, RF, Random Tree, Decision Table, NB, Bayes Network, MLPAccuracy (RF-93.77%)
Ghanem et al. [ ]2017IDSFive datasets gathered from an IEEE 802.11 and a private datasetSVMDR, OSR (on all datasets-100%)
Xu et al. [ ]2017Malicious TrafficNetwork CaptureRF, LRKernet: DR(RF-100%)
User-level: DR(RF-99%)
Tama et al. [ ]2017Common IoT attacksCIDDS-001, UNSW-NB15, GPRS-WEP, GPRS-WPA2DNNCIDDS-001: Accuracy-94.17%
UNSW-NB15: Accuracy-99.99%
GPRS-WEP: Accuracy-82.89%
GPRS-WPA2: Accuracy-94%
Yuan et al. [ ]2017DDoSISCX 2012RNNError Rate (RNN-2.103%)
Amira et al. [ ]2017IDSNSL-KDDNB, DT, NBTree, BFTree, J.48, RFT, MLPAccuracy (MLP-98.54%)
Niyaz et al. [ ]2017DDoSNetwork CaptureSAEAccuracy (SAE-95.65%)
Belavagi et al. [ ]2016IDSNSL-KDDLR, SVM, NB, RFAccuracy-(RF-99%)
Mehmood et al. [ ]2016IDSKDD-99SVM, NB, J.48, Decision TableAccuracy (J.48-–99%)
Alrawashdeh et al. [ ]2016IDSKDD-99RBM, DBN, DBN + LRAccuracy (DBN + LR-97.9%)
Robinson et al. [ ]2016DDoSCAIDA conficker, CAIDA DoS, KDD-99NB, RF, MLP, voting, BayesNet, IBK, J.48Accuracy (RF-100%)
Thabtah et al. [ ]2016PhishingDatasets from UCINNetAccuracy-93.06%
Tahir et al. [ ]2015IDSNSL-KDDhybrid of K-means Clustering and SVMDR-96.26%
Choudhury et al. [ ]2015IDSNSL-KDDBayesNet, LR, IBK, J.48, PART, JRip, Random Tree, RF, REPTree, boosting, bagging, and blendingAccuracy (RF-91.523%)
Niyaz et al. [ ] 2015IDSNSL-KDDSTL with AEAccuracy (STL-98%)
David et al. [ ]2015Malware AttacksPrivate DatasetDBNAccuracy (DBN-98.6%)
Barati et al. [ ]2015DDoSCAIDA USCD 2007GA + MLPAUC-0.9991
Abuadlla et al. [ ]2014IDSNetwork CaptureNNET, RBFNAccuracy-99.4%
Xie et al. [ ]2014Malicious TrafficADFASVMAccuracy (70%), FPR (20% when k = 5)
Mohammad et al. [ ] 2014Phishing AttacksPrivate DatasetANNAccuracy (testing set-92.18%)
Beaver et al. [ ]2013Zero-day AttacksKDD-99AdaBoostAccuracy (AdaBoost-94%)
Devikrishna et al. [ ]2013IDSKDD-99ANNSuccessfully detected and classified attacks
Lehnert et al. [ ]2012IDSKDD-99SVM, Clustering, NNETError Rate (SVM-2.79%)
Sharma et al. [ ]2012IDSKDD-99K-means clustering via NBDR-99%
Gogoi et al. [ ]2012IDSTUIDS, NSL-KDD, KDD-99ClusteringTUIDS Packet level: accuracy = 99.42%. KDD: accuracy = 92.39%.
NSL-KDD: accuracy = 98.34%
Hasan et al. [ ]2012IDSDARPA 1998NNETAccuracy (NNet-92%)
Wattanapongsakorn et al. [ ]2011IDSNetwork CaptureDT, Bayesian, Ripple Rule Back Propagation Neural NetworkDR (DT-95.5%)
Al-Janabi et al. [ ]2011IDSKDD-99ANNDR (ANN-91%)
Sun et al. [ ]2010Malicious TrafficNetwork CaptureSVM, RBFNN, PNNAccuracy (PNN-88.18%)

3.2. Common Intelligent Algorithms Applied

In this literature review, a number of papers were studied between the period of 2010–2021 and a plethora of both ML and DL techniques were utilized in these papers to build or compare models to detect and classify network attacks. Table 3 presents a list of all the respected papers that utilized the different algorithms, highlighting all problem domains where each algorithm was used for as well as the highest performance achieved. Figure 1 presents the number of articles that utilized each algorithm. As seen from the figure and table, RF and SVM were the most widely used algorithms in a good number of papers and ELM was the least applied algorithm. For ML algorithms, the best performing algorithms were DT, RF, and KNN with their accuracy reaching up to 100% and the least utilized algorithms were J.48 and KNN. For DL algorithms, the best performing algorithm was RRN with the highest accuracy of 100% achieved and the least utilized and least popular algorithm was ELM, which is considered to be fast in terms of training as it consists of a single hidden layer, so it is usually applied to simple applications. However, it has recently been extended to be hierarchical to handle more complex problems with higher accuracy [ 152 ].

An external file that holds a picture, illustration, etc.
Object name is sensors-21-07070-g001.jpg

ML and DL algorithms used in the reviewed papers.

ML and DL algorithms evaluated in the reviewed papers.

AlgorithmPapers That Applied ItNo. of ArticlesProblem Domains Performance (Highest Accuracy)
SVM [ , , , , , , , , , , , , , , , , , , , , , , , , ]26Insider Threat, DDoS, Malware, Botnet, Malicious Traffic, IDS, Phishing93.95% (IDS)
DT [ , , , , , , , , , , , , ]13Insider Threat, DDoS, Phishing, Malware, Botnet, Malicious Traffic, IDS100% (Malicious Traffic)
RF [ , , , , , , , , , , , , , , , , ,
, , , , , , , , , ]
27DDoS, Phishing, Malware, Botnet, IoT Network, Malicious Traffic, DNS Level Attack, IDS100% (Malicious Traffic, DDoS)
NB [ , , , , , , , , , , , , , , ,
, , , ]
19DDoS, Malware, Botnet, Malicious Traffic, DNS Level Attack, IDS, Phishing90% (Malicious Traffic)
KNN [ , , , , , ]6Botnet, Malicious Traffic, DNS Level Attack, IDS100% (Malicious Traffic)
MLP [ , , , , , , , , , ]11DDoS, Malware, Botnet, Malicious Traffic, IDS, Phishing99.60% (Malware)
ELM [ , ]2IDS99.5% (IDS)
LR [ , , , , , , , ]8DDoS, Malware, Malicious Traffic, DNS Level Attack, IDS99.92% (Malware)
J.48 [ , , , , , ]6DDoS, IDS99.66% (IDS)
ANN [ , , , , , ]6Phishing, Zero-Day, IDS99.6% (Zero-Day)
RNN [ , , , , , , , , , ]10Insider Threat, DDoS, Malicious Traffic, IDS100% (Insider Threat)
CNN [ , , , , , , , , , ]10Insider Threat, DDoS, Malware, IoT Network, Malicious Traffic, IDS99% (DDoS)
DNN [ , , , ]4Insider Threat, DDoS, IoT Network, Malicious Traffic99.99% (IoT Network)
LSTM [ , , , , , , ]7DDoS, Botnet, IoT Network, Malicious Traffic, Phishing99% (DDoS)
CNN-LSTM [ , ]2Insider Threat, DDoS99.48% (DDoS)
AE [ , , ]3DDoS, IDS99% (DDoS)

3.3. Common Datasets Used

There are several datasets used by researchers in the reviewed papers to evaluate their network detection and classification model. The most widely used dataset is NSL-KDD due to the reasonable size of its training and testing sets and is also available publicly. There are 41 features in the NSL-KDD dataset. It is an enhanced version of the KDD dataset and removed the duplication of the records to eliminate the bias of the classifiers. Then, KDD-99 and CICIDS2017 came after NSL-KDD. The KDD-99 dataset was used for the first time in a competition and is an improved version of DARAP98. The CICIDS2017 dataset contains normal and new attacks and was published in 2017 by the Canadian Institute for Cybersecurity (CIC).

After that, the UNSW-NB15 dataset comes next in terms of repeatedly being used. The IXIA tool was used for creating the UNSW-NB15 dataset and it consists of nine types of attacks.

There are many other datasets, however, few researchers have tried to create their datasets. The CTU-13 dataset was captured by CTU University in the Czech Republic. It contains real botnet traffic combined with normal traffic and contains thirteen scenarios including legitimate traffic and attacks such as DoS. The SNMP-MIB dataset consists of about 4998 records with 34 variables. The attacks recorded in the data include six DoS attacks (TCP-SYN, ICMP-ECHO, HTTP flood, UDP flood, Slowloris, Slowpost) and web brute force attacks. The Kyoto 2006+ dataset was built from real traffic data from Kyoto University’s Honeypots over three years, from November 2006 to August 2009. The Kyoto 2006+ dataset consists of 24 features, 14 of which are derived from the KDD-99 dataset and 10 additional features that can be used to analyze and evaluate the IDS network. Honeypots, email server, darknet sensors, and web crawler were used to construct the Kyoto 2006+.

ADFA is an IDS that includes three data types in its structure: (1) normal training data with 4373 traces; (2) normal validation data with 833 traces; and (3) attack data with 10 attacks per vector. As the web became a significant internet criminal activity platform, the security community put in efforts to blacklist malicious URLs. Ma et al.’s dataset [ 153 ] consists of 121 sets with overall 2.3 million URLs and 3.2 million features in the dataset. The researchers divided the URLs into three groups based on their characteristics, with features being identified as binary, non-binary, numerical, or discrete.

Table 4 lists all the respected papers that utilized the different datasets, highlighting the main references for all datasets as well as the last year when each dataset was used. Figure 2 presents the number of articles that utilized each dataset.

An external file that holds a picture, illustration, etc.
Object name is sensors-21-07070-g002.jpg

Datasets used in the reviewed papers.

Network traffic datasets used in the reviewed papers.

DatasetArticlesNumberLast Time Dataset UsedPublicly Available
DARPA-1998 [ ]12012 [ ]
KDD-99 [ , , , , , , , , , , , ]122018 [ ]
NSL-KDD [ , , , , , , , , , , , , , , , , , , ]192021 [ ]
UNSW-NB15 [ , , , , , , ]72020 [ ]
CICIDS-2017 or 2018 [ , , , , , , , ]82020 [ ]
CTU-13 [ , , , ]42021 [ ]
IoTID 20 [ , , ]32021 [ ]
Kyoto 2006+ [ ]12018 [ ]
CERT v6 or v4 [ , ]22021 [ , ]
SNMP-MIB [ ]12019 [ ]
ISCX 2012 or 2016 [ , , , ]42020 [ , ]
ADFA [ , ]22019 [ ]
CAIDA [ , ]22016 [ ]
ISOT CID [ , ]22021 [ ]
ISOT HTTP [ , ]22020 [ ]
Malicious URLs Dataset [ ]12021 [ ]
EMBER [ ]12018 [ ]
CICDDoS2019 or CICDoS2017 [ , , ]32020 [ , ]
USTCTFC2016 [ ]12016 [ ]
GPRS WPA2/WEP [ ]12017 [ ]
MTA KDD 19 [ ]12020 [ ]
LITNET-2020 [ ]12020 [ ]
CIRA-CIC-DoHBrw-2020 [ ]12020 [ ]
Bot-IoT [ ]12019 [ ]
Kaggle Datasets [ , ]22021 [ , ]
UCI Datasets [ , , ]32021 [ ]

4. Discussion and Conclusions

Network security is a major concern for individuals, profit, and non-profit organizations as well as governmental organizations. In fact, with the digital explosion that we are witnessing in the present era, ensuring network security is an urgent necessity in order to safeguard society’s acceptance for thousands and thousands of services that rely essentially on the backbone of the digital life, which is the network. Therefore, network security turns out to be an urgent requirement, and not a luxury. Although many protection methods have been introduced, there are still some vulnerabilities that are exploited by hackers, leaving the network security administrators in a continuous race against the network attackers. Techniques that hover around the use of intelligent methods, namely machine learning (ML) and deep learning (DL) have proved their merits in several domains including health care systems, financial analysis, higher education, energy industry, etc. This indeed motivated the people responsible for the network security to further explore the ability of these techniques in providing the required level of network security. Consequently, several intelligent security techniques have been offered in the past few years. Although these techniques showed exceptional performance, the problem has not been resolved entirely. This leaves us in a position to critically evaluate the currently offered solutions to recognize the possible research directions that might lead to building more secured network environments.

The complication of using the right dataset and features or the right ML and DL algorithms to identify the different attack types has proven to be an arduous decision for experts to make. Hence, among the reviewed papers, some researchers focused on comparing different algorithms to determine which algorithm to use for building an intelligent model using a training dataset. As no algorithm has been found to be a silver bullet for identifying and classifying all attacks with high accuracy, it was widely noted that it is not reasonable to accept a single algorithm as a universal model.

When building any intelligent system, the designer should take into account what is/are the algorithm(s) that best fit the domain. Not only this, but the designer should also decide which dataset comprises a set of features that better represent the classification area. Considering the network attacks, this research article found that RF is the most commonly used algorithm and this can be justified due to the fact that it uses an ensemble learning technique, which to some extent might ensure a life-long system due to the exceptional capability to continuously learn new knowledge on the fly. Producing models with reduced overfitting is another motivation behind using the RF. Not only this, but RF can also be effectively applied on both categorical and continuous features, and thus it can be applied to a wide range of datasets. In addition, the exceptional ability to handle missing data puts RF as a first option when building network attack mitigation models taking into account that most of the datasets are susceptible to include missing values. However, since RF produces complex trees, building a real-life system based on RF could be a challenging task because it might require more computational power and resources, while in fact, the main success factor for building a system for detecting network attacks is the quick and instant reaction. SVM is the second most widely used algorithm. However, SVM is applied to a fewer number of network attacks when compared to RF. This can be justified due to the fact that SVM produces complex intelligent models that are difficult to apply in real life. Nevertheless, SVM is considered as the main competitor to RF due to the fact that it shares several advantages with RF such as the exceptional capability to deal with missing values, and the remarkable capability to reduce the overfitting problem. NB ranks in third place, but still did not achieve the same predictive performance as RF and SVM due to the fact that it assumes that the dataset features are independent, which in fact, is not true in most training datasets. DT was employed almost half the time that RF and SVM were used. DT proved its merits in several domains, but in the network security domains, it has not been used very much. This can be justified due to the fact that it produces a set of rules that if exposed to the attackers, they can adopt their attacks by avoiding the rules adopted from the DT models.

Included among the algorithms that conveyed excellent performing results were DL models, namely, DNN and RNN as well as ML models, namely, RF and DT with their accuracies reaching up to 100%. A more promising research direction to explore can increasingly be toward applying hybrid or ensemble models to improve attack detection accuracy; for instance, augmenting DL techniques such as CNN with long short-term memory (LSTM) for automating feature engineering and improving network attack detection accuracy. Furthermore, gated recurrent unit (GRU), initially proposed in 2014, can further be applied by researchers in solving various problem domains in network security as it is considered more efficient than LSTM, and it uses comparatively less memory, and executes faster. They can solve complex problems faster, if trained well, and therefore, they are worth trying in network attack detection, namely for DDoS or in IoT networks.

Since the performance of the intelligent models largely depend on the datasets used for training them, it is important to analyze and evaluate which dataset to use for which type of attack. It is recommended that large datasets are used with a good distribution of each class type to increase the detection and classification accuracy. Moreover, limited availability of such datasets represents a challenge in the development of more robust intelligent-based models and highlights the need for producing and publishing more new datasets in different network attack problem domains. Most of the authors in the reviewed articles used the KDD-99 dataset as well as its latest version, the NSL-KDD dataset. However, the ADFA dataset was also used by some, which was proposed as a replacement for the KDD-99 dataset, ISOT HTTP for botnet, ISOT CID for cloud environments, and IoT20 for IoT environments, so can be explored further and used to build different ML and DL models.

Identifying malicious and benign URLs was also a fundamental research direction carried out by researchers where an important set of features that affected the model accuracy were URL related features. It was found that additional improvements in classifying malicious and benign URLs can be accomplished by deploying a lexical approach, which uses static lexical features extrapolated from the URL, in addition to analyzing the URL contents for instantaneous and reliable results. Hence, using a lexical approach to classify URLs can be an important direction to explore.

Several other problem domains need to be explored as they could be a valuable direction for enhancing network security in the modern world. Namely, with the growing establishment of encrypted network traffic as well as virtual private networks, more research needs to be carried out in detecting malicious traffic in these domains using intelligent techniques as not enough research has been focused in this area. Furthermore, with the rising number of inter-connected devices and the establishments of Internet of Things (IoTs) networks, more investigation needs to be carried out in assessing different intelligent techniques on new datasets such as IoT20 as well as paving ways to developing software that can detect and analyze data packets communicated in IoT environments to update the existing datasets for more attacks. Additionally, a new protocol called DNS over HTTP (DoH) has been created recently for which more research needs to be explored on detecting malicious DoH traffic at this (DNS) level.

Finally, multiple researchers intend in their future work to convert the models they built into a real-time system in order to benefit from them in real-life scenarios such as in attack detection and prevention. There are two levels of real-time ML which are online predictions and online learning. Online prediction means making predictions in real-time. Furthermore, online learning allows for the system to incorporate new data and update the model in real-time. Hence, converting intelligent models into real time systems may be considered as a fundamental direction to probe by more researchers.

Author Contributions

Conceptualization, M.A. (Malak Aljabri), S.S.A., R.M.A.M. and S.H.A.; methodology, M.A. (Malak Aljabri), S.S.A., R.M.A.M. and S.H.A.; software, S.M., F.M.A., M.A. (Mennah Aboulnour), D.M.A., D.H.A. and H.S.A.; validation, M.A. (Malak Aljabri), S.M. and F.M.A.; formal analysis, M.A. (Malak Aljabri), S.M. and F.M.A.; investigation, M.A. (Malak Aljabri), S.M., F.M.A., M.A. (Mennah Aboulnour), D.M.A., D.H.A. and H.S.A.; resources, M.A. (Malak Aljabri), S.M., F.M.A., M.A. (Mennah Aboulnour), D.M.A., D.H.A. and H.S.A.; data curation, S.M. and F.M.A.; writing—original draft preparation, M.A. (Malak Aljabri), S.M., F.M.A., M.A. (Mennah Aboulnour), D.M.A., D.H.A. and H.S.A.; writing—review and editing, M.A. (Malak Aljabri), S.M., F.M.A., S.S.A., R.M.A.M. and S.H.A.; visualization, S.M. and F.M.A.; supervision, M.A. (Malak Aljabri); project administration, M.A. (Malak Aljabri); funding acquisition, M.A. (Malak Aljabri) and S.S.A. All authors have read and agreed to the published version of the manuscript.

We would like to thank SAUDI ARAMCO Cybersecurity Chair for funding this project.

Conflicts of Interest

The authors declare no conflict of interest.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

  • Publications
  • News and Events
  • Education and Outreach

Software Engineering Institute

Sei digital library, latest publications, embracing ai: unlocking scalability and transformation through generative text, imagery, and synthetic audio, august 28, 2024 • webcast, by tyler brooks , shannon gallagher , dominic a. ross.

In this webcast, Tyler Brooks, Shannon Gallagher, and Dominic Ross aim to demystify AI and illustrate its transformative power in achieving scalability, adapting to changing landscapes, and driving digital innovation.

Counter AI: What Is It and What Can You Do About It?

August 27, 2024 • white paper, by nathan m. vanhoudnos , carol j. smith , matt churilla , shing-hon lau , lauren mcilvenny , greg touhill.

This paper describes counter artificial intelligence (AI) and provides recommendations on what can be done about it.

Using Quality Attribute Scenarios for ML Model Test Case Generation

August 27, 2024 • conference paper, by rachel brower-sinning , grace lewis , sebastián echeverría , ipek ozkaya.

This paper presents an approach based on quality attribute (QA) scenarios to elicit and define system- and model-relevant test cases for ML models.

3 API Security Risks (and How to Protect Against Them)

August 27, 2024 • podcast, by mckinley sconiers-hasan.

McKinley Sconiers-Hasan discusses three API risks and how to address them through the lens of zero trust.

Lessons Learned in Coordinated Disclosure for Artificial Intelligence and Machine Learning Systems

August 20, 2024 • white paper, by allen d. householder , vijay s. sarvepalli , jeff havrilla , matt churilla , lena pons , shing-hon lau , nathan m. vanhoudnos , andrew kompanek , lauren mcilvenny.

In this paper, the authors describe lessons learned from coordinating AI and ML vulnerabilities at the SEI's CERT/CC.

On the Design, Development, and Testing of Modern APIs

July 30, 2024 • white paper, by alejandro gomez , alex vesey.

This white paper discusses the design, desired qualities, development, testing, support, and security of modern application programming interfaces (APIs).

Evaluating Large Language Models for Cybersecurity Tasks: Challenges and Best Practices

July 26, 2024 • podcast, by jeff gennari , samuel j. perl.

Jeff Gennari and Sam Perl discuss applications for LLMs in cybersecurity, potential challenges, and recommendations for evaluating LLMs.

Capability-based Planning for Early-Stage Software Development

July 24, 2024 • podcast, by anandi hira , bill nichols.

This SEI podcast introduces capability-based planning (CBP) and its use and application in software acquisition.

A Model Problem for Assurance Research: An Autonomous Humanitarian Mission Scenario

July 23, 2024 • technical note, by gabriel moreno , anton hristozov , john e. robert , mark h. klein.

This report describes a model problem to support research in large-scale assurance.

Safeguarding Against Recent Vulnerabilities Related to Rust

June 28, 2024 • podcast, by david svoboda.

David Svoboda discusses two vulnerabilities related to Rust, their sources, and how to mitigate them.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 30 August 2024

A novel classification algorithm for customer churn prediction based on hybrid Ensemble-Fusion model

  • Chenggang He 1 , 3 &
  • Chris H. Q. Ding 2 , 3  

Scientific Reports volume  14 , Article number:  20179 ( 2024 ) Cite this article

1 Altmetric

Metrics details

  • Computer science
  • Information technology

Nowadays, customer churn issues are becoming more and more important, which is one of the most important metrics for evaluating the health of a business it is difficult to measure success without measuring customer churn metrics. However, it has become a challenge for the industry to predict when customers are churning or preparing to churn and to take the necessary action at the critical time before they do. At the same time, how to keep the place of deep research on the 17 machine learning algorithms in 9 major classes of machine learning classics production is the first problem we are facing. Through customer churn deep research, we mentioned the Ensemble-Fusion model based on machine learning and introduced a smart intelligent system to help reduce the actual customer churn about the production. Comparing with most popular predictive models, such as the Support vector machine algorithm, Random Forest algorithm, K-Nearest-Neighbor algorithm, Gradient boosting algorithm, Logistic regression algorithm, Bayesian algorithm, Decision tree algorithm, and Neural network algorithm are applied to check the effect on accuracy, AUC, and F1-score. By comparing with 17 algorithms in 9 categories of machine learning classics, the data prediction accuracy of the Ensemble-Fusion model reaches 95.35%, AUC score reaches 91% and F1-Score reaches 96.96%. The experimental results show that the data prediction accuracy of the Ensemble-Fusion model outperforms that of other benchmark algorithms.

Similar content being viewed by others

research papers on network security algorithms

Enhancing customer retention in telecom industry with machine learning driven churn prediction

research papers on network security algorithms

An efficient churn prediction model using gradient boosting machine and metaheuristic optimization

research papers on network security algorithms

Customer churn prediction using composite deep learning technique

Introduction.

Customer churn is one of the key factors affecting the benign development of industries and enterprises, and at the same time, it is a very challenging research topic in both academia and industry 1 , 2 , 3 , especially for those information industries relying on the subscription model and the order purchase operation model, customer churn, especially the churn of key customers, can be fatal to their impact. Reducing 5% of customer loss rate can increase profits by 25–125% 2 . Unfortunately, this always requires lots of manual efforts to analyze data, and it is often too late to take actions to retain them. In order to retain more existing old customers, especially some key customers, many companies have made many attempts to differentiate between churned and non-churned customers, so as to achieve the purpose of retaining churned customers, but the actual effect is very poor. As we all know, the loss of old customers not only affects revenue, but also affects the attraction of new customers. In addition, the cost of developing a new customer is often much higher (almost 5–6 times) than the cost of retaining an old customer 4 , 5 . So, is it possible to research efficient customer churn prediction models for customer churn prediction by using machine learning-related algorithms in conjunction with the actual needs of the industry? At the same time, in order to help those decision makers who do not have the theoretical foundation of algorithms to make decisions quickly and efficiently, is it possible to develop an intelligent, convenient, efficient and intelligent early warning system that can detect or predict the existing customer churn in a timely manner to help the industry, and then the enterprises can take relevant actions to retain customers when they find that there is a risk of churning key customers, so as to minimize the losses of the enterprises? In part of the related work the theoretical basis of Gradient Boosting Algorithm 6 , 7 , Bayesian Algorithm 8 , 9 , Support Vector Machine Algorithm 10 , 11 , 12 , 13 , 14 , 15 , Random Forest Algorithm 16 , K Neighborhood Algorithm 17 , 18 , Logistic Regression Algorith 19 , 20 , Decision Tree Algorithm 21 , 22 , 23 , 24 and Neural Network Algorithms 25 , 26 , 27 , 28 , 29 are described and the research on application of these algorithms in customer churn prediction is discussed. The literature related to the above algorithms is restating the superiority of the single algorithm they use, and after analyzing them, it can be concluded that these algorithms are affected by the characteristics of the dataset, and there is a strong dependency between their algorithms and the dataset, and then there is no such thing as being able to use one algorithm alone to solve all the problems in any practical application scenarios. Based on the shortcomings of the traditional algorithms analyzed above, this paper proposes a model based on Ensemble-Fusion (Integrated Learning Fusion), in order to meet the universality of various complex scenarios through the model, and expects to be able to provide academia and industry with a pervasive and efficient customer churn prediction solution. So in this paper, we first propose a customer churn prediction algorithm based on the Ensemble-Fusion model. Then it proposes an efficient churn solution based on the Ensemble-Fusion model. Finally, in order to help the information industry make efficient customer churn decisions, a real-time intelligent early warning system for customer churn is developed through theory-guided practice, which can monitor customer dynamics in real-time, help enterprises to identify potential lost customers in advance, and provide early warning at the first moment to remind the sales team or the Customer success management team (CSM) to take proactive action to retain lost customers, thus reducing the risk of fatal blow to the enterprise because of customer churn.

Given the above purposes, this paper conducts research on customer churn prediction through machine learning related theories and algorithms, firstly gives a solution to deal with the huge and complex datasets in the industry, then proposes the Ensemble-Fusion (Integrated Learning Fusion) prediction model for customer churn, and finally, in order to further guide the theory to practice, facilitate the enterprises to take actions quickly and efficiently to retain customers, especially the key customers, in order to improve customer retention. Especially the retention of key customers. Combined with my many years of experience in the industry, I have developed an end-to-end real-time intelligent early warning system for customer churn, which not only predicts customer churn in an organization’s production environment, but also sends out early warnings to alert the relevant personnel such as the sales team and the customer success team, so that the relevant teams can take effective action to retain the customers who are about to be lost in the first time. The system not only predicts customer churn in an organization’s production environment, but also sends out early warnings to alert relevant personnel such as sales and customer success teams so that they can take immediate action to retain lost customers. In order to solve the above problems, we must first deal with the problems encountered in the research, specifically in the research work encountered in the actual research and development of the very difficult problems are as follows: First, the real structure of the production data is very complex and the relevant data are often distributed in different regions of the world in different departments and data structure of different databases, the collection of data is very difficult, and due to the restriction of some sensitive information and the relevant agreements, it is very difficult to collect all the relevant data. It is also difficult to collect all the relevant data due to sensitive information and related protocol issues. Therefore, the problem of customer churn data collection becomes how to construct an effective model with a limited data set. Secondly, in the collected relevant data, there is still a lot of noise in the data, which is very imbalance 30 , 31 , 32 , 33 , 34 , 35 , 36 , 37 , 38 due to the actual impact of business complexity and there are no labels to mark whether a customer is churned or not, which requires that a lot of prior work and business knowledge should be involved before proceeding with the collection and processing of the data. In order to address the above issues in customer churn data prediction, this paper’s main contributions of the work are as follows:

This paper proposes a novel model named Ensemble-Fusion based on ML (Machine Learning) related theories and algorithms to predict customer churn in SAAS 36 (Software-as-a-Service, SAAS is a cloud-based software delivery model in which the cloud provider develops and maintains cloud application software) production environments, which focuses on the exceptionally complex data collection, processing and application in the actual production line, and organizes a detailed customer churn prediction data processing architecture diagram is shown(detailed in Sect. “ Customer churn prediction solution based on Ensemble-Fusion model ”), and finally the solution proposed in this paper is used in the actual production environment to achieve good results.

This paper combines machine learning theories and algorithms, such as support vector machine algorithms, random forest algorithms, K-neighborhood algorithms, gradient boosting algorithms, logistic regression algorithms, Bayesian algorithms, deci- sion tree algorithms and neural network algorithms, and other 9 categories of 17 machine learning algorithms as a baseline classifiers to propose the “customer churn data processing architecture based on the integration of learning fusion (Ensemble- Fusion)”. Fusion-based customer churn prediction model and verified the high accuracy and effectiveness of the churn prediction model by evaluating the key indexes of the machine learning model, such as precision, recall, accuracy, AUC 37 (Area under the ROC 38 Curve, AUC measures the entire two-dimensional area underneath the entire ROC curve. AUC provides an aggregate measure of performance across all possible classification thresholds.) and F1-score 39 , 40 (F1-score is an important evaluation metric that is commonly used in classification task to evaluate the performance of a model. F1-score is a way of combining the precision and recall of the model, and it is defined as the harmonic mean of the model’s precision and recall).

In order to further improve the productivity of the industry efficiently, by linking theory to practice, this paper also designs and develops an intelligent early warning system based on the Ensemble-Fusion model to help enterprises predict customer churn, especially the churn of important customers, quickly and effectively, so as to help them retain churned customers and reduce the churn that brings. The system is designed to help companies retain lost customers and minimize the fatal blow to the company due to customer churn. The intelligent system can not only present important customers with high probability of churn, but also automatically provide relevant information based on the prediction results to remind relevant personnel to take proactive actions to retain important customers that are about to be churned, so as to reduce losses.

This paper not only provides specific solutions to the important problem of cus- tomer churn from theory, but also translates the theory into a specific intelligent early warning system, which can efficiently help enterprises, especially those who don’t know the background knowledge of machine learning and other relevant leadership decision- making personnel to easily make effective decisions about customer churn, so as to be able to retain key customers and increase the competitiveness of the enterprise. The system can be used to retain key customers and increase the competitiveness of an organization.

The rest of this paper is organized as follows, in Section “ A research approach to customer churn prediction based on Ensemble-Fusion model ”, it mainly introduces the theory and methodology, solution, and overall architectural design of the machine learning-based customer churn intelligent system and introduces the customer churn prediction algorithm based on the Ensemble-Fusion model proposed in this paper. In Section “ Experiment and result ”, the proposed customer churn prediction algorithm is validated and the high accuracy and effectiveness of the churn prediction model are verified by the key metrics of machine learning model evaluation, such as precision, recall, accuracy, AUC , and F1-score 37 , 38 , 39 , 40 . Section “ Intelligent early warning system for customer churn prediction based on Ensemble-Fusion model ” describes the main functions of the intelligent early warning system for customer churn prediction, and also provides a detailed description of the User Cases associated with this intelligent system. A review of relevant customer churn research is presented in Section “ Related work ”. Finally, relevant conclusions and outlook are summarized in Section “ Conclusions and future work ”.

A research approach to customer churn prediction based on Ensemble-Fusion model

This part proposes a solution for customer churn prediction based on the Ensemble- Fusion model: firstly, it comprehensively outlines the specific scenarios to be solved for customer churn, and gives the ideas and feasible solutions to solve the problem from top to bottom. Then the specific design and implementation of an end-to-end customer churn intelligent prediction system is proposed: specifically including the collection and processing of complex datasets, the construction of prediction models, and the intelligent system platform in three parts, each of which contains a detailed process. Then this paper provides an in-depth analysis of the machine learning model for customer churn prediction, and finally this paper proposes a new customer churn prediction model and gives a specific implementation algorithm.

Customer churn prediction solution based on Ensemble-Fusion model

This part proposes a solution based on the Ensemble-Fusion model to predict customer churn and help organizations reduce customer churn. The detailed process of the solution is depicted in Fig.  1 , as shown in Fig.  1 , the solution consists of two main parts: the offline training part and the online inference part. During offline training, data preprocessing 30 , 31 , 32 , 33 s first required to clean and label the input data, the annotation is done by labeling the data with churn or non-churn. Then, the relevant features of the data are extracted based on the business knowledge, such as the feature “Trend of meetings compared to last year” which is used to describe the number of meetings booked by customers in the current year compared to the number of meetings booked by customers in the previous year, and the number of meetings booked also reflects the trend of imminent churn of customers. The feature “Trend in meeting duration compared to last year” can be used to characterize the total duration of meetings in the current year compared to the total duration of meetings in the previous year, which can be used to predict the trend of customer churn. These extracted features can effectively reflect the trend of imminent or significant customer churn. Specific model features are described in Table 1 , where model training data information is used from actual production line usage data.

figure 1

Customer Churn Solution Flowchart.

The process of customer churn prediction processing and the logical relationship between data transfers are detailed in Fig.  2 . In addition, since there are only a few churned (noisy) data, data balancing-related processes must be performed before training. These features can then be used to iteratively train and validate the machine learning model until the model is validated well enough to be deployed directly to a production environment. Finally, the rigorously validated model can be deployed in a production environment to predict the likelihood of customer churn in real time.

figure 2

Architecture diagram of customer churn prediction data processing.

For the online inference component, data cleaning and feature engineering 35 , 36 , 37 are also required to construct the training dataset. The dataset here does not contain labeled data, mainly because the goal to be predicted is whether customers will churn in the following months, which has not occurred in the previous inference process. After obtaining the trained model, test data also needs to be fed into the machine learning model to infer the final prediction. Finally, information about the high churn customers predicted by the validated machine learning model will be displayed on the intelligent churn prediction system. Information about the churn prediction will be notified to the project stakeholders in real-time via email, instant messaging, and other messaging channels so that they can proactively take action to minimize the risk of churn losses.

Customer churn data prediction algorithm based on Ensemble-Fusion model

In order to better carry out the research on customer churn rate, this paper focuses on the theoretical basis of the Support Vector Machine algorithm, Random Forest algorithm, K-neighborhood algorithm, Gradient Boosting algorithm, Logistic Regression algorithm, Bayesian algorithm, Decision Tree algorithm, and Neural Networks algorithm in Section “ Related work ” and discusses the research on the application of these algorithms in the prediction of customer churn rate. The literature related to the above algorithms restates the superiority of the single algorithm they use, and after analyzing them, it can be concluded that these algorithms are affected by the characteristics of the dataset, and there is a strong dependency between their algorithms and the dataset, and then there is no such thing as being able to use one algorithm alone to solve all the problems in any practical application scenarios. Based on the shortcomings of the traditional algorithms analyzed above, this paper proposes a model based on Ensemble-Fusion (Integrated Learning Fusion), in order to meet the universality of various complex scenarios through the model, and expects to be able to provide academia and industry with a pervasive and efficient customer churn prediction solution.

This subsection focuses on the detailed construction process of the customer churn prediction method based on the Ensemble-Fusion model, which is described in detail in Algorithm 1, and compared with the experimental results of 17 machine learning algorithms through the model in the experimental part of Section “ Experiment and result ”, so as to validate that the model has a high accuracy rate, strong robustness, and ease of scalability.

End-to-end customer churn prediction real-time intelligent early warning system design

To further help organizations reduce customer churn, this subsection designs and develops a customer churn intelligent prediction system. The system consists of three main parts, the first part is mainly the collection and processing of different business-related data set and detailed processing, which mainly includes four major processes, of which the first major process includes the access of heterogeneous data, due to the unusual complexity of the source of data in the real production environment, which mainly includes the system application data, Billing (financial billing) customer data, prod- uct transaction data, Product discount data, product sales data, cross-departmental transaction data, reconciliation data and posting data. In a large multinational group.

figure a

Customer ChurnPrediction Algorithm Based on Ensemble Fusion Model

of companies, due to the different technical architectures of each system, the data for- mat is not the same, generally JSON, XML, plain text files and other formats. To process the data, it is necessary to unify the data format here, from different hetero- generous databases through ETL (Extra, Transform, Load) to achieve from different types of databases (e.g., MySQL, Oracle, MongoDB, and Redis) to get the data, and finally unified storage in the MySQL database. The second major process is to structure the data by managing the database to construct training and testing datasets for the next machine learning models. The third major process is to perform the construction of the machine learning model for customer churn prediction through the formatted and unified dataset acquired in the previous step (details will be elaborated in Sect.  “ AUC results and analysis ”). The fourth major part is the transfer of business logic through the standardized API interface (Restful API), and ultimately display of relevant information on the front-end page, which mainly includes the display of customer churn information, the display of customer churn heat map, the customer churn management platform, and the analysis of customer churn 360-degree related information, which is elaborated in detail in Fig.  2 (Customer Churn Prediction Data Processing Architecture Diagram). The second part is the ML (Machine Learning) modeling system, which includes data acquisition, feature engineering, and model training, and this part is elaborated in subsection 2.3. The third part is the visualization and presentation plat- form which will display the information related to customer churn, and this relevant part will be described in detail in Section “ Experiment and result ”. The details of the system architecture are described in detail in Fig.  3 , as shown in Fig.  3 , the system mainly consists of the following parts, the first part is the collection of data, for the Fortune 500 multi- national corporations, their various businesses are spread all over the world, and the collection of data is a very complex and time-consuming work. The second part is the data processing such as feature engineering on the data collected in the first part, then the training and validation of the machine learning model, and finally obtaining a machine learning model with the highest accuracy rate to be used in the customer churn prediction system. The third part is the platform display part, which mainly displays multi-dimensional warning information and real-time forecasts for specific customer churn information, and the specific related information and functions will be elaborated in Section “ Experiment and result ”.

figure 3

Architecture diagram of customer churn intelligent early warning system.

Specific user usage examples of this intelligent system are described in detail in Fig.  4 . As shown in Fig.  4 , the sales layer and the leadership layer are two important key target roles that are important in the platform. At the sales level, the intelligent system displays customers with high churn risk on the platform and provides relevant details. The platform also sends out regular alert emails, timely messages, and other early warning information to notify the relevant project stakeholders to take proactive action to intervene in the impending churn. Additionally, salespeople can send feedback about forecasts to help continuously improve and optimize the proposed machine learning model. For leadership, it is even more important to keep track of global customer churn rather than individual customer churn. To solve this problem, the intelligent real-time alert system is designed with a dashboard module for leadership managers to show the overall churn trend from a global perspective, thus facilitating decision-makers to make efficient decisions at the first time.

figure 4

Use case diagram for a customer churn platform.

Experiment and result

This section focuses on the comparison of the experimental results of the proposed Ensemble-Fusion model-based machine learning for customer churn prediction and the classical machine learning 9 categories and 17 algorithms for customer churn predic- tion. Here, a private dataset of the customer production line system of the Company from 2015 to 2022 is used, where 80% of the data is used for training and 20% of the data is used for testing, in which K-fold cross-validation is used to test the accuracy of the model.

Model evaluation indicators

In order to evaluate the performance of machine learning models, relevant metrics recognized in the field of machine learning are usually used, namely precision, recall, accuracy and F1-score 38 , 39 , 40 , 41 . These metrics represent the performance of predictive models for customer churn prediction. The meanings of the metrics are explained here in a relevant way, with true positives and false positives denoted as TP and FP, respectively 42 , and true negatives and false negatives denoted as TN and FN, respectively 43 .TP stands for the number of customers whose actual labels are churned ( predict label is churn), FP stands for the customers whose actual customers are labeled as not churned but whose predicted customer labels are churned number, FN represents the number of customers whose actual label is churn but whose predicted label is not churn, and TN represents the number of customers whose actual label is not churn and whose predicted label is not churn. Thus, precision, recall, accuracy, and F1 score can be described as follows:

Results of model indicators related to customer churn prediction

To evaluate the performance of the customer churn prediction algorithm based on the Ensemble-Fusion model proposed in this paper, the customer churn prediction is performed by the model proposed in this paper and 17 machine learning algorithms in 9 major categories of machine learning classics respectively. The performance metrics of precision, recall, accuracy, and F1-score 38 , 39 , 40 , 41 are compared, and the detailed results of the specific comparison can be found in Table 2 . Among the 17 machine learning algorithms in 9 major classes of machine learning classics, the accuracy of gradient boosting classifiers and random forests are 95.32% and 94.29%, respectively, and the F1-score of the gradient boosting classifier is up to 96.3%, which is better than other machine learning classic algorithmic classifiers, while the integrated learning fusion model proposed in this paper achieves an accuracy rate of 95.35%, and the F1-Score reaches 96.96% significantly better than other machine learning classic benchmark classifier algorithms. The results of Precision, Recall, Accuracy, and F1-Score of 17 machine learning algorithms in 9 categories of machine learning classics are shown in detail in Figs. 5 , 6 , 7 and 8 for comparison.

figure 5

Comparison of algorithm precision.

figure 6

Comparison of algorithm recall.

figure 7

Algorithm Accuracy comparison chart.

figure 8

Algorithm F1-Score Comparison chart.

AUC results and analysis

To further evaluate the performance of the model, this section also uses AUC 13 curve for evaluating the machine learning model. A higher AUC score represents better performance of the model. Here, fivefold cross-validation 14 is used to calculate the ROC, and the highest AUC is obtained for the integrated learning-based fusion model proposed in this paper, the detailed results of the specific comparison can be found in Table 3 , and the ROC 15 results for the related machine algorithms are shown in Figs. 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 and 27 .

figure 9

SVM(RBF)algorithm ROC and AUC.

figure 10

SVM(Poly) algorithm AUC.

figure 12

SVM (Sigmoid) algorithm AUC.

figure 13

Random Forest algorithm AUC.

figure 14

KNN algorithm AUC.

figure 15

LR algorithm AUC.

figure 17

MLP (Algorithm 16) AUC.

figure 18

MLP (Algorithm 17) AUC.

figure 19

MultinomialNB algorithm AUC.

figure 20

BernouiliNB algorithm AUC.

figure 21

GaussianNB algorithm AUC.

figure 22

DT(CART) algorithm AUC.

figure 23

ID3 algorithm AUC.

figure 24

ExtraTrees algorithm AUC.

figure 25

AdaBoost algorithm AUC.

figure 26

Comparison of K-fold AUC for each algorithm.

figure 27

Comparison of average AUC by algorithm.

Intelligent early warning system for customer churn prediction based on Ensemble-Fusion model

In this section, the main functions of the real-time intelligent early warning system for customer churn data prediction based on the Ensemble-Fusion model will be elaborated in detail, and the relevant descriptions of the main functions are described as follows.

Information relevant to predicting customer churn

Figure  28 shows the top five of the “Top 100” accounts with high churn risk, as shown in Fig.  28 , with detailed information (e.g., account name, account ID, etc.) displayed in the table. If the prediction is incorrect, the user can give feedback by clicking on the relevant action, and then feedback through the system. Of course, it is also possible to click on the Account ID to enter the detailed prediction page, which will be analyzed in detail in Section “ Demonstration of the intelligent system of customer churn prediction ”.

figure 28

Example display of customer churn information.

Demonstration of the intelligent system of customer churn prediction

In Figs. 29 and 30 , detailed information of a detailed page of a real-time intelligent prediction system for customer churn is described, which consists of two parts, wherein the upper half of the page displays the basic information of the current churned customer data prediction, which specifically includes information such as the user’s ID, name, and the type of platform. In the second half, the reasons for the churn are provided and a multi-dimensional analysis of the specific reasons is provided to help the relevant stakeholders and personnel in the relevant departments in the industry to analyze the current billing and usage trends of the account so as to identify the churn trends in time to take effective action.

figure 29

Example display of lost customer details.

figure 30

Example display of user and account trends.

Dashboard for an intelligent system for customer churn prediction

For dashboards designed for leadership decision makers, specific information about the results of predictive analysis of relevant customer churn data is presented in Figs. 31 , 32 , 33 and 34 . The Real-Time Intelligent Alerts dashboard consists of a total of five sections. The first section is the overall trend in customer churn, which includes three parts: average churn rate, fully renewed accounts, and new onboarding contracts. The second section is Customer churn as a key driver for leading decision-making teams to make decisions. The third section is the Churn heatmap (Churn Heatmap Description), which displays churn rates for selected regions and also provides a top correlation analysis and top correlation forecast for the next six months.

figure 31

Leadership Decision Panel Design—Generalized Information.

figure 32

Leadership Decision Panel Design—Churn Heat Map 44 (We developed a customer churn intelligent early warning system using open source pyecharts, https://github.com/pyecharts/pyecharts ).

figure 33

Leadership decision panel design—correlation coefficient analysis.

figure 34

Leadership decision panel design—360 degree information analysis presentation.

Customer churn prediction intelligent system evaluation module

In order to evaluate the performance of the model in the intelligent early warning system for customer churn based on the Ensemble-Fusion model, this subsection tests the 2018 production line production data. Figure  35 demonstrates the specific results of the evaluation, and the accuracy of the model is obtained by testing and validation to be above 95.8%, which achieves a high level of accuracy prediction. Higher accuracy means that more predicted churned customers are indeed likely to actually churn in the future, which does reduce the churn rate and retention of customers thus reducing the risk of fatalities to the organization due to customer churn.

figure 35

Customer churn prediction model evaluation page.

Related work

To obtain the best model for customer churn prediction, this section will conduct a theoretical analysis of related machine learning algorithms and models. First, 9 categories and 17 algorithms related to machine-learning are expounded, and then in the third part, a prediction model of customer churn rate based on an ensemble-fusion model is proposed, and 17 sets of experiments are carried out to verify that the model has strong performance. Robust and easy to extend.

Support vector machines

Support vector machines(SVM) 10 , 11 are a set of supervised learning methods used for classification, regression, and outlier detection 12 . The advantages of support vector machines are effective in high dimensional spaces. Still effective in cases where the number of dimensions is greater than the number of samples. The objective function:

SVM is a supervised learning models that analyze data used for classification and regression analysis. In the customer churn prediction, SVM divides the result of prediction into two parts, such as positive is customer churn while negative is customer non-churn. The kernel of SVM is used like linear, poly and RBF.

Random forests

Random forests are constructed by several trees 16 and each decision tree is trained by random samples. A random forest is a data construct applied to machine learning that develops large numbers of random decision trees analyzing sets of variables. This type of algorithm helps to enhance the ways that technologies analyze complex data. The Random Forest algorithm is one of the best algorithms for classification. RF can classify large data with accuracy. It is a learning method in which the number of decision trees is constructed at the time of training and outputs of the modal predicted by the individual trees. RF acts as a tree predictor where every tree depends on the ran- dom vector values. The basic concept behind this is that a group of “weak learners” may come together to build a “strong learner”. Random forest models are machine learning models that make output predictions by combining outcomes from a sequence of regression decision trees. Each tree is constructed independently and depends on a random vector sampled from the input data, with all the trees in the forest having the same distribution. The predictions from the forests are averaged using bootstrap aggregation and random feature selection. RF models have been demonstrated to be robust predictors for both small sample sizes and high dimensional data. RF clas- sification models were constructed that directly classified bioreactor runs as having sufficient or insufficient cardiomyopathy content.

K-nearest-neighbors

K-nearest-neighbors algorithm (KNN) is a non-parametric classification method first developed by Evelyn Fix and Joseph Hodges in 1951 17 . It is used for classification and regression. In both cases, the input consists of the k closest training examples in the data set. The output depends on whether KNN 18 is used for classification or regression. The training examples are vectors in a multidimensional feature space, each with a class label. The training phase of the algorithm consists only of storing the feature vectors and class labels of the training samples. The k-nearest neighbors (KNN) algorithm is a simple, easy-to-implement supervised machine learning algorithm that can be used to solve both classification and regression problems. The principle behind nearest neighbor methods is to find a predefined number of training samples closest in distance to the new point and predict the label from these. The number of samples can be a user-defined constant (k-nearest neighbor learning) or vary based on the local density of points (radius-based neighbor learning). The distance can, in general, be any metric measure: standard Euclidean distance is the most common choice. Neighbors- based methods are known as non-generalizing machine learning methods since they simply “remember” all of their training data. KNN is a non-parametric algorithm, which means it does not make any assumptions on underlying data. It is also called a lazy learner algorithm because it does not learn from the training set immediately instead it stores the data set and at the time of classification, it performs an action on the data set. KNN algorithm at the training phase just stores the data set and when it gets new data, then it classifies that data into a category that is much similar to the new data.

Gradient boosting classifier

Gradient boosting[34, 35]produces a model in the form of an ensemble of the prediction model, usually there using decision trees. Gradient boosting classifier has a lot of advantages, such as high prediction rate, dealing with non-linear data, and flex- ible handling of various types of data. Predictions are made by the majority vote of the weak learners’ predictions, weighted by their individual accuracy. Gradient boosting machines are an extremely popular machine learning algorithm that has proven successful across many domains. A simple GBM model contains two categories of hyper-parameters: boosting hyper-parameters and tree-specific hyper-parameters. Gradient boosting re-defines boosting as a numerical optimization problem where the objective is to minimize the loss function of the model by adding weak learners using gradient descent. Gradient descent is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function. As gradient boosting is based on minimizing a loss function, different types of loss functions can be used resulting in a flexible technique that can be applied to regression, multi-class classification.

Theoretical analysis of customer churn rate prediction based on logistic regression

Logistic regression[19, 20]is a generalized linear regression analysis model, which is divided from the classification of machine learning. It belongs to the classification algorithm in supervised learning. Due to the good performance of logistic regression 19 , it can often be used for binary classification. or multi-classification problems. In the research on the prediction of customer churn rate, logistic regression can be abstracted here to deal with the binary classification problem, such as the label of marking customer churn as 0, and the label of non-churn as 1. At this time, for each set of input data, according to the Sigmoid function 20

in logistic regression, the predicted value can be mapped to between [0, 1]. If y  ≥ 0 . 5, it is recorded as 0 category is the loss, and similarly, it is 1 category that is not lost.

Theoretical analysis of customer churn rate prediction based on Bayesian theory

The research on customer churn prediction is currently limited to the application stage of Naive Bayes 8 . The basic idea of the Naive Bayes algorithm 9 : for a given category to be classified, solve the problem under the condition that this category appears. The probability of occurrence of each category, which category has the highest probability of occurrence, is considered to be the category to which the item to be classified belongs.

Theoretical analysis of customer churn rate prediction based on decision tree

In the research on customer churn prediction, a few pieces of literature use a decision tree algorithm 21 . A decision tree is also called a decision tree in some literature 22 . This kind of algorithm belongs to supervised learning in machine learning, which can be used to solve classification and regression problems. The decision tree algorithm is a top-down divide-and-conquer strategy, a recursive algorithm from the root node to the leaf node, where the leaf nodes are divided according to different division methods, generally according to information gain, gain rate, and Gini index 23 .The decision tree is divided the algorithms are ID3 algorithm, C4.5 algorithm and CART algorithm 24 .

Theoretical analysis of customer churn rate prediction based on neural network

In recent years, deep learning has been widely used to solve some complex problems, and it is also used in the prediction of customer churn rate 25 . The BP neural network was proposed by a group of scientists led by Rumelhart and McCelland in the book “Parallel Distributed Processing” in 1986, which detailed the error back-propagation algorithm for multilayer perceptions with nonlinear continuous transformation functions. The analysis of, realizes Minsky’s vision of multi-layer network 26 . The structure of BP neural network 26 is a backpropagation (Back Propagation) neural network, referred to as the BP neural network. The standard BP neural network is divided into three layers, namely the input layer, the hidden layer and the output layer, as shown in Fig.  36 .

figure 36

The structure of three-layer BP neural network.

The principle of the neural network algorithm mainly includes two stages: (1)FP (forward propagation) data is input from the input layer, then input through the hidden layer under the mapping of the relevant activation function, and finally reaches the output layer for output, and then according to the error between the expected output and the actual output is used to construct the cost function (loss function) for the second stage (2) BP (backpropagation) from the output layer through each hidden layer to correct the weight and bias of the hidden layer by layer, and finally correct the weights and biases from the hidden layer to the input layer, and finally get the neural network model. Neural networks can approximate any nonlinear function arbitrarily. Because of their simple structure and easy implementation, they have been widely used in time series analysis and nonlinear function regression estimation. However, the development of such networks is limited due to the difficulty of determining the network structure, the existence of over-learning, and the tendency to fall into local extreme values. This paper expects to use it in the research of customer churn prediction to get good results.

Conclusions and future work

In this paper, we proposed a novel model named Ensemble-Fusion that utilized 9 categories of 17 machine learning algorithms as baseline classifiers. Through experiment proves that the Ensemble-Fusion model (Our model) reaches 95.35%, AUC score reaches 91% and F1-Score reaches 96.96%, and the experimental results show that the data prediction accuracy of Ensemble-Fusion model outperforms that of other benchmark algorithms. This paper first elaborates on the important role of research in today’s information industry and gives important contributions, then this paper focuses on the research of customer churn prediction based on an integrated learning fusion model, mainly from the customer churn prediction solution based on the integrated learning fusion model, the design of real-time intelligent early warning system of customer churn, the machine learning algorithm of customer churn prediction and this paper. The newly proposed customer churn prediction model is compared and the specific implementation algorithm based on the integrated learning fusion model is given. Then this paper validates the proposed churn prediction algorithm experimentally and evaluates the robustness of the algorithm by using evaluation metrics such as precision, recall, accuracy, F1-score, and AUC. Finally, this paper provides a detailed description of the main functions of the theoretically and practically developed customer churn intelligent early warning system, in order to efficiently help the information industry improve its productivity and to be able to excel in today’s globally competitive environment.The study presented in this paper is not free of limitations. Firstly, it is challenging to gather all relevant data on customer churn due to sensitive information and related protocol issues. Therefore, how to construct an effective model using the limited dataset becomes a bottleneck in customer churn prediction research. The other limitation of the study is that there is still a lot of noise and no labels to mark customer churn in the collected data, which requires a lot of time to organize and learn relevant business knowledge before data collection and processing. Finally, customer churn is a multidisciplinary issue involving a variety of fields such as psychology, sociology, and economics, but current research may lack an interdisciplinary perspective and approach.Concerning future research, we intend to develop a similar ensemble-fusion classification algorithm that substitutes the baseline classifiers with reinforcement learning model-related algorithms. The primary aim here is to construct an ensemble classifier that can more easily be used in complex data structures such as multisource isomerization. In order to study customer churn in more depth in the future, there are several potential directions for further research. The first direction is to obtain more data from industry, e.g., combining different feature data. Another interesting direction is to relax strict algorithmic constraints to support compact and dense feature representations, which can be explored in areas such as fast symmetric decomposition techniques.

Data availability

Availability of data and materials: Data is available on request from the author (Chenggang He).

Fujo, S. W. et al. Customer churn prediction in telecommunication industry using deep learning. Inf. Sci. Lett. 11 (1), 24 (2022).

Google Scholar  

Xie, Y., Li, X., Ngai, E. & Ying, W. Customer churn prediction using improved balanced random forests. Expert Syst. Appl. 36 (3), 5445–5449 (2009).

Article   Google Scholar  

De Caigny, A., Coussement, K. & De Bock, K. W. A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees. Eur. J. Oper. Res. 269 (2), 760–772 (2018).

Article   MathSciNet   Google Scholar  

Ahmad, A. K., Jafar, A. & Aljoumaa, K. Customer churn prediction in telecom using machine learning in big data platform. J. Big Data 6 (1), 1–24 (2019).

He, C., Ding, C.H., Chen, S., Luo, B. Intelligent machine learning system for predicting customer churn. In: 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI) , pp. 522–527 (2021). IEEE.

Estran, R., Souchaud, A. & Abitbol, D. Using a genetic algorithm to optimize an expert credit rating model. Expert Syst. Appl. 203 , 117506 (2022).

Deng, Z., Huang, Z.-H. & Miao, X. Sufficient conditions for judging quasi-strictly diagonally dominant tensors. Comput. Appl. Math. 42 (1), 63 (2023).

Chen, Y., Matsubara, T., Yaguchi, T. Kam theory meets statistical learning theory: Hamiltonian neural networks with non-zero training loss. In: Proceedings of the AAAI Conference on Artificial Intelligence , vol. 36, pp 6322–6332 (2022).

Bhavan, A. et al. Bagged support vector machines for emotion recognition from speech. Knowl.-Based Syst. 184 , 104886 (2019).

Sadohara, R. et al. Seed coat color genetics and genotype× environment effects in yellow beans via machine- learning and genome-wide association. Plant Genom 15 (1), 20173 (2022).

Varshney, R. K. et al. Resequencing of 429 chickpea accessions from 45 countries provides insights into genome diversity, domestication and agronomic traits. Nat. Genet. 51 (5), 857–864 (2019).

Article   CAS   PubMed   Google Scholar  

Bergström, A. et al. Insights into human genetic variation and population history from 929 diverse genomes. Science 367 (6484), 5012 (2020).

Keramati, A., Ghaneei, H. & Mirmohammadi, S. M. Developing a prediction model for customer churn from electronic banking services using data mining. Financ. Innov. 2 , 1–13 (2016).

Hudaib, A. et al. Hybrid data mining models for predicting customer churn. Int. J. Commun. Netw. Syst. Sci. 8 (05), 91 (2015).

Li, H., Yang, D., Yang, L., Lu, Y., Lin, X. Supervised massive data analysis for telecommunication customer churn prediction. In: 2016 IEEE International Conferences on Big Data and Cloud Computing (BDCloud), Social Comput- ing and Networking (SocialCom), Sustainable Computing and Communications (SustainCom)(BDCloud-SocialCom-SustainCom), pp 163–169 (2016). IEEE.

Deng, Q. & Söffker, D. A review of hmm-based approaches of driving behaviors recognition and prediction. IEEE Trans. Intell. Vehic. 7 (1), 21–31 (2021).

Graham, S. E. et al. The power of genetic diversity in genome-wide association studies of lipids. Nature 600 (7890), 675–679 (2021).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Shen, J. et al. Identification of a novel gene signature for the prediction of recurrence in hcc patients by machine learning of genome-wide databases. Sci. Rep. 10 (1), 4435 (2020).

Devriendt, F., Berrevoets, J. & Verbeke, W. Why you should stop predicting customer churn and start using uplift models. Inf. Sci. 548 , 497–515 (2021).

Wang, Q.-F., Xu, M. & Hussain, A. Large-scale ensemble model for customer churn prediction in search ads. Cognit. Comput. 11 , 262–270 (2019).

Alboukaey, N., Joukhadar, A. & Ghneim, N. Dynamic behavior based churn prediction in mobile telecom. Expert Syst. Appl. 162 , 113779 (2020).

Wang, S., Cao, J. & Philip, S. Y. Deep learning for spatio-temporal data mining: A survey. IEEE Trans. Knowl. Data Eng. 34 (8), 3681–3700 (2020).

Zdravevski, E., Lameski, P., Apanowicz, C. & Ślȩzak, D. From big data to business analytics: The case study of churn prediction. Appl. Soft Comput. 90 , 106164 (2020).

Vo, N. N., Liu, S., Li, X. & Xu, G. Leveraging unstructured call log data for customer churn prediction. Knowl.-Based Syst. 212 , 106586 (2021).

Goecks, J., Jalili, V., Heiser, L. M. & Gray, J. W. How machine learning will transform biomedicine. Cell 181 (1), 92–101 (2020).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Aria, M., Cuccurullo, C. & Gnasso, A. A comparison among interpretative proposals for random forests. Mach. Learn. Appl. 6 , 100094 (2021).

Alotaibi, M. Z. & Haq, M. A. Customer churn prediction for telecommunication companies using machine learning and ensemble methods. Eng. Technol. Appl. Sci. Res. 14 , 14572–14578 (2024).

Alabdulwahab, A., Haq, M. A. & Alshehri, M. Cyberbullying detection using machine learning and deep learning. Int. J. Adv. Comput. Sci. Appl. 14 , 10 (2023).

Haq, M. A., Khan, M. A. & Alshehri, M. Insider threat detection based on NLP word embedding and machine learning. Intell. Autom. Soft Comput. 33 , 619–635 (2022).

Oksuz, K., Cam, B. C., Kalkan, S. & Akbas, E. Imbalance problems in object detec- tion: A review. IEEE Trans. Pattern Anal. Mach. Intell. 43 (10), 3388–3415 (2020).

Zidan, M. A. et al. A general memristor-based partial differential equation solver. Nat. Electron. 1 (7), 411–420 (2018).

Devriendt, F., Berrevoets, J. & Verbeke, W. Why you should stop predicting cus- tomer churn and start using uplift models. Inf. Sci. 548 , 497–515 (2021).

Shirazi, F. & Mohammadi, M. A big data analytics model for customer churn prediction in the retiree segment. Int. J. Inf. Manag. 48 , 238–253 (2019).

Amin, A. et al. Cross-company customer churn prediction in telecommunication: A comparison of data transformation methods. Int. J. Inf. Manag. 46 , 304–319 (2019).

Stripling, E., Broucke, S., Antonio, K., Baesens, B. & Snoeck, M. Profit maximizing logistic model for customer churn prediction using genetic algorithms. Swarm Evolut. Comput. 40 , 116–130 (2018).

Liu, Z. et al. Extreme gradient boosting trees with efficient Bayesian optimization for profit-driven customer churn prediction. Technol. Forecast. Soc. Change 198 , 122945 (2024).

Chicco, D. & Jurman, G. The advantages of the matthews correlation coeffi- cient (mcc) over f1 score and accuracy in binary classification evaluation. BMC Genom. 21 (1), 1–13 (2020).

Carvalho, D. V., Pereira, E. M. & Cardoso, J. S. Machine learning interpretability: A survey on methods and metrics. Electronics 8 (8), 832 (2019).

Chou, J.-S. & Nguyen, T.-K. Forward forecast of stock price using sliding- window metaheuristic-optimized machine-learning regression. IEEE Trans. Ind. Inform. 14 (7), 3132–3142 (2018).

Vafeiadis, T., Diamantaras, K. I., Sarigiannidis, G. & Chatzisavvas, K. C. A com- parison of machine learning techniques for customer churn prediction. Simul. Model. Pract. Theory 55 , 1–9 (2015).

Ismail, M. R., Awang, M. K., Rahman, M. N. A. & Makhtar, M. A multi-layer percep- tron approach for customer churn prediction. Int. J. Multimed. Ubiquitous Eng. 10 (7), 213–222 (2015).

Riedmiller, M., Lernen, A. Multilayer perceptron. Machine learning lab special lecture, University of Freiburg 24 (2014).

Jain, H., Khunteta, A. & Srivastava, S. Churn prediction in telecommunication using logistic regression and logit boost. Procedia Comput. Sci. 167 , 101–112 (2020).

Pyecharts, https://github.com/pyecharts/pyecharts .

Download references

Acknowledgements

Thanks for Authors Contributions: “Conceptualization, Chenggang He; methodology, Chenggang He; validation, Chenggang He,Chris H.Q.Ding; investigation, Chenggang He; writing—original draft preparation, Chenggang He; writing—review and editing, Chenggang He; supervision, Chris H.Q.Ding.

This research was funded by the Scientific Research Foundation for High- level Talents of Anhui University of Science and Technology(2023yjrc120), Anhui Quality Engineering Project(2023cyts013), NSFC Key Project of International (Regional) Cooperation and Exchanges (61860206004), Natural Science Foundation of China (61976004,61572030).

Author information

Authors and affiliations.

School of Public Safety and Emergency Management, Anhui University of Science and Technology, No.15 Fengxia Road, Hefei, 230041, Anhui, China

Chenggang He

School Department of Computer Science and Engineering, University of Texas at Arlington, 701 S. Nedderman Drive, Arlington, TX, 76019, USA

Chris H. Q. Ding

School of Computer Science and Technology, Anhui University, 111 Jiulong Road, Hefei, 230039, Anhui, China

Chenggang He & Chris H. Q. Ding

You can also search for this author in PubMed   Google Scholar

Contributions

Conceptualization, C.H.; methodology, C.H.; validation, C.H.; investigation, C.H.; writing—original draft preparation, C.H.; writing—review and editing, C.H., C. H. Q. D.; supervision, C. H. Q. D.. All authors reviewed the manuscript.

Corresponding author

Correspondence to Chenggang He .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Cite this article.

He, C., Ding, C.H.Q. A novel classification algorithm for customer churn prediction based on hybrid Ensemble-Fusion model. Sci Rep 14 , 20179 (2024). https://doi.org/10.1038/s41598-024-71168-x

Download citation

Received : 27 January 2024

Accepted : 26 August 2024

Published : 30 August 2024

DOI : https://doi.org/10.1038/s41598-024-71168-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Customer churn
  • Machine learning
  • Ensemble-Fusion model
  • Smart intelligent system

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

research papers on network security algorithms

A Novel Approach for Accurate Identification in Masked and Unmasked Scenarios using Glowworm Swarm Optimization and Neural Networks

  • Published: 31 August 2024

Cite this article

research papers on network security algorithms

  • Kosuri Naresh Babu   ORCID: orcid.org/0009-0001-2215-345X 1 &
  • Suneetha Manne   ORCID: orcid.org/0000-0002-8917-276X 2  

Recognition and classification are among the most important applications of machine learning. This recognition process is used to identify objects and humans. In particular, it plays a major role in authentication processes by identifying features such as human eyes, fingerprints, and facial patterns. Among these features, facial recognition is an evolving technology used in smartphones, attendance systems in offices, and healthcare centers. Several research efforts have been conducted to perform facial recognition using machine learning and deep learning algorithms. These algorithms have performed well on faces without masks, but they have struggled with masked faces, as most facial features are hidden by the mask. Therefore, an improved algorithm is needed for performing facial recognition on faces with and without masks. Since the COVID-19 outbreak, research has been focused on using deep learning algorithms to identify masked faces. However, these algorithms were typically trained on faces both with and without masks. In this paper, we propose a facial recognition approach for recognizing faces with and without masks. The common regions of the face in both scenarios are identified by cropping the image. These cropped regions are then subjected to feature extraction using histogram properties, SURF, and SIFT features. The dominant features are identified using a swarm intelligence approach called Glowworm Swarm Optimization. These dominant features are then trained using a neural network with a regression function. Finally, the performance of the proposed method will be evaluated based on accuracy, sensitivity, and specificity and compared to existing approaches, such as SURF, with different variations for facial recognition with and without masks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

research papers on network security algorithms

Explore related subjects

  • Artificial Intelligence

Data Availability

The data used to support the findings of created new data set, this study is available from the corresponding author upon request.

Abbreviations

Generalized Regression Neural Network

Adaptive Histogram Equalization

Histogram Modification Function

Scale-Invariant Feature Transform

Speeded-Up Robust Features

Unsupervised Speeded-Up Robust Features

Hybrid Glowworm Swarm Optimization

Glowworm Swarm Optimization

Hemathilaka, Susith & Aponso, Achala. (2021). An Analysis of Face Recognition under Face Mask Occlusions. 41–48. https://doi.org/10.5121/csit.2021.111804 .

Aware, Mrunal & Labade, Prasad & Tambe, Manish & Jagtap, Aniket & Beldar, Chinmay. (2021). Attendance Management System using Face-Recognition. International Journal of Scientific Research in Computer Science, Engineering and Information Technology. 336–341. https://doi.org/10.32628/CSEIT217370 .

Alexiou, Michail & Papadakis Ktistakis, Iosif & Goodman, Garrett. (2021). Towards a Masked Face Recognition Algorithm: A Novel Rule Based Hybrid Algorithm. 1–6. https://doi.org/10.1109/SEEDA-CECNSM53056.2021.9566244 .

Baskaran, Kaaviya & P, Baskaran & V, Rajaram & Kumaratharan, Narayanaswamy. (2020). IoT Based COVID Preventive System for Work Environment. 65–71. https://doi.org/10.1109/I-SMAC49090.2020.9243471 .

Arora M, Naithani S, Areeckal A (2022) A web-based application for face detection in real-time images and videos. J Phys: Conf Ser 2161:012071. https://doi.org/10.1088/1742-6596/2161/1/012071

Article   Google Scholar  

Jiang, Andy & Sun, Yu. (2023). A Mobile Application to Mark Attendance using a Combined Backend of the Firestore Database and Amazon AWS Services. 87–98. https://doi.org/10.5121/csit.2023.130108 .

Pichetjamroen, Sasakorn & Rattanalerdnusorn, Ekkachan & Vorakulpipat, Chalee & Pichetjamroen, Achara. (2021). Multi-Factor based Face Validation Attendance System with Contactless Design in Training Event. 637–640. https://doi.org/10.1109/ECTI-CON51831.2021.9454779 .

Asaju, Christine & Vadapalli, Hima. (2021). Affects Analysis: A Temporal Approach to Estimate Students’ Learning. 1–7. https://doi.org/10.1109/IMITEC52926.2021.9714657 .

Mundial, I.Q., ul Hassan, M., Tiwana, M.I., Qureshi, W.S., &Alanazi, E.A. (2020). Towards Facial Recognition Problem in COVID-19 Pandemic. 2020 4rd International Conference on Electrical, Telecommunication and Computer Engineering (ELTICOM), 210–214.

Nursetiawati S, Josua DP, Atmanto D, Oktaviani F, Fardani AL (2020) Science Education in the Family Environment with the Experimental Method of Facial Cosmetics Plant Fertilization in the Covid-19 Pandemic Era. JurnalPendidikan IPA Indonesia 9:561–573

Tseng, F., Cheng, Y., Wang, Y., & Suen, H. (2022). Real-time Facial Expression Recognition via Dense & Squeeze-and-Excitation Blocks. Human-centric Computing and Information Sciences, 12. https://doi.org/10.22967/HCIS.2022.12.039

Awan, S.A., Ali, S., Hussain, I.,., Hassan, B., & Ashfaq Ashraf, S.M. (2021). Proficient Masked Face Recognition Method Using Deep Learning Convolution Neural Network in Covid-19 Pandemic. International Journal of Circuits, Systems and Signal Processing.

Savchenko A, Savchenko LV, Makarov I (2022) Classifying Emotions and Engagement in Online Learning Based on a Single Facial Expression Recognition Neural Network. IEEE Trans Affect Comput 13:2132–2143

Bradford, B., Yesberg, J.A., Jackson, J., & Dawson, P.A. (2020). Live Facial Recognition: Trust and Legitimacy as Predictors of Public Support for Police Use of New Technology. The British Journal of Criminology.

Srinivas, M.A., Kambam, V.R., Pinnu, S.K., Edla, S., &Komera, A.K. (2022). Facial Expression Recognition based Restaurant Scoring System Using Deep Learning. Emperor Journal of Applied Scientific Research.

Rifaee, M.A., Al Rawajbeh, M., AlOkosh, B., &AbdelFattah, F. (2022). A New approach to Recognize Human Face Under Unconstrained Environment. International Journal of Advances in Soft Computing and its Applications.

Enadula, S.M., SriramEnadula, A., &Burri, R.D. (2021). Recognition of Student Emotions in an Online Education System. 2021 Fourth International Conference on Electrical, Computer and Communication Technologies (ICECCT), 1–4.

Recto, I.J., & Devaraj, M. (2022). Synthetic Occluded Masked Face Recognition using Convolutional Neural Networks. 2022 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT), 124–129.

Sekhar JNC, Domathoti B, Santibanez Gonzalez EDR (2023) Prediction of Battery Remaining Useful Life Using Machine Learning Algorithms. Sustainability 15(21):15283. https://doi.org/10.3390/su152115283

Nurhopipah, A., Rifai Azziz, I., &Suhaman, J. (2022). Behind the Mask: Detection and Recognition Based-on Deep Learning. (Indonesian Journal of Computing and Cybernetics Systems). Vol.16, No.1, pp. 67–78. https://doi.org/10.22146/ijccs.72075

Birkle, J., Weber, R., Möller, K., & Wagner-Hartl *, V. (2022). Psychophysiological Parameters for Emotion recognition – Conception and first evaluation of a measurement environment. Intelligent Human Systems Integration (IHSI 2022). https://doi.org/10.54941/ahfe100958

Gupta, S., Kumar, P., &Tekchandani, R. (2022). Facial emotion recognition based real-time learner engagement detection system in online learning context using deep learning models. Multimedia Tools and Applications, 1 - 30. https://doi.org/10.1007/s11042-022-13558-9

Bhattacharya, Pronaya & Handa, Anand & Zuhair, Mohd. (2021). Face Detection System for Health Care Units Using Raspberry PI. https://doi.org/10.1109/ICIEM51511.2021.9445367 .

Mittal, Parangat & Pandey, Kartikeya & Tawani, Peeyush & Rohilla, Rajesh. (2021). CNN-based Person Recognition System for Masked Faces in a post-pandemic world. 1–6. https://doi.org/10.1109/INCET51464.2021.9456416 .

Hu Y, Jiang Z, Zhu K (2022) An Optimized CNN Model for Engagement Recognition in an E-Learning Environment. Appl Sci 12(16):8007. https://doi.org/10.3390/app12168007

Download references

Acknowledgements

Not Applicable.

Author declared that no funding was received for this Research and Publication.

Author information

Authors and affiliations.

Dept of CSE-AIML, Geethanjali College of Engineering and Technology, JNTUK, Kakinada, Hyderabad, 501301, India

Kosuri Naresh Babu

Dept of IT, Velagapudi Ramakrishna Siddhartha Engineering College, Vijayawada, 520007, India

Suneetha Manne

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Kosuri Naresh Babu .

Ethics declarations

Ethical approval.

This article does not contain any studies with human participant and Animals performed by author.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Babu, K.N., Manne, S. A Novel Approach for Accurate Identification in Masked and Unmasked Scenarios using Glowworm Swarm Optimization and Neural Networks. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-20093-2

Download citation

Received : 15 December 2023

Revised : 09 August 2024

Accepted : 14 August 2024

Published : 31 August 2024

DOI : https://doi.org/10.1007/s11042-024-20093-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Facial recognition
  • Pandemic environment
  • Without masks
  • Neural network

Advertisement

  • Find a journal
  • Publish with us
  • Track your research

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

energies-logo

Article Menu

research papers on network security algorithms

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Research on grid multi-source survey data sharing algorithm for cross-professional and cross-departmental operations collaboration.

research papers on network security algorithms

1. Introduction

2. grid engineering survey data and its characteristics, 3. research on grid survey data sharing algorithm, 3.1. survey data sharing methods combining differential privacy, 3.1.1. overview of the methodology, 3.1.2. discriminator feedback construction combining differential privacy.

Discriminator Weight Updates Combined with Differential Privacy
  discriminator weight T , discriminator loss function L , survey data data_x, generator synthesized shared data data_g, learning rate l_r, Privacy budget for differential privacy ep, Differential privacy sensitivity delta, first-order momentum estimation m, second-order momentum estimation v, threshold C, and Gaussian noise standard deviation S.
  m = 0 v = 0 sigma = sqrt (delta/(2×ep))
 for each iteration in training://Each iterative step in the training process
 //Calculate the loss for real and generated data
 loss_real = LD(data_x,T )
 loss_fake = LD(data_g,T )
 //Gradient calculation, the gradient function is used to calculate the gradient of the loss function with respect to the model parameters
 grad_real = gradient(loss_real, T )
 grad_fake = gradient(loss_fake, T )
 //Merge the gradients and compute the average gradient
 grad = (grad_real + grad_fake)/2
 //Updating the first- and second-order momentum estimates, beta1 and beta2 are the first- and second-order momentum parameters of Adam’s optimizer.
 m = beta1*m + (1−beta1)×grad
 v = beta2*v + (1-beta2)×(grad )
 //Calculate the adaptive learning rate, t denotes the current number of iterations
 adaptive_lr = l_r×(sqrt(v/(1−beta2 )))
 //Noise is added according to differential privacy requirements, and the normal_noise function generates noise based on a Gaussian distribution
 Noise = normal_noise(mean = 0, std = S)
 //Updating discriminator weights while considering privacy-preserving noise
 TD = TD−adaptive_lr×(m + noise)
 //Updating the privacy budget
 ep = ep−delta
 //Stop updating if the privacy budget is depleted or less than the threshold C
 if epsilon < 0 or epsilon < C:
 break
return T

3.1.3. Dynamic Noise Regulation

Dynamic Noise Conditioning Algorithm
  Attenuation rate , Initial noise size , Survey data data_sources.
  noise_scales = {source: for source in data_sources}
 for source in data_sources://Iterate through each data source
 //Sample data from the current data source
 batch_data = sample_data(source)
 //Calculate the loss and gradient of the model on the current data
 loss = calculate_loss(model, batch_data)
 grad = calculate_gradient(loss, model.params)
 //Dynamically adjust the noise scale of the current data source according to the attenuation rate
 noise_scales [source] = noise_scales [source] ×
 //Adding noise for differential privacy
 noise = normal_noise(mean = 0, std = noise_scales [source])
 noisy_grad = grad + noise
 return noisy_grad//Gradient after output adaptive perturbation

3.2. Attribute Encryption Based Permission Change Method

4. experiment, 4.1. experimental configuration and data sources, 4.2. experimental situation, 4.2.1. parameter settings, 4.2.2. evaluation index, 4.2.3. experimental results against survey data sharing methods combining differential privacy, comparison of algorithm performance with different number of sharers, comparison of algorithm performance under sharing between different professionals, comparison of algorithm performance under sharing between different departments, 4.2.4. experimental results of attribute encryption based permission change method, 5. conclusions, author contributions, data availability statement, conflicts of interest.

  • Wang, M.; Rui, L.; Xu, S.; Gao, Z.; Liu, H.; Guo, S. A multi-keyword searchable encryption sensitive data trusted sharing scheme in multi-user scenario. Comput. Netw. 2023 , 237 , 110045. [ Google Scholar ] [ CrossRef ]
  • Liu, Z.; Li, T.; Li, P.; Jia, C.; Li, J. Verifiable searchable encryption with aggregate keys for data sharing system. Futur. Gener. Comput. Syst. 2018 , 78 , 778–788. [ Google Scholar ] [ CrossRef ]
  • Niu, S.; Yang, P.; Xie, Y.; Du, X. Cloud-assisted ciphertext policy attribute-based data sharing encryption scheme on blockchain. J. Electron. Inf. 2021 , 43 , 1864–1871. [ Google Scholar ]
  • Jiang, L.; Qin, Z. An efficient decentralized mobile groupwise data sharing scheme based on attribute hiding. J. Univ. Electron. Sci. Technol. 2023 , 52 , 915–924. [ Google Scholar ]
  • Tian, G.; Hu, Y.; Wei, J.; Liu, Z.; Huang, X.; Chen, X.; Susilo, W. Blockchain-based secure deduplication and shared auditing in decentralized storage. IEEE Trans. Dependable Secur. Comput. 2021 , 19 , 3941–3954. [ Google Scholar ] [ CrossRef ]
  • Xu, Y.; Mao, Y.; Li, S.; Li, J.; Chen, X. Privacy-Preserving Federal Learning Chain for Internet of Things. IEEE Internet Things J. 2023 , 10 , 18364–18374. [ Google Scholar ] [ CrossRef ]
  • Yin, L.; Feng, J.; Xun, H.; Sun, Z.; Cheng, X. A privacy-preserving federated learning for multiparty data sharing in social IoTs. IEEE Trans. Netw. Sci. Eng. 2021 , 8 , 2706–2718. [ Google Scholar ] [ CrossRef ]
  • Huang, L.; Yi, W.; Wang, Y.; Cha, D. Research on secure data sharing method for sea-rail transportation based on federated learning and multi-party secure computing. Railw. Transp. Econ. 2024 , 46 , 58–67. [ Google Scholar ] [ CrossRef ]
  • Chen, J.; Peng, C.; Tan, W. A design scheme for user profiling based on federated learning with multi-source data. J. Nanjing Univ. Posts Telecommun. (Nat. Sci. Ed.) 2023 , 43 , 83–91. [ Google Scholar ] [ CrossRef ]
  • Chen, L.; Xiao, D.; Yu, Z.; Huang, H.; Li, M. Efficient federated learning for communication based on secret sharing and compressed sensing. Comput. Res. Dev. 2022 , 59 , 2395–2407. [ Google Scholar ]
  • Ren, Z.; Yan, E.; Chen, T.; Yu, Y. Blockchain-based CP-ABE data sharing and privacy-preserving scheme using distributed KMS and zero-knowledge proof. J. King Saud Univ.—Comput. Inf. Sci. 2024 , 36 , 101969. [ Google Scholar ] [ CrossRef ]
  • Zhang, X.; Yao, Y.; Fu, J.; Xie, H. Policy-hiding efficient multi-authorized organization CP-ABE data sharing scheme for Internet of Things. Comput. Res. Dev. 2023 , 60 , 2193–2202. [ Google Scholar ]
  • Zhao, K.; Kang, P.; Liu, B.; Guo, Z.; Feng, C.; Qing, Y. A CP-ABE scheme supporting cloud proxy re-encryption. J. Electron. 2023 , 51 , 728–735. [ Google Scholar ]
  • Liu, C.; Zhang, Q.; Li, Y.; Zhang, H. Efficient storage and sharing algorithm for power information based on fog computing. J. Shenyang Univ. Technol. 2024 , 46 , 1–6. [ Google Scholar ]
  • Guo, F.; Liu, S.; Wu, X.; Chen, B.; Zhang, W.; Ge, Q. Fault diagnosis of power transformer with unbalanced sample data based on federated learning. Power Syst. Autom. 2023 , 47 , 145–152. [ Google Scholar ]
  • Qin, S.; Dai, W.; Zeng, H.; Gu, X. Research on secure data sharing of electric power application based on blockchain. Inf. Netw. Secur. 2023 , 23 , 52–65. [ Google Scholar ]
  • Deng, S.; Hu, Q.; Wu, D.; He, Y. BCTC-KSM: A blockchain-assisted threshold cryptography for key security management in power IoT data sharing. Comput. Electr. Eng. 2023 , 108 , 108666. [ Google Scholar ] [ CrossRef ]
  • Yang, X.; Liao, Z.; Liu, L.; Wang, C. Power data sharing scheme based on blockchain and attribute-based encryption. Power Syst. Prot. Control 2023 , 51 , 169–176. [ Google Scholar ] [ CrossRef ]
  • Zhang, H.; Ding, P.; Peng, Y.; Sun, C. State Grid Electricity Data Sharing Program Based on CKKS and CP-ABE. Inf. Secur. Res. 2023 , 9 , 262–270. [ Google Scholar ]
  • Xiang, Y.; Yang, L.; Chen, B.; Li, G. Research on power line loss data sharing based on differential privacy protection. Comput. Appl. Softw. 2023 , 40 , 333–336+341. [ Google Scholar ]
  • Wang, B.; Guo, Q.; Yu, Y. Mechanism design for data sharing: An electricity retail perspective. Appl. Energy 2022 , 314 , 118871. [ Google Scholar ] [ CrossRef ]
  • Song, J.; Yang, Y.; Mei, J.; Zhou, G.; Qiu, W.; Wang, Y.; Xu, L.; Liu, Y.; Jiang, J.; Chu, Z.; et al. Proxy re-encryption-based traceability and sharing mechanism of the power material data in blockchain environment. Energies 2022 , 15 , 2570. [ Google Scholar ] [ CrossRef ]
  • Erlingsson, Ú.; Pihur, V.; Korolova, A. Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, Scottsdale, AR, USA, 3–7 November 2014; pp. 1054–1067. [ Google Scholar ]
  • Jiang, W.; Chen, Y.; Han, Y.; Wu, Y.; Zhou, W.; Wang, H. A privacy-preserving approach for mix-and-shuffle differentials during K-Modes clustering data collection and distribution. J. Commun. 2024 , 45 , 201–213. [ Google Scholar ]
  • Fan, H.; Xu, W.; Fan, X.; Wang, Y. Analysis and outlook of the application of privacy computing in new power systems. Power Syst. Autom. 2023 , 47 , 187–199. [ Google Scholar ]
  • Yu, H.; Liang, Y.; Song, J.; Li, h.; Xi, X.; Yuan, J. Overview of the development of data security sharing technology and its application in the field of energy and electric power. Inf. Secur. Res. 2023 , 9 , 208–219. [ Google Scholar ]
  • Sadeghi, P.; Korki, M. Offset-symmetric Gaussians for differential privacy. IEEE Trans. Inf. Forensics Secur. 2022 , 17 , 2394–2409. [ Google Scholar ] [ CrossRef ]
  • Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A.C. Improved training of wasserstein gans. In Advances in Neural Information Processing Systems ; NeurIPS: La Jolla, CA, USA, 2017; p. 30. [ Google Scholar ]
  • Wang, Y.; Ren, T.; Fan, Z. UAV air combat maneuver decision making based on bootstrap Minimax-DDQN. Comput. Appl. 2023 , 43 , 2636–2643. [ Google Scholar ]
  • Zhao, Y.; Yang, M. A review of progress in differential privacy research. Comput. Sci. 2023 , 50 , 65–276. [ Google Scholar ]
  • Xie, L.; Lin, K.; Wang, S.; Wang, F.; Zhou, J. Differentially private generative adversarial network. arXiv 2018 , arXiv:1802.06739. [ Google Scholar ]
  • Xu, L.; Skoularidou, M.; Cuesta-Infante, A.; Veeramachaneni, K. Modeling Tabular data using Conditional GAN. arXiv 2019 , arXiv:1907.00503. [ Google Scholar ]
  • Wang, Z.; Cheng, X.; Su, S.; Liang, J.; Yang, H. ATLAS: GAN-Based Differentially Private Multi-Party Data Sharing. IEEE Trans. Big Data 2023 , 9 , 1225–1237. [ Google Scholar ] [ CrossRef ]
  • Wang, Z.; Cheng, X.; Su, S.; Wang, G. Differentially private generative decomposed adversarial network for vertically partitioned data sharing. Inf. Sci. 2023 , 619 , 722–744. [ Google Scholar ] [ CrossRef ]
  • Sun, C.; van Soest, J.; Dumontier, M. Generating synthetic personal health data using conditional generative adversarial networks combining with differential privacy. J. Biomed. Inform. 2023 , 143 , 104404. [ Google Scholar ] [ CrossRef ] [ PubMed ]

Click here to enlarge figure

NameContentFormatStructured vs. UnstructuredReal-Time vs. Non-Real-Time
Image dataIncluding remote sensing data, aerial data, laser point cloud data, etc.TIFF, PNG, JPG, GeoTiff, IMG, GIF, BMPUnstructured/real-time
Sensors dataIncludes pressure sensor data, radar sensor data, humidity sensor data, etc.TXT, DAT, BRN, CSVStructured, unstructuredreal-time
Basic control measurement dataBasic control measurement information element attribute informationXML, HTML, JSON, YAML, CSVStructuredreal-time
Geotechnical dataAttribute information of exploration data elements of exploration points, etc.XML, HTML, JSON, YAML, CSVStructuredreal-time
3D modeling dataThree-dimensional modeling data of power grid engineering facilities and the surrounding environmentCGR, DWG, DXF, DWF, DGNPLN, RVTUnstructurednon-real-time
ModelATLAS [ ]DP-CGANS [ ]DPGDAN [ ]Our
LR0.78880.73030.72620.8547
SVM0.77620.72350.70610.8426
RF0.77480.73120.71330.8219
AVG0.77990.72830.71520.8397
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Zhang, J.; He, B.; Lv, J.; Zhao, C.; Yu, G.; Liu, D. Research on Grid Multi-Source Survey Data Sharing Algorithm for Cross-Professional and Cross-Departmental Operations Collaboration. Energies 2024 , 17 , 4380. https://doi.org/10.3390/en17174380

Zhang J, He B, Lv J, Zhao C, Yu G, Liu D. Research on Grid Multi-Source Survey Data Sharing Algorithm for Cross-Professional and Cross-Departmental Operations Collaboration. Energies . 2024; 17(17):4380. https://doi.org/10.3390/en17174380

Zhang, Jiyong, Bangzheng He, Jingguo Lv, Chunhui Zhao, Gao Yu, and Donghui Liu. 2024. "Research on Grid Multi-Source Survey Data Sharing Algorithm for Cross-Professional and Cross-Departmental Operations Collaboration" Energies 17, no. 17: 4380. https://doi.org/10.3390/en17174380

Article Metrics

Further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

IEEE Account

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Underwater SONAR Image Classification and Analysis using LIME-based Explainable Artificial Intelligence

  • Natarajan, Purushothaman
  • Nambiar, Athira

Deep learning techniques have revolutionized image classification by mimicking human cognition and automating complex decision-making processes. However, the deployment of AI systems in the wild, especially in high-security domains such as defence, is curbed by the lack of explainability of the model. To this end, eXplainable AI (XAI) is an emerging area of research that is intended to explore the unexplained hidden black box nature of deep neural networks. This paper explores the application of the eXplainable Artificial Intelligence (XAI) tool to interpret the underwater image classification results, one of the first works in the domain to the best of our knowledge. Our study delves into the realm of SONAR image classification using a custom dataset derived from diverse sources, including the Seabed Objects KLSG dataset, the camera SONAR dataset, the mine SONAR images dataset, and the SCTD dataset. An extensive analysis of transfer learning techniques for image classification using benchmark Convolutional Neural Network (CNN) architectures such as VGG16, ResNet50, InceptionV3, DenseNet121, etc. is carried out. On top of this classification model, a post-hoc XAI technique, viz. Local Interpretable Model-Agnostic Explanations (LIME) are incorporated to provide transparent justifications for the model's decisions by perturbing input data locally to see how predictions change. Furthermore, Submodular Picks LIME (SP-LIME) a version of LIME particular to images, that perturbs the image based on the submodular picks is also extensively studied. To this end, two submodular optimization algorithms i.e. Quickshift and Simple Linear Iterative Clustering (SLIC) are leveraged towards submodular picks. The extensive analysis of XAI techniques highlights interpretability of the results in a more human-compliant way, thus boosting our confidence and reliability.

  • Computer Science - Computer Vision and Pattern Recognition;
  • Computer Science - Artificial Intelligence;
  • Computer Science - Human-Computer Interaction;
  • Computer Science - Machine Learning;
  • 68T07 (Primary) 68T45;
  • 68U10 (Secondary);

Study on Network Security Algorithm

  • February 2020

Chandrashekhar HIMMATRAO Patil at Dr. Vishwanath Karad MIT World Peace University, Pune

  • Dr. Vishwanath Karad MIT World Peace University, Pune

Discover the world's research

  • 25+ million members
  • 160+ million publication pages
  • 2.3+ billion citations
  • S. Sree Priya
  • T. Rajendran
  • Sree Priya S

Jie Wang

  • William Stallings
  • Eric Maiwald
  • Shyam Nandan Kumar
  • Recruit researchers
  • Join for free
  • Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up

IMAGES

  1. (PDF) A review of cryptographic algorithms in network security

    research papers on network security algorithms

  2. (PDF) CLOUD SECURITY ALGORITHMS

    research papers on network security algorithms

  3. (PDF) Review on Network Security and Cryptography

    research papers on network security algorithms

  4. Research Network Security Thesis Writing Guidance [Professional Writers]

    research papers on network security algorithms

  5. (PDF) Study on Network Security Algorithm

    research papers on network security algorithms

  6. (PDF) Network Security Research paper

    research papers on network security algorithms

VIDEO

  1. ''Secure Hashing Algorithm'' Cryptography and Network security Lecture 03 By MS Mrignainy Kansal, A

  2. Quiz: WiFi Encryption Algorithm

  3. Top 10 Cloud Computing Security Algorithms

  4. CS3401 Algorithm

  5. network security

  6. Innovations in Network Security Policy

COMMENTS

  1. A review on graph-based approaches for network security monitoring and

    This survey paper provides a comprehensive overview of recent research and development in network security that uses graphs and graph-based data representation and analytics. The paper focuses on the graph-based representation of network traffic records and the application of graph-based analytics in intrusion detection and botnet detection. The paper aims to answer several questions related ...

  2. Network Security and Cryptography Challenges and Trends on Recent

    Cryptography and network security should be narrower to include knowledge on safeguarding digital information and providing security services.

  3. Present and Future of Network Security Monitoring

    Network Security Monitoring (NSM) is a popular term to refer to the detection of security incidents by monitoring the network events. An NSM system is central for the security of current networks, given the escalation in sophistication of cyberwarfare. In this paper, we review the state-of-the-art in NSM, and derive a new taxonomy of the functionalities and modules in an NSM system. This ...

  4. A Comparative Analysis Of DES, AES and RSA Crypt Algorithms For Network

    PDF | On Mar 26, 2019, Priya Chittibabu published A Comparative Analysis Of DES, AES and RSA Crypt Algorithms For Network Security in Cloud Computing | Find, read and cite all the research you ...

  5. Featured Papers on Network Security and Privacy

    Security-by-design is a way to build a network where security is considered holistically in the whole network from its first concept, through the design, development, installation, configuration and maintenance of the network and to the finalisation of the useful life of the network.

  6. The Current Research Status of AI-Based Network Security ...

    Network security situational awareness is based on the extraction and analysis of big data, and by understanding these data to evaluate the current network security status and predict future development trends, provide feedback to decision-makers to make corresponding countermeasures, and achieve security protection for the network environment. This article focuses on artificial intelligence ...

  7. Exploring the landscape of network security: a comparative analysis of

    Reliability in the context of network security refers to the ability of a system to consistently and dependably provide a high level of security across the network. A reliable system ensures that security measures are consistently applied and maintained, minimizing the risk of unauthorized access, data breaches, and other security incidents.

  8. Applied Cryptography in Network Systems Security for Cyberattack

    The paper explores applied cryptography concepts in information and network systems security to prevent cyberattacks and improve secure communications. The contribution of the paper is threefold: First, we consider the various cyberattacks on the different cryptography algorithms in symmetric, asymmetric, and hashing functions.

  9. AI-powered Network Security: Approaches and Research Directions

    In this paper, we discuss AI-based protection techniques, according to a security life-cycle consisting of several phases: (i) Prepare; (ii) Monitor and Diagnose; and (iii) React, Recovery and Fix. For each phase, we discuss relevant AI techniques, initial approaches, and research directions.

  10. Research on Network Information Security Applications Based on Deep

    According to the principles of security management, intuitive scientificity, and scalability, an information security system architecture based on representation and metric deep learning algorithms was designed. Two key algorithm technologies, interactive interfaces, neural network algorithms, and convolutional operation algorithms, were used for information security prediction analysis and ...

  11. Network intrusion detection system: A systematic study of machine

    1 INTRODUCTION With the recent interest and progress in the development of internet and communication technologies over the last decade, network security has emerged as a vital research domain. It employs tools like firewall, antivirus software, and intrusion detection system (IDS) to ensure the security of the network and all its associated assets within a cyberspace. 1 Among these, network ...

  12. Wireless sensor network security: A recent review based on state-of-the

    Consequently, there is a lack of an in-depth review of WSN we find that there are two aspects to the survey research; one with minimal information about assaults and other studies that explore network security and its impact on energy dissipation, using our understanding of the security difficulties in WSN.

  13. Cryptography Algorithms for Enhancing IoT Security

    This paper discusses lightweight block ciphers, stream ciphers, and hybrid ciphers. The report evaluates security algorithms, comparing performance and robustness with the computational complexity of these techniques. Finally, the survey presents IoT security challenges, threats, and attacks with their mitigation techniques.

  14. Artificial intelligence for cybersecurity: Literature review and future

    To answer the fourth and final research question of this paper (RQ4), the literature relevant to our research questions was scrutinized to highlight potential research gaps and identify opportunities for future AI for cybersecurity research.

  15. A review of cryptographic algorithms in network security

    First step of getting the help from algorithm is to be studied and compared their parameters. This paper presents a review that comparative study of algorithms taken by many authors.

  16. Deep Learning Algorithms for Cybersecurity Applications: A

    Effectively, we have chosen the research papers from the year 2011 to 2020, which are based on cybersecurity issues with deep learning concepts. Ultimately, we analyzed 80 research papers from different kinds of journals and the deeply analyzed survey are effectively mentioned in the below section.

  17. Machine learning in cybersecurity: a comprehensive survey

    Abstract Today's world is highly network interconnected owing to the pervasiveness of small personal devices (e.g., smartphones) as well as large computing devices or services (e.g., cloud computing or online banking), and thereby each passing minute millions of data bytes are being generated, processed, exchanged, shared, and utilized to yield outcomes in specific applications. Thus ...

  18. PDF A Hybrid Algorithm to Enhance Wireless

    However, the proposed algorithm to be implemented in this research paper will lower the level of possible attacks and enhancing security without compromising the performance of the network and also improving the power consumption within WSNs in IoT.

  19. Intelligent Techniques for Detecting Network Attacks: Review and

    The outcomes of this paper provide valuable directions for further research and applications in the field of applying effective and efficient intelligent techniques in network analytics. This article is organized into four sections. The first section provides an introduction and background to the research area.

  20. Research on Network Security Filtering Model and Key Algorithms Based

    With the rapid development of network communication technology, the continuous deepening of Internet application and the increasing enrichment of information, the Internet has become an important infrastructure of human society. With the development of network technology and the increasing complexity of network topology, the supervision of network is facing great challenges. Among them ...

  21. The Vulnerability Relationship Prediction Research for Network Risk

    Network risk assessment should include the impact of the relationship between vulnerabilities, in order to conduct a more in-depth and comprehensive assessment of vulnerabilities and network-related risks. However, the impact of extracting the relationship between vulnerabilities mainly relies on manual processes, which are subjective and inefficient. To address these issues, this paper ...

  22. GSAAN with War Strategy Optimization Algorithm for Cyber Security in a

    In this paper, the graph sample and aggregate-attention network with war strategy optimization algorithm for cyber security in the 5G wireless communication network (CS-5GWCN-GSAAN-WSOA) is proposed in 5G mobile networks to identify cyber threats. Initially, the input data are amassed from the 5G-NIDD dataset.

  23. SEI Digital Library

    The SEI Digital Library provides access to more than 6,000 documents from three decades of research into best practices in software engineering. These documents include technical reports, presentations, webcasts, podcasts and other materials searchable by user-supplied keywords and organized by topic, publication type, publication year, and author.

  24. A novel classification algorithm for customer churn prediction ...

    In order to better carry out the research on customer churn rate, this paper focuses on the theoretical basis of the Support Vector Machine algorithm, Random Forest algorithm, K-neighborhood ...

  25. 349293 PDFs

    Network security consists of the provisions and policies adopted by a network administrator to prevent and monitor unauthorized access, misuse,... | Explore the latest full-text research PDFs ...

  26. A Novel Approach for Accurate Identification in Masked and ...

    In [] research, researchers suggested a mask detection technique that relies on HOG (Histogram of Gradient) attributes classifier and SVM (Support Vector Machine) to establish whether the face is masked or not.The proposed method was tested using over 10,000 randomly selected images from the Masked Face-Net database, and it correctly classified 98.73% of the tested images.

  27. Research on Grid Multi-Source Survey Data Sharing Algorithm for Cross

    This paper addresses the problem of multi-source survey data sharing in power system engineering by proposing two improved methods: a survey data sharing method combined with differential privacy and a permission change method based on attribute encryption. The survey data sharing method integrated with differential privacy achieves effective cross-professional and cross-departmental data ...

  28. An Enhanced Security Algorithm for Wireless Sensor Networks

    This paper addresses the emerging problems and challenges related to security in WSNs. The proposed solution, namely the DSD-RSA algorithm, integrates existing models such as the S-MAC protocol, AES algorithm, and RSA algorithm. DSD-RSA aims to minimize power consumption and end-to-end delay while maximizing network throughput in WSNs.

  29. Underwater SONAR Image Classification and Analysis using LIME-based

    Deep learning techniques have revolutionized image classification by mimicking human cognition and automating complex decision-making processes. However, the deployment of AI systems in the wild, especially in high-security domains such as defence, is curbed by the lack of explainability of the model. To this end, eXplainable AI (XAI) is an emerging area of research that is intended to explore ...

  30. Study on Network Security Algorithm

    Cryptography is the encryption and decryption of data with secret keys using various algorithms. In this paper network security are described on the basis of the services of security.