attribute
∗p-value < 0.05 is statistically significant.
Python 3.7 programming language was used for building ML-based heart disease prediction system. Powerful software libraries supported by Python namely NumPy, Pandas, Seaborn, Statsmodels.api, SciPy and Sklearn etc. were used for exploratory analysis of data 17 and implementing five ML algorithms namely k-Nearest Neighbours (k-NN), Naïve Bayes (NB), Logistic Regression (LR), AdaBoost (AB) and Random Forest (RF). This study also has typical binary classification where 13 input attributes are observed to determine if there is a high risk of heart disease in a patient (risk of CVD = high) or not (risk of CVD = low). Fig. 1 shows the workflow diagram of complete project.
Workflow diagram of the study. This figure depicts the complete workflow of the study. The medical data set of 1670 records were gathered (in random fashion). Seventy percent data samples used to train the models. Test subset comprised the rest 30% of medical records. Five machine learning algorithms are applied to train the training subset. The prediction system was hosted on the public cloud for easy accessibility.
It was observed that there were no missing values or outliers in the data.
Since the ML algorithms can process only numerical data, the categorical attributes were label encoded. Gender female was encoded as 1 while male as 0. For all the other categorical variables like diabetes, stress, exercise etc., the presence (yes) was encoded as 1 while absence (no) was encoded as 0. High risk of CVD was encoded as 1 while low risk of CVD was encoded as 0.
Using the train_test_split function supported by scikit learn library, the complete medical data set was randomly split into two portions in the ratio 70:30 referred as training and test/validation subset, respectively. Out of total 1670 records, training subset had 1169 records while test subset had 501 records. Detailed information about the training and test subsets is provided in Table 3 . The total number of records in the training data set were 1169, of which 656 records correspond to CVDs while 513 records belonged to healthy people not diagnosed with CVDs.
Details of training and test subsets.
Class | Training subset (70%) | Test subset (30%) | Total records |
---|---|---|---|
High-risk cardiovascular disease (CVD) | 656 | 237 | 893 |
Low-risk CVD | 513 | 264 | 777 |
Total records | 1169 | 501 | 1670 |
ML algorithms with well demonstrated performance for classification namely NB, LR and k-NN were applied to build the prediction model.
Research has proved that the performance of a ML-based prediction system can be improvised using ensembling techniques. 18 Ensembling is a union of individual classifying algorithms. Bagging ensemble algorithms namely RF and boosting ensemble algorithms namely adaptive boosting AB were also implemented for enhanced performance.
The performance of prediction models developed using k-NN, NB and LR algorithms was analysed using the validation subset of 501 records as shown in Table 4 . Of these records, 237 were confirmed cases of CVDs while remaining 264 records correspond to healthy people not diagnosed with CVDs. Prevalence of disease in validation subset was 237/501 = 47.3%
Performance of machine learning algorithms on validation set of 501 records.
Algorithm | True positive | True negative | False negative | False positive | Sensitivity | Specificity | Positive predictive value | Negative predictive value | Accuracy |
---|---|---|---|---|---|---|---|---|---|
k-Nearest Neighbours | 211 | 230 | 26 | 34 | 89% | 87.1% | 86.1% | 89.8% | 88% |
Naïve Bayes | 210 | 232 | 27 | 32 | 88.6% | 87.8% | 86.7% | 89.5% | 88.2% |
Logistic Regression | 215 | 240 | 22 | 24 | 90.7% | 90.9% | 89.9% | 91.6% | 90.8% |
AdaBoost | 218 | 246 | 19 | 18 | 91.9% | 93.1% | 92.3% | 92.8% | 92.6% |
Random Forest | 220 | 250 | 17 | 14 | 92.8% | 94.6% | 94% | 93.6% | 93.8% |
Analysis of confusion matrix is a standard way to check the performance of ML-based prediction system. Confusion matrix has four components namely true positives (TPs), true negatives (TNs), false positives (FPs) and false negatives (FNs).
TPs: Heart patients who are predicted correctly to have heart diseases.
TNs: Healthy persons who are predicted correctly to be healthy.
FPs: Healthy persons predicted incorrectly to have heart diseases (Type 1 error).
FNs: Heart patient predicted incorrectly to be healthy (Type 2 error).
These values are used to calculate accuracy, specificity, sensitivity, positive predictive value (PPV) and negative predictive value (NPV). PPV and NPV depend on the prevalence of disease.
A brief description of these parameters is given below.
Grid Search for cross-validation was used to identify the best hyperparameters for the learning algorithms. Grid Search CV class from sklearn library was used for this purpose.
The best performance prediction system built using RF model was deployed in Microsoft Azure cloud for better accessibility. 19 ‘ Pickle ’ and ‘ Flask ’ software libraries of Python programming language were used for this purpose. 20 Hosting the prediction system on cloud enables it to be easily accessed from anywhere in the world via Internet. This is highly useful feature for healthcare sector of India, which faces the major issue of shortage of medical facilities especially in rural areas. Accessing this prediction system is as easy as accessing an e-mail via Internet.
CVD prediction system was developed by applying five well-established ML algorithms on the training data set. The performance was tested on the validation test set of 501 records. Prevalence of disease in validation subset was 237/501 = 47.3% Performance metrics namely accuracy, sensitivity, specificity, PPV and NPV were calculated for each algorithm. The performance results of all classifiers are given in Table 4 .
The best hyperparameters for k-NN (n_neighbors = 12) resulted in a performance of sensitivity 89%, specificity 87.1%, PPV 86.1%, NPV 89.8%. The performance of NB was found to better than k-NN. Sensitivity 88.6%, specificity 87.8%, PPV 86.7%, NPV 89.5% were achieved by NB.
LR with hyperparameters (C = 1, penalty = l2) performed well in classifying people with low risk or high risk of CVDs. LR correctly classified 455 out of 501 records, thus attaining a classification accuracy of 90.8%. Sensitivity 90.7% and specificity were 90.7% and 90.9%, respectively. PPV was observed to be 89.9% while NPV was 91.6%.
Models built using ensemble techniques (RF and AB) performed better than LR. AB model was trained with Stage-wise Adaptive Modelling using a Multi-class Exponential loss function (n_estimators = 30) while RF based on ‘gini index’ with n_estimators = 150 resulted in the best performance. Sensitivity and specificity of AB model was 91.9% and 93.1%, respectively, while RF reported 92.8% sensitivity and 94.6% specificity. PPV 94% and NPV 93.6% were achieved by RF–based prediction model.
Interpretation of ML-based models is not easy, and these are usually considered as ‘black boxes. However, logistic regression–based models are quite interpretable. Logistic regression was implemented using the Logit function (Binomial family) based on maximum likelihood estimation method to predict CVD risk using statsmodels.api library of Python. Fig. 2 shows the summary of results obtained.
Study population characteristics mean (standard deviation) of numerical attributes along with p-values of t -test to indicate the statistical significance for two groups: high risk/low risk of cardiovascular disease (CVDs). Count (%) of categorical attributes in two groups: high risk/low risk of CVDs.
Male gender, diabetes, hypertension, high cholesterol level, smoking and alcohol were significantly associated with CVD. Lack of exercise and stress were observed to be more prevalent in CVD group (p value < 0.05).
Estimate column in the summary reflects the natural logarithm of odds ratio of getting diagnosed with high risk of heart disease keeping all other features constant. Due to negative values of log (odds ratio) it is inferred that females had a low risk of CVDs compared with males. Regular exercise and intake of healthy diet were observed to be associated with low risk of CVDs; on the other hand, diabetes, hypertension, stress, smoking and family history tend to result in high risk of CVDs.
The odds ratio column in the summary suggests how the odds ratio of being detected with high risk of CVD change if all other attributes are kept constant. Hypertension tends to increase the odds ratio of high risk of CVDs by 1.573 while the odds ratio drops significantly to 0.328 with regular physical exercise. Odds ratio of high risk of CVD for females is 0.788 compared with males.
Ensemble algorithms (RF and AB) are based on decision trees and attribute importance is graded according to selection occurrence frequency of an attribute as a decision node decided based on information gain and entropy. Variable importance for boosting algorithm was decided based on the impurity-based scores using feature_importances_ from sklearn library of Python. Attributes exercise, weight, total cholesterol, hypertension and age were the top five important attributes for AB algorithm. In case of RF prediction system, variable importance scores for attributes weight, exercise, total cholesterol, hypertension, and gender were found to be maximum for predicting CVDs. Variable importance for AB algorithms and RF is represented graphically in Fig. 3 (a) and (b), respectively.
Variable importance. (a) Variable importance for AdaBoost-based prediction model. (b) Variable importance for Random Forest–based prediction model.
RF-based CVD prediction model (trained on 1169 records and tested on 501 records) is hosted on cloud and can be easily accessed at das.southeastasia.cloudapp.azure.com/predict/
The input attributes of the patient are entered into the system. The system predicts if the patient has low risk of CVDs or high risk. Sample screenshots of the result obtained using the prediction system are shown in Fig. 4 .
Using cardiovascular disease (CVD) prediction model to test the risk of CVDs. The medical practitioner enters the patient's clinical parameters as well as attributes related to his lifestyle to predict the risk of CVD.
In the recent years, substantial research studies have been carried out to build methods for diagnosing heart diseases in early stages. Various feature selection techniques were applied in the research carried out by Takci. 21 (2018), and the resulting prediction system attained an accuracy of 84.81%. Similar study was carried out by Kausar et al. and an accuracy of 88.41% was obtained. 22 Prediction system developed by Khalid Raza using ensembling technique (2019) attained an accuracy of 88.88%. 23 A similar accuracy level of 89% was achieved by the prediction system developed by Haq et al. in 2019. 24 Using artificial neural network to design a prediction system Alic et al. achieved an accuracy of 91% in their research study. 25 But importantly, the prediction system developed in all of these studies do not work effectively well for Indian population as these models are based on data collected from Western countries and do not take into consideration lifestyle-related risk factors responsible for CVDs (lack of physical activity, family history, alcohol etc.). Moreover, these systems rely on the results of medical tests like ECG, treadmill test, fluoroscopy tests etc., which are not feasible in Indian primary health centres in the existing scenario.
The accuracy attained in the present study is 93.8%. The prediction system developed in this research uses 13 clinical parameters and identifies the risk of a person to have heart disease. Compared with the studies done so far, this study has been carried out on Indian population, and the potential risk factors like high body weight, lack of exercise, psychological stress, family history, smoking and alcohol consumption habits have been considered in this study (unlike the studies quoted previously). It is worth noting that the system developed in this study is highly cost-effective compared with earlier studies as expensive tests like fluoroscopy and treadmill tests have not been taken into consideration. Easy accessibility of the prediction system via Internet is also an added remarkable feature of this study, which was not reported by earlier studies. It is worth mentioning here that prediction model developed in this pilot study predicts output depending on the study population attribute trends it was trained on. Once the ML models are trained and tested on voluminous data sets, it can be used as a screening tool in rural India and can help in the prevention of CVDs.
Cost-effectiveness, excellent performance and easy accessibility of the prediction system via Internet defend the use of ML-based prediction system as a screening tool for CVD detection in India.
To the best of our knowledge, this study was first of its kind in Indian context. Developed countries like the United Kingdom and the United States are investing their resources to carry to research for developing ML-based prediction models for diagnosing heart diseases in primary healthcare centres. 26 , 27 It is recommended that similar studies should be promoted in India. The current national health policy (2017) of our government, laying stress on preventive health will be more meaningful and fruitful if advancement in this field is made as early as possible. 28 We propose larger studies of multicentric nature for development of AI prediction systems for CVD screening in our country, which is facing ever increasing load of morbidity and mortality due to CVD being detected in late advanced stages. Premier institutes of medicine and technology can collaborate in this regard to diagnose other lifestyle diseases and non-communicable diseases like malignancies. Cardiological Society of India (CSI) can help in this regard. Other modern techniques like artificial neural networks can be applied to further improve the performance of the system.
This study used a data set of 1670 patients reporting to a tertiary care private setup in a south Indian metropolitan city where largely the higher income group seeks the medical care. This potentially may seem biased in reader's mind, but this study was aimed only to detect the robustness of a prediction model based on ML. The results obtained from the prediction system developed in this study are based on the attribute trends of the study population on which the model is trained on. In future the model needs to be trained on huge data sets collected from diverse regions before using it as a screening tool.
The study portrays the capability of ML algorithms to predict CVDs in Indian population. Issues of affordability and accessibility in healthcare sector of India can be addressed using ML-based models, which can be easily accessed via Internet even in the rural parts of the country. It is proposed to build and test the performance of similar systems using voluminous cardiac data sets belonging to all economic sections of the society collected from various regions of India. We recommend similar studies of multicentric nature across entire country. To achieve the sustainable development goals laid down by World Health Organization, it is high time, we as a country do take timely advantage of ML-based prediction systems in improving preventive care aspect of public healthcare system. 29
ML-based tools have shown remarkable performance in diagnosing various serious diseases in initial stages in healthcare centres of developed countries.
An indigenous high-performance ML-based CVD prediction system easily accessible via Internet is proposed for existing Indian healthcare system. Healthcare in India can be made more affordable and accessible using ML-based prediction systems.
The authors have none to declare.
The authors express their heartfelt gratitude to Sagar Hospitals, Jayanagar, Bengaluru, for providing anonymized information of patients' health parameters for carrying out this study. No funding was received for this project.
If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Property | Value |
---|---|
Status | |
Version | |
Ad File | |
Disable Ads Flag | |
Environment | |
Moat Init | |
Moat Ready | |
Contextual Ready | |
Contextual URL | |
Contextual Initial Segments | |
Contextual Used Segments | |
AdUnit | |
SubAdUnit | |
Custom Targeting | |
Ad Events | |
Invalid Ad Sizes |
Access provided by
2 role and impact of temperature, humidity and leaf wetness on grape plant, 3 literature survey.
Reference | Plant | Data analysis | Technique used | Diseases Covered | Accuracy |
---|---|---|---|---|---|
6 | Grape | HMM | Zigbee | Downey | Downey 90.9% |
Powdery | Powdery 90.9% | ||||
9 | Tomato | SVM KNN Random forest | IOT | Disorder | 99.6% using Random forest |
Detection | |||||
42 | Tea | Multiple Linear Regression | IOT | Blister Disease | 91% |
Detection | |||||
15 | Grape | N/A | IOT | Downey | Downey 94.4% |
Powdery | Powdery 96% | ||||
8 | Multiple | KNN RF LR | IOT | Plant | 91% using |
Disease | KNN | ||||
29 | Grapevine | ANN | IP | Downey Powdery Black Rot | Downey 90.47% Powdery 92.85% |
32 | Grape | SVM RF AdaBoost | Leaf Disease | 93% using SVM | |
33 | Crop | Fuzzy Logic | Wi-Fi | Climate Prediction |
5 methodology.
Sr. No | Component Description | Specification |
---|---|---|
1 | Power Supply: Battery | 40A, 14.8 V - 16.8 V |
2 | Temperature Sensor | DHT11 |
3 | Leaf wetness Sensor | - |
4 | GSM Module | SIM800L GSM/GPRS, 4 V |
5 | Node MCU | ESP8266, 16 Digital Pins, Analog −1 Pin |
6 | LCD Display | 16X2 |
Calibration, 6.1 experimental setup.
7.1 dataset generated by system.
Date | Time | Temperature | Humidity | Leaf Wetness |
---|---|---|---|---|
7/14/2023 | 9:11:01 | 28.1 | 67 | 0 |
7/14/2023 | 9:11:05 | 28.1 | 67 | 0 |
7/14/2023 | 9:11:10 | 28.1 | 67 | 0 |
7/14/2023 | 9:11:17 | 28.1 | 66 | 0 |
7/14/2023 | 9:11:22 | 28.1 | 66 | 0 |
7/14/2023 | 9:11:26 | 28.1 | 66 | 0 |
7/14/2023 | 9:11:31 | 28.2 | 66 | 0 |
7/14/2023 | 9:11:35 | 28.2 | 66 | 0 |
7/14/2023 | 9:11:39 | 28.2 | 66 | 0 |
7/14/2023 | 9:11:44 | 28.2 | 66 | 0 |
7/14/2023 | 9:11:48 | 28.2 | 66 | 0 |
7/14/2023 | 9:11:52 | 28.2 | 66 | 0 |
7/14/2023 | 9:11:57 | 28.3 | 65 | 0 |
7/14/2023 | 9:12:02 | 28.3 | 65 | 0 |
7/14/2023 | 9:12:02 | 28.3 | 65 | 0 |
7/14/2023 | 9:12:06 | 28.3 | 65 | 0 |
7/14/2023 | 9:12:11 | 28.3 | 65 | 0 |
7/14/2023 | 9:12:16 | 28.3 | 65 | 0 |
Measure | Percentage % |
---|---|
Accuracy | 98.25 |
Recall | 98.3 |
Precision | 98.3 |
Measure | Percentage % |
---|---|
Accuracy | 98.85 |
Recall | 98.9 |
Precision | 97.7 |
Measure | Percentage % |
---|---|
Accuracy | 93.95 |
Recall | 94.0 |
Precision | 94.4 |
Parameters/Authors | Patil & Thorat | K. Sanghavi | Proposed |
---|---|---|---|
Powdery Mildew | Yes | Yes | Yes |
Downey Mildew | Yes | Yes | Yes |
Bacterial Leaf Spot | Yes | No | Yes |
Technique Used | IoT | IoT | IoT |
Cloud Based | No | Yes | Yes |
Accuracy | Downey 90.9% | Downey 94.4% | Downey 98.85% |
Powdery 90.9% | Powdery 96% | Powdery 98.25% | |
Bacterial Leaf Spot 93.95 |
Comparative analysis, 9 limitations, 10 conclusion, 11 future work, data availability, additional information, uncited reference, declaration of competing interest, article metrics, related articles.
The content on this site is intended for healthcare professionals and researchers across all fields of science.
We use cookies to help provide and enhance our service and tailor content. To update your cookie settings, please visit the Cookie settings for this site. All content on this site: Copyright © 2024 Elsevier Inc., its licensors, and contributors. All rights are reserved, including those for text and data mining, AI training, and similar technologies. For all open access content, the Creative Commons licensing terms apply.
Your session will expire shortly. If you are still working, click the ‘Keep Me Logged In’ button below. If you do not respond within the next minute, you will be automatically logged out.
Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 1191))
Included in the following conference series:
Agriculture ensures that everyone has enough to eat even if the global population suddenly doubles. Prediction of plant diseases at an early stage is suggested in agriculture since it is crucial to provide food for the general population. Unfortunately, early disease forecasting is not possible for crops. The purpose of this study is to educate agriculturalists on recent advances in the fight against plant leaf diseases. To identify leaf illnesses in tomato plants, an accurate methodology was developed utilizing machine learning and image processing approaches. The authors of this paper propose a method for detecting illness in plants and crops that employs digital image processing and machine learning. The system can distinguish between healthy and unhealthy plant photos by using a Machine Learning model known as Support Vector Machine (SVM). “The Informational properties of leaf samples are extracted using different descriptors, including Discrete Wavelet Transform, Principal Component Analysis, and Grey Level Co-occurrence Matrix. With an F1 score of 99%, 99% accuracy, 98% precision, and 99% recall, the suggested technique integrates Discrete Wavelet Transform (DWT), Principal Component Analysis (PCA), Grey-Level Co-occurrence Matrix (GLCM), and Convolutional Neural Networks (CNN)” for the greatest performance”.
This is a preview of subscription content, log in via an institution to check access.
Subscribe and save.
Tax calculation will be finalised at checkout
Purchases are for personal use only
Institutional subscriptions
F. Abbas, S. Khan, Z. Zhang, A novel feature selection approach for plant disease detection. Comput. Electron. Agric. 141 , 234–246 (2017). https://doi.org/10.1016/j.compag.2017.08.016
Article Google Scholar
A.E. Abdalla, M.M. El Hoseny, E.A. Farahat, A novel hybrid intelligent model for tomato leaf diseases diagnosis using decision tree and naïve Bayes algorithms. Comput. Electron. Agric. 174 , 105491 (2020). https://doi.org/10.1016/j.compag.2020.105491
S.S. Bere, G.P. Shukla, V.N. Khan, A.M. Shah, D.G. Takale, Analysis of students performance prediction in online courses using machine learning algorithms. NeuroQuantology 20 (12), 13–19 (2022)
Google Scholar
S. Ghosal, S. Saha, S. Chakraborty, A deep learning approach for detection and classification of plant leaf diseases. Comput. Electron. Agric. 163 , 104853 (2019)
R. Jain, I. Gupta, D. Varshney, Plant leaf disease classification using convolutional neural networks. J. Eng. Res. Rep. 18 (3), 1–8 (2021)
H. Jiang, W. Qian, M. Gao, Y. Li, An automatic detection system of lung nodule based on multigroup patch-based deep learning network. IEEE J. Biomed. Health Inform. 22 (4), 1227–1237 (2018). https://doi.org/10.1109/JBHI.2017.2725903
S.U. Kadam, V.M. Dhede, V.N. Khan, A. Raj, D.G. Takale, Machine learning methode for automatic potato disease detection. NeuroQuantology 20 (16), 2102–2106 (2022)
S.U. Kadam, A. Katri, V.N. Khan, A. Singh, D.G. Takale, D.S. Galhe, Improve the performance of non-intrusive speech quality assessment using machine learning algorithms. NeuroQuantology 20 (19), 3243–3250 (2022)
A.A. Khan, R.M. Mulajkar, V.N. Khan, S.K. Sonkar, D.G. Takale, A research on efficient spam detection technique for IOT devices using machine learning. NeuroQuantology 20 (18), 625–631 (2022)
Y. Li, X. Chen, J. Wei, An autoencoder-based approach for soybean disease recognition using multispectral images. IEEE Access 7 , 53210–53222 (2019). https://doi.org/10.1109/ACCESS.2019.2914136
P. Mandal, P. Mitra, B. Chanda, A comparative study on feature extraction techniques for plant leaf disease classification. Procedia Comput. Sci. 132 , 1154–1163 (2018)
A. Mishra, S.K. Tripathi, Plant leaf disease detection and classification using DWT and PCA. Int. J. Comput. Appl. 175 (12), 6–10 (2020)
S.P. Mohanty, D.P. Hughes, M. Salathé, Using deep learning for image-based plant disease detection. Front. Plant Sci. 7 , 1419 (2016). https://doi.org/10.3389/fpls.2016.01419
E. Mwebaze, J. Wanyama, R. Ogwang, Smartphone-based plant disease diagnosis using a convolutional neural network. J. Intell. Syst. 27 (2), 229–238 (2018). https://doi.org/10.1515/jisys-2017-0293
D. Patel, N. Patel, A review on various techniques used for detection of plant leaf diseases. Int. J. Eng. Res. Technol. 7 (3), 684–689 (2018)
R. Raut, Y. Borole, S. Patil, V.N. Khan, D.G. Takale, Skin disease classification using machine learning algorithms. NeuroQuantology 20 (10), 9624–9629 (2022)
M. Sajjad, S. Ali, M. Hussain, M. Shahzad, An intelligent system for plant leaf disease diagnosis using KNN and SVM classifiers. Comput. Electron. Agric. 164 , 104891 (2019)
D. Singh, D. Gupta, S. Gupta, A review on various techniques used for detection of plant leaf diseases. Int. J. Emerg. Technol. Innovative Res. 6 (10), 220–224 (2019a)
R. Singh, G. Singh, P. Singh, Detection of plant leaf diseases using GLCM and machine learning techniques. Int. J. Comput. Sci. Eng. 7 (9), 1–5 (2019b)
D. Singh, D. Singh, M. Kaur, Ensemble of bagging and boosting for accurate identification of grape leaf diseases. J. Ambient. Intell. Humaniz. Comput. 12 (4), 4089–4100 (2021). https://doi.org/10.1007/s12652-020-02739-5
D.G. Takale, A review on implementing energy efficient clustering protocol for wireless sensor network. J. Emerg. Technol. Innovative Res. (JETIR) 6 (1), 310–315 (2019a)
D.G. Takale, A review on QoS aware routing protocols for wireless sensor networks. Int. J. Emerg. Technol. Innovative Res. 6 (1), 316–320 (2019b)
D.G. Takale, A review on wireless sensor network: its applications and challenges. J. Emerg. Technol. Innovative Res. (JETIR) 6 (1), 222–226 (2019c)
D.G. Takale, A review on data centric routing for wireless sensor network. J. Emerg. Technol. Innovative Res. (JETIR) 6 (1), 304–309 (2019e)
D.G. Takale et al., A study of fault management algorithm and recover the faulty node using the FNR algorithms for wireless sensor network. Int. J. Eng. Res. Gen. Sci. 2 (6), 590–595 (2014)
D.G. Takale, S.D. Gunjal, V.N. Khan, A. Raj, S.N. Gujar, Road accident prediction model using data mining techniques. NeuroQuantology 20 (16), 2904–2101 (2022)
D.G. Takale et al., Load balancing energy efficient protocol for wireless sensor network. Int. J. Res. Anal. Rev. (IJRAR) 153–158 (2019d)
H. Wang, G. Liu, H. Zhao, Segmentation of apple leaf disease spots based on the K-means clustering algorithm. J. Phys. Conf. Ser. 837 , 012007 (2017). https://doi.org/10.1088/1742-6596/837/1/012007
J. Zhang, X. Wang, Y. Zhang, Plant disease recognition using a convolutional neural network ensemble method. Comput. Electron. Agric. 149 , 142–149 (2018). https://doi.org/10.1016/j.compag.2018.04.019
X. Zhang, Y. Zhou, X. Lin, G. Wu, Z. Wang, Y. Yao, Multi-class plant disease recognition using a CNN with a combination of convolutional and recurrent layers. Front. Plant Sci. 11 , 588071 (2020a). https://doi.org/10.3389/fpls.2020.588071
X. Zhang, Y. Zhou, G. Wu, X. Lin, Y. Yao, Tomato disease recognition based on feature fusion and multi-scale convolutional neural network. J. Phys. Conf. Ser. 1529 , 022064 (2020b). https://doi.org/10.1088/1742-6596/1529/2/022064
Download references
Authors and affiliations.
Department of Computer Engineering, Vishwakarma Institute of Information Technology, SPPU Pune, Pune, India
Dattatray G. Takale, Chitrakant B. Banchhor, Piyush P. Gawali, Vajid Khan & Vikas B. Maral
Department of AI & DS, Vishwakarma Institute of Information Technology, SPPU Pune, Pune, India
Parishit N. Mahalle
Vishwakarma Institute of Information Technology, SPPU Pune, Pune, India
Vivek Deshpande
Department of Computer Engineering, KJ College of Engineering and Management Research, SPPU Pune, Pune, India
Gopal Deshmukh
You can also search for this author in PubMed Google Scholar
Correspondence to Dattatray G. Takale .
Editors and affiliations.
Department of Computer Science, University of South Dakota, Vermillion, SD, USA
K. C. Santosh
Department of Computer Applications, National Institute of Technology Kuruks, Kurukshetra, Haryana, India
Sandeep Kumar Sood
School of Science and Technology, Bournemouth University, Poole, UK
Hari Mohan Pandey
Manav Rachna International Institute of Research and Studies, Faridabad, Haryana, India
Charu Virmani
Reprints and permissions
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
Cite this paper.
Takale, D.G. et al. (2024). Image Processing and Machine Learning for Plant Disease Detection. In: Santosh, K.C., Sood, S.K., Pandey, H.M., Virmani, C. (eds) Advances in Artificial-Business Analytics and Quantum Machine Learning. COMITCON 2023. Lecture Notes in Electrical Engineering, vol 1191. Springer, Singapore. https://doi.org/10.1007/978-981-97-2508-3_45
DOI : https://doi.org/10.1007/978-981-97-2508-3_45
Published : 19 September 2024
Publisher Name : Springer, Singapore
Print ISBN : 978-981-97-2507-6
Online ISBN : 978-981-97-2508-3
eBook Packages : Computer Science Computer Science (R0)
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
Policies and ethics
IMAGES
VIDEO
COMMENTS
Based on the symptoms, age, and gender of an individual, the diagnosis system gives the output as the disease that the individual might be suffering from. The weighted KNN algorithm gave the best results as compared to the other algorithms. The accuracy of the weighted KNN algorithm for the prediction was 93.5 %.
the disease is omitted mistakenly from the consideration. Machine learning (ML) is used practically everywhere, from cutting-edge technology (such as mobile phones, computers, and robotics) to health care (i.e., disease diagnosis, safety). ML is gaining popularity in various fields, including disease diagnosis in health care.
Disease Prediction Using Machine Learning. * Research Gate Link: Marouane Fethi Ferjani. Computing Department. Bournemouth University. Bournemouth, England. [email protected]. Abstract ...
The numbers of disease prediction papers using XGBoost with medical data have increased recently 33,34,35,36. XGBoost is an algorithm that overcomes the shortcomings of GBM (gradient boosting ...
Study characteristics. Table 2 shows the basic characteristics of the included studies. In total, our meta-analysis of ML and cardiovascular diseases included 103 cohorts (55 studies) with a total ...
Machine learning models are used to create and enhance various disease prediction frameworks. Ensemble learning is a machine learning technique that combines multiple classifiers to improve performance by making more accurate predictions than a single classifier. Although numerous studies have employed ensemble approaches for disease prediction, there is a lack of thorough assessment of ...
Background Supervised machine learning algorithms have been a dominant method in the data mining field. Disease prediction using health data has recently shown a potential application area for these methods. This study aims to identify the key trends among different types of supervised machine learning algorithms, and their performance and usage for disease risk prediction. Methods In this ...
Section 2 of this paper will introduce the theories, development and disease application cases of two kinds of structured data algorithms, ANN and FM-Deep Learning. Section 3 will introduce the theories, development and disease application cases of CNN and RNN. Section 4 will respectively introduce the current defects in the field of disease prediction algorithms and the coping strategies.
Abstract: Machine learning (ML) refers to the science and engineering of artificially intelligent systems, providing them with the capability to learn without being explicitly programmed. In recent years, ML in the healthcare domain has made great advancements in the early predictions of many critical illnesses. While there have been significant contributions to single disease prediction ...
Here we propose an approach that leverages deep generative models to predict variant pathogenicity without relying on labels. By modelling the distribution of sequence variation across organisms ...
In this paper we are proposes a complete Multiple Disease Prediction System that makes accurate predictions of diabetes, cancer, and heart disease using machine learning algorithms. The system's ...
Cardiovascular disease refers to any critical condition that impacts the heart. Because heart diseases can be life-threatening, researchers are focusing on designing smart systems to accurately diagnose them based on electronic health data, with the aid of machine learning algorithms. This work presents several machine learning approaches for predicting heart diseases, using data of major ...
The aim of the study is to show whether it is possible to predict infectious disease outbreaks early, by using machine learning. This study was carried out following the guidelines of the Cochrane Collaboration and the meta-analysis of observational studies in epidemiology and the preferred reporting items for systematic reviews and meta-analyses. The suitable bibliography on PubMed/Medline ...
For the analysis, a sample of 4920 patient records with 41 disorders was chosen. A total of 41 diseases made up the dependent variable. We enhanced 95 of the 132 independent variables (symptoms) that are closely related to illnesses. This paper illustrates a disease prediction system constructed using the Random Forest Machine Learning algorithm.
Our paper is part of the research on the detection and prediction of heart disease. It is based on the application of Machine Learning algorithms, of which w e have. chosen the 3 most used ...
Heart diseases are consistently ranked among the top causes of mortality on a global scale. Early detection and accurate heart disease prediction can help effectively manage and prevent the disease. However, the traditional methods have failed to improve heart disease classification performance. So, this article proposes a machine learning approach for heart disease prediction (HDP) using a ...
The identification and prediction of such diseases at their earlier stages are much important, so as to prevent the extremity of it. It is difficult for doctors to manually identify the diseases accurately most of the time. The goal of this paper is to identify and predict the patients with more common chronic illnesses.
1. "Multiple Disease Prediction Using Machine Learning Algorithms" by Chauhan et al. (2021): This paper investigates using various ML algorithms, including SVM and Decision Trees, for multiple disease prediction, focusing on symptoms as input. It examines the performance of these algorithms on four diseases, including heart disease and diabetes.
Globally, cardiovascular disease (CVDs) is the primary cause of morbidity and mortality, accounting for more than 70% of all fatalities. According to the 2017 Global Burden of Disease research, cardiovascular disease is responsible for about 43% of all fatalities [1,2].Common risk factors for heart disease in high-income nations include lousy diet, cigarette use, excessive sugar consumption ...
ChaoTan et al [1] explored the feasibility of using decision stumps as a poor classification method and track element analysis to predict timely lung cancer in a combination of Adaboost (machine learning ensemble). For the illustration, a cancer dataset was used which identified 9 trace elements in 122 urine samples.
This research work carried out demonstrates the disease prediction system developed using Machine learning algorithms such as Decision Tree classifier, Random forest classifier, and Naïve Bayes classifier. The paper presents the comparative study of the results of the above algorithms used. Published in: 2020 ...
Today, heart disease is the leading cause of death. The annual death rate from coronary heart disease decreased by 31.8% between 2006 and 2016. Age-adjusted death rate from coronary heart disease per 100,000 people . Thus, a high-precision system that can be used as an analytical tool to find hidden patterns of heart problems in medical data ...
6 Conclusion. This paper reviews the deep learning algorithms in the field of disease prediction. According to the type of data processed, the algorithms are divided into structured data algorithms and unstructured data algorithms. Structured data algorithms include ANN and FM-Deep Learning algorithms.
In their research paper, Anuja Kumari et al. came to the conclusion that utilizing super vector machine and the Pima Indian Diabetes Dataset, with Matlab R2010a, the classifier can predict diabetes disease with the best possible the cost and effectiveness. ... Diabetes disease prediction using machine learning on big data of healthcare. In ...
Introduction. Cardiovascular diseases (CVDs) are the foremost reason of disease burden and mortality all over the world. Approximately 30% of total deaths (17.9 million) occurred due to CVDs globally in 2016. 1 The situation is critically serious in low- and middle-income countries like India. During the past three decades, the number of deaths due to CVDs has increased significantly from 15.2 ...
Machine learning with IoT practices in the agriculture sector has the potential to address numerous challenges encountered by farmers, including disease prediction and estimation of soil profile. This paper extensively explores the classification of diseases in grape plants and provides detailed information about the conducted experiments. It is important to keep track of each crop's current ...
The dataset in question is referred to as Plant Village, and it is a publicly accessible dataset that was curated for the purpose of identifying plant leaf diseases by Sharada P. Mohanty et al. (Abbas et al. 2017).There are a total of 38 separate plant disease categories included in this collection, which is comprised of 87,000 RGB photographs of healthy and sick plant leaves.