Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Review Article
  • Published: 19 April 2022

The future of early cancer detection

  • Rebecca C. Fitzgerald   ORCID: orcid.org/0000-0002-3434-3568 1 ,
  • Antonis C. Antoniou 2 ,
  • Ljiljana Fruk 3 &
  • Nitzan Rosenfeld   ORCID: orcid.org/0000-0002-2825-4788 4  

Nature Medicine volume  28 ,  pages 666–677 ( 2022 ) Cite this article

21k Accesses

147 Citations

181 Altmetric

Metrics details

A proactive approach to detecting cancer at an early stage can make treatments more effective, with fewer side effects and improved long-term survival. However, as detection methods become increasingly sensitive, it can be difficult to distinguish inconsequential changes from lesions that will lead to life-threatening cancer. Progress relies on a detailed understanding of individualized risk, clear delineation of cancer development stages, a range of testing methods with optimal performance characteristics, and robust evaluation of the implications for individuals and society. In the future, advances in sensors, contrast agents, molecular methods, and artificial intelligence will help detect cancer-specific signals in real time. To reduce the burden of cancer on society, risk-based detection and prevention needs to be cost effective and widely accessible.

This is a preview of subscription content, access via your institution

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

24,99 € / 30 days

cancel any time

Subscribe to this journal

Receive 12 print issues and online access

195,33 € per year

only 16,28 € per issue

Buy this article

  • Purchase on SpringerLink
  • Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

research paper on cancer detection

Similar content being viewed by others

research paper on cancer detection

Emerging strategies to investigate the biology of early cancer

research paper on cancer detection

Advancing prostate cancer detection: a comparative analysis of PCLDA-SVM and PCLDA-KNN classifiers for enhanced diagnostic accuracy

research paper on cancer detection

A non-invasive method for concurrent detection of multiple early-stage cancers in women

Martincorena, I. et al. Somatic mutant clones colonize the human esophagus with age. Science 362 , 911–917 (2018).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Tomasetti, C. & Vogelstein, B. Cancer etiology. Variation in cancer risk among tissues can be explained by the number of stem cell divisions. Science 347 , 78–81 (2015).

Yokoyama, A. et al. Age-related remodelling of oesophageal epithelia by mutated cancer drivers. Nature 565 , 312–317 (2019).

Article   CAS   PubMed   Google Scholar  

Krimmel, J. D. et al. Ultra-deep sequencing detects ovarian cancer cells in peritoneal fluid and reveals somatic TP53 mutations in noncancerous tissues. Proc. Natl Acad Sci. USA 113 , 6005–6010 (2016).

Hu, Z. et al. Quantitative evidence for early metastatic seeding in colorectal cancer. Nat. Genet. 51 , 1113–1122 (2019).

Turajlic, S. et al. Tracking cancer evolution reveals constrained routes to metastases: TRACERx Renal. Cell 173 , 581–594.e512 (2018).

Turajlic, S., Sottoriva, A., Graham, T. & Swanton, C. Resolving genetic heterogeneity in cancer. Nat. Rev. Genet. 20 , 404–416 (2019).

Dobrow, M. J., Hagens, V., Chafe, R., Sullivan, T. & Rabeneck, L. Consolidated principles for screening based on a systematic review and consensus process. Can. Med. Assoc. J. 190 , E422–E429 (2018).

Article   Google Scholar  

Welch, H. G., Kramer, B. S. & Black, W. C. Epidemiologic signatures in cancer. N. Engl. J. Med. 381 , 1378–1386 (2019).

Article   PubMed   Google Scholar  

Pashayan, N., Morris, S., Gilbert, F. J. & Pharoah, P. D. P. Cost-effectiveness and benefit-to-harm ratio of risk-stratified screening for breast cancer: a life-table model. JAMA Oncol. 4 , 1504–1510 (2018).

Article   PubMed   PubMed Central   Google Scholar  

UK National Screening Committee. Adult screening programme: bowel cancer. https://view-health-screening-recommendations.service.gov.uk/bowel-cancer

US Preventive Services Task Force. Screening for prostate cancer: US Preventive Services Task Force recommendation statement. J. Am. Med. Assoc. 319 , 1901–1913 (2018).

Welch, H. G., Prorok, P. C., O’Malley, A. J. & Kramer, B. S. Breast-cancer tumor size, overdiagnosis, and mammography screening effectiveness. N. Engl. J. Med. 375 , 1438–1447 (2016).

UK National Screening Committee. Adult screening programme: prostate cancer. https://view-health-screening-recommendations.service.gov.uk/prostate-cancer/

Marmot, M. G. et al. The benefits and harms of breast cancer screening: an independent review. Br. J. Cancer 108 , 2205–2240 (2013).

Kuchenbaecker, K. B. et al. Risks of breast, ovarian, and contralateral breast cancer for BRCA1 and BRCA2 mutation carriers. J. Am. Med. Assoc . 317 , 2402–2416 (2017).

Pharoah, P. D. et al. Polygenic susceptibility to breast cancer and implications for prevention. Nat. Genet. 31 , 33–36 (2002).

Amos, C. I. et al. The OncoArray Consortium: a network for understanding the genetic architecture of common cancers. Cancer Epidemiol. Biomark. Prev. 26 , 126–135 (2017).

Bahcall, O. G. iCOGS collection provides a collaborative model. Foreword. Nat. Genet. 45 , 343 (2013).

Wand, H. et al. Improving reporting standards for polygenic scores in risk prediction studies. Nature 591 , 211–219 (2021).

Adeyemo, A. et al. Responsible use of polygenic risk scores in the clinic: potential benefits, risks and gaps. Nat. Med. 27 , 1876–1884 (2021).

Article   CAS   Google Scholar  

Pashayan, N. et al. Personalized early detection and prevention of breast cancer: ENVISION consensus statement. Nat. Rev. Clin. Oncol. 17 , 687–705 (2020).

McGeoch, L. et al. Risk prediction models for colorectal cancer incorporating common genetic variants: a systematic review. Cancer Epidemiol. Biomark. Prev. 28 , 1580–1593 (2019).

Harrison, H. et al. Risk prediction models for kidney cancer: a systematic review. Eur. Urol. Focus 7 , 1380–1390 (2020).

Lee, A. et al. BOADICEA: a comprehensive breast cancer risk prediction model incorporating genetic and nongenetic risk factors. Genet. Med. 21 , 1708–1718 (2019).

Lee, A. et al. Comprehensive epithelial tubo-ovarian cancer risk prediction model incorporating genetic and epidemiological risk factors. J. Med. Genet. https://doi.org/10.1136/jmedgenet-2021-107904 (2021).

Maas, P. et al. Breast cancer risk from modifiable and nonmodifiable risk factors among white women in the United States. JAMA Oncol. 2 , 1295–1302 (2016).

Tyrer, J., Duffy, S. W. & Cuzick, J. A breast cancer prediction model incorporating familial and personal risk factors. Stat. Med. 23 , 1111–1130 (2004).

Hurson, A. N. et al. Prospective evaluation of a breast-cancer risk model integrating classical risk factors and polygenic risk in 15 cohorts from six countries. Int. J. Epidemiol. dyab036 (2021).

Clift, A. K. et al. The current status of risk-stratified breast screening. Br. J. Cancer 126 , 533–550 (2021).

Aleshin-Guendel, S., Lange, J., Goodman, P., Weiss, N. S. & Etzioni, R. A latent disease model to reduce detection bias in cancer risk prediction studies. Evaluation Health Prof. 44 , 42–49 (2021).

Shen, Y., Dong, W., Gulati, R., Ryser, M. D. & Etzioni, R. Estimating the frequency of indolent breast cancer in screening trials. Stat. Methods Med. Res. 28 , 1261–1271 (2019).

Trentham-Dietz, A. et al. Reflecting on 20 years of breast cancer modeling in CISNET: recommendations for future cancer systems modeling efforts. PLoS Comput. Biol. 17 , e1009020 (2021).

Shieh, Y. et al. Breast cancer screening in the precision medicine era: risk-based screening in a population-based trial. J. Natl Cancer Inst. https://doi.org/10.1093/jnci/djw290 (2017).

Ghanouni, A. et al. Attitudes towards risk-stratified breast cancer screening among women in England: a cross-sectional survey. J. Med. Screen 27 , 138–145 (2020).

Pashayan, N. et al. Should age-dependent absolute risk thresholds be used for risk stratification in risk-stratified breast cancer screening? J. Pers. Med 11 , 916 (2021).

Falcaro, M. et al. The effects of the national HPV vaccination programme in England, UK, on cervical cancer and grade 3 cervical intraepithelial neoplasia incidence: a register-based observational study. Lancet 398 , 2084–2092 (2021).

Arbyn, M. et al. 2020 list of human papillomavirus assays suitable for primary cervical cancer screening. Clin. Microbiol. Infect. 27 , 1083–1095 (2021).

WHO. A cervical cancer-free future: First-ever global commitment to eliminate a cancer. https://www.who.int/news/item/17-11-2020-a-cervical-cancer-free-future-first-ever-global-commitment-to-eliminate-a-cancer (2020).

Mazzone, P. J. et al. Early candidate nasal swab classifiers developed using machine learning and whole transcriptome sequencing may improve early lung cancer detection. J. Clin. Oncol. 39 , 8551–8551 (2021).

Sarkeala, T. et al. Piloting gender-oriented colorectal cancer screening with a faecal immunochemical test: population-based registry study from Finland. BMJ Open 11 , e046667 (2021).

Baldacchini, F. et al. Results of compliant participation in five rounds of fecal immunochemical test screening for colorectal cancer. Clin. Gastroenterol. Hepatol. 19 , 2361–2369 (2021).

Imperiale, T. F. et al. Multitarget stool DNA testing for colorectal-cancer screening. N. Engl. J. Med. 370 , 1287–1297 (2014).

Nieuwenburg, S. A. V. et al. Accuracy of H. pylori fecal antigen test using fecal immunochemical test (FIT). Gastric Cancer https://doi.org/10.1007/s10120-021-01264-8 (2021).

Fitzgerald, R. C. et al. Cytosponge-trefoil factor 3 versus usual care to identify Barrett’s oesophagus in a primary care setting: a multicentre, pragmatic, randomised controlled trial. Lancet 396 , 333–344 (2020).

Gehrung, M. et al. Triage-driven diagnosis of Barrett’s esophagus for early detection of esophageal adenocarcinoma using deep learning. Nat. Med. 27 , 833–841 (2021).

Menon, U. et al. Ovarian cancer population screening and mortality after long-term follow-up in the UK Collaborative Trial of Ovarian Cancer Screening (UKCTOCS): a randomised controlled trial. Lancet 397 , 2182–2193 (2021).

Eklund, M. et al. MRI-targeted or standard biopsy in prostate cancer screening. N. Engl. J. Med. 385 , 908–920 (2021).

Kasivisvanathan, V. et al. MRI-targeted or standard biopsy for prostate-cancer diagnosis. N. Engl. J. Med. 378 , 1767–1777 (2018).

Hugosson, J. et al. A 16-yr follow-up of the European randomized study of screening for prostate cancer. Eur. Urol. 76 , 43–51 (2019).

Martin, R. M. et al. Effect of a low-intensity PSA-based screening intervention on prostate cancer mortality: the CAP randomized clinical trial. J. Am. Med. Assoc. 319 , 883–895 (2018).

Van Poppel, H. et al. A European model for an organised risk-stratified early detection programme for prostate cancer. Eur. Urol. Oncol. 4 , 731–739 (2021).

Lenaerts, L. et al. Comprehensive genome-wide analysis of routine non-invasive test data allows cancer prediction: a single-center retrospective analysis of over 85,000 pregnancies. EClinicalMedicine 35 , 100856 (2021).

Abbosh, C., Swanton, C. & Birkbak, N. J. Clonal haematopoiesis: a source of biological noise in cell-free DNA analyses. Ann. Oncol. 30 , 358–359 (2019).

Cohen, J. D. et al. Detection and localization of surgically resectable cancers with a multi-analyte blood test. Science 359 , 926–930 (2018).

Newman, A. M. et al. An ultrasensitive method for quantitating circulating tumor DNA with broad patient coverage. Nat. Med. 20 , 548–554 (2014).

Chan, K. C. A. et al. Analysis of plasma Epstein–Barr virus DNA to screen for nasopharyngeal cancer. N. Engl. J. Med. 377 , 513–522 (2017).

Moss, J. et al. Comprehensive human cell-type methylation atlas reveals origins of circulating cell-free DNA in health and disease. Nat. Commun. 9 , 5068 (2018).

Shen, S. Y. et al. Sensitive tumour detection and classification using plasma cell-free DNA methylomes. Nature 563 , 579–583 (2018).

Liu, M. C., Oxnard, G. R., Klein, E. A., Swanton, C. & Seiden, M. V. Sensitive and specific multi-cancer detection and localization using methylation signatures in cell-free DNA. Ann. Oncol. 31 , 745–759 (2020).

Lo, Y. M. D., Han, D. S. C., Jiang, P. & Chiu, R. W. K. Epigenetics, fragmentomics, and topology of cell-free DNA in liquid biopsies. Science 372 , eaaw3616 (2021).

Mathios, D. et al. Detection and characterization of lung cancer using cell-free DNA fragmentomes. Nat. Commun. 12 , 5060 (2021).

Mouliere, F. et al. Enhanced detection of circulating tumor DNA by fragment size analysis. Sci. Transl. Med. 10 , eaat4921 (2018).

Peneder, P. et al. Multimodal analysis of cell-free DNA whole-genome sequencing for pediatric cancers with low mutational burden. Nat. Commun. 12 , 3230 (2021).

Lennon, A. M. et al. Feasibility of blood testing combined with PET-CT to screen for cancer and guide intervention. Science 369 , eabb9601 (2020).

Oren, O., Blankstein, R. & Bhatt, D. L. Incidental imaging findings in clinical trials. JAMA 323 , 603–604 (2020).

Augustine, R. et al. Imaging cancer cells with nanostructures: prospects of nanotechnology driven non-invasive cancer diagnosis. Adv. Colloid Interface Sci. 294 , 102457 (2021).

Ehlerding, E. B., Grodzinski, P., Cai, W. & Liu, C. H. Big potential from small agents: nanoparticles for imaging-based companion diagnostics. ACS Nano 12 , 2106–2121 (2018).

Liu, M., Anderson, R.-C., Lan, X., Conti, P. S. & Chen, K. Recent advances in the development of nanoparticles for multimodality imaging and therapy of cancer. Medicinal Res. Rev. 40 , 909–930 (2020).

Wang, W. et al. Spiky Fe 3 O 4 @Au supraparticles for multimodal in vivo imaging. Adv. Funct. Mater. 28 , 1800310 (2018).

Hao, L. et al. Microenvironment-triggered multimodal precision diagnostics. Nat. Mater. 20 , 1440–1448 (2021).

Koudrina, A. & DeRosa, M. C. Advances in medical imaging: aptamer- and peptide-targeted MRI and CT contrast agents. ACS Omega 5 , 22691–22701 (2020).

Yuan, Y. et al. Furin-mediated self-assembly of olsalazine nanoparticles for targeted raman imaging of tumors. Angew. Chem. Int. Ed. 60 , 3923–3927 (2021).

Sood, R. et al. Ultrasound for breast cancer detection globally: a systematic review and meta-analysis. J. Glob. Oncol. 5 , 1–17 (2019).

PubMed   Google Scholar  

Abou-Elkacem, L., Bachawal, S. V. & Willmann, J. K. Ultrasound molecular imaging: moving toward clinical translation. Eur. J. Radiol. 84 , 1685–1693 (2015).

Willmann, J. K. et al. Ultrasound molecular imaging with BR55 in patients with breast and ovarian lesions: first-in-human results. J. Clin. Oncol. 35 , 2133–2140 (2017).

Wang, Y. et al. Molecular imaging of orthotopic prostate cancer with nanobubble ultrasound contrast agents targeted to PSMA. Sci. Rep. 11 , 4726 (2021).

Zhang, T. et al. One-pot synthesis of hollow PDA@DOX nanoparticles for ultrasound imaging and chemo-thermal therapy in breast cancer. Nanoscale 11 , 21759–21766 (2019).

Duran-Sierra, E. et al. Clinical label-free biochemical and metabolic fluorescence lifetime endoscopic imaging of precancerous and cancerous oral lesions. Oral. Oncol. 105 , 104635 (2020).

Pence, I. & Mahadevan-Jansen, A. Clinical instrumentation and applications of Raman spectroscopy. Chem. Soc. Rev. 45 , 1958–1979 (2016).

Nicolson, F., Kircher, M. F., Stone, N. & Matousek, P. Spatially offset Raman spectroscopy for biomedical applications. Chem. Soc. Rev. 50 , 556–568 (2021).

Nicolson, F. et al. Non-invasive in vivo imaging of cancer using surface-enhanced spatially offset Raman spectroscopy (SESORS). Theranostics 9 , 5899–5913 (2019).

Wang, C. et al. Cathespin B-initiated cypate nanoparticle formation for tumor photoacoustic imaging. Angew. Chem. Int. Ed. Engl. 61 , e202114766 (2022).

Brown, E., Brunker, J. & Bohndiek, S. E. Photoacoustic imaging as a tool to probe the tumour microenvironment. Dis. Models Mech . 12 , dmm039636 (2019).

Weber, J., Beard, P. C. & Bohndiek, S. E. Contrast agents for molecular photoacoustic imaging. Nat. Methods 13 , 639–650 (2016).

Vukajlović, J. M. & Panić-Janković, T. in Mass Spectrometry in Life Sciences and Clinical Laboratory (ed. Mitulović, G.) Ch. 5 (IntechOpen, 2021).

Banerjee, S. et al. Diagnosis of prostate cancer by desorption electrospray ionization mass spectrometric imaging of small metabolites and lipids. Proc. Natl Acad. Sci. USA 114 , 3334–3339 (2017).

Hanahan, D. & Weinberg, R. A. Hallmarks of cancer: the next generation. Cell 144 , 646–674 (2011).

Böhm, D. et al. Comparison of tear protein levels in breast cancer patients and healthy controls using a de novo proteomic approach. Oncol. Rep. 28 , 429–438 (2012).

Wu, C.-C., Chu, H.-W., Hsu, C.-W., Chang, K.-P. & Liu, H.-P. Saliva proteome profiling reveals potential salivary biomarkers for detection of oral cavity squamous cell carcinoma. Proteomics 15 , 3394–3404 (2015).

Komor, M. A. et al. Proteins in stool as biomarkers for non-invasive detection of colorectal adenomas with high risk of progression. J. Pathol. 250 , 288–298 (2020).

Kwong, G. A. et al. Synthetic biomarkers: a twenty-first century path to early cancer detection. Nat. Rev. Cancer 21 , 655–668 (2021).

Mahmoudi, T., de la Guardia, M. & Baradaran, B. Lateral flow assays towards point-of-care cancer detection: a review of current progress and future trends. Trends Anal. Chem. 125 , 115842 (2020).

Bayoumy, S. et al. Glycovariant-based lateral flow immunoassay to detect ovarian cancer–associated serum CA125. Commun. Biol. 3 , 460 (2020).

Sachdeva, S., Davis, R. W. & Saha, A. K. Microfluidic point-of-care testing: commercial landscape and future directions. Front. Bioeng. Biotechnol. 8 , 602659 (2021).

McRae, M. P., Simmons, G., Wong, J. & McDevitt, J. T. Programmable bio-nanochip platform: a point-of-care biosensor system with the capacity to learn. Acc. Chem. Res. 49 , 1359–1368 (2016).

Low, C. A. Harnessing consumer smartphone and wearable sensors for clinical cancer research. NPJ Digit. Med. 3 , 140 (2020).

Yu, X. et al. Skin-integrated wireless haptic interfaces for virtual and augmented reality. Nature 575 , 473–479 (2019).

Yu, X. et al. Needle-shaped ultrathin piezoelectric microsystem for guided tissue targeting via mechanical sensing. Nat. Biomed. Eng. 2 , 165–172 (2018).

Williams, R. M. et al. Noninvasive ovarian cancer biomarker detection via an optical nanosensor implant. Sci. Adv. 4 , eaaq1090 (2018).

Hindley, J. W. et al. Building a synthetic mechanosensitive signaling pathway in compartmentalized artificial cells. Proc. Natl Acad. Sci. USA 116 , 16711–16716 (2019).

He, J. et al. The practical implementation of artificial intelligence technologies in medicine. Nat. Med. 25 , 30–36 (2019).

Huang, S., Yang, J., Fong, S. & Zhao, Q. Artificial intelligence in cancer diagnosis and prognosis: opportunities and challenges. Cancer Lett. 471 , 61–71 (2020).

Ardila, D. et al. End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. Nat. Med. 25 , 954–961 (2019).

Esteva, A. et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542 , 115–118 (2017).

FDA. Artificial intelligence and machine learning (AI/ML) software as a medical device action plan. https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-software-medical-device (2021).

Finlayson, S. G. et al. Adversarial attacks on medical machine learning. Science 363 , 1287–1289 (2019).

Elemento, O., Leslie, C., Lundin, J. & Tourassi, G. Artificial intelligence in cancer research, diagnosis and therapy. Nat. Rev. Cancer 21 , 747–752 (2021).

Conroy, S. M. et al. Racial/ethnic differences in the impact of neighborhood social and built environment on breast cancer risk: the Neighborhoods and Breast Cancer Study. Cancer Epidemiol. Biomark. Prev. 26 , 541–552 (2017).

Welch, H. G. & Fisher, E. S. Income and cancer overdiagnosis — when too much care is harmful. N. Engl. J. Med. 376 , 2208–2209 (2017).

Wegwarth, O., Schwartz, L. M., Woloshin, S., Gaissmaier, W. & Gigerenzer, G. Do physicians understand cancer screening statistics? A national survey of primary care physicians in the United States. Ann. Intern. Med. 156 , 340–349 (2012).

de Koning, H. J. et al. Reduced lung-cancer mortality with volume CT screening in a randomized trial. N. Engl. J. Med. 382 , 503–513 (2020).

Park, Sm. et al. A mountable toilet system for personalized health monitoring via the analysis of excreta. Nat. Biomed. Eng. 4 , 624–635 (2020).

Kruger, S. et al. Advances in cancer immunotherapy 2019 — latest trends. J. Exp. Clin. Cancer Res. 38 , 268 (2019).

Pennycuick, A. et al. Immune surveillance in clinical regression of preinvasive squamous cell lung cancer. Cancer Discov. 10 , 1489–1499 (2020).

Davies, S. & Pearson-Stuttard, J. Whose Health Is It, Anyway? (Oxford University Press, 2020).

Zhang, C., Yan, Y., Zou, Q., Chen, J. & Li, C. Superparamagnetic iron oxide nanoparticles for MR imaging of pancreatic cancer: Potential for early diagnosis through targeted strategies. Asia Pac. J. Clin. Oncol. 12 , 13–21 (2016).

Li, Y. et al. A bioinspired nanoprobe with multilevel responsive T1-weighted MR signal-amplification illuminates ultrasmall metastases. Adv. Mater. 32 , 1906799 (2020).

Yu, B., Choi, B., Li, W. & Kim, D.-H. Magnetic field boosted ferroptosis-like cell death and responsive MRI using hybrid vesicles for cancer immunotherapy. Nat. Commun. 11 , 3637 (2020).

Kostevšek, N. et al. Magneto-liposomes as MRI contrast agents: a systematic study of different liposomal formulations. Nanomaterials 10 , 889 (2020).

Article   CAS   PubMed Central   Google Scholar  

Taylor, R. M. et al. Multifunctional iron platinum stealth immunomicelles: targeted detection of human prostate cancer cells using both fluorescence and magnetic resonance imaging. J. Nanopart. Res. 13 , 4717–4729 (2011).

Botta, M. & Tei, L. Relaxivity enhancement in macromolecular and nanosized gdIII-based MRI contrast agents. Eur. J. Inorg. Chem. 2012 , 1945–1960 (2012).

Jiang, Q. et al. NIR-laser-triggered gadolinium-doped carbon dots for magnetic resonance imaging, drug delivery and combined photothermal chemotherapy for triple negative breast cancer. J. Nanobiotechnology 19 , 64 (2021).

Bouché, M. et al. Recent advances in molecular imaging with gold nanoparticles. Bioconjugate Chem. 31 , 303–314 (2020).

Kinsella, J. M. et al. X-ray computed tomography imaging of breast cancer by using targeted peptide-labeled bismuth sulfide nanoparticles. Angew. Chem. Int. Ed. 50 , 12308–12311 (2011).

Hallouard, F. et al. Radiopaque iodinated nano-emulsions for preclinical X-ray imaging. RSC Adv. 1 , 792–801 (2011).

Karunamuni, R. et al. Development of silica-encapsulated silver nanoparticles as contrast agents intended for dual-energy mammography. Eur. Radiol. 26 , 3301–3309 (2016).

Al Zaki, A. et al. Gold-loaded polymeric micelles for computed tomography-guided radiation therapy treatment and radiosensitization. ACS Nano 8 , 104–112 (2014).

Oh, M. H. et al. Large-scale synthesis of bioinert tantalum oxide nanoparticles for X-ray computed tomography imaging and bimodal image-guided sentinel lymph node mapping. JACS 133 , 5508–5515 (2011).

Pan, D. et al. An early investigation of ytterbium nanocolloids for selective and quantitative “multicolor” spectral CT imaging. ACS Nano. 6 , 3364–3370 (2012).

Ramos-Membrive, R. et al. In vivo SPECT-CT imaging and characterization of technetium-99m-labeled bevacizumab-loaded human serum albumin pegylated nanoparticles. J. Drug Deliv. Sci. Technol. 64 , 101809 (2021).

Pérez-Medina, C. et al. PET imaging of tumor-associated macrophages with 89 Zr-labeled high-density lipoprotein nanoparticles. J. Nucl. Med. 56 , 1272–1277 (2015).

Bonvalot, S. et al. NBTXR3, a first-in-class radioenhancer hafnium oxide nanoparticle, plus radiotherapy versus radiotherapy alone in patients with locally advanced soft-tissue sarcoma (Act.In.Sarc): a multicentre, phase 2–3, randomised, controlled trial. Lancet Oncol. 20 , 1148–1159 (2019).

Liu, Q., Fang, H., Gai, Y. & Lan, X. pH-triggered assembly of natural melanin nanoparticles for enhanced PET imaging. Front. Chem. 8 , 755 (2020).

Nagachinta, S. et al. Radiolabelling of lipid-based nanocarriers with fluorine-18 for in vivo tracking by PET. Colloids Surf. B 188 , 110793 (2020).

Xing, Z. et al. The fabrication of novel nanobubble ultrasound contrast agent for potential tumor imaging. Nanotechnology 21 , 145607 (2010).

Ho, Y.-J. et al. Superhydrophobic drug-loaded mesoporous silica nanoparticles capped with β-cyclodextrin for ultrasound image-guided combined antivascular and chemo-sonodynamic therapy. Biomaterials 232 , 119723 (2020).

Lee, J. et al. Theranostic gas-generating nanoparticles for targeted ultrasound imaging and treatment of neuroblastoma. J. Controlled Release 223 , 197–206 (2016).

Li, J., Ji, H., Jing, Y. & Wang, S. pH- and acoustic-responsive platforms based on perfluoropentane-loaded protein nanoparticles for ovarian tumor-targeted ultrasound imaging and therapy. Nanoscale Res. Lett. 15 , 31 (2020).

Jiang, Y. et al. Metabolizable semiconducting polymer nanoparticles for second near-infrared photoacoustic imaging. Adv. Mater. 31 , 1808166 (2019).

Park, E.-Y., Oh, D., Park, S., Kim, W. & Kim, C. New contrast agents for photoacoustic imaging and theranostics: recent 5-year overview on phthalocyanine/naphthalocyanine-based nanoparticles. APL Bioeng. 5 , 031510 (2021).

Doan, V. H. M. et al. Fluorescence/photoacoustic imaging-guided nanomaterials for highly efficient cancer theragnostic agent. Sci. Rep. 11 , 15943 (2021).

García-Álvarez, R. et al. Optimizing the geometry of photoacoustically active gold nanoparticles for biomedical imaging. ACS Photonics 7 , 646–652 (2020).

Wang, C. et al. Cathespin B-initiated cypate nanoparticle formation for tumor photoacoustic imaging. Angew. Chem. Int. Ed. Engl. 61 , e202114766 (2021).

Zhai, T. et al. Hollow bimetallic complex nanoparticles for trimodality imaging and photodynamic therapy in vivo. ACS Appl. Mater. Interfaces 12 , 37470–37476 (2020).

Sun, M. et al. Thermally triggered in situ assembly of gold nanoparticles for cancer multimodal imaging and photothermal therapy. ACS Appl. Mater. Interfaces 9 , 10453–10460 (2017).

Rieffel, J. et al. Hexamodal imaging with porphyrin-phospholipid-coated upconversion nanoparticles. Adv. Mater. 27 , 1785–1790 (2015).

Download references

Author information

Authors and affiliations.

Early Detection Programme, Cancer Research UK Cambridge Centre, University of Cambridge, Cambridge, UK

Rebecca C. Fitzgerald

Centre for Cancer Genetic Epidemiology, Department of Public Health & Primary Care, University of Cambridge, Cambridge, UK

  • Antonis C. Antoniou

Department of Chemical Engineering and Biotechnology, University of Cambridge, Cambridge, UK

Ljiljana Fruk

Cancer Research UK Cambridge Institute, University of Cambridge, Li Ka Shing Centre, Cambridge, UK

Nitzan Rosenfeld

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Rebecca C. Fitzgerald .

Ethics declarations

Competing interests.

R.C.F. is named on patents relating to Cytosponge and associated assays that have been licensed by the Medical Research Council to Covidien (now Medtronic). R.C.F. is a founder and shareholder for Cyted. A.C.A. is a named inventor of BOADICEA v5, licensed by Cambridge Enterprise (University of Cambridge). N.R. is co-founder and Chief Scientific Officer of Inivata and is an inventor on patents related to cancer detection and molecular analysis.

Peer review

Peer review information.

Nature Medicine thanks Marnix Jansen, Martin Eklund and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Karen O’Leary was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article.

Fitzgerald, R.C., Antoniou, A.C., Fruk, L. et al. The future of early cancer detection. Nat Med 28 , 666–677 (2022). https://doi.org/10.1038/s41591-022-01746-x

Download citation

Received : 21 December 2021

Accepted : 15 February 2022

Published : 19 April 2022

Issue Date : April 2022

DOI : https://doi.org/10.1038/s41591-022-01746-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Pan-cancer characterization of cell-free immune-related mirna identified as a robust biomarker for cancer diagnosis.

  • Chaoqi Zhang

Molecular Cancer (2024)

Health expenditure trajectory and gastric cancer incidence in the National Health Insurance Senior Cohort: a nested case-control study

  • Ki-Bong Yoo

BMC Health Services Research (2024)

Unveiling the therapeutic promise: exploring Lysophosphatidic Acid (LPA) signaling in malignant bone tumors for novel cancer treatments

Lipids in Health and Disease (2024)

Validation of the BOADICEA model for epithelial tubo-ovarian cancer risk prediction in UK Biobank

British Journal of Cancer (2024)

Plasma Epstein-Barr virus microRNA BART8-3p as a potential biomarker for detection and prognostic prediction in early nasopharyngeal carcinoma

  • Yuebing Chen
  • Jingfeng Zong

Scientific Reports (2024)

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

research paper on cancer detection

Advertisement

Advertisement

A comprehensive analysis of recent advancements in cancer detection using machine learning and deep learning models for improved diagnostics

  • Published: 04 August 2023
  • Volume 149 , pages 14365–14408, ( 2023 )

Cite this article

research paper on cancer detection

  • Hari Mohan Rai 1 &
  • Joon Yoo 1  

1046 Accesses

14 Citations

Explore all metrics

There are millions of people who lose their life due to several types of fatal diseases. Cancer is one of the most fatal diseases which may be due to obesity, alcohol consumption, infections, ultraviolet radiation, smoking, and unhealthy lifestyles. Cancer is abnormal and uncontrolled tissue growth inside the body which may be spread to other body parts other than where it has originated. Hence it is very much required to diagnose the cancer at an early stage to provide correct and timely treatment. Also, manual diagnosis and diagnostic error may cause of the death of many patients hence much research are going on for the automatic and accurate detection of cancer at early stage.

In this paper, we have done the comparative analysis of the diagnosis and recent advancement for the detection of various cancer types using traditional machine learning (ML) and deep learning (DL) models. In this study, we have included four types of cancers, brain, lung, skin, and breast and their detection using ML and DL techniques. In extensive review we have included a total of 130 pieces of literature among which 56 are of ML-based and 74 are from DL-based cancer detection techniques. Only the peer reviewed research papers published in the recent 5-year span (2018–2023) have been included for the analysis based on the parameters, year of publication, feature utilized, best model, dataset/images utilized, and best accuracy. We have reviewed ML and DL-based techniques for cancer detection separately and included accuracy as the performance evaluation metrics to maintain the homogeneity while verifying the classifier efficiency.

Among all the reviewed literatures, DL techniques achieved the highest accuracy of 100%, while ML techniques achieved 99.89%. The lowest accuracy achieved using DL and ML approaches were 70% and 75.48%, respectively. The difference in accuracy between the highest and lowest performing models is about 28.8% for skin cancer detection. In addition, the key findings, and challenges for each type of cancer detection using ML and DL techniques have been presented. The comparative analysis between the best performing and worst performing models, along with overall key findings and challenges, has been provided for future research purposes. Although the analysis is based on accuracy as the performance metric and various parameters, the results demonstrate a significant scope for improvement in classification efficiency.

The paper concludes that both ML and DL techniques hold promise in the early detection of various cancer types. However, the study identifies specific challenges that need to be addressed for the widespread implementation of these techniques in clinical settings. The presented results offer valuable guidance for future research in cancer detection, emphasizing the need for continued advancements in ML and DL-based approaches to improve diagnostic accuracy and ultimately save more lives.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

research paper on cancer detection

Similar content being viewed by others

research paper on cancer detection

A Review on Automated Cancer Detection in Medical Images using Machine Learning and Deep Learning based Computational Techniques: Challenges and Opportunities

research paper on cancer detection

Cancer detection and segmentation using machine learning and deep learning techniques: a review

research paper on cancer detection

Cancer Diseases Diagnosis Using Deep Transfer Learning Architectures

Data availability.

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

Abdar M et al (2021) Uncertainty quantification in skin cancer classification using three-way decision-based Bayesian deep learning. Comput Biol Med 135:104418. https://doi.org/10.1016/j.compbiomed.2021.104418

Article   PubMed   Google Scholar  

Agarap AFM (2018) On breast cancer detection: an application of machine learning algorithms on the Wisconsin diagnostic dataset. ACM Int Conf Proc Ser 1:5–9. https://doi.org/10.1145/3184066.3184080

Article   Google Scholar  

Ahammed M, Mamun A, Shorif M (2022) Healthcare Analytics A machine learning approach for skin disease detection and classification using image segmentation. Healthc Anal 2:100122. https://doi.org/10.1016/j.health.2022.100122

Aidossov N et al (2023) An integrated intelligent system for breast cancer detection at early stages using ir images and machine learning methods with explainability. SN Comput Sci 4(2):1–16. https://doi.org/10.1007/s42979-022-01536-9

Akkar H, Haddad SQ (2020) Diagnosis of lung cancer disease based on back-propagation artificial neural network algorithm. Eng Technol J 38(3B):184–196. https://doi.org/10.30684/etj.v38i3b.1666

Al-Dmour H, Al-Ani A (2018) A clustering fusion technique for MR brain tissue segmentation. Neurocomputing 275:546–559. https://doi.org/10.1016/j.neucom.2017.08.051

Alenezi F, Armghan A, Polat K (2023) Wavelet transform based deep residual neural network and ReLU based extreme learning machine for skin lesion classification. Expert Syst Appl 213:119064. https://doi.org/10.1016/j.eswa.2022.119064

Alfian G et al (2022) Predicting breast cancer from risk factors using SVM and extra-trees-based feature selection method. Computers. https://doi.org/10.3390/computers11090136

Almutairi SM, Manimurugan S, Aborokbah MM, Narmatha C, Ganesan S, Karthikeyan P (2023) An efficient USE-Net deep learning model for cancer detection. Int J Intell Syst 2023:1–14. https://doi.org/10.1155/2023/8509433

Al-shamasneh ARM, Obaidellah UHB (2017) Artificial intelligence techniques for cancer detection and classification: review study. Eur Sci J 13(3):342–370. https://doi.org/10.19044/esj.2016.v13n3p342

Alsheikhy AA, Said Y, Shawly T, Alzahrani AK, Lahza H (2023) A CAD system for lung cancer detection using hybrid deep learning techniques. Diagnostics. https://doi.org/10.3390/diagnostics13061174

Article   PubMed   PubMed Central   Google Scholar  

Alyasriy H, Al-Huseiny M (2021) The IQ-OTHNCCD lung cancer dataset. Mendeley Data. https://doi.org/10.17632/bhmdr45bh2.2

Amin J et al (2020) Integrated design of deep features fusion for localization and classification of skin cancer. Pattern Recognit Lett 131:63–70. https://doi.org/10.1016/j.patrec.2019.11.042

Anaya-Isaza A, Mera-Jiménez L, Verdugo-Alejo L, Sarasti L (2023) Optimizing MRI-based brain tumor classification and detection using AI: a comparative analysis of neural networks, transfer learning, data augmentation, and the cross-transformer network. Eur J Radiol Open. https://doi.org/10.1016/j.ejro.2023.100484

Archana KV, Komarasamy G (2023) A novel deep learning-based brain tumor detection using the Bagging ensemble with K-nearest neighbor. J Intell Syst. https://doi.org/10.1515/jisys-2022-0206

Armato SG et al. (2011) The Lung Image Database Consortium “LIDC… and image database resource initiative” IDRI…: a completed reference database of lung nodules on CT scans

Arooj S et al (2022) Breast cancer detection and classification empowered with transfer learning. Front Public Health 10(July):1–18. https://doi.org/10.3389/fpubh.2022.924432

Asadi B, Memon Q (2023) Efficient breast cancer detection via cascade deep learning network. Int J Intell Netw 4:46–52. https://doi.org/10.1016/j.ijin.2023.02.001

Ashraf R et al (2020) Region-of-interest based transfer learning assisted framework for skin cancer detection. IEEE Access 8:147858–147871. https://doi.org/10.1109/ACCESS.2020.3014701

Asuntha A, Srinivasan A (2020) Deep learning for lung cancer detection and classification. Multimed Tools Appl 79(11):7731–7762. https://doi.org/10.1007/s11042-019-08394-3

Ausawalaithong W, Thirach A, Marukatat S, Wilaiprasitporn T (2018) Automatic lung cancer prediction from chest X-ray images using the deep learning approach. In: BMEiCON 2018 - 11th Biomedical Engineering International Conference, 2019, https://doi.org/10.1109/BMEiCON.2018.8609997

Baid U et al. (2021) The RSNA-ASNR-MICCAI BraTS 2021 benchmark on brain tumor segmentation and radiogenomic classification. [Online]. Available: http://arxiv.org/abs/2107.02314

Bajwa MN et al (2020) Computer-aided diagnosis of skin diseases using deep neural networks. Appl Sci (switzerland). https://doi.org/10.3390/app10072488

Bakas S et al (2017) Advancing the cancer genome atlas glioma MRI collections with expert segmentation labels and radiomic features. Sci Data. https://doi.org/10.1038/sdata.2017.117

Batista LG, Bugatti PH, Saito PTM (2022) Computer methods and programs in biomedicine classification of skin lesion through active learning strategies. Comput Methods Programs Biomed 226:107122. https://doi.org/10.1016/j.cmpb.2022.107122

Bębas E et al (2021) Machine-learning-based classification of the histological subtype of non-small-cell lung cancer using MRI texture analysis. Biomed Signal Process Control 66:102446. https://doi.org/10.1016/j.bspc.2021.102446

Bhargavi S, Sowmya V, Syama S, Lekshmi S (2022) Skin cancer detection using machine learning. In: 2022 International Conference on Disruptive Technologies for Multi-Disciplinary Research and Applications (CENTCON), IEEE, Dec. 2022, pp. 119–124. https://doi.org/10.1109/CENTCON56610.2022.10051495

Bhatia S, Sinha Y, Goel L (2019) Lung cancer detection: a deep learning approach. In: Bansal JC, Das KN, Nagar A, Deep K, Ojha AK (eds) Soft Computing for problem solving. Springer Singapore, Singapore, pp 699–705

Chapter   Google Scholar  

Bi D, Zhu D, Sheykhahmad FR, Qiao M (2021) Computer-aided skin cancer diagnosis based on a new meta-heuristic algorithm combined with support vector method. Biomed Signal Process Control 68(4655):102631. https://doi.org/10.1016/j.bspc.2021.102631

Bin-Tufail A et al (2021) Deep learning in cancer diagnosis and prognosis prediction: a minireview on challenges, recent trends, and future directions. Comput Math Methods Med. https://doi.org/10.1155/2021/9025470

Birchha V, Nigam B (2023) Performance analysis of averaged perceptron machine learning classifier for breast cancer detection. Procedia Comput Sci 218(2022):2181–2190. https://doi.org/10.1016/j.procs.2023.01.194

Booz Allen Hamilton (2017) Data Science Bowl 2017: Can You Improve Lung Cancer Detection? https://www.kaggle.com/competitions/data-science-bowl-2017/ accessed 21 Jul 2023

Bouzar-Benlabiod L, Harrar K, Yamoun L, Khodja MY, Akhloufi MA (2023) A novel breast cancer detection architecture based on a CNN-CBR system for mammogram classification. Comput Biol Med 163:107133. https://doi.org/10.1016/j.compbiomed.2023.107133

Article   CAS   PubMed   Google Scholar  

Brunese L, Mercaldo F, Reginelli A, Santone A (2020) An ensemble learning approach for brain cancer detection exploiting radiomic features. Comput Methods Programs Biomed 185:105134. https://doi.org/10.1016/j.cmpb.2019.105134

Cai G, Guo Y, Chen W, Zeng H, Zhou Y, Lu Y (2020) Computer-aided detection and diagnosis of microcalcification clusters on full field digital mammograms based on deep learning method using neutrosophic boosting. Multimed Tools Appl 79(23–24):17147–17167. https://doi.org/10.1007/s11042-019-7726-x

Cassidy B, Kendrick C, Brodzicki A, Jaworek-Korjakowska J, Yap MH (2022) Analysis of the ISIC image datasets: usage, benchmarks and recommendations. Med Image Anal. https://doi.org/10.1016/j.media.2021.102305

Chang JE et al (2018) Analysis of volatile organic compounds in exhaled breath for lung cancer diagnosis using a sensor system. Sens Actuat B Chem 255:800–807. https://doi.org/10.1016/j.snb.2017.08.057

Article   CAS   Google Scholar  

Chatterjee S, Dey D, Munshi S (2019) Computer methods and programs in biomedicine integration of morphological preprocessing and fractal based feature extraction with recursive feature elimination for skin lesion types classification. Comput Methods Programs Biomed 178:201–218. https://doi.org/10.1016/j.cmpb.2019.06.018

Çinar A, Yildirim M (2020) Detection of tumors on brain MRI images using the hybrid convolutional neural network architecture. Med Hypotheses 139:109684. https://doi.org/10.1016/j.mehy.2020.109684

Dai X, Spasic I, Meyer B, Chapman S, Andres F (2019) Machine learning on mobile: an on-device inference app for skin cancer detection. In: 2019 4th International Conference on Fog and Mobile Edge Computing, FMEC 2019, pp. 301–305, https://doi.org/10.1109/FMEC.2019.8795362

Dalwinder S, Birmohan S, Manpreet K (2020) Simultaneous feature weighting and parameter determination of neural networks using ant lion optimization for the classification of breast cancer. Biocybern Biomed Eng 40(1):337–351. https://doi.org/10.1016/j.bbe.2019.12.004

de Carvalho-Filho AO, Silva AC, de Paiva AC, Nunes RA, Gattass M (2018) Classification of patterns of benignity and malignancy based on CT using topology-based phylogenetic diversity index and convolutional neural network. Pattern Recognit 81:200–212. https://doi.org/10.1016/j.patcog.2018.03.032

El Massari H, Gherabi N, Mhammedi S, Sabouri Z, Ghandi H, Qanouni F (2023) Effectiveness of applying machine learning techniques and ontologies in breast cancer detection. Procedia Comput Sci 218(2022):2392–2400. https://doi.org/10.1016/j.procs.2023.01.214

Faguet GB (2015) A brief history of cancer: age-old milestones underlying our current knowledge database. Int J Cancer 136(9):2022–2036. https://doi.org/10.1002/ijc.29134

Faruqui N, Yousuf MA, Whaiduzzaman M, Azad AKM, Barros A, Moni MA (2021) LungNet: a hybrid deep-CNN model for lung cancer diagnosis using CT and wearable sensor-based medical IoT data. Comput Biol Med 139:104961. https://doi.org/10.1016/j.compbiomed.2021.104961

Feng J, Jiang J (2022) Deep learning-based chest CT image features in diagnosis of lung cancer. Comput Math Methods Med. https://doi.org/10.1155/2022/4153211

Ferlay J et al (2019) Estimating the global cancer incidence and mortality in 2018: GLOBOCAN sources and methods. Int J Cancer 144(8):1941–1953. https://doi.org/10.1002/ijc.31937

GCO-SURVCAN. https://gco.iarc.fr/survival/survcan/dataviz/table accessed 9 Jun 2023

Geetharamani R, Sivagami G (2021) Iterative principal component analysis method for improvised classification of breast cancer disease using blood sample analysis. Med Biol Eng Comput 59(10):1973–1989. https://doi.org/10.1007/s11517-021-02405-y

Ghassemi N, Shoeibi A, Rouhani M (2020) Deep neural network with generative adversarial networks pre-training for brain tumor classification based on MR images. Biomed Signal Process Control 57:101678. https://doi.org/10.1016/j.bspc.2019.101678

Global Burden of Disease (2019) Number of deaths by cause, World, 2019. The Lancet

Gomathi E, Jayasheela M, Thamarai M, Geetha M (2023) Skin cancer detection using dual optimization based deep learning network. Biomed Signal Process Control 84:104968. https://doi.org/10.1016/j.bspc.2023.104968

Gouda W, Sama NU, Al-Waakid G, Humayun M, Jhanjhi NZ (2022) Detection of skin cancer based on skin lesion images using deep learning. Healthcare (switzerland) 10(7):1183. https://doi.org/10.3390/healthcare10071183

Gupta N, Bhatele P, Khanna P (2019) Glioma detection on brain MRIs using texture and morphological features with ensemble learning. Biomed Signal Process Control 47:115–125. https://doi.org/10.1016/j.bspc.2018.06.003

Harada T et al (2021) Analysis of diagnostic error cases among Japanese residents using diagnosis error evaluation and research taxonomy. J Gen Fam Med 22(2):96–99. https://doi.org/10.1002/jgf2.388

Harangi B (2018) Skin lesion classification with ensembles of deep convolutional neural networks. J Biomed Inform 86(January):25–32. https://doi.org/10.1016/j.jbi.2018.08.006

Hashemzehi R, Mahdavi SJS, Kheirabadi M, Kamel SR (2020) Detection of brain tumors from MRI images base on deep learning using hybrid model CNN and NADE. Biocybern Biomed Eng 40(3):1225–1232. https://doi.org/10.1016/j.bbe.2020.06.001

Heidari A, Javaheri D, Toumaj S, Navimipour NJ, Rezaei M, Unal M (2023) A new lung cancer detection method based on the chest CT images using Federated Learning and blockchain systems. Artif Intell Med 141:102572. https://doi.org/10.1016/j.artmed.2023.102572

Hekal AA, Elnakib A, Moustafa HED (2021) Automated early breast cancer detection and classification system. Signal Image Video Process 15(7):1497–1505. https://doi.org/10.1007/s11760-021-01882-w

Heron M (2021) Deaths: leading causes for 2019. National vital statistics reports, 70(9)

Huang S, Yang J, Fong S, Zhao Q (2020) Artificial intelligence in cancer diagnosis and prognosis: opportunities and challenges. Cancer Lett 471(2019):61–71. https://doi.org/10.1016/j.canlet.2019.12.007

Huynh HN, Tran AT, Tran TN (2023) Region-of-interest optimization for deep-learning-based breast cancer detection in mammograms. Appl Sci 13(12):6894. https://doi.org/10.3390/app13126894

Ibrahim A, Mohamed HK, Maher A, Zhang B (2022) A survey on human cancer categorization based on deep learning. Front Artif Intell. https://doi.org/10.3389/frai.2022.884749

Imran A, Nasir A, Bilal M, Sun G, Alzahrani A, Almuhaimeed A (2022) Skin cancer detection using combined decision of deep learners. IEEE Access 10(October):118198–118212. https://doi.org/10.1109/ACCESS.2022.3220329

Iqtidar K, Iqtidar A, Ali W, Aziz S, Khan MU (2020) Image pattern analysis towards classification of skin cancer through dermoscopic images. In: Proceedings - 2020 1st International Conference of Smart Systems and Emerging Technologies, SMART-TECH 2020, no. January 2021, pp. 208–213, https://doi.org/10.1109/SMART-TECH49988.2020.00055

Irfan T, Rauf A, Iqbal MJ (2023) Skin cancer prediction using deep learning techniques. In: 2023 International Multi-disciplinary Conference in Emerging Research Trends (IMCERT), IEEE, Jan. 2023, pp. 1–5. doi: https://doi.org/10.1109/IMCERT57083.2023.10075313

Jacobs C et al (2014) Automatic detection of subsolid pulmonary nodules in thoracic computed tomography images. Med Image Anal 18(2):374–384. https://doi.org/10.1016/j.media.2013.12.001

Jaculin-Femil J, Jaya T (2023) An efficient hybrid optimization for skin cancer detection using PNN classifier. Comput Syst Sci Eng 45(3):2919–2934. https://doi.org/10.32604/csse.2023.032935

Jiaquan X, Sherry LM, Kenneth DK, Elizabeth A (2021) Deaths: final data 2019. National Vital Statistics Reports, 70(8)

Karayegen G, Aksahin MF (2020) Brain tumor prediction on MR images with semantic segmentation by using deep learning network and 3D imaging of tumor region. Biomed Signal Process Control 66(November):2021. https://doi.org/10.1016/j.bspc.2021.102458

Kasinathan G, Jayakumar S (2022) Cloud-based lung tumor detection and stage classification using deep learning techniques. Biomed Res Int. https://doi.org/10.1155/2022/4185835

Kaur R, GholamHosseini H, Sinha R, Lindén M (2022) Automatic lesion segmentation using atrous convolutional deep neural networks in dermoscopic skin cancer images. BMC Med Imaging 22(1):1–13. https://doi.org/10.1186/s12880-022-00829-y

Kavitha T et al (2022) Deep learning based capsule neural network model for breast cancer diagnosis using mammogram images. Interdiscip Sci 14(1):113–129. https://doi.org/10.1007/s12539-021-00467-y

Keerthana D, Venugopal V, Nath MK, Mishra M (2023) Hybrid convolutional neural networks with SVM classifier for classification of skin cancer. Biomed Eng Adv 5(2022):100069. https://doi.org/10.1016/j.bea.2022.100069

Khan SA et al (2019) Lungs nodule detection framework from computed tomography images using support vector machine. Microsc Res Tech 82(8):1256–1266. https://doi.org/10.1002/jemt.23275

Khan MA et al (2020) Lungs cancer classification from CT images: an integrated design of contrast based classical features fusion and selection. Pattern Recogn Lett 129:77–85. https://doi.org/10.1016/j.patrec.2019.11.014

Khan MBS, Atta-Ur-Rahman, Nawaz MS, Ahmed R, Khan MA, Mosavi A (2022) Intelligent breast cancer diagnostic system empowered by deep extreme gradient descent optimization. Math Biosci Eng 19(8):7978–8002. https://doi.org/10.3934/mbe.2022373

Khuriwal N, Mishra N (2018) Breast cancer detection from histopathological images using deep learning. In: 2018 3rd International Conference and Workshops on Recent Advances and Innovations in Engineering (ICRAIE), New York, NY, USA: IEEE, pp. 1–4. https://doi.org/10.1109/ICRAIE.2018.8710426

Krishnapriya S, Karuna Y (2023) Pre-trained deep learning models for brain MRI image classification. Front Hum Neurosci. https://doi.org/10.3389/fnhum.2023.1150120

Kumar G, Alqahtani H (2022) Deep learning-based cancer detection-recent developments, trend and challenges. CMES—Comput Model Eng Sci 130(3):1271–1307. https://doi.org/10.32604/cmes.2022.018418

Kumar Y, Gupta S, Singla R, Hu YC (2022) A systematic review of artificial intelligence techniques in cancer prediction and diagnosis. Arch Comput Methods Eng 29(4):2043–2070. https://doi.org/10.1007/s11831-021-09648-w

Kumar V et al (2023) Improved UNet deep learning model for automatic detection of lung cancer nodules. Comput Intell Neurosci 2023:1–8. https://doi.org/10.1155/2023/9739264

Kumar AK, Satheesha TY, Salvador BBL, Mithileysh S, Ahmed ST (2023) Augmented intelligence enabled deep neural networking (AuDNN) framework for skin cancer classification and prediction using multi-dimensional datasets on industrial IoT standards. Microprocess Microsyst 97:104755. https://doi.org/10.1016/j.micpro.2023.104755

Levine AB, Schlosser C, Grewal J, Coope R, Jones SJM, Yip S (2019) Rise of the machines: advances in deep learning for cancer diagnosis. Trends Cancer 5(3):157–169. https://doi.org/10.1016/j.trecan.2019.02.002

Li Y, Shen L (2018) Skin lesion analysis towards melanoma detection using deep learning network. Sensors (switzerland) 18(2):1–16. https://doi.org/10.3390/s18020556

Mahmud MI, Mamun M, Abdelgawad A (2023) A deep analysis of brain tumor detection from MR images using deep learning networks. Algorithms 16(4):1–19. https://doi.org/10.3390/a16040176

Majumder A, Sen D (2021) Artificial intelligence in cancer diagnostics and therapy: current perspectives. Indian J Cancer 58(4):481–492. https://doi.org/10.4103/ijc.IJC_399_20

Malarvizhi AB, Mofika A, Monapreetha M, Arunnagiri AM (2022) Brain tumour classification using machine learning algorithm. J Phys Conf Ser. https://doi.org/10.1088/1742-6596/2318/1/012042

Manhas J, Gupta RK, Roy PP (2022) A review on automated cancer detection in medical images using machine learning and deep learning based computational techniques: challenges and opportunities. Arch Comput Methods Eng. https://doi.org/10.1007/s11831-021-09676-6

Maqsood S, Damaševičius R (2023) Multiclass skin lesion localization and classification using deep learning based features fusion and selection framework for smart healthcare. Neural Netw 160:238–258. https://doi.org/10.1016/j.neunet.2023.01.022

Masud M, Sikder N, Al-Nahid A, Bairagi AK, Alzain MA (2021) A machine learning approach to diagnosing lung and colon cancer using a deep learning-based classification framework. Sensors (switzerland) 21(3):1–21. https://doi.org/10.3390/s21030748

Maurya S, Tiwari S, Mothukuri MC, Tangeda CM, Nandigam RNS, Addagiri DC (2023) A review on recent developments in cancer detection using machine learning and deep learning models. Biomed Signal Process Control 80(P2):104398. https://doi.org/10.1016/j.bspc.2022.104398

Mendonca T, Ferreira PM, Marques JS, Marcal ARS, Rozeira J (2013) PH2—a dermoscopic image database for research and benchmarking. In: 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), IEEE, pp. 5437–5440. https://doi.org/10.1109/EMBC.2013.6610779

Menze BH et al (2015) The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans Med Imaging 34(10):1993–2024. https://doi.org/10.1109/TMI.2014.2377694

Mezher MA, Altamimi A, Altamimi R (2022) A genetic folding strategy based support vector machine to optimize lung cancer classification. Front Artif Intell 5(June):1–7. https://doi.org/10.3389/frai.2022.826374

Mohakud R, Dash R (2022) Designing a grey wolf optimization based hyper-parameter optimized convolutional neural network classifier for skin cancer detection. J King Saud Univ Comput Inform Sci 34(8):6280–6291. https://doi.org/10.1016/j.jksuci.2021.05.012

Mohammad WT, Teete R, Al-Aaraj H, Rubbai YSY, Arabyat MM (2022) Diagnosis of breast cancer pathology on the Wisconsin dataset with the help of data mining classification and clustering techniques. Appl Bionics Biomech. https://doi.org/10.1155/2022/6187275

Mokoatle M, Marivate V, Mapiye D, Bornman R, Hayes VM (2023) A review and comparative study of cancer detection using machine learning: SBERT and SimCSE application. BMC Bioinform 24(1):112. https://doi.org/10.1186/s12859-023-05235-x

Monika MK, Vignesh NA, Usha C, Kumar MNVSS, Lydia EL (2020) Materials today: proceedings Skin cancer detection and classification using machine learning. Mater Today Proc 33:4266–4270. https://doi.org/10.1016/j.matpr.2020.07.366

Moreira IC, Amaral I, Domingues I, Cardoso A, Cardoso MJ, Cardoso JS (2012) INbreast: toward a full-field digital mammographic database. Acad Radiol 19(2):236–248. https://doi.org/10.1016/j.acra.2011.09.014

Mughal B, Sharif M, Muhammad N, Saba T (2018) A novel classification scheme to decline the mortality rate among women due to breast tumor. Microsc Res Tech 81(2):171–180. https://doi.org/10.1002/jemt.22961

Murugan A, Nair SAH, Kumar KPS (2019) Detection of skin cancer using SVM, random forest and kNN classifiers. J Med Syst. https://doi.org/10.1007/s10916-019-1400-8

Nanglia P, Kumar S, Mahajan AN, Singh P, Rathee D (2021) A hybrid algorithm for lung cancer classification using SVM and neural networks. ICT Express 7(3):335–341. https://doi.org/10.1016/j.icte.2020.06.007

Naqi SM, Sharif M, Lali IU (2019) A 3D nodule candidate detection method supported by hybrid features to reduce false positives in lung nodule detection. Multimed Tools Appl 78(18):26287–26311. https://doi.org/10.1007/s11042-019-07819-3

Naseer I, Akram S, Masood T, Jaffar A, Khan MA, Mosavi A (2022) Performance analysis of state-of-the-art CNN architectures for LUNA16. Sensors 22(12):4426. https://doi.org/10.3390/s22124426

Article   CAS   PubMed   PubMed Central   Google Scholar  

Nasir MU et al (2022) Breast cancer prediction empowered with fine-tuning. Comput Intell Neurosci. https://doi.org/10.1155/2022/5918686

NCI (2021) What is cancer? – NCI. National Cancer Institute, 2021. https://www.cancer.gov/about-cancer/understanding/what-is-cancer accessed 9 Jun 2023

Neema M, Nair AS, Joy A, Menon AP, Haris A (2020) Skin lesion/cancer detection using deep learning. Int J Appl Eng Res 15(1):11–17

Google Scholar  

Newman-Toker DE et al (2021) Rate of diagnostic errors and serious misdiagnosis-related harms for major vascular events, infections, and cancers: toward a national incidence estimate using the ‘big Three.’ Diagnosis 8(1):67–84. https://doi.org/10.1515/dx-2019-0104

Nigudgi S, Bhyri C (2023) Lung cancer CT image classification using hybrid-SVM transfer learning approach. Soft Comput. https://doi.org/10.1007/s00500-023-08498-x

Nofallah S et al (2021) Machine learning techniques for mitoses classification. Comput Med Imaging Graph 87:101832. https://doi.org/10.1016/j.compmedimag.2020.101832

Omeroglu AN, Mohammed HMA, Oral EA, Aydin S (2023) A novel soft attention-based multi-modal deep learning framework for multi-label skin lesion classification. Eng Appl Artif Intell 120:105897. https://doi.org/10.1016/j.engappai.2023.105897

Painuli D, Bhardwaj S, Köse U (2022) Recent advancement in cancer diagnosis using machine learning and deep learning techniques: a comprehensive review. Comput Biol Med. https://doi.org/10.1016/j.compbiomed.2022.105580

Patil S, Kirange D (2023) Ensemble of deep learning models for brain tumor detection. Procedia Comput Sci 218(2022):2468–2479. https://doi.org/10.1016/j.procs.2023.01.222

Pradhan KS, Chawla P, Tiwari R (2023) HRDEL: high ranking deep ensemble learning-based lung cancer diagnosis model. Expert Syst Appl 213:118956. https://doi.org/10.1016/j.eswa.2022.118956

Prakash TS, Siva-Kumar A, Durai CRB, Ashok S (2023) Enhanced Elman spike neural network optimized with flamingo search optimization algorithm espoused lung cancer classification from CT images. Biomed Signal Process Control 84:104948. https://doi.org/10.1016/j.bspc.2023.104948

Radhika PR, Nair RAS, Veena G (2019) A comparative study of lung cancer detection using machine learning algorithms. In: Proceedings of 2019 3rd IEEE International Conference on Electrical, Computer and Communication Technologies, ICECCT 2019, pp. 2–5, https://doi.org/10.1109/ICECCT.2019.8869001

Ragab M, Albukhari A, Alyami J, Mansour RF (2022) Ensemble deep-learning-enabled clinical decision support system for breast cancer diagnosis and classification on ultrasound images. Biology (basel) 11(3):439. https://doi.org/10.3390/biology11030439

Rahman MM, Ghasemi Y, Suley E, Zhou Y, Wang S, Rogers J (2021) Machine learning based computer aided diagnosis of breast cancer utilizing anthropometric and clinical features. IRBM 42(4):215–226. https://doi.org/10.1016/j.irbm.2020.05.005

Ramadan SZ (2020) Using convolutional neural network with cheat sheet and data augmentation to detect breast cancer in mammograms. Comput Math Methods Med. https://doi.org/10.1155/2020/9523404

Ramtekkar PK, Pandey A, Pawar MK (2023) Accurate detection of brain tumor using optimized feature selection based on deep learning techniques. Multimed Tools Appl. https://doi.org/10.1007/s11042-023-15239-7

Rasheed M et al (2023) Recognizing brain tumors using adaptive noise filtering and statistical features. Diagnostics. https://doi.org/10.3390/diagnostics13081451

Roy A (2019) Deep convolutional neural networks for breast cancer detection. In: 2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), IEEE, pp. 0169–0171. https://doi.org/10.1109/UEMCON47517.2019.8993023

Ruan J, Meng Y, Zhao F, Gu H, He L, Gong X (2022) Development of deep learning-based automatic scan range setting model for lung cancer screening low-dose CT imaging. Acad Radiol. https://doi.org/10.1016/j.acra.2021.12.001

Saba T (2020) Recent advancement in cancer detection using machine learning: systematic survey of decades, comparisons and challenges. J Infect Public Health 13(9):1274–1289. https://doi.org/10.1016/j.jiph.2020.06.033

Saba T, Khan MA, Rehman A, Marie-Sainte SL (2019) Region extraction and classification of skin cancer: a heterogeneous framework of deep CNN features fusion and reduction. J Med Syst. https://doi.org/10.1007/s10916-019-1413-3

Sadad T, Munir A, Saba T, Hussain A (2018) Fuzzy C-means and region growing based classification of tumor from mammograms using hybrid texture feature. J Comput Sci 29:34–45. https://doi.org/10.1016/j.jocs.2018.09.015

Saeedi S, Rezayi S, Keshavarz H, Niakan-Kalhori SR (2023) MRI-based brain tumor detection using convolutional deep learning methods and chosen machine learning techniques. BMC Med Inform Decis Mak 23(1):1–17. https://doi.org/10.1186/s12911-023-02114-6

Safdar M, Kobaisi S, Zahra F (2020) A comparative analysis of data augmentation approaches for magnetic resonance imaging (MRI) scan images of brain tumor. Acta Informatica Medica 28(1):29. https://doi.org/10.5455/aim.2020.28.29-36

Salem-Ghahfarrokhi S, Khodadadi H (2020) Human brain tumor diagnosis using the combination of the complexity measures and texture features through magnetic resonance image. Biomed Signal Process Control 61:102025. https://doi.org/10.1016/j.bspc.2020.102025

Sánchez-Cauce R, Pérez-Martín J, Luque M (2021) Multi-input convolutional neural network for breast cancer detection using thermal images and clinical data. Comput Methods Programs Biomed. https://doi.org/10.1016/j.cmpb.2021.106045

Sannasi-Chakravarthy SR, Rajaguru H (2022) Automatic detection and classification of mammograms using improved extreme learning machine with deep learning. IRBM 43(1):49–61. https://doi.org/10.1016/j.irbm.2020.12.004

Saravana-Kumar NM, Hariprasath K, Tamilselvi S, Kavinya A, Kaviyavarshini N (2021) Detection of stages of melanoma using deep learning. Multimed Tools Appl 80(12):18677–18692. https://doi.org/10.1007/s11042-021-10572-1

Saric M, Russo M, Stella M, Sikora M (2019) CNN-based method for lung cancer detection in whole slide histopathology images. In: 2019 4th International Conference on Smart and Sustainable Technologies, SpliTech 2019, pp. 14–17, https://doi.org/10.23919/SpliTech.2019.8783041

Scarpace L et al. (2016) The cancer genome atlas glioblastoma multiforme collection (TCGA-GBM) (version 4). The Cancer Imaging Archive

Selvapandian A, Nagendra-Prabhu S, Sivakumar P, Jagannadha-Rao DB (2022) Lung cancer detection and severity level classification using sine cosine sail fish optimization based generative adversarial network with CT images. Comput J 65(6):1611–1630. https://doi.org/10.1093/comjnl/bxab141

Senan EM, Jadhav ME (2021) Analysis of dermoscopy images by using ABCD rule for early detection of skin cancer. Glob Trans Proc 2(1):1–7. https://doi.org/10.1016/j.gltp.2021.01.001

Shafi ASM, Rahman MB, Anwar T, Halder RS, Kays HME (2021) Classification of brain tumors and auto-immune disease using ensemble learning. Inform Med Unlocked 24:100608. https://doi.org/10.1016/j.imu.2021.100608

Shahsavari A, Khatibi T, Ranjbari S (2023) Skin lesion detection using an ensemble of deep models: SLDED. Multimed Tools Appl 82(7):10575–10594. https://doi.org/10.1007/s11042-022-13666-6

Shahzadi I, Tang TB, Meriadeau F, Quyyum A (2018) CNN-LSTM: cascaded framework for brain tumour classification. In: 2018 IEEE-EMBS Conference on Biomedical Engineering and Sciences (IECBES), IEEE, pp. 633–637. https://doi.org/10.1109/IECBES.2018.8626704

Sharma S, Mehra R (2020) Conventional machine learning and deep learning approach for multi-classification of breast cancer histopathology images—a comparative insight. J Digit Imaging 33(3):632–654. https://doi.org/10.1007/s10278-019-00307-y

Sheeba A, Santhosh-Kumar P, Ramamoorthy M, Sasikala S (2023) Microscopic image analysis in breast cancer detection using ensemble deep learning architectures integrated with web of things. Biomed Signal Process Control 79(P2):104048. https://doi.org/10.1016/j.bspc.2022.104048

Shetty B, Fernandes R, Rodrigues AP, Vijaya P (2022) Brain tumor detection using machine learning and convolutional neural network. In: 2022 International Conference on Artificial Intelligence and Data Engineering (AIDE), IEEE, pp. 86–91. doi: https://doi.org/10.1109/AIDE57180.2022.10060254

Shim SO, Alkinani MH, Hussain L, Aziz W (2022) Feature ranking importance from multimodal radiomic texture features using machine learning paradigm: a biomarker to predict the lung cancer. Big Data Res 29:100331. https://doi.org/10.1016/j.bdr.2022.100331

Shimanto SA, Hosain MK, Biswas SP, Islam MS (2023) Brain tumor detection and classification by SVM algorithm and performance analysis through CNN approach. In: 2023 International Conference on Electrical, Computer and Communication Engineering (ECCE), IEEE, pp. 1–6. https://doi.org/10.1109/ECCE57851.2023.10101618

Suckling J, Parker J, Dance D, Astley S, Hutt I (2015) Mammographic Image Analysis Society (MIAS) database v1.21. Apollo - University of Cambridge Repository., 2015. https://www.repository.cam.ac.uk/handle/1810/250394 accessed 21 Jul 2023

Talo M, Yildirim O, Baloglu UB, Aydin G, Acharya UR (2019) Convolutional neural networks for multi-class brain disease detection using MRI images. Comput Med Imaging Graph 78:101673. https://doi.org/10.1016/j.compmedimag.2019.101673

Tharwat A (2018) Classification assessment methods. Appl Comput Inform. https://doi.org/10.1016/j.aci.2018.08.003

The Indian Express (2019) World brain tumour day 2019: know the symptoms, risk factors and treatment. https://indianexpress.com/article/lifestyle/health/world-brain-tumour-day-2019-symptoms-risk-factors-treatment-5770587 accessed 7 Jun 2020

Toğaçar M, Ergen B, Cömert Z (2020) BrainMRNet: Brain tumor detection using magnetic resonance images with a novel convolutional neural network model. Med Hypotheses 134:109531. https://doi.org/10.1016/j.mehy.2019.109531

Tschandl P, Rosendahl C, Kittler H (2018a) The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci Data 5(1):180161. https://doi.org/10.1038/sdata.2018.161

Tschandl P, Rosendahl C, Kittler H (2018b) Data descriptor: the HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci Data 5:1–9. https://doi.org/10.1038/sdata.2018.161

Tumpa PP, Kabir MA (2021) An artificial neural network based detection and classification of melanoma skin cancer using hybrid texture features. Sens Int 2:100128. https://doi.org/10.1016/j.sintl.2021.100128

United Nations Conference on Trade and Development (2022) UN list of least developed countries. UNCTAD. https://unctad.org/topic/least-developed-countries/list

Ur-Rehman K, Li J, Pei Y, Yasin A, Ali S, Mahmood T (2021) Computer vision-based microcalcification detection in digital mammograms using fully connected depthwise separable convolutional neural network. Sensors. https://doi.org/10.3390/s21144854

Vaiyapuri T, Liyakathunisa, Alaskar H, Parvathi R, Pattabiraman V, Hussain A (2022) Cat swarm optimization-based computer-aided diagnosis model for lung cancer classification in computed tomography images. Appl Sci (switzerland). https://doi.org/10.3390/app12115491

Vaka AR, Soni B, Reddy SK (2020) Breast cancer detection by leveraging machine learning. ICT Express 6(4):320–324. https://doi.org/10.1016/j.icte.2020.04.009

Valvano G et al (2019) Convolutional neural networks for the segmentation of microcalcification in mammography imaging. J Healthc Eng. https://doi.org/10.1155/2019/9360941

Vankdothu R, Hameed MA (2022) Brain tumor segmentation of MR images using SVM and fuzzy classifier in machine learning. Meas Sens 24:100440. https://doi.org/10.1016/j.measen.2022.100440

Vijayarajeswari R, Parthasarathy P, Vivekanandan S, Basha AA (2019) Classification of mammogram for early detection of breast cancer using SVM classifier and Hough transform. Measurement (lond) 146:800–805. https://doi.org/10.1016/j.measurement.2019.05.083

Vineeth J, Hemanth S, Rao CV, Pavankumar N, Javanna HS, Janardhan CN (2022) Skin cancer detection using deep learning. In: 2022 4th International Conference on Cognitive Computing and Information Processing, CCIP 2022, no. Icears, pp. 1724–1730, https://doi.org/10.1109/CCIP57447.2022.10058685

Virupakshappa, Amarapur B (2020) Computer-aided diagnosis applied to MRI images of brain tumor using cognition based modified level set and optimized ANN classifier. Multimed Tools Appl 79(5–6):3571–3599. https://doi.org/10.1007/s11042-018-6176-1

Wahba MA, Ashour AS, Guo Y, Napoleon SA, Abd MM (2018) Computer methods and programs in biomedicine a novel cumulative level difference mean based GLDM and modified ABCD features ranked using eigenvector centrality approach for four skin lesion types classification. Comput Methods Programs Biomed 165:163–174. https://doi.org/10.1016/j.cmpb.2018.08.009

Wang Z, Xin J, Sun P, Lin Z, Yao Y, Gao X (2018) Improved lung nodule diagnosis accuracy using lung CT images with uncertain class. Comput Methods Programs Biomed 162:197–209. https://doi.org/10.1016/j.cmpb.2018.05.028

World Health Organization (2019) Global cancer observatory. Malaysia Cancer Statistics. https://gco.iarc.fr/ accessed 19 May 2023

World Health Organization International Agency for Research on Cancer (2020) The Global Cancer Observatory—all cancers. International Agency for Research on Cancer - WHO, vol. 419, pp. 199–200

Woźniak M, Połap D, Capizzi G, Lo-Sciuto G, Kośmider L, Frankiewicz K (2018) Small lung nodules detection based on local variance analysis and probabilistic neural network. Comput Methods Programs Biomed 161:173–180. https://doi.org/10.1016/j.cmpb.2018.04.025

Yan F, Huang H, Pedrycz W, Hirota K (2023) Automated breast cancer detection in mammography using ensemble classifier and feature weighting algorithms. Expert Syst Appl 227:120282. https://doi.org/10.1016/j.eswa.2023.120282

Yu K-H et al (2020) Classifying non-small cell lung cancer types and transcriptomic subtypes using convolutional neural networks. J Am Med Inform Assoc 27(5):757–769. https://doi.org/10.1093/jamia/ocz230

Zakareya S, Izadkhah H, Karimpour J (2023) A new deep-learning-based model for breast cancer diagnosis from medical images. Diagnostics 13(11):1944. https://doi.org/10.3390/diagnostics13111944

Zeng W, Liao Y, Chen Y, Ying-Diao Q, Ying-Fu Z, Yao F (2023) Research on classification and recognition of the skin tumors by laser ultrasound using support vector machine based on particle swarm optimization. Opt Laser Technol 158:108810. https://doi.org/10.1016/j.optlastec.2022.108810

Zhang N, Cai YX, Wang YY, Tian YT, Wang XL, Badami B (2020) Skin cancer diagnosis based on optimized convolutional neural network. Artif Intell Med 102:101756. https://doi.org/10.1016/j.artmed.2019.101756

Zhao J, Chen T, Cai B (2022) A computer-aided diagnostic system for mammograms based on YOLOv3. Multimed Tools Appl 81(14):19257–19281. https://doi.org/10.1007/s11042-021-10505-y

Zhou H et al (2018) Diagnosis of distant metastasis of lung cancer: based on clinical and radiomic features. Transl Oncol 11(1):31–36. https://doi.org/10.1016/j.tranon.2017.10.010

Download references

This work was supported by the National Research Foundation of Korea (NRF) Grant funded by the Korea government (MSIT) (NRF-2021R1F1A1063640).

Author information

Authors and affiliations.

School of Computing, Gachon University, 1342 Seongnam-daero, Sujeong-gu, Seongnam-si, 13120, Gyeonggi-do, Republic of Korea

Hari Mohan Rai & Joon Yoo

You can also search for this author in PubMed   Google Scholar

Contributions

Hari Mohan Rai: Conducted the literature search, reviewed and selected the relevant articles, and drafted the manuscript. Joon Yoo: Provided critical input, revised the manuscript for clarity and accuracy, and contributed to the overall organization and structure of the review.

Corresponding author

Correspondence to Hari Mohan Rai .

Ethics declarations

Conflict of interest.

No conflict of interest is involved in this manuscript.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (VSDX 407 KB)

Rights and permissions.

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Rai, H.M., Yoo, J. A comprehensive analysis of recent advancements in cancer detection using machine learning and deep learning models for improved diagnostics. J Cancer Res Clin Oncol 149 , 14365–14408 (2023). https://doi.org/10.1007/s00432-023-05216-w

Download citation

Received : 20 June 2023

Accepted : 26 July 2023

Published : 04 August 2023

Issue Date : November 2023

DOI : https://doi.org/10.1007/s00432-023-05216-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Cancer detection
  • Machine learning
  • Deep learning
  • Feature extraction
  • State-of-art analysis
  • Brain tumor
  • Lung cancer
  • Skin cancer
  • Breast cancer
  • Find a journal
  • Publish with us
  • Track your research
  • Open access
  • Published: 23 March 2023

A review and comparative study of cancer detection using machine learning: SBERT and SimCSE application

  • Mpho Mokoatle 1 ,
  • Vukosi Marivate 1 ,
  • Darlington Mapiye 2 ,
  • Riana Bornman 4 &
  • Vanessa. M. Hayes 3 , 4  

BMC Bioinformatics volume  24 , Article number:  112 ( 2023 ) Cite this article

11k Accesses

22 Citations

2 Altmetric

Metrics details

Using visual, biological, and electronic health records data as the sole input source, pretrained convolutional neural networks and conventional machine learning methods have been heavily employed for the identification of various malignancies. Initially, a series of preprocessing steps and image segmentation steps are performed to extract region of interest features from noisy features. Then, the extracted features are applied to several machine learning and deep learning methods for the detection of cancer.

In this work, a review of all the methods that have been applied to develop machine learning algorithms that detect cancer is provided. With more than 100 types of cancer, this study only examines research on the four most common and prevalent cancers worldwide: lung, breast, prostate, and colorectal cancer. Next, by using state-of-the-art sentence transformers namely: SBERT (2019) and the unsupervised SimCSE (2021), this study proposes a new methodology for detecting cancer. This method requires raw DNA sequences of matched tumor/normal pair as the only input. The learnt DNA representations retrieved from SBERT and SimCSE will then be sent to machine learning algorithms (XGBoost, Random Forest, LightGBM, and CNNs) for classification. As far as we are aware, SBERT and SimCSE transformers have not been applied to represent DNA sequences in cancer detection settings.

The XGBoost model, which had the highest overall accuracy of 73 ± 0.13 % using SBERT embeddings and 75 ± 0.12 % using SimCSE embeddings, was the best performing classifier. In light of these findings, it can be concluded that incorporating sentence representations from SimCSE’s sentence transformer only marginally improved the performance of machine learning models.

Peer Review reports

Introduction

Cancer is a disease where some cells in the body grow destructively and may spread to other body organs [ 1 ]. Typically, cells grow and expand through a cell division process to create new cells that can be used to repair old and damaged ones. However, this phenomenon can be interrupted resulting in abnormal cells growing uncontrollably to form tumors that can be malignant (harmful) or benign (harmless) [ 2 , 3 , 4 ].

With the introduction of genomic data that allows physicians and healthcare decision-makers to learn more about their patients and their response to the therapy they provide to them, this has facilitated the use of machine learning and deep learning to solve challenging cancer problems. These kinds of problems involve various tasks such as designing cancer risk-prediction models that try to identify patients that are at a higher risk of developing cancer than the general population, studying the progression of the disease to improve survival rates, and building methods that trace the effectiveness of treatment to improve treatment options [ 5 , 6 , 7 ].

Generally, the first step in analyzing genomic data to address cancer-related problems is selecting a data representation algorithm that will be used to estimate contiguous representations of the data. Examples of such algorithms include Word2vec [ 8 ], GloVe [ 9 ], and fastText [ 10 ]. The more recent and advanced versions of these algorithms are sentence transformers which are used to compute dense vector representations for sentences, paragraphs, and images. Similar texts are found close together in a vector space and dissimilar texts are far apart [ 11 ]. In this work, two such sentence transformers (SBERT and SimCSE) are proposed for detecting cancer in tumor/normal pairs of colorectal cancer patients. In this new approach, the classification algorithm relies on raw DNA sequences as the only input source. Moreover, this work provides a review of the most recent developments in cancers of the human body using machine learning and deep learning methods. While these kinds of similar reviews already exist in the literature, this study solely focuses on work that investigates four cancer types that have high prevalence rates worldwide [ 12 ] (lung, breast, prostate, and colorectal cancer) that have been published in the last five years (2018–2022).

Detection of cancer using machine learning

Lung cancer.

Lung cancer is the type of cancer that begins in the lungs and may spread to other organs in the body. This kind of cancer occurs when malignant cells develop in the tissue of the lung. There are two types of lung cancer: non-small-cell lung cancer (NSCLC) and small-cell lung cancer (SCLC). These cancers develop differently and thus their treatment therapies are different. Smoking (tobacco) is the leading cause of lung cancer. However, non-smokers can also develop lung cancer [ 13 , 14 ].

When it comes to the detection of lung cancer using machine learning (Fig.  1 ), a considerable amount of work has been done, a summary is provided (Table 1 ). Typically, a series of pre-processing steps using statistical methods and pretrained CNNs for feature extraction are carried out from several input sources (mostly images) to delineate the cancer region. Then, the extracted features are fed as input to several machine learning algorithms for classification of various lung cancer tasks such as the detection of malignant lung nodules from benign ones [ 15 , 16 , 17 ], the separation of a set of normalized biological data points into cancerous and non cancerous groups [ 18 ], and a basic comparative analysis of powerful machine learning algorithms for lung cancer detection [ 19 ].

figure 1

Generalized machine learning framework for lung cancer prediction [ 33 ]

The lowest classification accuracy reported in Table 1 was 74.4% by work in [ 20 ]. In this work, a pretrained CNN model (DenseNet) was used to develop a lung cancer detection model. First, the model was fine-tuned to identify lung nodules from chest X-rays using the ChestX-ray14 dataset [ 21 ]. Second, the model was fine-tuned to identify lung cancer from images in the JSRT (Japanese Society of Radiological Technology) dataset [ 22 ].

The highest classification accuracy of 99.7% for lung cancer classification was reported by work in [ 18 ]. This study developed the Discrete AdaBoost Optimized Ensemble Learning Generalized Neural Network (DAELGNN) framework that uses a set of normalized biological data points to create a neural network that separates normal lung features from non-normal (cancerous) features.

Popular datasets used in lung cancer research using machine learning include the Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI) (LIDC-IDRI) database [ 23 ] initiated by the National Cancer Institute (NCI), and the histopathological images of lung and colon cancer (LC2500) database [ 24 ].

Breast cancer

Breast Cancer is a malignant tumor or growth that develops in the cells of the breast [ 34 ]. Similar to lung cancer, breast cancer also has the ability to metastasize to near by lymph nodes or to other body organs. Towards the end of 2020, there were approximately 7.8 million women who have been diagnosed with breast cancer, making this type of cancer the most prevalent cancer in the world. Risk factors of breast cancer include age, obesity, abuse of alcohol, and family history [ 35 , 36 , 37 ].

Currently, there is no identified prevention procedure for breast cancer. However, maintaining a healthy living habit such as physical exercise and less alcohol intake can reduce the risk of developing breast cancer [ 38 ]. It has also been said that early detection methods that rely on machine learning can improve the prognosis. As such, this type of cancer has been extensively studied using machine learning and deep learning [ 39 , 40 ].

As with lung cancer (Sect.  2.1 ), a great deal of work has been executed in developing breast cancer detection models, a generalized approach that illustrates the process using machine learning is provided (Fig.  2 ).

figure 2

Generalized machine learning framework for breast cancer prediction [ 45 ]

Several classification problems have been studied that mainly focuses on the detection of breast cancer from thermogram images [ 41 ], handrafted features [ 42 ], mammograms [ 43 ], and whole slide images [ 44 ]. To develop a breast cancer detection model, initially, a pre-processing step is implemented that aims to extract features of interest. Then, the extracted features are provided as input to machine learning models for classification. This framework is implemented by several works such as [ 45 , 46 , 47 , 48 ].

One of the most popular datasets used for breast cancer detection using machine learning is the Wisconsin breast cancer dataset [ 42 ]. This dataset consists of features that describe the characteristics of the cell nuclei that is present in the image such as the diagnosis features (malignant or benign), radius, symmetry, and texture. Studies that used this dataset are [ 49 , 50 ]. In [ 49 ], the authors scaled the Wisconsin breast cancer features to be in the range between 0 and 1, then used a CNN for classification into benign or malignant. As opposed to using a CNN for classification, the authors [ 50 ] used traditional machine learning classifiers (Linear Regression, Multilayer Perceptron (MLP), Nearest Neighbor search, Softmax Regression, Gated recurrent Unit (GRU)-SVM, and SVM). For data pre-processing, the study used the Standard Scaler technique that standardizes data points by removing the mean and scaling the data to unit variance. The MLP model outperformed the other models by producing the highest accuracy of 99.04% which is almost similar to the accuracy of 99.6% that was reported by [ 49 ].

Different form binary classification of benign or malignant classes, a study [ 46 ] proposed a two-step approach to design a breast cancer multi-class classification model that predicts eight categories of breast cancer. In the first approach, the study used handcrafted features that are generated from histopathology images. These features were then fed as input to classical machine learning algorithms (RF, SVM, Linear Discriminant Analysis (LDA)). In the second approach, the study applied a transfer learning method to develop the multi-classification deep learning framework where pretained CNNs (ResNet50, VGG16 and VGG19) were used as feature extractors and baseline models. It was then found that the VGG16 pretrained CNN with the linear SVM provided the best accuracy in the range of 91.23% \(-\) 93.97%. This study also found that using pretrained CNNs as feature extractors improved the classification performance of the models.

The Table 2 provides a summary of the work that has been done to detect breast cancer using machine learning.

Prostate cancer

Prostate cancer is a type of cancer that develops when cells in the prostate gland start to grow uncontrollably (malignant). Prostate cancer often presents with no symptoms and grows at a slow rate. As a result, some men may die of other diseases before the cancer starts to cause notable problems. Comparably, prostate cancer can also be aggressive and metastasize to other body organs that are outside the confines of the prostate gland. Risk factors that are associated with this type of cancer include age, specifically, men that are above the age of 50. Other risk factors include ethnicity, family history of prostate cancer, breast or ovarian cancer, and obesity [ 61 , 62 , 63 ].

Transfer learning, which is defined as the reuse of a pretrained model on a new problem, was frequently applied to develop prostate cancer detection models using machine learning (Fig.  3 ). For example, a study [ 64 ] applied a transfer learning approach to detect prostate cancer on magnetic resonance images (MRI) by using a pretrained GoogleNet. A series of features such as texture, entropy, morphological, scale invariant feature transform (SIFT), and Elliptic Fourier Descriptors (EFDs) were extracted from the images as described by [ 65 , 66 ]. Other traditional machine learning classifiers were also evaluated such as Decision trees, and SVM Gaussian however, the GoogleNet model outperformed the other models.

figure 3

Generalized machine learning framework for prostate cancer prediction using 3-d CNNs, pooling layers, and a fully connected layer for classification [ 69 ]

Also using transfer learning, a study [ 67 ] developed a prostate cancer detection model by using MRI images and ultrasound (US) images. The model was developed in two stages: first, pretrained CNNs were used for classification of the US and MRI images into benign or malignant. While the pretrained CNNs performed well on the US images (accuracy 97%), the performance on the MRI images was not adequate. As a result, the best-performing pretrained CNN(VGG16) was selected and used as a feature extractor. The extracted features were then provided as input to traditional machine learning classifiers.

Another study [ 68 ] also used the same dataset as in [ 64 ] to create a prostate cancer detection model. However, instead of using GoogleNet as seen previously by [ 64 ], this study used a ResNet-101 and an autoencoder for feature reduction. Other machine learning models were also evaluated but, the study concluded that the pretrained ResNet-101 outperformed the other models with an accuracy of 100%. These results are similar to a previous study [ 64 ] that showed how pretrained CNNs outperform traditional machine learning models for cancer detection.

Table 3 , gives a summary of recent work that has been executed to create prostate cancer detection models.

Colorectal cancer

Colorectal cancer is a type of cancer that starts in the colon or rectum. The colon and rectum are parts of the human body that make up the large intestine that is part of the digestive system. A large part of the large intestine is made up of the colon which is divided into a few parts namely: ascending colon, transverse colon, descending colon, and sigmoid colon. The main function of the colon is to absorb water and salt from the remaining food waste after it has passed through the small intestine. Then, the waste that is left after passing through the colon goes into the rectum and is stored there until it is passed through the anus. Some colorectal cancers called polyps first develop as growth that can be found in the inner lining of the colon or rectum. Overtime, these polyps can develop into cancer, however, not all of them can be cancerous. Some of the risk factors of colorectal cancer include obesity, lack of exercise, diets that are rich in red meat, smoking, and alcohol [ 82 , 83 , 84 ].

In relation to the advancements made in colorectal cancer research using machine learning (Fig.  4 ), various tasks have been investigated such as predicting high-risk colorectal cancer from images, predicting five-year disease-specific survival, colorectal cancer tissue multi-class classification, and identifying the risk factors for lymph node metastasis (LNM) in colorectal cancer patients [ 85 , 86 , 87 , 88 ]. As with prostate cancer, transfer learning was mostly applied to extract features from various input sources such as colonoscopic images, tissue microarrays (TMA), and H &E slide images. Then, the extracted features were fed as input to machine learning algorithms for classification.

figure 4

Using a deep CNN network to predict colorectal cancer outcome using images [ 86 ]

One common observation with regards to colorectal cancer models, is that the predictions made from the models were compared to those of experts. For example, a study [ 85 ] developed a deep learning model that detects high risk colorectal cancer from whole slide images that were collected from colon biopsies. The deep learning model was created in two stages: first, a segmentation procedure was executed to extract high risk regions from whole slide images. This segmentation procedure applied Faster-Region Based Convolutional Neural Network (Faster-RCNN) that uses a ResNet-101 model as a backbone for feature extraction. The second stage of implementing the model applied a gradient-boosted decision tree on the output of the Faster-RCNN deep learning model to classify the slides into either high or low risk colorectal cancer, and achieved an AUC of 91.7%. The study then found that the predictions made from the validation set were in agreement with annotations made by expert pathologists.

Work in [ 89 ] also compared predictions made by the Microsatellite instability (MSI)-predictor model with those of expert pathologists and found that experts achieved a mean AUROC of 61% while the model achieved an AUROC of 93% on a hold-out set and 87% on a reader experiment.

A previous study [ 90 ] developed a model named CRCNet, based a pretrained dense CNN, that automatically detects colorecal cancer from colonoscopic images and found that the model exceeded the avarage performance of expert endoscopists on a recall rate of 91.3% versus 83.8%.

In Table 4 , a summary is provided that describes the work that has been executed in colorectal cancer research using machine learning.

In summary of the literature survey (Sect.  2 ), a series of machine learning approaches for the detection of cancer were analysed. Imaging datasets, biological and clinical data, and EHRs were primarily employed as the initial input source when developing cancer detection algorithms. This procedure involved a few preprocessing steps. First, the input source was typically preprocessed at the beginning stages of the experiment to extract regions or features of interest. Next, the retrieved set of features were then applied to downstream machine learning classifiers for cancer prediction. In this work, as opposed to using imaging datasets, clinical and biological data or, EHRs as the starting input source, this work proposes to use raw DNA sequences as the only input source. Moreover, contrary to using statistical methods or advanced CNNs for data extraction and representation, this work proposes to use state-of-the-art sentence transformers namely: SBERT and SimCSE. As far as we are aware, these two sentence transformer models have not been applied for learning representations in cancer research. The learned representations will then be fed as input to machine learning algorithms for cancer prediction.

Data description

In this study, 95 samples from colorectal cancer patients and matched-normal samples from previous work [ 104 ] were analysed. Exon sequences from two key genes: APC and ATM were used. The full details of the exons that were used in this study is shown Tables 5 and 6 . Table 7 shows the data distribution among the normal/tumor DNA sequences. Ethics approval was granted by the University of Pretoria EBIT Research Ethics Committee (EBIT/139/2020).

Data encoding

To encode the DNA sequences, state-of-the-art sentence transformers: Sentence-BERT [ 105 ] and SimCSE [ 105 ] were used. These transformers are explained in the next subsection.

Sentence-BERT

Sentence-BERT (SBERT) (Fig.  5 ) adapts the pretrained BERT [ 106 ] and RoBERTa [ 107 ] transformer network and modifies it to use a siamese and triplet network architectures to compute fixed-sized vectors for more than 100 languages. The sentence embeddings can then be contrasted using the cosine-similarity. SBERT was trained on the combination of SNLI data [ 108 ] and the Multi-Genre NLI dataset [ 109 ].

figure 5

SBERT architecture with classification objective function (left) and the regression objective function (right) [ 105 ]

In its architecture, SBERT adds a default mean-pooling procedure on the output of the BERT or RoBERTa network to compute sentence embeddings. SBERT implements the following objective functions: classification objective function, regression objective function, and the triplet objective function. In the classification objective function, the sentence embeddings of two sentence pairs u and v are concatenated using the element-wise difference \(\mid u-v \mid\) and multiplied with the trainable weight \(W_{t} \epsilon {\mathbb {R}}^{3n *k}\) :

where n is the length or dimension of the sentence embeddings and k is the value of the target labels.

The regression objective function makes use of mean-squared-error loss as the objective function to compute the cosine-similarity between two sentence embeddings u and v .

The triplet objective function fine-tunes the network such that the distance between an anchor sentence a and a positive sentence p is smaller than the distance between sentence a and the negative sentence n .

Using the pretrained SBERT model: all-MiniLM-L6-v2 , each DNA sequence was represented by a 384-dimensional vector.

As with SBERT, Simple Contrastive Sentence Embedding (SimCSE) [ 110 ] (Fig.  6 is a transformer based model that modifies the BERT/RoberTa encoder to generate sentence embeddings. It uses a contrastive learning approach that aims to learn sentence representations by pulling close neighbours together and propelling non-neighbours. SimCSE comes in two learning forms: unsupervised and supervised SimCSE. In unsupervised SimCSE, the network is fine-tuned to predict the input sentence itself using dropout as noise then, the other sentences that are in the mini-batch are taken as negatives. In this case, dropout acts as a data augmentation method while previous [ 111 , 112 ] methods have used word deletion, reordering, and substitution as a way of generating positive instances. In unsupervised SimCSE, an input sentence is fed twice to the encoder then, two embeddings with different dropout masks z , \(z'\) are generated as output. The training objective for SimCSE is:

where z is the standard dropout mask that are found in Transformers and no additional dropout mask is added [ 110 ].

figure 6

Unsupervised SimCSE ( a ) and supervised SimCSE ( b ) [ 110 ]

In supervised SimCSE, positive pairs are taken from the natural language inference (NLI) datasets and used to optimise the following equation:

where \(\tau\) is a temperature hyperparamter and \(sim(h_{1},h_{2})\) is the cosine similarity.

Using the unsupervised pretrained SimCSE model: unsup-simcse-bert-base-uncased , each DNA sequence was represented by a 768-dimensional vector.

K -means clustering

The k -means clustering algorithm was used to visualize the sentence representations generated from SBERT and SimCSE in an unsupervised approach. The k -means algorithm divides the data points into k clusters where each data point is said to belong to the cluster centroid closest to it. Since the data consists of two types of documents (tumor vs. normal), the k -means algorithm was asked to find 2 clusters n and assign each DNA sequence to its closest centroid [ 113 ].

Machine learning experiments

A total of three machine learning algorithms were used for classification: Light Gradient Boosting (LightGBM), eXtreme Gradient Boosting (XGBoost), and Random Forest (RF).

eXtreme gradient boosting (XGBoost)

eXtreme Gradient Boosting (XGBoost), is an efficient implementation of the gradient boosting algorithm. Gradient boosting belongs to a group of ensemble machine learning algorithms that be used to solve classification or regression problems. The ensembles are created from decision trees that are added one at a time to the ensemble, and fit to correct the classification error that were made by prior trees [ 114 ].

Light gradient boosting (LightGBM)

Light Gradient Boosting (LightGBM) machine is also a gradient boosting model that is used for ranking, classification, and regression. In contrast to XGBoost, LightGBM splits the tree vertically as opposed to horizontally. This method of growing the tree leaf vertically results in more loss reduction and provides higher accuracy while also being faster. LightGBM uses the Gradient-based One-Side Sampling (GOSS) method to filter out data instances for obtaining the best split value while XGBoost uses a pre-sorted and Histogram-based algorithm for calculating the best split value [ 115 ].

Random forest (RF)

Random forest (RF) is a supervised machine learning that is used in classification and regression tasks. It creates decision tress based on different samples and takes the majority vote for classification or average for regression. While XGBoost and LightGBM use a gradient boosting method, Random Forest uses a bagging method. The bagging method builds a different training subset from the training data with replacement. Each model is trained separately and the final result is based on a majority voting after consolidating the results of all the models [ 116 ].

Convolutional neural network (CNN)

Convolutional neural networks (CNNs) are a subset of neural networks that are frequently used to process speech, audio, and visual input signals. Convolutional, pooling, and fully connected (FC) layers are the three types of layers that are generally present in CNNs. The convolutional layer is the fundamental component of a CNN and is in charge of performing convolutional operations on the input before passing the outcome to the following layer. Then, the input is subjected to dimensionality reduction using pooling layers that reduces the number of parameters in the input. The FC layer uses a variety of activation functions, including the softmax activation function and the sigmoid activation function, to carry out the classification task using the features retrieved from the network’s prior layers [ 117 , 118 ]. In this work, a three-layer CNN model with a sigmoid activation function will be supplied with the embedding features that were retrieved by SBERT and SimCSE sentence transformers. Due to computational limitations, the network will be trained over 10 epochs using the RMSprop optimizer and cross-validated over five folds.

Performance evaluation metrics

To measure the performance of the machine learning models, the average performance of the models were reported using 5-fold cross validation and the following metrics were used: accuracy, precision, recall and F1 score. In Table 8 , the definition of these metrics is provided.

This section described the datasets used in the study as well as data representation methods and machine learning algorithms that were applied in this work. In the next section, the results of the applied methods are described.

Visualizations

In this subsection, unlabeled data from SBERT and SimCSE representations were explored and visualized with the k -means clustering algorithm. The representations of the SBERT algorithm (Fig.  7 ) revealed more overlap between the data points in comparison to the representations of the SimCSE algorithm (Fig.  8 ). In the next subsection, machine learning models are evaluated to reveal if there is sufficient signal in the representations of the two sentence transformers that can discriminate between tumor and normal DNA sequences.

figure 7

Visualisation of the SBERT documents with k -means clustering

figure 8

Visualisation of the SimCSE documents with k -means clustering

Comparative performance of the machine learning results

Sbert before smote.

Table 9 presents the performance of the machine learning models on the dev set in terms of the average accuracy, averaged over the five folds using the SBERT representations. More performance metrics such as F1 score, recall, and precision are reported in the Additional file 1 (Appendix A ).

Considering that the tumor DNA sequences belonging to the APC gene comprised of \(\approx\) 64% of the data before SMOTE sampling, the machine learning models classified most sequences as positive (tumor); with the CNN achieving the best overall with the highest accuracy of 67.3 ± 0.04%.

In contrast to the data distribution of the APC gene before SMOTE sampling, the original data distribution of sequences from the ATM gene were relatively balanced as the tumor sequences comprised of 53% of the total data, and normal DNA sequences made up 47%. Moreover, as opposed to predicting nearly all sequences as positive, the machine learning models demonstrated an unbiased above-average performance as the highest performing model (XGBoost) achieved an accuracy of 73. ± 0.13 %.

SBERT after SMOTE

The performance of the majority of the machine learning classifiers after applying SMOTE remained consistent in that very little improvement or decline was observed. Moreover, while the CNN model previously obtained the highest overall accuracy before SMOTE oversampling, it performed the worst after applying SMOTE with a reported accuracy of 47. ± 17.4 %. Although biased, the LightGBM classifier reached the highest accuracy of 64.9 ± 0.29 %. Its confusion matrix is shown (Fig.  9 ).

figure 9

Confusion matrix of the LightGBM model using SBERT representations after SMOTE (dev set)

The same trend as seen in the previous Sect.  4.2.2 was also observed in this section with sequences from the ATM gene. Here, the performance of the machine learning models after SMOTE sampling was relatively similar to the performance of the machine learning models before SMOTE sampling as the XGBoost still maintained the best overall accuracy of 73. ± 0.13 % (Fig.  10 ).

figure 10

Confusion matrix of the XGBoost model using SBERT representations after SMOTE (dev set)

SimCSE before SMOTE

Table 9 also presents the performance of the machine learning models in terms of the average accuracy, averaged over the five folds using the SimCSE representations. Supplementary performance metrics are reported (Additional file 1 : Appendix A).

In this experimental setting, the performance of the machine learning models with SBERT representations before SMOTE sampling was similar to the performance of the models with SimCSE representations before SMOTE sampling. Here, the CNN achieved the best accuracy of 67. ± 0.0 %.

A similar pattern as in the previous Sect. ( APC , SimCSE before SMOTE) was also detected in this setting when using sequences from the ATM gene in that the performance of the SimCSE models were almost similar to the performance of the SBERT models (before SMOTE) with slight improvement. The LightGBM model achieved the highest accuracy of 74. ± 0.18 % which was an improvement in accuracy of approximately 4 %.

SimCSE after SMOTE

The LightGBM model achieved the highest accuracy of 64.7 ± 0.29 (Fig.  11 ), which was indistinguishable to the performance reported before SMOTE oversampling.

figure 11

Confusion matrix of the LightGBM model using SimCSE representations after SMOTE (dev set)

ATM In this final experimental setting, the results demonstrated a consistent performance before SMOTE sampling and after SMOTE sampling. The highest performing model was the Random forest model as it achieved an average accuracy of 71.6 ± 1.47 % (Fig.  12 ).

figure 12

Confusion matrix of the Random forest model using SimCSE representations after SMOTE (dev set)

In Table 10 , the experiments were repeated on an additional unseen test set. Overall, the machine learning models demonstrated a slight increase in the accuracy as the highest performing model, XGBoost, achieved an average accuracy of 75. ± 0.12 % using SimCSE representations from the ATM gene.

This paper provided a literature review of how cancer has been detected using various machine learning methods. Additionally, this work developed machine learning models that detect cancer using raw DNA sequences as the only input source. The DNA sequences were retrieved from matched tumor/normal pairs of colorectal cancer patients as described by previous work [ 104 ]. For data representation, two state-of-the-art sentence transformers were proposed: SBERT and SimCSE. To the best of our knowledge, these two methods have not been used to represent DNA sequences in cancer detection problems using machine learning. In summary of the results, we note that using SimCSE representations only marginally improved the performance of the machine learning models.

The ability to detect cancer by relying on human DNA as the only input source to a learning algorithm was one of the significant contributions of this work. We acknowledge that similar research investigating the role that the DNA plays in various cancer types has been conducted in the past. In contrary, the way the DNA was represented for the learning algorithms in our work is different from that in earlier research. An example would be work performed by [ 120 ] that used cell-free DNA (cfDNA) data from shallow whole-genome sequencing to uncover patterns associated with a number of different cancers including Hodgkin lymphoma, diffuse large B-cell lymphoma, and multiple myeloma. This study used PCA transformed genome-wide coverage features and applied them as input to a support vector algorithm to predict cancer status rather than employing sentence transforms for data representation as was done in our study. Another study [ 121 ] also used cfDNA sequences to predict cancer tissue sequences from healthy ones. In this work, reads from hepatocellular carcinoma (HCC) patients and healthy individuals were integrated with methylation information and then, a deep learning model was created to predict the reads that originated from a cancer tissue. The deep learning model consisted of a 1-d CNN followed by a maxpooling layer, a bi-directional LSTM, a 1-d CNN, and three dense layers. To represent the cfDNA sequences and methylation information, the variables were encoded into a one-hot encoded matrix that was then provided as input to the deep learning model for classification. Different from relying on raw DNA or cfDNA data to develop cancer detection frameworks, a study [ 122 ] consolidated methods from variant calling and machine learning to develop a model that detects cancers of unknown primary (CUP) origin which account for approximately 3% of all cancer diagnoses. This work employed whole-genome-sequencing-based mutation features derived from structural variants that were generated through variant calling and fed them as input to an ensemble of random forest binary classifiers for the detection of 35 different cancers.

Limitations of the study

The machine learning experiments were only performed on two key genes: APC and APC , therefore it would have been interesting to see how the models generalize across various genes. The common disadvantage of conducting the experiments on multiple genes or whole genome sequencing data is that they require more computational resources which have a direct impact on cost. Another limitation of this work is that only two pretrained models were used for generating the sentence representations. Since there are several other pretrained models that are publicly available to choose from, some pretrained models were slower to execute than others hence a decision was made to focus on pretrained models that provided fast execution.

This article reviewed the literature and demonstrated how various machine learning techniques have been used to identify cancer. Given that they are the most common malignancies worldwide, this work placed a special emphasis on four cancer types: lung, breast, prostate, and colorectal cancer. Then, a new method for the identification of colorectal cancer employing SBERT and SimCSE sentence representations was presented. Raw DNA sequences from matched tumor/normal pairs of colorectal cancer served as the sole input for this approach. The learned representations were then provided as input to machine learning classifiers for classification. In light of the performance of the machine learning classifiers, XGBoost was found to be the best performing classifier overall. Moreover, using SimCSE representations only marginally improved the classification performance of the machine learning models.

Availability of data and materials

The data can be accessed at the host database (The European Genome-phenome Archive at the European Bioinformatics Institute, accession number: EGAD00001004582 Data access ).

Jones PA, Baylin SB. The epigenomics of cancer. Cell. 2007;128(4):683–92.

Article   CAS   PubMed   PubMed Central   Google Scholar  

What Is Cancer? National Cancer Institute. https://www.cancer.gov/about-cancer/understanding/what-is-cancer

Zheng R, Sun K, Zhang S, Zeng H, Zou X, Chen R, Gu X, Wei W, He J. Report of cancer epidemiology in china, 2015. Zhonghua zhong liu za zhi. 2019;41(1):19–28.

CAS   PubMed   Google Scholar  

Hegde PS, Chen DS. Top 10 challenges in cancer immunotherapy. Immunity. 2020;52(1):17–35.

Article   CAS   PubMed   Google Scholar  

Kourou K, Exarchos TP, Exarchos KP, Karamouzis MV, Fotiadis DI. Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J. 2015;13:8–17.

Iqbal MJ, Javed Z, Sadia H, Qureshi IA, Irshad A, Ahmed R, Malik K, Raza S, Abbas A, Pezzani R, et al. Clinical applications of artificial intelligence and machine learning in cancer diagnosis: looking into the future. Cancer Cell Int. 2021;21(1):1–11.

Article   Google Scholar  

Loud JT, Murphy J. Cancer screening and early detection in the 21st century. Semin Oncol Nurs. 2017;33:121–8.

Article   PubMed   PubMed Central   Google Scholar  

Goldberg Y, Levy O. word2vec explained: deriving mikolov et al.’s negative-sampling word-embedding method. 2014; arXiv preprint arXiv:1402.3722

Pennington J, Socher R, Manning CD. Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014. p. 1532–43.

Bojanowski P, Grave E, Joulin A, Mikolov T. Enriching word vectors with subword information. Trans Assoc Comput Linguist. 2017;5:135–46.

Church KW. Word2vec. Natl Lang Eng. 2017;23(1):155–62.

Cancer. World Health Organization. https://www.who.int/news-room/fact-sheets/detail/cancer

Bade BC, Cruz CSD. Lung cancer 2020: epidemiology, etiology, and prevention. Clin Chest Med. 2020;41(1):1–24.

Article   PubMed   Google Scholar  

Barta JA, Powell CA, Wisnivesky JP. Global epidemiology of lung cancer. Ann Global Health. 2019;85:1.

de Carvalho Filho AO, Silva AC, de Paiva AC, Nunes RA, Gattass M. Classification of patterns of benignity and malignancy based on ct using topology-based phylogenetic diversity index and convolutional neural network. Pattern Recogn. 2018;81:200–12.

Rodrigues MB, Da Nobrega RVM, Alves SSA, Reboucas Filho PP, Duarte JBF, Sangaiah AK, De Albuquerque VHC. Health of things algorithms for malignancy level classification of lung nodules. IEEE Access. 2018;6:18592–601.

Asuntha A, Srinivasan A. Deep learning for lung cancer detection and classification. Multim Tools Appl. 2020;79(11):7731–62.

Shakeel PM, Tolba A, Al-Makhadmeh Z, Jaber MM. Automatic detection of lung cancer from biomedical data set using discrete adaboost optimized ensemble learning generalized neural networks. Neural Comput Appl. 2020;32(3):777–90.

Abdullah DM, Abdulazeez AM, Sallow AB. Lung cancer prediction and classification based on correlation selection method using machine learning techniques. Qubahan Acad J. 2021;1(2):141–9.

Ausawalaithong W, Thirach A, Marukatat S, Wilaiprasitporn T. Automatic lung cancer prediction from chest x-ray images using the deep learning approach. In: 2018 11th biomedical engineering international conference (BMEiCON). 2018; pp. 1–5. IEEE

Wang X, Peng Y, Lu L, Lu Z, Bagheri M, Summers RM. Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2017; pp. 2097–106

Shiraishi J, Katsuragawa S, Ikezoe J, Matsumoto T, Kobayashi T, Komatsu K-I, Matsui M, Fujita H, Kodera Y, Doi K. Development of a digital image database for chest radiographs with and without a lung nodule: receiver operating characteristic analysis of radiologists’ detection of pulmonary nodules. Am J Roentgenol. 2000;174(1):71–4.

Article   CAS   Google Scholar  

Armato SG III, McLennan G, Bidaut L, McNitt-Gray MF, Meyer CR, Reeves AP, Zhao B, Aberle DR, Henschke CI, Hoffman EA, et al. The lung image database consortium (lidc) and image database resource initiative (idri): a completed reference database of lung nodules on ct scans. Med Phys. 2011;38(2):915–31.

Kaggle: Lung and Colon Cancer Histopathological Images. https://www.kaggle.com/andrewmvd/lung-and-colon-cancer-histopathological-images Accessed 16 July 2020.

Radhika P, Nair RA, Veena G. A comparative study of lung cancer detection using machine learning algorithms. In: 2019 IEEE international conference on electrical, computer and communication technologies (ICECCT). 2019; pp. 1–4. IEEE

Salaken SM, Khosravi A, Khatami A, Nahavandi S, Hosen MA. Lung cancer classification using deep learned features on low population dataset. In: 2017 IEEE 30th Canadian conference on electrical and computer engineering (CCECE). 2017; pp. 1–5. IEEE.

Bhattacharjee A, Richards WG, Staunton J, Li C, Monti S, Vasa P, Ladd C, Beheshti J, Bueno R, Gillette M, et al. Classification of human lung carcinomas by mrna expression profiling reveals distinct adenocarcinoma subclasses. Proc Natl Acad Sci. 2001;98(24):13790–5.

Bhatia S, Sinha Y, Goel L. Lung cancer detection: a deep learning approach. In: Soft computing for problem solving. 2019; p. 699–705. Springer.

Shin H, Oh S, Hong S, Kang M, Kang D, Ji Y-G, Choi BH, Kang K-W, Jeong H, Park Y, et al. Early-stage lung cancer diagnosis by deep learning-based spectroscopic analysis of circulating exosomes. ACS Nano. 2020;14(5):5435–44.

Masud M, Sikder N, Nahid A-A, Bairagi AK, AlZain MA. A machine learning approach to diagnosing lung and colon cancer using a deep learning-based classification framework. Sensors. 2021;21(3):748.

Naseer I, Akram S, Masood T, Jaffar A, Khan MA, Mosavi A. Performance analysis of state-of-the-art cnn architectures for luna16. Sensors. 2022;22(12):4426.

Setio AAA, Traverso A, De Bel T, Berens MS, Van Den Bogaard C, Cerello P, Chen H, Dou Q, Fantacci ME, Geurts B, et al. Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: the luna16 challenge. Med Image Anal. 2017;42:1–13.

Saba T. Recent advancement in cancer detection using machine learning: systematic survey of decades, comparisons and challenges. J Infect Pub Health. 2020;13(9):1274–89.

Sun Y-S, Zhao Z, Yang Z-N, Xu F, Lu H-J, Zhu Z-Y, Shi W, Jiang J, Yao P-P, Zhu H-P. Risk factors and preventions of breast cancer. Int J Biol Sci. 2017;13(11):1387.

Breast cancer. World Health Organization. https://www.who.int/news-room/fact-sheets/detail/breast-cancer

Kelsey JL, Gammon MD. The epidemiology of breast cancer. CA Cancer J Clin. 1991;41(3):146–65.

Harbeck N, Penault-Llorca F, Cortes J, Gnant M, Houssami N, Poortmans P, Ruddy K, Tsang J, Cardoso F. Breast cancer. Nat Rev Dis Prim. 2019;5(1):1–31.

Google Scholar  

Waks AG, Winer EP. Breast cancer treatment: a review. JAMA. 2019;321(3):288–300.

Tahmooresi M, Afshar A, Rad BB, Nowshath K, Bamiah M. Early detection of breast cancer using machine learning techniques. J Telecommun Electr Comput Eng. 2018;10(3):21–7.

Sharma S, Aggarwal A, Choudhury T. Breast cancer detection using machine learning algorithms. In: 2018 international conference on computational techniques, electronics and mechanical systems (CTEMS). 2018; p. 114–8 . IEEE.

VisualLab: A Methodology for Breast Disease Computer-Aided Diagnosis Using Dynamic Thermography. http://visual.ic.uff.br/en/proeng/thiagoelias/

Wolberg WH, Street WN, Mangasarian OL. Breast cancer wisconsin (diagnostic) data set. UCI machine learning repository. http://archive.ics.uci.edu/ml/ ; 1992.

Suckling JP. The mammographic image analysis society digital mammogram database. Digital Mammo. 1994; pp. 375–86.

Roy A. Deep convolutional neural networks for breast cancer detection. In: 2019 IEEE 10th annual ubiquitous computing, electronics & mobile communication conference (UEMCON). 2019; pp. 0169–71 . IEEE.

Mambou SJ, Maresova P, Krejcar O, Selamat A, Kuca K. Breast cancer detection using infrared thermal imaging and a deep learning model. Sensors. 2018;18(9):2799.

Sharma S, Mehra R. Conventional machine learning and deep learning approach for multi-classification of breast cancer histopathology images-a comparative insight. J Digit Imag. 2020;33(3):632–54.

Remya R, Rajini NH. Transfer learning based breast cancer detection and classification using mammogram images. In: 2022 international conference on electronics and renewable systems (ICEARS). 2022; pp. 1060–5 . IEEE.

Vaka AR, Soni B, Reddy S. Breast cancer detection by leveraging machine learning. ICT Express. 2020;6(4):320–4.

Khuriwal N, Mishra N. Breast cancer detection from histopathological images using deep learning. In: 2018 3rd international conference and workshops on recent advances and innovations in engineering (ICRAIE). 2018; pp. 1–4 . IEEE.

Agarap AFM. On breast cancer detection: an application of machine learning algorithms on the wisconsin diagnostic dataset. In: proceedings of the 2nd international conference on machine learning and soft computing. 2018; pp. 5–9.

Shen L, Margolies LR, Rothstein JH, Fluder E, McBride R, Sieh W. Deep learning to improve breast cancer detection on screening mammography. Sci Rep. 2019;9(1):1–12.

Sawyer Lee R, Gimenez F, Hoogi A, Rubin D. Curated Breast Imaging Subset of DDSM. The cancer imaging archive, 2016.

Moreira IC, Amaral I, Domingues I, Cardoso A, Cardoso MJ, Cardoso JS. Inbreast: toward a full-field digital mammographic database. Acad Radiol. 2012;19(2):236–48.

VRI: Breast Cancer Histopathological Database (BreakHis). https://web.inf.ufpr.br/vri/databases/breast-cancer-histopathological-database-breakhis/

Alanazi SA, Kamruzzaman M, Islam Sarker MN, Alruwaili M, Alhwaiti Y, Alshammari N, Siddiqi MH. Boosting breast cancer detection using convolutional neural network. J Healthc Eng 2021;2021.

Janowczyk, A.: Use case 6: invasive ductal carcinoma (IDC) segmentation. http://www.andrewjanowczyk.com/use-case-6-invasive-ductal-carcinoma-idc-segmentation/

Arooj S, et al.: Breast cancer detection and classification empowered with transfer learning. Front Pub Health. 2022;10.

Nasir MU, Ghazal TM, Khan MA, Zubair M, Rahman A-u, Ahmed R, Hamadi HA, Yeun CY. Breast cancer prediction empowered with fine-tuning. Comput Intell Neurosci. 2022;2022.

Breast cancer patients mris. Kaggle. https://www.kaggle.com/uzairkhan45/breast-cancer-patients-mris

Khan MBS, Nawaz MS, Ahmed R, Khan MA, Mosavi A, et al. Intelligent breast cancer diagnostic system empowered by deep extreme gradient descent optimization. Mathem Biosci Eng. 2022;19(8):7978–8002.

What is Prostate Cancer. UCLA Health. https://www.uclahealth.org/urology/prostate-cancer/what-is-prostate-cancer

Desai MM, Cacciamani GE, Gill K, Zhang J, Liu L, Abreu A, Gill IS. Trends in incidence of metastatic prostate cancer in the us. JAMA Netw Open. 2022;5(3):222246.

Cackowski FC, Heath EI. Prostate cancer dormancy and recurrence. Cancer Lett. 2022;524:103–8.

Abbasi AA, Hussain L, Awan IA, Abbasi I, Majid A, Nadeem MSA, Chaudhary Q-A. Detecting prostate cancer using deep learning convolution neural network with transfer learning approach. Cogn Neurodyn. 2020;14(4):523–33.

Hussain L, Ahmed A, Saeed S, Rathore S, Awan IA, Shah SA, Majid A, Idris A, Awan AA. Prostate cancer detection using machine learning techniques by employing combination of features extracting strategies. Cancer Biomark. 2018;21(2):393–413.

Hussain L, et al. Detecting brain tumor using machines learning techniques based on different features extracting strategies. Curr Med Imag. 2019;15(6):595–606.

Hassan MR, Islam MF, Uddin MZ, Ghoshal G, Hassan MM, Huda S, Fortino G. Prostate cancer classification from ultrasound and mri images using deep learning based explainable artificial intelligence. Fut Gener Comput Syst. 2022;127:462–72.

Iqbal S, Siddiqui GF, Rehman A, Hussain L, Saba T, Tariq U, Abbasi AA. Prostate cancer detection using deep learning and traditional techniques. IEEE Access. 2021;9:27085–100.

Feng Y, Yang F, Zhou X, Guo Y, Tang F, Ren F, Guo J, Ji S. A deep learning approach for targeted contrast-enhanced ultrasound based prostate cancer detection. IEEE/ACM transactions on computational biology and bioinformatics. 2018;16(6):1794–801.

Reda I, Khalil A, Elmogy M, Abou El-Fetouh A, Shalaby A, Abou El-Ghar M, Elmaghraby A, Ghazal M, El-Baz A. Deep learning role in early diagnosis of prostate cancer. Technol Cancer Res Treat. 2018;17:1533034618775530.

Barlow H, Mao S, Khushi M. Predicting high-risk prostate cancer using machine learning methods. Data. 2019;4(3):129.

Yoo S, Gujrathi I, Haider MA, Khalvati F. Prostate cancer detection using deep convolutional neural networks. Sci Rep. 2019;9(1):1–10.

Tolkach Y, Dohmgörgen T, Toma M, Kristiansen G. High-accuracy prostate cancer pathology using deep learning. Nat Mach Intell. 2020;2(7):411–8.

Genomic Data Commons Data Portal. National Cancer Institute (NIH) GDC Data Portal. http://portal.gdc.cancer.gov

Zenodo. Zenodo. https://zenodo.org/deposit/3825933

Hosseinzadeh M, Saha A, Brand P, Slootweg I, de Rooij M, Huisman H. Deep learning–assisted prostate cancer detection on bi-parametric mri: minimum training data size requirements and effect of prior knowledge. Eur Radiol. 2021; 1–11.

Natarajan S, Priester A, Margolis D, Huang J, Marks L. Prostate mri and ultrasound with pathology and coordinates of tracked biopsy (prostate-mri-us-biopsy). 2020.

Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, Moore S, Phillips S, Maffitt D, Pringle M, et al. The cancer imaging archive (tcia): maintaining and operating a public information repository. J Dig Imaging. 2013;26(6):1045–57.

Sonn GA, Natarajan S, Margolis DJ, MacAiran M, Lieu P, Huang J, Dorey FJ, Marks LS. Targeted biopsy in the detection of prostate cancer using an office based magnetic resonance ultrasound fusion device. J Urol. 2013;189(1):86–92.

Tsuneki M, Abe M, Kanavati F. A deep learning model for prostate adenocarcinoma classification in needle biopsy whole-slide images using transfer learning. Diagnostics. 2022;12(3):768.

Otsu N. A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern. 1979;9(1):62–6.

What Is Colorectal Cancer? American Cancer Society. https://www.cancer.org/cancer/colon-rectal-cancer/about/what-is-colorectal-cancer.html

Center MM, Jemal A, Smith RA, Ward E. Worldwide variations in colorectal cancer. CA Cancer J Clin. 2009;59(6):366–78.

Weitz J, Koch M, Debus J, Höhler T, Galle PR, Büchler MW. Colorectal cancer. Lancet. 2005;365(9454):153–65.

Ho C, Zhao Z, Chen XF, Sauer J, Saraf SA, Jialdasani R, Taghipour K, Sathe A, Khor L-Y, Lim K-H, et al. A promising deep learning-assistive algorithm for histopathological screening of colorectal cancer. Sci Rep. 2022;12(1):1–9.

Bychkov D, Linder N, Turkki R, Nordling S, Kovanen PE, Verrill C, Walliander M, Lundin M, Haglund C, Lundin J. Deep learning based tissue analysis predicts outcome in colorectal cancer. Sci Rep. 2018;8(1):1–11.

Damkliang K, Wongsirichot T, Thongsuksai P. Tissue classification for colorectal cancer utilizing techniques of deep learning and machine learning. Biomed Eng Appl Basis Commun. 2021;33(03):2150022.

Brockmoeller S, Echle A, Ghaffari Laleh N, Eiholm S, Malmstrøm ML, Plato Kuhlmann T, Levic K, Grabsch HI, West NP, Saldanha OL, et al. Deep learning identifies inflamed fat as a risk factor for lymph node metastasis in early colorectal cancer. J Pathol. 2022;256(3):269–81.

Yamashita R, Long J, Longacre T, Peng L, Berry G, Martin B, Higgins J, Rubin DL, Shen J. Deep learning model for the prediction of microsatellite instability in colorectal cancer: a diagnostic study. Lancet Oncol. 2021;22(1):132–41.

Zhou D, Tian F, Tian X, Sun L, Huang X, Zhao F, Zhou N, Chen Z, Zhang Q, Yang M, et al. Diagnostic evaluation of a deep learning model for optical diagnosis of colorectal cancer. Nat Commun. 2020;11(1):1–9.

CAS   Google Scholar  

Wang Y-H, Nguyen PA, Islam MM, Li Y-C, Yang H-C, et al. Development of deep learning algorithm for detection of colorectal cancer in ehr data. In: MedInfo. 2019; pp. 438–41

Echle A, Grabsch HI, Quirke P, van den Brandt PA, West NP, Hutchins GG, Heij LR, Tan X, Richman SD, Krause J, et al. Clinical-grade detection of microsatellite instability in colorectal tumors by deep learning. Gastroenterology. 2020;159(4):1406–16.

Macenko M, et al. A method for normalizing histology slides for quantitative analysis. In: 2009 IEEE international symposium on biomedical imaging: from Nano to Macro, pp. 1107–10 (2009). IEEE.

Amitay EL, Carr PR, Jansen L, Walter V, Roth W, Herpel E, Kloor M, Bläker H, Chang-Claude J, Brenner H, et al. Association of aspirin and nonsteroidal anti-inflammatory drugs with colorectal cancer risk by molecular subtypes. JNCI J Natl Cancer Inst. 2019;111(5):475–83.

Group QC, et al. Adjuvant chemotherapy versus observation in patients with colorectal cancer: a randomised study. Lancet. 2007;370(9604):2020–9.

van den Brandt PA, Goldbohm RA, Veer PV, Volovics A, Hermus RJ, Sturmans F. A large-scale prospective cohort study on diet and cancer in the netherlands. Journal of clinical epidemiology. 1990;43(3):285–95.

Taylor J, Wright P, Rossington H, Mara J, Glover A, West N, Morris E, Quirke P. Regional multidisciplinary team intervention programme to improve colorectal cancer outcomes: study protocol for the yorkshire cancer research bowel cancer improvement programme (ycr bcip). BMJ Open. 2019;9(11): 030618.

Histological images for MSI vs. MSS classification in gastrointestinal cancer, FFPE samples. Zenodo. https://zenodo.org/record/2530835#.Ypib9C8RpQI

Sarwinda D, Paradisa RH, Bustamam A, Anggia P. Deep learning in image classification using residual network (resnet) variants for detection of colorectal cancer. Proc Comput Sci. 2021;179:423–31.

Tissue Image Analytics (TIA) Centre. warwick. https://warwick.ac.uk/fac/cross_fac/tia/data/glascontest/download

Lorenzovici N, Dulf E-H, Mocan T, Mocan L. Artificial intelligence in colorectal cancer diagnosis using clinical data: non-invasive approach. Diagnostics. 2021;11(3):514.

Kather JN, Weis C-A, Bianconi F, Melchers SM, Schad LR, Gaiser T, Marx A, Zöllner FG. Multi-class texture analysis in colorectal cancer histology. Sci Rep. 2016;6(1):1–11.

Muti H, Loeffler C, Echle A, Heij L, Buelow R, Krause J, et al. The aachen protocol for deep learning histopathology: a hands-on guide for data preprocessing. Zenodo Aachen. 2020;10

Poulos RC, Perera D, Packham D, Shah A, Janitz C, Pimanda JE, Hawkins N, Ward RL, Hesson LB, Wong JW. Scarcity of recurrent regulatory driver mutations in colorectal cancer revealed by targeted deep sequencing. JNCI Cancer spectr. 2019;3(2):012.

Reimers N, Gurevych I. Sentence-bert: Sentence embeddings using siamese bert-networks. 2019; arXiv preprint arXiv:1908.10084

Devlin J, Chang M-W, Lee K, Toutanova K. Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V. Roberta: a robustly optimized bert pretraining approach. 2019; arXiv preprint arXiv:1907.11692

Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. arXiv preprint arXiv:1508.05326 (2015)

Williams A, Nangia N, Bowman SR. A broad-coverage challenge corpus for sentence understanding through inference. arXiv preprint arXiv:1704.05426 (2017)

Gao T, Yao X, Chen D. Simcse: simple contrastive learning of sentence embeddings. 2021;arXiv preprint arXiv:2104.08821

Wu, Z., Wang, S., Gu, J., Khabsa, M., Sun, F., Ma, H.: Clear: Contrastive learning for sentence representation. arXiv preprint arXiv:2012.15466 (2020)

Meng Y, Xiong C, Bajaj P, Bennett P, Han J, Song X, et al. Coco-lm: correcting and contrasting text sequences for language model pretraining. Advances in Neural Information Processing Systems. 2021;34

Hartigan JA, Wong MA. Algorithm as 136: a k-means clustering algorithm. J R Stat Soc. 1979;28(1):100–8.

Chen T, Guestrin C. Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 2016; pp. 785–94

Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu T-Y. Lightgbm: A highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst. 2017;30

Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.

Albawi S, Mohammed TA, Al-Zawi S. Understanding of a convolutional neural network. In: 2017 International conference on engineering and technology (ICET). 2017; pp. 1–6. IEEE

O’Shea K, Nash R. An introduction to convolutional neural networks. 2015; arXiv preprint arXiv:1511.08458

Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011;12:2825–30.

Che H, Jatsenko T, Lenaerts L, Dehaspe L, Vancoillie L, Brison N, Parijs I, Van Den Bogaert K, Fischerova D, Heremans R, et al. Pan-cancer detection and typing by mining patterns in large genome-wide cell-free dna sequencing datasets. Clin Chem. 2022;68(9):1164–76.

Li J, Wei L, Zhang X, Zhang W, Wang H, Zhong B, Xie Z, Lv H, Wang X. Dismir: D eep learning-based noninvasive cancer detection by i ntegrating dna s equence and methylation information of i ndividual cell-free dna r eads. Brief Bioinf. 2021;22(6):250.

Nguyen L, Van Hoeck A, Cuppen E. Machine learning-based tissue of origin classification for cancer of unknown primary diagnostics using genome-wide mutation features. Nat Commun. 2022;13(1):4013.

Download references

Acknowledgements

The authors would like to thank the DAC for MCO colorectal cancer genomics at The University of New South Wales, for providing the data used in the study. The authors would also like to thank Prof. Jason Wong, for facilitating the data access requests and approvals.

The work reported herein was made possible through funding by the South African Medical Research Council (SAMRC) through its Division of Research Capacity Development under the Internship Scholarship Program from funding received from the South African National Treasury. The content hereof is the sole responsibility of the authors and does not necessarily represent the official views of the SAMRC or the funders.

Author information

Authors and affiliations.

Department of Computer Science, University of Pretoria, Pretoria, South Africa

Mpho Mokoatle & Vukosi Marivate

CapeBio TM Technologies, Centurion, South Africa

Darlington Mapiye

School of Medical Sciences, The University of Sydney, Sydney, Australia

Vanessa. M. Hayes

School of Health Systems and Public Health, University of Pretoria, Pretoria, South Africa

Riana Bornman & Vanessa. M. Hayes

You can also search for this author in PubMed   Google Scholar

Contributions

MM conceptualised the work and wrote the main manuscript. VM and DM co-supervised and validated the results of experiments reported on the paper. RB and VMH provided expert advice on the topic and also reviewed the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Mpho Mokoatle .

Ethics declarations

Ethics approval and consent to participate.

Ethics approval was granted by the University of Pretoria EBIT Research Ethics Committee (EBIT/139/2020). Data approval was granted by the DAC for MCO colorectal cancer genomics at UNSW.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1..

Appendix A.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Mokoatle, M., Marivate, V., Mapiye, D. et al. A review and comparative study of cancer detection using machine learning: SBERT and SimCSE application. BMC Bioinformatics 24 , 112 (2023). https://doi.org/10.1186/s12859-023-05235-x

Download citation

Received : 28 November 2022

Accepted : 17 March 2023

Published : 23 March 2023

DOI : https://doi.org/10.1186/s12859-023-05235-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Cancer detection
  • Machine learning
  • SentenceBert

BMC Bioinformatics

ISSN: 1471-2105

research paper on cancer detection

An official website of the United States government

Official websites use .gov A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS A lock ( Lock Locked padlock icon ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

  • Publications
  • Account settings
  • Advanced Search
  • Journal List

Springer Nature - PMC COVID-19 Collection logo

A Systematic Review of Artificial Intelligence Techniques in Cancer Prediction and Diagnosis

Yogesh kumar, surbhi gupta, ruchi singla.

  • Author information
  • Article notes
  • Copyright and License information

Corresponding author.

Received 2021 May 23; Accepted 2021 Sep 11; Issue date 2022.

This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.

Artificial intelligence has aided in the advancement of healthcare research. The availability of open-source healthcare statistics has prompted researchers to create applications that aid cancer detection and prognosis. Deep learning and machine learning models provide a reliable, rapid, and effective solution to deal with such challenging diseases in these circumstances. PRISMA guidelines had been used to select the articles published on the web of science, EBSCO, and EMBASE between 2009 and 2021. In this study, we performed an efficient search and included the research articles that employed AI-based learning approaches for cancer prediction. A total of 185 papers are considered impactful for cancer prediction using conventional machine and deep learning-based classifications. In addition, the survey also deliberated the work done by the different researchers and highlighted the limitations of the existing literature, and performed the comparison using various parameters such as prediction rate, accuracy, sensitivity, specificity, dice score, detection rate, area undercover, precision, recall, and F1-score. Five investigations have been designed, and solutions to those were explored. Although multiple techniques recommended in the literature have achieved great prediction results, still cancer mortality has not been reduced. Thus, more extensive research to deal with the challenges in the area of cancer prediction is required.

Introduction

The word cancer comes from the ancient Greek kapkivoc, which means crab and tumor. Cancer was introduced to the medical world in the 1600 s and is associated with abnormally growing cells that can invade or spread to other parts of the body [ 136 ]. The uncontrolled growth of cells starts from a site in the human body and further spreads to other body parts known as cancer metastasis [ 43 , 172 ]. Cancer cells are categorized into benign and malignant cells. The benign cells do not spread to other parts, while malignant cells metastasize and are considered more destructive. Due to high mortality and recurrence rate, its process of treatment is very long and costly. There is a need to accurately diagnose it early to enhance cancer patient's survival rate. It is a genetic disease triggered due to genetic mutations that control our cell's function, especially how they grow and divide. As the tumor cells continue to grow, additional changes will occur. In a nutshell, cancer cells have more genetic changes, such as mutations in DNA, than normal cells [ 116 ], 110]. Though the immune system generally discards damaged or abnormal cells from the body, few cancer cells can hide from the immune system. The tumor also uses the immune system to grow and stay alive [ 179 ]. The name of the cancer type is based on the site where tumor cells grow, for example, cancer that arises in the lungs and spreads to the liver is called lung cancer. Cancer diagnosis includes three predictive predictions related to cancer risk assessment, cancer recurrence, and cancer survivability prediction. Initially, the probability of cancer occurrence is assessed, followed by the second step, predicting cancer recurrence. The last step is to predict the aspects like progression, life expectancy, tumor-drug sensitivity, survivability [ 95 ].

The motivation behind this research is the rapid growth in cancer incidence and mortality cases worldwide [ 10 ]. The reasons are complex but reflect both aging and growth of the population and changes in the prevalence and distribution of the main risk factors for cancer. Figure  1 depicts the cancer incidence cases and death statistics reported by the American Cancer Society and other reliable resources.

Fig. 1

Estimated number of new cases and deaths in 2020 for common cancer types ( www.cancer.net )

Multiple investigations have been done in cancer research; for example, Rong et al. [ 142 ] have led a mortality and survival study by gender orientation. Dolatkhah et al. [ 49 ] have introduced the investigation that revealed the endurance information and pattern examination of malignant breast growth in Iran. Goodarzi et al. [ 65 ] had introduced the assessment dependent on distinct cross-sectional malignant growth studies. Azamjah et al. [ 13 ] aimed to determine the 25-year breast cancer mortality rate in 7 super regions defined by the Health Metrics and Evaluation (IHME). Momenimovahed et al. [ 115 ] presented a study that determined that breast cancer incidence varies significantly with race and ethnicity and is higher in developed countries. Haggar et al. [ 66 ] introduced the examination which demonstrated the frequency, mortality, and survival rates for colorectal malignancy are with consideration paid to provincial varieties and changes after some time. Zhang et al. [ 184 ] led an investigation to gather the CRC frequency information from the Cancer Incidence in Five Continents. Wong et al. [ 174 ] observed a positive correlation between incidence and country-specific socio-economic development. Nguyen et al. [ 124 ] summarized the diagnosis and treatment of thyroid cancer, with recommendations from the American Thyroid Association regarding thyroid nodules and differentiated thyroid cancer. Lee et al. [ 176 ] have stated that from March 18 to April 26, 2020, 800 patients analyzed with a diagnosis of cancer and symptomatic COVID-19. 412 (52%) patients had a mild COVID-19 disease course. 226 (28%) patients died, and the risk of death was significantly associated with advancing patient age. Al-Zhou et al. [ 6 ] evaluated the demographic characteristics and histological trends of skin cancer in Southern areas of Yemen. Artificial Intelligence (AI) is one of the exceptional achievements of computer science conceived around the 1940s [ 5 , 130 ]. AI has marked its significance in advanced clinical diagnostics by providing unique opportunities to incorporate the tools into the healthcare area [ 4 , 131 ]. AI aims to analyze the associations between treatment techniques and patient outcomes. In cancer research, AI has proved its potential to affect several facets of cancer therapy, improved the accuracy and speed of diagnosis, and provided more reliable clinical decisions, leading to better health outcomes [ 182 , 183 ]. AI provides an unprecedented cancer prediction accuracy level higher than a general statistical expert [ 152 , 180 ]. Thus, AI-based cancer detection models can assist in health centers and help medical experts affirm their medical verdicts without any obstruction. Hence, the article aims to highlight the contribution made by the researchers in the field of artificial intelligence techniques for the early detection and diagnosis of cancer.

Contribution and Organization of Paper

We conducted an extensive survey of the conventional machine and deep learning models proposed in cancer research. The paper presents a comparative analysis of the existing research works using AI-based techniques and medical imaging for cancer diagnosis, medical imaging for diagnosis, and automated analysis in cancer diagnosis. Most of the techniques proposed in the different papers were based on the deep learning framework and provided appreciable prediction outcomes. The paper provides a description of cancer complications and clinical applications, cancer classification using AI-based techniques, the role of deep learning in cancer research, limitations of cancer prediction-related using automated learning, multiple investigations, and challenges corresponding to cancer research using AI-based techniques.

The rest of the paper is organized as follows. Section  2 elaborates the research methodology. This section discusses the approach used for selecting the literature. Section  3 highlights the Cancer complications and clinical Applications. Section  4 expresses the reported work, which covers the deep learning perspective in cancer. This section further discusses the comparative analysis, which includes the challenges of the current work with performance evaluation using various other parameters. Section  5 delivers a thorough discussion; all the investigations are discussed in this section. Section  6 concludes the paper and discusses future directions.

Research Methodology

We conducted this systematic review under the PRISMA guidelines [ 40 ]. We performed an efficient search for selecting research articles on three different electronic databases, i.e., the web of science, EBSCO, and EMBASE. These are all openly available web indexes that list the entire content or metadata of academic writings. The articles were selected using the query ((Artificial Intelligence) or (Cancer Diagnosis) or (Early Detection) or (Machine Learning) or (Deep Learning)). The exclusion and inclusion standards used to select the articles are discussed in Sect. 2.1 . Figure  2 presents the PRISMA flowchart depicting the detailed screening of the collected papers.

Fig. 2

PRISMA flow chart

The articles published from 2009 to April 2021 have been included in this study. Total 350 studies were selected, and after removing duplicate ones, 275 studies remained. Subsequently, 210 papers were selected, and the studies focused on diseases other than cancer, treatment & surgery, a language other than English were excluded. Also, after this phase, the complete articles were evaluated, and the research articles that used methods other than AI-based techniques were also excluded from further analysis. Finally, the 185 selected articles were analyzed in the study.

Investigations

Investigation 1:  Which Learning Approach has provided appreciable prediction outcomes extensively?

Investigation 2:  Which cancer site and training data has been explored most extensively?

Investigation 3 : In which year most of the cancer prediction studies have been published?

Investigation 4: W hich sorts of images have attained the highest prediction accuracy?

Investigation 5:  What are the Challenges faced by the researchers in the construction of AI-based prediction models.

Cancer Complications and Clinical Applications

The DNA present inside a cell is packaged into a vast number of individual genes and has instructions that communicate the cell's functions. [ 15 ]. DNA mutations are the reason for cancer development. The original functioning of the cells ultimately turns cancerous due to some error interruption in the multistage process [ 104 , 185 ].

Figure  3 shows different factors that affect the spread of cancers. Tobacco, alcohol, improper diet, and few physical activities are the leading cancer risk factors worldwide. Some chronic infections are the risk factors for cancer and have major significance in low- and middle-income countries.

Fig. 3

Causes of cancers [ 26 ]

Cancer Complications

While undergoing cancer treatment, one can experience many complications that affect the health of the patient. However, not all cancers are painful while undergoing cancer treatment, but they still may have to experience some pain. But there are few medications and other approaches that help treat cancer-related pain [ 129 , 184 ]. During cancer, one can experience fatigue and many symptoms, but usually, it is manageable [ 3 ]. Tiredness happens because of radiation therapy or chemotherapy treatments,however, it is generally short-term. Breathing is another complication because of cancer or cancer treatment [ 120 ]. However, treatments may bring relief whereas, some types of cancer and treatment of cancer can lead to nausea [ 34 ]. Cancerous cells deprive normal cells of required nutrients, which may ultimately cause a loss in weight. Majorly, even if nutrients are provided with the help of artificial ways via tubes in the vein or stomach, it still does not impact the reduction of weight [ 169 ], 21]. Cancer can also uplift severe complications because of the imbalance of the average chemical balance in the human body. Frequent urination, confusion, excessive thirst, and constipation might be the signs and symptoms of chemical imbalances [ 46 ]. In some instances, cancer can impact the body's immune system by attacking cancer cells to normal and fit cells. Paraneoplastic syndrome, a very uncommon reaction, can bring on several symptoms and signs like a problem in walk and seizures [ 7 ]. Cancer immensely affects the functioning of that body part as it may press on nearby nerves. It can cause headaches and signs and symptoms of stroke and maybe a weakness on one side of the human body if it involves the brain [ 47 ]. Suppose someone becomes successful in defeating once it may save one temporarily because cancer survivors always remain at the risk of occurrence [ 36 ]. So, the patient needs to hear from the doctor about the precautions.

Clinical Applications

Doctors can develop a plan for the future, consisting of scans and examine at regular fixed intervals of time (in the months or years) after the patient's treatment to investigate radiation treatment: In a radiation treatment, cancerous cells are targeted [ 30 , 54 ]. A significant fraction of cancer cases and deaths can be preventable by having an excellent epidemiological and mechanistic understanding of environmental and behavioral risk factors. Cancer therapeutics presently have the most minimal clinical preliminary achievement pace of every significant sickness. Due to the scarcity of successful anti-cancer drugs, malignant growth will be the leading source of mortality in created nations. As a sickness inserted in the essentials of our science, cancerous growth presents troublesome difficulties that would profit by joining specialists from a wide cross-segment of related and random fields [ 55 ]. Along with causes, we have factors for identifications of the initial staging of cancer. Diagnosing cancer at an early stage ultimately leads to higher survival rates, less morbidity, and less expensive treatment [ 27 ]. Three essential steps need to be taken in a well-timed way:

Alertness and get into precaution

Medical valuation, analysis, and staging

Get into therapeutics.

The relevancy of early diagnosis is high in every situation and most cancers. Programs can be formulated to lessen hold-up in and obstruction to care, letting patients gain treatment well in time [ 31 ].

Current methodologies applied in the medical sector for cancer prediction

The section presents a description on the clinical practices applied in the medical sector for cancer prediction at present. The methodologies are described as follows:

Screening : Screening aims to find people of particular cancer or pre-cancer who have not developed any symptoms and direct them quickly for analysis and treatment. For the specific type of cancer, screening can be effective when tests are used according to the need and stages [ 149 ]. Moreover, screening is a more complicated process to follow than early diagnosis. Screening is of utmost necessary to have an accurate diagnosis [ 10 ]. The main reason behind every type of cancer is that cancer needs a unique treatment schedule that includes single or extra modalities, such as chemotherapy, surgical procedures, and radiotherapy [ 16 ]. The main aim is to treat the tumor and significantly extend lifespan because improving a patient's life is also an unforgettable target [ 28 ].

  • Hormone-level therapy : Hormone-level therapy works on the reaction of few hormones to the body. Hormones play a substantial role among people suffering from prostate or breast cancers [ 53 ].
  • Immunotherapy : Immunotherapy aims to strengthen the body's immune system to fight against cancerous cells. Checkpoint inhibitors and adoptive cell transfers are some examples of immunotherapy [ 150 ].
  • Personalized medication : Personalized medication is a newly developed approach with the help of genetic testing and determines suitable treatment for specific cancer. However, it is yet to prove that whether personalized medication can treat all kinds of cancers or not [ 24 ].
  • Radiation treatment : Radiation therapy kills the cancerous cells or slows down the growth of cancerous cells by damaging their DNA. Medical experts often recommend this treatment to shrink tumors or minimize cancer symptoms before surgery [ 89 ].
  • Stem cell transplant : Stem cell transplant is helpful for cancer that is related to blood, such as leukemia or lymphoma. The process involves the removal of RBC (Red Blood Cells) and WBC (White Blood cells), which have been destroyed because of the chemotherapy [ 34 ].
  • Surgery : Surgery is primarily done when a person is suffering from cancerous cells. It is also used to nullify the spread of the disease by removing the lymph nodes [ 48 ].
  • Targeted therapies : Targeted therapies are used to avoid the spread of cancer and improve immunity. Small-molecule drugs and monoclonal antibodies are examples of the target therapies [ 90 ].

Related Work

From the last couple of years, artificial intelligence has taken society’s imagination and created interest in its potential to progress our lives [ 91 ]. Now the usage of AI has been increasing rampantly to uplift disease recognition, its management, and the ramification of therapies. Because of the growing number of patients identified with cancer and the ample amount of data gathered during the treatment process [ 77 , 119 ]. It leads to the need for AI to improve oncologic care. Cancer prediction can diminish the mortality rate [ 57 , 118 ]. The section consists of cancer diagnosis based on deep learning methods, medical imaging for cancer, the mortality rate for different cancers, cancer dataset, and automated and semi-automated methods for cancer detection.

Artificial Intelligence in Medical Imaging for Cancers Diagnosis

In clinical imaging, computer-aided detection (CADe) or computer-aided diagnosis (CADx) is the system-based framework that helps specialists to make decisions rapidly [ 70 ]. Medical imaging manages data in the picture that the clinical specialist and specialists need to assess and examine abnormality in a timeframe [ 182 , 183 ]. Clinical images prepared with AI strategies can propel the exactness in various cancer growth stages [ 121 ]. In this way, early malignancy determination and recognition clinical imaging is a robust method. Without a doubt, clinical imaging has been generally utilized for early malignancy discovery, checking, and follow-up after the medicines [ 44 , 101 , 102 ].

Figure  4 shows different kinds of scans used for cancer diagnosis. A computed tomography (CT) scan can help doctors diagnose cancer and determine the shape and size of the tumor. Nuclear medicine scans can help medical experts determine cancer metastasis. The most common nuclear scans are bone scans, PET (positron emission tomography) scans, Thyroid scans, MUGA (multigated acquisition) scans, and gallium scans. MRI assists specialists with discovering malignancy in the body and search for signs that it has spread. X-ray additionally can help specialists plan malignant growth therapy, similar to medical procedure or radiation, and Mammograms are low-portion x-beams that can help discover breast disease. Detection of Cancer usually includes radiological imaging that examines the extent of cancer and improvement after treatment. Oncological imaging is constantly turning into more wide-ranging and precise [ 95 ]. Suberi et al. [ 162 ] proposed an image-based computer-aided system for cancer immunotherapy. The proposed approach enhanced the preparation of the vaccine with Dendritic Cells (DCs) immunotherapy. The study has incorporated various image-based algorithms have into the system with low computational time.

Fig. 4

Types of imaging for cancer test

Nirupama and Damodhar [ 126 ] predicted lung cancer using the MRI scans (Dicom images). Win et al. [ 171 ] developed a computer-aided decision system to detect the cancer cells in cytological pleural effusion images. Initially, median filtering and intensity adjustment were applied to enhance the quality of the picture. They used a hybrid segmentation method to extract cell nuclei based on simple linear iterative clustering and K-means clustering. In a K- means clustering algorithm, the error of each data point is computed using the distance (Euclidean) between the data point and nearest centroid as shown in Eq. ( 1 ), and further compute the total sum of the squared errors.

In the Eq. ( 1 ), D , m , and n represent the objective function, the number of clusters, and number of cases, respectively. Also, x j i represents j th case of i th cluster and c i is the centroid for i th cluster. Another distance metric used in K-means clustering is cosine similarity, expressed mathematically in Eq. ( 2 ).

In Eq. ( 2 ), a and b are the Euclidean norms of the vector a and vector b , respectively. Rosalidar et al. [ 140 ] presented the asymmetrical thermal distribution on breast thermograms using computer-assisted technology. The reported work has shown that the current neural learning models have increased the classification accuracy of breast cancer thermograms. Taher et al. [ 165 ] worked on the CAD system to diagnose lung cancer. They used the database of 100 sputum color images of different patients collected from the Tokyo Centre of lung cancer. The new CAD system processed the sputum images and classified them into benign or cancerous cells. Another factor observed in the study was the superior performance of Bayesian classification over the rule-based heuristic classification. The Bayesian algorithm works by computing posterior probabilities as shown in Eq. ( 3 ).

In Eq. ( 3 ), f c and f x   are the prior probability of class and predictor, respectively. Also, f c | x and f x | c denote the posterior probability of target  ( c ) given  predictor  ( x ) and the probability of x given  c , respectively. Naeem et al. [ 117 ] introduced the AI (ML) strategies for liver malignancy order using a fused dataset of two-dimensional (2D) computed tomography (CT) and attractive reverberation imaging (MRI). From that point, a combination of MRI and CT-filter datasets produced the fused optimized hybrid-feature dataset. The MLP has indicated a promising exactness of 99% among all the conveyed classifiers. Kalaiselvi et al. [ 80 ] have also proposed a fuzzy c-means method to detect automatic brain tumors from T2-weighted MRI brain images using the principle of modified minimum error thresholding (MET). Lee et al. [ 99 ] discovered the most widely recognized type of disease types, particularly breast malignancy, prostate disease, cellular breakdown in the lungs, and skin disease. A new proposed distributed computing structure has motivated the specialists to use the current deals with picture-based disease investigation and build up a more flexible CAD framework for discovery [ 87 ]. introduced an edge technique for sectioning mammographic pictures to identify Breast malignancy in its beginning phases. [ 127 ] evaluated a computer-aided diagnosis (CADx) system for lung nodule classification. The retrospective study hand-crafted imaging features with machine learning algorithms and compared support vector machine (SVM) and gradient tree boosting (XGBoost) as machine learning algorithms. Gradient boosting classifiers works by first computing the error done by each misclassified instance as shown in Eq. ( 4 ) and then increasing the weight of misclassified instances in the next layer as shown in Eq. ( 4 ).

Here, E denotes the error, w is the weight associated with each instance and  m is the size of the dataset, and p  denotes the number of the weak learners. The hypothesis  ħ s m  for each of the s instances is evaluated under the condition function C . The weight Updation formula is given in Eq. ( 5 ).

Deep learning methods for cancer detection

Deep learning is a sub-part of AI, which falls under artificial intelligence. Deep learning is a technique that takes in the features from the data, for instance, text, pictures, or sound. Deep learning is one of the most significant attributes of AI [ 101 , 102 ]. Traditional AI methodologies require gathering steps to achieve the portrayal task, including pre-getting ready, feature extraction, and wary selection of features, learning, and request [ 113 ]. The introduction of these systems is solidly dependent on the picked features, which may not be the right features to isolate between classes. At the same time, Deep learning engages the robotized learning of the capacities for different endeavors instead of standard AI methodology. It can achieve the learning and gathering in one shot [ 114 ].

Figure  5 shows the deep learning methods for cancer diagnosis and detection by analyzing the medical imaging in different steps. This section discusses the purpose of various deep learning models such as auto-encoder, transfer learning, Convolutional Neural Networks, Gradient Descent, Generative Adversarial Networks, and Boltzmann Machines for cancer diagnosis and detection. Yu et al. [ 178 ] built up an information-based discovery technique that utilized deep learning strategies for lincRNA discovery and created DNA genome examination [ 82 ]. Second, approving the commented on lincRNAs record locales and testing the presence of deep learning strategy by contrasting and customary procedures. For the primary objective, the auto-encoder method accomplished a 100% rate.

Fig. 5

Deep learning process for cancer diagnosis [ 1 ]

An auto-encoder strategy is made out of three primary strides, as demonstrated in Fig.  6 : building, pre-preparing, and approving. The fundamental design, including an input layer, concealed layer, and initiation capacities, is fabricated in the initial step. Also, the encoder and the decoder are prepared layer by coating following the pre-arranged cycles. Thirdly, fine-grained preparing/approval is performed through the whole model. All in all, the initial step develops the fundamental system of the deep neural organization, the subsequent one trains the layer-wise hubs, and the last one moves through all layers for approval. Brosch et al. [ 35 ] described a method that learned the 3D brain image using a deep belief network. Their approach took low computational time and less memory. Kadam et al. [ 79 ] also proposed a feature ensemble learning based on Sparse Auto-encoders and Softmax Regression for classification of Breast Cancer into benign (non-cancerous) and malignant (cancerous). An Auto-encoder consists of an encoder part and a decoder part, an artificial neural network trained using unsupervised learning that applies the back-propagation approach. Sparse Auto-encoder (SA) is an Autoencoder imposed with sparseness constraints on all hidden nodes and the sparse penalty term. The cost function for training a Sparse Auto-encoder (given by Eq. ( 6 ) includes three attributes. The first term is called mean square error, which offers the discrepancy between input and reconstructs the whole training data.

where λ = T h e c o e f f i c i e n t f o r t h e L 2 r e g u l a r i z a t i o n t e r m .

Fig. 6

Working of auto-encoder method [ 126 ]

Mean Squared Error computes the average squared difference between predicted and the actual value. MSE is expressed mathematically in Eq. ( 7 ) where G and G i are the vectors of observed and predicted values

Li [ 100 ] also proposed a practical and self-interpretable invasive cancer diagnosis solution for the diagnosis of breast cancer. Also, Krithiga et al. [ 88 ] carried a systematic review on breast cancer that focused on the call for specific action in the diagnostic processes. Similarly, Bulten et al. [ 32 ], Sajja et al. [ 145 ] also proposed a deep neural network based on GoogleNet with a maximum dropout ratio to moderate the processing time for detection of lung cancer using CT scan images. In the proposed approach, 60% of neurons are at a fully connected layer with which higher drop rate than the existing GoogleNet. Experiments were conducted using the three pre-trained CNN architectures such as AlexNet, GoogleNet, and ResNet50 on LIDC pre-process dataset. ResNet50 produced the highest accuracy than the pre-trained architectures and the state-of-the-art methods. The main components working behind the deep learning architecture are the "neurons" that compute average k vector values, and q denotes the column vector of weights. The working is mathematically expressed in Eq. ( 8 ).

Further, bias ( b) gets updated with each iteration and added to adjust the output, as shown in Eq. ( 9 ).

The functioning of layer k is explained in Eq. ( 10 ), where g and a are the non-linear function and activation functions.

The function of each is further computed, as shown in Eq. ( 11 ).

Kassani et al. [ 78 ] proposed a successful deep learning-based technique utilizing a DCNN descriptor and pooling activity to characterize breast malignancy. The creators likewise utilized diverse information enlargement strategies to help the exhibition of order and explored the impact of various stain standardization strategies. The proposed approach using the pre-prepared Xception model accomplished 92.50% order precision. Chen et al. [ 37 ] proposed a transfer learning-based depiction group (TLSE) strategy by incorporating preview outfit learning with move learning in a brought together and composed manner. Preview outfit gives troupe benefits inside a solitary model preparing methodology while moving learning centers around the little example issue in cervical cell arrangement.

Figure  7 portrays the transfer learning-based approach ensemble strategy for cervical cell arrangement reason. The TLSE technique is assessed on a pap-smear dataset called Herlev dataset and is demonstrated to have a few superiorities over the leaving strategies. It shows that TLSE can improve the exactness with just one preparing measure for the little example in fine-grained cervical cells arrangement. Alzubaidi et al. [ 9 ] introduced a crossover deep convolutional neural organization to arrange hematoxylin–eosin-stained bosom biopsy pictures into four classes: obtrusive carcinoma, in-situ carcinoma, kind tumor, and normal tissue. The model consolidated two ideas, which are equal convolutions with various channel sizes and leftover connections. The foundational layout of the proposed model has as conspicuous attributes a superior component portrayal and the mix of highlights at multiple levels. This study achieved a precision of 90% precision in predicting breast cancer. Sasikala et al. [ 151 ] performed the detection of skin cancer lesions as malignant (melanoma) or benign using the CNN. The system's performance was evaluated using the accuracy and error rate with varying learning rates. Hosny et al. [ 76 ] introduced a programmed skin injuries grouping framework with a higher characterization rate utilizing the hypothesis of move learning and the pre-prepared deep neural organization. The exchange learning has been applied to the Alex-net in various manners, including the arrangement layer with a softmax layer. The presentation of the framework is measured with the ISIC dataset and got 93% precision. Nivaashini and Soundariya [ 128 ] The proposed system uses a Deep Boltzmann Machine (DBM) to find an efficient set of features. Deep Neural Network (DNN) classifier is used to classify the tumor into benign or malignant breast cancer groups. The proposed system obtained a higher detection rate of 99.73% than the conventional machine learning models.

Fig. 7

Transfer learning-based snapshot ensemble method [ 37 ]

Figure  8 shows the typical segmentation with Deep Learning: A Convolutional Neural Network (CNN) based model is discovered. It first packs up the source picture with a heap of various convolution, actuation, and pooling layers. The inverse operation extends the compacted latent representation. The organization is kept from start to finish trainable. At the test time, a forward pass gives the segmentation labels, which first packs the information picture measurements with a heap of convolutional and pooling layers. Altaf et al. [ 1 ], Gomez et al. [ 59 ] also proposed a CNN-based breast disease diagnosis technique by utilizing thermal pictures. The creators showed that an all-around delimited data set split method is required to decrease the bias and overfitting during the training process. They likewise introduced the studies on the DMR-IR data set. Exploratory outcomes affirmed that the data set split approach limits the overfitting and bias during training. The creators also passed on that state-of-the-art benchmark of CNN models, for example, ResNet, SeResNet, VGG16, Inception, InceptionResNetV2, and Xception, the DMR-IR data set. Albahar [ 8 ] proposed a prediction model that grouped skin injuries into kind-hearted or harmful sores dependent on a novel regularize method. The proposed model accomplished a standard exactness of 97.49%, which indicated its prevalence over other state-of-the-art strategies. The presentation of CNN as far as AUC-ROC with an implanted novel regularizer was tried on various use cases. The Area under the curve (AUC) accomplished for nevus against melanoma sore is 77%. Ragab et al. [ 135 ] proposed a computer-aided diagnosis (CAD) structure for requesting thoughtful and undermining mass tumors in breast mammography pictures. The deep convolutional neural association (DCNN) is used to incorporate extraction. An outstanding DCNN design named AlexNet is used and is aligned to mastermind two classes instead of 1,000 classes. The last related convolution layer is associated with the support vector machine (SVM) classifier to improve exactness. The results are obtained using the going with transparently open datasets (1) the electronic informational index for screening mammography (DDSM) and (2) the Curated Breast Imaging Subset of DDSM (CBIS-DDSM). The mathematical working of linear, polynomial, and radial basis function (rbf) kernel is expressed in the Eqs. ( 12 ), ( 13 ), ( 14 ), respectively.

Here, k i and k j are n-dimensional inputs.

Here, r is the constant and t is the degree of freedom.

Here, σ is the free parameter.

Fig. 8

Deep learning-based CNN model for segmentation of MRI imaging [ 1 ]

Saraf and Kalpana [ 148 ] presented the work for classifying the benign and the malignant thyroid nodules in ultrasound images. The author performed pre-processing, segmentation, feature extraction as well as the classification for thyroid detection. Edge detection techniques have been used for segmentation purposes and detected malignant nodule using ANN. Similarly, Dov et al. [ 51 ] also presented the work for predicting thyroid-malignancy from the ultra-high-resolution whole-slide images of the cytopathology. A deep-learning-based algorithm has been used for the cytopathologist diagnosing the slides. The projected algorithm assigns the relevant image regions to the local malignancy scores, which are incorporated into global malignancy. The reported output of the presented work using the MIL method is 0.87 Area under the curve (AUC) and 0.743 average precision (AP). Ma et al. [ 106 ] also proposed that the CNN diagnose thyroid-based diseases using the SPECT images. The projected method used the modified DenseNet architecture as well as the improved training method. The accuracy achieved using the proposed method is 99.08% for Grave’s disease, 99.25% for Hashimoto disease, and 99.67% for Subacute disease. Sokoutil et al. [ 161 ] presented the work for detecting tumors in the thyroid gland. The reported work depicts the image processing technique and the simple, intelligent system like the hill-climbing algorithm. Malathi et al. [ 107 ] presented the CNN method for the segmentation of brain tumors and achieved high prediction accurateness [ 132 ], compared three segmentation algorithms and proposed a Random Forest (RF) classifier, and convolution neural network. RF and CNN yielded an average Dice’s coefficient (DC) of 0.862 and 0.876, respectively. The RF classification method computes the information gain for a split using Entropy ( E ). Mathematically,

E is expressed in Eq. ( 15 ). Here, y is the number of classes (binary or multi) and ρ n is the likelihood that an instance belongs to the class n.

Image processing techniques have been widely used in various health sectors, especially detecting and diagnosing cancer early. Huidrom et al. [ 75 ] used Juxta-Pleural nodules inclusion which was a fully automated lung segmentation method, and it consisted of two main stages. In its first stage, the Lung region was extracted, also known as lung field extraction, followed by the second stage, lungs were segmented using boundary analysis and segmentation techniques. It has been observed that their proposed method yielded a better result than that of the existing ones. Whereas, Asideu et al. [ 12 ] proposed a technique in which automatic features were extracted and classified for acetic acid and Lugol’s iodine cervigrams. The study employed various techniques for combining the features in cervigrams and used a support vector machine model to classify cervigrams. Cheng et al. [ 38 ] used a CAD system to detect and classify breast cancer. They did it in four stages, i.e., pre-processing, segmentation, feature extraction, and feature classification. Patil et al. [ 131 ] presented the automated system to build the mammogram breast detection model with improved hybrid classifiers. Image processing, tumor segmentation, feature extraction, and diagnosis are the well-designed steps for detecting projected breast cancer. [ 122 ] launched automated multi-strategy-based lung nodule detection and the classification system, which contains the objective of the bogus positive decrease at the beginning phases. Cui et al. [ 41 ] proposed the strategy to perceive lung nodules in the pictures of chest CT and improved DICOM windows show. During this experiment, the nodule recognition was 92.65% sensitive with 0.2468 FPs/filter.

Comparative Analysis

The comparative analysis section highlighted the study of different researchers for cancer disease detection using AI techniques. The prediction outcomes are classified on basis of parameters such as accuracy, sensitivity/recall, precision, specificity, dice score, Area under the Curve. Figure  9 provides the description of multiple evaluation parameters.

Fig. 9

Evaluation parameters

Table 1 comprises the comparative analysis based on multiple evaluation parameters for various cancer types.

Comparative analysis using AI techniques for different cancers

As shown in the comparative analysis, many research works have been analyzed for cancer diagnosis and detection using conventional machine and deep learning methods. It can be observed that most of the deep learning techniques have performed well and achieved high accurateness in terms of the prediction scores obtained. Also, most of the research articles have been published recently (2020). Also, most of the studies have worked on the diagnosis of breast cancer.

In the current review, we have presented recently published research studies that employed AI-based Learning techniques for predicting malignancy. This study highlights research works related to cancer diagnosis prediction and predicting post-operative life expectancy of cancer patients using AI-based learning techniques.

Investigation 1 : Which Learning Approach has provided appreciable prediction outcomes extensively?

AI-based techniques have contributed significantly to the field of cancer research. The research works mentioned in the literature have focussed mainly on deep learning techniques. Deep learning classifiers have dominated over machine learning models in the field of cancer research. Among Deep learning models, Convolutional Neural Networks (CNN) has been used most commonly for cancer prediction; approximately 41% of studies have used CNN to classify cancer. Neural networks (NN) and Deep Neural Networks (DNN) have also been used extensively in the literature. Apart from deep learning approaches, Ensemble learning techniques (Random Forest Classifier weighted voting, Gradient Boosting Machines) and Support vector machines (SVM) are primarily used in literature. The distribution of literature based on AI-based prediction models is shown in Fig. 10 .

Investigation 2: Which cancer site and training data has been explored most extensively? Most of the research papers explored in this review focused on the automated diagnosis of cancer prediction. The most extensively explored sites are the breast (22) followed by the kidney (17). Other than breast and kidney, most researchers have worked on brain, colorectal, cervical, and prostate cancer prediction. Figure 11 depicts the distribution of the research works based on cancer sites.

The type of data used to train the prediction model significantly affects the performance of the model. The reliability and the prediction outcomes are dependent on the data used to train the classification model. Most of the research studies reviewed in this paper has used Magnetic Resonance Imaging (MRI). The second most commonly used data is Computed Tomography (CT) scan images. Other image types like dermoscopic, mammographic, endoscopic, and pathological were also used in the literature. Figure 12 highlights the distribution of papers based on the type of data used to train the prediction model.

Investigation 3: In which year most of the cancer prediction studies have been published?

The research works published between 2009 to April 2021 are selected in this review article. Figure 13 demonstrates the distribution of the articles based on the published year. Most of the research works were published in the years 2020 (35), 2019 (32), 2018 (30). There are few papers from the year 2021 as we could only extract papers published up to April 2021. Based on the analysis of Fig. 13 , we can conclude that number of research studies has increased gradually in recent years.

Investigation 4 : which sorts of images have attained the highest prediction accuracy? Most of the studies have used MRI images for cancer diagnosis prediction. Approximately 23% of literature has used Computed Tomography scan for training the model. Also, many studies have employed mammographic images, endoscopic images, and pathological images. Low contrast in CT scan images makes the classification task difficult as it becomes difficult to differentiate the object from the background. Some cancers, such as prostate cancer, and certain liver cancers , are hardly detected using a CT scan. In such scenarios, Digital Imaging and Communications in Medicine (DICOM) images generated from MRI can help achieve the purpose with greater prediction accurateness.

Regarding the specificity of the type of classification models used for specific cancer: Convolutional Neural Networks models have been used to predict almost every type of cancer such as brain, colorectal, skin, thyroid, and lungs. Most of the studies that explored the prediction of breast cancer diagnosis used hybrid modes or novel approaches for the purpose. Also, Neural networks have been applied to almost all breast and cervical cancer datasets. Regarding Stomach cancer, only Convolutional Neural Networks have been used. Support Vector machines have been used for the prediction of liver and breast cancer. In a nutshell, Convolutional Neural Networks can be applied with different datasets. Also, ensemble learners have been used with almost every kind of cancer.

Investigation 5: Challenges faced by the researchers in the construction of AI-based prediction models.

Although AI-based techniques have marked their significance in the field of cancer prediction research, there are still many challenges faced by the researchers that need to be addressed.

Limited Data size  The most common challenge faced by most of the studies was insufficient data to train the model .  A small sample size implies a smaller training set which does not authenticate the efficiency of the proposed approaches. Good sample size can train the model better than the limited one.

High dimensionality  Another data-related issue faced in cancer research is high dimensionality. High dimensionality is referred to a vast number of features as compared to cases. However, multiple dimensionality reduction techniques [ 155 ] are available to deal with this issue. However, the requirement of a generic approach to handle this issue is there.

Class imbalance problem  A leading challenge faced by medical data sets, especially cancer data, is the uneven distribution of classes. Class imbalance arises due to a miss-match of the sample size of each class. Classification models tend to be biased towards the class with a majority of samples. Most of the existing techniques handle the imbalance well on binary classes but fail in multi-class patterns.

Computational time  About 90% of studies have endorsed deep learning approaches to predict cancer using medical images than other techniques. However, the deep learning-based approaches are highly complex. About 41% of the studies have used the CNN classifier, which has performed significantly but at the cost of high computational time and space.

Efficient feature selection technique  Many studies have achieved exceptional prediction outcomes. However, the requirement of a computationally effective feature selection method is still there to eradicate the data cleaning procedures while generating high cancer prediction accuracy.

Model Generalizability  A shift in research towards improving the generalizability of the model is required. Most of the studies have proposed a prediction model that is validated on a single site. There is a need to validate the models on multiple sites that can help improve the model's generalizability.

Clinical Implementation  AI-based models have proved their dominance in cancer research; still, the practical implementation of the models in the clinics is not incorporated. These models need to be validated in a clinical setting to assist the medical practitioner in affirming the diagnosis verdicts.

Fig. 10

AI-Based Prediction Models

Fig. 11

Cancer site-wise distribution of papers

Fig. 12

Distribution of papers based on the type of training data

Fig. 13

Year-wise distribution of papers

Conclusions and Future Directions

This review study attempts to summarize the various research directions for AI-based cancer prediction models. AI has marked its significance in the area of healthcare, especially cancer prediction. The paper provides a critical and analytical examination of current state-of-the-art cancer diagnostic and detection analysis approaches—a thorough examination of the machine and deep learning models used in cancer early detection using medical imaging. The AI techniques play a significant role in early cancer prognosis and detection using machine and deep learning techniques for extracting and classifying the disease features. Our study concluded that most previous literature works employed deep learning techniques, especially Convolutional Neural Networks. Another significant factor noted in our study is that most studies have worked on breast cancer data. It was examined that when deep learning models are applied to pre-processed and segmented medical images, the images perform better in classification metrics such as AUC, Sensitivity, Dice-coefficient, and Accuracy. There is scope to work on early detection of head and neck cancers because less study has been conducted for both types of cancer. Also, the federated learning model can be used for cancer detection based on distributed datasets. hence, we intend to use a federated learning model for the detection of cancer disease by creating the decentralized training model for cancer datasets in remote places. This study highlights the challenges faced by the researchers in the construction of AI-based prediction models. Although multiple pieces of research have displayed significant results, there is still a need to address the challenges in cancer research in future.

Declarations

Conflict of interest.

The authors declare no conflict of interest.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Yogesh Kumar, Email: [email protected].

Surbhi Gupta, Email: [email protected].

Ruchi Singla, Email: [email protected].

Yu-Chen Hu, Email: [email protected].

  • 1. Altaf F, Islam S, Akhtar N, Janjua N. Going deep in medical image analysis. IEEE Access. 2019;7:1–6. doi: 10.1109/ACCESS.2019.2929365. [ DOI ] [ Google Scholar ]
  • 2. Abdallah MY, Elgak SN, Zain H, Rafig M, Ebaid EA, Elnaema AA. Breast cancer detection using image enhancement and segmentation algorithms. Biomed Res. 2018;29(20):3732–3736. doi: 10.4066/biomedicalresearch.29-18-1106. [ DOI ] [ Google Scholar ]
  • 3. Abraham A, Duncan D, Gange S, West S. Computer aided assessment of diagnostic images for epidemiological research. BMC Med Res Methodol. 2009;9:1–8. doi: 10.1186/1471-2288-9-74. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 4. Adegun A, Viriri S (2020) Deep learning techniques for skin lesion analysis and melanoma cancer detection: a survey of state-of-the-art. In: Artif Intell Rew (Issue 0123456789). Springer Netherlands. 10.1007/s10462-020-09865-y
  • 5. Al-shamasneh A, Obaidellah U (2017) Artificial intelligence techniques for cancer detection and classificiation. Eur Sci J 342–370
  • 6. AlZhou A, Thabit M, Sakkaf K, Basaleem H (2017) Skin Cancer. 17, 3195-3199 [ PubMed ]
  • 7. Alakwaa W, Naseef M, Badr A. Lung cancer detection and classification with 3D convolutional neural network (3D-CNN) Int J Adv Comput Sci Appl. 2017;8(8):409–417. [ Google Scholar ]
  • 8. Albahar M. Skin Leison classification using convolutional neural network with novel regularizer. IEEE Access. 2019;7:38306–38313. doi: 10.1109/ACCESS.2019.2906241. [ DOI ] [ Google Scholar ]
  • 9. Alzubaidi L, Al-Shamma O, Fadhel M, Farhan L, Zhang J, Duan Y. Optimizing the performance of breast cancer classification by employing the same domain transfer learning from hybrid deep convolutional neural network model. Electronics. 2020;9:1–21. doi: 10.3390/electronics9030445. [ DOI ] [ Google Scholar ]
  • 10. Andriole G, Kramer B, Berg C. Mortality results from a randomized prostate cancer screening trail. N Engl J Med. 2009;360:1310–1319. doi: 10.1056/NEJMoa0810696. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 11. Anvari S, Nambiar S, Pang J, Maftoon N (2021) Computational models and simulations of cancer metastasis. Arch Comput Method Eng, 1–23.
  • 12. Asideu M, Simhal A, Chaudhary U, Mueller J, Lam C, Schmitt J, Venegas G, Sapiro G (2018) Development of algorithms for automated detection of cervical pre-cancers with a low –cost, point-of-care, Pocket colposcope. BioRxiv, 1–13. 10.1101/324541 [ DOI ] [ PMC free article ] [ PubMed ]
  • 13. Azamjah N, Zadeh Y, Zayeri F. Global trend of breast cancer mortality rate: A 25-year study. Global Trend Breast Cancer Mortal. 2018;20:1–6. doi: 10.31557/APJCP.2019.20.7.2015. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 14. Assiri A, Nazir S, Velastin S. Breast tumor classification using an ensemble machine learning method. J Imaging. 2020;6(39):1–13. doi: 10.3390/jimaging6060039. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 17. Alam S, Rahman M, Hossain MA. Automatic human brain tumor detection in MRI image using template-based K means and improved fuzzy C means clustering algorithm. Big Data Cogn Comput. 2019;3(2):1–18. doi: 10.3390/bdcc3020027. [ DOI ] [ Google Scholar ]
  • 18. Al-ayyoub M, Alabed-alaziz A, Darwish O (2012) Machine learning approach for brain tumor detection. In: ICICS '12: Proceedings of the 3rd international conference on information and communication systems, 1–4. 10.1145/2222444.2222467
  • 19. Ali AM, Zhuang H, Ibrahim A, Rehman O, Huang M, Wu A (2018) A machine learning approach for the classification of kidney cancer subtypes using miRNA genome data. Appl Sci (Switzerland) 8(12). 10.3390/app8122422
  • 20. Alyafeai Z, Ghouti L. A fully-automated deep learning pipeline for cervical cancer classification. Expert Syst Appl. 2020 doi: 10.1016/j.eswa.2019.112951. [ DOI ] [ Google Scholar ]
  • 21. Ayman El-Baz, Beache GM, Gimel’Farb G, Suzuki K, Okada K, Elnakib A, Soliman A, Abdollahi B (2013) Computer-aided diagnosis systems for lung cancer: challenges and methodologies. In: International journal of biomedical imaging, 2013. 10.1155/2013/942353 [ DOI ] [ PMC free article ] [ PubMed ]
  • 22. Asuntha A, Srinivasan A. Deep learning for lung cancer detection and classification. Multimed Tools Appl. 2020;79(11–12):7731–7762. doi: 10.1007/s11042-019-08394-3. [ DOI ] [ Google Scholar ]
  • 23. Ausawalaithong W, Thirach A, Marukatat S, Wilaiprasitporn T (2019) Automatic lung cancer prediction from chest X-ray images using the deep learning approach. In: BMEiCON 2018—11th biomedical engineering international conference, 1–5. 10.1109/BMEiCON.2018.8609997
  • 24. Azer SA. Deep learning with convolutional neural networks for identification of liver masses and hepatocellular carcinoma: a systematic review. World J Gastrointest Oncol. 2019;11(12):1218–1230. doi: 10.4251/wjgo.v11.i12.1218. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 25. Bach P, Mirkin J, Oliver T, et al. Benefits and harms of CT screening for lung cancer. JAMA. 2012;307:2418–2429. doi: 10.1001/jama.2012.5521. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 26. Beilner D, Kuhn C, Kost BP, Jückstock J, Mayr D, Schmoeckel E, Dannecker C, Mahner S, Jeschke U, Heidegger HH. Lysine-specific histone demethylase 1A (LSD1) in cervical cancer. J Cancer Res Clin Oncol. 2020 doi: 10.1007/s00432-020-03338-z. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 27. Kumar Y, Singla R (2021) Federated learning systems for healthcare: perspective and recent progress. In: Rehman MH, Gaber MM (eds) Federated learning systems. studies in computational intelligence, vol 965. Cham: Springer. 10.1007/978-3-030-70604-3_6
  • 28. Bengtsson E, Malm P (2014) Screening for cervical cancer using automated analysis of PAP-smears. Hindawi Publishing Corporation, vol 2014, 1–13. 10.1155/2014/842037 [ DOI ] [ PMC free article ] [ PubMed ]
  • 29. Bidard F, Mathiot C, Delaloge S, Brain E, Giachetti S, Cremoux P, Marty M, Pierga J. Single circulating tumor cell detection and overall survival in nonmetastatic breast cancer. Ann Oncol. 2010;24:729–733. doi: 10.1093/annonc/mdp391. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 15. Boakye EA, Wang M, Sharma A, Jenkins WD, Osazuwa-Peters N, Chen B, Lee M, Schootman M. Risk of second primary cancers in individuals diagnosed with index smoking- and non-smoking- related cancers. J Cancer Res Clin Oncol. 2020;146(7):1765–1779. doi: 10.1007/s00432-020-03232-8. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 30. Bono J, Chi K, Jones R, Scher H, et al. Abiraterone and increased survival in metastatic prostate cancer. N Engl J Med. 2011;364:1995–2005. doi: 10.1056/NEJMoa1014618. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 31. Büntzel J, Klein M, Keinki C, Walter S, Büntzel J, Hübner J. Oncology services in corona times: a flash interview among German cancer patients and their physicians. J Cancer Res Clin Oncol. 2020 doi: 10.1007/s00432-020-03249-z. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 32. Bulten W, Litjens G (2018) Unsupervised prostate cancer detection on H&E using convolutional adversarial autoencoders. Med Imaging Deep Learn, 1–3
  • 33. Bur AM, Holcomb A, Goodwin S, Woodroof J, Karadaghy O, Shnayder Y, Kakarala K, Brant J, Shew M. Machine learning to predict occult nodal metastasis in early oral squamous cell carcinoma. Oral Oncol. 2019;92:20–25. doi: 10.1016/j.oraloncology.2019.03.011. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 34. Brito J, Morris J, Montori V (2013) Thyroid cancer: zealous imaging has increased detection ad treatment oflow risk tumors. Bmj, vol 347 [ DOI ] [ PubMed ]
  • 35. Brosch T, Tam R (2013) Manifold learning of brain MRIs by deep learning. In: International Conference on medical image computing and computer assisted intervention, 633–640. [ DOI ] [ PubMed ]
  • 36. Chan CWH, Law BMH, So WKW, Chow KM, Waye MMY. Pharmacogenomics of breast cancer: highlighting CYP2D6 and tamoxifen. J Cancer Res Clin Oncol. 2020;146(6):1395–1404. doi: 10.1007/s00432-020-03206-w. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 37. Chen W, Li X, Gao L, Shen W. Improving computer-aided cervical cells classification using transfer learning based snapshot ensemble. Appl Sci. 2020;10:1–14. doi: 10.3390/app10207292. [ DOI ] [ Google Scholar ]
  • 38. Cheng H, Shan J, Ju W, Guo Y, Zhang L. Automated breast cancer detection and classification using ultrasound images. Pattern Recogn. 2010;43(1):299–317. doi: 10.1016/j.patcog.2009.05.012. [ DOI ] [ Google Scholar ]
  • 39. Chillakuru YR, Kranen K, Doppalapudi V, Xiong Z, Fu L, Heydari A, Sohn JH. High precision localization of pulmonary nodules on chest CT utilizing axial slice number labels. BMC Med Imaging. 2021;21(1):1–13. doi: 10.1186/s12880-021-00594-4. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 40. Chlebus G, Schenk A, Moltz JH, Ginneken BV, Hahn HK, Meine H. Automatic liver tumor segmentation in CT with fully convolutional neural networks and object-based postprocessing. Sci Rep. 2018;8:15497. doi: 10.1038/s41598-018-33860-7. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 41. Cui G, Wu L, Zhou T, Gu Y, Lu X, Zhang B, Zhao Y, Yu D, Gao L. Automatic lung nodule detection using multi-scale dot nodule-enhancement filter and weighted support vector machines in chest computed tomography. PLoS ONE. 2019;14(1):1–25. doi: 10.1371/journal.pone.0210551. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 42. Das A, Acharya UR, Panda SS, Sabut S. Deep learning based liver cancer detection using watershed transform and Gaussian mixture model techniques. Cogn Syst Res. 2019;54:165–175. doi: 10.1016/j.cogsys.2018.12.009. [ DOI ] [ Google Scholar ]
  • 43. Denil M, Bazzani L, Larochelle H, Freitas N (2012) Learning where to attend with deep architectures for image tracking. Neural Comput 2151–2184 [ DOI ] [ PubMed ]
  • 44. Devi MA, Ravi S, Vaishnavi J, Punitha S. Classification of cervical cancer using artificial neural networks. Procedia Comput Sci. 2016;89:465–472. doi: 10.1016/j.procs.2016.06.105. [ DOI ] [ Google Scholar ]
  • 45. Devi N, Bhattacharyya K (2018) Automatic brain tumor detection and classification of grades of astrocytoma. In: International conference on computing and communication systems, Springer, vol 24. 10.1007/978-981-10-6890-4_11
  • 46. Devaranjan P (2011) Biomakers for the early detection of acute kidney injury. Current opinion in pediatrics, vol 23 [ DOI ] [ PMC free article ] [ PubMed ]
  • 47. Donald C, Johnson A, Werner N, Brody D. Detection of blast-related traumatic brain injury in US military personnel. N Engl J Med. 2011;364:2091–2100. doi: 10.1056/NEJMoa1008069. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 48. Donaldson M, Coldiron B. No end in sight: the skin cancer epidemic continues. Semin Cutan Med Surg. 2011;30:3–5. doi: 10.1016/j.sder.2011.01.002. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 49. Dolatkhah R et al (2020) Breast cancer survival and incidence: 10 Years cancer registry data in the northwest, Iran. Int J Breast Cancer, 1–6 [ DOI ] [ PMC free article ] [ PubMed ]
  • 50. Dong H, Yang G, Liu F, Mo Y, Guo Y, Heart N (2017) Automatic brain tumor detection and segmentation using U-Net based fully convolutional networks. In: Valdés Hernández M, González-Castro V (eds) Medical image understanding and analysis. MIUA 2017. Communications in computer and information science, 723. Cham: Springer
  • 51. Dov D, Kovalsky SZ, Cohen J, Range DE, Henao R, Carin L (2019) A deep-learning algorithm for thyroid malignancy prediction from whole slide cytopathology images. 1–10. http://arxiv.org/abs/1904.12739 [ DOI ] [ PMC free article ] [ PubMed ]
  • 52. Eleyan A, Saliha O, Ikram G, Tolga E (2018) Breast cancer classification using machine learning. 2018 Electric Electronics, Computer Science, Biomedical Engineerings’ Meeting (EBBT), Istanbul, Turkey
  • 53. Engel J, Schubert-Fritschle G, Emeny R, Hölzel D. Breast cancer: are long-term and intermittent endocrine therapies equally effective? J Cancer Res Clin Oncol. 2020;146(8):2041–2049. doi: 10.1007/s00432-020-03264-0. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 54. Feng S, Lin D, Lin J, Li B, Huang Z, Chen G, Wei Z, Wang L, Pan J, Chen R, Zeng H. Blood Plasma surface-enhanced Raman spectroscopy for non invasive optical detection of cervical cancer. Analyst. 2013;138:3967–3974. doi: 10.1039/c3an36890d. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 55. Ferlay J. Estimates of worldwide burden of cancer in 2008: GLOBOCAN 2008. Int J Cancer. 2010;127:2893–2917. doi: 10.1002/ijc.25516. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 56. Figueiredo PN, Figueiredo IN, Prasath S, Tsai R. Automatic polyp detection in pillcam colon 2 capsule images and videos: preliminary feasibility report. Diagn Therap Endos. 2011 doi: 10.1155/2011/182435. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 57. Gonçalves WGE, Santos MHDPD, Lobato FMF, Ribeiro-Dos-Santos Â, Araújo GSD. Deep learning in gastric tissue diseases: A systematic review. BMJ Open Gastroenterol. 2020;7(1):1–11. doi: 10.1136/bmjgast-2019-000371. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 58. Godkhindi AM, Gowda RM (2018) Automated detection of polyps in CT colonography images using deep learning algorithms in colon cancer diagnosis. In: 2017 international conference on energy, communication, data analytics and soft computing, 1722–1728
  • 59. Gomez J, Masry Z, Benaggoune K, Meraghni S, Zerhouni N (2019) A CNN based methodology for breast cancer diagnosis using thermal images. arXiv, 1–19
  • 60. Gruber N, Antholzer S, Jaschke W, Kremser C, Haltmeier M (2019) A joint deep learning approach for automated liver and tumor segmentation. In: 2019 13th international conference on sampling theory and applications. 10.1109/SampTA45681.2019.9030909
  • 61. Guan Q, Wang Y, Du J, Qin Y, Lu H, Xiang J, Wang F (2019) Deep learning based classification of ultrasound images for thyroid nodules: a large scale of pilot study. Annal Translat Med 7(7):137–137. 10.21037/atm.2019.04.34 [ DOI ] [ PMC free article ] [ PubMed ]
  • 62. Gupta R, Sarwar A, Sharma V. Screening of cervical cancer by artificial intelligence based analysis of digitized papanicolaou-smear images. Int J Contemp Med Res. 2017;4(5):1–6. [ Google Scholar ]
  • 63. Gupta S, Gupta MK (2021) A comprehensive data‐level investigation of cancer diagnosis on imbalanced data. Comput Intell
  • 64. Gupta S, Gupta MK (2021) Computational prediction of cervical cancer diagnosis using ensemble-based classification algorithm. Comput J
  • 65. Goodarzi E, Moslem A, et al. (2021) Epidemiology, incidence and mortality of thyroid cancer and their relationship with the human development index in the world: an ecology study in 2018
  • 66. Haggar F, Boushey R. Colorectal cancer epidemiology: incidence, mortality, survival, and risk factors. Clinics in Colon. 2009;22:191–197. doi: 10.1055/s-0029-1242458. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 67. Han S, Hwang SI, Lee HJ. The classification of renal cancer in 3-phase ct images using a deep learning method. J Digit Imaging. 2019;32(4):638–643. doi: 10.1007/s10278-019-00230-2. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 68. Hasan M, Barman SD, IslamS, Reza AW (2019) Skin cancer detection using convolutional neural network.In: ICCAI '19: proceedings of the 2019 5th international conference on computing and artificial intelligence, 254–258. 10.1145/3330482.3330525
  • 69. Hasan MZ, Shoumik S, Zahan N (2019). Integrated use of rough sets and artificial neural network for skin cancer disease classification. In: 5th international conference on computer, communication, chemical, materials and electronic engineering, 1–4
  • 70. He Z, Liu H, Moch H, Simon H. Machine learning with autophagy- related proteins for discriminating renal cell carcinoma subtypes. Sci Rep. 2020;10:720. doi: 10.1038/s41598-020-57670-y. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 71. Hertrampf K, Pritzkuleit R, Baumann E, Wiltfang J, Wenz HJ, Waldmann A. Oral cancer awareness campaign in Northern Germany: first positive trends in incidence and tumour stages. J Cancer Res Clin Oncol. 2020;146(10):2489–2496. doi: 10.1007/s00432-020-03305-8. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 72. Hirasawa T, Aoyama K, Tanimoto T, Ishihara S, Shichijo S, Ozawa T, Ohnishi T, Fujishiro M, Matsuo K, Fujisaki J, Tada T. Application of artificial intelligence using a convolutional neural network for detecting gastric cancer in endoscopic images. Gastric Cancer. 2018;21(4):653–660. doi: 10.1007/s10120-018-0793-2. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 73. Hoerter N, Gross SA, Liang PS. Artificial intelligence and polyp detection. Current Treatment Options Gastroenterol. 2020;18(1):120–136. doi: 10.1007/s11938-020-00274-2. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 74. Hu N, Zhao J, Li Y, Fu Q, Zhao L, Chen H, Yang G. Breast cancer and background parenchymal enhancement at breast magnetic resonance imaging: a meta-analysis. BMC Med Imaging. 2021;21(1):1–7. doi: 10.1186/s12880-020-00536-6. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 75. Huidrom R, Chanu Y, Singh K (2018) Automated lung segmentation on computed tomography image for the diagnosis of lung cancer. Comput Syst 22(3):907–915. 10.13053/CyS-22-3-2526
  • 76. Hosny K, Kassem M, Foaud M. Classification of skin lesions using transfer learning and augmentation with Alexnet. PLoS ONE. 2019 doi: 10.1371/journal.pone.0217293. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 77. Jin P, Ji X, Kang W, Li Y, Liu H, Ma F, Ma S, Hu H, Li W, Tian Y. Artificial intelligence in gastric cancer: a systematic review. J Cancer Res Clin Oncol. 2020;146(9):2339–2350. doi: 10.1007/s00432-020-03304-9. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 78. Kassani S, Kassani P, Wesolowski M, Schneider K (2019) Breast cancer diagnosis with transfer learning and global pooling. arXiv, 1–6
  • 79. Kadam V, Jadhav S, Vijayakumar K. Breast cancer diagnosis using feature ensemble learning based on stacked sparse autoencoders and softmax regression. Image Signal Process. 2019;43:1–11. doi: 10.1007/s10916-019-1397-z. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 80. Kalaiselvi T, Nagaraja P. A rapid automatic brain tumor detection method for MRI images using modified minimum error thresholding technique. Int J Imaging Syst Technol. 2015;25(1):77–85. doi: 10.1002/ima.22123. [ DOI ] [ Google Scholar ]
  • 81. Kang S, Jeon k, Kim H, Seo J, Lee S (2014) Automaic three-dimensional cephalometric annotation system using three dimensional convolution neural networks
  • 82. Kaur P, Singh G, Kaur P. Intellectual detection and validation of automated mammogram breast cancer images by multi-class SVM using deep learning classification. Inform Med Unlock. 2019;16:100151. doi: 10.1016/j.imu.2019.01.001. [ DOI ] [ Google Scholar ]
  • 83. Kaushal C, Singla A. Automated segmentation technique with self driven post processing for histopathological breast cancer images. CAAI Trans Intell Technol. 2020 doi: 10.1049/trit.2019.0077. [ DOI ] [ Google Scholar ]
  • 84. Khan MQ, Hussain A, Rehman SU, Khan U, Maqsood M, Mehmood K, Khan MA. Classification of melanoma and nevus in digital images for diagnosis of skin cancer. IEEE Access. 2019;7:90132–90144. doi: 10.1109/ACCESS.2019.2926837. [ DOI ] [ Google Scholar ]
  • 85. Khryashchev VV, Stepanova OA, Lebedev AA, Kashin SV, Kuvaev RO (2019) Deep learning for gastric pathology detection in endoscopic images. In: ICGSP '19: 2019 The 3rd international conference on graphics and signal processing, 90–94. 10.1145/3338472.3338492
  • 86. Kloeckner J, Sansonowicz TK, Rodrigues ÁL, Nunes TWN. Multi-categorical classification using deep learning applied to the diagnosis of gastric cancer. Jornal Brasileiro de Patologia e Medicina Laboratorial. 2020;56:1–8. doi: 10.5935/1676-2444.20200013. [ DOI ] [ Google Scholar ]
  • 87. Kokare D, Gumaste P (2015) Mammographic cancer detection using computer aided diagnosis system. Int J Innov Res Electr Electron Instrum Control Eng 3:137–141. 10.17148/IJREEICE.2015.3229
  • 88. Krithiga, R., Geetha, P. (2020). Breast cancer detection, segmentation and classification on histopathology images analysis: a systematic review. Arch Comput Methods in Eng 1–13
  • 89. Kruger DT, Opdam M, Noort VVD, Sanders J, Nieuwenhuis M, Valk BD, Beelen KJ, Linn SC, Boven E. PI3K pathway protein analyses in metastatic breast cancer patients receiving standard everolimus and exemestane. J Cancer Res Clin Oncol. 2020 doi: 10.1007/s00432-020-03291-x. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 90. Kumar Y, Sood K, Kaul S, Vasuja R (2020) Big data analytics and its benefits in healthcare. In: Kulkarni A. et al. Big data analytics in healthcare. Studies in big data, vol 66. Chan: Springer. 10.1007/978-3-030-31672-3_1
  • 91. Kumar Y, Mahajan M. Intelligent behavior of fog computing with IOT for healthcare system. Int J Sci Technol Res. 2019;8(7):674–679. [ Google Scholar ]
  • 92. Kurnianingsih AKHS, Nugroho LE, Widyawan LL, Prabuwono AS, Mantoro T. Segmentation and classification of cervical cells using deep learning. IEEE Access. 2019;7:116925–116941. doi: 10.1109/ACCESS.2019.2936017. [ DOI ] [ Google Scholar ]
  • 93. Lavanya L, Chandra J. Oral cancer analysis using machine learning techniques. Int J Eng Res Technol. 2019;12(5):596–601. [ Google Scholar ]
  • 94. Laura M (2018) Cancer cells vs normal cell. Cancer research from technology networks
  • 95. Lathwal A, Kumar R, Arora C, Raghava GPS. Identification of prognostic biomarkers for major subtypes of non-small-cell lung cancer using genomic and clinical data. J Cancer Res Clin Oncol. 2020 doi: 10.1007/s00432-020-03318-3. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 96. Le TN, Bao PT, Huynh HT. Liver tumor segmentation from MR images using 3d fast marching algorithm and single hidden layer feedforward neural network. Biomed Res Int. 2016;2016:3219068. doi: 10.1155/2016/3219068. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 97. Leon F, Gelvez M, Jaimes Z, Gelvez T, Arguello H (2019) Supervised classification of histopathological images using convolutional neuronal networks for gastric cancer detection. In: 2019 22nd symposium on image, signal processing and artificial vision, 1–5. 10.1109/STSIVA.2019.8730284
  • 98. Liu S, Zheng H, Feng Y, Li W (2017) Prostate cancer diagnosis using deep learning with 3D multiparametric MRI. Medical imaging 2017: Comput Aid Diagn 10134:1013428. 10.1117/12.2277121
  • 99. Lee T, Lin Y, Uedo N, Wang H, Chang H, Hung C (2013) Computer-aided diagnosis in endoscopy: a novel application toward automatic detection of abnormal lesions on magnifying narrow-band imaging endoscopy in the stomach. In: 35th annual international conference of the IEEE engineering in medicine and biology society (EMBC), 4430–4433 [ DOI ] [ PubMed ]
  • 100. Li X, Radulovic M, Kanjer K, Palatniotis K (2020) Discriminative pattern mining for breast cancer histopathology image classification via fully convolutional Autoencoder. arXiv, 1–12
  • 101. Liu B, Chi W, Li X, Li P, Liang W, Liu H, Wang W, He J (2020) Evolving the pulmonary nodules diagnosis from classical approaches to deep learning-aided decision support: three decades’ development course and future prospect. In: Journal of cancer research and clinical oncology 146. Berlin: Springer. 10.1007/s00432-019-03098-5 [ DOI ] [ PubMed ]
  • 102. Liu J, Ke F, Chen T, Zhou Q, Weng L, Tan J, Shen W, Li L, Zhou J, Xu C, Cheng H, Zhou J. MicroRNAs that regulate PTEN as potential biomarkers in colorectal cancer: a systematic review. J Cancer Res Clin Oncol. 2020;146(4):809–820. doi: 10.1007/s00432-020-03172-3. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 103. Lee H, Chen Y. Image based computer aided diagnosis system for cancer detection. Elsevier. 2015;42:5356–5365. doi: 10.1016/j.eswa.2015.02.005. [ DOI ] [ Google Scholar ]
  • 104. Lopez L, Morales J, Martin A, Diaz S, Barranco A. Prometeo: a CNN-based computer—aided daignosis system for WSI prostate cancer detection. IEEE Access. 2020 doi: 10.1109/ACCESS.2020.3008868. [ DOI ] [ Google Scholar ]
  • 105. Iuga AI, Carolus H, Höink AJ, Brosch T, Klinder T, Maintz D, Püsken M. Automated detection and segmentation of thoracic lymph nodes from CT using 3D foveal fully convolutional neural networks. BMC Med Imaging. 2021;21(1):1–12. doi: 10.1186/s12880-021-00599-z. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 106. Ma L, Ma C, Liu Y, Wang X. Thyroid diagnosis from SPECT images using convolutional neural network with optimization. Comput Intell Neurosci. 2019 doi: 10.1155/2019/6212759. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 107. Malathi M, Sinthia P. Brain tumour segmentation using convolutional neural network with tensor flow. Asian Pac J Cancer Prev. 2019;20(7):2095–2101. doi: 10.31557/APJCP.2019.20.7.2095. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 108. Mane S, Shinde S (2018) A method for melanoma skin cancer detection using dermoscopy images. In: 4th international conference on computing, communication control and automation, 1–6. 10.1109/ICCUBEA.2018.8697804
  • 109. Marka A, Carter J, Toto E, Hassanpour S. Automated detection of nonmelanoma skin cancer using digital images: a systematic review. BMC Med Imaging. 2019 doi: 10.1186/s12880-019-0307-7. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 110. Mehrotra R, Gupta D. Exciting new advances in oral cancer diagnosis: avenues to early detection. Head Neck Oncol. 2011;3:1–9. doi: 10.1186/1758-3284-3-1. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 111. Mejia TM, Pérez MG, Andaluz VH, Conci A. Automatic segmentation and analysis of thermograms using texture descriptors for breast cancer detection. Asia-Pacific Conf Comput-Aid Syst Eng. 2015;2015:24–29. doi: 10.1109/APCASE.2015.12. [ DOI ] [ Google Scholar ]
  • 112. Mohsen H, El-dahshan EA, El-horbaty EM, Salem AM. Classification using deep learning neural networks for brain tumors. Future Comput Inform J. 2018;3(1):68–71. doi: 10.1016/j.fcij.2017.12.001. [ DOI ] [ Google Scholar ]
  • 113. Munir K, Elahi H, Ayub A, Frezza F, Rizzi A. Cancer diagnosis using deep learning. Cancers. 2019;11:1–36. doi: 10.3390/cancers11091235. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 114. Murtaza G, Shuib L, Wahab AWA, Mujtaba G, Mujtaba G, Nweke HF, Al-garadi MA, Zulfiqar F, Raza G, Azmi NA. Deep learning-based breast cancer classification through medical imaging modalities: state of the art and research challenges. Artif Intell Rev. 2020;53(3):1655–1720. doi: 10.1007/s10462-019-09716-5. [ DOI ] [ Google Scholar ]
  • 115. Momenimovahed Z, Salehiniya H (2019) Epidemiological characteristics of and risk factors for breast cancer in the world. Breast Cancer and Therapy, 151–164 [ DOI ] [ PMC free article ] [ PubMed ]
  • 116. Milne A, Carneiro F, O'Morain C, Offerhaus G. Nature meets nurture: molecular genetics of gastric cancer. Hum Genet. 2009;126:615–628. doi: 10.1007/s00439-009-0722-x. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 117. Naeem S, Ali A, Qadri S, Mashwani W, Tairan N, Shah H, Fayaz M, Jamal F, Chesneau C, Anam S. Machine learning based hybrid-feature analysis for liver cancer classification using fused images. Appl Sci. 2020;10:1–22. doi: 10.3390/app10093134. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 118. Nahar VK, Allison FM, Brodell RT, Boyas JF, Jacks SK, Biviji-Sharma R, Haskins MA, Bass MA. Skin cancer prevention practices among malignant melanoma survivors: a systematic review. J Cancer Res Clin Oncol. 2016;142(6):1273–1283. doi: 10.1007/s00432-015-2086-z. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 119. Nakano K, Nemoto H, Nomura R, Inaba H, Yoshioka H, Taniguchi K, Amano A, Ooshima T. Detection of oral bacteria in cardiovascular specimens. Oral Microbiol Immunol. 2009;24:64–68. doi: 10.1111/j.1399-302X.2008.00479.x. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 120. Narayanan D, Saladi R, Fox J. Ultraviolet radiation and skin cancer. Int J Dermatol. 2010;49:978–986. doi: 10.1111/j.1365-4632.2010.04474.x. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 121. Nartowt BJ, Hart GR, Muhammad W, Liang Y, Stark GF, Deng J. Robust machine learning for colorectal cancer risk prediction and stratification. Front Big Data. 2020;3:1–12. doi: 10.3389/fdata.2020.00006. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 122. Nasrullah N, Sang J, Alam SA, Mateen M, Cai B, Hu H. Automated lung nodule detection and classification using deep learning combined with multiple strategies. Sensors. 2019;19(7):3722. doi: 10.3390/s19173722. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 123. Nallamala SH, Mishra P, Koneru SV. Breast cancer detection using machine learning way. Int J Recent Technol Eng. 2019;8(2–3):1402–1405. [ Google Scholar ]
  • 124. Nguyen Q, Lee J, Huang M, Khullar A, Raymond P. Diagnosis and treatment of patient with thyroid cancer. Clinical. 2015;1:1–40. [ Google Scholar ]
  • 125. Ning Y, Yu Z, Pan Y. A deep learning method for lincRNA detection using auto-encoder algorithm. BMC Bioinformatics. 2017;18:1–9. doi: 10.1186/s12859-017-1922-3. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 126. Nirupama T, Damodhar J (2016) A GSM based computer aided diagnosis system for lung cancer detection. In: National conference on emerging trends in information, digital and embedded systems, 137–142
  • 127. Nishio M, Nishizawa M, Sugiyama O, Kojima R, Yakami M, Kuroda T, Togashi K. Computer aided diagnosis of lung nodule using gradient tree boosting and Bayesian optimization. PLoS ONE. 2018;13(4):1–13. doi: 10.1371/journal.pone.0195875. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 128. Nivaashini M, Soundariya R. Deep Boltzmann machine based breast cancer risk detection for healthcare systems. Int J Pure Appl Math. 2018;119:581–590. [ Google Scholar ]
  • 129. Okuboyejo D, Olugbara O, Odunaike S (2013) Automating Skin disease diagnosis using image classification. In: Proceedings of the world congress in engineering and computer science, 23–25.
  • 130. Park HJ, Park B, Lee SS. Radiomics and deep learning: Hepatic applications. Korean J Radiol. 2020;21(4):387–401. doi: 10.3348/kjr.2019.0752. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 131. Patil RS, Biradar N. Automated mammogram breast cancer detection using the optimized combination of convolutional and recurrent neural network. Evol Intel. 2020 doi: 10.1007/s12065-020-00403-x. [ DOI ] [ Google Scholar ]
  • 132. Poudel P, Illanes A, Sheet D, Friebe M. Evaluation of commonly used algorithms for thyroid ultrasound images segmentation and improvement using machine learning approaches. J Healthcare Eng. 2018 doi: 10.1155/2018/8087624. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 133. Qayyum A, Basit A (2017) Automatic breast segmentation and cancer detection via SVM in mammograms. In: 2016 international conference on emerging technologies. 10.1109/ICET.2016.7813261
  • 134. Radu S, Jianu S, Ichim L, Ieee M, Popescu D, Ieee M (2019) Automatic diagnosis of skin cancer using neural networks. In: 2019 11th international symposium on advanced topics in electrical engineering, 1–4
  • 135. Ragab DA, Sharkas M, MarshallRen SJ. Breast cancer detection using deep convolutional neural networks and support vector machines. PeerJ. 2019;1:1–23. doi: 10.7717/peerj.6201. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 136. Ramadan S. Methods used in computer aided daignosis for breast cancer detection using mammograms. J Healthc Eng. 2020;2020:1–21. doi: 10.1155/2020/9162464. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 137. Raj A, Jayasree M. Automated liver tumor detection using markov random field segmentation. Procedia Technol. 2016;24:1305–1310. doi: 10.1016/j.protcy.2016.05.126. [ DOI ] [ Google Scholar ]
  • 138. Rajkumar TD, Deepa D, Jeyaranjani J (2019) Automatic diagnosis of liver tumor in CT images. Int J Eng Adv Technol 9(1S4):1105–1109. 10.35940/ijeat.a1116.1291s419
  • 139. Riquelme D, Akhloufi MA (2020). Deep learning for lung cancer nodules detection and classification in CT scans. AI, 1(1):28–67. 10.3390/ai1010003
  • 140. Rosalidar R, Rahman A, Muharar R, Syahputra M, Arnia F, Syukri M, Pradhan B, Munadi K. A review on recent progress in thermal imaging and deep learning approaches for breast cancer detection. IEEE Access. 2020;8:116176–116194. doi: 10.1109/ACCESS.2020.3004056. [ DOI ] [ Google Scholar ]
  • 141. Rudra P, Kanti MB, Bhattacharjee D (2015) Automated cervical cancer detection using pap smear images. In: Proceedings of fourth international conference on soft computing on problem solving, 267–278
  • 142. Rong F, Gong W, Pan J, Wang W (2019) Analysis of mortality and survival rate of liver cancer in Zhejiang .Province in China: A general population-based study. Hindawi, 1–7 [ DOI ] [ PMC free article ] [ PubMed ]
  • 16. Sajeena AM, Jereesh AS (2015) Automated cervical cancer detection through RGVF segmentation and SVM classification. In: International conference on computing and network communications, Trivandrum, 663–669.
  • 143. Sajenna TA, Jereesh AS. Automated cervical cancer detection through rgvf segmentation and SVM classification. Int Conf Comput Network Commun Trivandrum. 2015;2015:663–669. [ Google Scholar ]
  • 144. Saha A, Harowicz MR, Wang W, Mazurowski MA. A study of association of oncotype DX recurrence score with DCE-MRI characteristics using multivariate machine learning models. J Cancer Res Clin Oncol. 2018;144(5):799–807. doi: 10.1007/s00432-018-2595-7. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 145. Sajja T, Devarapalli R, Kalluri H (2019) Lung cancer detection based on CT scan images by using deep transfer learning. Transm Signal, 36, 339–344. 10.18280/ts.360406
  • 146. Sakai Y, Takemoto S, Hori K, Nishimura M, Ikematsu H, Yano T, Yokota H (2018) Automatic detection of early gastric cancer in endoscopic images using a transferring convolutional neural network. In: 2018 40th annual international conference of the IEEE engineering in medicine and biology society, 4138–4141 [ DOI ] [ PubMed ]
  • 147. Santini G, Moreau N, Rubeaux M (2019) Kidney tumor segmentation using an ensembling multi-stage deep learning approach. A contribution to the KiTS19 challenge. 1–11. 10.24926/548719.023
  • 148. Saraf J, Kalpana V. Thyroid cancer detection using image processing. Int J Res Sci Innov. 2017;4(8):75–77. [ Google Scholar ]
  • 149. Sarwar A, Sheikh AA, Manhas J, Sharma V. Segmentation of cervical cells for automated screening of cervical cancer: a review. Artif Intell Rev. 2020;53(4):2341–2379. doi: 10.1007/s10462-019-09735-2. [ DOI ] [ Google Scholar ]
  • 150. Sasikala S, Bharathi M, Sowmiya B. Lung Cancer detection and classification using deep CNN. Int J Innov Technol Explor Eng. 2018;8:259–262. [ Google Scholar ]
  • 151. Sasikala S, Kumar S, Shivappriya S, Priyadarrshan T (2020) Towards Improving skin cancer detection using transfer learning. BBRC, 13(11), 55–60. 10.21786/bbrc/13.11/13
  • 152. Selvathi D, Poornila A. Breast cancer detection in mammogram images using deep learning technique. J Sci Res. 2017;25(2):417–426. doi: 10.5829/idosi.mejsr.2017.417.426. [ DOI ] [ Google Scholar ]
  • 153. Senthil KK, Venkatalakshmi K, Karthikeyan K. Lung cancer detection using image segmentation by means of various evolutionary algorithms. Comput Math Methods Med. 2019;2019:4909846. doi: 10.1155/2019/4909846. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 154. Shakeel PM, Burhanuddin MA, Desa MI. Automatic lung cancer detection from CT image using improved deep neural network and ensemble classifier. Neural Comput Appl. 2020 doi: 10.1007/s00521-020-04842-6. [ DOI ] [ Google Scholar ]
  • 155. Sharma A., Rani R (2021) A systematic review of applications of machine learning in cancer prediction and diagnosis. Arch Comput Methods Eng, 1–22
  • 156. Sobhaninia Z, Rezaei S, Karimi N, Emami A, Samavi S. Brain tumor segmentation by cascaded deep neural networks using multiple image scales. IEEE Xplorer. 2020;2:1–4. [ Google Scholar ]
  • 157. Song T, Zhang QW, Duan SF, Bian Y, Hao Q, Xing PY, Lu JP. MRI-based radiomics approach for differentiation of hypovascular non-functional pancreatic neuroendocrine tumors and solid pseudopapillary neoplasms of the pancreas. BMC Med Imaging. 2021;21(1):1–11. doi: 10.1186/s12880-020-00536-6. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 158. Shibata T, Teramoto A, Yamada H, Ohmiya N, Saito K, Fujita H. Automated detection and segmentation of early gastric cancer from endoscopic images using mask R-CNN. Appl Sci (Switzerland) 2020;10(11):1–10. doi: 10.3390/app10113842. [ DOI ] [ Google Scholar ]
  • 159. Shin Y, Qadir HA, Aabakken L, Bergsland J, Balasingham I. Automatic colon polyp detection using region based deep CNN and post learning approaches. IEEE Access. 2018;6:40950–40962. doi: 10.1109/ACCESS.2018.2856402. [ DOI ] [ Google Scholar ]
  • 160. Skalski A (2016) Kidney tumor segmentation and detection on computed tomography data. In: 2016 IEEE International conference on imaging systems and techniques, Chania, 2016, pp 238–242. 10.1109/IST.2016.7738230
  • 161. Sokoutil M, Sokouti M, Sokouti B. Computer aided diagnosis of thyroid cancer using image processing techniques. Int J Comput Sci Network Secur. 2018;18(4):1–8. [ Google Scholar ]
  • 162. Suberi A, Zakaria W, Tomari R. Dendritic cell recognition in computer aided system for cancer immunotherapy. Procedia Comput Sci. 2016;105:177–182. doi: 10.1016/j.procs.2017.01.201. [ DOI ] [ Google Scholar ]
  • 163. Sudharani K, Sarma TC, Prasad KS. Advanced morphological technique for automatic brain tumor detection and evaluation of statistical parameters. Procedia Technol. 2016;24:1374–1387. doi: 10.1016/j.protcy.2016.05.153. [ DOI ] [ Google Scholar ]
  • 164. Tabibu S, Vinod PK, Jawahar CV. Pan-renal cell carcinoma classification and survival prediction from histopathology images using deep learning. Sci Rep. 2019;9:10509. doi: 10.1038/s41598-019-46718-3. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 165. Taher F, Werghi N, Ahmad H. Computer aided diagnosis system for early lung cancer detection. Algorithms. 2015;8:1088–1110. doi: 10.3390/a8041088. [ DOI ] [ Google Scholar ]
  • 166. Thapa S, Fischbach LA, Delongchamp R, Faramawi MF, Orloff MS. Using machine learning to predict progression in the gastric precancerous process in a population from a developing country who underwent a gastroscopy for dyspeptic symptoms. Gastroenterol Res Pract. 2019 doi: 10.1155/2019/8321942. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 167. Udrea A, Mitra GD (2017) Generative adversarial neural networks for pigmented and non-pigmented skin lesions detection in clinical images. In: 2017 21st international conference on control systems and computer, 364–368. 10.1109/CSCS.2017.56
  • 168. Wang L, Gu J. Serum microRNA-29a is a promising novel marker for early detection of colorectal liver metastasis. Cancer Epidemiol. 2012;36:61–67. doi: 10.1016/j.canep.2011.05.002. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 169. Wei X, Liu W, Wang JQ, Tang Z. “Hedgehog pathway”: a potential target of itraconazole in the treatment of cancer. J Cancer Res Clin Oncol. 2020;146(2):297–304. doi: 10.1007/s00432-019-03117-5. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 170. Weng AM, Heidenreich JF, Metz C, Veldhoen S, Bley TA, Wech T. Deep learning-based segmentation of the lung in MR-images acquired by a stack-of-spirals trajectory at ultra-short echo-times. BMC Med Imaging. 2021;21(1):1–11. doi: 10.1186/s12880-021-00608-1. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 171. Win KP, Kitjaidure Y, Hamamoto K, Aung TM. Computer-assisted screening for cervical cancer using digital image processing of pap smear images. Appl Sci. 2020;10(5):1–22. doi: 10.3390/app10051800. [ DOI ] [ Google Scholar ]
  • 172. Win Y, Choomchuay S, Hamamoto K, Raveesunthornkiat M, Rangsirattanakul L, Poongsawat S. Computer aided diagnosis system for detection of cancer cells on cytological pleural effusion images. Hindawi. 2018;2018:1–22. doi: 10.1155/2018/6456724. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 173. Wu M, Yan C, Liu H, Liu Q, Yin Y. Automatic classification of cervical cancer from cytological images by using convolutional neural network. Biosci Rep. 2018;38(6):1–9. doi: 10.1042/BSR20181769. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 174. Wong M, Goggins W, Fung F, et al. Incidence and mortality of kidney cancer: temporal patterns and global trends in 39 countries. Sci Rep. 2017;7:1–10. doi: 10.1038/s41598-016-0028-x. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 175. Yamada M, Saito Y, Imao H, Sai M, Yamada S. Development of a real-time endoscopic image diagnosis support system using deep learning technology in colonoscopy. Sci Rep. 2019;9:14465. doi: 10.1038/s41598-019-50567-5. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 176. Lee YM, Agelis V et al. (2020) COVID-19 mortality in patients with cancer on chemotherapy or other anticancer treatments: a prospective cohort study. Crossmark, 1–9 [ DOI ] [ PMC free article ] [ PubMed ]
  • 177. Yoo S, Gujrathi I, Haider MA, Khalvati F. Prostate cancer detection using deep convolutional neural networks. Sci Rep. 2019;9:19518. doi: 10.1038/s41598-019-55972-4. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 178. Yu N, Yu Z, Pan Y. A deep learning method for lincRNA detection using auto-encoder algorithm. BMC Bioinform. 2017;18:1–10. doi: 10.1186/s12859-017-1922-3. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 179. Yue W, Wang Z, Chen H, Payme A, Liu X. machine learning with applications in breast cancer diagnosis and prognosis. Designs. 2018;2(13):1–17. doi: 10.3390/designs2020013. [ DOI ] [ Google Scholar ]
  • 180. Zhang C, Guo H (2018) Smart Software can diagnose prostate cancer as well as pathologist. Europ Assoc Urol, 1–3
  • 181. Zhang L, Gao HJ, Zhang J, Badami B. Optimization of the convolutional neural networks for automatic detection of skin cancer. Open Med (Poland) 2020;15(1):27–37. doi: 10.1515/med-2020-0006. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 182. Zhang N, Lou W, Ji F, Qiu L, Tsang BK, Di W. Low molecular weight heparin and cancer survival: clinical trials and experimental mechanisms. J Cancer Res Clin Oncol. 2016;142(8):1807–1816. doi: 10.1007/s00432-016-2131-6. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 183. Zhang R, Zheng Y, Wing T, Mak TWC, Yu R, Wong SH, Lau JYW, Poon CCY. Automatic detection and classification of colorectal polyps by transferring low-level CNN features from nonmedical domain. IEEE J Biomed Health Inform. 2016;21(1):41–47. doi: 10.1109/JBHI.2016.2635662. [ DOI ] [ PubMed ] [ Google Scholar ]
  • 184. Zhang Y, Li M, Gao X, Chen Y, Liu T. Nanotechnology in cancer diagnosis: progress, challenges and opportunities. J Hematol Oncol. 2019;12:1–13. doi: 10.1186/s13045-019-0833-3. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • 185. Zugazagoitia J, Guedes C, Ponce S, Ares L, Pinelo S, Ferrer I. Current challenges in cancer treatment. Clin Ther. 2016;38:1551–1566. doi: 10.1016/j.clinthera.2016.03.026. [ DOI ] [ PubMed ] [ Google Scholar ]
  • View on publisher site
  • PDF (2.3 MB)
  • Collections

Similar articles

Cited by other articles, links to ncbi databases.

  • Download .nbib .nbib
  • Format: AMA APA MLA NLM

Add to Collections

IMAGES

  1. (PDF) Skin cancer detection: A review using AI techniques

    research paper on cancer detection

  2. (PDF) Cancer Immunotherapy: An Evidence-Based Overview and Implications

    research paper on cancer detection

  3. Thesis for cancer research paper

    research paper on cancer detection

  4. Cancer Research Template

    research paper on cancer detection

  5. (PDF) DETECTION OF BREAST CANCER USING VARIOUS AI-ML CLASSIFIERS

    research paper on cancer detection

  6. (PDF) Cancer Detection and Analysis Using Machine Learning

    research paper on cancer detection

VIDEO

  1. 17 Cancer Symptoms You Shouldn't Ignore

  2. Lung Cancer Detection / Exosomes/ New Research /Science News

  3. Cancer detection using Artificial intelligence

  4. CONVERSATIONS 303 STATE OF CANCER RESEARCH

  5. Early Cancer Detection: The Test That Could Save Your Life

  6. CANCER RESEARCH

COMMENTS

  1. Automating cancer diagnosis using advanced deep learning ...

    This research demonstrates the significant potential of AI-based deep learning techniques in enhancing the accuracy and efficiency of cancer detection and classification across multiple cancer types.

  2. The future of early cancer detection

    Since pioneering work on the classification of skin and lung cancer 104,105, numerous papers and opinion pieces have explored the advantages and challenges of using AI in early cancer detection.

  3. Prediction of Cancer Disease using Machine learning Approach

    Deep Convolutional Neural Network CNNs is used to identify or label a medical image in some research papers. Diagnosed lung cancer in 2015 with a multiscal two-layer CNN ... the predictive models using the machine learning algorithms reported in the literal works are less for lung cancer detection with IoT integration. There is a high scope to ...

  4. A comprehensive analysis of recent advancements in cancer detection

    Only the peer reviewed research papers published in the recent 5-year span (2018-2023) have been included for the analysis based on the parameters, year of publication, feature utilized, best model, dataset/images utilized, and best accuracy. ... The presented results offer valuable guidance for future research in cancer detection ...

  5. Recent advancement in cancer diagnosis using machine learning and deep

    Various types of cancer detection methods using ML/DL-based techniques have introduced a new research area for early detection of cancers. This review paper presents comprehensive survey & analysis of research done in past six years during 2016-2021 on ML/DL-based diagnosis of six different cancers types i.e., liver, lung, brain, breast, skin ...

  6. Recent advancement in cancer detection using machine learning

    Different types of cancer detection and classification using machine assistance have opened up a new research area for early detection of cancer, which has shown the ability to reduce manual system impairments. ... Accordingly, this paper has presented a systematic review of current techniques in diagnosis and cure of several cancers affecting ...

  7. Applied machine learning in cancer research: A systematic review for

    In cancer research and oncology, the successful application of Deep Learning (DL) techniques has recently demonstrated fundamental improvements in image-based disease diagnosis and detection , . Generally, DL architectures correspond to artificial neural networks of multiple non-linear layers.

  8. A Review of Deep Learning Techniques for Lung Cancer Screening and

    The goal of this paper is to provide a thorough overview of the state-of-the-art literature on deep learning algorithms used for classifying and segmenting CT images for lung cancer. ... The majority of previous research on lung cancer detection only focused on classifying the nodules as benign or malignant. In , Liu et al. suggested a Multi ...

  9. A review and comparative study of cancer detection using machine

    With more than 100 types of cancer, this study only examines research on the four most common and prevalent cancers worldwide: lung, breast, prostate, and colorectal cancer. Next, by using state-of-the-art sentence transformers namely: SBERT (2019) and the unsupervised SimCSE (2021), this study proposes a new methodology for detecting cancer.

  10. A Systematic Review of Artificial Intelligence Techniques in Cancer

    Most of the research papers explored in this review focused on the automated diagnosis of cancer prediction. The most extensively explored sites are the breast (22) followed by the kidney (17). Other than breast and kidney, most researchers have worked on brain, colorectal, cervical, and prostate cancer prediction.