Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts

Image processing articles within Scientific Reports

Article 08 April 2024 | Open Access

A novel vector field analysis for quantitative structure changes after macular epiretinal membrane surgery

  • Seok Hyun Bae
  • , Sojung Go
  •  &  Sang Jun Park

Article 05 April 2024 | Open Access

Advanced disk herniation computer aided diagnosis system

  • Maad Ebrahim
  • , Mohammad Alsmirat
  •  &  Mahmoud Al-Ayyoub

Article 28 March 2024 | Open Access

Brain temperature and free water increases after mild COVID-19 infection

  • Ayushe A. Sharma
  • , Rodolphe Nenert
  •  &  Jerzy P. Szaflarski

Article 26 March 2024 | Open Access

High-capacity data hiding for medical images based on the mask-RCNN model

  • Hadjer Saidi
  • , Okba Tibermacine
  •  &  Ahmed Elhadad

Article 25 March 2024 | Open Access

Integrated image and location analysis for wound classification: a deep learning approach

  • , Tirth Shah
  •  &  Zeyun Yu

Article 21 March 2024 | Open Access

A number sense as an emergent property of the manipulating brain

  • Neehar Kondapaneni
  •  &  Pietro Perona

Article 16 March 2024 | Open Access

Lesion-conditioning of synthetic MRI-derived subtraction-MIPs of the breast using a latent diffusion model

  • Lorenz A. Kapsner
  • , Lukas Folle
  •  &  Sebastian Bickelhaupt

Article 14 March 2024 | Open Access

Dual ensemble system for polyp segmentation with submodels adaptive selection ensemble

  • , Kefeng Fan
  •  &  Kaijie Jiao

Article 11 March 2024 | Open Access

Generalizable disease detection using model ensemble on chest X-ray images

  • Maider Abad
  • , Jordi Casas-Roma
  •  &  Ferran Prados

Article 08 March 2024 | Open Access

Segmentation-based cardiomegaly detection based on semi-supervised estimation of cardiothoracic ratio

  • Patrick Thiam
  • , Christopher Kloth
  •  &  Hans A. Kestler

Article 05 March 2024 | Open Access

Brain volume measured by synthetic magnetic resonance imaging in adult moyamoya disease correlates with cerebral blood flow and brain function

  • Kazufumi Kikuchi
  • , Osamu Togao
  •  &  Kousei Ishigami

Article 04 March 2024 | Open Access

Critical evaluation of artificial intelligence as a digital twin of pathologists for prostate cancer pathology

  • Okyaz Eminaga
  • , Mahmoud Abbas
  •  &  Olaf Bettendorf

Computational pathology model to assess acute and chronic transformations of the tubulointerstitial compartment in renal allograft biopsies

  • Renaldas Augulis
  • , Allan Rasmusson
  •  &  Arvydas Laurinavicius

Opportunistic screening with multiphase contrast-enhanced dual-layer spectral CT for osteoblastic lesions in prostate cancer compared with bone scintigraphy

  • Ming-Cheng Liu
  • , Chi-Chang Ho
  •  &  Yi-Jui Liu

Article 02 March 2024 | Open Access

Reduction of NIFTI files storage and compression to facilitate telemedicine services based on quantization hiding of downsampling approach

  • Ahmed Elhadad
  • , Mona Jamjoom
  •  &  Hussein Abulkasim

Article 29 February 2024 | Open Access

Attention-guided jaw bone lesion diagnosis in panoramic radiography using minimal labeling effort

  • Minseon Gwak
  • , Jong Pil Yun
  •  &  Chena Lee

End-to-end multimodal 3D imaging and machine learning workflow for non-destructive phenotyping of grapevine trunk internal structure

  • Romain Fernandez
  • , Loïc Le Cunff
  •  &  Cédric Moisy

Article 27 February 2024 | Open Access

An improved V-Net lung nodule segmentation model based on pixel threshold separation and attention mechanism

  • , Handing Song
  •  &  Zhan Wang

Article 26 February 2024 | Open Access

Quantifying mangrove carbon assimilation rates using UAV imagery

  • Javier Blanco-Sacristán
  • , Kasper Johansen
  •  &  Matthew F. McCabe

Article 24 February 2024 | Open Access

Iterative pseudo balancing for stem cell microscopy image classification

  • Adam Witmer
  •  &  Bir Bhanu

Article 22 February 2024 | Open Access

Deep learning-based, fully automated, pediatric brain segmentation

  • Min-Jee Kim
  • , EunPyeong Hong
  •  &  Tae-Sung Ko

Article 21 February 2024 | Open Access

Correction of high-rate motion for photoacoustic microscopy by orthogonal cross-correlation

  • , Qiuqin Mao
  •  &  Xiaojun Liu

Article 20 February 2024 | Open Access

ERCP-Net: a channel extension residual structure and adaptive channel attention mechanism for plant leaf disease classification network

  •  &  Yannan Xu

A quality grade classification method for fresh tea leaves based on an improved YOLOv8x-SPPCSPC-CBAM model

  • Xiu’yan Zhao
  • , Yu’xiang He
  •  &  Kai’xing Zhang

Article 16 February 2024 | Open Access

Stripe noise removal in conductive atomic force microscopy

  • , Jan Rieck
  •  &  Michael H. F. Wilkinson

Article 13 February 2024 | Open Access

Automatic enhancement preprocessing for segmentation of low quality cell images

  •  &  Kazuhiro Hotta

Article 09 February 2024 | Open Access

An artificial intelligence based abdominal aortic aneurysm prognosis classifier to predict patient outcomes

  • Timothy K. Chung
  • , Pete H. Gueldner
  •  &  David A. Vorp

Article 08 February 2024 | Open Access

Application of PET imaging delta radiomics for predicting progression-free survival in rare high-grade glioma

  • Shamimeh Ahrari
  • , Timothée Zaragori
  •  &  Antoine Verger

Cluster-based histopathology phenotype representation learning by self-supervised multi-class-token hierarchical ViT

  • , Shivam Kalra
  •  &  Mohammad Saleh Miri

Article 03 February 2024 | Open Access

YOLOX target detection model can identify and classify several types of tea buds with similar characteristics

  • Mengdao Yang
  • , Weihao Yuan
  •  &  Gaojian Xu

Phenotypic characterization of liver tissue heterogeneity through a next-generation 3D single-cell atlas

  • Dilan Martínez-Torres
  • , Valentina Maldonado
  •  &  Fabián Segovia-Miranda

Article 30 January 2024 | Open Access

Machine learning approaches for early detection of non-alcoholic steatohepatitis based on clinical and blood parameters

  • Amir Reza Naderi Yaghouti
  • , Hamed Zamanian
  •  &  Ahmad Shalbaf

Research on improved black widow algorithm for medical image denoising

  •  &  Lina Zhang

Article 25 January 2024 | Open Access

Methodology of generation of CFD meshes and 4D shape reconstruction of coronary arteries from patient-specific dynamic CT

  • Krzysztof Psiuk-Maksymowicz
  • , Damian Borys
  •  &  Ryszard A. Bialecki

Article 23 January 2024 | Open Access

Comparison between a deep-learning and a pixel-based approach for the automated quantification of HIV target cells in foreskin tissue

  • Zhongtian Shao
  • , Lane B. Buchanan
  •  &  Jessica L. Prodger

Task design for crowdsourced glioma cell annotation in microscopy images

  • Svea Schwarze
  • , Nadine S. Schaadt
  •  &  Friedrich Feuerhake

Article 20 January 2024 | Open Access

Unlocking cardiac motion: assessing software and machine learning for single-cell and cardioid kinematic insights

  • Margherita Burattini
  • , Francesco Paolo Lo Muzio
  •  &  Michele Miragoli

Article 19 January 2024 | Open Access

Microstructural brain abnormalities, fatigue, and cognitive dysfunction after mild COVID-19

  • Lucas Scardua-Silva
  • , Beatriz Amorim da Costa
  •  &  Clarissa Lin Yasuda

Article 18 January 2024 | Open Access

Validation of reliability, repeatability and consistency of three-dimensional choroidal vascular index

  • , Yifan Bai
  •  &  Qingli Shang

Integrated image and sensor-based food intake detection in free-living

  • Tonmoy Ghosh
  •  &  Edward Sazonov

Article 16 January 2024 | Open Access

Early stage black pepper leaf disease prediction based on transfer learning using ConvNets

  • Anita S. Kini
  • , K. V. Prema
  •  &  Smitha N. Pai

GPU-accelerated lung CT segmentation based on level sets and texture analysis

  • Daniel Reska
  •  &  Marek Kretowski

Article 12 January 2024 | Open Access

Accuracy of an AI-based automated plate reading mobile application for the identification of clinical mastitis-causing pathogens in chromogenic culture media

  • Breno Luis Nery Garcia
  • , Cristian Marlon de Magalhães Rodrigues Martins
  •  &  Marcos Veiga dos Santos

Crowdsourced human-based computational approach for tagging peripheral blood smear sample images from Sickle Cell Disease patients using non-expert users

  • José María Buades Rubio
  • , Gabriel Moyà-Alcover
  •  &  Nataša Petrović

Article 09 January 2024 | Open Access

Identification of wheel track in the wheat field

  • Wanhong Zhang

Article 04 January 2024 | Open Access

Multi scale-aware attention for pyramid convolution network on finger vein recognition

  • Huijie Zhang
  • , Weizhen Sun
  •  &  Ling Lv

Article 03 January 2024 | Open Access

Rapid artefact removal and H&E-stained tissue segmentation

  • B. A. Schreiber
  • , J. Denholm
  •  &  E. J. Soilleux

Article 02 January 2024 | Open Access

UNet based on dynamic convolution decomposition and triplet attention

  •  &  Limei Fang

Multi-pose-based convolutional neural network model for diagnosis of patients with central lumbar spinal stenosis

  • Seyeon Park
  • , Jun-Hoe Kim
  •  &  Chun Kee Chung

Article 21 December 2023 | Open Access

Deep learning framework for automated goblet cell density analysis in in-vivo rabbit conjunctiva

  • Seunghyun Jang
  • , Seonghan Kim
  •  &  Ki Hean Kim

Advertisement

Browse broader subjects

  • Computational biology and bioinformatics

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

image processing projects research papers

Subscribe to the PwC Newsletter

Join the community, search results, scikit-image: image processing in python.

1 code implementation • 23 Jul 2014

scikit-image is an image processing library that implements algorithms and utilities for use in research, education and industry applications.

Loss Functions for Neural Networks for Image Processing

2 code implementations • 28 Nov 2015

Neural networks are becoming central in several areas of computer vision and image processing and different architectures have been proposed to solve specific problems.

Picasso: A Modular Framework for Visualizing the Learning Process of Neural Network Image Classifiers

1 code implementation • 16 May 2017

Picasso is a free open-source (Eclipse Public License) web application written in Python for rendering standard visualizations useful for analyzing convolutional neural networks.

image processing projects research papers

MAXIM: Multi-Axis MLP for Image Processing

1 code implementation • CVPR 2022

In this work, we present a multi-axis MLP based architecture called MAXIM, that can serve as an efficient and flexible general-purpose vision backbone for image processing tasks.

image processing projects research papers

Fast Image Processing with Fully-Convolutional Networks

2 code implementations • ICCV 2017

Our approach uses a fully-convolutional network that is trained on input-output pairs that demonstrate the operator's action.

image processing projects research papers

Pre-Trained Image Processing Transformer

6 code implementations • CVPR 2021

To maximally excavate the capability of transformer, we present to utilize the well-known ImageNet benchmark for generating a large amount of corrupted image pairs.

image processing projects research papers

In Defense of Classical Image Processing: Fast Depth Completion on the CPU

2 code implementations • 31 Jan 2018

With the rise of data driven deep neural networks as a realization of universal function approximators, most research on computer vision problems has moved away from hand crafted classical image processing algorithms.

image processing projects research papers

Image Processing Using Multi-Code GAN Prior

1 code implementation • CVPR 2020

Such an over-parameterization of the latent space significantly improves the image reconstruction quality, outperforming existing competitors.

image processing projects research papers

Comparison of Image Quality Models for Optimization of Image Processing Systems

1 code implementation • 4 May 2020

The performance of objective image quality assessment (IQA) models has been evaluated primarily by comparing model predictions to human quality judgments.

image processing projects research papers

Quaternion Convolutional Neural Networks for Heterogeneous Image Processing

1 code implementation • 31 Oct 2018

Convolutional neural networks (CNN) have recently achieved state-of-the-art results in various applications.

  • Reference Manager
  • Simple TEXT file

People also looked at

Editorial article, editorial: current trends in image processing and pattern recognition.

www.frontiersin.org

  • PAMI Research Lab, Computer Science, University of South Dakota, Vermillion, SD, United States

Editorial on the Research Topic Current Trends in Image Processing and Pattern Recognition

Technological advancements in computing multiple opportunities in a wide variety of fields that range from document analysis ( Santosh, 2018 ), biomedical and healthcare informatics ( Santosh et al., 2019 ; Santosh et al., 2021 ; Santosh and Gaur, 2021 ; Santosh and Joshi, 2021 ), and biometrics to intelligent language processing. These applications primarily leverage AI tools and/or techniques, where topics such as image processing, signal and pattern recognition, machine learning and computer vision are considered.

With this theme, we opened a call for papers on Current Trends in Image Processing & Pattern Recognition that exactly followed third International Conference on Recent Trends in Image Processing & Pattern Recognition (RTIP2R), 2020 (URL: http://rtip2r-conference.org ). Our call was not limited to RTIP2R 2020, it was open to all. Altogether, 12 papers were submitted and seven of them were accepted for publication.

In Deshpande et al. , authors addressed the use of global fingerprint features (e.g., ridge flow, frequency, and other interest/key points) for matching. With Convolution Neural Network (CNN) matching model, which they called “Combination of Nearest-Neighbor Arrangement Indexing (CNNAI),” on datasets: FVC2004 and NIST SD27, their highest rank-I identification rate of 84.5% was achieved. Authors claimed that their results can be compared with the state-of-the-art algorithms and their approach was robust to rotation and scale. Similarly, in Deshpande et al. , using the exact same datasets, exact same set of authors addressed the importance of minutiae extraction and matching by taking into low quality latent fingerprint images. Their minutiae extraction technique showed remarkable improvement in their results. As claimed by the authors, their results were comparable to state-of-the-art systems.

In Gornale et al. , authors extracted distinguishing features that were geometrically distorted or transformed by taking Hu’s Invariant Moments into account. With this, authors focused on early detection and gradation of Knee Osteoarthritis, and they claimed that their results were validated by ortho surgeons and rheumatologists.

In Tamilmathi and Chithra , authors introduced a new deep learned quantization-based coding for 3D airborne LiDAR point cloud image. In their experimental results, authors showed that their model compressed an image into constant 16-bits of data and decompressed with approximately 160 dB of PSNR value, 174.46 s execution time with 0.6 s execution speed per instruction. Authors claimed that their method can be compared with previous algorithms/techniques in case we consider the following factors: space and time.

In Tamilmathi and Chithra , authors carefully inspected possible signs of plant leaf diseases. They employed the concept of feature learning and observed the correlation and/or similarity between symptoms that are related to diseases, so their disease identification is possible.

In Das Chagas Silva Araujo et al. , authors proposed a benchmark environment to compare multiple algorithms when one needs to deal with depth reconstruction from two-event based sensors. In their evaluation, a stereo matching algorithm was implemented, and multiple experiments were done with multiple camera settings as well as parameters. Authors claimed that this work could be considered as a benchmark when we consider robust evaluation of the multitude of new techniques under the scope of event-based stereo vision.

In Steffen et al. ; Gornale et al. , authors employed handwritten signature to better understand the behavioral biometric trait for document authentication/verification, such letters, contracts, and wills. They used handcrafter features such as LBP and HOG to extract features from 4,790 signatures so shallow learning can efficiently be applied. Using k-NN, decision tree and support vector machine classifiers, they reported promising performance.

Author Contributions

The author confirms being the sole contributor of this work and has approved it for publication.

Conflict of Interest

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Santosh, KC, Antani, S., Guru, D. S., and Dey, N. (2019). Medical Imaging Artificial Intelligence, Image Recognition, and Machine Learning Techniques . United States: CRC Press . ISBN: 9780429029417. doi:10.1201/9780429029417

CrossRef Full Text | Google Scholar

Santosh, KC, Das, N., and Ghosh, S. (2021). Deep Learning Models for Medical Imaging, Primers in Biomedical Imaging Devices and Systems . United States: Elsevier . eBook ISBN: 9780128236505.

Google Scholar

Santosh, KC (2018). Document Image Analysis - Current Trends and Challenges in Graphics Recognition . United States: Springer . ISBN 978-981-13-2338-6. doi:10.1007/978-981-13-2339-3

Santosh, KC, and Gaur, L. (2021). Artificial Intelligence and Machine Learning in Public Healthcare: Opportunities and Societal Impact . Spain: SpringerBriefs in Computational Intelligence Series . ISBN: 978-981-16-6768-8. doi:10.1007/978-981-16-6768-8

Santosh, KC, and Joshi, A. (2021). COVID-19: Prediction, Decision-Making, and its Impacts, Book Series in Lecture Notes on Data Engineering and Communications Technologies . United States: Springer Nature . ISBN: 978-981-15-9682-7. doi:10.1007/978-981-15-9682-7

Keywords: artificial intelligence, computer vision, machine learning, image processing, signal processing, pattern recocgnition

Citation: Santosh KC (2021) Editorial: Current Trends in Image Processing and Pattern Recognition. Front. Robot. AI 8:785075. doi: 10.3389/frobt.2021.785075

Received: 28 September 2021; Accepted: 06 October 2021; Published: 09 December 2021.

Edited and reviewed by:

Copyright © 2021 Santosh. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: KC Santosh, [email protected]

This article is part of the Research Topic

Current Trends in Image Processing and Pattern Recognition

Real-time intelligent image processing for security applications

  • Guest Editorial
  • Published: 05 September 2021
  • Volume 18 , pages 1787–1788, ( 2021 )

Cite this article

  • Akansha Singh 1 ,
  • Ping Li 2 ,
  • Krishna Kant Singh 3 &
  • Vijayalakshmi Saravana 4  

4092 Accesses

5 Citations

Explore all metrics

The advent of machine learning techniques and image processing techniques has led to new research opportunities in this area. Machine learning has enabled automatic extraction and analysis of information from images. The convergence of machine learning with image processing is useful in a variety of security applications. Image processing plays a significant role in physical as well as digital security. Physical security applications include homeland security, surveillance applications, identity authentication, and so on. Digital security implies protecting digital data. Techniques like digital watermarking, network security, and steganography enable digital security.

Avoid common mistakes on your manuscript.

1 Accepted papers

The rapidly increasing capabilities of imaging systems and techniques have opened new research areas in the security domain. The increase of cyber and physical crimes requires novel techniques to control them. In the case of both physical and digital security, real-time performance is crucial. The availability of the right image information at the right time will enable situational awareness. The real-time image processing techniques can perform the required operation by a latency being within the required time frame. Physical security applications like surveillance and object tracking will be practical only if provided in real time. Similarly, biometric authentication, watermarking or network security is also time restricted applications and requires real-time image processing. This special issue aims to bring together researchers to present novel tools and techniques for real-time image processing for security applications augmented by machine learning techniques.

This special issue on Real-Time Intelligent Image Processing for Security Applications comprises contributions on the topics in theory and applications related to the latest developments in security applications using image processing. Real-time imaging and video processing can be used for finding solutions to a variety of security problems. The special issue consists of the articles that address such security problems.

The paper entitled “RGB + D and deep learning-based real-time detection of suspicious event in Bank ATMs” presents a real-time detection method for human activities. The method is applied to enhance the surveillance and security of Bank Automated Teller Machine (ATM) [ 1 ]. The increasing number of illicit activities at ATMs has become a security concern.

The existing methods for surveillance involving human interaction are not very efficient. The human surveillance methods are highly dependent on the security personnel’s behavior. The real-time surveillance of these machines can be achieved by the proposed solution. The authors have presented a deep learning-based method for detecting the different kinds of motion from the video stream. The motions are classified as abnormal in case of any suspicious activity.

The paper entitled “A real-time person tracking system based on SiamMask network for intelligent video surveillance” presents a real-time surveillance system by tracking persons. The proposed solution can be applied to various public places, offices, buildings, etc., for tracking persons [ 2 ]. The authors have presented a person tracking and segmentation system using an overhead camera perspective.

The paper entitled “Adaptive and stabilized real-time super-resolution control for UAV-assisted smart harbor surveillance platforms” presents a method for smart harbor surveillance platforms [ 3 ]. The method utilizes drones for flexible localization of nodes. An algorithm for scheduling among the data transmitted by different drones and multi-access edge computing systems is proposed. In the second stage of the algorithm, all drones transmit their own data, and these data are utilized for surveillance. Further, the authors have used the concept of super resolution for improving the quality of data and surveillance. Lyapunov optimization-based method is used for maximizing the time-average performance of the system subject to stability of the self-adaptive super resolution control.

The paper entitled “Real-Time Video Summarizing using Image Semantic Segmentation for CBVR” presents a real-time video summarizing method using image semantic segmentation for CBVR [ 4 ]. The paper presents a method for summarizing the videos frame-wise using stacked generalization by an ensemble of different machine learning algorithms. Also, the ranks are given to videos on the basis of the time a particular building or monument appears in the video. The videos are retrieved using KD Tree. The method can be applied to different applications for security surveillance. The authors use video summarization using prominent objects in the video scene. The summary is used to query the video for extracting the required frames. The labeling is done using machine learning and image matching algorithms.

The paper entitled “A real-time classification model based on joint sparse-collaborative representation” presents a classification model based on joint sparse-collaborative representation [ 5 ]. The paper proposes the two-phase test sample representation method. The authors have made improvements in the first phase of the traditional two set method. The second phase has an imbalance in the training samples. Thus, the authors have included the unselected training samples in modeling. The proposed method is applied on numerous face databases. The method has shown good recognition accuracy.

The paper entitled “Recognizing Human Violent Action Using Drone Surveillance within Real-Time Proximity” presents a method for recognizing human violent action using drone surveillance [ 6 ]. The authors have presented a machine-driven recognition and classification of human actions from drone videos. A database is also created from an unconstrained environment using drones. Key-point extraction is performed and 2D skeletons for the persons in the frame are generated. These extracted key points are given as features in the classification module to recognize the actions. For classification, the authors have used SVM and Random Forest methods. The violent actions can be recognized using the proposed method.

2 Conclusion

The editors believe that the papers selected for this special issue will enhance the body of knowledge in the field of security using real-time imaging. We would like to thank the authors for contributing their works to this special issue. The editors would like to acknowledge and thank the reviewers for their insightful comments. These comments have been a guiding force in improving the quality of the papers. The editors would also like to thank the editorial staff for their support and help. We are especially thankful to the Journal of Real-Time Image Processing Chief Editors, Nasser Kehtarnavaz and Matthias F. Carlsohn, who provided us the opportunity to offer this special issue.

Khaire, P.A., Kumar, P.: RGB+ D and deep learning-based real-time detection of suspicious event in Bank-ATMs. J Real-Time Image Proc 23 , 1–3 (2021)

Google Scholar  

Ahmed, I., Jeon, G.: A real-time person tracking system based on SiamMask network for intelligent video surveillance. J Real-Time Image Proc 28 , 1–2 (2021)

Jung, S., Kim, J.: Adaptive and stabilized real-time super-resolution control for UAV-assisted smart harbor surveillance platforms. J Real-Time Image Proc 17 , 1–1 (2021)

Jain R, Jain P, Kumar T, Dhiman G (2021) Real time video summarizing using image semantic segmentation for CBVR. J Real-Time Image Proc.

Li Y, Jin J, Chen CLP (2021) A real-time classification model based on joint sparse-collaborative representation. J Real-Time Image Proc.

Srivastava A, Badal T, Garg A, Vidyarthi A, Singh R (2021) Recognizing human violent action using drone surveillance within real time proximity. J Real-Time Image Proc.

Download references

Author information

Authors and affiliations.

Computer Science Engineering Department, Bennett University, Greater Noida, India

Akansha Singh

Department of Computing, The Hong Kong Polytechnic University, Kowloon, Hong Kong

Faculty of Engineering and Technology, Jain (Deemed-To-Be University), Bengaluru, India

Krishna Kant Singh

Department of Computer Science, University of South Dakota, Vermillion, USA

Vijayalakshmi Saravana

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Akansha Singh .

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Singh, A., Li, P., Singh, K.K. et al. Real-time intelligent image processing for security applications. J Real-Time Image Proc 18 , 1787–1788 (2021). https://doi.org/10.1007/s11554-021-01169-w

Download citation

Published : 05 September 2021

Issue Date : October 2021

DOI : https://doi.org/10.1007/s11554-021-01169-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Find a journal
  • Publish with us
  • Track your research

digital image processing Recently Published Documents

Total documents.

  • Latest Documents
  • Most Cited Documents
  • Contributed Authors
  • Related Sources
  • Related Keywords

Developing Digital Photomicroscopy

(1) The need for efficient ways of recording and presenting multicolour immunohistochemistry images in a pioneering laboratory developing new techniques motivated a move away from photography to electronic and ultimately digital photomicroscopy. (2) Initially broadcast quality analogue cameras were used in the absence of practical digital cameras. This allowed the development of digital image processing, storage and presentation. (3) As early adopters of digital cameras, their advantages and limitations were recognised in implementation. (4) The adoption of immunofluorescence for multiprobe detection prompted further developments, particularly a critical approach to probe colocalization. (5) Subsequently, whole-slide scanning was implemented, greatly enhancing histology for diagnosis, research and teaching.

Parallel Algorithm of Digital Image Processing Based on GPU

Quantitative identification cracks of heritage rock based on digital image technology.

Abstract Digital image processing technologies are used to extract and evaluate the cracks of heritage rock in this paper. Firstly, the image needs to go through a series of image preprocessing operations such as graying, enhancement, filtering and binaryzation to filter out a large part of the noise. Then, in order to achieve the requirements of accurately extracting the crack area, the image is again divided into the crack area and morphological filtering. After evaluation, the obtained fracture area can provide data support for the restoration and protection of heritage rock. In this paper, the cracks of heritage rock are extracted in three different locations.The results show that the three groups of rock fractures have different effects on the rocks, but they all need to be repaired to maintain the appearance of the heritage rock.

Determination of Optical Rotation Based on Liquid Crystal Polymer Vortex Retarder and Digital Image Processing

Discussion on curriculum reform of digital image processing under the certification of engineering education, influence and application of digital image processing technology on oil painting creation in the era of big data, geometric correction analysis of highly distortion of near equatorial satellite images using remote sensing and digital image processing techniques, color enhancement of low illumination garden landscape images.

The unfavorable shooting environment severely hinders the acquisition of actual landscape information in garden landscape design. Low quality, low illumination garden landscape images (GLIs) can be enhanced through advanced digital image processing. However, the current color enhancement models have poor applicability. When the environment changes, these models are easy to lose image details, and perform with a low robustness. Therefore, this paper tries to enhance the color of low illumination GLIs. Specifically, the color restoration of GLIs was realized based on modified dynamic threshold. After color correction, the low illumination GLI were restored and enhanced by a self-designed convolutional neural network (CNN). In this way, the authors achieved ideal effects of color restoration and clarity enhancement, while solving the difficulty of manual feature design in landscape design renderings. Finally, experiments were carried out to verify the feasibility and effectiveness of the proposed image color enhancement approach.

Discovery of EDA-Complex Photocatalyzed Reactions Using Multidimensional Image Processing: Iminophosphorane Synthesis as a Case Study

Abstract Herein, we report a multidimensional screening strategy for the discovery of EDA-complex photocatalyzed reactions using only photographic devices (webcam, cellphone) and TLC analysis. An algorithm was designed to identify automatically EDA-complex reactive mixtures in solution from digital image processing in a 96-wells microplate and by TLC-analysis. The code highlights the region of absorption of the mixture in the visible spectrum, and the quantity of the color change through grayscale values. Furthermore, the code identifies automatically the blurs on the TLC plate and classifies the mixture as colorimetric reactions, non-reactive or potentially reactive EDA mixtures. This strategy allowed us to discover and then optimize a new EDA-mediated approach for obtaining iminophosphoranes in up to 90% yield.

Mangosteen Quality Grading for Export Markets Using Digital Image Processing Techniques

Export citation format, share document.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Springer Nature - PMC COVID-19 Collection

Logo of phenaturepg

Medical image analysis based on deep learning approach

Muralikrishna puttagunta.

Department of Computer Science, School of Engineering and Technology, Pondicherry University, Pondicherry, India

Medical imaging plays a significant role in different clinical applications such as medical procedures used for early detection, monitoring, diagnosis, and treatment evaluation of various medical conditions. Basicsof the principles and implementations of artificial neural networks and deep learning are essential for understanding medical image analysis in computer vision. Deep Learning Approach (DLA) in medical image analysis emerges as a fast-growing research field. DLA has been widely used in medical imaging to detect the presence or absence of the disease. This paper presents the development of artificial neural networks, comprehensive analysis of DLA, which delivers promising medical imaging applications. Most of the DLA implementations concentrate on the X-ray images, computerized tomography, mammography images, and digital histopathology images. It provides a systematic review of the articles for classification, detection, and segmentation of medical images based on DLA. This review guides the researchers to think of appropriate changes in medical image analysis based on DLA.

Introduction

In the health care system, there has been a dramatic increase in demand for medical image services, e.g. Radiography, endoscopy, Computed Tomography (CT), Mammography Images (MG), Ultrasound images, Magnetic Resonance Imaging (MRI), Magnetic Resonance Angiography (MRA), Nuclear medicine imaging, Positron Emission Tomography (PET) and pathological tests. Besides, medical images can often be challenging to analyze and time-consuming process due to the shortage of radiologists.

Artificial Intelligence (AI) can address these problems. Machine Learning (ML) is an application of AI that can be able to function without being specifically programmed, that learn from data and make predictions or decisions based on past data. ML uses three learning approaches, namely, supervised learning, unsupervised learning, and semi-supervised learning. The ML techniques include the extraction of features and the selection of suitable features for a specific problem requires a domain expert. Deep learning (DL) techniques solve the problem of feature selection. DL is one part of ML, and DL can automatically extract essential features from raw input data [ 88 ]. The concept of DL algorithms was introduced from cognitive and information theories. In general, DL has two properties: (1) multiple processing layers that can learn distinct features of data through multiple levels of abstraction, and (2) unsupervised or supervised learning of feature presentations on each layer. A large number of recent review papers have highlighted the capabilities of advanced DLA in the medical field MRI [ 8 ], Radiology [ 96 ], Cardiology [ 11 ], and Neurology [ 155 ].

Different forms of DLA were borrowed from the field of computer vision and applied to specific medical image analysis. Recurrent Neural Networks (RNNs) and convolutional neural networks are examples of supervised DL algorithms. In medical image analysis, unsupervised learning algorithms have also been studied; These include Deep Belief Networks (DBNs), Restricted Boltzmann Machines (RBMs), Autoencoders, and Generative Adversarial Networks (GANs) [ 84 ]. DLA is generally applicable for detecting an abnormality and classify a specific type of disease. When DLA is applied to medical images, Convolutional Neural Networks (CNN) are ideally suited for classification, segmentation, object detection, registration, and other tasks [ 29 , 44 ]. CNN is an artificial visual neural network structure used for medical image pattern recognition based on convolution operation. Deep learning (DL) applications in medical images are visualized in Fig.  1 .

An external file that holds a picture, illustration, etc.
Object name is 11042_2021_10707_Fig1_HTML.jpg

a X-ray image with pulmonary masses [ 121 ] b CT image with lung nodule [ 82 ] c Digitized histo pathological tissue image [ 132 ]

Neural networks

History of neural networks.

The study of artificial neural networks and deep learning derives from the ability to create a computer system that simulates the human brain [ 33 ]. A neurophysiologist, Warren McCulloch, and a mathematician Walter Pitts [ 97 ] developed a primitive neural network based on what has been known as a biological structure in the early 1940s. In 1949, a book titled “Organization of Behavior” [ 100 ] was the first to describe the process of upgrading synaptic weights which is now referred to as the Hebbian Learning Rule. In 1958, Frank Rosenblatt’s [ 127 ] landmark paper defined the structure of the neural network called the perceptron for the binary classification task.

In 1962, Windrow [ 172 ] introduced a device called the Adaptive Linear Neuron (ADALINE) by implementing their designs in hardware. The limitations of perceptions were emphasized by Minski and Papert (1969) [ 98 ]. The concept of the backward propagation of errors for purposes of training is discussed in Werbose1974 [ 171 ]. In 1979, Fukushima [ 38 ] designed artificial neural networks called Neocognitron, with multiple pooling and convolution layers. One of the most important breakthroughs in deep learning occurred in 2006, when Hinton et al. [ 9 ] implemented the Deep Belief Network, with several layers of Restricted Boltzmann Machines, greedily teaching one layer at a time in an unsupervised fashion. In 1989, Yann LeCun [ 71 ] combined CNN with backpropagation to effectively perform the automated recognition of handwritten digits. Figure ​ Figure2 2 shows important advancements in the history of neural networks that led to a deep learning era.

An external file that holds a picture, illustration, etc.
Object name is 11042_2021_10707_Fig2_HTML.jpg

Demonstrations of significant developments in the history of neural networks [ 33 , 134 ]

Artificial neural networks

Artificial Neural Networks (ANN) form the basis for most of the DLA. ANN is a computational model structure that has some performance characteristics similar to biological neural networks. ANN comprises simple processing units called neurons or nodes that are interconnected by weighted links. A biological neuron can be described mathematically in Eq. ( 1 ). Figure ​ Figure3 3 shows the simplest artificial neural model known as the perceptron.

An external file that holds a picture, illustration, etc.
Object name is 11042_2021_10707_Fig3_HTML.jpg

Perceptron [ 77 ]

Training a neural network with Backpropagation (BP)

In the neural networks, the learning process is modeled as an iterative process of optimization of the weights to minimize a loss function. Based on network performance, the weights are modified on a set of examples belonging to the training set. The necessary steps of the training procedure contain forward and backward phases. For Neural Network training, any of the activation functions in forwarding propagation is selected and BP training is used for changing weights. The BP algorithm helps multilayer FFNN to learn input-output mappings from training samples [ 16 ]. Forward propagation and backpropagation are explained with the one hidden layer deep neural networks in the following algorithm.

The backpropagation algorithm is as follows for one hidden layer neural network

  • Initialize all weights to small random values.
  • While the stopping condition is false, do steps 3 through10.
  • For each training pair (( x 1 ,  y 1 )…( x n ,  y n ) do steps 4 through 9.

Feed-forward propagation:

  • 4. Each input unit ( X i , i  = 1, 2, … n ) receives the input signal x i and send this signal to all hidden units in the above layer.
  • 5. Each hidden unit ( Z j ,  j  = 1. .,  p ) compute output using the below equation, and it transmits to the output unit (i.e.) z j _ in = b j + ∑ i = 1 n w ij x i applies to an activation function Z j  =  f ( Z j  _  in ).

y k _ in = b k + ∑ j = 1 p z j w jk and calculate activation y k  =  f ( y k  _  in )

Backpropagation

At output-layer neurons δ k  = ( t k  −  y k ) f ′ ( y k  _  in )

At Hidden layer neurons δ j = f ′ z j _ in ∑ k m δ k w jk

  • 9. Update weights and biases using the following formulas where η is learning rate

Each output layer ( Y k , k  = 1, 2, …. m ) updates its weights ( J  = 0, 1, … P ) and bias

w jk ( new ) =  w jk ( old ) +  ηδ k z j ; b k ( new ) =  b k ( old ) +  ηδ k

Each hidden layer ( Z J ,  J  = 1, 2, … p ) updates its weights ( i  = 0, 1, … n ) biases:

w ij ( new ) =  w ij ( old ) +  ηδ j x i ; b j ( old ) =  b j ( old ) +  ηδ j

  • 10. Test stopping condition

Activation function

The activation function is the mechanism by which artificial neurons process and transfers information [ 42 ]. There are various types of activation functions which can be used in neural networks based on the characteristic of the application. The activation functions are non-linear and continuously differentiable. Differentiability property is important mainly when training a neural network using the gradient descent method. Some widely used activation functions are listed in Table ​ Table1 1 .

Activation functions

Deep learning

Deep learning is a subset of the machine learning field which deals with the development of deep neural networks inspired by biological neural networks in the human brain .

Autoencoder

Autoencoder (AE) [ 128 ] is one of the deep learning models which exemplifies the principle of unsupervised representation learning as depicted in Fig.  4a . AE is useful when the input data have more number of unlabelled data compared to labeled data. AE encodes the input x into a lower-dimensional space z. The encoded representation is again decoded to an approximated representation  x ′ of the input x through one hidden layer z.

An external file that holds a picture, illustration, etc.
Object name is 11042_2021_10707_Fig4_HTML.jpg

a Autoencoder [ 187 ] b Restricted Boltzmann Machine with n hidden and m visible units [ 88 ] c Deep Belief Networks [ 88 ]

Basic AE consists of three main steps:

Encode: Convert input vector x ϵ R m into h ϵ R n , the hidden layer by h  =  f ( wx  +  b )where w ϵ R m ∗ n and b ϵ R n . m  and n are dimensions of the input vector and converted hidden state. The dimension of the hidden layer h is to be smaller than x . f is an activate function.

Decode: Based on the above  h , reconstruct input vector z by equation z  =  f ′ ( w ′ h  +  b ′ ) where w ′ ϵ R n ∗ m and b ′ ϵ R m . The f ′ is the same as the above activation function.

Calculate square error: L recons ( x , z) =  ∥  x  − z∥ 2 , which is the reconstruction error cost function. Reconstruct error minimization is achieved by optimizing the cost function (2)

Another unsupervised algorithm representation is known as Stacked Autoencoder (SAE). The SAE comprises stacks of autoencoder layers mounted on top of each other where the output of each layer was wired to the inputs of the next layer. A Denoising Autoencoder (DAE) was introduced by Vincent et al. [ 159 ]. The DAE is trained to reconstruct the input from random noise added input data. Variational autoencoder (VAE) [ 66 ] is modifying the encoder where the latent vector space is used to represent the images that follow a Gaussian distribution unit. There are two losses in this model; one is a mean squared error and the Kull back Leibler divergence loss that determines how close the latent variable matches the Gaussian distribution unit. Sparse autoencoder [ 106 ] and variational autoencoders have applications in unsupervised, semi-supervised learning, and segmentation.

Restricted Boltzmann machine

A Restricted Boltzmann machine [RBM] is a Markov Random Field (MRF) associated with the two-layer undirected probabilistic generative model, as shown in Fig. ​ Fig.4b. 4b . RBM contains visible units (input) v and hidden (output) units  h . A significant feature of this model is that there is no direct contact between the two visible units or either of the two hidden units. In binary RBMs, the random variables ( v ,  h ) takes ( v ,  h ) ∈ {0, 1} m  +  n . Like the general Boltzmann machine [ 50 ], the RBM is an energy-based model. The energy of the state { v ,  h } is defined as (3)

where v j , h i are the binary states of visible unit j  ∈ {1, 2, … m } and hidden unit i  ∈ {1, 2, .. n }, b j , c i  are their biases of visible and hidden units, w ij is the symmetric interaction term between the units v j and h i them. A joint probability of ( v ,  h ) is given by the Gibbs distribution in Eq. ( 4 )

Z is a “partition function” that can be given by summing over all possible pairs of visual v  and hidden h (5).

A significant feature of the RBM model is that there is no direct contact between the two visible units or either of the two hidden units. In term of probability, conditional distributions p ( h |  v ) and p ( v |  h ) is computed as (6) p h v = ∏ i = 1 n p h i v

For binary RBM condition distribution of visible and hidden are given by (7) and (8)

where σ( · ) is a sigmoid function

RBMs parameters ( w ij ,  b j ,  c i ) are efficiently calculated using the contrastive divergence learning method [ 150 ]. A batch version of k-step contrastive divergence learning (CD-k) can be discussed in the algorithm below [ 36 ]

An external file that holds a picture, illustration, etc.
Object name is 11042_2021_10707_Figd_HTML.jpg

Deep belief networks

The Deep Belief Networks (DBN) proposed by Hinton et al. [ 51 ] is a non-convolution model that can extract features and learn a deep hierarchical representation of training data. DBNs are generative models constructed by stacking multiple RBMs. DBN is a hybrid model, the first two layers are like RBM, and the rest of the layers form a directed generative model. A DBN has one visible layer v and a series of hidden layers h (1) , h (2) , …, h ( l ) as shown in Fig. ​ Fig.4c. 4c . The DBN model joint distribution between the observed units v and the l  hidden layers h k (  k  = 1, … l ) as (9)

where v  =  h (0) , P ( h k |  h k  + 1 ) is a conditional distribution (10) for the layer k given the units of k  + 1

A DBN has l weight matrices: W (1) , …. , W ( l ) and l  + 1 bias vectors: b (0) , …, b ( l ) P ( h ( l ) ,  h ( l  − 1) ) is the joint distribution of top-level RBM (11).

The probability distribution of DBN is given by Eq. ( 12 )

Convolutional neural networks (CNN)

In neural networks, CNN is a unique family of deep learning models. CNN is a major artificial visual network for the identification of medical image patterns. The family of CNN primarily emerges from the information of the animal visual cortex [ 55 , 116 ]. The major problem within a fully connected feed-forward neural network is that even for shallow architectures, the number of neurons may be very high, which makes them impractical to apply to image applications. The CNN is a method for reducing the number of parameters, allows a network to be deeper with fewer parameters.

CNN’s are designed based on three architectural ideas that are shared weights, local receptive fields, and spatial sub-sampling [ 70 ]. The essential element of CNN is the handling of unstructured data through the convolution operation. Convolution of the input signal  x ( t ) with filter signal  h ( t ) creates an output signal y ( t ) that may reveal more information than the input signal itself. 1D convolution of a discrete signals x ( t ) and h ( t ) is (13)

A digital image x ( n 1 ,  n 2 ) is a 2-D discrete signal. The convolution of images  x ( n 1 ,  n 2 ) and h ( n 1 ,  n 2 ) is (14)

where 0 ≤  n 1  ≤  M  − 1, 0 ≤  n 2  ≤  N  − 1.

The function of the convolution layer is to detect local features x l from input feature maps x l  − 1 using kernels k l by convolution operation (*) i.e. x l  − 1  ∗  k l . This convolution operation is repeated for every convolutional layer subject to non-linear transform (15)

where k mn l represents weights between feature map  m at layer l  − 1 and feature map n at l . x m l − 1 represents the  m  feature map of the layer l  − 1 and x n l is n  feature map of the layer l . b m l is the bias parameter. f (.) is the non-linear activation function.  M l  − 1 denotes a set of feature maps. CNN significantly reduces the number of parameters compared with a fully connected neural network because of local connectivity and weight sharing. The depth, zero-padding, and stride are three hyperparameters for controlling the volume of the convolution layer output.

A pooling layer comes after the convolutional layer to subsample the feature maps. The goal of the pooling layers is to achieve spatial invariance by minimizing the spatial dimension of the feature maps for the next convolution layer. Max pooling and average pooling are commonly used two different polling operations to achieve downsampling. Let the size of the pooling region M  and each element in the pooling region is given as x j  = ( x 1 ,  x 2 , … x M  ×  M ), the output after pooling is given as x i . Max pooling and average polling are described in the following Eqs. ( 16 ) and ( 17 ).

The max-pooling method chooses the most superior invariant feature in a pooling region. The average pooling method selects the average of all the features in the pooling area. Thus, the max-pooling method holds texture information that can lead to faster convergence, average pooling method is called Keep background information [ 133 ]. Spatial pyramid pooling [ 48 ], stochastic polling [ 175 ], Def-pooling [ 109 ], Multi activation pooling [ 189 ], and detailed preserving pooling [ 130 ] are different pooling techniques in the literature. A fully connected layer is used at the end of the CNN model. Fully connected layers perform like a traditional neural network [ 174 ]. The input to this layer is a vector of numbers (output of the pooling layer) and outputs an N-dimensional vector (N number of classes). After the pooling layers, the feature of previous layer maps is flattened and connected to fully connected layers.

The first successful seven-layered LeNet-5 CNN was developed by Yann LeCunn in 1990 for handwritten digit recognition successfully. Krizhevsky et al. [ 68 ] proposed AlexNet is a deep convolutional neural network composed of 5 convolutional and 3 fully-connected layers. In AlexNet changed the sigmoid activation function to a ReLU activation function to make model training easier.

K. Simonyan and A. Zisserman invented the VGG-16 [ 143 ] which has 13 convolutional and 3 fully connected layers. The Visual Geometric Group (VGG) research group released a series of CNN starting from VGG-11, VGG-13, VGG-16, and VGG-19. The main intention of the VGG group to understand how the depth of convolutional networks affects the accuracy of the models of image classification and recognition. Compared to the maximum VGG19, which has 16 convolutional layers and 3 fully connected layers, the minimum VGG11 has 8 convolutional layers and 3 fully connected layers. The last three fully connected layers are the same as the various variations of VGG.

Szegedy et al. [ 151 ] proposed an image classification network consisting of 22 different layers, which is GoogleNet. The main idea behind GoogleNet is the introduction of inception layers. Each inception layer convolves the input layers partially using different filter sizes. Kaiming He et al. [ 49 ] proposed the ResNet architecture, which has 33 convolutional layers and one fully-connected layer. Many models introduced the principle of using multiple hidden layers and extremely deep neural networks, but then it was realized that such models suffered from the issue of vanishing or exploding gradients problem. For eliminating vanishing gradients’ problem skip layers (shortcut connections) are introduced. DenseNet developed by Gao et al. [ 54 ] consists of several dense blocks and transition blocks, which are placed between two adjacent dense blocks. The dense block consists of three layers of batch normalization, followed by a ReLU and a 3 × 3 convolution operation. The transition blocks are made of Batch Normalization, 1 × 1 convolution, and average Pooling.

Compared to state-of-the-art handcrafted feature detectors, CNNs is an efficient technique for detecting features of an object and achieving good classification performance. There are drawbacks to CNNs, which are that unique relationships, size, perspective, and orientation of features are not taken into account. To overcome the loss of information in CNNs by pooling operation Capsule Networks (CapsNet) are used to obtain spatial information and most significant features [ 129 ]. The special type of neurons, called capsules, can detect efficiently distinct information. The capsule network consists of four main components that are matrix multiplication, Scalar weighting of the input, dynamic routing algorithm, and squashing function.

Recurrent neural networks (RNN)

RNN is a class of neural networks used for processing sequential information (deal with sequential data). The structure of the RNN shown in Fig.  5a is like an FFNN and the difference is that recurrent connections are introduced among hidden nodes. A generic RNN model at time t , the recurrent connection hidden unit h t receives input activation from the present data x t and the previous hidden state  h t  − 1 . The output y t is calculated given the hidden state h t . It can be represented using the mathematical Eqs. ( 18 ) and ( 19 ) as

An external file that holds a picture, illustration, etc.
Object name is 11042_2021_10707_Fig5_HTML.jpg

a Recurrent Neural Networks [ 163 ] b Long Short-Term Memory [ 163 ] c Generative Adversarial Networks [ 64 ]

Here f is a non-linear activation function, w hx is the weight matrix between the input and hidden layers, w hh is the matrix of recurrent weights between the hidden layers and itself w yh is the weight matrix between the hidden and output layer, and b h and b y are biases that allow each node to learn and offset. While the RNN is a simple and efficient model, in reality, it is, unfortunately, difficult to train properly. Real-Time Recurrent Learning (RTRL) algorithm [ 173 ] and Back Propagation Through Time (BPTT) [ 170 ] methods are used to train RNN. Training with these methods frequently fails because of vanishing (multiplication of many small values) or explode (multiplication of many large values) gradient problem [ 10 , 112 ]. Hochreiter and Schmidhuber (1997) designed a new RNN model named Long Short Term Memory (LSTM) that overcome error backflow problems with the aid of a specially designed memory cell [ 52 ]. Figure ​ Figure5b 5b shows an LSTM cell which is typically configured by three gates: input gate g t , forget gate  f t and output gate  o t , these gates add or remove information from the cell.

An LSTM can be represented with the following Eqs. ( 20 ) to ( 25 )

Generative adversarial networks (GAN)

In the field of deep learning, one of the deep generative models are Generative Adversarial Networks (GANs) introduced by Good Fellow in [ 43 ]. GANs are neural networks that can generate synthetic images that closely imitate the original images. In GAN shown in Fig. ​ Fig.5c, 5c , there are two neural networks, namely generator, and discriminator, which are trained simultaneously. The generator G generates counterfeit data samples which aim to “fool” the discriminator  D , while the discriminator attempts to correctly distinguish the true and false samples. In mathematical terms, D and G play a two player minimax game with the cost function of (26) [ 64 ].

Where x represents the original image, z is a noise vector with random numbers. p data ( x ) and p z ( z ) are probability distributions of x and  z , respectively.  D ( x ) represents the probability that x comes from the actual data p data ( x ) rather than the generated data. 1 −  D ( G (z)) is the probability that it can be generated from p z (z). The expectation of x from the real data distribution  p data is expressed by E x ~ p data x and the expectation of z sampled from noise is E z ~ P z z . The goal of the training is to maximize the loss function for the discriminator, while the training objective for the generator is to reduce the term log (1 −  D ( G ( z ))).The most utilization of GAN in the field of medical image analysis is data augmentation (generating new data) and image to image translation [ 107 ]. Trustability of the Generated Data, Unstable Training, and evaluation of generated data are three major drawbacks of GAN that might hinder their acceptance in the medical community [ 183 ].

Ronneberger et al. [ 126 ] proposed CNN based U-Net architecture for segmentation in biomedical image data. The architecture consists of a contracting path (left side) to capture context and an expansive symmetric path (right side) that enables precise localization. U-Net is a generalized DLA used for quantification tasks such as cell detection and shape measurement in medical image data [ 34 ].

Software frameworks

There are several software frameworks available for implementing DLA which are regularly updated as new approaches and ideas are created. DLA encapsulates many levels of mathematical principles based on probability, linear algebra, calculus, and numerical computation. Several deep learning frameworks exist such as Theano, TensorFlow, Caffe, CNTK, Torch, Neon, pylearn, etc. [ 138 ]. Globally, Python is probably the most commonly used programming language for DL. PyTorch and Tensorflow are the most widely used libraries for research in 2019. Table ​ Table2 2 shows the analysis of various Deep Learning Frameworks based on the core language and supported interface language.

Comparison of various Deep Learning Frameworks

Use of deep learning in medical imaging

X-ray image.

Chest radiography is widely used in diagnosis to detect heart pathologies and lung diseases such as tuberculosis, atelectasis, consolidation, pleural effusion, pneumothorax, and hyper cardiac inflation. X-ray images are accessible, affordable, and less dose-effective compared to other imaging methods, and it is a powerful tool for mass screening [ 14 ]. Table ​ Table3 3 presents a description of the DL methods used for X-ray image analysis.

An overview of the DLA for the study of X-ray images

S. Hwang et al. [ 57 ] proposed the first deep CNN-based Tuberculosis screening system with a transfer learning technique. Rajaraman et al. [ 119 ] proposed modality-specific ensemble learning for the detection of abnormalities in chest X-rays (CXRs). These model predictions are combined using various ensemble techniques toward minimizing prediction variance. Class selective mapping of interest (CRM) is used for visualizing the abnormal regions in the CXR images. Loey et al. [ 90 ] proposed A GAN with deep transfer training for COVID-19 detection in CXR images. The GAN network was used to generate more CXR images due to the lack of the COVID-19 dataset. Waheed et al. [ 160 ] proposed a CovidGAN model based on the Auxiliary Classifier Generative Adversarial Network (ACGAN) to produce synthetic CXR images for COVID-19 detection. S. Rajaraman and S. Antani [ 120 ] introduced weakly labeled data augmentation for increasing training dataset to improve the COVID-19 detection performance in CXR images.

Computerized tomography (CT)

CT uses computers and rotary X-ray equipment to create cross-section images of the body. CT scans show the soft tissues, blood vessels, and bones in different parts of the body. CT is a high detection ability, reveals small lesions, and provides a more detailed assessment. CT examinations are frequently used for pulmonary nodule identification [ 93 ]. The detection of malignant pulmonary nodules is fundamental to the early diagnosis of lung cancer [ 102 , 142 ]. Table ​ Table4 4 summarizes the latest deep learning developments in the study of CT image analysis.

A review of articles that use DL techniques for the analysis of the CT image

AUC: area under ROC curve; FROC: Area under the Free-Response ROC Curve; SN: sensitivity; SP: specificity; MAE: mean absolute error LIDC: Lung Image Database Consortium; LIDC-IDRI: Lung Image Database Consortium-Image Database Resource Initiative.

Li et al. 2016 [ 74 ] proposed deep CNN for the detection of three types of nodules that are semisolid, solid, and ground-glass opacity. Balagourouchetty et al. [ 5 ] proposed GoogLeNet based an ensemble FCNet classifier for The liver lesion classification. For feature extraction, basic Googlenet architecture is modified with three modifications. Masood et al. [ 95 ] proposed the multidimensional Region-based Fully Convolutional Network (mRFCN) for lung nodule detection/classification and achieved a classification accuracy of 97.91%. In lung nodule detection, the feature work is the detection of micronodules (less than 3 mm) without loss of sensitivity and accuracy. Zhao and Zeng 2019 [ 190 ] proposed DLA based on supervised MSS U-Net and 3DU-Net to automatically segment kidneys and kidney tumors from CT images. In the present pandemic situation, Fan et al. [ 35 ] and Li et al. [ 79 ] used deep learning-based techniques for COVID-19 detection from CT images.

Mammograph (MG)

Breast cancer is one of the world’s leading causes of death among women with cancer. MG is a reliable tool and the most common modality for early detection of breast cancer. MG is a low-dose x-ray imaging method used to visualize the breast structure for the detection of breast diseases [ 40 ]. Detection of breast cancer on mammography screening is a difficult task in image classification because the tumors constitute a small part of the actual breast image. For analyzing breast lesions from MG, three steps are involved that are detection, segmentation, and classification [ 139 ].

The automatic classification and detection of masses at an early stage in MG is still a hot subject of research. Over the past decade, DLA has shown some significant overcome in breast cancer detection and classification problem. Table ​ Table5 5 summarizes the latest DLA developments in the study of mammogram image analysis.

Summary of DLA for MG image analysis

MIAS: Mammographic Image Analysis Society dataset; DDSM: Digital Database for Screening Mammography; BI-RADS: Breast Imaging Reporting and Data System; `WBCD: Wisconsin Breast Cancer Dataset; DIB-MG: data-driven imaging biomarker in mammography. FFDMs: Full-Field Digital Mammograms; MAMMO: Man and Machine Mammography Oracle; FROC: Free response receiver operating characteristic analysis; SN: sensitivity; SP: specificity.

Fonseca et al. [ 37 ] proposed a breast composition classification according to the ACR standard based on CNN for feature extraction. Wang et al. [ 161 ] proposed twelve-layer CNN to detect Breast arterial calcifications (BACs) in mammograms image for risk assessment of coronary artery disease. Ribli et al. [ 124 ] developed a CAD system based on Faster R-CNN for detection and classification of benign and malignant lesions on a mammogram image without any human involvement. Wu et al. [ 176 ] present a deep CNN trained and evaluated on over 1,000,000 mammogram images for breast cancer screening exam classification. Conant et al. [ 26 ] developed a Deep CNN based AI system to detect calcified lesions and soft- tissue in digital breast tomosynthesis (DBT) images. Kang et al. [ 62 ] introduced Fuzzy completely connected layer (FFCL) architecture, which focused primarily on fused fuzzy rules with traditional CNN for semantic BI-RADS scoring. The proposed FFCL framework achieved superior results in BI-RADS scoring for both triple and multi-class classifications.

Histopathology

Histopathology is the field of study of human tissue in the sliding glass using a microscope to identify different diseases such as kidney cancer, lung cancer, breast cancer, and so on. The staining is used in histopathology for visualization and highlight a specific part of the tissue [ 45 ]. For example, Hematoxylin and Eosin (H&E) staining tissue gives a dark purple color to the nucleus and pink color to other structures. H&E stain plays a key role in the diagnosis of different pathologies, cancer diagnosis, and grading over the last century. The recent imaging modality is digital pathology

Deep learning is emerging as an effective method in the analysis of histopathology images, including nucleus detection, image classification, cell segmentation, tissue segmentation, etc. [ 178 ]. Tables ​ Tables6 6 and ​ and7 7 summarize the latest deep learning developments in pathology. In the study of digital pathology image analysis, the latest development is the introduction of whole slide imaging (WSI). WSI allows digitizing glass slides with stained tissue sections at high resolution. Dimitriou et al. [ 30 ] reviewed challenges for the analysis of multi-gigabyte WSI images for building deep learning models. A. Serag et al. [ 135 ] discuss different public “Grand Challenges” that have innovations using DLA in computational pathology.

Summary of articles using DLA for digital pathology image - Organ segmentation

Summary of articles using DLA for digital pathology image - Detection and classification of disease

NODE: Neural Ordinary Differential Equations; IoU: mean Intersection over Union coefficient

Other images

Endoscopy is the insertion of a long nonsurgical solid tube directly into the body for the visual examination of an internal organ or tissue in detail. Endoscopy is beneficial in studying several systems inside the human body, such as the gastrointestinal tract, the respiratory tract, the urinary tract, and the female reproductive tract [ 60 , 101 ]. Du et al. [ 31 ] reviewed the Applications of Deep Learning in the Analysis of Gastrointestinal Endoscopy Images. A revolutionary device for direct, painless, and non-invasive inspection of the gastrointestinal (GI) tract for detecting and diagnosing GI diseases (ulcer, bleeding) is Wireless capsule endoscopy (WCE). Soffer et al. [ 145 ] performed a systematic analysis of the existing literature on the implementation of deep learning in the WCE. The first deep learning-based framework was proposed by He et al. [ 46 ] for the detection of hookworm in WCE images. Two CNN networks integrated (edge extraction and classification of hookworm) to detect hookworm. Since tubular structures are crucial elements for hookworm detection, the edge extraction network was used for tubular region detection. Yoon et al. [ 185 ] developed a CNN model for early gastric cancer (EGC) identification and prediction of invasion depth. The depth of tumor invasion in early gastric cancer (EGC) is a significant factor in deciding the method of treatment. For the classification of endoscopic images as EGC or non-EGC, the authors employed a VGG-16 model. Nakagawa et al. [ 105 ] applied DL technique based on CNN to enhance the diagnostic assessment of oesophageal wall invasion using endoscopy. J.choi et al. [ 22 ] express the feature aspects of DL in endoscopy.

Positron Emission Tomography (PET) is a nuclear imaging tool that is generally used by the injection of particular radioactive tracers to visualize molecular-level activities within tissues. T. Wang et al. [ 168 ] reviewed applications of machine learning in PET attenuation correction (PET AC) and low-count PET reconstruction. The authors discussed the advantages of deep learning over machine learning in the applications of PET images. AJ reader et al. [ 123 ] reviewed the reconstruction of PET images that can be used in deep learning either directly or as a part of traditional reconstruction methods.

The primary purpose of this paper is to review numerous publications in the field of deep learning applications in medical images. Classification, detection, and segmentation are essential tasks in medical image processing [ 144 ]. For specific deep learning tasks in medical applications, the training of deep neural networks needs a lot of labeled data. But in the medical field, at least thousands of labeled data is not available. This issue is alleviated by a technique called transfer learning. Two transfer learning approaches are popular and widely applied that are fixed feature extractors and fine-tuning a pre-trained network. In the classification process, the deep learning models are used to classify images into two or more classes. In the detection process, Deep learning models have the function of identifying tumors and organs in medical images. In the segmentation task, deep learning models try to segment the region of interest in medical images for processing.

Segmentation

For medical image segmentation, deep learning has been widely used, and several articles have been published documenting the progress of deep learning in the area. Segmentation of breast tissue using deep learning alone has been successfully implemented [ 104 ]. Xing et al. [ 179 ] used CNN to acquire the initial shape of the nucleus and then isolate the actual nucleus using a deformable pattern. Qu et al. [ 118 ] suggested a deep learning approach that could segment the individual nucleus and classify it as a tumor, lymphocyte, and stroma nuclei. Pinckaers and Litjens [ 115 ] show on a colon gland segmentation dataset (GlaS) that these Neural Ordinary Differential Equations (NODE) can be used within the U-Net framework to get better segmentation results. Sun 2019 [ 149 ] developed a deep learning architecture for gastric cancer segmentation that shows the advantage of utilizing multi-scale modules and specific convolution operations together. Figure ​ Figure6 6 shows U-Net is the most usually used network for segmentation (Fig. ​ (Fig.6 6 ).

An external file that holds a picture, illustration, etc.
Object name is 11042_2021_10707_Fig6_HTML.jpg

U-Net architecture for segmentation,comprising encoder (downsampling) and decoder (upsampling) sections [ 135 ]

The main challenge posed by methods of detection of lesions is that they can give rise to multiple false positives while lacking a good proportion of true positive ones . For tuberculosis detection using deep learning methods applied in [ 53 , 57 , 58 , 91 , 119 ]. Pulmonary nodule detection using deep learning has been successfully applied in [ 82 , 108 , 136 , 157 ].

Shin et al. [ 141 ] discussed the effect of CNN pre-trained architectures and transfer learning on the identification of enlarged thoracoabdominal lymph nodes and the diagnosis of interstitial lung disease on CT scans, and considered transfer learning to be helpful, given the fact that natural images vary from medical images. Litjens et al. [ 85 ] introduced CNN for the identification of Prostate cancer in biopsy specimens and breast cancer metastasis identification in sentinel lymph nodes. The CNN has four convolution layers for feature extraction and three classification layers. Riddle et al. [ 124 ] proposed the Faster R-CNN model for the detection of mammography lesions and classified these lesions into benign and malignant, which finished second in the Digital Mammography DREAM Challenge. Figure ​ Figure7 7 shows VGG architecture for detection.

An external file that holds a picture, illustration, etc.
Object name is 11042_2021_10707_Fig7_HTML.jpg

CNN architecture for detection [ 144 ]

An object detection framework named Clustering CNN (CLU-CNNs) was proposed by Z. Li et al. [ 76 ] for medical images. CLU-CNNs used Agglomerative Nesting Clustering Filtering (ANCF) and BN-IN Net to avoid much computation cost facing medical images. Image saliency detection aims at locating the most eye-catching regions in a given scene [ 21 , 78 ]. The goal of image saliency detection is to locate a given scene in the most eye-catching regions. In different applications, it also acts as a pre-processing tool including video saliency detection [ 17 , 18 ], object recognition, and object tracking [ 20 ]. Saliency maps are a commonly used tool for determining which areas are most important to the prediction of a trained CNN on the input image [ 92 ]. NT Arun et al. [ 4 ] evaluated the performance of several popular saliency methods on the RSNA Pneumonia Detection dataset and was found that GradCAM was sensitive to the model parameters and model architecture.

Classification

In classification tasks, deep learning techniques based on CNN have seen several advancements. The success of CNN in image classification has led researchers to investigate its usefulness as a diagnostic method for identifying and characterizing pulmonary nodules in CT images. The classification of lung nodules using deep learning [ 74 , 108 , 117 , 141 ] has also been successfully implemented.

Breast parenchymal density is an important indicator of the risk of breast cancer. The DL algorithms used for density assessment can significantly reduce the burden of the radiologist. Breast density classification using DL has been successfully implemented [ 37 , 59 , 72 , 177 ]. Ionescu et al. [ 59 ] introduced a CNN-based method to predict Visual Analog Score (VAS) for breast density estimation. Figure ​ Figure8 8 shows AlexNet architecture for classification.

An external file that holds a picture, illustration, etc.
Object name is 11042_2021_10707_Fig8_HTML.jpg

CNN architecture for classification [ 144 ]

Alcoholism or alcohol use disorder (AUD) has effects on the brain. The structure of the brain was observed using the Neuroimaging approach. S.H.Wang et al. [ 162 ] proposed a 10-layer CNN for alcohol use disorder (AUD) problem using dropout, batch normalization, and PReLU techniques. The authors proposed a 10 layer CNN model that has obtained a sensitivity of 97.73, a specificity of 97.69, and an accuracy of 97.71. Cerebral micro-bleeding (CMB) are small chronic brain hemorrhages that can result in cognitive impairment, long-term disability, and neurologic dysfunction. Therefore, early-stage identification of CMBs for prompt treatment is essential. S. Wang et al. [ 164 ] proposed the transfer learning-based DenseNet to detect Cerebral micro-bleedings (CMBs). DenseNet based model attained an accuracy of 97.71% (Fig. ​ (Fig.8 8 ).

Limitations and challenges

The application of deep learning algorithms to medical imaging is fascinating, but many challenges are pulling down the progress. One of the limitations to the adoption of DL in medical image analysis is the inconsistency in the data itself (resolution, contrast, signal-to-noise), typically caused by procedures in clinical practice [ 113 ]. The non-standardized acquisition of medical images is another limitation in medical image analysis. The need for comprehensive medical image annotations limits the applicability of deep learning in medical image analysis. The major challenge is limited data and compared to other datasets, the sharing of medical data is incredibly complicated. Medical data privacy is both a sociological and a technological issue that needs to be discussed from both viewpoints. For building DLA a large amount of annotated data is required. Annotating medical images is another major challenge. Labeling medical images require radiologists’ domain knowledge. Therefore, it is time-consuming to annotate adequate medical data. Semi-supervised learning could be implemented to make combined use of the existing labeled data and vast unlabelled data to alleviate the issue of “limited labeled data”. Another way to resolve the issue of “data scarcity” is to develop few-shot learning algorithms using a considerably smaller amount of data. Despite the successes of DL technology, there are many restrictions and obstacles in the medical field. Whether it is possible to reduce medical costs, increase medical efficiency, and improve the satisfaction of patients using DL in the medical field cannot be adequately checked. However, in clinical trials, it is necessary to demonstrate the efficacy of deep learning methods and to develop guidelines for the medical image analysis applications of deep learning.

Conclusion and future directions

Medical imaging is a place of origin of the information necessary for clinical decisions. This paper discusses the new algorithms and strategies in the area of deep learning. In this brief introduction to DLA in medical image analysis, there are two objectives. The first one is an introduction to the field of deep learning and the associated theory. The second is to provide a general overview of the medical image analysis using DLA. It began with the history of neural networks since 1940 and ended with breakthroughs in medical applications in recent DL algorithms. Several supervised and unsupervised DL algorithms are first discussed, including auto-encoders, recurrent, CNN, and restricted Boltzmann machines. Several optimization techniques and frameworks in this area include Caffe, TensorFlow, Theano, and PyTorch are discussed. After that, the most successful DL methods were reviewed in various medical image applications, including classification, detection, and segmentation. Applications of the RBM network is rarely published in the medical image analysis literature. In classification and detection, CNN-based models have achieved good results and are most commonly used. Several existing solutions to medical challenges are available. However, there are still several issues in medical image processing that need to be addressed with deep learning. Many of the current DL implementations are supervised algorithms, while deep learning is slowly moving to unsupervised and semi-supervised learning to manage real-world data without manual human labels.

DLA can support clinical decisions for next-generation radiologists. DLA can automate radiologist workflow and facilitate decision-making for inexperienced radiologists. DLA is intended to aid physicians by automatically identifying and classifying lesions to provide a more precise diagnosis. DLA can help physicians to minimize medical errors and increase medical efficiency in the processing of medical image analysis. DL-based automated diagnostic results using medical images for patient treatment are widely used in the next few decades. Therefore, physicians and scientists should seek the best ways to provide better care to the patient with the help of DLA. The potential future research for medical image analysis is the designing of deep neural network architectures using deep learning. The enhancement of the design of network structures has a direct impact on medical image analysis. Manual design of DL Model structure requires rich knowledge; hence Neural Network Search will probably replace the manual design [ 73 ]. A meaningful feature research direction is also the design of various activation functions. Radiation therapy is crucial for cancer treatment. Different medical imaging modalities are playing a critical role in treatment planning. Radiomics was defined as the extraction of high throughput features from medical images [ 28 ]. In the feature, Deep-learning analysis of radionics will be a promising tool in clinical research for clinical diagnosis, drug development, and treatment selection for cancer patients . Due to limited annotated medical data, unsupervised, weakly supervised, and reinforcement learning methods are the emerging research areas in DL for medical image analysis. Overall, deep learning, a new and fast-growing field, offers various obstacles as well as opportunities and solutions for a range of medical image applications.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Muralikrishna Puttagunta, Email: moc.liamg@04939ilarum .

S. Ravi, Email: moc.liamg@eticivars .

Image Processing based Image to Cartoon Generation: Reducing complexity of large computation arising from Deep Learning

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

EP-Logo-wit-text-260px

Engineer's Planet

Mtech, Btech Projects, PhD Thesis and Research Paper Writing Services in Delhi

Image Processing Based Project Topics With Abstracts and Base Papers 2024

Embark on a visual journey into the realm of image processing with our meticulously curated selection of M.Tech project topics for 2024, thoughtfully paired with trending IEEE base papers. These projects encapsulate the forefront of visual innovation, serving as an indispensable resource for M.Tech students keen on exploring the dynamic landscape of image analysis and manipulation. Our comprehensive compilation spans a diverse range of Image Processing project topics, each complemented by a carefully chosen base paper and a concise abstract. From computer vision and pattern recognition to deep learning techniques and medical image processing, these projects represent the latest trends in the ever-evolving field of visual computing. Stay ahead of the technological curve by delving into projects that align with the current demands and challenges faced by industries globally. Whether you’re a student, researcher, or industry professional, our collection acts as a gateway to the cutting-edge advancements in visual innovation. The project titles are strategically chosen to incorporate keywords that resonate with the latest IEEE standards and technological breakthroughs, ensuring relevance and alignment with industry needs. Explore the abstracts to quickly grasp the scope, methodologies, and potential impacts of each project.

M.Tech Projects Topics List In Image Processing Based

Leave a Reply Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed .

  • Mobile Site
  • Staff Directory
  • Advertise with Ars

Filter by topic

  • Biz & IT
  • Gaming & Culture

Front page layout

image processing —

Playboy image from 1972 gets ban from ieee computer journals, use of "lenna" image in computer image processing research stretches back to the 1970s..

Benj Edwards - Mar 29, 2024 9:16 pm UTC

Playboy image from 1972 gets ban from IEEE computer journals

On Wednesday, the IEEE Computer Society announced to members that, after April 1, it would no longer accept papers that include a frequently used image of a 1972 Playboy model named Lena Forsén. The so-called " Lenna image ," (Forsén added an extra "n" to her name in her Playboy appearance to aid pronunciation) has been used in image processing research since 1973 and has attracted criticism for making some women feel unwelcome in the field.

Further Reading

In an email from the IEEE Computer Society sent to members on Wednesday, Technical & Conference Activities Vice President Terry Benzel wrote , "IEEE's diversity statement and supporting policies such as the IEEE Code of Ethics speak to IEEE's commitment to promoting an including and equitable culture that welcomes all. In alignment with this culture and with respect to the wishes of the subject of the image, Lena Forsén, IEEE will no longer accept submitted papers which include the 'Lena image.'"

An uncropped version of the 512×512-pixel test image originally appeared as the centerfold picture for the December 1972 issue of Playboy Magazine. Usage of the Lenna image in image processing began in June or July 1973 when an assistant professor named Alexander Sawchuck and a graduate student at the University of Southern California Signal and Image Processing Institute scanned a square portion of the centerfold image with a primitive drum scanner, omitting nudity present in the original image. They scanned it for a colleague's conference paper, and after that, others began to use the image as well.

The original 512×512

The image's use spread in other papers throughout the 1970s, '80s, and '90s , and it caught Playboy's attention, but the company decided to overlook the copyright violations. In 1997, Playboy helped track down Forsén, who appeared at the 50th Annual Conference of the Society for Imaging Science in Technology, signing autographs for fans. "They must be so tired of me... looking at the same picture for all these years!" she said at the time. VP of new media at Playboy Eileen Kent told Wired , "We decided we should exploit this, because it is a phenomenon."

The image, which features Forsén's face and bare shoulder as she wears a hat with a purple feather, was reportedly ideal for testing image processing systems in the early years of digital image technology due to its high contrast and varied detail. It is also a sexually suggestive photo of an attractive woman, and its use by men in the computer field has garnered criticism over the decades, especially from female scientists and engineers who felt that the image (especially related to its association with the Playboy brand) objectified women and created an academic climate where they did not feel entirely welcome.

Due to some of this criticism, which dates back to at least 1996 , the journal Nature banned the use of the Lena image in paper submissions in 2018.

The comp.compression Usenet newsgroup FAQ document claims that in 1988, a Swedish publication asked Forsén if she minded her image being used in computer science, and she was reportedly pleasantly amused. In a 2019 Wired article , Linda Kinstler wrote that Forsén did not harbor resentment about the image, but she regretted that she wasn't paid better for it originally. "I’m really proud of that picture," she told Kinstler at the time.

Since then, Forsén has apparently changed her mind. In 2019, Creatable and Code Like a Girl created an advertising documentary titled Losing Lena , which was part of a promotional campaign aimed at removing the Lena image from use in tech and the image processing field. In a press release for the campaign and film, Forsén is quoted as saying, "I retired from modelling a long time ago. It’s time I retired from tech, too. We can make a simple change today that creates a lasting change for tomorrow. Let’s commit to losing me."

It seems like that commitment is now being granted. The ban in IEEE publications, which have been historically important journals for computer imaging development, will likely further set a precedent toward removing the Lenna image from common use. In the email, IEEE's Benzel recommended wider sensitivity about the issue, writing, "In order to raise awareness of and increase author compliance with this new policy, program committee members and reviewers should look for inclusion of this image, and if present, should ask authors to replace the Lena image with an alternative."

reader comments

Channel ars technica.

Help | Advanced Search

Electrical Engineering and Systems Science > Image and Video Processing

Title: fasthdrnet: a new efficient method for sdr-to-hdr translation.

Abstract: Modern displays nowadays possess the capability to render video content with a high dynamic range (HDR) and an extensive color gamut (WCG).However, the majority of available resources are still in standard dynamic range(SDR). Therefore, we need to identify an effective methodology for this objective.The existing deep neural network (DNN) based SDR(Standard dynamic range) to HDR (High dynamic range) conversion methods outperform conventional methods, but they are either too large to implement or generate some terrible artifacts. We propose a neural network for SDRTV to HDRTV conversion, termed "FastHDRNet". This network includes two parts, Adaptive Universal Color Transformation and Local Enhancement.The architecture is designed as a lightweight network that utilizes global statistics and local information with super high efficiency. After the experiment, we find that our proposed method achieve state-of-the-art performance in both quantitative comparisons and visual quality with a lightweight structure and a enhanced infer speed.

Submission history

Access paper:.

  • Other Formats

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

IMAGES

  1. 😊 Research paper on digital image processing. Digital Image Processing

    image processing projects research papers

  2. Ieee Paper On Image Processing Based On HUMAN MACHINE INTERFACE

    image processing projects research papers

  3. (PDF) Image segmentation Techniques and its application

    image processing projects research papers

  4. The Eight-Step Process in Designing Your Research Project

    image processing projects research papers

  5. (PDF) Review Paper On Image Processing

    image processing projects research papers

  6. (PDF) A STUDY ON THE IMPORTANCE OF IMAGE PROCESSING AND ITS APLLICATIONS

    image processing projects research papers

VIDEO

  1. Image Processing Course in 2 hours

  2. Exact sum PDF and CDF wireless communication matlab code

  3. Battery fed buck boost converter and a supercapacitor fed bidirectional DC DC converter sources

  4. VOLTAGE AND CURRENT STABILITY OF HVDC MMC NON LINEAR CONTROLLER MATLAB SIMULINK

  5. Research on Key Technologies of Logistics Information Traceability Model Based on Consortium Chain

  6. Research on Fire Detection and Image Information Processing System Based on Image Processing

COMMENTS

  1. Image Processing: Research Opportunities and Challenges

    Image Processing: Research O pportunities and Challenges. Ravindra S. Hegadi. Department of Computer Science. Karnatak University, Dharwad-580003. ravindrahegadi@rediffmail. Abstract. Interest in ...

  2. Image processing

    Image processing is manipulation of an image that has been digitised and uploaded into a computer. Software programs modify the image to make it more useful, and can for example be used to enable ...

  3. Image processing

    Crowdsourced human-based computational approach for tagging peripheral blood smear sample images from Sickle Cell Disease patients using non-expert users. José María Buades Rubio. , Gabriel ...

  4. IEEE Transactions on Image Processing

    Communications Preferences. Profession and Education. Technical Interests. Need Help? US & Canada:+1 800 678 4333. Worldwide: +1 732 981 0060. Contact & Support. About IEEE Xplore. Contact Us.

  5. Search for image processing

    1 code implementation • CVPR 2022. In this work, we present a multi-axis MLP based architecture called MAXIM, that can serve as an efficient and flexible general-purpose vision backbone for image processing tasks. Ranked #1 on Deblurring on RealBlur-J (using extra training data) Deblurring Image Deblurring +6. 933.

  6. 471383 PDFs

    All kinds of image processing approaches. | Explore the latest full-text research PDFs, articles, conference papers, preprints and more on IMAGE PROCESSING. Find methods information, sources ...

  7. (PDF) Advances in Artificial Intelligence for Image Processing

    AI has had a substantial influence on image processing, allowing cutting-edge methods and uses. The foundations of image processing are covered in this chapter, along with representation, formats ...

  8. Advances in image processing using machine learning techniques

    With the recent advances in digital technology, there is an eminent integration of ML and image processing to help resolve complex problems. In this special issue, we received six interesting papers covering the following topics: image prediction, image segmentation, clustering, compressed sensing, variational learning, and dynamic light coding.

  9. Deep Learning-based Image Text Processing Research

    Deep learning is a powerful multi-layer architecture that has important applications in image processing and text classification. This paper first introduces the development of deep learning and two important algorithms of deep learning: convolutional neural networks and recurrent neural networks. The paper then introduces three applications of deep learning for image recognition, image ...

  10. [2404.00633] IPT-V2: Efficient Image Processing Transformer using

    To this end, we present an efficient image processing transformer architecture with hierarchical attentions, called IPTV2, adopting a focal context self-attention (FCSA) and a global grid self-attention (GGSA) to obtain adequate token interactions in local and global receptive fields. Specifically, FCSA applies the shifted window mechanism into ...

  11. Recent Trends in Image Processing and Pattern Recognition

    The 5th International Conference on Recent Trends in Image Processing and Pattern Recognition (RTIP2R) aims to attract current and/or advanced research on image processing, pattern recognition, computer vision, and machine learning. The RTIP2R will take place at the Texas A&M University—Kingsville, Texas (USA), on November 22-23, 2022, in ...

  12. [2404.06075] LIPT: Latency-aware Image Processing Transformer

    Transformer is leading a trend in the field of image processing. Despite the great success that existing lightweight image processing transformers have achieved, they are tailored to FLOPs or parameters reduction, rather than practical inference acceleration. In this paper, we present a latency-aware image processing transformer, termed LIPT. We devise the low-latency proportion LIPT block ...

  13. Real-time intelligent image processing for the internet of things

    The first theme of this special issue focuses on "Theories, models, and algorithms". Fan and Guan [] have developed a deep face verification framework based on SIFT (scale invariant feature transform) and CNN (convolutional neural network) methods.Their experimental results show how the proposed model outperformed some state-of-the-art methods on the LFW (Labeled Faces in the Wild) and YTB ...

  14. Frontiers

    The field of image processing has been the subject of intensive research and development activities for several decades. This broad area encompasses topics such as image/video processing, image/video analysis, image/video communications, image/video sensing, modeling and representation, computational imaging, electronic imaging, information forensics and security, 3D imaging, medical imaging ...

  15. Image Processing Technology Based on Machine Learning

    Machine learning is a relatively new field. With the deepening of people's research in this field, the application of machine learning is increasingly extensive. On the other hand, with the development of science and technology, image has become an indispensable medium of information transmission, and image processing technology is also booming. This paper introduces machine learning into ...

  16. J. Imaging

    When we consider the volume of research developed, there is a clear increase in published research papers targeting image processing and DL, over the last decades. ... This manuscript is a result of the research project "DarwinGSE: Darwin Graphical Search Engine", with code CENTRO-01-0247-FEDER-045256, co-financed by Centro 2020, Portugal ...

  17. Frontiers

    Technological advancements in computing multiple opportunities in a wide variety of fields that range from document analysis (Santosh, 2018), biomedical and healthcare informatics (Santosh et al., 2019; Santosh et al., 2021; Santosh and Gaur, 2021; Santosh and Joshi, 2021), and biometrics to intelligent language processing.These applications primarily leverage AI tools and/or techniques, where ...

  18. Real-time intelligent image processing for security applications

    The advent of machine learning techniques and image processing techniques has led to new research opportunities in this area. Machine learning has enabled automatic extraction and analysis of information from images. The convergence of machine learning with image processing is useful in a variety of security applications. Image processing plays a significant role in physical as well as digital ...

  19. digital image processing Latest Research Papers

    Abstract Digital image processing technologies are used to extract and evaluate the cracks of heritage rock in this paper. Firstly, the image needs to go through a series of image preprocessing operations such as graying, enhancement, filtering and binaryzation to filter out a large part of the noise. Then, in order to achieve the requirements ...

  20. [2404.05911] LATUP-Net: A Lightweight 3D Attention U-Net with Parallel

    Early-stage 3D brain tumor segmentation from magnetic resonance imaging (MRI) scans is crucial for prompt and effective treatment. However, this process faces the challenge of precise delineation due to the tumors' complex heterogeneity. Moreover, energy sustainability targets and resource limitations, especially in developing countries, require efficient and accessible medical imaging ...

  21. Recent advances of image processing techniques in agriculture

    According to the research, only light field cameras have been successful in capturing 3D plant growth in a single shot [23, 45].Using a light field camera to generate 3D point clouds from a single photo is a simple approach to add variety to remote sensing monitoring and modeling applications [46].Fig. 7.1 indicates a sample of real light field image with different focus and depth map, where ...

  22. Physics-Inspired Synthesized Underwater Image Dataset

    This paper introduces the physics-inspired synthesized underwater image dataset (PHISWID), a dataset tailored for enhancing underwater image processing through physics-inspired image synthesis. Deep learning approaches to underwater image enhancement typically demand extensive datasets, yet acquiring paired clean and degraded underwater ones poses significant challenges. While several ...

  23. [2404.04916] Correcting Diffusion-Based Perceptual Image Compression

    The images produced by diffusion models can attain excellent perceptual quality. However, it is challenging for diffusion models to guarantee distortion, hence the integration of diffusion models and image compression models still needs more comprehensive explorations. This paper presents a diffusion-based image compression method that employs a privileged end-to-end decoder model as ...

  24. Medical image analysis based on deep learning approach

    Deep Learning Approach (DLA) in medical image analysis emerges as a fast-growing research field. DLA has been widely used in medical imaging to detect the presence or absence of the disease. This paper presents the development of artificial neural networks, comprehensive analysis of DLA, which delivers promising medical imaging applications.

  25. (PDF) A Review on Image Processing

    Abstract. Image Processing includes changing the nature of an image in order to improve its pictorial information for human interpretation, for autonomous machine perception. Digital image ...

  26. Image Processing based Image to Cartoon Generation: Reducing complexity

    This paper proposes an approach to convert real life images into cartoon images using image processing. The cartoon images have sharp edges, reduced colour quantity compared to the original image, and smooth colour regions. With the rapid advancement in artificial intelligence, recently deep learning methods have been developed for image to cartoon generation. Most of these methods perform ...

  27. (PDF) IMAGE RECOGNITION USING MACHINE LEARNING

    The image classification is a classical problem of image processing, computer vision and machine learning fields. In this paper we study the image classification using deep learning.

  28. Image Processing Based Project Topics With Abstracts and Base Papers

    Explore the latest M.Tech project topics in Image Processing for 2024, featuring trending IEEE base papers. Elevate your research with cutting-edge projects covering diverse aspects of visual intelligence, from computer vision to deep learning applications. Discover innovative titles, abstracts, and base papers to stay ahead in the dynamic field of Image Processing.

  29. Playboy image from 1972 gets ban from IEEE computer journals

    509. On Wednesday, the IEEE Computer Society announced to members that, after April 1, it would no longer accept papers that include a frequently used image of a 1972 Playboy model named Lena ...

  30. FastHDRNet: A new efficient method for SDR-to-HDR Translation

    Modern displays nowadays possess the capability to render video content with a high dynamic range (HDR) and an extensive color gamut (WCG).However, the majority of available resources are still in standard dynamic range(SDR). Therefore, we need to identify an effective methodology for this objective.The existing deep neural network (DNN) based SDR(Standard dynamic range) to HDR (High dynamic ...