Convolutional Neural Network

Convolutional neural network (CNN) is a computationally efficient model with special convolution and pooling operations for the detection of health-related problems by analyzing images.

From: Data Science for COVID-19, 2021

Chapters and Articles

Convolutional neural networks

Jonas Teuwen, Nikita Moriakov, in Handbook of Medical Image Computing and Computer Assisted Intervention, 2020

20.1 Introduction

Convolutional neural networks (CNNs) – or convnets, for short – have in recent years achieved results which were previously considered to be purely within the human realm. In this chapter we introduce CNNs, and for this we first consider regular neural networks, and how these methods are trained. After introducing the convolution, we introduce CNNs. They are very similar to the regular neural networks as they are also made up of neurons with learnable weights. But, in contrast to MLPs, CNNs make the explicit assumption that inputs have specific structure like images. This allows encoding this property into the architecture by sharing the weights for each location in the image and having neurons respond only locally.

Read full chapter
URL: https://www.sciencedirect.com/science/article/pii/B9780128161760000259

Convolutional neural networks

Maria Deprez, Emma C. Robinson, in Machine Learning for Biomedical Applications, 2024

11.1.2 Convolutional neural networks

Convolutional neural networks (CNNs) are designed to mimic the mammalian visual system. They learn spatial filters of increasing complexity, from edge filters to object detectors. The architecture of CNNs combines three ideas [28]:

Local receptive fields: Features are extracted only from local neighborhoods, rather than from the whole image. Convolution achieves that by applying a small filter to the image. In Fig. 11.2, the size of the filter (and therefore the local neighborhood) is 3×3 pixels. Local receptive fields support extraction of low-level features, such as edges and corners, as we illustrated in Fig. 11.1.

Shared weights: The features should be extracted the same way, independent of their location. This is called translation invariance. The convolution applies the same filter to all locations of the image.

Subsampling: The feature maps extracted by convolutions are subsampled and subsequently fed to the next convolutional layer to be combined into higher-level features. As the size of the feature maps reduces, the exact positions of extracted features become irrelevant, and only the patterns of salient features and their relative positions remain.

Fig. 11.3 shows an example CNN architecture diagram in LeNet style.1 The network input is one grey-scale image of size 64×64 (shown as 1@64x64). The first convolutional layer applies 8 different filters to extract 8 feature maps of the same size as the image (shown as 8@64x64). These feature maps are then downsampled by a factor of two using the max-pooling operation, which simply selects a maximum feature value from each 2×2 image patch (resulting feature maps shown as 8@32x32). The second convolutional layer applies 16 filters, resulting in 16 feature maps (shown as 16@32x32). The second max-pooling layer reduces the feature maps to size 16×16 (shown as 16@16x16). At this point a fully-connected layer (also called a dense layer) is applied to produce a vector of 128 features. A second fully connected layer produces 10 outputs, which could correspond to 10 output classes of objects, such as digits.

Figure 11.3

Figure 11.3. Diagram of an example CNN in LeNet style.

The key strength of CNNs compared to the classical filtering techniques is that they learn the filters to extract the most salient features. This is achieved by fitting the CNN model to the data where learnable parameters are the filters. The fitting is performed using backpropagation, similarly to the fully connected networks, as we have seen in Sect. 10.2.3.

Read full chapter
URL: https://www.sciencedirect.com/science/article/pii/B9780128229040000169

Integrated deep learning for self-driving robotic cars

Tad Gonsalves, Jaychand Upadhyay, in Artificial Intelligence for Future Generation Robotics, 2021

8.4.2 Deep Q Network

CNN is essentially a classification structure for classifying images into labeled classes. The various layers of the CNN extract image features and finally learn to classify the images. Hence, the outputs of a typical CNN represent the classes or the labels of the classes, the CNN has learnt to classify (Fig. 8.19). A DQN is a variation of CNN. The outputs are not classes, but the Q values (or rewards) corresponding to each of the actions the agent has learnt to take in response to its state in the environment. In our model, the input to the DQN is the image of the street the car sees in front of it at a given point of time. The output is the steering angle.

Figure 8.19. Layout of Deep Q Network learning.

Read full chapter
URL: https://www.sciencedirect.com/science/article/pii/B9780323854986000101

Applications of deep learning in biomedical engineering

S. Shajun Nisha, M. Nagoor Meeral, in Handbook of Deep Learning in Biomedical Engineering, 2021

5 Convolutional neural network

CNN is a particular type of feed-forward neural network in AI. It is widely used for image recognition [7]. CNN represents the input data in the form of multidimensional arrays [2]. It works well for a large number of labeled data. CNN extract the each and every portion of input image, which is known as receptive field. It assigns weights for each neuron based on the significant role of the receptive field. So that it can discriminate the importance of neurons from one another [8]. The architecture of CNN is shown in Fig. 9.4. The architecture of CNN consists of three types of layer: (1) convolution, (2) pooling, and (3) fully connected.

Figure 9.4. Architecture of CNN. CNN, convolutional neural network.

Based on https://commons.wikimedia.org/wiki/File:Molumen_zaz_965.svg.
Read full chapter
URL: https://www.sciencedirect.com/science/article/pii/B9780128230145000089

13th International Symposium on Process Systems Engineering (PSE 2018)

Wenbo Zhu, ... Jose Romagnoli, in Computer Aided Chemical Engineering, 2018

2.2 Convolutional neural network (CNN)

CNN is a deep neural network originally designed for image analysis. Recently, it was discovered that the CNN also has an excellent capacity in sequent data analysis such as natural language processing (Zhang, 2015). CNN always contains two basic operations, namely convolution and pooling. The convolution operation using multiple filters is able to extract features (feature map) from the data set, through which their corresponding spatial information can be preserved. The pooling operation, also called subsampling, is used to reduce the dimensionality of feature maps from the convolution operation. Max pooling and average pooling are the most common pooling operations used in the CNN. Due to the complicity of CNN, relu is the common choice for the activation function to transfer gradient in training by backpropagation.

Read full chapter
URL: https://www.sciencedirect.com/science/article/pii/B9780444642417503694

Multi-category classification problem

Shih-Chia Huang, Trung-Hieu Le, in Principles and Labs for Deep Learning, 2021

Abstract

A convolutional neural network, also known as CNN or ConvNet, is a class of deep neural network that has been successfully applied to various computer vision applications, especially for analyzing visual images. In this chapter, we first present the structure and operation of CNN to explain how it works. Next, we discuss multi-category classification, which refers to the problem of categorizing samples into one of three or more classes. Then, we build a CNN-based multi-category classification model to help readers improve their skills in working with CNN for solving the multi-category classification problem. Finally, we present and apply a data augmentation technique to prevent overfitting problem during training process and improve the performance of the CNN-based classification model.

Read full chapter
URL: https://www.sciencedirect.com/science/article/pii/B9780323901987000057

A comprehensive review on deep learning techniques for a BCI-based communication system

M. Bhuvaneshwari, ... S. Thomas George, in Demystifying Big Data, Machine Learning, and Deep Learning for Healthcare Analytics, 2021

7.5.1.3 RNN

CNN is the most commonly used architecture for DL. Apart from CNN, there are other DL architectures such as autoencoder, deep generative models, and the recurrent neural network (RNN). CNN and RNN experience the vanishing gradient problem, which gets resolved by a specially designed RNN model called a long short-term memory network. Toraman, Tuncer, and Balgetir (2019) used various CNN architectures such as VGG16, VGG19 (Liu et al., 2018), ResNet (Liu et al., 2018), DenseNet, MobileNet, NasNetMobile, and NasNetlarge to extract features from different spectrograms. Dose et al. (2018) used shallow CNN to classify the EEG signal. In order to enhance the quality and quantity of feature vector information, Ma et al. (2017) used the deep belief network (DBN) to map the feature. Table 7.6 summarizes the architectures used in recent studies.

Table 7.6. Deep learning architectures in various research works.

ArchitectureAuthors
Convolutional neural networkDose et al. (2018), Liu et al. (2018), Liu, Zhou, Zhang, and Xiong (2020) and Chen et al. (2019)
Deep belief networkMa et al. (2017) and Längkvist, Karlsson, and Loutfi (2012)
Recurrent neural networkReddy and Delen (2018),Tsiouris et al. (2018), Michielli, Acharya, and Molinari (2019), and Abbasvandi and Nasrabadi (2019)
LSTMHosseini, Tran, Pompili, Elisevich, and Soltanian-Zadeh (2020) and Zheng et al. (2019)
AutoencoderBoloukian and Safi-Esfahani (2020)
Sparse autoencoderKundu and Ari (2020)
Read full chapter
URL: https://www.sciencedirect.com/science/article/pii/B9780128216330000131

Convolutional neural network architectures

Shih-Chia Huang, Trung-Hieu Le, in Principles and Labs for Deep Learning, 2021

Abstract

Convolutional neural networks (CNNs) have been widely applied to many computer vision applications such as image classification, face recognition, object detection, and so on. This chapter introduces some of the most popular CNN architectures, including LeNet, AlexNet, VGG, GoogLeNet, and ResNet. Then, it presents two scenarios of implementing Inception-v3, a CNN for computer vision. In the first scenario, we create the Inception-v3 model by using Keras Applications. In the second scenario, we employ TensorFlow Hub to build the Inception-v3 model.

Read full chapter
URL: https://www.sciencedirect.com/science/article/pii/B978032390198700001X

Computational lung sound classification: a review

Truc Nguyen, Franz Pernkopf, in State of the Art in Neural Networks and Their Applications, 2023

9.3.1.2.3 Hybrid systems

Hybrid CNN models have been proposed, which combine CNN models with either RNN variants or special mechanisms such as mixture of expert (MoE) layers. In addition, hybrid LSC systems have been developed as combination of conventional LSC and deep learning. For instance, classical features are fed into deep multilayer perceptrons [24,25]. Furthermore, conventional classifiers are used for learned representations from CNN architectures [25,73].

Common deep learning architectures are summarized in Table 9.5.

Table 9.5. Summary of deep learning architectures.

TypeReferences
CNNsStandard CNN architectures
VGGs[29,37,52,65,72,76,77]
AlexNets[52,53,55,72,76,81]
ResNets[36,55,70,75,81]
InceptionNets[35,55,75]
MobileNets[72,75,77]
GoogLeNets[55,81]
Custom CNN architectures
1D CNNs[65,87]
CNNs[10,25,29,33,34,38,41,43,50,53,56,71,73,76,80,82]
Multi-Input (MI)—CNNs[19,39,51]
RNNsNoise Mask—RNN variations[26,78,79]

RNNs

GRUs/BiGRUs

LSTMs/BiLSTMs

[18,26,27,78]

[16,27,42,48,52]

[27,48,52]

Hybrid systems

CNN—RNNs

CNN—LSTMs/CNN—Bi LSTMs

CNN—GRUs/CNN—Bi GRUs

CNN—MoE

Autoencoder

[32]

[9,48,51,70,87]

[28,37,48]

[32,33]

[34,79]

Deep multilayer perceptrons

CNN—SVMs

[24,25]

[25,73]

In addition, some deep learning-based LSC systems are focusing on low computational costs by using or proposing small model architectures such as lightweight CNNs, MobileNets, depth-wise convolutional layers in custom CNN models [52] or weight quantization [77].

Read full chapter
URL: https://www.sciencedirect.com/science/article/pii/B9780128198728000161

Driver drowsiness detection using heart rate and behavior methods: A study

Anmol Wadhwa, Sanjiban Sekhar Roy, in Data Analytics in Biomedical Engineering and Healthcare, 2021

3.2.3 Convolutional neural network

CNNs are a class of neural network that allow greater extraction of features from captured images [68]. Unlike classical models, CNNs take image data, train the model, and then classify the features automatically for healthier classification. There are three principal components of CNNs: convolution, maxpooling, and activation function. CNNs are used in many applications like image recognition, face recognition, and video analysis [68]. Yann Le Cun [69] was the first researcher to use CNN in computer vision applications, but results were not good until he used deep CNN for object recognition. In [68] driver drowsiness detection was accomplished using representation learning that used Viola and Jones’ algorithm for face classification. In this experiment, the pictures were cropped to 48 × 48 frames and then passed to the outmost layer of the framework, which comprised 20 filters. Then, its output was fed to the softmax layer. This system doesn’t consider face position. Other researchers have used a 3D deep neural network in which the face is passed through a combination of two more filters [70], thus making the system applicable even when the driver changes head position.

Read full chapter
URL: https://www.sciencedirect.com/science/article/pii/B9780128193143000112