Deep Learning and Neural Networks
Neural networks (Figure 3) pass input data through a series of interconnected nodes (analogous to biological neurons). Each node functions as a mathematical operation (addition, multiplication, etc.), and a group of interconnected nodes within the network is referred to as a 'layer' within a network, with the overall structure of the layers being referred to as the 'architecture'. During training, every node is adjusted and optimized through an iterative process called 'backpropagation',[2,3] allowing the neural network to improve its classification accuracy.
Schematic depicting how classification tasks are performed in convolutional neural networks. Pixel data from an image are passed through an architecture consisting of multiple layers of connecting nodes. In convolutional neural networks, these layers contain unique 'convolutional layers', which operate as filters. These filters work because it was recognized that the location of a feature within an image is often less important than whether that feature is present or absent – an example might be (theoretically) the presence or absence of blue-grey veiling within a melanoma. A convolutional 'filter' learns a particular feature of the image irrespective of where it occurs within the image (represented by the black squares). The network is composed of a large number of hierarchical filters that learn increasingly high-level representations of the image. These could in principle learn dermoscopic features similar to those described by clinicians, although in practice the precise features recognized are likely to differ from classic diagnostic criteria.
Neural networks with multiple 'hidden layers' of nodes (Figure 3) are referred to as 'deep' neural nets and perform 'deep learning'. Although the concept of deep neural networks was described decades ago, lack of affordable and efficient computing power was a major limitation in being able to train them effectively. However, in 2013 it was recognized that graphical processing units (GPUs), originally designed for three-dimensional graphics in computer games, could be repurposed to power the repetitive training required for neural networks.[4,5] Of note, convolutional neural networks (CNNs) are a specific form of deep learning architecture that have proven effective for the classification of image data. CNNs have massively increased in popularity as a method for computer-based image classification after the victory of the GPU-powered CNN AlexNet in 2012, which won the ImageNet competition with a top 5 error rate of 15·3%, which was a remarkable 10% improvement on the next best competitor.
In the past few years, use of CNNs in classification tasks has exploded due to demonstrable and consistently superior efficacy and availability. Novel CNN architectures have been developed, improved and made available for public use by institutions with a high level of expertise and computational resources; examples of these include 'Inception' by Google and 'ResNet' by Microsoft. These architectures can be accessed using software such as TensorFlow (developed by Google) or PyTorch (developed by Facebook) and then trained further for a specific purpose or used in a novel application. A common approach would be to take a pretrained image recognition network architecture such as Inception, and specialize its application by inputting a specific type of image data. This process is referred to as transfer learning.
The British Journal of Dermatology. 2020;183(3):423-430. © 2020 Blackwell Publishing