ANN, CNN, FFN, RNN… That’s a lot of N’s! Even for someone ensconced in the field of neural networks, a subfield within machine learning, there is an overabundance of terms and it can be difficult to keep up. If you’re not in the data science field, and either want to break into it or think […]
ANN, CNN, FFN, RNN… That’s a lot of N’s! Even for someone ensconced in the field of neural networks, a subfield within machine learning, there is an overabundance of terms and it can be difficult to keep up. If you’re not in the data science field, and either want to break into it or think your business could benefit from implementing , then finding a starting point can be a challenge. I’ll try to make that task a little less daunting by explaining the three most common neural networks architectures you’ll encounter and utilize.
Feed Forward Network (FFN)
The FFN is the very first neural network you’ll see when learning about neural networks, with the iconic architecture shown in Fig.1. and discussed in my blog on demystifying neural networks.
Fig.1. A feed forward network – every neuron in one layer passes information to every other neuron in the next layer.
The idea behind FFNs is relatively straightforward: each layer learns how to do something simple, and passes it to the next layer, with as few assumptions about the incoming data and as few restrictions on the network itself as possible. FFNs are, by far, the most general-purpose neural network, and are guaranteed to be able to give the correct answer given enough layers (each column of objects in Fig.1), enough neurons (those round things in Fig.1) in each layer, enough data, and enough time to train on the data. They also require a little luck – FFNs are pretty good at getting stuck in “local minima,” which are solutions that do a good enough job that the computer isn’t sure what the best way to make improvements is, but not good enough that we, as people, are satisfied. Here’s a pros and cons list of FFNs:
If I hand you a 4032×3024 resolution picture and ask you to identify what’s in it, I’m giving you 12,192,768 data points and asking you to classify them. That’s a lot of data, and that’s a big ask. Using a FFN in this scenario is absolutely not viable since there would be hundreds of millions of variables that must be compiled and optimized to categorize even just 10 different objects in the picture (see last Con under FFNs).
Convolutional neural networks were created to handle exactly these kinds of problems by learning to hierarchically identify small, patterned features contained within a larger dataset and aggregating and re-identifying features until the entire dataset has been examined.
Fig.2. A convolutional layer + pooling layer in a CNN.
Fig. 2 shows an example of a typical convolutional and pooling layer structure that executes precisely this feature-aggregating logic. Here, imagine the picture in the classification task above as a 1-dimensional line of data rather than a 2-dimensional set of pixels.
Row 1) is the set of input data that feeds into the CNN.
Row 2) examines every set of 3 consecutive input data points and applies the same logic to each of these sets. It’s like using a mini-FFN to learn the simpler task of identifying mini-features in the data rather than trying to classify it all at once. This is called the convolutional layer.
Row 3) identifies which of the overlapping chunks of data express specific features the strongest. For example, in an image, one chunk of data could contain part of a leaf, while the next one over contains the entire leaf. The network determines there is a leaf in some general area by taking the chunk in a region containing the full leaf as the representation for leaves. Another data chunk might contain a different feature that is not related to leaves. The process of keeping the champions of each feature in each segment of input data is called pooling, and the ‘layer’ that executes this logic is called the pooling layer.
The process of identifying the strongest features and aggregating is simply repeated until the entire dataset is appropriately classified.
Here’s a pros/cons list for CNNs:
Fig.3. Deep learning now outperforms humans at classifying images.
CNNs are great at pattern recognition. Image recognition is an obvious application for such a network class, but, it turns out, natural language processing is another! We frequently speak and write by using patterns of words as templates, and gluing those patterns together. Idiomatic expressions are inherently patterns, as they frequently mean something different than their naive semantic interpretation. But, saying something in slightly different ways confuses a CNN. How, then, can we account for grammar and semantics when executing on natural language processing tasks?
Enter, recurrent neural networks. RNNs take a single, complicated function and apply it repeatedly to a sequence of data. The network keeps track of its understanding of the data it has read, and updates its understanding as new data is ingested with each application of the function it has learned. The architecture looks like Fig.4
Fig.4. Each sequential data element (x’s) is read by the RNN (A) and outputs an updated understanding of the data (h’s). From Chris Olah’s blog on LSTMs.
RNNs excel at analyzing any series of data – stock prices through time, rainfall throughout the year, network activity, etc. In fact, they are unreasonably effective at these tasks. They are also generative. To this end, they are behind any deep learning-created piece of text, and modern approaches to language translation.
Of course, they, like all other networks, face obstacles. Here’s the pros/cons list:
Hopefully this overview has provided some useful insight into the most popular neural networks – both how they work, and when they’re useful. Interested in learning more? Contact us.