How it is possible to drastically reduce the data annotation effort
In the previous posts I showed how we managed to train a neural network to detect objects with a very small number of annotated examples, now I will tell you what is our “secret”.
A Deep Neural network is a software/mathematical artifact that converges from a dataset to an algorithm that model that data. It’s an intrinsically very complex process, but if we intuitively understand its inner workings, we can understand how to guide neural network convergence/learning to quickly achieve optimal results.
The key point that I consider fundamental to the understanding of convolutional deep neural networks is its main element, the feature map.
You can understand a feature map as a small image clipping that is trained to detect a particular pattern in the examples, and will flag whenever that pattern is found in an image.
The training of a neural network will be effective if their feature maps learn the fundamental patterns of the objects we want to detect.
However, to annotating a large number of examples is extremely laborious, and it is at that point that data augmentation comes to be a great tool. Data augmentation consists of generating changes in images that transform them into a different image, but still as a valid example. Thus, in each training season, the example is presented differently for neural network, forcing feature maps to learn the fundamental patterns of objects.
Let’s see how data augmentation works in practice with our demo application. At Eyeflow.AI we have a playground where the user can express and configure several transformations in the images and thus create a pipeline of transformations that will be applied to their examples in training.
In the left panel, we can see several effects that can be applied to the example images. Each of them has parameters that define their behavior, and a probability of being applied to an example at each step of the training.
Let’s try the “Rotate Image” and observe the result.
The probability was set to 30, meaning that in 30% of the times this effect will be applied to an example in training. The range has been set from -30º to +30º, setting the minimum and maximum rotation that will be applied. Thus, with each example presented to the network in training, there will be a 30% chance of the image being rotated with a random degree from -30º to +30º.
It is very important to understand how each of these effects will transform the image and set the parameters correctly. Some effects can generate drastic deformations in the images making the objects unrecognizable. If such an example is presented to the network in training, it will exert a negative effect, confusing the learning. See below the case of an extreme “Shear”.
After setting each of the effects we can test on a batch of images to see how they all stick together. Remembering that our goal is to generate transformations in the examples that will be submitted for learning over the network, and therefore the objects will still need to be well recognizable for feature maps can learn the patterns.
I often see articles talking about how working with AI is laborious and painstaking difficult because requires a gigantic number of annotated examples to work. But for us here at Eyeflow.AI, this process has become faster and easier because of all the tools and advances we are adding to our Video Analytics platform.
If you don’t know, come visit us. https://eyeflow.ai
Data Augmentation and the wonder of multiplication of examples was originally published in Analytics Vidhya on Medium, where people are continuing the conversation by highlighting and responding to this story.