Adding False Images to the Dataset
When training a network, valid datasets teach the neural network how it should behave. However, if all it has to go with is valid data then the network will probably not perform well when presented with new or unexpected data. In our scenario, on-track data is valid and all others are invalid. It would be helpful if we can get the minidrone to stop when it unable to identify a track.
For this purpose, I went around taking random pictures. These were then resized, thresholded and saved in the same format as the valid data but with zeros for the previous and current values of turn and speed. See InvalidConverter.java.
Saving the Dataset to a CSV File
Once all the data is in the same format, the next step is to convert it to a .csv (comma separated value) file with each line representing one data item. The first two numbers are the previous turn and speed values. The next 768 values are pixels from the image and the last two are current values of turn and speed. The code in Generator.java performs this conversion. I’ve chosen the .csv format because that’s the easiest format to feed Encog with (particularly the workbench – a GUI that simplifies interacting with Encog).
Choice of Neural Network
Now that the data is in a consistent format, it’s time to determine what to use for training as well as the method. For this example, we are interested in predicting the next values of turn and speed based on the previous values as well as the camera image from the drone. This means 770 inputs and 2 outputs. I have chosen to use a feedforward network and train it using resilient propagation (there are several other training methods. I’ve chosen this because it eliminates the need to choose a learning rate or momentum). Based on the number of inputs and outputs, the network will have 770 neurons in the input layer and 2 neurons in the output layer. The big question becomes: how do we choose the number of hidden layers and the number of neurons in each of them?
Sadly, there’s no given method of doing this but by experimentation. I initially started out with a 770-1155-2 network with HyperTan activation in the hidden layer and Linear activation in the output layer. However, varying the number of neurons in the hidden layer did not yield any positive results. I always had above a 100% error. This is the part where patience comes in handy. After several days of no luck I almost gave up on the project. Eventually I decided to switch to network with 2 hidden layers. After several trials, I settled for a 770-500-100-2 network and was able to get below 5% error.
Training with the Encog Analyst
One of the convenient things about Encog is that it can analyze your data file and normalize it for you. Using the workbench, you can pick a goal (in our case regression) and have it generate a .ega (encog analyst) file which describes the data as well as the several tasks to perform (like splitting it into training and validation sets, randomizing it and so on).
You can simply download the sources and run it to use the workbench. Create a project and then drag the .csv file with your data into it. Right clicking on the data file gives you the option to analyze it and a window comes up allowing you to set any necessary options.
Once you hit the OK button an analyst file is generated. Sadly sometimes Encog misclassifies some inputs as classes rather than continuous values. You can fix these by manually editing the file. It’s also necessary to specify the correct inputs and outputs. The analyst file also contains the network definition and target error. Here’s a copy of the analyst file that I used for training. Once it’s all setup, hitting execute performs all the defined tasks and trains the network. Encog also allows you to stop a command e.g. training once you think the error is acceptable. I was satisfied with a 5% error.
Please leave a comment if you have a question.