Sunday, November 22, 2015

Creating Training Material for Recurrent Neural Networks


In my previous post I shared an experiment I did using Recurrent Neural Network (RNN) software.  I started thinking that perhaps RNNs could learn not just the QSO language concepts but also learn how to decode Morse code from noisy signals. Since I was able to demonstrate learning of the syntax, structure and commonly used phrases in QSOs just in 50 epochs after going through the training material, wouldn't the same concept work for actual Morse signals?

Well, I don't really have any suitable training materials to test this. For the Kaggle competitions (MLMv1, MLMv2) I created a lot of training materials but the focus of these materials was different. The audio files and corresponding transcript files were open ended as I didn't want to narrow down possible approaches that participants might take. The materials were designed for a Kaggle competition in mind to be able to score participants' solutions.

In machine learning you typically have training & validation material that has many different dimensions  and a target variable (or variables) you are trying to model. With neural networks you can train the network to look patterns in the input(s) and set outputs to target values when the input pattern is detected. With RNNs you can introduce memory function - this is necessary because you need to remember signal values from the past to properly decode the Morse characters.

In Morse code you typically have just one signal variable and goal is to extract decoded message from that signal. This could be done by having for example 26 outputs for each alphabet character and train the network to set output 'A' to high when pattern '.-' is detected in the signal input. Alternatively you could have output lines for symbols like 'dit' and 'dah' and 'element space' that are set high when corresponding pattern is detected in the input signal.

Since a well working Morse decoder has to deal with different speeds (typically 5 ... 50 WPM), signals containing noise and QSB fading and other factors I decided to create a Morse Encoder software that creates artificial training signals, but also corresponding symbols, speed information etc. I chose to use this symbols approach because it easier to debug errors and problems when you can plot the inputs vs. outputs graphically. See this Wikipedia article for details about representation, timing of symbols and speed.

The Morse Encoder generates a set of time synchronized signals and has also capability to add QSB type fading effects and Gaussian noise. See example of 'QUICK BROWN FOX JUMPED OVER THE LAZY FOX ' plotted with deep  QSB fading with 4 second cycle time and  0.01 sigma Gaussian noise added in Figure 1. below.
Fig 1. Morse Encoder output signal with QSB and noise

The QSB for real life signals doesn't always follow sin() curve like in Fig 1. but as you can see from example below this is close enough. The big challenge is how to continue decoding correctly when the signal goes down to noise level as shown between 12000 to 14000 time samples (horizontal axis) below.


To provide proper target values for RNN training the Morse Encoder creates a Python DataFrame with the following columns defined

    P.t    # keep time  
    P.sig  # signal stored here
    P.dit  # probability of 'dit' stored here
    P.dah  # probability of 'dah' stored here
    P.ele  # probability of 'element space' stored here
    P.chr  # probability of 'character space' stored here
    P.wrd  # probability of 'word space' stored here
    P.spd  # WPM speed stored here 

Using these columns Morse Encoder takes the given text and parameters and then generates values to these columns. For example when there is a 'dit' in the signal, on corresponding rows the P.dit has probability of 1.0. Likewise, if there is a 'dah' in the signal, on corresponding rows the P.dah has probability of 1.0. This is shown on the Figure 2. below - dits are red and dahs are green, while the signal is shown in blue color.

Fig 2.  Dit and Dah probabilities 

Zoomed section of letters 'QUI ' is shown on Fig 3. below.

Fig 3. Zoomed section

Likewise we create probabilities for the spaces. In Figure 4 below element space is shown with magenta and character space with cyan color. I decided to set character space to probability 1.0 only after element space has passed, as can be seen from the graph.

Fig 4. Element Space and Character Space 

The resulting DataFrame can be saved into a CSV file with a simple Python command and it is very easy to manipulate or plot graphs. Conceptually it is like an Excel spreadsheet - see below:


The Morse Encoder software is stored in Github and it is open source.


Now that I have the capability to create proper training material automatically with some parameters, like speed (WPM), fading (QSB) and noise level (sigma) it is a trivial exercise to produce large quantities of these training files.

My next focus area is to learn more about Recurrent Neural Networks (especially LSTM variants) and experiment with different network configurations. The goal would be to find a RNN configuration that is able to learn how to model the symbols correctly, even in presence of noise and QSB or at different speeds.


No comments:

Post a Comment

Popular Posts