Saturday, November 25, 2017

MORSE: DENOISING AUTO-ENCODER

Introduction

Denoising auto-encoder (DAE) is an artificial neural network used for unsupervised learning of efficient codings.  DAE takes a partially corrupted input whilst training to recover the original undistorted input.

For ham radio amateurs there are many potential use cases for de-noising auto-encoders.  In this blogpost I share an experiment where I trained a neural network to decode morse code from very noisy signal.

Can you see the Morse character in the figure 1. below?   This looks like a bad waterfall display with a lot of background noise.

Fig 1.  Noisy Input Image
To my big surprise this trained DAE was able to decode letter 'Y'  on the top row of the image.  The reconstructed image is shown below in Figure 2.  To put this in perspective,  how often can you totally eliminate the noise just by turning a knob in your radio?  This reconstruction is very clear with a small exception that timing of last  'dah' in letter 'Y' is a bit shorter than in the original training image. 

Fig 2.  Reconstructed Out Image 





For reference, below is original image of letter 'Y'  that was used in the training phase. 


Fig 3.   Original image used for training 




Experiment Details

As a starting point I used Tensorflow tutorials using Jupyter Notebooks, in particular this excellent de-noising autoencoder example that uses MNIST database as the data source.  The MNIST database (Modified National Institute of Standards and Technology database) is a large database of handwritten digits that is commonly used for training various image processing systems. The database is also widely used for training and testing in the field of machine learning. The MNIST database contains 60,000 training images and 10,000 testing images. Half of the training set and half of the test set were taken from NIST's training dataset, while the other half of the training set and the other half of the test set were taken from NIST's testing dataset.

Fig 4. Morse images
I created a simple Python script that generates a Morse code dataset in MNIST format using a text file as the input data. To keep things simple I kept the MNIST image size (28 x 28 pixels) and just 'painted' morse code as white pixels on the canvas.  These images look a bit like waterfall display in modern SDR receivers or software like CW skimmer.  I created all together 55,000 training images,  5000 validation images and 10,000 testing images.

To validate that these images look OK  I plotted first ten characters "BB 2BQA}VA" from the random text file I used for training.  Each image is 28x28 pixels in size so even the longest Morse character will easily fit on this image.  Right now all Morse characters start from top left corner but it would be easy to generate more randomness in the starting point and even length  (or speed) of these characters. 

In fact the original MNIST  images have a lot of variability in the handwritten digits and some are difficult even for humans to classify correctly.  In MNIST case you have only ten classes to choose from  (numbers 0,1,2,3,4,5,6,7,8,9) but in Morse code I had 60 classes as I wanted to include also special characters in the training material.

Fig 5. MNIST images

Figure 4. shows the Morse example images and Figure 5. shows the MNIST example handwritten images.

When training DAE network I added modest amount of gaussian noise to these training images.  See example on figure 6.  It is quite surprising that the DAE network is still able to decode correct answers with three times more noise added on the test images.

Fig 6. Noise added to training input image





















Network model and functions

A typical feature in auto-encoders is to have hidden layers that have less features than the input or output layers.  The network is forced to learn a ”compressed” representation of the input. If the input were completely random then this compression task would be very difficult. But if there is structure in the data, for example, if some of the input features are correlated, then this algorithm will be able to discover some of those correlations.

# Network Parameters
n_input    = 784 # MNIST data input (img shape: 28*28)
n_hidden_1 = 256 # 1st layer num features
n_hidden_2 = 256 # 2nd layer num features
n_output   = 784 # 
with tf.device(device2use):
    # tf Graph input
    x = tf.placeholder("float", [None, n_input])
    y = tf.placeholder("float", [None, n_output])
    dropout_keep_prob = tf.placeholder("float")
    # Store layers weight & bias
    weights = {
        'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])),
        'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
        'out': tf.Variable(tf.random_normal([n_hidden_2, n_output]))
    }
    biases = {
        'b1': tf.Variable(tf.random_normal([n_hidden_1])),
        'b2': tf.Variable(tf.random_normal([n_hidden_2])),
        'out': tf.Variable(tf.random_normal([n_output]))
    }

The functions for this neural network are below. The cost function calculates the mean square of the difference of output and training images.

with tf.device(device2use):
    # MODEL
    out = denoising_autoencoder(x, weights, biases, dropout_keep_prob)
    # DEFINE LOSS AND OPTIMIZER
    cost = tf.reduce_mean(tf.pow(out-y, 2))
     
    optimizer = tf.train.AdamOptimizer(learning_rate=0.01).minimize(cost) 
    # INITIALIZE
    init = tf.initialize_all_variables()
    # SAVER
    savedir = "nets/"
    saver = tf.train.Saver(max_to_keep=3) 

Model Training 

I used the following parameters for training the model. Training took  1780 seconds on a Macbook Pro laptop. The cost curve of training process is shown in Figure 6.  

training_epochs = 300
batch_size      = 1000
display_step    = 5
plot_step       = 10


Fig 6. Cost curve

It is interesting to observe what is happening to the weights.  Figure 7 shows the first hidden layer "h1" weights after training is completed. Each of these blocks have learned some internal representation of the Morse characters. You can also see the noise that was present in the training data.

Fig 7.  Filter shape for "h1" weights

Software

The Jupyter Notebook source code of this experiment has been posted to Github.  Many thanks to the original contributors of this and other Tensorflow tutorials. Without them this experiment would not have been possible.

Conclusions

This experiment demonstrates that de-noising auto-encoders could have many potential use cases for ham radio experiments. While I used MNIST format (28x28 pixel images) in this experiment, it is quite feasible to use other kinds of data, such as audio WAV files,  SSTV images  or some data from other digital modes commonly used by ham radio amateurs.  

If your data has a clear structure that will have noise added and distorted during a radio transmission, it would be quite feasible to experiment implementing a de-noising auto-encoder to restore  near original quality.   It is just a matter of re-configuring the DAE network and re-training the neural network.

If this article sparked your interest in de-noising auto-encoders please let me know.  Machine Learning algorithms are rapidly being deployed in many data intensive applications.  I think it is time for ham radio amateurs to start experimenting with this technology as well. 


73 
Mauri  AG1LE  



Sunday, November 5, 2017

TensorFlow revisited: a new LSTM Dynamic RNN based Morse decoder



It has been almost two years since I was playing with TensorFlow based Morse decoder.  This is a long time in the rapidly moving Machine Learning field.

I created a new version of the LSTM Dynamic RNN based Morse decoder using TensorFlow package and Aymeric Damien's example.  This version is much faster and has also ability to train/decode on variable length sequences.  The training and testing sets are generated from sample text files on the fly, I included the Python library and the new TensorFlow code in my Github page

The demo has ability to train and test using datasets with noise embedded.    Fig 1. shows the 50 first test vectors with gaussian noise added. Each vector is padded to 32 values.  Unlike the previous version of LSTM network this new version has ability to train variable length sequences.  The Morse class handles the generation of training vectors based on input text file that contains randomized text. 

Fig 1. "NOW 20 WPM TEXT IS FROM JANUARY 2015 QST PAGE 56 " 




Below are the TensorFlow model and network parameters I used for this experiment: 

# MODEL Parameters
learning_rate = 0.01
training_steps = 5000
batch_size = 512
display_step = 100
n_samples = 10000 

# NETWORK  Parameters
seq_max_len = 32 # Sequence max length
n_hidden = 64    # Hidden layer num of features  
n_classes = 60   # Each morse character is a separate class


Fig 2. shows the training loss and accuracy by minibatch. This training took 446.9 seconds and final testing accuracy reached was 0.9988.  This training session was done without any noise in the training dataset. 


Fig 2. Training Loss and Accuracy plot.















Sample session to use the trained model is below: 

# ================================================================
#   Use saved model to predict characters from Morse sequence data
# ================================================================
NOISE = False

saver = tf.train.Saver()

testset = Morse(n_samples=10000, max_seq_len=seq_max_len,filename='arrl2.txt')
test_data = testset.data
if (NOISE): 
    test_data = test_data +  normal(0.,0.1, 32*10000).reshape(10000,32,1)
test_label = testset.labels
test_seqlen = testset.seqlen
# Later, launch the model, use the saver to restore variables from disk, and
# do some work with the model.
with tf.Session() as sess:
    # Restore variables from disk.
    saver.restore(sess, "/tmp/morse_model.ckpt")
    print("Model restored.")
    y_hat = tf.argmax(pred,1)
    ch = sess.run(y_hat, feed_dict={x: test_data, y: test_label,seqlen: test_seqlen})
    s = ''
    for c in ch:
        s += testset.decode(c)
    print( s)

Here is the output from the decoder (this is using arrl2.txt file as input): 

INFO:tensorflow:Restoring parameters from /tmp/morse_model.ckpt
Model restored.
NOW 20 WPM TEXT IS FROM JANUARY 2015 QST  PAGE 56 SITUATIONS WHERE I COULD HAVE BROUGHT A DIRECTIONAL ANTENNA WITH ME, SUCHAS A SMALL YAGI FOR HF OR VHF.  IF ITS LIGHT ENOUGH, ROTATING A YAGI CAN BEDONE WITH THE ARMSTRONG METHOD, BUT IT IS OFTEN VERY INCONVENIENT TO DO SO.PERHAPS YOU DONT WANT TO LEAVE THE RIG BEHIND WHILE YOU GO OUTSIDE TO ADJUST THE ANTENNA TOWARD THAT WEAK STATION, OR PERHAPS YOU'RE IN A TENT AND ITS DARK OUT THERE.  A BATTERY POWERED ROTATOR PORTABLE ROTATION HAS DEVELOPED A SOLUTION TO THESE PROBLEMS.  THE 12PR1A IS AN ANTENNA ROTATOR FIGURE 6 THAT FUNCTIONS ON 9 TO 14 V DC.  AT 12 V, THE UNIT IS SPECIFIED TO DRAW 40 MA IDLE CURRENT AND 200 MA OR LESS WHILE THE ANTENNA IS TURNING. IT CAN BE POWERED FROM THE BATTERY USED TO RUN A TYPICAL PORTABLE STATION.WHILE THE CONTROL HEAD FIGURE 7 WILL FUNCTION WITH AS LITTLE AS 6 V, A END OF 20 WPM TEXT QST DE AG1LE  NOW 20 WPM     TEXT IS FROM JANUARY 2014 QST  PAGE 46 TRANSMITTER MANUALS SPECIFI

As the reader can observe the LSTM network has learned near perfectly to translate incoming Morse sequences to  text. 

Next I did set the NOISE variable to True.  Here is the decoded message with noise: 

NOW J0 O~M TEXT IS LRZM JANUSRQ 2015 QST  PAGE 56 SITRATIONS WHEUE I XOULD HAVE BRYUGHT A DIRECTIZNAF ANTENNS WITH ME{ SUYHSS A SMALL YAGI FYR HF OU VHV'  IV ITS LIGHT ENOUGH, UOTSTING A YAGI CAN BEDONE FITH THE ARMSTRONG METHOD8 LUT IT IS OFTEN VERQ INOGN5ENIENT TC DG SC.~ERHAPS YOR DZNT WINT TO LEAVE THE RIK DEHIND WHILE YOU KO OUTSIME TO ADJUST THE AATENNA TYOARD THNT WEAK STTTION0 OU ~ERHAPS COU'UE IN A TENT AND ITS MARK OUT THERE.  S BATTERC JYWERED RCTATOR ~ORTALLE ROTATION HAS DEVELOOED A SKLUTION TO THESE ~UOBLEMS.  THE 1.JU.A IS AN ANTENNA RYTATCR FIGURE 6 THAT FRACTIZNS ZN ) TO 14 V DC1  AT 12 W{ THE UNIT IS SPECIFIED TO DRSW }8 MA IDLE CURRENT AND 20' MA OR LESS WHILE THE ANTENNA IS TURNING. IT ZAN BE POOERED FROM THE BATTEUY USED TO RRN A T}~IXAL CQMTUBLE STATION_WHILE IHE }ZNTROA HEAD FIGURE 7 WILA WUNXTION WITH AS FITTLE AA 6 F8 N END ZF 2, WPM TEXT OST ME AG1LE  NOW 20 W~M     TEXT IS LROM JTNUARJ 201} QST  ~AGE 45 TRANSMITTER MANUALS S~ECILI

Interestingly this text is still quite readable despite noisy signals. The model seems to mis-decode some dits and dahs but the word structure is still visible. 

As a next step I re-trained the network using the same amount of noise in the training dataset.  I expected the loss and accuracy to be worse.   Fig 3. shows that training accuracy to 0.89338 took much longer and maximum testing accuracy was only 0.9837.

Fig. 3  Training Loss and Accuracy with noisy dataset


With the new model trained using noisy data I did re-run the testing phase. Here is the decoded message with noise:

NOW 20 WPM TEXT IS FROM JANUARY 2015 QST  PAGE 56 SITUATIONS WHERE I COULD HAWE BROUGHT A DIRECTIONAL ANTENNA WITH ME0 SUCHAS A SMALL YAGI FOR HF OR VHF1  IF ITS LIGHT ENOUGH0 ROTATING A YAGI CAN BEDONE WITH THE ARMSTRONG METHOD0 BUT IT IS OFTEN VERY INCONVENIENT TO DO SO1PERHAPS YOU DONT WANT TO LEAVE THE RIG BEHIND WHILE YOU GO OUTSIDE TO ADJUST THE ANTENNA TOWARD THAT WEAK STATION0 OR PERHAPS YOU1RE IN A TENT AND ITS DARK OUT THERE1  A BATTERY POWERED ROTATOR PORTABLE ROTATION HAS DEVELOPED A SOLUTION TO THESE PROBLEMS1  THE 12PR1A IS AN ANTENNA ROTATOR FIGURE 6 THAT FUNCTIONS ON 9 TO 14 V DC1  AT 12 V0 THE UNIT IS SPECIFIED TO DRAW 40 MA IDLE CURRENT AND 200 MA OR LESS WHILE THE ANTENNA IS TURNING1 IT CAN BE POWERED FROM THE BATTERY USED TO RUN A TYPICAL PORTABLE STATION1WHILE THE CONTROL HEAD FIGURE Q WILL FUNCTION WITH AS LITTLE AS X V0 A END OF 20 WPM TEXT QST DE AG1LE  NOW 20 WPM     TEXT IS FROM JANUARY 2014 QST  PAGE 46 TRANSMITTER MANUALS SPECIFI

As reader can observe now we have nearly perfect copy from noisy testing data.  The LSTM network has gained ability to pick-up the signals from noise.  Note that training data and testing data are two completely separate datasets.

CONCLUSIONS

Recurrent Neural Networks have gained a lot of momentum over the last 2 years. LSTM type networks are used in machine learning systems, like Google Translate,  that can translate one sequence of characters to another language efficiently and accurately.  

This experiment shows that a relatively small TensorFlow based  neural network  can learn  Morse code sequences and translate them to text.   This experiment shows also that  adding noise to the training data  will slow down the learning rate and will impact overall training accuracy achieved.  However,  applying similar noise level in the testing phase will significantly improve  the testing accuracy when using a model trained under noisy training signals. The network has learned the signal distribution and is able to decode more accurately. 

So what are the practical implications of this work?   With some signal pre-processing LSTM RNN could provide a self learning Morse decoder that only needs a set of labeled audio files to learn a particular set of sequences.  With large enough training dataset the model could achieve over 95% accuracy.

73  de AG1LE 
Mauri