## Tuesday, November 24, 2015

### INTRODUCTION

In my previous post I created a Python script to generate training material for neural networks.
The goal is to test how well the modern Deep Learning algorithms would work in decoding noisy Morse signals with heavy QSB fading.

I did some research on various frameworks and found from Daniel Hnyk. My requirements were quite similar - full Python support, LSTM RNN built-in and a simple interface.
He had selected Keras that is available in Github. There is a mailing list for Keras users that is fairly active and quite useful to find support from other users. I installed Keras on my Linux laptop and using Jupyter interactive notebooks it was easy to start experimenting with various neural network configurations.

### SIMPLE RECURRENT NEURAL NETWORK EXPERIMENT

Using various sources and above mailing list I came up with the following experiment. I have uploaded the Jupyter notebook file in Github in case the reader wants to replicate the experiment.

The source code or printed output text is shown below with courier font  and I have added some commentary as well as the graphs as pictures.

In [12]:
#!/usr/bin/env python
# MorseEncoder.py  - Morse Encoder to generate training material for neural networks
# Generates raw signal waveforms with Gaussian noise and QSB (signal fading) effects
# Provides also the training target variables in separate columns. Example usage:
#
# WPM= 40 # speed 40 words per minute
# Tq = 4. # QSB cycle time in seconds (typically 5..10 secs)
# sigma = 0.02 # add some Gaussian noise
# P = signal('QUICK BROWN FOX JUMPED OVER THE LAZY FOX ',WPM,Tq,sigma)
# from matplotlib.pyplot import  plot,show,figure,legend
# from numpy.random import normal
# figure(figsize=(12,3))
# lb1,=plot(P.t,P.sig,'b',label="sig")
# lb2,=plot(P.t,P.dit,'g',label="dit")
# lb3,=plot(P.t,P.dah,'g',label="dah")
# lb4,=plot(P.t,P.ele,'m',label="ele")
# lb5,=plot(P.t,P.chr,'c',label="chr")
# lb6,=plot(P.t,P.wrd,'r*',label="wrd")
# legend([lb1,lb2,lb3,lb4,lb5,lb6])
# show()
# P.to_csv("MorseTest.csv")
#
# Copyright (C) 2015   Mauri Niininen, AG1LE
#
#
# MorseEncoder.py is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# MorseEncoder.py is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with bmorse.py.  If not, see <http://www.gnu.org/licenses/>.

import numpy as np
import pandas as pd
from numpy import sin,pi
from numpy.random import normal
pd.options.mode.chained_assignment = None  #to prevent warning messages

Morsecode = {
'!': '-.-.--',
'\$': '...-..-',
"'": '.----.',
'(': '-.--.',
')': '-.--.-',
',': '--..--',
'-': '-....-',
'.': '.-.-.-',
'/': '-..-.',
'0': '-----',
'1': '.----',
'2': '..---',
'3': '...--',
'4': '....-',
'5': '.....',
'6': '-....',
'7': '--...',
'8': '---..',
'9': '----.',
':': '---...',
';': '-.-.-.',
'<AR>': '.-.-.',
'<AS>': '.-...',
'<HM>': '....--',
'<INT>': '..-.-',
'<SK>': '...-.-',
'<VE>': '...-.',
'=': '-...-',
'?': '..--..',
'@': '.--.-.',
'A': '.-',
'B': '-...',
'C': '-.-.',
'D': '-..',
'E': '.',
'F': '..-.',
'G': '--.',
'H': '....',
'I': '..',
'J': '.---',
'K': '-.-',
'L': '.-..',
'M': '--',
'N': '-.',
'O': '---',
'P': '.--.',
'Q': '--.-',
'R': '.-.',
'S': '...',
'T': '-',
'U': '..-',
'V': '...-',
'W': '.--',
'X': '-..-',
'Y': '-.--',
'Z': '--..',
'\\': '.-..-.',
'_': '..--.-',
'~': '.-.-'}

def encode_morse(cws):
s=[]
for chr in cws:
try: # try to find CW sequence from Codebook
s += Morsecode[chr]
s += ' '
except:
if chr == ' ':
s += '_'
continue
print "error: '%s' not in Codebook" % chr
return ''.join(s)

def len_dits(cws):
# length of string in dit units, include spaces
val = 0
for ch in cws:
if ch == '.': # dit len + el space
val += 2
if ch == '-': # dah len + el space
val += 4
if ch==' ':   #  el space
val += 2
if ch=='_':   #  el space
val += 7
return val

def signal(cw_str,WPM,Tq,sigma):
# for given CW string i.e. 'ABC '
# return a pandas dataframe with signals and  symbol probabilities
# WPM = Morse speed in Words Per Minute (typically 5...50)
# Tq  = QSB cycle time (typically 3...10 seconds)
# sigma = adds gaussian noise with standard deviation of sigma to signal
cws = encode_morse(cw_str)
#print cws
# calculate how many milliseconds this string will take at speed WPM
ditlen = 1200/WPM # dit length in msec, given WPM
msec = ditlen*(len_dits(cws)+7)  # reserve +7 for the last pause
t = np.arange(msec)/ 1000.       # time array in seconds
ix = range(0,msec)               # index for arrays

# Create a DataFrame and initialize
col =["t","sig","dit","dah","ele","chr","wrd","spd"]
P = pd.DataFrame(index=ix,columns=col)
P.t = t              # keep time
P.sig=np.zeros(msec) # signal stored here
P.dit=np.zeros(msec) # probability of 'dit' stored here
P.dah=np.zeros(msec) # probability of 'dah' stored here
P.ele=np.zeros(msec) # probability of 'element space' stored here
P.chr=np.zeros(msec) # probability of 'character space' stored here
P.wrd=np.zeros(msec) # probability of 'word space' stored here
P.spd=np.ones(msec)*WPM #speed stored here

#pre-made arrays with multiple(s) of ditlen
z = np.zeros(ditlen)
z2 = np.zeros(2*ditlen)
z4 = np.zeros(4*ditlen)
dit = np.ones(ditlen)
dah = np.ones(3*ditlen)

# For all dits/dahs in CW string generate the signal, update symbol probabilities
i = 0
for ch in cws:
if ch == '.':
dur = len(dit)
P.sig[i:i+dur] = dit
P.dit[i:i+dur] = dit
i += dur
dur=len(z)
P.sig[i:i+dur] = z
P.ele[i:i+dur] = np.ones(dur)
i += dur

if ch == '-':
dur = len(dah)
P.sig[i:i+dur] = dah
P.dah[i:i+dur]=  dah
i += dur
dur=len(z)
P.sig[i:i+dur] = z
P.ele[i:i+dur] = np.ones(dur)
i += dur

if ch == ' ':
dur = len(z2)
P.sig[i:i+dur] = z2
P.chr[i:i+dur]=  np.ones(dur)
i += dur
if ch == '_':
dur = len(z4)
P.sig[i:i+dur] = z4
P.wrd[i:i+dur]=  np.ones(dur)
i += dur
if Tq > 0.:  # QSB cycle time impacts signal amplitude
qsb = 0.5 * sin((1./float(Tq))*t*2*pi) +0.55
P.sig = qsb*P.sig
if sigma >0.:
P.sig += normal(0,sigma,len(P.sig))
return P
In [13]:
print ('MorseEncoder started')
%matplotlib inline
from matplotlib.pyplot import  plot,show,figure,legend, title
from numpy.random import normal
WPM= 40
Tq = 1.8 # QSB cycle time in seconds (typically 5..10 secs)
sigma = 0.01 # add some Gaussian noise
P = signal('QUICK',WPM,Tq,sigma)
figure(figsize=(12,3))
lb1,=plot(P.t,P.sig,'b',label="sig")
title("QUICK in Morse code - (c) 2015 AG1LE")
legend([lb1])
show()
print ('MorseEncoder finished. %d datapoints created' % len(P.sig))

MorseEncoder started

The Jupyter notebook will plot this graph that basically shows the text 'QUICK' converted to noisy signal with strong QSB fading.  This signal goes down close to zero between letters C and K as you can see below.

 Figure 1.  The training signal containing noise and QSB fading
The next  section of the code imports some libraries (including Keras) that is used for Neural Network experimentation. I am also preparing the data to the proper format that Keras requires.

MorseEncoder finished. 1950 datapoints created
In [14]:
# Time Series Testing - Morse case
import keras.callbacks
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Dense, Dropout
from keras.layers.recurrent import LSTM

import random
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

# Data preparation
# use 100 examples of data to predict nb_samples (850) in the future
samples = 1950
examples = 1000
y_examples = 100

x = np.linspace(0,1950,samples)
nb_samples = samples - examples - y_examples
data = P.sig

# prepare input for RNN training  - 1 feature
input_list = [np.expand_dims(np.atleast_2d(data[i:examples+i]), axis=0) for i in xrange(nb_samples)]
input_mat = np.concatenate(input_list, axis=0)
lb1,=plot(x,data,label="input")
lb2,=plot(x,P.dit,label="target")
legend([lb1,lb2])
title("training input and target data")
Out[14]:
<matplotlib.text.Text at 0x10c119b50>

This graph shows the training data (the noisy, fading signal) and the target data (I selected 'dits' in this example). This is just to verify that I have the right datasets selected.

 Figure 2.  Training and target data

In the following sections I prepare the training target ('dits') to proper format and setup the neural network model.  I am using LSTM with Dropout and the model has 300 hidden neurons.  I have also a callback function defined to capture the loss data during the training so that I can plot the loss curve to see the training progress.

In [15]:
# prepare target - the first column in merged dataframe
ydata = P.dit
target_list = [np.atleast_2d(ydata[i+examples:examples+i+y_examples]) for i in xrange(nb_samples)]
target_mat = np.concatenate(target_list, axis=0)

# set up a model
trials = input_mat.shape[0]
features = input_mat.shape[2]
hidden = 300

model = Sequential()
model.compile(loss='mse', optimizer='rmsprop')

# Call back to capture losses
class LossHistory(keras.callbacks.Callback):
def on_train_begin(self, logs={}):
self.losses = []

def on_batch_end(self, batch, logs={}):
self.losses.append(logs.get('loss'))
# Train the model
history = LossHistory()
model.fit(input_mat, target_mat, nb_epoch=100,callbacks=[history])

# Plot the loss curve
plt.plot( history.losses)
title("training loss")

Here I have started the training. I selected 100 epochs - this means that the software will go through the training material  for 100 times during the training.  As you can see this goes very quickly - with larger model or larger datasets the training might take minutes to hours per epoch. We have a very small model and small dataset here.

Epoch 1/100
850/850 [==============================] - 0s - loss: 0.1050
Epoch 2/100
850/850 [==============================] - 0s - loss: 0.0927
Epoch 3/100
850/850 [==============================] - 0s - loss: 0.0870
Epoch 4/100
850/850 [==============================] - 0s - loss: 0.0823
Epoch 5/100
850/850 [==============================] - 0s - loss: 0.0788
Epoch 6/100
850/850 [==============================] - 0s - loss: 0.0756
Epoch 7/100
850/850 [==============================] - 0s - loss: 0.0724
Epoch 8/100
850/850 [==============================] - 0s - loss: 0.0693
Epoch 9/100
850/850 [==============================] - 0s - loss: 0.0668
Epoch 10/100
850/850 [==============================] - 0s - loss: 0.0639
Epoch 11/100
850/850 [==============================] - 0s - loss: 0.0611
Epoch 12/100
850/850 [==============================] - 0s - loss: 0.0586
Epoch 13/100
850/850 [==============================] - 0s - loss: 0.0561
Epoch 14/100
850/850 [==============================] - 0s - loss: 0.0539
Epoch 15/100
850/850 [==============================] - 0s - loss: 0.0519
Epoch 16/100
850/850 [==============================] - 0s - loss: 0.0495
Epoch 17/100
850/850 [==============================] - 0s - loss: 0.0476
Epoch 18/100
850/850 [==============================] - 0s - loss: 0.0456
Epoch 19/100
850/850 [==============================] - 0s - loss: 0.0441
Epoch 20/100
850/850 [==============================] - 0s - loss: 0.0430
Epoch 21/100
850/850 [==============================] - 0s - loss: 0.0411
Epoch 22/100
850/850 [==============================] - 0s - loss: 0.0400
Epoch 23/100
850/850 [==============================] - 0s - loss: 0.0387
Epoch 24/100
850/850 [==============================] - 0s - loss: 0.0378
Epoch 25/100
850/850 [==============================] - 0s - loss: 0.0370
Epoch 26/100
850/850 [==============================] - 0s - loss: 0.0356
Epoch 27/100
850/850 [==============================] - 0s - loss: 0.0350
Epoch 28/100
850/850 [==============================] - 0s - loss: 0.0340
Epoch 29/100
850/850 [==============================] - 0s - loss: 0.0334
Epoch 30/100
850/850 [==============================] - 0s - loss: 0.0328
Epoch 31/100
850/850 [==============================] - 0s - loss: 0.0322
Epoch 32/100
850/850 [==============================] - 0s - loss: 0.0317
Epoch 33/100
850/850 [==============================] - 0s - loss: 0.0309
Epoch 34/100
850/850 [==============================] - 0s - loss: 0.0302
Epoch 35/100
850/850 [==============================] - 0s - loss: 0.0299
Epoch 36/100
850/850 [==============================] - 0s - loss: 0.0296
Epoch 37/100
850/850 [==============================] - 0s - loss: 0.0290
Epoch 38/100
850/850 [==============================] - 0s - loss: 0.0285
Epoch 39/100
850/850 [==============================] - 0s - loss: 0.0283
Epoch 40/100
850/850 [==============================] - 0s - loss: 0.0277
Epoch 41/100
850/850 [==============================] - 0s - loss: 0.0272
Epoch 42/100
850/850 [==============================] - 0s - loss: 0.0268
Epoch 43/100
850/850 [==============================] - 0s - loss: 0.0265
Epoch 44/100
850/850 [==============================] - 0s - loss: 0.0258
Epoch 45/100
850/850 [==============================] - 0s - loss: 0.0256
Epoch 46/100
850/850 [==============================] - 0s - loss: 0.0253
Epoch 47/100
850/850 [==============================] - 0s - loss: 0.0251
Epoch 48/100
850/850 [==============================] - 0s - loss: 0.0248
Epoch 49/100
850/850 [==============================] - 0s - loss: 0.0246
Epoch 50/100
850/850 [==============================] - 0s - loss: 0.0241
Epoch 51/100
850/850 [==============================] - 0s - loss: 0.0236
Epoch 52/100
850/850 [==============================] - 0s - loss: 0.0233
Epoch 53/100
850/850 [==============================] - 0s - loss: 0.0234
Epoch 54/100
850/850 [==============================] - 0s - loss: 0.0230
Epoch 55/100
850/850 [==============================] - 0s - loss: 0.0229
Epoch 56/100
850/850 [==============================] - 0s - loss: 0.0224
Epoch 57/100
850/850 [==============================] - 0s - loss: 0.0223
Epoch 58/100
850/850 [==============================] - 0s - loss: 0.0218
Epoch 59/100
850/850 [==============================] - 0s - loss: 0.0218
Epoch 60/100
850/850 [==============================] - 0s - loss: 0.0215
Epoch 61/100
850/850 [==============================] - 0s - loss: 0.0215
Epoch 62/100
850/850 [==============================] - 0s - loss: 0.0212
Epoch 63/100
850/850 [==============================] - 0s - loss: 0.0208
Epoch 64/100
850/850 [==============================] - 0s - loss: 0.0209
Epoch 65/100
850/850 [==============================] - 0s - loss: 0.0207
Epoch 66/100
850/850 [==============================] - 0s - loss: 0.0205
Epoch 67/100
850/850 [==============================] - 0s - loss: 0.0203
Epoch 68/100
850/850 [==============================] - 0s - loss: 0.0200
Epoch 69/100
850/850 [==============================] - 0s - loss: 0.0200
Epoch 70/100
850/850 [==============================] - 0s - loss: 0.0197
Epoch 71/100
850/850 [==============================] - 0s - loss: 0.0197
Epoch 72/100
850/850 [==============================] - 0s - loss: 0.0198
Epoch 73/100
850/850 [==============================] - 0s - loss: 0.0193
Epoch 74/100
850/850 [==============================] - 0s - loss: 0.0191
Epoch 75/100
850/850 [==============================] - 0s - loss: 0.0189
Epoch 76/100
850/850 [==============================] - 0s - loss: 0.0188
Epoch 77/100
850/850 [==============================] - 0s - loss: 0.0189
Epoch 78/100
850/850 [==============================] - 0s - loss: 0.0185
Epoch 79/100
850/850 [==============================] - 0s - loss: 0.0185
Epoch 80/100
850/850 [==============================] - 0s - loss: 0.0184
Epoch 81/100
850/850 [==============================] - 0s - loss: 0.0183
Epoch 82/100
850/850 [==============================] - 0s - loss: 0.0181
Epoch 83/100
850/850 [==============================] - 0s - loss: 0.0180
Epoch 84/100
850/850 [==============================] - 0s - loss: 0.0179
Epoch 85/100
850/850 [==============================] - 0s - loss: 0.0177
Epoch 86/100
850/850 [==============================] - 0s - loss: 0.0177
Epoch 87/100
850/850 [==============================] - 0s - loss: 0.0174
Epoch 88/100
850/850 [==============================] - 0s - loss: 0.0177
Epoch 89/100
850/850 [==============================] - 0s - loss: 0.0175
Epoch 90/100
850/850 [==============================] - 0s - loss: 0.0173
Epoch 91/100
850/850 [==============================] - 0s - loss: 0.0172
Epoch 92/100
850/850 [==============================] - 0s - loss: 0.0171
Epoch 93/100
850/850 [==============================] - 0s - loss: 0.0171
Epoch 94/100
850/850 [==============================] - 0s - loss: 0.0167
Epoch 95/100
850/850 [==============================] - 0s - loss: 0.0167
Epoch 96/100
850/850 [==============================] - 0s - loss: 0.0170
Epoch 97/100
850/850 [==============================] - 0s - loss: 0.0164
Epoch 98/100
850/850 [==============================] - 0s - loss: 0.0166
Epoch 99/100
850/850 [==============================] - 0s - loss: 0.0163
Epoch 100/100
850/850 [==============================] - 0s - loss: 0.0164
Out[15]:
<matplotlib.text.Text at 0x11e055350>

The following graph shows the training loss during the training process. This gives you an idea whether the training is progressing well or if you have some problem with the model or the parameters.
 Figure 3.  Training loss curve

In [16]:
# Use training data to check prediction
predicted = model.predict(input_mat)
In [17]:
# Plot original data (green) and predicted data (red)
lb1,=plot(data,'g',label="training")
#lb2,=plot(ydata,'b',label="target")
lb3,=plot(xrange(examples,examples+nb_samples), predicted[:,1],'r',label="predicted")
legend([lb1,lb3])
title("training vs. predicted")
Out[17]:
<matplotlib.text.Text at 0x11f164610>

In this section I am checking the model prediction. Since I am using the training material this is supposed to show a good result if the training was successful.  As you can see from figure 4. below the predicted graph (red color)  is aligned with 'dits' in the training signal (green color) despite QSB fading and noise in the signal.
 Figure 4.  Training vs. predicted graph

In the following section I will create another Morse signal, this time with text 'KCIUQ' but using the same noise, QSB and speed parameters.  I am planning to use this signal to validate how well the model has generalized the 'dit' concept.

In [18]:
# Let's change the input signal, instead of QUICK we have KCIUQ in Morse code
P = signal('KCIUQ',WPM,Tq,sigma)
data = P.sig

# prepare input - 1 feature
input_list = [np.expand_dims(np.atleast_2d(data[i:examples+i]), axis=0) for i in xrange(nb_samples)]
input_mat = np.concatenate(input_list, axis=0)
plt.plot(x,data)
Out[18]:
[<matplotlib.lines.Line2D at 0x136050f90>]

Here is the generated validation Morse signal.  It has the same letter as before but in reverse order. Can you read letters 'KCIUQ' from the graph below?

 Figure 5.  Validation Morse signal

In this section I use the above validation signal to create a prediction and the plot the results.

In [19]:
predicted = model.predict(input_mat)
plt.plot(data,'g')
plt.plot(xrange(examples,examples+nb_samples), predicted[:,1],'r')
Out[19]:
[<matplotlib.lines.Line2D at 0x1217be9d0>]

As you can see from the graph below the predicted 'dit' symbols (red color)  don't really line up with actual 'dits' in the signal (green color). This is not a surprise to me.  To build a good model that can generalize the learning you need to have a lot of training material (typically millions of datapoints) and the model needs to have enough neural nodes to capture the details of the underlying signals.
In this simple experiment I had only 1950 datapoints and 300 hidden nodes. There are only 8  'dit' symbols in the training material - learning CW skill  well requires a lot more material and many repetitions, as any human who has gone through the process can testify. Same principle applies for neural networks.
 Figure 6.  Validation test

### CONCLUSIONS

In this experiment I built a proof of concept to test whether Recurrent Neural Networks (especially LSTM variant) could be used to learn to detect symbols from noisy Morse code that has deep QSB fading.  This experiment may contain errors and misunderstandings from my part as I have only had a few hours to play with this Keras Neural Network framework. Also, the concept itself needs still more validation as I may have used the framework incorrectly.

I think that the results look quite promising.  In only 100 epochs the RNN model learned 'dits' from the noisy signal and was able to separate them from 'dah' symbols.  As the validation test shows I overfitted the model to this small sample of training material used in the experiment.  It will take much more training data and larger, more complicated neural network to learn to generalize the symbols in Morse code.  The training process may also need more computing capacity. It might be beneficial to have a graphics card with GPU to speed up the training process going forward.

Any comments or feedback?

73
Mauri AG1LE

## Sunday, November 22, 2015

### INTRODUCTION

In my previous post I shared an experiment I did using Recurrent Neural Network (RNN) software.  I started thinking that perhaps RNNs could learn not just the QSO language concepts but also learn how to decode Morse code from noisy signals. Since I was able to demonstrate learning of the syntax, structure and commonly used phrases in QSOs just in 50 epochs after going through the training material, wouldn't the same concept work for actual Morse signals?

Well, I don't really have any suitable training materials to test this. For the Kaggle competitions (MLMv1, MLMv2) I created a lot of training materials but the focus of these materials was different. The audio files and corresponding transcript files were open ended as I didn't want to narrow down possible approaches that participants might take. The materials were designed for a Kaggle competition in mind to be able to score participants' solutions.

In machine learning you typically have training & validation material that has many different dimensions  and a target variable (or variables) you are trying to model. With neural networks you can train the network to look patterns in the input(s) and set outputs to target values when the input pattern is detected. With RNNs you can introduce memory function - this is necessary because you need to remember signal values from the past to properly decode the Morse characters.

In Morse code you typically have just one signal variable and goal is to extract decoded message from that signal. This could be done by having for example 26 outputs for each alphabet character and train the network to set output 'A' to high when pattern '.-' is detected in the signal input. Alternatively you could have output lines for symbols like 'dit' and 'dah' and 'element space' that are set high when corresponding pattern is detected in the input signal.

Since a well working Morse decoder has to deal with different speeds (typically 5 ... 50 WPM), signals containing noise and QSB fading and other factors I decided to create a Morse Encoder software that creates artificial training signals, but also corresponding symbols, speed information etc. I chose to use this symbols approach because it easier to debug errors and problems when you can plot the inputs vs. outputs graphically. See this Wikipedia article for details about representation, timing of symbols and speed.

The Morse Encoder generates a set of time synchronized signals and has also capability to add QSB type fading effects and Gaussian noise. See example of 'QUICK BROWN FOX JUMPED OVER THE LAZY FOX ' plotted with deep  QSB fading with 4 second cycle time and  0.01 sigma Gaussian noise added in Figure 1. below.
 Fig 1. Morse Encoder output signal with QSB and noise

The QSB for real life signals doesn't always follow sin() curve like in Fig 1. but as you can see from example below this is close enough. The big challenge is how to continue decoding correctly when the signal goes down to noise level as shown between 12000 to 14000 time samples (horizontal axis) below.

### TRAINING MATERIALS

To provide proper target values for RNN training the Morse Encoder creates a Python DataFrame with the following columns defined

P.t    # keep time
P.sig  # signal stored here
P.dit  # probability of 'dit' stored here
P.dah  # probability of 'dah' stored here
P.ele  # probability of 'element space' stored here
P.chr  # probability of 'character space' stored here
P.wrd  # probability of 'word space' stored here
P.spd  # WPM speed stored here

Using these columns Morse Encoder takes the given text and parameters and then generates values to these columns. For example when there is a 'dit' in the signal, on corresponding rows the P.dit has probability of 1.0. Likewise, if there is a 'dah' in the signal, on corresponding rows the P.dah has probability of 1.0. This is shown on the Figure 2. below - dits are red and dahs are green, while the signal is shown in blue color.

 Fig 2.  Dit and Dah probabilities

Zoomed section of letters 'QUI ' is shown on Fig 3. below.

 Fig 3. Zoomed section

Likewise we create probabilities for the spaces. In Figure 4 below element space is shown with magenta and character space with cyan color. I decided to set character space to probability 1.0 only after element space has passed, as can be seen from the graph.

 Fig 4. Element Space and Character Space

The resulting DataFrame can be saved into a CSV file with a simple Python command and it is very easy to manipulate or plot graphs. Conceptually it is like an Excel spreadsheet - see below:

tsigditdahelechrwrdspd
00.0000.5733550100040
10.0010.5318650100040
20.0020.5544120100040
30.0030.5515390100040
40.0040.5364300100040
50.0050.5614380100040
60.0060.5611700100040
70.0070.5463260100040
80.0080.5629020100040
90.0090.5331400100040

The Morse Encoder software is stored in Github MorseEncoder.py and it is open source.

### NEXT STEPS

Now that I have the capability to create proper training material automatically with some parameters, like speed (WPM), fading (QSB) and noise level (sigma) it is a trivial exercise to produce large quantities of these training files.

My next focus area is to learn more about Recurrent Neural Networks (especially LSTM variants) and experiment with different network configurations. The goal would be to find a RNN configuration that is able to learn how to model the symbols correctly, even in presence of noise and QSB or at different speeds.

73
AG1LE