Showing posts with label Morse decoder. Show all posts
Showing posts with label Morse decoder. Show all posts

Sunday, August 24, 2014

Cortical Learning Algorithm for Morse code - part 1

Cortical Learning Algorithm Overview 

Humans can perform many tasks that computers currently cannot do. For example understanding spoken language in noisy environment, walking down a path in complex terrain or winning in CQWW WPX CW contest are tasks currently not feasible for computers (and might be difficult for humans, too).
Despite decades of machine learning & artificial intelligence research, we have few viable algorithms for achieving human-like performance on a computer. Morse decoding at the best human performance level would be a good target to test these new algorithms.

Numenta Inc.  has developed technology called Cortical Learning Algorithm (CLA) that was recently made available as open source project  NuPIC.  This software provides an online learning system that learns from every data point fed into the system. The CLA is constantly making predictions which are continually verified as more data arrives. As the underlying patterns in the data change the CLA adjusts accordingly. CLA uses Sparse Distributed Representation (SDR) in similar fashion as neocortex in human brain stores information. SDR has many advantages over traditional ways of storing memories, such as ability to associate and retrieve information using noisy data.

Detailed description on how CLA works can be found from this whitepaper.

Experiment 

To learn more how CLA works I decided to start with a very simple experiment.  I created a Python script that uses Morse code book and calculates Sparse Distributed Representation (SDR) for each character.  Figure 1 below shows the Morse alphabet and numbers 0...9 converted to SDRs.

Fig 1. SDR for Morse characters A...Z, 0...9


NuPIC requires input vector to be a binary representation of the input signal.  I created a function that packs "dits" and "dahs" into 36x1  vector, see two examples below. Each "dit"  is represented as 1. followed by 0. and each "dah" is represented by  1. 1. 1. followed by 0.  to accomodate 1:3 timing ratio between "dit" and "dah".  This preserves the semantic structure of Morse code that is important from sequence learning perspective.

H ....
[ 1.  0.  1.  0.  1.  0.  1.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]

O ---
[ 1.  1.  1.  0.  1.  1.  1.  0.  1.  1.  1.  0.  0.  0.  0.  0.  0.  0.
  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]

The Spatial Pooler uses 64 x 64 vector giving SDR of size 4096. As you calculate the SDR random bits get active on this vector. I plotted all active cells (value = 1) per columns 0...4096 for each letters and numbers as displayed in Fig 1. above. The respective character is shown on the right most column.

To see better the relationships between SDR and Morse character set I created another SDR map with letters  'EISH5' and 'TMO0'  next to each other.  These consequent letters and numbers differ from each other only by one "dit" or one "dah".  See Fig 2.  for SDR visualization of these characters.

There is no obvious visible patterns across these Morse characters, all values look quite different.  In the Numenta CLA whitepaper page 21 it says "Imagine now that the input pattern changes. If only a few input bits change, some columns will receive a few more or a few less inputs in the “on” state, but the set of active columns will not likely change much. Thus similar input patterns (ones that have a significant number of active bits in common) will map to a relatively stable set of active columns."

This doesn't seem to apply in these experiments so I need to investigate this a bit further.


Fig 2. SDR for Morse characters EISH5 and TMO0










I did another experiment by reducing SDR size to only 16x16  so 256 cells per SDR. In Fig 3.  it is now easier to see common patterns between similar characters - for example compare C with K and Y.  These letters have 3 common cells active.

Fig 3. SDR  map with reduced size 16x16 = 256 cells
























The Python software to create the SDR pictures is below:

"""A simple program that demonstrates the working of the spatial pooler"""
import numpy as np
from matplotlib import pyplot as plt
from random import randrange, random
from nupic.research.spatial_pooler import SpatialPooler as SP


CODE = {
    ' ': '',
    'A': '.-',
    'B': '-...',
    'C': '-.-.', 
    'D': '-..',
    'E': '.',
    'F': '..-.',
    'G': '--.',    
    'H': '....',
    'I': '..',
    'J': '.---',
    'K': '-.-',
    'L': '.-..',
    'M': '--',
    'N': '-.',
    'O': '---',
    'P': '.--.',
    'Q': '--.-',
    'R': '.-.',
    'S': '...',
    'T': '-',
    'U': '..-',
    'V': '...-',
    'W': '.--',
    'X': '-..-',
    'Y': '-.--',
    'Z': '--..',
    '0': '-----',
    '1': '.----',
    '2': '..---',
    '3': '...--',
    '4': '....-',
    '5': '.....',
    '6': '-....',
    '7': '--...',
    '8': '---..',
    '9': '----.' }



class Morse():
  
  def __init__(self, inputShape, columnDimensions):
    """
     Parameters:
     ----------
     _inputShape : The size of the input. (m,n) will give a size m x n
     _columnDimensions : The size of the 2 dimensional array of columns
     """
    self.inputShape = inputShape
    self.columnDimensions = columnDimensions
    self.inputSize = np.array(inputShape).prod()
    self.columnNumber = np.array(columnDimensions).prod()
    self.inputArray = np.zeros(self.inputSize)
    self.activeArray = np.zeros(self.columnNumber)

    self.sp = SP(self.inputShape, 
self.columnDimensions,
potentialRadius = self.inputSize,
numActiveColumnsPerInhArea = int(0.02*self.columnNumber),
globalInhibition = True,
synPermActiveInc = 0.01
)

    
  def createInputVector(self,elem,chr):
        
    print "\n\nCreating a Morse codebook input vector for character: " + chr + " " + elem
    
    #clear the inputArray to zero before creating a new input vector
    self.inputArray[0:] = 0
    j = 0
    i  = 0

    while j < len(elem) :
      if elem[j] == '.' :
        self.inputArray[i] = 1
        self.inputArray[i+1] = 0
        i = i + 2
      if elem[j] == '-' :
        self.inputArray[i] = 1
        self.inputArray[i+1] = 1
        self.inputArray[i+2] = 1
        self.inputArray[i+3] = 0
        i = i + 4
      j = j + 1
               
    print self.inputArray
     
     
  def createSDRs(self,row,x,y,ch):
    """Run the spatial pooler with the input vector"""
    print "\nComputing the SDR for character: " + ch
    
    #activeArray[column]=1 if column is active after spatial pooling
    self.sp.compute(self.inputArray, True, self.activeArray)
    
    # plot each row above previous one
    z = self.activeArray * row

    # Plot the SDR vectors - these are 4096 columns in the plot, with active cells visible
    plt.plot(x,y,z,'o')
    plt.text(4120,row-0.5,ch,fontsize=14);
    print self.activeArray.nonzero()
    
    
# Testing NuPIC for Morse decoding  
# Create SDRs from Morse Codebook input vectors
print "\n \nCreate SDRs from Morse Codebook input vectors"

# vector size 36x1 for input,  64x64 = 4096 for SDR 
example = Morse((36, 1), (64, 64))

x,y = np.meshgrid(np.linspace(1,1,4096), np.linspace(1,1,32))
plt.ylim([0,38])
plt.xlim([0,4170])

# Select the characters to be converted to SDRs
#str = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'
str = 'EISH5 TMO0'
row = 1
for ch in str:
  example.createInputVector(CODE[ch],ch)
  example.createSDRs(row,x,y,ch)
  row = row +1

plt.show()
plt.clf()



Conclusions

CLA provides promising new technology that is now open for ham radio experimenters to start tinkering with. Building a CLA based Morse decoder would be a good performance benchmark for CLA technology. There are plenty of existing Morse decoders available to compare with and it is fairly easy to test decoder accuracy under noisy conditions.  There is also a plenty of audio test material available, including streaming sources like WebSDR stations.

73
Mauri 




Friday, July 25, 2014

New Morse Decoder - part 6

Multichannel CW decoder 

Development of the Bayesian CW decoder is moving forward. Many thanks to Dave W1HKJ  there is now an alpha build available also for Windows platform. The v3.21.83cw-a4  tarball sources and Windows version are available in http://www.w1hkj.com/ag1le/

This version has still multiple problems that need to fixed. In Fig 1. below I have screenshot where I have the multichannel signal browser and Fldigi configuration screen for Modems / CW  visible. I am running Windows FLDIGI version v3.21.83cw-a4  connected to PowerSDR v2.6.4 and my Flex3000 radio.

The following description explains the options and expected behavior in this alpha software version.  Things are not yet  well optimized so you are likely to see a lot of misdecoded signals.  I am interested getting feedback and improvement ideas to make the Bayesian decoder better.

Checkbox "Bayesian decoding" enables multichannel operation. If you have Signal Browser open you can slide the horizontal control at the bottom to adjust the signal decoding threshold. With lower threshold you can enable weaker CW stations to be decoded but often you get just noise and the decoder produces garbage as visible in Fig 1.   The software detects peaks in the spectrum and starts a new decoder instance based on the detected peak signal frequency. It tracks each instance on a channel which is rounded at 100 Hz of the audio signal frequency. Number of channels and timeout value can be set in  Configure/User Interface/Browser menu.

If there are two stations in nearby frequencies less than 20 Hz apart the other one is eliminated.  This is done to reduce impact of frequency splatter - otherwise one strong CW station would spin up decoders on multiple channels. Also, in this version this process is done for every FFT row in the waterfall display. Because I am not doing any kind of averaging yet the detected peak signal center frequency  may be incorrect and therefore decoder is tuned on the wrong frequency. With narrow filter bandwidth setting this may cause garbage in the decoder output.

Fig 1. Multichannel CW decoder 

In this version each decoder instance uses the same filter bandwidth that is set manually in Configure/Modem/CW/General  tab. This means that Bayesian decoder does not have optimal, speed dependent filtering. For faster stations the bandwidth should be larger and for slow stations it can be narrow.
This means that decoding accuracy suffers if there are multiple stations operating at different speeds. Once again this functionality can be improved as the Bayesian decoder does provide a speed estimate automatically but I haven't had time to implement the automatic "Matched filter"  feature yet. The filter bandwidth is taken from the "Filter bandwith" slider value  for all Bayesian  decoder instances.

On receiver speed estimation Rx WPM value is updated only for selected CW signal on the waterfall. You can also observe that with Bayesian decoder enabled the speed display is updated much more frequently than with the legacy CW decoder.  Bayesian decoder calculates speed estimate every 5 msec and provides a WPM value whenever there is a state change  (mark -> space or space->mark) in the signal.  Sometimes the speed estimator gets "stuck" in lower or upper extreme.  You can adjust the "Lower limit" or "Upper Limit" on the CW / General  Transmit  section - this will give a hint to the speed estimator to re-evaluate speed.  Obviously, if the speed estimate is incorrect you will get a lot of garbage text  out from the decoder.

Tracking feature is not implemented properly yet for the Bayesian decoder. This means that if you start to transmit  your speed may be different than the station that you were listening. TX WPM is visible in the configuration screen.

Once again,  this is an alpha release  provided in order to get some feedback and improvement ideas  from FLDIGI users.  You can provide feedback by submitting comments below or sending me email to  ag1le at innomore dot com.   It would be very helpful  if you could provide audio samples (WAV files can be recorded   using File / Audio / RX Capture feature of FLDIGI),  screenshot of what CW parameter settings you are using  and general description of the problem or issue you discovered.

If you are interested in software development I would be very grateful  to get some additional help. Building a Bayesian Morse decoder has been a great learning experience in signal processing, machine learning algorithms  and probability theory.  There are plenty of problems to solve in this space as we  build  more intelligent software to use Morse code, the international language of hams.  

73
Mauri AG1LE








Thursday, July 17, 2014

New Morse Decoder - part 5

Multichannel CW decoder for FLDIGI

I have been working on the Bayesian Morse decoder for a while.  The latest effort was focused on making it possible to automatically detect all CW signals in the audio band and spin up a new instance of the Bayesian decoder for each detected signal.

Figure 1. shows a running version implemented on top of FLDIGI  version 3.21.77.  The waterfall shows 9 CW signals from 400Hz to 1200 Hz. The software utilizes the FLDIGI Signal Browser user interface and you can set the signal threshold using the slider bar below the signal browser window. The user interface works very much like the PSK or RTTY browser.

Figure 1. Multichannel CW decoder for FLDIGI
The audio file in this demo was created using Octave script  below

Fs = 8000; % Fs is sampling frequency - 8 Khz
Ts = 10*Fs; % Total sample time is 10 seconds


% create 9 different parallel morse sessions - 10 seconds each at 20-35 WPM speed
%         TEXT           audio file  noiselevel Hz    speed WPM 
x1=morse('CQ TEST DE AG1LE','cw1.wav', 20,1200,Fs,20, Ts);
x2=morse('TEST DE SP3RQ CQ','cw2.wav', 15, 1100,Fs,35, Ts);
x3=morse('DE W3RQS CQ TEST','cw3.wav', 20,  1000,Fs,30, Ts);
x4=morse('SM0LXW CQ TEST DE','cw4.wav',15,   900,Fs, 25, Ts);
x5=morse('CQ TEST DE HS1DX','cw5.wav', 20,    800,Fs, 20, Ts);
x6=morse('TEST DE JA1DX CQ','cw6.wav', 10,      700,Fs, 20, Ts);
x7=morse('DE JA2ATA CQ TEST','cw7.wav',20,      600,Fs, 20, Ts);
x8=morse('UA2HH CQ TEST DE','cw8.wav', 15,      500,Fs, 20, Ts);
x9=morse('CQ TEST DE CT1CX','cw9.wav', 20,      400,Fs, 20, Ts);


% weighted sum - merge all the audio streams together 
% 9 signals arranged in frequency order 1200Hz ... 400Hz
y = 0.1*x1 + 0.1*x2 + 0.1*x3 + 0.1*x4 + 0.1*x5 + 0.1*x6 + 0.1*x7 + 0.1*x8 + 0.1*x9;


% write to cwcombo.wav file 
wavwrite(y,Fs,'cwcombo.wav');

I have saved the full sources of this experimental FLDIGI version in Github:  FLDIGI source
UPDATE July 20, 2014: I rebased this using Source Forge latest branch  - new version is here: fldigi-3.22.0CN.tar.gz

Let me know if you are interested in testing this software.  I would be interested in getting feedback on scalability, any performance problems as well as how well the CW decoder works with real life signals.

73
Mauri AG1LE

Wednesday, June 11, 2014

New Morse Decoder - Part 4

In the previous blog entry  I shared some test results of the new experimental Bayesian Morse decoder.  Thanks to the alpha testers I found the bug that was causing the sensitivity to signal amplitudes and causing overflow. I have fix this bug in the software over the last few months.

CQ WW WPX  CW contest was in May 24-25 and it presented a good opportunity to put the new FLDIGI software version to test.  I worked some stations and let the software running for 48 hours to test the stability.

In the figure 1 the first 2 1/2 lines are decoded using the new Bayesian CW decoder. The following 2 lines is the same audio material decoded using the legacy CW decoder.  CW settings are also visible in the picture.  Matched filter bandwidth was set to 50Hz based on Rx speed of 32 WPM.

 The legacy decoder is doing a pretty good job following ZF1A working CW contest at 7005.52 KHz.  At first look it appears that the new Bayesian decoder is having a lot of difficulties.   Let's have a closer look what is really going on here.

Figure 1.  FLDIGI  CW Decoder testing 






















In the audio recording  ZF1A made 5 contacts, with VE1RGB, UR4MF, KP2M, SM6BZV and WA2MCR in 1 min 31 seconds.

I let the captured audio file run in a loop for two times using both FLDIGI CW decoder versions.  The decoded text is shown below, broken into separate lines by each QSO for clarity.

FLDIGI - the new experimental Bayesian Morse decoder: 
RN 512 X 
ZF1A N VE1RGB 5NN513 X 
ZF1A --R4MF 5NN 0MO0 N T E E E E E E E TU
 ------TM N T E XJ0TT 6NN 515 X 

ZF1A N QT SM6BZV 5NM516 X 
ZF1A N WA2MCR 5NN 517 
NN 5 --2 B
ZF1A N VE1RGB 5MK 613 X 
ZF1A N KR4MF 5NN 514 X 
ZF1A N KP2M 6NN 515 TU 
ZF1A N OT SM6BZV 5NN 516 X
ZH1A N WA2MCR 5NN 517 

The first line should have been decoded as NN 512 TU but  the Bayesian decoder misdecoded last 2 letters. What was originally  TU  was decoded as X.

Let's look at the signal waveform (Fig 2.). It is a close call when looking at the waveform timing...same decoding error happens in almost all the cases above.

Figure 2.  TU or  X ? 














So what happened in the QSO with UR4MF? Let's look at waterfall picture (Fig 3.) for possible clues.
UR2MF is very visible at 692 Hz frequency but what is the faint signal that starts 200 msec before letter U? It is approximately 70Hz below UR2MF and starts "dah-dit".

Figure 3. UR4MF with another station 70 Hz below









The new Bayesian decoder is sensitive enough to be able to pick up this other station, but unfortunately the selected 50 Hz filter bandwidth is not enough to separate these two stations from each other, thus causing what appears an error in decoding.  Legacy decoder did not even notice this other station.



FLDIGI - legacy Morse decoder: 
6N S W2 TU 
F1 VE1RGB 5NN 513 TU 
ZF1A UR4MF N 514 TU 
ZF1A KP2M 5NN 515 TU 
ZF1 SM6BZV 5NN 516 TU 
ZF1A WA2MCR 5NN 17 
NN S W2 TU 
F1 VE1RGB 5NN 513 TU 
ZF1A UR4MF E N 514 TU 
ZF1A KP2M 5NN 515 TU 
ZF1 SM6BZV 5NN 516 TU 
ZF1A WA2MCR 5NN 17


First line should be NN 512 TU but deep QSB in the signal mangled the decoding into 6N S W2   TU.
See figure 4 on how the amplitude goes down rapidly between 5 and 1. The first "dit" in number one is barely visible in waterfall figure 5 below. While legacy decoder made an error in this case  the new Bayesian decoder didn't seem to have any problem despite this deep and rapid QSB.

Once again, the Bayesian decoder appears to be much more sensitive and automatically able to pickup signals even under deep QSB conditions.
Figure 4. QSB 













Figure 5. QSB in waterfall










CONCLUSIONS

Testing the performance of the new, experimental Bayesian Morse decoder with real world contest signals did show some surprising behaviors. What appeared initially to be errors in decoding turned out to be real signals that impacted the probabilistic decoding process. It also appears that the Bayesian method is much more sensitive and may need a bit different strategy related to signal pre-processing and pre-filtering compared to the legacy Morse decoder currently implemented in FLDIGI.

I am working on a better test framework to experiment more systematically the impact of various parameters to find the performance limits of the new experimental Bayesian Morse decoder.   Early results show for example that minimum CER  (character error rate) is dependent on SNR of signal as expected, but the CW speed dependent pre-filter bandwidth has some interesting effects on CER as demonstrated in figure 6.  Different graphs show how the speed (in WPM) setting impacts CER at different pre-filter bandwidth  (for example 20 WPM  ~ 33.3Hz, 80 WPM ~ 133.3 Hz ).  You would expect best CER performance with the most narrow filter setting that matches the actual speed (in this case actual speed was 20 WPM).  However,  as seen in figure 6  this is not consistently the case with signals between SNR  +2 ... +15 dB.


Figure 6.  CER vs. SNR testing at different WPM settings 




Tuesday, December 31, 2013

New Morse Decoder - Part 3

New Bayesian Morse Decoder 

In my previous blog entry I shared the first test results using Bayesian CW decoder integrated to FLDIGI.  As I indicated the software has still bugs but when it works the accuracy improvement compared to the legacy FLDIGI decoder was pretty impressive.  I was testing again tonight as ham bands were full of CW stations working in ARRL Straight Key Night.  I made also many audio file captures to be used as  additional testing material.

However,  as my alpha testers have also noticed the current decoder is very sensitive to proper audio signal settings.  In order to analyze this problem a bit more details I re-implemented FLDIGI audio processing chain from audio file playback to the decoder (including the FFT filtering in CW.CXX) in a standalone version of the Bayesian Morse Decoder.  It was easier to run a series of tests by just varying command line variables and WAV audio files  and capture the results for analysis & plotting the data.

After running a series of tests I compared the signal amplitude against decoder errors and found a very sharp non-linear behavior related to speed estimation algorithm.  By increasing signal volume by only 0.06% I was able to re-produce this event that alpha testers found and also capture the dynamic behavior of various parameters, including <spdhat>  that I used below to plot the speed estimate over time.

Figure 1. below shows that initially the speed tracking is trying to find the correct speed (in this case 40 WPM) but fairly soon it goes down below the lower limit 10 WPM  and is not able to recover from this error situation. This is similar behavior that alpha testers have seen.
Figure 1.  Speed estimator bug

I  scaled down the signal strength by only 0.06% and now the speed estimator works much better.   In Figure 2. below  the speed tracking fluctuates between 13 and 48 WPM until it settles to about 38 WPM.   Despite this tracking behavior the decoder produces  correct results from a noisy test file.

Figure 2.  Normal speed estimator behavior




I am very thankful for those brave hams who took this first  (broken) alpha release and provided me not only good  feedback but also some WAV file examples  that I have been able to use to track down these problems.

I have also converted all the files from C  to C++  to make it easier to re-integrate to FLDIGI.  I will continue cleaning up the code and keep working to improve the speed estimator.  Once I have the next alpha release ready for another set of tests I will announce it in the fldigi mailing lists and eham.net CW forum.

73 and Happy New Year

Mauri  AG1LE



Tuesday, December 24, 2013

New Morse Decoder - Part 2

New Bayesian Morse Decoder 


In my previous post I shared initial results of the new Bayesian Morse Decoder.  I have worked a bit more on this topic to integrate the decoder software to FLDIGI v3.21.75.   Dave  W1HKJ provided me some space on his web server so I posted full source code including all changes and new source files related to Bayesian Morse decoder in here.

To demonstrate the Bayesian decoding capability I tuned my KX3  on W1AW frequency 3.58105 Mhz  at 8:00 PM EST today and let two identical FLDIGI copies of the same v3.21.75 software running from the same audio source.  I have a SignaLink USB connected between KX3  and my ThinkPad laptop running Linux Mint OS.

On Fig 1. below  I have the legacy SOM decoder on the left side and the new Bayesian decoder on the right .  The legacy decoder makes a lot of errors and CW bulletin is not really readable.  On the right side I have enabled Bayesian decoder and you can actually read the bulletin, though the lack of proper word spaces make this still a bit difficult to read. In any case Bayesian decoder provides significant accuracy improvement with real life noisy CW signals.

Figure 1.  FLDIGI running legacy and Bayesian Morse decoder





If you choose to  compile FLDIGI from the sources yourself  please see Figure 2. below.  When you go to menu item Configure / Modems / CW / General  and click the tick box "Bayes decoding"  that will enable the new Bayesian decoder (and disables legacy decoder).  You will also notice that CW RX speed indicator  (Fig 1. bottom left corner of FLDIGI window)  will change more rapidly - this is a feature of the new decoder.  It evaluates the instantaneous speed continuously and uses this information as part of the decoding process.

Figure 2.  Enabling Bayesian decoder. 



I would be interested in getting your feedback and suggestions how to improve this software.  Current version is buggy  and it does crash occasionally.  Also, I still have not found the root cause of the problem that causes approximately 3% base error rate.

There are number of improvement areas that need work. For example noise estimator ("noise.c") does not seem to be able to keep track of the real noise level. This noise power information is later used by Kalman filters  ("kalfil.c") in estimating likelihood of keystate given the observed signals. Bayesian inference is then used in updating posterior conditional probabilities for each new path in ("probp.c"). I am getting intermittent overflows in this section of the code. Any help in this area would be very welcome.

However,  given the visible improvement in the decoding accuracy at least during certain conditions  (when you have the signal level correct, not too much noise etc.) I am now posting this for alpha testing with hopes that I could get some more eyes to look at the code and find those bugs.

Happy Holidays!

73
Mauri AG1LE  




Friday, September 27, 2013

New Morse Decoder - Part 1

NEW BAYESIAN MORSE DECODER

I have been working on a new Bayesian Morse decoder algorithm.  This work is based on Dr. Bell's  doctoral thesis  that is one of the best and most comprehensive documents on this topic I have found so far. While it contains a lot of advanced mathematics this thesis has also very thorough description of the problems related to automatically transcribing the hand-keyed manual Morse signal with acceptable error rate.  This thesis covers a lot of ground and describes with mathematical rigor how to model and solve each of the problems. It has also software examples (in Fortran)  and test results comparing to theoretically optimal solution as well as to human performance under different conditions.

Using the thesis as a starting point I implemented the algorithms in C language. Current version has some 3335 lines of code and is posted in Github as open source with verbal permission from Dr. Bell. I managed to get the software working to the degree that I am able to run some performance tests.  There is still a lot of work ahead to improve and clean up the code but I decided to publish some early results for those who might be interested in this work.

CORRELATOR-ESTIMATOR TESTING

As described in the thesis this algorithm represents a "correlator-estimator" technique in which a sequence of all possible keystate transitions are hypothesized and correlated with the incoming signal, and the most likely sequence is output as the best estimate.  To illustrate this point I plotted a segment of incoming signal with letters "QUICK B" as shown on figure 1. below.  The probability estimates of various keystates [P(dit), P(dah), P(el-spc), P(chr-spc), P(wrd-spc), P(pause)] are plotted underneath the incoming signal.  I used Kst Data Viewer to create this plot.


Figure 1.  Correlator-estimator probabilities




























The Morse audio file used in this test had Signal-Noise Ratio (SNR) of 8 dB @ 2 kHz bandwidth. This figure shows nicely the time variant probability values of each keystate, as well as how a particular keystate correlates with "mark"/"space" changes in the incoming signal.

CER vs. SNR  TESTING 

To test the decoder performance in the presence of noise I created a set of Morse sound files with -10 dB to +20 dB SNR @2kHz bandwidth using modified version of Rob Frohne KL7NA morse.m Octave software.  These files contain 200 words with 5 random letters and numbers each.  The sound files are available here: Morse sound files -10 dB ... 20dB SNR @2kHz BW  For algorithm testing I also created corresponding text files where is each sample is a real number on separate line. These are easier to manipulate and plot using standard Linux tools, like gnuplot.

I ran the algorithm using files with different SNR levels and saved the decoder text output. To get the Character Error Rate (CER) I created a small utility program to calculate Levenshtein  distance using this algorithm.  Comparing original text to decoded text gives the character error rate.  When plotting the results it was quite interesting to see a deep reduction in CER with SNR over 6 dB.  Note that SNR figure contains noise for the whole 2kHz bandwidth.  

Figure 2.  CER to SNR test results


































The CER should go down to zero as SNR improves - however the graph shows some base error rate ~ 3 %.  I need to study this a bit more in detail to find the root cause.  At higher noise levels the curve shape looks a bit closer to what I expected.

NEXT STEPS 

As mentioned before there is still a lot of work ahead to make this software useful.

I did some initial testing to integrate this decoder to a modified  FLDIGI package and got the software partially working as an external program connected to FLDIGI via a Linux FIFO pipe.

I am also trying to figure out automated test scripts for timing and speed variance testing.  I would really like to find the limits where the algorithm breaks. This requires more work in creating synthetic tests similar to what is described in the thesis.

Third  area of work is testing the algorithm performance against signal fading. This would provide limits on real world signals and help to optimize the model parameters.

If you are interested in advancing the state of the art in Morse decoding feel free to download the software, work on testing & improving it. Please provide your feedback and post comments below.

73
Mauri AG1LE




Monday, February 4, 2013

Probabilistic Neural Network Classifier for Morse Code

One of the challenges in Morse Code decoder is how to build a well working adaptive classifier for incoming symbols given the variability in  "dit/dah" timing in real world CW traffic received in ham bands. I have discussed these problems in my previous blog posting.


OVERVIEW 

I have collected data from different ham QSOs in order to understand the underlying patterns and timing relationships.  One good way to represent this data is to think Morse Code as "symbols" instead of  "dits" and "dahs".   A symbol  is  a  ["tone",  " silence"]  duration pair.  For example the  letter  A   ( . - )  could be represented as two symbols   ["dit" - "ele"]  + ["dah", "chr"]  where "dit" and "dah"  represent short and long tone; respectively "ele" represent inter-element space  and "chr" represents inter-character space.  Morse code can then be represented as a sequence of symbols - letter "A" would be {S0, S4}  -  I am using the following definitions as shorthand for symbols:

  • S0 = [dit, ele]  //  dit  and inter-element space 
  • S1 = [dit, chr]  //  dit  and inter-character space 
  • S2 = [dit, wrd]  //  dit  and inter-word space 
  • S3 = [dah, ele]  //  dah  and inter-element space 
  • S4 = [dah, chr]  //  dah  and inter-character space 
  • S5 = [dah, wrd]  //  dah  and inter-word space 

CLASSIFICATION EXAMPLES 

If we plot  thousands of these symbols as  data vectors  in  (x, y) chart where x-axis  is  tone duration ("mark")  and y-axis is silence duration ("space") we get a picture like figure 1 below.  "Dit" duration is scaled to 0.1 in the picture and "dah" duration is  0.3 respectively.  Figure 1. was created from a recorded test audio file with  -3 dB  SNR by courtesy of Dave W1HKJ.

You can easily see different symbol clusters and there is clean separation between them.  In the "mark" dimension  (horizontal x-axis)  "dit" and "dah"  follow roughly  1:3  ratio as expected.

Dark blue and green areas represent S0 and S3  symbols - these are symbols within Morse letters as "space" value is around 0.1.

Red and brown areas represent S1 and S4 symbols  - longer inter-character "space" value ~0.3 means that these symbols are the last symbol in a letter.

Yellow and light blue areas represent S2 and S5 symbols - long "space" value around 0.6 - 0.7 shows that these are the last symbol in a word.  

Figure 1.  Morse code symbols  ( -3 dB SNR test audio file) 






















While figure 1. looks nice and neat please take a look at figure 2. below.   Using the same exact classifier settings  I recorded several stations working on 7 Mhz band  and collected some 1800 symbols. This picture shows data from multiple stations - therefore the variation for each symbol is much larger.

There is much more variation in the "space" dimension.  When listening the Morse code  this is consistent with people having "thinking pauses" as they form an idea of the next words to send. Sometimes these pauses can be even longer - you can see those when a new subject is introduced in the QSO such as asking a question or responding to something that other station sent.
I have capped the duration to 1.0  ( this is 10 *  "dit" duration) for implementation reasons. In real life these pauses can be few seconds.

I created another symbol S6  that I have used for  noise spike detection. These random spikes are typically less than 1/2 "dit" duration.  Having a separate symbol for noise makes it easier to remove them when decoding characters.  

Figure 2.  Morse code symbols (7 Mhz band ham traffic)























PNN AS A CLASSIFIER 

Finding a classifier algorithm that would be able to  identify each symbol in a consistent manner was not very easy.  I used Weka software  to test different algorithms trying to find a classifier that would classify the above symbols with over 90% accuracy.  After spending many hours of running different data sets through Weka  I was quite disappointed.  None of the built-in classifiers seemed to handle the task well or I was not able to provide good enough parameters as a starting point.

I was exchanging some emails with Dave, N7AIG on how to build a classifier for Morse code symbols while he suggested that I should look at Probabilistic Neural Network algorithm.

I started reading articles on PNN and discovered that it has several great benefits.  PNN is very fast in learning new probability distributions, in fact  it is so called "one-pass" learning algorithm.  This is a great benefit as many of the more traditional neural network algorithms require  hundreds or even thousands of training cycles before they converge to a solution.  With PNN you just have to show a few examples of each symbol class and it does the classification almost immediately.

PNN networks are also relatively insensitive to outliers.  So few bad examples does not kill the classification performance. PNN networks also approach Bayes optimal classification and are guaranteed to converge to an optimal classifier as the size of the representative training set increases. Training samples can also be added or removed without extensive retraining.

PROTOTYPING WITH EXCEL 

PNN has only one parameter "sigma"  that needs tuning.  I built an Excel sheet (copy in Google Sheets) to validate the PNN algorithm  and to see how different "sigma" values  impact the classification.  By changing the "sigma" value I could easily see how the classifier handles overlapping distributions. Sigma basically determines the Gaussian shape of the PNN equation:

Figure 3.   PNN equation

Experimenting with different "sigma" values  I was able to discover that  0.03 seems to be a good compromise value.  If you increase the value much higher you tend to get more classification errors due to overlaps.  If you make it much smaller you get more classification errors due to too narrow distribution.

See figure 4.  on how the probability distribution function (PDF) looks like for "dit", "dah" and "word space" with sigma = 0.03.  The scale has been set so that "dit" duration corresponds to 0.1  like in the previous pictures.


Figure 4.  PNN tuning




















IMPLEMENTATION IN C++ FOR FLDIGI

The final part of this work was to create PNN algorithm in C++  and integrate it with  FLDIGI software.  The alpha version of the classifier code is in Appendix below - I have also included the Examples[]  data I have used to classify results in Figure 1 and 2.

Figure 5. Probabilistic Neural Network 
Figure 5.  demonstrates  the role of different "layers"  in PNN algorithm.  Input layer gets the symbol  ("mark", "space") values that are scaled  between 0.0 and 1.0. The pattern layers calculates the match against each symbol class training example. In this example there are two training examples per symbol class.  The summation layer calculates the sum of probability distribution function (PDF - see Fig 3) for each class.  Finally the decision layer compares all these results and selects the largest sum value that is the best matching symbol class.

Part of my current work has been also to completely rewrite the state machine in FLDIGI  CW.CXX module that handles the conversion from KEY_DOWN and KEY_UP events to  "mark" and "space" durations.



I have also rewritten the Morse decoder part - new decoder gets stream of classified symbols from PNN() function and runs through a state machine to come up with the correct character. I am still in process of getting this state machine to behave like a Viterbi  decoder.

For this I would need the a-priori probabilities for each state and traverse through the states using the symbols and their probabilities to calculate the optimal decoding path.  I am struggling a bit with this step currently - it is not obvious to me how I should use the constraints in Morse code to derive the best path.  If you have experience in Viterbi decoders I would like to chat with you to get better insight how this algorithm should really work.

73
Mauri  AG1LE


APPENDIX  -  C++ SOURCE CODE 



#define Classes   7 // Symbols S0 ... S5  - 6 different classes + S6 for noise 
#define NEPC 2 // Number of Examples per Class
#define Dimensions 2 // dimensions - use mark & space here - could add more features like AGC 



const float Examples[Classes][NEPC][Dimensions] =  {
{{0.1, 0.1}, // S0 = dit-ele
{0.08, 0.12}}, // S0 = dit-ele

{{0.1, 0.3}, // S1 = dit-chr
{0.08, 0.23}}, // S1 = dit-chr

{{0.1, 0.7}, // S2 = dit-wrd
{0.1,  0.6}}, // S2 = dit-wrd

{{0.3, 0.1}, // S3 = dah-ele
{0.34, 0.16}}, // S3 = dah-ele

{{0.34, 0.4}, // S4 = dah-chr
{0.29, 0.22 }}, // S4 = dah-chr

{{0.23, 0.54}, // S5 = dah-wrd
{0.23,  0.9}}, // S5 = dah-wrd

{{0.01, 0.01}, // S6 = noise-noise
{0.05,  0.05}} // S6 = noise-noise

};



//=======================================================================
// (C) Mauri Niininen AG1LE 
// PNN()  is a Probabilistic Neural Network algorithm to find best matching Symbol 
// given received Mark / Space pair. Mark and Space are scaled to [0 ... 1.0] where 
// Mark: standard  'dit' length is  0.1, 'dah' is 0.3  
// Space: standard element space is 0.1, character space 0.3 and word space 0.7 
// Examples[] contains 2 examples for each symbol class
//=======================================================================
int symbol::PNN (float mark, float space) 
{
float sigma = 0.03; // SIGMA determines the Gaussian shape  z = f(x,y) = exp (-((x-x0)^2+(y-y0)^2)/2*sigma^2)
int classify = -1;
float largest = 0;
float sum [Classes];
float test_example[2];


if (abs(mark) > 1) mark = 1;
if (abs(space) > 1) space = 1; 

test_example[0] = mark;
test_example[1] = space;

// OUTPUT layer  - computer PDF for each class c  
for (int k=0; k < Classes; k++) 
{
sum[k] = 0;

// SUMMATION layer - accumulated PDF for each example from particular class k 
for (int i=0;i < NEPC; i++) 
{
float product = 0; 
// PATTERN layer - multiplies test examples by the weights 
for (int j=0; j < Dimensions; j++)
{
product += (test_example[j] - Examples[k][i][j])*(test_example[j] - Examples[k][i][j]);
}
product = -product/(2*(sigma * sigma));
product= exp(product); 
sum[k] += product;
}

sum[k] /= NEPC;
}
// sum[k] has accumulated PDF for each class k 
for (int k=0; k<Classes; k++) 
{
if ( sum[k] > largest) 
{
largest = sum[k];
classify = k;
}

}
 
learn (classify, mark, space);

return classify;













Popular Posts