Sunday, May 27, 2012

Morse Code decoding - machine learning

In the previous blog entry I covered an experiment using Self Organizing Maps with SOM toolbox to automatically learn Morse code from the input vectors that contained duration of the "dit" and "dah" tones.

While I have been reading academic research papers  Dave W1HKJ and  alpha testing team has been working on a new version of FLdigi.


I was testing the brand new Fldigi alpha release during the CQ WW WPX contest  as the band was literally full of CW stations around the world. It was very difficult to find an empty frequency on the CW bands. Figure 1 below shows a screenshot from my Flex3000  PowerSDR pan adaptor / waterfall display. I recorded  a  230 MB size IQ audio Wav file for further testing purposes.

Figure 1.  CQ WW WPX  contest at 7 Mhz CW band

One of the practical challenges is how to detect the start and end point of the tone correctly in the presence of noise or interference. When the detected signal amplitude varies greatly due to fading, noise or interference having a simple threshold does not work well.  For example using the upper detector threshold in figure 2 would only produce two "dits" (aka "E E").  Having two detector thresholds and some hysteresis will produce better results.

Figure 2. Upper and lower threshold.

In the  FLdigi 3.22.0CB alpha version Dave W1HKJ has added Lower and Upper Detector thresholds parameters that are also visible in the Scope display.  See figure 3 below.  

Figure 3.  Detector Threshold settings

This simple enhancement works surprisingly well.  In figure 4 below I have used SOM detection and alternatively used FFT filtering and Matched Filtering.  As you can see from the figure there are some stations with very high signal strength (you can see the Morse code spectrum spreading over 2 kHz below from 7072.7 kHZ ) and some with much smaller signal strength. I am getting some decoding errors (mostly extra "E", "I", "T" characters)  but the new decoder works quite well compared to previous decoder. Here are some ideas how to improve tone start and end point detecting even more, such as using energy and Zero Crossing Ratio (ZCR) values as thresholds.

Figure 4.  FLDigi 3.22.0CB version in action


Over the last 2 weeks I have read many research papers on various topics such as Hidden Markov Models, Dynamic Time Warping, matching patterns in time with Bayesian classifiers,  Restricted Boltzman Machines (RBM), Deep Belief Networks (DBN) etc.

Some of these machine learning algorithms tend to focus on creating a set of features that one (or multiple) classifier(s) then use trying to find the best match to pre-labeled training data. This PhD theses is somewhat old (1992) but well written and it covers most commonly used techniques and algorithms.

Other alternative is "unsupervised learning" approach where a lot of unlabeled training data is used to find patterns & clusters that can then be classified and labeled.  Especially RBM and DBN camp seems to believe this is better and more universal approach. This presentation in Youtube (see demo 21:38 forward)  by Geoffrey E. Hinton  explains well this latter approach.

Just looking at the amount of research done on machine learning it is quite a challenge trying to find the best way to move forward. One idea that I was playing with was to create a database of Morse code audio wav files as training & testing material  - it looks like the machine learning community has focused so far only on broad speech, music and  environment audio categories. These type of training sample  audio files can be found in many websites of machine learning teams.  Having good quality labeled material seems to be a big problem for the machine learning community.

I think it would be a worthwhile technical challenge  to advance the state of the art in machine learning by automatically decoding all the hundreds of Morse code conversations taking place during these ham radio contests. It would push the machine learning algorithm development further and hopefully bring some bright minds into our ham community as well.

I wonder if ARRL or some other organization could setup this kind of public challenge for schools and universities engaged with machine learning ?  As we can see from the above Morse code is very much alive and being used every day.  It would be relatively easy for hams to produce such labeled audio content. We could use FLDIGI to decode the audio file content as a training reference and have different kinds of CW files recorded from real ham RF bands, like the  above CQ WW WPX contest example on 7 Mhz

Mauri  AG1LE

EPD algorithms

Popular Posts