One of the challenges in Morse Code decoder is how to build a well working adaptive classifier for incoming symbols given the variability in "dit/dah" timing in real world CW traffic received in ham bands. I have discussed these problems in my previous blog posting.
OVERVIEW
I have collected data from different ham QSOs in order to understand the underlying patterns and timing relationships. One good way to represent this data is to think Morse Code as "symbols" instead of "dits" and "dahs". A symbol is a ["tone", " silence"] duration pair. For example the letter A ( . - ) could be represented as two symbols ["dit" - "ele"] + ["dah", "chr"] where "dit" and "dah" represent short and long tone; respectively "ele" represent inter-element space and "chr" represents inter-character space. Morse code can then be represented as a sequence of symbols - letter "A" would be {S0, S4} - I am using the following definitions as shorthand for symbols:- S0 = [dit, ele] // dit and inter-element space
- S1 = [dit, chr] // dit and inter-character space
- S2 = [dit, wrd] // dit and inter-word space
- S3 = [dah, ele] // dah and inter-element space
- S4 = [dah, chr] // dah and inter-character space
- S5 = [dah, wrd] // dah and inter-word space
CLASSIFICATION EXAMPLES
If we plot thousands of these symbols as data vectors in (x, y) chart where x-axis is tone duration ("mark") and y-axis is silence duration ("space") we get a picture like figure 1 below. "Dit" duration is scaled to 0.1 in the picture and "dah" duration is 0.3 respectively. Figure 1. was created from a recorded test audio file with -3 dB SNR by courtesy of Dave W1HKJ.You can easily see different symbol clusters and there is clean separation between them. In the "mark" dimension (horizontal x-axis) "dit" and "dah" follow roughly 1:3 ratio as expected.
Dark blue and green areas represent S0 and S3 symbols - these are symbols within Morse letters as "space" value is around 0.1.
Red and brown areas represent S1 and S4 symbols - longer inter-character "space" value ~0.3 means that these symbols are the last symbol in a letter.
Yellow and light blue areas represent S2 and S5 symbols - long "space" value around 0.6 - 0.7 shows that these are the last symbol in a word.
Figure 1. Morse code symbols ( -3 dB SNR test audio file) |
While figure 1. looks nice and neat please take a look at figure 2. below. Using the same exact classifier settings I recorded several stations working on 7 Mhz band and collected some 1800 symbols. This picture shows data from multiple stations - therefore the variation for each symbol is much larger.
There is much more variation in the "space" dimension. When listening the Morse code this is consistent with people having "thinking pauses" as they form an idea of the next words to send. Sometimes these pauses can be even longer - you can see those when a new subject is introduced in the QSO such as asking a question or responding to something that other station sent.
I have capped the duration to 1.0 ( this is 10 * "dit" duration) for implementation reasons. In real life these pauses can be few seconds.
I created another symbol S6 that I have used for noise spike detection. These random spikes are typically less than 1/2 "dit" duration. Having a separate symbol for noise makes it easier to remove them when decoding characters.
Figure 2. Morse code symbols (7 Mhz band ham traffic) |
PNN AS A CLASSIFIER
Finding a classifier algorithm that would be able to identify each symbol in a consistent manner was not very easy. I used Weka software to test different algorithms trying to find a classifier that would classify the above symbols with over 90% accuracy. After spending many hours of running different data sets through Weka I was quite disappointed. None of the built-in classifiers seemed to handle the task well or I was not able to provide good enough parameters as a starting point.I was exchanging some emails with Dave, N7AIG on how to build a classifier for Morse code symbols while he suggested that I should look at Probabilistic Neural Network algorithm.
I started reading articles on PNN and discovered that it has several great benefits. PNN is very fast in learning new probability distributions, in fact it is so called "one-pass" learning algorithm. This is a great benefit as many of the more traditional neural network algorithms require hundreds or even thousands of training cycles before they converge to a solution. With PNN you just have to show a few examples of each symbol class and it does the classification almost immediately.
PNN networks are also relatively insensitive to outliers. So few bad examples does not kill the classification performance. PNN networks also approach Bayes optimal classification and are guaranteed to converge to an optimal classifier as the size of the representative training set increases. Training samples can also be added or removed without extensive retraining.
PROTOTYPING WITH EXCEL
PNN has only one parameter "sigma" that needs tuning. I built an Excel sheet (copy in Google Sheets) to validate the PNN algorithm and to see how different "sigma" values impact the classification. By changing the "sigma" value I could easily see how the classifier handles overlapping distributions. Sigma basically determines the Gaussian shape of the PNN equation:Figure 3. PNN equation |
Experimenting with different "sigma" values I was able to discover that 0.03 seems to be a good compromise value. If you increase the value much higher you tend to get more classification errors due to overlaps. If you make it much smaller you get more classification errors due to too narrow distribution.
See figure 4. on how the probability distribution function (PDF) looks like for "dit", "dah" and "word space" with sigma = 0.03. The scale has been set so that "dit" duration corresponds to 0.1 like in the previous pictures.
Figure 4. PNN tuning |
IMPLEMENTATION IN C++ FOR FLDIGI
The final part of this work was to create PNN algorithm in C++ and integrate it with FLDIGI software. The alpha version of the classifier code is in Appendix below - I have also included the Examples[] data I have used to classify results in Figure 1 and 2.Figure 5. Probabilistic Neural Network |
Part of my current work has been also to completely rewrite the state machine in FLDIGI CW.CXX module that handles the conversion from KEY_DOWN and KEY_UP events to "mark" and "space" durations.
I have also rewritten the Morse decoder part - new decoder gets stream of classified symbols from PNN() function and runs through a state machine to come up with the correct character. I am still in process of getting this state machine to behave like a Viterbi decoder.
For this I would need the a-priori probabilities for each state and traverse through the states using the symbols and their probabilities to calculate the optimal decoding path. I am struggling a bit with this step currently - it is not obvious to me how I should use the constraints in Morse code to derive the best path. If you have experience in Viterbi decoders I would like to chat with you to get better insight how this algorithm should really work.
73
Mauri AG1LE
APPENDIX - C++ SOURCE CODE
#define Classes 7 // Symbols S0 ... S5 - 6 different classes + S6 for noise
#define NEPC 2 // Number of Examples per Class
#define Dimensions 2 // dimensions - use mark & space here - could add more features like AGC
const float Examples[Classes][NEPC][Dimensions] = {
{{0.1, 0.1}, // S0 = dit-ele
{0.08, 0.12}}, // S0 = dit-ele
{{0.1, 0.3}, // S1 = dit-chr
{0.08, 0.23}}, // S1 = dit-chr
{{0.1, 0.7}, // S2 = dit-wrd
{0.1, 0.6}}, // S2 = dit-wrd
{{0.3, 0.1}, // S3 = dah-ele
{0.34, 0.16}}, // S3 = dah-ele
{{0.34, 0.4}, // S4 = dah-chr
{0.29, 0.22 }}, // S4 = dah-chr
{{0.23, 0.54}, // S5 = dah-wrd
{0.23, 0.9}}, // S5 = dah-wrd
{{0.01, 0.01}, // S6 = noise-noise
{0.05, 0.05}} // S6 = noise-noise
};
//=======================================================================
// (C) Mauri Niininen AG1LE
// PNN() is a Probabilistic Neural Network algorithm to find best matching Symbol
// given received Mark / Space pair. Mark and Space are scaled to [0 ... 1.0] where
// Mark: standard 'dit' length is 0.1, 'dah' is 0.3
// Space: standard element space is 0.1, character space 0.3 and word space 0.7
// Examples[] contains 2 examples for each symbol class
//=======================================================================
int symbol::PNN (float mark, float space)
{
float sigma = 0.03; // SIGMA determines the Gaussian shape z = f(x,y) = exp (-((x-x0)^2+(y-y0)^2)/2*sigma^2)
int classify = -1;
float largest = 0;
float sum [Classes];
float test_example[2];
if (abs(mark) > 1) mark = 1;
if (abs(space) > 1) space = 1;
test_example[0] = mark;
test_example[1] = space;
// OUTPUT layer - computer PDF for each class c
for (int k=0; k < Classes; k++)
{
sum[k] = 0;
// SUMMATION layer - accumulated PDF for each example from particular class k
for (int i=0;i < NEPC; i++)
{
float product = 0;
// PATTERN layer - multiplies test examples by the weights
for (int j=0; j < Dimensions; j++)
{
product += (test_example[j] - Examples[k][i][j])*(test_example[j] - Examples[k][i][j]);
}
product = -product/(2*(sigma * sigma));
product= exp(product);
sum[k] += product;
}
sum[k] /= NEPC;
}
// sum[k] has accumulated PDF for each class k
for (int k=0; k<Classes; k++)
{
if ( sum[k] > largest)
{
largest = sum[k];
classify = k;
}
}
learn (classify, mark, space);
return classify;
}