Tuesday, May 29, 2012

FLDIGI - Analysing SOM decoder errors

In my previous blog post I shared some test results on new alpha version of Fldigi.

The new version of the Fldigi software works pretty well but occasionally it still generates some decoding errors.  I spent some time today in instrumenting the CW.CXX module to collect some measurements during the normal CW decoding operation.


The first place to focus was on how the software detects  "dits" and "dahs" from the incoming signal.
The cw::handle_event() function keeps track of CW_KEYDOWN_EVENTs and CW_KEYUP_EVENTs based on signal thresholds in the new  cw::decode_stream() function.
Therefore this was an obvious place to put a hook.  I created a new function cw::histogram() that collects the current duration value after CW_KEYUP_EVENT is detected. I placed the collection after the noise spike code as these values are not collected for decoding anyways.

I recorded about 3:17 into an audio file from 14 Mhz band with multiple CW stations having contacts. The band was quite noisy and I could see some decoder errors on the Fldigi instance I had connected to my Flex3000.  I replayed the processed audio file on Linux environment with instrumented version of Fldigi.  Figure 1 below shows the probability distribution of "dits" and "dahs".  There are multiple peaks visible and I also noticed that automatic Morse speed tracking changed from 16 WPM to  13 WPM, corresponding to "dit" values of 75ms and 92 ms respectively.  There are also a small number of outliers between 100 and 250 ms range.
Figure 1.  Dit & Dah timing distribution.

Since I was using the SOM decoder  for this experiment I decided to utilize a very nice feature built-in to the Best Matching Unit algorithm.  Since we are calculating Euclidian distance between incoming data vectors and codebook weight vectors and selecting the codebook vector with smallest distance we can use this distance as an error metric.  In other words if the best matching unit did not really provide a good match the distance (diffsf variable in the find_winner() function) should be higher than normally.
I  plotted figure 2  in order to demonstrate where SOM decoder indicated it had trouble matching the right codebook entry.  Not surprisingly when looking at each decoded character with high (over 0.05) error metric you can see some problem with the input vector.  The mean error value was 0.0125 and standard deviation was 0.047556. As you can see the the figure 2 the maximum was 0.50672  and there are several other peaks over 0.1 below.   Each entry on x-axis corresponds to one detected character and y axis is the SOM decoder error.

Figure 2.  SOM decoder - errors over time

Since there are so many error peaks in the figure 2  I started wondering if there is something else than just noise peaks that could be causing problems for the SOM decoder.  In most cases where SOM decoder indicates a bad match there was either additional "dit" or "dah" elements concatenated to the input vector  or some values that were not "dit" or "dah"  according to timing distribution shown in figure 1.

I let the voice file play again and plotted two other variables on the same timescale.  Agc_peak is used to determine with the threshold when signal is considered going up or down and it is a variable itself with fast attack and slow decay. If agc_peak falls down that means that signal level is also going down giving an indication on potential signal-to-noise problem.

Cw_adaptive_receive_threshold is a variable tracking the duration of 2 * "dit" length with a time constant defined by trackingfilter. This is a key variable determining whether received key down event was a "dit" or "dah".  If this variable is not able to follow the changing Morse speed then SOM decoder could get incorrectly assigned "dit" and "dah" values in the  input vector.

I normalized these 3 variables to  [0.0 1.0] range and plotted them on the same graph to see if there are any dependencies.  Figure 3  shows  SOM error rate,  agc_peak and  cw_adaptive_receive_threshold variables over time.  X axis represents time of each decoded character (sampling done when SOM decoder emits a character) and Y axis is normalized value of these variables.

Figure 3.   Error rate, Automatic Gain Control peak and CW speed over time 

By looking the figure 3  it is not obvious that agc_peak or cw_adaptive_receive_threshold values would correlate with higher SOM error rates.  Interestingly, even when the  agc_peak value goes down significantly showing that signal has some fading down to S2..S3 level, the SOM error  rate is not increasing during these fading events.

I decided to have another look at the audio file.  The original audio file recorded by PowerSDR/Flexradio 3000 sounded OK when played with Windows media player.  I usually import audio files to Linux environment where I have development environment.  Alsaplayer is also playing the audio files OK  but for some reason Fldigi  does not show waterfall when playing files originated from PowerSDR.  I have used Octave to do a format conversion:

PowerSDR format: Uncompressed 32-bit IEEE float audio, stereo, 12000Hz sample rate
Octave format:        Uncompressed 16-bit PCM audio, stereo, 8000Hz sample rate.

I copied the Octave formatted audio file back to Windows environment and heard what sounded like clipping  (there was several CW stations sending at the same time on this sample).

I decimated the audio file to the same length as measurements in the above figure (235 samples),  took absolute value and plotted the signal on the graph with other variables.  Now the culprit for these odd errors  is more visible - the signal is  clipping severly at x = 20...40,  x = 125..130, x = 190..210.  In the Fldigi waterfall display this clipping caused some visible interference spikes.

Figure 4.  Signal clipping is causing SOM decoder error rate peaks  


SOM decoder has an error metric  feature that is very useful in debugging problems.  In this particular case I was able to track the problem down to incorrectly converted audio file that caused clipping and artefacts on the frequency that Fldigi was decoding.

This decoding error metric could be used for other purposes as well.  Instead of printing incorrectly decoded characters on Fldigi display we could establish an  error limit  perhaps based on mean & standard deviation of normal  SOM decoding.  If error metric goes above this limit we can either stop printing characters or show  "*"  like  the original decoder is doing when it cannot decode.

For hams who want to learn manually to send high quality Morse code these features could provide some numerical metric on progress.  Non-gaussian "dit" / "dah" distribution indicates problems in rhythm   and SOM error metrics indicates timing problems in patterns.

Mauri AG1LE

Popular Posts