In my previous blog post I described a new Tensorflow based Machine Learning model that learns Morse code from annotated audio .WAV files with 8 kHz sample rate.
In order to evaluate the performance characteristic of the decoding accuracy from noisy audio source files I created a set of training & validation materials with Signal-to-Noise Ratio from -22dB to +30 dB. Target SNR_dB was created using the following Python code:
# Desired linear SNR
SNR_linear = 10.0**(SNR_dB/10.0)
# Measure power of signal - assume zero mean
power = morsecode.var()
# Calculate required noise power for desired SNR
noise_power = power/SNR_linear
# Generate noise with calculated power (mu=0, sigma=1)
noise = np.sqrt(noise_power)*np.random.normal(0,1,len(morsecode))
# Add noise to signal
morsecode = noise + morsecode
These audio .WAV files contain random words with maximum 5 characters - 5000 samples at each SNR level with 95% used for training and 5% for validation. The Morse speed in each audio sample was randomly selected from 30 WPM or 25 WPM.
The training was performed until 5 consecutive epochs did not improve the character error rate. The duration of these training sessions varied from 15 - 45 minutes on Macbook Pro with2.2 GHz Intel Core i7 CPU.
I captured and plotted the Character Error Rate (CER) and Signal-to-Noise Ratio (SNR) of each completed training and validation session. The following graph shows that the Morse decoder performs quite well until about -12 dB SNR level and below that the decoding accuracy drops fairly dramatically.
CER vs. SNR graph |
To view how noisy these files are here are some random samples - first 4 seconds of 8KHz audio file is demodulated, filtered using 25Hz 3rd order Butterworth filter and decimated by 125 to fit into a (128,32) vector. These vectors is shown as grayscale images below:
-6 db SNR |
-11 dB SNR |
-13 dB SNR |
-16 dB SNR |
Conclusions
The Tensorflow model appears to perform quite well on decoding noisy audio files, at least when the training set and validation set have the same SNR level.
The next experiments could include more variability with a much bigger training dataset that has a combination of different SNR, Morse speed and other variables. The training duration depends on the amount of training data so it can take a while to perform these larger scale experiments on a home computer.
73 de
Mauri AG1LE
It appears to me from the above images that lower SNR seems to be repesented by lower contrast, which is true to a point, but to make this a realistic test wouldn't the noise level have to increase which would appear as a speckled image. This would be like an old TV tuned to a far away channel such that the speckling dominates the image with faint lines visible in the background. That would much more closely represent low SNR CW.
ReplyDelete> 8 kHz sample rate
ReplyDeleteThe way you generate noise is around 1dB optimistic compared to a usual SNR calculated for 500Hz-3kHz SSB audio channel. Your noise power is spread over 4kHz audio channel, which is around 3dB more noise than the SSB channel, you calculate SNR towards the average power of the CW signal, while usually SNR is calculated to the carrier power, which optimizes by +3dB.
> The Morse speed in each audio sample was randomly selected from 30 WPM or 25 WPM.
This is not very representative, but I suppose it will take an ethernity to train on your Mac laptop. I have a desktop with NVIDIA 2080RTX Ti on order, this will take the Morse decoder training to another level.
Vojtech OK1IAK