Ham Radio Blog by AG1LE

Cloudberry Live - listen your rig from everywhere with Web UI using Raspberry Pi

2021-07-04T19:56:00.010-04:00

Overview

I wanted to have a fully open source solution to listen my radio(s) from my mobile phone and laptop over the web using a low cost Raspberry Pi as the rig control server. While there are many different remote station solutions out there I could not find one that would just work with a normal web browser (Chrome, Safari, etc) and without doing complicated network configurations exposing your internal WiFi network via a router. Also, I wanted to have the solution that is really easy to install to Raspberry Pi and update new versions as new features get added to the software.

I revisited the KX3 remote control project I did in Feb 2016 and started a new Cloudberry Live project. Cloudberry Live has several new improvements, such as no need to install Mumble client on your devices - you can just listen your radio(s) using a regular web browser. I did also upgrade my Amazon Alexa skill to leverage the ability to stream audio to Amazon Echo devices and control the frequency using voice commands.

Here is a short demo video how Cloudberry.live works:

Features

Listen your radio using web streaming from anywhere.
Web UI that works with mobile, tablet and laptop browsers (Chrome and Safari tested)
View top 10 DX cluster spots, switch the radio to the frequency with one click.

The software is currently at alpha stage - all the parts are working as shown in the demo above but need refactoring and general clean-up. The cloudberry.live proxy service is currently using a 3rd party open source proxy provider jprq. My plan is to host a reverse proxy myself in order to simplify the installation process.

The software is written using Python Flask framework and bash scripts. The deployment to Raspberry Pi is done using Ansible playbook that configures the environment correctly. I am using NGINX webserver to serve the web application.

The audio streaming portion is using HTTP Live Streaming (HLS) protocol and ffmpeg is used to stream audio from ALSA port and encode it using AAC format. There is a python http.server on port 8000 serving HLS traffic. I have tested Safari and Chrome browsers to be able to stream HLS audio. Chrome requires Play HLS M3u8 extension to be installed.

The home screen is shown below. This gives you the top 10 spots and a link to open audio streaming window. By clicking the frequency link on the freq column the server sends hamlib commands to the radio to set the frequency and mode. Only USB and LSB modes are supported in the current software version.

The Tune screen is shown below. This is still works-in-progress and needs some polishing. The Select Frequency allows to enter the frequency using numbers. The VFO range bar allows to change the radio frequency by dragging the green selection bar. The band selection buttons don't do anything at the moment.

The Configure Rig screen allows you to select your rig from the list of hamblib supported radios. I am using ICOM IC-7300 that is currently the default setting.

The Search button on the menu bar allows to check call sign from hamdb.org database. A pop-up window will show the station details:

Amazon Alexa Skill

I created a new Alexa Skill Cloudberry Live (not published yet) that uses the web API interface for selecting the frequency based on DX cluster spots and the HLS streaming to listen your radio. While the skill is currently using only my station, my goal would be to implement some sort of registration process so that Alexa users would have more choice to listen ham radio traffic from DX stations around the world using Cloudberry.live software.

This would give an opportunity also for people with disabilities to enjoy listening HF bands using voice controlled, low cost ($20 - $35) smart speakers. By keeping your radio (Raspberry Pi server) online you could help to grow the ham community.

Installation

I have posted the software to Github in a private repo. The software will have the following key features

One step software installation to Raspberry Pi using Ansible playbooks.

Configure your radio using Hamlib

Get your personalized Cloudberry.live weblink

I have been developing cloudberry.live on my Macbook Pro and pushing new versions to RaspBerry Pi server downstairs where my IC-7300 is located. Typical Ansible playbook update takes about 32 seconds (this includes restarting the services). I can see the access and error logs on the server using SSH consoles - this makes debugging quite easy.

Questions?

I am looking for collaborators to work with me on this project. If you are interested in open source web development using Python Flask framework let me know by posting a comment below.

73 de

Mauri AG1LE

New exciting Digital Mode CT8 for ham radio communications

2021-03-31T17:07:00.005-04:00

April 1, 2021

Overview

CT8 is a new exciting digital mode designed for interactive ham radio communication where signals may be weak and fading, and openings may be short.

A beta release of CT8 offers sensitivity down to –48 dB on the AWGN channel, and DX contacts with 4 times longer distance than FT8. An auto-sequencing feature offers the option to respond automatically to the first decoded reply to your CQ.

The best part of this new mode is that it is easy to learn how to decode in your head, thus no decoder software is needed. Alpha users of CT8 mode report that learning to decode CT8 is ten times easier than Morse code. For those who rather use a computer, an open source Tensorflow based Machine Learning decoder software is included in this beta release.

CT8 is based on novel avian vocalization encoding scheme. The character combinations were designed to be very easily recognizable to leverage existing QSO practices in the communication modes like CW.

Below is an example audio clip on how to establish a CT8 contact - the message format should be familiar to anybody who have listened Morse code in ham radio bands before.

Listen to the "CQ CQ DE AG1LE K" - the audio has rich syllabic tonal and harmonic features that are very easy to recognize even under noisy band conditions.

Fig 1. below shows the corresponding spectrogram. Notice the harmonic spectral features that ensure accurate symbol decoding and provide high sensitivity and tolerance against rapid fading, flutter and QRM.

Fig 1. CT8 spectrogram - CQ CQ CQ DE AG1LE K

The audio clip sample may sound a bit like a chicken. This is actually a key feature of avian vocalization encoding.

Scientific Background

The idea behind CT8 mode is not new. There is a lot of research done on avian vocalizations over the past hundred years. From late 1990s digital signal processing software has become widely available and vocal signals can be analyzed using sonograms and spectrograms with a personal computer.

In research article [1] Dr. Nicholas Collias described sound spectrograms of 21 of the 26 vocal signals in the extensive vocal repertoire of the African Village Weaver (Ploceus cucullatus). A spectrographic key to vocal signals helps make these signals comparable for different investigators. Short-distance contact calls are given in favorable situations and are generally characterized by low amplitude and great brevity of notes. Alarm cries are longer, louder, and often strident calls with much energy at high frequencies, whereas threat notes, also relatively long and harsh, emphasize lower frequencies.

In a very interesting research article [2] by Kevin G. McCracken and Frederick H. Sheldon conclude that the characters most subject to ecological convergence, and thus of least phylogenetic value, are first peak-energy frequency and frequency range, because sound penetration through vegetation depends largely on frequency. The most phylogenetically informative characters are number of syllables, syllable structure, and fundamental frequency, because these are more reflective of behavior and syringeal structure. In the figure below give details about Heron phylogeny, corresponding spectrograms, vocal characters, and habitat distributions.

Habitat distributions suggest that avian species that inhabit open areas such as savannas, grasslands, and open marshes have higher peak-energy (J) frequencies (kHz) and broader frequency ranges (kHz) than do taxa inhabiting closed habitats such as forests. Number of syllables is the number most frequently produced.

Ibises, tiger-herons, and boat-billed herons emit a rapid series of similar syllables; other heron vocalizations generally consist of singlets, doublets, or triplets. Syllabic structure may be tonal (i.e., pure whistled notes) or harmonic (i.e., possessing overtones; integral multiples of the base frequency). Fundamental frequency (kHz) is the base frequency of a syllable and is a function of syringeal morphology.

These vocalization features can be used for training modern machine learning algorithms. In fact, in a series of studies published [3] between 2014 and 2016, Georgia Tech research engineer Wayne Daley and his colleagues exposed groups of six to 12 broiler chickens to moderately stressful situations—such as high temperatures, increased ammonia levels in the air and mild viral infections—and recorded their vocalizations with standard USB microphones. They then fed the audio into a machine learning program, training it to recognize the difference between the sounds of contented and distressed birds. According the Scientific American article [4] Carolynn “K-lynn” Smith, a biologist at Macquarie University in Australia and a leading expert on chicken vocalizations, says that although the studies published so far are small and preliminary, they are “a neat proof of concept” and “a really fascinating approach.”

What does CT8 stand for?

Building on this solid scientific foundation it is easy to imagine very effective communication protocols that are based on millions of years of evolution of various avian species. After all, birds are social animals and have very expressive and effective communication protocols, whether to warn others about approaching predator or to invite flock members to join feasting on a corn field.

Humans have domesticated several avian species and have been living with species like chicken (Gallus gallus domesticus) for over 8000 years. Therefore CT8 mode sounds inherently natural to humans and it is much easier to learn to decode than Morse code based on extensive alpha testing performed by the development team.

CT8 stands for "Chicken Talk" version 8 -- over a year of development effort and seven previous encoding versions tested over difficult band conditions, and with hundreds of Machine Learning models trained, the software development team has finally been able to release CT8 digital mode.

Encoding Scheme

From ham radio perspective the frequency range of these avian vocalizations is below 4 kHz in most cases. This makes it possible to use existing SSB or FM transceivers without any modifications, other than perhaps adjustment of the filter bandwidth available in modern rigs. The audio sampling rate used in this project was 8 kHz, so the original audio source files were re-sampled using a Linux command line tool:

sox -b16 -c 1 input.wav output.wav rate 8000

The encoding scheme for the CT8 mode was done by collecting various free audio sources of chicken sounds and carefully assembling vowels, plosives, fricatives and nasals using this resource as the model. Free open source cross-platform audio software Audacity was used to extract vocalizations using the spectrogram view and also creating labeled audio files.

Figure 3. below shows a sample audio file with assigned character labels.

Fig 3. Labeled vocalizations using Audacity software

CT8 Software

The encoder software is written in C++ and Python and runs on Windows, OSX, and Linux. The sample decoder is made available from Github as open source software, if there is enough interest on this novel communication mode from the ham radio community.

For the CT8 decoder a Machine Learning based decoder software was built on top of open source Tensorflow framework. The decoder was trained on short 4 second audio clips and in the experiments character error rate 0.1% and word accuracy of 99.5% was achieved. With more real-world training material the ML model is expected to achieve even better decoding accuracy.

Future Enhancements

CT8 opens a new era for ham radio communication protocol development using biomimetics principles. Adding new phonemes using the principles of ecological signals as described in article [2] can open up things like "DX mode" for long distance communication. For example the vocalizations of Cetaceans (whales) could be also used to build a new phoneme map for DX contacts - some of the lowest frequency whale sounds can travel through the ocean as far as 10,000 miles without losing their energy.

73 de AG1LE

PS. If you made it down here, I hope that you enjoyed this figment of my imagination and I wish you a very happy April 1st.

References

[1] Nicholas E. Collias, Vocal Signals of the Village Weaver: A Spectrographic Key and the Communication Code

[2] Kevin G. McCracken and Frederick H. Sheldon, Avian vocalizations and phylogenetic signal

[3] Wayne Daley, et al Identifying rale sounds in chickens using audio signals for early disease detection in poultry

[4] Scientific American, Ferris Jabr, Fowl Language: AI Decodes the Nuances of Chicken “Speech”

New real-time deep learning Morse decoder

2020-04-12T09:20:00.002-04:00

Introduction

I have done some experiments with deep learning models previously. This previous blog post covers the new approach of building Morse decoder by training a CNN-LSTM-CTC model using audio that is converted to small image frames.

In this latest experiment I trained a new Tensorflow based CNN-LSTM-CTC model using 27.8 hours of Morse audio training set (25,000 WAV files - each clip 4 seconds) and achieved character error rate of 1.5% and word accuracy of 97.2% after 2:29:19 training time. The training data corpus was created from ARRL Morse code practice files (text files).

New real-time deep learning Morse decoder

I wanted to see if this new model is capable of decoding audio in real-time so I wrote a simple Python script to listen microphone, create a spectrogram, detect the CW frequency automatically, and feed 128 x 32 images to the model to perform the decoding inference.

With some tuning of the various components and parameters I was able to put together a working prototype using standard Python libraries and the Tensorflow Morse decoder that is available as open source in Github.

I recorded this sample YouTube video below in order to document this experiment.

Starting from the top left I have FLDIGI window open decoding CW at 30 WPM speed. On the top middle I have console window open printing the frame number, CW tone frequency followed by "infer_image:" and decoded text as well as the probability that the model assigns to this result.

On the top right I have the Spectrogram window that plots 4 seconds of the audio on a frequency scale. The morse code is quite readable on this graph.

On the bottom left I have Audacity playing a sample 30 WPM practice file from ARRL. Finally, on the bottom right I have the 128x32 image frame that I am feeding to the model.

Analysis

The full text at 30 WPM is here - I have highlighted the text section that is playing in the above video clip.

�  NOW 30 WPM  �  TEXT IS FROM JULY 2015 QST  PAGE 99 �

AGREEMENT WITH SOUTHCOM GRANTED ATLAS ACCESS TO THE SC 130S TECHNOLOGY.
THE ATLAS 180 ADAPTED THE MAN PACK RADIOS DESIGN FOR AMATEUR USE.  AN
ANALOG VFO FOR THE 160, 80, 40, AND 20 METER BANDS REPLACED THE SC 130S
STEP TUNED 2 12 MHZ SYNTHESIZER.  OUTPUT POWER INCREASED FROM 20 W TO 100
W.  AMONG THE 180S CHARMS WAS ITS SIZE.  IT MEASURED 9R5 X 9R5 X 3 INCHES.

THATS NOTHING SPECIAL TODAY, BUT IT WAS A TINY RIG IN 1974.  THE FULLY
SOLID STATE TRANSCEIVER FEATURED NO TUNE OPERATION.  THE VFOS 350 KHZ RANGE
REQUIRED TWO BAND SWITCH SEGMENTS TO COVER 75/80 METERS, BUT WAS AMPLE FOR
THE OTHER BANDS.  IN ORDER TO IMPROVE IMMUNITY TO OVERLOAD AND CROSS

MODULATION, THE 180S RECEIVER HAD NO RF AMPLIFIER STAGE THE ANTENNA INPUT
CIRCUIT FED THE RADIOS MIXER DIRECTLY.  A PAIR OF SUCCESSORS EARLY IN 1975,
ATLAS INTRODUCED THE 180S SUCCESSOR IN REALITY, A PAIR OF THEM.  THE NEW
210 COVERED 80 10 METERS, WHILE THE OTHERWISE IDENTICAL 215 COVERED 160 15
METERS HEREAFTER, WHEN THE 210 SERIES IS MENTIONED, THE 215 IS ALSO
IMPLIED.  BECAUSE THE 210 USED THE SAME VFO AND BAND SWITCH AS THE 180,
SQUEEZING IN FIVE BANDS SACRIFICED PART OF 80 METERS.  THAT BAND STARTED AT
�  END OF 30 WPM TEXT  �  QST DE W1AW  �

As can be seen from the YouTube video FLDIGI is able to copy this CW quite well. The new deep learning Morse decoder is also able to decode the audio with probabilities ranging from 4% to over 90% during this period.

It has visible problems when the current image frame cuts the Morse character into parts. The scrolling 128x32 image that is produced from the spectrogram graph does not have any smarts - it is just copied at every update cycle and fed into the infer_image() function. This means that a single Morse character is moving out of the frame but some part of the character can be still visible, causing incorrect decodes.

The decoder has also problems with some numbers even when fully visible in the 128x32 image frame. The ARRL training material that I used to build the corpus for training has about 8.6% words that are numbers (such as bands, frequencies and years). I believe that the current model doesn't have enough examples to decode all the numbers correctly.

The final problem is the lack of spaces between the words. The current model doesn't know about the "Space" character so it is just decoding what it has been trained on.

Software

The python script running the model is quite simple and listed below. I adapted the main Spectogram loop from this Github repo. I used the following constants in mic_read.py.

RATE = 8000
FORMAT = pyaudio.paInt16 #conversion format for PyAudio stream
CHANNELS = 1 #microphone audio channels
CHUNK_SIZE = 8192 #number of samples to take per read
SAMPLE_LENGTH = int(CHUNK_SIZE*1000/RATE) #length of each sample in ms

specgram.py

"""
Created by Mauri Niininen (AG1LE)
Real time Morse decoder using CNN-LSTM-CTC Tensorflow model

adapted from https://github.com/ayared/Live-Specgram

"""
############### Import Libraries ###############
from matplotlib.mlab import specgram
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import numpy as np
import cv2

############### Import Modules ###############
import mic_read
from morse.MorseDecoder import Config, Model, Batch, DecoderType

############### Constants ###############
SAMPLES_PER_FRAME = 4 #Number of mic reads concatenated within a single window
nfft = 256 # NFFT value for spectrogram
overlap = nfft-56 # overlap value for spectrogram
rate = mic_read.RATE #sampling rate

############### Call Morse decoder ###############
def infer_image(model, img):
if img.shape == (128, 32):
batch = Batch(None, [img])
(recognized, probability) = model.inferBatch(batch, True)
return img, recognized, probability
else:
print(f"ERROR: img shape:{img.shape}")

# Load the Tensorlow model
config = Config('model.yaml')
model = Model(open("morseCharList.txt").read(), config, decoderType = DecoderType.BestPath, mustRestore=True)

stream,pa = mic_read.open_mic()

############### Functions ###############
"""
get_sample:
gets the audio data from the microphone
inputs: audio stream and PyAudio object
outputs: int16 array
"""
def get_sample(stream,pa):
data = mic_read.get_data(stream,pa)
return data
"""
get_specgram:
takes the FFT to create a spectrogram of the given audio signal
input: audio signal, sampling rate
output: 2D Spectrogram Array, Frequency Array, Bin Array
see matplotlib.mlab.specgram documentation for help
"""
def get_specgram(signal,rate):
arr2D,freqs,bins = specgram(signal,window=np.blackman(nfft),
Fs=rate, NFFT=nfft, noverlap=overlap,
pad_to=32*nfft )
return arr2D,freqs,bins

"""
update_fig:
updates the image, just adds on samples at the start until the maximum size is
reached, at which point it 'scrolls' horizontally by determining how much of the
data needs to stay, shifting it left, and appending the new data.
inputs: iteration number
outputs: updated image
"""
def update_fig(n):
data = get_sample(stream,pa)
arr2D,freqs,bins = get_specgram(data,rate)

im_data = im.get_array()
if n < SAMPLES_PER_FRAME:
im_data = np.hstack((im_data,arr2D))
im.set_array(im_data)
else:
keep_block = arr2D.shape[1]*(SAMPLES_PER_FRAME - 1)
im_data = np.delete(im_data,np.s_[:-keep_block],1)
im_data = np.hstack((im_data,arr2D))
im.set_array(im_data)

# Get the image data array shape (Freq bins, Time Steps)
shape = im_data.shape

# Find the CW spectrum peak - look across all time steps
f = int(np.argmax(im_data[:])/shape[1])

# Create a 32x128 array centered to spectrum peak
if f > 16:
print(f"n:{n} f:{f}")
img = cv2.resize(im_data[f-16:f+16][0:128], (128,32))
if img.shape == (32,128):
cv2.imwrite("dummy.png",img)
img = cv2.transpose(img)
img, recognized, probability = infer_image(model, img)
if probability > 0.0000001:
print(f"infer_image:{recognized} prob:{probability}")
return im,

def main():

global im
############### Initialize Plot ###############
fig = plt.figure()
"""
Launch the stream and the original spectrogram
"""
stream,pa = mic_read.open_mic()
data = get_sample(stream,pa)
arr2D,freqs,bins = get_specgram(data,rate)
"""
Setup the plot paramters
"""
extent = (bins[0],bins[-1]*SAMPLES_PER_FRAME,freqs[-1],freqs[0])

im = plt.imshow(arr2D,aspect='auto',extent = extent,interpolation="none",
cmap = 'Greys',norm = None)

plt.xlabel('Time (s)')
plt.ylabel('Frequency (Hz)')
plt.title('Real Time Spectogram')
plt.gca().invert_yaxis()
#plt.colorbar() #enable if you want to display a color bar

############### Animate ###############
anim = animation.FuncAnimation(fig,update_fig,blit = True,
interval=mic_read.CHUNK_SIZE/1000)


try:
plt.show()
except:
print("Plot Closed")

############### Terminate ###############
stream.stop_stream()
stream.close()
pa.terminate()
print("Program Terminated")

if __name__ == "__main__":
main()

I did run this experiment on Macbook Pro (2.2 GHz Quad-Core Intel Core i7) and MacOS Catalina 10.15.3. The Python version used was Python 3.6.5 (v3.6.5:f59c0932b4, Mar 28 2018, 05:52:31) [GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin

Conclusions

This experiment demonstrates that it is possible to build a working real time Morse decoder based on deep learning Tensorflow model using a slow interpreted language like Python. The approach taken here is quite simplistic and lacks some key functionality, such as alignment of decoded text to audio timeline.

It also shows that there are still more work to do in order to build a fully functioning, open source and high performance Morse decoder. A better event driven software architecture would allow building a proper user interface with some controls, like audio filtering. Such an architecture would enable also building server side decoders running based on audio feeds from WebSDR receivers etc.

Finally, the Tensorflow model in this experiment has a very small training set, only 27.8 hours of audio. If you compare to commercial ASR (automatic speech recognition) engines they have been trained using over 1000X more labeled audio training material. To get better performance from deep learning models you need to have a lot of high quality labeled training material that matches with the typical sound environment the model will be used on.

73
Mauri AG1LE

DeepMorse - Web based tool for CW Puzzles and Training

2019-07-13T17:37:00.002-04:00

Introduction

I started working on a new project recently. The idea behind this "DeepMorse" project is to create a web site that contains curated Morse code audio clips. The website would allow subscribers to upload annotated CW audio clips (MP3, WAV, etc) and associated metadata.

As a subscriber you would be able to provide the story behind the clip as well as some commentary or even photos. After uploading the site would show the graphical view of the audio clip much like the modern Software Defined Radios (SDRs) and users would be able to play back the audio and see the metadata.

Since this site would contain "real world" recordings and some really difficult to copy audio clips, this would also provide ultimate test of your CW copying skills. The system would save a score on your copying accuracy before it gives you the "ground truth" of annotated audio. You could compete for the top scores with all the other CW aficionados.

The site could also be used to share historical records of curated Morse code audio materials with the ham radio community. For CW newbies the site would have a treasure trove of different kinds of training materials when you get tired of listening ARRL morse practice MP3 files. For experienced CW operators you could share some of your best moments when working using your favorite operating mode, teaching newbies how to catch the "big fish".

User Interface

I wanted to experiment combining audio and graphical waveform view of the audio together, giving the user ability to listen, scroll back and re-listen as well as zoom into the waveform.

Part of the user interface is also the free text form where user can enter the text they heard in the audio clip. By pressing "Check" button the system will calculate the accuracy compared to the "ground truth" text. System is using normalized Levenshtein method to calculate the accuracy in percentage (0...100%) where 100% is perfect copy.

Figure 1. below shows the main listening view.

Figure 1. DeepMorse User Interface

Architecture

I wrote this web application using Python Django web framework and it took only a few nights to get the basic structure together. The website is running in AWS using serverless Lambda functions and serverless Aurora RDS MySQL database. The audio files are stored into an S3 bucket.

Using serverless database backend sounds like oxymoron, since there is a database server managed by AWS. It also brings some challenges such as slow "cold start" that will be visible for end users. When you click the "Puzzles" menu you normally will get this view (see Figure 2. below).

Figure 2. Puzzles View

However, if the serverless database server has timed out due to no activity, it will take more than 30 seconds to come up. By this time the front end webserver has also timed out and the user will see this below instead (see Figure 3.). A simple refresh of the browser will fix the situation and both the front end and the backend will be then available.

Figure 3. Serverless "Time Out" error message

So what is then the benefit of using AWS serverless technology? The benefit is that you get billed only for usage and if the application is not used 24x7 this means significant cost savings. For a hobby project like DeepMorse I am able to run the service very cost efficiently.

The other benefit of serverless technologies is automatic scaling - if the service becomes suddenly hugely popular the system is able to scale up rapidly.

Next Steps

I am looking for some feedback from early users trying to figure out what features might be interesting for Morse code aficionados.

73 de Mauri

AG1LE

SCREEN SHOTS

Performance characteristics of the ML Morse Decoder

2019-02-10T14:35:00.000-05:00

In my previous blog post I described a new Tensorflow based Machine Learning model that learns Morse code from annotated audio .WAV files with 8 kHz sample rate.

In order to evaluate the performance characteristic of the decoding accuracy from noisy audio source files I created a set of training & validation materials with Signal-to-Noise Ratio from -22dB to +30 dB. Target SNR_dB was created using the following Python code:

# Desired linear SNR
SNR_linear = 10.0**(SNR_dB/10.0)

# Measure power of signal - assume zero mean
power = morsecode.var()

# Calculate required noise power for desired SNR
noise_power = power/SNR_linear

# Generate noise with calculated power (mu=0, sigma=1)
noise = np.sqrt(noise_power)*np.random.normal(0,1,len(morsecode))

# Add noise to signal
morsecode = noise + morsecode

These audio .WAV files contain random words with maximum 5 characters - 5000 samples at each SNR level with 95% used for training and 5% for validation. The Morse speed in each audio sample was randomly selected from 30 WPM or 25 WPM.

The training was performed until 5 consecutive epochs did not improve the character error rate. The duration of these training sessions varied from 15 - 45 minutes on Macbook Pro with2.2 GHz Intel Core i7 CPU.

I captured and plotted the Character Error Rate (CER) and Signal-to-Noise Ratio (SNR) of each completed training and validation session. The following graph shows that the Morse decoder performs quite well until about -12 dB SNR level and below that the decoding accuracy drops fairly dramatically.

CER vs. SNR graph

To view how noisy these files are here are some random samples - first 4 seconds of 8KHz audio file is demodulated, filtered using 25Hz 3rd order Butterworth filter and decimated by 125 to fit into a (128,32) vector. These vectors is shown as grayscale images below:

-6 db SNR

-11 dB SNR

-13 dB SNR

-16 dB SNR

Conclusions

The Tensorflow model appears to perform quite well on decoding noisy audio files, at least when the training set and validation set have the same SNR level.

The next experiments could include more variability with a much bigger training dataset that has a combination of different SNR, Morse speed and other variables. The training duration depends on the amount of training data so it can take a while to perform these larger scale experiments on a home computer.

73 de

Mauri AG1LE

Training a Computer to Listen and Decode Morse Code

2019-02-02T23:51:00.002-05:00

Abstract

I trained a Tensorflow based CNN-LSTM-CTC model with 5.2 hours of Morse audio training set (5000 files) and achieved character error rate of 0.1% and word accuracy of 99.5% I tested the model with audio files containing various levels of noise and found the model to decode relatively accurately down to -3 dB SNR level.

Introduction

Decoding Morse code from audio signals is not a novel idea. The author has written many different software decoder implementations that use simplistic models to convert a sequence of "Dits" and "Dahs" to corresponding text. When the audio signal is noise free and there is no interference, these simplistic methods work fairly well and produce nearly error free decoding. Figure 1. below shows "Hello World" with 35 dB signal-to-noise ratio that most conventional decoders don't have any problems decoding.

"Hello World" with 30 dB SNR

Figure 2 below shows the same "Hello World" but with -12 dB signal-to-noise ratio using exactly same process as above to extract the demodulated envelope. Humans can still hear and even recognize the Morse code faintly in the noise. Computers equipped with these simplistic models have great difficulties decoding anything meaningful out of this signal. In ham radio terms the difference of 47 dB corresponds roughly eight S units - human ears & brain can still decode S2 level signals whereas conventional software based Morse decoders produce mostly gibberish.

"Hello World" with -12 dB SNR

New Approach - Machine Learning

I have been quite interested in Machine Learning (ML) technologies for a while. From software development perspective ML is changing the paradigm how we are processing data.

In traditional programming we look at the input data and try to write a program that uses some processing steps to come up with the output data. Depending on the complexity of the problem software developer may need to spend quite a long time coming up with the correct algorithms to produce the right output data. From Morse decoder perspective this is how most decoders work: they take input audio data that contains the Morse signals and after many complex operations the correct decoded text appears on the screen.

Machine Learning changes this paradigm. As a ML engineer you need to curate a dataset that has a representative selection of input data with corresponding output data (also known as label data). The computer then applies a training algorithm to this dataset that eventually discovers the correct "program" - the ML model that provides the best matching function that can infer the correct output, given the input data.

See Figure 3. that tries to depict this difference between traditional programming and the new approach with Machine Learning.

Programming vs. Machine Learning

So what does this new approach mean in practice? Instead of trying to figure out ever more complex software algorithms to improve your data processing and accuracy of decoding, you can select from some standard machine learning algorithms that are available in open source packages like Tensorflow and focus on building a neural network model and curating a large dataset to train this model. The trained model can then be used to make the decoding from the input audio data. This is exactly what I did in the following experiment.

I took a Tensorflow implementation of Handwritten Text Recognition created by Harald Scheidl [3] that he has posted in Github as an open source project. He has provided excellent documentation on how the model works as well as references to the IAM dataset that he is using for training the handwritten text recognition.

Why would a model created for handwritten text recognition work for Morse code recognition?

It turns out that the Tensorflow standard learning algorithms used for handwriting recognition are very similar to ones used for speech recognition.

The figures below are from Hannun, "Sequence Modeling with CTC", Distill, 2017. In the article Hannun [2] shows that the (x,y) coordinates of a pen stroke or pixels in image can be recognized as text, like the spectrogram of speech audio signals. Morse code has similar properties as speech - the speed can vary a lot and hand-keyed code can have unique rhythm patterns that make it difficult to align signals to decoded text. The common theme is that we have some variable length input data that need to be aligned with variable length output data. The algorithm that comes with Tensorflow is called Connectionist Temporal Classification (CTC) [1].

Morse Dataset

The Morse code audio file can be easily converted to a representation that is suitable as input data for these neural networks. I am using single track (mono) WAV files with 8 kHz sampling frequency.

The following few lines of Python code takes 4 seconds sample from an existing WAV audio file, finds the signal peak frequency, de-modulates and decimates the data so that we get a (1,256) vector that we re-shape to (128, 32) and write into a PNG file.

def find_peak(fname):
# Find the signal frequency and maximum value
Fs, x = wavfile.read(fname)
f,s = periodogram(x, Fs,'blackman',8192,'linear', False, scaling='spectrum')
threshold = max(s)*0.9 # only 0.4 ... 1.0 of max value freq peaks included
maxtab, mintab = peakdet(abs(s[0:int(len(s)/2-1)]), threshold,f[0:int(len(f)/2-1)] )

return maxtab[0,0]

def demodulate(x, Fs, freq):
# demodulate audio signal with known CW frequency
t = np.arange(len(x))/ float(Fs)
mixed = x*((1 + np.sin(2*np.pi*freq*t))/2 )

#calculate envelope and low pass filter this demodulated signal
#filter bandwidth impacts decoding accuracy significantly
#for high SNR signals 40 Hz is better, for low SNR 20Hz is better
# 25Hz is a compromise - could this be made an adaptive value?
low_cutoff = 25. # 25 Hz cut-off for lowpass
wn = low_cutoff/ (Fs/2.)
b, a = butter(3, wn) # 3rd order butterworth filter
z = filtfilt(b, a, abs(mixed))

# decimate and normalize
decimate = int(Fs/64) # 8000 Hz / 64 = 125 Hz => 8 msec / sample
o = z[0::decimate]/max(z)
return o

def process_audio_file(fname, x, y, tone):
Fs, signal = wavfile.read(fname)
dur = len(signal)/Fs
o = demodulate(signal[(Fs*(x)):Fs*(x+y)], Fs, tone)
return o, dur

filename = "error.wav"
tone = find_peak(filename)
o,dur = process_audio_file(filename,0,4, tone)
im = o[0::1].reshape(1,256)
im = im*256.

img = cv2.resize(im, (128, 32), interpolation = cv2.INTER_AREA)
cv2.imwrite("error.png",img)

Here is the resulting PNG image - it contains "ERROR M". The labels are kept in a file that contains also the corresponding audio file name.

4 second audio sample converted to a (128,32) PNG file

It is very easy to produce a lot of training and validation data with this method. The important part is that each audio file must have accurate "labels" - this is the textual representation of the Morse audio file.

I created a small Python script to produce this kind of Morse training and validation dataset. With a few parameters you can generate as much data as you want with different speed and noise levels.

Model

I used Harald's model to start the Morse decoding experiments.

The model consists of 5 CNN layers, 2 RNN (LSTM) layers and the CTC loss and decoding layer. The illustration below gives an overview of the NN (green: operations, pink: data flowing through NN) and here follows a short description:

The input image is a gray-value image and has a size of 128x32
5 CNN layers map the input image to a feature sequence of size 32x256
2 LSTM layers with 256 units propagate information through the sequence and map the sequence to a matrix of size 32x80. Each matrix-element represents a score for one of the 80 characters at one of the 32 time-steps
The CTC layer either calculates the loss value given the matrix and the ground-truth text (when training), or it decodes the matrix to the final text with best path decoding or beam search decoding (when inferring)
Batch size is set to 50

It is not hard to imagine making some changes to the model to allow for longer audio clips to be decoded. Right now the limit is about 4 seconds audio converted to (128x32) input image. Harald is actually providing details of a model that can handle larger input image (800x64) and output up to 100 characters strings.

Experiment

Here are parameters I used for this experiment:

5000 samples, split into training and validation set: 95% training - 5% validation
Each sample has 2 random words, max word length is 5 characters
Morse speed randomly selected from [20, 25, 30] words-per-minute
Morse audio SNR: 40 dB
batchSize: 100
imgSize: [128,32]
maxTextLen: 32
earlyStopping: 20

Training time was 1hr 51mins on a Macbook Pro 2.2 GHz Intel Core i7
Training curves of character error rate, word accuracy and loss after 50 epochs were the following:

Training over 50 epochs

The best character error rate was 14.9% and word accuracy was 36.0%. These are not great numbers - the reason was that I had training data containing 2 words in each sample - in many cases this was too many characters to fit in the 4 second time window, therefore the training algorithm did not see the second word in the training material in many cases.

I did re-run the experiment with 5000 samples, but with just one word in each sample. It took 54 mins 7 seconds to do this training. New parameters are below:

model:
# model constants
batchSize: 100
imgSize: !!python/tuple [128,32]
maxTextLen: 32
earlyStopping: 5

morse:
fnTrain: "morsewords.txt"
fnAudio: "audio/"
count: 5000
SNR_dB:
- 20
- 30
- 40
f_code: 600
Fs: 8000
code_speed:
- 30
- 25
- 20
length_N: 65000
play_sound: False
word_max_length: 5
words_in_sample: 1

experiment:
modelDir: "model/"
fnAccuracy: "model/accuracy.txt"
fnTrain: "model/morsewords.txt"
fnInfer: "model/test.png"
fnCorpus: "model/corpus.txt"
fnCharList: "model/charList.txt"

Here is the outcome of that second training session:

Total training time was 0:54:07.857731

Character error rate: 0.1%. Word accuracy: 99.5%.

Training over 33 epochs

With a larger dataset the training will take longer. One possibility would be to use AWS cloud computing service to accelerate the training for a much larger dataset.

Note that the model did not know anything about Morse code at the start. It did learn the character set, the structure of the Morse code and the words just by "listening" through the provided sample files. This is approximately 5.3 hours of Morse code audio materials with random words. (5000 files * 95% * 4 sec/file = 19000 seconds).

It would be great to get some comparative data on how quickly humans will learn to produce similar character error rate.

Results

I created a small "helloword.wav" audio file with HELLO WORLD text at 25 WPM in different signal-to-noise ratios (-6, -3, +6, +50) dB to test the first model.

Attempting to decode the content of the audio file I got the following results. Given that the training was done with +40 dB samples I was quite surprised to see relatively good decoding accuracy. The model also provides probability how confident it is about the result. These values vary between 0.4% to 5.7%.

File: -6 dB SNR

python MorseDecoder.py -f audio/helloworld.wav

Validation character error rate of saved model: 15.4

Python: 2.7.10 (default, Aug 17 2018, 19:45:58)

[GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.0.42)]

Tensorflow: 1.4.0

2019-02-02 22:40:51.970393: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA

Init with stored values from model/snapshot-22

inferBatch: probs:[ 0.00420194] texts:['HELL Q PE']

Recognized: "HELL Q PE"

Probability: 0.00420194

['HELL Q PE']

-6 dB HELLO WORLD

File: -3 dB SNR

python MorseDecoder.py -f audio/helloworld.wav

Validation character error rate of saved model: 15.4

Python: 2.7.10 (default, Aug 17 2018, 19:45:58)

[GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.0.42)]

Tensorflow: 1.4.0

2019-02-02 22:36:32.838156: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA

Init with stored values from model/snapshot-22

inferBatch: probs:[ 0.05750186] texts:['HELLO WOE']

Recognized: "HELLO WOE"

Probability: 0.0575019

['HELLO WOE']

-3 dB HELLO WORLD

File: +6 dB SNR

python MorseDecoder.py -f audio/helloworld.wav

Validation character error rate of saved model: 15.4

Python: 2.7.10 (default, Aug 17 2018, 19:45:58)

[GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.0.42)]

Tensorflow: 1.4.0

2019-02-02 22:38:57.549928: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA

Init with stored values from model/snapshot-22

inferBatch: probs:[ 0.03523131] texts:['HELLO WOT']

Recognized: "HELLO WOT"

Probability: 0.0352313

['HELLO WOT']

+6 dB HELLO WORLD

File: +50 dB SNR

python MorseDecoder.py -f audio/helloworld.wav

Validation character error rate of saved model: 15.4

Python: 2.7.10 (default, Aug 17 2018, 19:45:58)

[GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.0.42)]

Tensorflow: 1.4.0

2019-02-02 22:42:55.403738: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA

inferBatch: probs:[ 0.03296029] texts:['HELLO WOT']

Recognized: "HELLO WOT"

Probability: 0.0329603

['HELLO WOT']

+50 dB HELLO WORLD

In comparison, I took one file that was used in the training process. This file contains "HELLO HERO" text at +40 dB SNR. Here is what the decoder was able to decode - with much higher probability 51.8%

File: +40 dB SNR

python MorseDecoder.py -f audio/6e753ac57d4849ef87d5146e158610f0.wav

Validation character error rate of saved model: 15.4

Python: 2.7.10 (default, Aug 17 2018, 19:45:58)

[GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.0.42)]

Tensorflow: 1.4.0

2019-02-02 22:53:27.029448: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA

Init with stored values from model/snapshot-22

inferBatch: probs:[ 0.51824665] texts:['HELLO HERO']

Recognized: "HELLO HERO"

Probability: 0.518247

['HELLO HERO']

+40 dB HELLO HERO

Conclusions

This is my first machine learning experiment where I used Morse audio files for both training and validation of the model. The current model limitation is that only 4 second audio clips can be used. However, it is very feasible to build a larger model that can decode longer audio clip with a single inference operation. Also, it would be possible to feed a longer audio file in 4 second pieces to get decoding happening across the whole file.

This Morse decoder doesn't have a single line of code that would explicitly spell out the Morse codebook. The model literally learned from the training data what Morse code is and how to decode it. It represents a new paradigm in building decoders, and is using similar technology what companies like Google, Microsoft, Amazon and Apple are using for their speech recognition products.

I hope that this experiment demonstrates to the ham radio community how to build high quality, open source Morse decoders using a simple, standards based ML architecture. With more computing capacity and larger training / validation datasets that contain accurate annotated (labeled) audio files it is now feasible to build a decoder that will surpass the accuracy of conventional decoders (like the one in FLDIGI software).

73 de Mauri
AG1LE

Software and Instructions

The initial version of the software is available in Github - see here

Using from the command line:

python MorseDecoder.py -h

usage: MorseDecoder.py [-h] [--train] [--validate] [--generate] [-f FILE]

optional arguments:

-h, --help show this help message and exit

--train train the NN

--validate validate the NN

--generate generate a Morse dataset of random words

-f FILE input audio file

To get started you need to generate audio training material. The count variable in model.yaml config file tells how many samples will get generated. Default is 5000.

python MorseDecoder.py --generate

Next you need to perform the training. You need to have "audio/", "image/" and "model/" subdirectories on the folder you are running the program.

python MorseDecoder.py --train

Last this to do is to validate the model:

python MorseDecoder.py --validate

To have the model decode a file you should use:

python MorseDecoder.py -f audio/myfilename.wav

Config file model.yaml (first training session):
model:
# model constants
batchSize: 100
imgSize: !!python/tuple [128,32]
maxTextLen: 32
earlyStopping: 20

morse:
fnTrain: "morsewords.txt"
fnAudio: "audio/"
count: 5000
SNR_dB: 20
f_code: 600
Fs: 8000
code_speed: 30
length_N: 65000
play_sound: False
word_max_length: 5
words_in_sample: 2

experiment:
modelDir: "model/"
fnAccuracy: "model/accuracy.txt"
fnTrain: "model/morsewords.txt"
fnInfer: "model/test.png"
fnCorpus: "model/corpus.txt"
fnCharList: "model/charList.txt"

Config file model.yaml (second training session):
model:
# model constants
batchSize: 100
imgSize: !!python/tuple [128,32]
maxTextLen: 32
earlyStopping: 5

morse:
fnTrain: "morsewords.txt"
fnAudio: "audio/"
count: 5000
SNR_dB:
- 20
- 30
- 40
f_code: 600
Fs: 8000
code_speed:
- 30
- 25
- 20
length_N: 65000
play_sound: False
word_max_length: 5
words_in_sample: 1

experiment:
modelDir: "model/"
fnAccuracy: "model/accuracy.txt"
fnTrain: "model/morsewords.txt"
fnInfer: "model/test.png"
fnCorpus: "model/corpus.txt"
fnCharList: "model/charList.txt"

References

[1] A. Graves, S. Fernandez, F. Gomez, and J. Schmidhuber, “Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks,” in Proceedings of the 23rd international conference on Machine learning. ACM, 2006, pp. 369–376. https://www.cs.toronto.edu/~graves/icml_2006.pdf

[2] Hannun, "Sequence Modeling with CTC", Distill, 2017. https://distill.pub/2017/ctc/

[3] Harald Scheidl "Handwritten Text Recognition with TensorFlow", https://github.com/githubharald/SimpleHTR

MORSE: DENOISING AUTO-ENCODER

2017-11-25T22:07:00.003-05:00

Introduction

Denoising auto-encoder (DAE) is an artificial neural network used for unsupervised learning of efficient codings. DAE takes a partially corrupted input whilst training to recover the original undistorted input.

For ham radio amateurs there are many potential use cases for de-noising auto-encoders. In this blogpost I share an experiment where I trained a neural network to decode morse code from very noisy signal.

Can you see the Morse character in the figure 1. below? This looks like a bad waterfall display with a lot of background noise.

Fig 1. Noisy Input Image

To my big surprise this trained DAE was able to decode letter 'Y' on the top row of the image. The reconstructed image is shown below in Figure 2. To put this in perspective, how often can you totally eliminate the noise just by turning a knob in your radio? This reconstruction is very clear with a small exception that timing of last 'dah' in letter 'Y' is a bit shorter than in the original training image.

Fig 2. Reconstructed Out Image

For reference, below is original image of letter 'Y' that was used in the training phase.

Fig 3. Original image used for training

Experiment Details

As a starting point I used Tensorflow tutorials using Jupyter Notebooks, in particular this excellent de-noising autoencoder example that uses MNIST database as the data source. The MNIST database (Modified National Institute of Standards and Technology database) is a large database of handwritten digits that is commonly used for training various image processing systems. The database is also widely used for training and testing in the field of machine learning. The MNIST database contains 60,000 training images and 10,000 testing images. Half of the training set and half of the test set were taken from NIST's training dataset, while the other half of the training set and the other half of the test set were taken from NIST's testing dataset.

Fig 4. Morse images

I created a simple Python script that generates a Morse code dataset in MNIST format using a text file as the input data. To keep things simple I kept the MNIST image size (28 x 28 pixels) and just 'painted' morse code as white pixels on the canvas. These images look a bit like waterfall display in modern SDR receivers or software like CW skimmer. I created all together 55,000 training images, 5000 validation images and 10,000 testing images.

To validate that these images look OK I plotted first ten characters "BB 2BQA}VA" from the random text file I used for training. Each image is 28x28 pixels in size so even the longest Morse character will easily fit on this image. Right now all Morse characters start from top left corner but it would be easy to generate more randomness in the starting point and even length (or speed) of these characters.

In fact the original MNIST images have a lot of variability in the handwritten digits and some are difficult even for humans to classify correctly. In MNIST case you have only ten classes to choose from (numbers 0,1,2,3,4,5,6,7,8,9) but in Morse code I had 60 classes as I wanted to include also special characters in the training material.

Fig 5. MNIST images

Figure 4. shows the Morse example images and Figure 5. shows the MNIST example handwritten images.

When training DAE network I added modest amount of gaussian noise to these training images. See example on figure 6. It is quite surprising that the DAE network is still able to decode correct answers with three times more noise added on the test images.

Fig 6. Noise added to training input image

Network model and functions

A typical feature in auto-encoders is to have hidden layers that have less features than the input or output layers. The network is forced to learn a ”compressed” representation of the input. If the input were completely random then this compression task would be very difficult. But if there is structure in the data, for example, if some of the input features are correlated, then this algorithm will be able to discover some of those correlations.

# Network Parameters

n_input = 784 # MNIST data input (img shape: 28*28)

n_hidden_1 = 256 # 1st layer num features

n_hidden_2 = 256 # 2nd layer num features

n_output = 784 #

with tf.device(device2use):

# tf Graph input

x = tf.placeholder("float", [None, n_input])

y = tf.placeholder("float", [None, n_output])

dropout_keep_prob = tf.placeholder("float")

# Store layers weight & bias

weights = {

'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])),

'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),

'out': tf.Variable(tf.random_normal([n_hidden_2, n_output]))

}

biases = {

'b1': tf.Variable(tf.random_normal([n_hidden_1])),

'b2': tf.Variable(tf.random_normal([n_hidden_2])),

'out': tf.Variable(tf.random_normal([n_output]))

}

The functions for this neural network are below. The cost function calculates the mean square of the difference of output and training images.

with tf.device(device2use):

# MODEL

out = denoising_autoencoder(x, weights, biases, dropout_keep_prob)

# DEFINE LOSS AND OPTIMIZER

cost = tf.reduce_mean(tf.pow(out-y, 2))

optimizer = tf.train.AdamOptimizer(learning_rate=0.01).minimize(cost)

# INITIALIZE

init = tf.initialize_all_variables()

# SAVER

savedir = "nets/"

saver = tf.train.Saver(max_to_keep=3)

Model Training

I used the following parameters for training the model. Training took 1780 seconds on a Macbook Pro laptop. The cost curve of training process is shown in Figure 6.

training_epochs = 300

batch_size = 1000

display_step = 5

plot_step = 10

Fig 6. Cost curve

It is interesting to observe what is happening to the weights. Figure 7 shows the first hidden layer "h1" weights after training is completed. Each of these blocks have learned some internal representation of the Morse characters. You can also see the noise that was present in the training data.

Fig 7. Filter shape for "h1" weights

Software

The Jupyter Notebook source code of this experiment has been posted to Github. Many thanks to the original contributors of this and other Tensorflow tutorials. Without them this experiment would not have been possible.

Conclusions

This experiment demonstrates that de-noising auto-encoders could have many potential use cases for ham radio experiments. While I used MNIST format (28x28 pixel images) in this experiment, it is quite feasible to use other kinds of data, such as audio WAV files, SSTV images or some data from other digital modes commonly used by ham radio amateurs.

If your data has a clear structure that will have noise added and distorted during a radio transmission, it would be quite feasible to experiment implementing a de-noising auto-encoder to restore near original quality. It is just a matter of re-configuring the DAE network and re-training the neural network.

If this article sparked your interest in de-noising auto-encoders please let me know. Machine Learning algorithms are rapidly being deployed in many data intensive applications. I think it is time for ham radio amateurs to start experimenting with this technology as well.

Mauri AG1LE

2017-11-05T21:18:00.002-05:00

TensorFlow revisited: a new LSTM Dynamic RNN based Morse decoder

It has been almost two years since I was playing with TensorFlow based Morse decoder. This is a long time in the rapidly moving Machine Learning field.

I created a new version of the LSTM Dynamic RNN based Morse decoder using TensorFlow package and Aymeric Damien's example. This version is much faster and has also ability to train/decode on variable length sequences. The training and testing sets are generated from sample text files on the fly, I included the Python library and the new TensorFlow code in my Github page.

The demo has ability to train and test using datasets with noise embedded. Fig 1. shows the 50 first test vectors with gaussian noise added. Each vector is padded to 32 values. Unlike the previous version of LSTM network this new version has ability to train variable length sequences. The Morse class handles the generation of training vectors based on input text file that contains randomized text.

Fig 1. "NOW 20 WPM TEXT IS FROM JANUARY 2015 QST PAGE 56 "

Below are the TensorFlow model and network parameters I used for this experiment:

# MODEL Parameters

learning_rate = 0.01

training_steps = 5000

batch_size = 512

display_step = 100

n_samples = 10000

# NETWORK Parameters

seq_max_len = 32 # Sequence max length

n_hidden = 64 # Hidden layer num of features

n_classes = 60 # Each morse character is a separate class

Fig 2. shows the training loss and accuracy by minibatch. This training took 446.9 seconds and final testing accuracy reached was 0.9988. This training session was done without any noise in the training dataset.

Fig 2. Training Loss and Accuracy plot.

Sample session to use the trained model is below:

# ================================================================

# Use saved model to predict characters from Morse sequence data

# ================================================================

NOISE = False

saver = tf.train.Saver()

testset = Morse(n_samples=10000, max_seq_len=seq_max_len,filename='arrl2.txt')

test_data = testset.data

if (NOISE):

test_data = test_data + normal(0.,0.1, 32*10000).reshape(10000,32,1)

test_label = testset.labels

test_seqlen = testset.seqlen

# Later, launch the model, use the saver to restore variables from disk, and

# do some work with the model.

with tf.Session() as sess:

# Restore variables from disk.

saver.restore(sess, "/tmp/morse_model.ckpt")

print("Model restored.")

y_hat = tf.argmax(pred,1)

ch = sess.run(y_hat, feed_dict={x: test_data, y: test_label,seqlen: test_seqlen})

s = ''

for c in ch:

s += testset.decode(c)

print( s)

Here is the output from the decoder (this is using arrl2.txt file as input):

INFO:tensorflow:Restoring parameters from /tmp/morse_model.ckpt
Model restored.
NOW 20 WPM TEXT IS FROM JANUARY 2015 QST  PAGE 56 SITUATIONS WHERE I COULD HAVE BROUGHT A DIRECTIONAL ANTENNA WITH ME, SUCHAS A SMALL YAGI FOR HF OR VHF.  IF ITS LIGHT ENOUGH, ROTATING A YAGI CAN BEDONE WITH THE ARMSTRONG METHOD, BUT IT IS OFTEN VERY INCONVENIENT TO DO SO.PERHAPS YOU DONT WANT TO LEAVE THE RIG BEHIND WHILE YOU GO OUTSIDE TO ADJUST THE ANTENNA TOWARD THAT WEAK STATION, OR PERHAPS YOU'RE IN A TENT AND ITS DARK OUT THERE.  A BATTERY POWERED ROTATOR PORTABLE ROTATION HAS DEVELOPED A SOLUTION TO THESE PROBLEMS.  THE 12PR1A IS AN ANTENNA ROTATOR FIGURE 6 THAT FUNCTIONS ON 9 TO 14 V DC.  AT 12 V, THE UNIT IS SPECIFIED TO DRAW 40 MA IDLE CURRENT AND 200 MA OR LESS WHILE THE ANTENNA IS TURNING. IT CAN BE POWERED FROM THE BATTERY USED TO RUN A TYPICAL PORTABLE STATION.WHILE THE CONTROL HEAD FIGURE 7 WILL FUNCTION WITH AS LITTLE AS 6 V, A END OF 20 WPM TEXT QST DE AG1LE  NOW 20 WPM     TEXT IS FROM JANUARY 2014 QST  PAGE 46 TRANSMITTER MANUALS SPECIFI

As the reader can observe the LSTM network has learned near perfectly to translate incoming Morse sequences to text.

Next I did set the NOISE variable to True. Here is the decoded message with noise:

NOW J0 O~M TEXT IS LRZM JANUSRQ 2015 QST  PAGE 56 SITRATIONS WHEUE I XOULD HAVE BRYUGHT A DIRECTIZNAF ANTENNS WITH ME{ SUYHSS A SMALL YAGI FYR HF OU VHV'  IV ITS LIGHT ENOUGH, UOTSTING A YAGI CAN BEDONE FITH THE ARMSTRONG METHOD8 LUT IT IS OFTEN VERQ INOGN5ENIENT TC DG SC.~ERHAPS YOR DZNT WINT TO LEAVE THE RIK DEHIND WHILE YOU KO OUTSIME TO ADJUST THE AATENNA TYOARD THNT WEAK STTTION0 OU ~ERHAPS COU'UE IN A TENT AND ITS MARK OUT THERE.  S BATTERC JYWERED RCTATOR ~ORTALLE ROTATION HAS DEVELOOED A SKLUTION TO THESE ~UOBLEMS.  THE 1.JU.A IS AN ANTENNA RYTATCR FIGURE 6 THAT FRACTIZNS ZN ) TO 14 V DC1  AT 12 W{ THE UNIT IS SPECIFIED TO DRSW }8 MA IDLE CURRENT AND 20' MA OR LESS WHILE THE ANTENNA IS TURNING. IT ZAN BE POOERED FROM THE BATTEUY USED TO RRN A T}~IXAL CQMTUBLE STATION_WHILE IHE }ZNTROA HEAD FIGURE 7 WILA WUNXTION WITH AS FITTLE AA 6 F8 N END ZF 2, WPM TEXT OST ME AG1LE  NOW 20 W~M     TEXT IS LROM JTNUARJ 201} QST  ~AGE 45 TRANSMITTER MANUALS S~ECILI

Interestingly this text is still quite readable despite noisy signals. The model seems to mis-decode some dits and dahs but the word structure is still visible.

As a next step I re-trained the network using the same amount of noise in the training dataset. I expected the loss and accuracy to be worse. Fig 3. shows that training accuracy to 0.89338 took much longer and maximum testing accuracy was only 0.9837.

Fig. 3 Training Loss and Accuracy with noisy dataset

With the new model trained using noisy data I did re-run the testing phase. Here is the decoded message with noise:

NOW 20 WPM TEXT IS FROM JANUARY 2015 QST  PAGE 56 SITUATIONS WHERE I COULD HAWE BROUGHT A DIRECTIONAL ANTENNA WITH ME0 SUCHAS A SMALL YAGI FOR HF OR VHF1  IF ITS LIGHT ENOUGH0 ROTATING A YAGI CAN BEDONE WITH THE ARMSTRONG METHOD0 BUT IT IS OFTEN VERY INCONVENIENT TO DO SO1PERHAPS YOU DONT WANT TO LEAVE THE RIG BEHIND WHILE YOU GO OUTSIDE TO ADJUST THE ANTENNA TOWARD THAT WEAK STATION0 OR PERHAPS YOU1RE IN A TENT AND ITS DARK OUT THERE1  A BATTERY POWERED ROTATOR PORTABLE ROTATION HAS DEVELOPED A SOLUTION TO THESE PROBLEMS1  THE 12PR1A IS AN ANTENNA ROTATOR FIGURE 6 THAT FUNCTIONS ON 9 TO 14 V DC1  AT 12 V0 THE UNIT IS SPECIFIED TO DRAW 40 MA IDLE CURRENT AND 200 MA OR LESS WHILE THE ANTENNA IS TURNING1 IT CAN BE POWERED FROM THE BATTERY USED TO RUN A TYPICAL PORTABLE STATION1WHILE THE CONTROL HEAD FIGURE Q WILL FUNCTION WITH AS LITTLE AS X V0 A END OF 20 WPM TEXT QST DE AG1LE  NOW 20 WPM     TEXT IS FROM JANUARY 2014 QST  PAGE 46 TRANSMITTER MANUALS SPECIFI

As reader can observe now we have nearly perfect copy from noisy testing data. The LSTM network has gained ability to pick-up the signals from noise. Note that training data and testing data are two completely separate datasets.

CONCLUSIONS

Recurrent Neural Networks have gained a lot of momentum over the last 2 years. LSTM type networks are used in machine learning systems, like Google Translate, that can translate one sequence of characters to another language efficiently and accurately.

This experiment shows that a relatively small TensorFlow based neural network can learn Morse code sequences and translate them to text. This experiment shows also that adding noise to the training data will slow down the learning rate and will impact overall training accuracy achieved. However, applying similar noise level in the testing phase will significantly improve the testing accuracy when using a model trained under noisy training signals. The network has learned the signal distribution and is able to decode more accurately.

So what are the practical implications of this work? With some signal pre-processing LSTM RNN could provide a self learning Morse decoder that only needs a set of labeled audio files to learn a particular set of sequences. With large enough training dataset the model could achieve over 95% accuracy.

73 de AG1LE

Mauri

President Trump's "America First Energy Plan" Secrets Leaked: Quake Field Generator

2017-04-01T10:52:00.000-04:00

April 1st, 2017 Lexington, Massachusetts

As President Trump has stated publicly many times, a sound energy policy begins with the recognition that we have vast untapped domestic energy reserves right here in America. Unfortunately, the secret details behind the ambitious America First Energy Plan were leaked late last night.

To pre-empt any fake news by the Liberal Media I am making a full disclosure of the secret project I have been working on the last 18 months in propinquity of MIT Lincoln Laboratory, a federally funded research and development center chartered to apply advanced technology to problems of national security.

I am unveiling a breakthrough technology that will lower energy costs for hardworking Americans and maximize the use of American resources, freeing us from dependence on foreign oil. This technology allows harvesting clean energy from around the world and making other nations to pay for it according to President Trump's master plan.

The technology is based on quake fields and provides virtually unlimited free energy, while protecting clean air and clean water, conserving our natural habitats, and preserving our natural reserves and resources.

What is Quake Field?

Quake field theory is relatively unknown part of seismology. Seismology is the scientific study of earthquakes and the propagation of elastic waves through the Earth or through other planet-like bodies. The field also includes studies of earthquake environmental effects such as tsunamis as well as diverse seismic sources such as volcanic, tectonic, oceanic, atmospheric, and artificial processes such as explosions.

Quake field theory was formulated by Dr. James von Hausen in 1945 as part of the Manhattan project during World War II. Quake field theory provides a mathematical model how energy propagates through elastic waves. During the development of the first nuclear weapons scientists faced a big problem: nobody was able to provide an accurate estimate of the energy yield of the first atom bomb. People were concerned possible side effects and there was speculation that fission reaction could ignite the Earth atmosphere.

Quake field theory provides precise field formulas to calculate energy propagation in planet-like bodies. The theory has been proven in hundreds of nuclear weapon tests during the Cold War period. However, most of the empirical research and scientific papers have been classified by the U.S. Government and therefore you cannot really find details in Wikipedia or other public sources due to the sensitivity of the information.

In the recent years U.S. seismologists have started to use quake field theory to calculate the amount of energy released in earthquakes. This work was enabled by creation of global network of seismic sensors that is now available. These sensors provide real time information on earthquakes over the Internet.

I have a Raspberry Shake at home. This is a Raspberry Pi powered device to monitor quake field activity and part of a global seismic sensor network. Figure 1 show quake field activity on March 25, 2017. As you can see it was a very active day. This system gives me a prediction when the quake field is activated.

Figure 1. Quake Field activity in Lexington, MA

How much energy is available from Quake Field?

A single magnitude 9 earthquake releases approximately 3.9 e+22 Joules of seismic moment energy (Mo). Much of this energy gets dissipated at the epicenter but approximately 1.99 e+18 Joules is radiated as seismic waves through the planet. To put this in perspective you could power the whole United States for 7.1 days with this radiated energy. This radiated energy equals to 15,115 million gallons of gasoline - just from a single large earthquake.

The radiated energy is released as waves from the epicenter of a major earthquake and propagate outward as surface waves (S waves). In the case of compressional waves (P waves), the energy radiates from the focus under the epicenter and travels all the way through the globe. Figure 2 illustrates these two primary energy transfer mechanisms. Note that we don’t need to build any transmission network to transfer this energy so the capital cost would be very small.

Figure 2. Energy Transfer by Radiated Waves

Magnitude 2 and smaller earthquakes occur several hundred times a day world wide. Major earthquakes, greater than magnitude 7, happen more than once per month. “Great earthquakes”, magnitude 8 and higher, occur about once a year.

The real challenge has been that we don’t have a technology harvest this huge untapped energy - until today.

Introducing Quake Field Generator

The following introduction explains the operating principles of quake field generator (QFG) technology.

Using the quake field theory and the seismic sensor data it is now possible to predict accurately when the S and P waves arrive to any location on Earth. The big problem has been to find efficient method how to convert the energy of these waves to electricity.

A triboelectric nanogenerator (TENG) is an energy harvesting device that converts the external mechanical energy into electricity by a conjunction of triboelectric effect and electrostatic induction.

Ever since the first report of the TENG in January 2012, the output power density of TENG has been improved for five orders of magnitude within 12 months. The area power density reaches 313 W/m2, volume density reaches 490 kW/m3, and a conversion efficiency of ~60% has been demonstrated. Besides the unprecedented output performance, this new energy technology also has a number of other advantages, such as low cost in manufacturing and fabrication, excellent robustness and reliability, environmental-friendly, and so on.

The Liberal Media outlets have totally misunderstood the "clean coal technology” that is the cornerstone of President Trump's master plan for energy independence. Graphene is coal, just in different molecular configuration. Graphene is one of materials exhibiting strong triboelectric effect. With recent advances in 3D printing technology it is now feasible to mass produce low cost triboelectric nanogenerators. Graphene is now commercially available for most 3D printers.

The geometry of Quake Field Generator is based on fractals, minimizing the size of resonant transducer. My prototype consists of 10,000 TENG elements organized into a fractal shape. In this prototype version that I have been working on the last 18 months I have also implemented an automated tuning circuit that uses flux capacitors to maximize the energy capture at the resonance frequency. This brings the efficiency of the QFG to 97.8% - I am quite pleased with this latest design.

Figure 3. show my current Quake Field Generator prototype - this is a 10 kW version. It has four stacks of TENG elements. Due to the high efficiency of these elements the ventilation need is quite minimal.

Figure 3. Quake Field Generator prototype - 10 kW version

So what does this news mean to an average American?

Quake Field Generator will be fully open source technology that will create millions of new jobs in the U.S. energy market. It leverages our domestic coal sources to build TENG devices from graphene (aka “clean coal”).

A simple 10 kW generator can be 3D printed in one day and it can be mounted next to your power distribution panel at your home. The only requirements are that the unit must have connection to ground to harvest the quake field energy and you need to use a professional electrician to make a connection to your home circuit.

I have been running such a DYI 10 kW generator for over a year. So far I have been very happy with the performance of this Quake Field Generator. Once I finalize the design my plan is to publish the software, circuit design, transducer STL files etc. on Github.

Let me know if you are interested in QFG technology - happy April 1st.

Mauri

Amazon Echo - Alexa skills for ham radio

2017-01-29T18:00:00.001-05:00

Demo video showing a proof of concept Alexa DX Cluster skill with remote control of Elecraft KX3 radio.

Introduction

According to a Wikipedia article Amazon Echo is a smart speaker developed by Amazon. The device consists of a 9.25-inch (23.5 cm) tall cylinder speaker with a seven-piece microphone array. The device connects to the voice-controlled intelligent personal assistant service Alexa, which responds to the name "Alexa". The device is capable of voice interaction, music playback, making to-do lists, setting alarms, streaming podcasts, playing audiobooks, and providing weather, traffic and other real time information. It can also control several smart devices using itself as a home automation hub.

Echo also has access to skills built with the Alexa Skills Kit. These are 3rd-party developed voice experiences that add to the capabilities of any Alexa-enabled device (such as the Echo). Examples of skills include the ability to play music, answer general questions, set an alarm, order a pizza, get an Uber, and more. Skills are continuously being added to increase the capabilities available to the user.

The Alexa Skills Kit is a collection of self-service APIs, tools, documentation and code samples that make it fast and easy for any developer to add skills to Alexa. Developers can also use the "Smart Home Skill API", a new addition to the Alexa Skills Kit, to easily teach Alexa how to control cloud-controlled lighting and thermostat devices. A developer can follow tutorials to learn how to quickly build voice experiences for their new and existing applications.

Ham Radio Use Cases

For ham radio purposes Amazon Echo and Alexa service creates a whole new set of opportunities to automate your station and build new audio experiences.

Here is a list of ideas what you could use Amazon echo for:

- listen ARRL Podcasts
- practice Morse code or ham radio examination
- check space weather and radio propagation forecasts
- memorize Q codes (QSL, QTH, etc.)
- check call sign details from QRZ.com
- use APRS to locate a mobile ham radio station

I started experimenting with Alexa Skills APIs using mostly Python to create programs. One of the ideas I had was to get Alexa to control my Elecraft KX3 radio remotely. To make the skill more useful I build some software to pull latest list of spots from DX Cluster and use those to set the radio on the spotted frequency to listen some new station or country on my bucket list.

Alexa Skill Description

Imagine if you could use and listen your radio station anywhere just by saying the magic words "Alexa, ask DX Cluster to list spots."

Alexa would then go to a DX Cluster, find the latest spots on SSB (or CW) and allows you to select the spot you want to follow. By just saying "Select seven" Alexa would set your radio to that frequency and start playing the audio.

Figure 2. Alexa DX Cluster Skill output

System Architecture

Figure 3. below shows all the main components of this solution. I have a Thinkpad X301 laptop connected to Elecraft KX3 radio with KXUSB serial port and using built-in audio interface. X301 is running several processes: one for recording the audio into MP3 files, hamlib rigctld to control the the radio and a web server that allows Alexa skill to control the frequency and retrieve the recorded MP3 files.

I implemented the Alexa Skill "DX Cluster" using Amazon Web Services Cloud. Main services are AWS Gateway and AWS Lambda.

The simplified sequence of events is shown in the figure below:

1. User says "Alexa, ask DX Cluster to list spots". Amazon Echo device sends the voice file to Amazon Alexa service that does the voice recognition.

2. Amazon Alexa determines that the skill is "DX Cluster" and sends JSON formatted request to configured endpoint in AWS Gateway.

3. AWS Gateway sends the request to AWS Lambda that loads my Python software.

4. My "DX Cluster" software parses the JSON request, calls "ListIntent" handler. If not already loaded, it will make a web API request to pull the latest DX cluster data from ham.qth.com. The software will the convert the text to SSML format for speech output and returns the list of spots to Amazon Echo device.

5. If user says "Select One" (the top one on the list), then the frequency of the selected spot is sent to the webserver running on X301 laptop. It will change the radio frequency using rigctl command and then return the URL to the latest MP3 that is recorded. This URL is passed to Amazon Echo device to start the playback.

6. Amazon Echo device will retrieve the MP3 file from the X301 web server and starts playing.

Figure 3. System Architecture

Software

As this is just a proof of concept the software is still very fragile and not ready for publishing. The software is written in Python language and is heavily using open source components, such as

hamblib - for controlling the Elecraft KX3 radio
rotter - for recording MP3 files from the radio
Flask - Python web framework
Boto3 - AWS Python libraries
Zappa - serverless Python services

Once the software is a bit more mature I could post it on Github if there is any interest from the ham radio community for this.

Mauri AG1LE

KX3 Remote Control and audio streaming with Raspberry Pi 2

2016-02-06T21:08:00.002-05:00

REMOTE CONTROL OF ELECRAFT K3

I wanted to control my Elecraft KX3 transceiver remotely using my Android Phone. A quick Internet search yielded this site by Andrea IU4APC. His KX3 companion application on Android allows remote control using Raspberry Pi 2 and he has also links to an audio streaming application called Mumble.

I did a quick ham shack inventory of hardware and software and realized that I had already everything required for this project.

A short video how this works is in YouTube:

KX3, Raspberry Pi2 and Android Phone connected together over Wifi.

HARDWARE COMPONENTS

Elecraft KX3
Elecraft KXUSB Serial Cable for KX3
Raspberry Pi 2 with Raspbian Linux. I have 32 GB SD memory card, 8 GB should also work.
Behringer UCA202 USB Audio Interface and audio cables
Android Phone (I have OnePlus One)

CONFIGURE RASPBERRY PI AND KX3 COMPANION APP

Following the instructions I plugged the KXUSB Serial cable to the KX3 ACC1 port and to one of the two Raspberry Pi USB ports.

I installed ser2net with following commands on command line:

sudo apt-get update
sudo apt-get install ser2net

then I edited the /etc/ser2net.conf file:

sudo nano /etc/ser2net.conf

and added the following line:

7777:raw:0:/dev/ttyUSB0:38400 8DATABITS NONE 1STOPBIT

and saved the file by pressing CTRL+X and then Y

I executed the ser2net:

ser2net
sudo /etc/init.d/ser2net restart

Once done with the host I downloaded the KX3 Companion app (link here) on my Android phone and opened the app.

To enable the KX3 Remote functionality you have to edit 3 options (“Remote Settings” section). Check the “Use KX3Remote/Piglet/Pigremote” option

Set your PC/Raspberry Pi IP address in the “KX3Remote/Piglet/Pigremote IP” option. This below assumes that your RPI and Android phone are connected to the same Wifi network.

In my case RPI is using WLAN0 interface connected to WiFi router and IP address is 192.168.0.47. This address depends on your local network configuration and you can get the Raspberry Pi IP address using command

ip addr show

Set the choosen Port number (7777) on the PC/Raspberry Pi IP address in the “KX3Remote/Piglet/Pigremote Port” option

Now you can test the connection. By tapping "ON" button on the left top corner you can see if the connection was successful. A message "Connected to Piglet/Pigremote" should show up at the bottom - see below:

If you are having problems with this, here are some troubleshooting ideas

check the Raspberry Pi IP address again
check that Raspberry Pi and Android Phone are on the same Wifi network
check that your KX3 serial port is set to 38400 bauds (this is the default in KX3 Companion App)

If everything works, you should be able to change the frequency and the bands on KX3 by tapping Band+/Band- and Freq+/Freq- buttons on the app. Current KX3 frequency will be updated on FREQUENCY field between buttons as you turn the VFO on KX3.

CONFIGURE RASPBERRY PI 2 FOR AUDIO

Plug in USB Audio Interface to Raspberry Pi 2 USB port. In my case I used Behringer UCA202 but there are many other alternatives available.

The audio server is called Mumble. This is a low latency Voice over IP (VoIP) server designed for gaming community but it works well for streaming audio from KX3 to Android Phone and back. There is a great page that describes installation in more details.

I used the following commands to install mumble VoIP server

sudo apt-get install mumble-server
sudo dpkg-reconfigure mumble-server

This last command will present you with a few options, set these however you would like mumble to operate.

Autostart: I selected Yes
High Priority: I selected Yes (This ensures Mumble will always be given top priority even when the Pi is under a lot of stress)
SuperUser: Set the password here. This account will have full control over the server.

You need to know your IP address on Raspberry Pi 2 when configuring the Mumble client. Write it down as you will need it shortly. In my case it was 192.168.0.47

ip addr show

You may want to edit the server configuration file. I didn't do any changes but the installation page recommends changing welcome text and server password. You can do it using this command:

sudo nano /etc/mumble-server.ini

Finally, you need to restart the server:

sudo /etc/init.d/mumble-server restart

Now that we have the mumble server running we need to install the Mumble client on Raspberry Pi 2. This can be done with this command:

sudo apt-get install mumble

Next you start the client application by typing:

mumble

This starts the mumble client. First you need to go through some configuration windows.

You need to have USB audio interface input connected to KX3 Phones output when going though the Mumble Audio Wizard. I turned the audio volume to approximately 30.

You need to select the USB Audio device as the input device. Default device is "Default ALSA device" that is onboard audio chip. When clicking Device drop down list select SysDefault card - USB Audio Codec as shown on picture below.

The drop down list might be different depending on your hardware configuration. Select the SysDefault USB device.

Once the Input and Output devices have been selected you can move forward with Next.

Next comes device tuning. I selected the longest delay for best sound quality.

Next comes Volume tuning. Make sure that KX3 audio volume is at least 30. You should see blue bar moving in sync with KX3 audio. Follow instructions.

Next comes voice activity detection setting. Follow instructions.

Next comes quality selection. I selected high as I am testing this in local LAN network.

Audio settings are now completed.

Next comes server connect. You can "Add New..." by giving the IP address that you wrote down earlier. I gave the server label "raspberrypi" and username "pi".You don't have the change the port.

When you connect to the server you should have a view like this below.

Next step is then download mumble client on the Android phone and configure it.

CONFIGURE ANDROID PHONE

I downloaded free mumble client called Plumble on my Android phone. You need to configure the Mumble server running on Raspberry Pi 2 on the software. Once you open Plumble client tap the "+" sign on right top corner.

I gave the label "KX3" and IP address of the Mumble server running on Raspberry Pi 2 - in my case the IP address is 192.168.0.47. For username I selected my ham radio call sign.

Since I did not configure any passwords on my server I left that field empty. Once the server has been added, you can try to connect to it.

OPERATION

If everything has gone well you should be able to connect to the Mumble VoIP server and hear a sound from your mobile phone.

On Raspberry Pi 2 you should see that another client "AG1LE" has connected to the server. See example below:

NEXT STEPS

If you want to extend from just listening KX3 to actually working remotely you need to configure your Wifi router to enable connection remotely over the Internet. Also, the USB audio interface need to be connected to the microphone (MIC) input of KX3 radio. KX3 must have VOX turned on to enable audio transmit.

Documenting these steps will take a bit more time, so I leave it for the next session.

Did you find these instructions useful? Any comments or feedback?

73
Mauri AG1LE

TensorFlow: a new LSTM RNN based Morse decoder

2015-12-27T23:35:00.003-05:00

INTRODUCTION

In my previous post I created an experiment to train a LSTM Recurrent neural network (RNN) to detect symbols from noisy Morse code. I continued experiments, but this time I used the new TensorFlow open source library for machine intelligence. The flexible architecture of TensorFlow allows to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API.

TensorFlow was originally developed by researchers and engineers working on the Google Brain Team within Google's Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as well.

EXPERIMENT

I started with the TensorFlow MNIST example authored by Aymeric Damien. MNIST is a large database of handwritten digits that is commonly used for machine learning experiments and algorithm development. Instead of training a LSTM RNN model using handwritten characters I created a Python script to generate a lot of Morse code training material. I downloaded ARRL Morse training text files and created a large text file. From this text file the Python script generates properly formatted training vectors, over 155,000 of them. The software is available as Python inotebook format in Github.

The LSTM RNN model has the following parameters:

# Parameters
learning_rate = 0.001
training_iters = 114000
batch_size = 126

# Network Parameters
n_input = 1 # each Morse element is normalized to dit length 1
n_steps = 32 # timesteps (training material padded to 32 dit length)
n_hidden = 128 # hidden layer num of features
n_classes = 60 # Morse character set

The training takes approximately 15 minutes on my Thinkpad X301 laptop. The progress of loss function and accuracy % over the training is depicted in Figure 1 below. The final accuracy was 93.6% after 114,000 training samples.

Figure 1. Training progress over time

I was testing the model with generated data while adding noise gradually to signals using the "sigma" parameter on the Python scripts. The results are below:

Test case: QUICK BROWN FOX JUMPED OVER THE LAZY FOX 0123456789
Results:
Noise 0.0: QUICK BROWN VOC YUMPED OVER THE LACY VOC ,12P45WOQ.
Noise 0.02: QUICK BROWN VOC YUMPED OVER THE LACY FOC 012P45WOQ.
Noise 0.05: QUICK BROWN VOC YUMPED OVER THE LACQ VOC ,,2P45WO2.
Noise 0.1: Q5IOK BROWN FOX YUMPED O4ER THE LACY FOC 012P4FWO2,
Noise 0.2: .4IOK WDOPD VOO 2FBPIM QFEF TRE WAC2 4OX 0,.PF52Q91
As can be seen above at "sigma" level 0.2 the decoder starts to make a lot of errors.

CONCLUSIONS

The software learns the Morse code by going through the training vectors multiple times. By going through 114,000 characters in training the model achieves 96.3% accuracy. I did not try to optimize anything and I just used the reference material that came with TensorFlow library. This experiment shows that it is possible to build an intelligent Morse decoder that learns the patterns from the data and also allows to scale up more complex models with better accuracy and better tolerance for QSB and noisy signals.

TensorFlow proved to be a very powerful new machine learning library that was relatively easy to use. The biggest challenge was to figure out what data formats to use with various API calls. Due to the complexity and richness of the TensorFlow library I am fairly sure that much can be done to improve the efficiency of this software. As TensorFlow has been designed so that it works on a desktop, server, tablet or even on a mobile phone this open new possibilities to build an intelligent, learning Morse decoder for different platforms.

73 Mauri AG1LE

Experiment: Deep Learning algorithm for Morse decoder using LSTM RNN

2015-11-24T22:35:00.000-05:00

INTRODUCTION

In my previous post I created a Python script to generate training material for neural networks.
The goal is to test how well the modern Deep Learning algorithms would work in decoding noisy Morse signals with heavy QSB fading.

I did some research on various frameworks and found this article from Daniel Hnyk. My requirements were quite similar - full Python support, LSTM RNN built-in and a simple interface.
He had selected Keras that is available in Github. There is a mailing list for Keras users that is fairly active and quite useful to find support from other users. I installed Keras on my Linux laptop and using Jupyter interactive notebooks it was easy to start experimenting with various neural network configurations.

SIMPLE RECURRENT NEURAL NETWORK EXPERIMENT

Using various sources and above mailing list I came up with the following experiment. I have uploaded the Jupyter notebook file in Github in case the reader wants to replicate the experiment.

The source code or printed output text is shown below with courier font and I have added some commentary as well as the graphs as pictures.

In [12]:
#!/usr/bin/env python
# MorseEncoder.py - Morse Encoder to generate training material for neural networks
# Generates raw signal waveforms with Gaussian noise and QSB (signal fading) effects
# Provides also the training target variables in separate columns. Example usage:
#
# WPM= 40 # speed 40 words per minute
# Tq = 4. # QSB cycle time in seconds (typically 5..10 secs)
# sigma = 0.02 # add some Gaussian noise
# P = signal('QUICK BROWN FOX JUMPED OVER THE LAZY FOX ',WPM,Tq,sigma)
# from matplotlib.pyplot import plot,show,figure,legend
# from numpy.random import normal
# figure(figsize=(12,3))
# lb1,=plot(P.t,P.sig,'b',label="sig")
# lb2,=plot(P.t,P.dit,'g',label="dit")
# lb3,=plot(P.t,P.dah,'g',label="dah")
# lb4,=plot(P.t,P.ele,'m',label="ele")
# lb5,=plot(P.t,P.chr,'c',label="chr")
# lb6,=plot(P.t,P.wrd,'r*',label="wrd")
# legend([lb1,lb2,lb3,lb4,lb5,lb6])
# show()
# P.to_csv("MorseTest.csv")
#
# Copyright (C) 2015 Mauri Niininen, AG1LE
#
#
# MorseEncoder.py is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# MorseEncoder.py is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with bmorse.py. If not, see <http://www.gnu.org/licenses/>.

import numpy as np
import pandas as pd
from numpy import sin,pi
from numpy.random import normal
pd.options.mode.chained_assignment = None #to prevent warning messages

Morsecode = {
'!': '-.-.--',
'$': '...-..-',
"'": '.----.',
'(': '-.--.',
')': '-.--.-',
',': '--..--',
'-': '-....-',
'.': '.-.-.-',
'/': '-..-.',
'0': '-----',
'1': '.----',
'2': '..---',
'3': '...--',
'4': '....-',
'5': '.....',
'6': '-....',
'7': '--...',
'8': '---..',
'9': '----.',
':': '---...',
';': '-.-.-.',
'<AR>': '.-.-.',
'<AS>': '.-...',
'<HM>': '....--',
'<INT>': '..-.-',
'<SK>': '...-.-',
'<VE>': '...-.',
'=': '-...-',
'?': '..--..',
'@': '.--.-.',
'A': '.-',
'B': '-...',
'C': '-.-.',
'D': '-..',
'E': '.',
'F': '..-.',
'G': '--.',
'H': '....',
'I': '..',
'J': '.---',
'K': '-.-',
'L': '.-..',
'M': '--',
'N': '-.',
'O': '---',
'P': '.--.',
'Q': '--.-',
'R': '.-.',
'S': '...',
'T': '-',
'U': '..-',
'V': '...-',
'W': '.--',
'X': '-..-',
'Y': '-.--',
'Z': '--..',
'\\': '.-..-.',
'_': '..--.-',
'~': '.-.-'}


def encode_morse(cws):
s=[]
for chr in cws:
try: # try to find CW sequence from Codebook
s += Morsecode[chr]
s += ' '
except:
if chr == ' ':
s += '_'
continue
print "error: '%s' not in Codebook" % chr
return ''.join(s)

def len_dits(cws):
# length of string in dit units, include spaces
val = 0
for ch in cws:
if ch == '.': # dit len + el space
val += 2
if ch == '-': # dah len + el space
val += 4
if ch==' ': # el space
val += 2
if ch=='_': # el space
val += 7
return val

def signal(cw_str,WPM,Tq,sigma):
# for given CW string i.e. 'ABC '
# return a pandas dataframe with signals and symbol probabilities
# WPM = Morse speed in Words Per Minute (typically 5...50)
# Tq = QSB cycle time (typically 3...10 seconds)
# sigma = adds gaussian noise with standard deviation of sigma to signal
cws = encode_morse(cw_str)
#print cws
# calculate how many milliseconds this string will take at speed WPM
ditlen = 1200/WPM # dit length in msec, given WPM
msec = ditlen*(len_dits(cws)+7) # reserve +7 for the last pause
t = np.arange(msec)/ 1000. # time array in seconds
ix = range(0,msec) # index for arrays

# Create a DataFrame and initialize
col =["t","sig","dit","dah","ele","chr","wrd","spd"]
P = pd.DataFrame(index=ix,columns=col)
P.t = t # keep time
P.sig=np.zeros(msec) # signal stored here
P.dit=np.zeros(msec) # probability of 'dit' stored here
P.dah=np.zeros(msec) # probability of 'dah' stored here
P.ele=np.zeros(msec) # probability of 'element space' stored here
P.chr=np.zeros(msec) # probability of 'character space' stored here
P.wrd=np.zeros(msec) # probability of 'word space' stored here
P.spd=np.ones(msec)*WPM #speed stored here


#pre-made arrays with multiple(s) of ditlen
z = np.zeros(ditlen)
z2 = np.zeros(2*ditlen)
z4 = np.zeros(4*ditlen)
dit = np.ones(ditlen)
dah = np.ones(3*ditlen)

# For all dits/dahs in CW string generate the signal, update symbol probabilities
i = 0
for ch in cws:
if ch == '.':
dur = len(dit)
P.sig[i:i+dur] = dit
P.dit[i:i+dur] = dit
i += dur
dur=len(z)
P.sig[i:i+dur] = z
P.ele[i:i+dur] = np.ones(dur)
i += dur

if ch == '-':
dur = len(dah)
P.sig[i:i+dur] = dah
P.dah[i:i+dur]= dah
i += dur
dur=len(z)
P.sig[i:i+dur] = z
P.ele[i:i+dur] = np.ones(dur)
i += dur

if ch == ' ':
dur = len(z2)
P.sig[i:i+dur] = z2
P.chr[i:i+dur]= np.ones(dur)
i += dur
if ch == '_':
dur = len(z4)
P.sig[i:i+dur] = z4
P.wrd[i:i+dur]= np.ones(dur)
i += dur
if Tq > 0.: # QSB cycle time impacts signal amplitude
qsb = 0.5 * sin((1./float(Tq))*t*2*pi) +0.55
P.sig = qsb*P.sig
if sigma >0.:
P.sig += normal(0,sigma,len(P.sig))
return P
In [13]:
print ('MorseEncoder started')
%matplotlib inline
from matplotlib.pyplot import plot,show,figure,legend, title
from numpy.random import normal
WPM= 40
Tq = 1.8 # QSB cycle time in seconds (typically 5..10 secs)
sigma = 0.01 # add some Gaussian noise
P = signal('QUICK',WPM,Tq,sigma)
figure(figsize=(12,3))
lb1,=plot(P.t,P.sig,'b',label="sig")
title("QUICK in Morse code - (c) 2015 AG1LE")
legend([lb1])
show()
print ('MorseEncoder finished. %d datapoints created' % len(P.sig))

MorseEncoder started

The Jupyter notebook will plot this graph that basically shows the text 'QUICK' converted to noisy signal with strong QSB fading. This signal goes down close to zero between letters C and K as you can see below.

Figure 1. The training signal containing noise and QSB fading

The next section of the code imports some libraries (including Keras) that is used for Neural Network experimentation. I am also preparing the data to the proper format that Keras requires.

MorseEncoder finished. 1950 datapoints created
In [14]:
# Time Series Testing - Morse case
import keras.callbacks
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Dense, Dropout
from keras.layers.recurrent import LSTM

import random
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

# Data preparation
# use 100 examples of data to predict nb_samples (850) in the future
samples = 1950
examples = 1000
y_examples = 100

x = np.linspace(0,1950,samples)
nb_samples = samples - examples - y_examples
data = P.sig

# prepare input for RNN training - 1 feature
input_list = [np.expand_dims(np.atleast_2d(data[i:examples+i]), axis=0) for i in xrange(nb_samples)]
input_mat = np.concatenate(input_list, axis=0)
lb1,=plot(x,data,label="input")
lb2,=plot(x,P.dit,label="target")
legend([lb1,lb2])
title("training input and target data")
Out[14]:
<matplotlib.text.Text at 0x10c119b50>

This graph shows the training data (the noisy, fading signal) and the target data (I selected 'dits' in this example). This is just to verify that I have the right datasets selected.

Figure 2. Training and target data

In the following sections I prepare the training target ('dits') to proper format and setup the neural network model. I am using LSTM with Dropout and the model has 300 hidden neurons. I have also a callback function defined to capture the loss data during the training so that I can plot the loss curve to see the training progress.

In [15]:
# prepare target - the first column in merged dataframe
ydata = P.dit
target_list = [np.atleast_2d(ydata[i+examples:examples+i+y_examples]) for i in xrange(nb_samples)]
target_mat = np.concatenate(target_list, axis=0)

# set up a model
trials = input_mat.shape[0]
features = input_mat.shape[2]
hidden = 300

model = Sequential()
model.add(LSTM(input_dim=features, output_dim=hidden,return_sequences=False))
model.add(Dropout(.2))
model.add(Dense(input_dim=hidden, output_dim=y_examples))
model.add(Activation('linear'))
model.compile(loss='mse', optimizer='rmsprop')

# Call back to capture losses
class LossHistory(keras.callbacks.Callback):
def on_train_begin(self, logs={}):
self.losses = []

def on_batch_end(self, batch, logs={}):
self.losses.append(logs.get('loss'))
# Train the model
history = LossHistory()
model.fit(input_mat, target_mat, nb_epoch=100,callbacks=[history])

# Plot the loss curve
plt.plot( history.losses)
title("training loss")

Here I have started the training. I selected 100 epochs - this means that the software will go through the training material for 100 times during the training. As you can see this goes very quickly - with larger model or larger datasets the training might take minutes to hours per epoch. We have a very small model and small dataset here.

Epoch 1/100
850/850 [==============================] - 0s - loss: 0.1050
Epoch 2/100
850/850 [==============================] - 0s - loss: 0.0927
Epoch 3/100
850/850 [==============================] - 0s - loss: 0.0870
Epoch 4/100
850/850 [==============================] - 0s - loss: 0.0823
Epoch 5/100
850/850 [==============================] - 0s - loss: 0.0788
Epoch 6/100
850/850 [==============================] - 0s - loss: 0.0756
Epoch 7/100
850/850 [==============================] - 0s - loss: 0.0724
Epoch 8/100
850/850 [==============================] - 0s - loss: 0.0693
Epoch 9/100
850/850 [==============================] - 0s - loss: 0.0668
Epoch 10/100
850/850 [==============================] - 0s - loss: 0.0639
Epoch 11/100
850/850 [==============================] - 0s - loss: 0.0611
Epoch 12/100
850/850 [==============================] - 0s - loss: 0.0586
Epoch 13/100
850/850 [==============================] - 0s - loss: 0.0561
Epoch 14/100
850/850 [==============================] - 0s - loss: 0.0539
Epoch 15/100
850/850 [==============================] - 0s - loss: 0.0519
Epoch 16/100
850/850 [==============================] - 0s - loss: 0.0495
Epoch 17/100
850/850 [==============================] - 0s - loss: 0.0476
Epoch 18/100
850/850 [==============================] - 0s - loss: 0.0456
Epoch 19/100
850/850 [==============================] - 0s - loss: 0.0441
Epoch 20/100
850/850 [==============================] - 0s - loss: 0.0430
Epoch 21/100
850/850 [==============================] - 0s - loss: 0.0411
Epoch 22/100
850/850 [==============================] - 0s - loss: 0.0400
Epoch 23/100
850/850 [==============================] - 0s - loss: 0.0387
Epoch 24/100
850/850 [==============================] - 0s - loss: 0.0378
Epoch 25/100
850/850 [==============================] - 0s - loss: 0.0370
Epoch 26/100
850/850 [==============================] - 0s - loss: 0.0356
Epoch 27/100
850/850 [==============================] - 0s - loss: 0.0350
Epoch 28/100
850/850 [==============================] - 0s - loss: 0.0340
Epoch 29/100
850/850 [==============================] - 0s - loss: 0.0334
Epoch 30/100
850/850 [==============================] - 0s - loss: 0.0328
Epoch 31/100
850/850 [==============================] - 0s - loss: 0.0322
Epoch 32/100
850/850 [==============================] - 0s - loss: 0.0317
Epoch 33/100
850/850 [==============================] - 0s - loss: 0.0309
Epoch 34/100
850/850 [==============================] - 0s - loss: 0.0302
Epoch 35/100
850/850 [==============================] - 0s - loss: 0.0299
Epoch 36/100
850/850 [==============================] - 0s - loss: 0.0296
Epoch 37/100
850/850 [==============================] - 0s - loss: 0.0290
Epoch 38/100
850/850 [==============================] - 0s - loss: 0.0285
Epoch 39/100
850/850 [==============================] - 0s - loss: 0.0283
Epoch 40/100
850/850 [==============================] - 0s - loss: 0.0277
Epoch 41/100
850/850 [==============================] - 0s - loss: 0.0272
Epoch 42/100
850/850 [==============================] - 0s - loss: 0.0268
Epoch 43/100
850/850 [==============================] - 0s - loss: 0.0265
Epoch 44/100
850/850 [==============================] - 0s - loss: 0.0258
Epoch 45/100
850/850 [==============================] - 0s - loss: 0.0256
Epoch 46/100
850/850 [==============================] - 0s - loss: 0.0253
Epoch 47/100
850/850 [==============================] - 0s - loss: 0.0251
Epoch 48/100
850/850 [==============================] - 0s - loss: 0.0248
Epoch 49/100
850/850 [==============================] - 0s - loss: 0.0246
Epoch 50/100
850/850 [==============================] - 0s - loss: 0.0241
Epoch 51/100
850/850 [==============================] - 0s - loss: 0.0236
Epoch 52/100
850/850 [==============================] - 0s - loss: 0.0233
Epoch 53/100
850/850 [==============================] - 0s - loss: 0.0234
Epoch 54/100
850/850 [==============================] - 0s - loss: 0.0230
Epoch 55/100
850/850 [==============================] - 0s - loss: 0.0229
Epoch 56/100
850/850 [==============================] - 0s - loss: 0.0224
Epoch 57/100
850/850 [==============================] - 0s - loss: 0.0223
Epoch 58/100
850/850 [==============================] - 0s - loss: 0.0218
Epoch 59/100
850/850 [==============================] - 0s - loss: 0.0218
Epoch 60/100
850/850 [==============================] - 0s - loss: 0.0215
Epoch 61/100
850/850 [==============================] - 0s - loss: 0.0215
Epoch 62/100
850/850 [==============================] - 0s - loss: 0.0212
Epoch 63/100
850/850 [==============================] - 0s - loss: 0.0208
Epoch 64/100
850/850 [==============================] - 0s - loss: 0.0209
Epoch 65/100
850/850 [==============================] - 0s - loss: 0.0207
Epoch 66/100
850/850 [==============================] - 0s - loss: 0.0205
Epoch 67/100
850/850 [==============================] - 0s - loss: 0.0203
Epoch 68/100
850/850 [==============================] - 0s - loss: 0.0200
Epoch 69/100
850/850 [==============================] - 0s - loss: 0.0200
Epoch 70/100
850/850 [==============================] - 0s - loss: 0.0197
Epoch 71/100
850/850 [==============================] - 0s - loss: 0.0197
Epoch 72/100
850/850 [==============================] - 0s - loss: 0.0198
Epoch 73/100
850/850 [==============================] - 0s - loss: 0.0193
Epoch 74/100
850/850 [==============================] - 0s - loss: 0.0191
Epoch 75/100
850/850 [==============================] - 0s - loss: 0.0189
Epoch 76/100
850/850 [==============================] - 0s - loss: 0.0188
Epoch 77/100
850/850 [==============================] - 0s - loss: 0.0189
Epoch 78/100
850/850 [==============================] - 0s - loss: 0.0185
Epoch 79/100
850/850 [==============================] - 0s - loss: 0.0185
Epoch 80/100
850/850 [==============================] - 0s - loss: 0.0184
Epoch 81/100
850/850 [==============================] - 0s - loss: 0.0183
Epoch 82/100
850/850 [==============================] - 0s - loss: 0.0181
Epoch 83/100
850/850 [==============================] - 0s - loss: 0.0180
Epoch 84/100
850/850 [==============================] - 0s - loss: 0.0179
Epoch 85/100
850/850 [==============================] - 0s - loss: 0.0177
Epoch 86/100
850/850 [==============================] - 0s - loss: 0.0177
Epoch 87/100
850/850 [==============================] - 0s - loss: 0.0174
Epoch 88/100
850/850 [==============================] - 0s - loss: 0.0177
Epoch 89/100
850/850 [==============================] - 0s - loss: 0.0175
Epoch 90/100
850/850 [==============================] - 0s - loss: 0.0173
Epoch 91/100
850/850 [==============================] - 0s - loss: 0.0172
Epoch 92/100
850/850 [==============================] - 0s - loss: 0.0171
Epoch 93/100
850/850 [==============================] - 0s - loss: 0.0171
Epoch 94/100
850/850 [==============================] - 0s - loss: 0.0167
Epoch 95/100
850/850 [==============================] - 0s - loss: 0.0167
Epoch 96/100
850/850 [==============================] - 0s - loss: 0.0170
Epoch 97/100
850/850 [==============================] - 0s - loss: 0.0164
Epoch 98/100
850/850 [==============================] - 0s - loss: 0.0166
Epoch 99/100
850/850 [==============================] - 0s - loss: 0.0163
Epoch 100/100
850/850 [==============================] - 0s - loss: 0.0164
Out[15]:
<matplotlib.text.Text at 0x11e055350>

The following graph shows the training loss during the training process. This gives you an idea whether the training is progressing well or if you have some problem with the model or the parameters.

Figure 3. Training loss curve

In [16]:
# Use training data to check prediction
predicted = model.predict(input_mat)
In [17]:
# Plot original data (green) and predicted data (red)
lb1,=plot(data,'g',label="training")
#lb2,=plot(ydata,'b',label="target")
lb3,=plot(xrange(examples,examples+nb_samples), predicted[:,1],'r',label="predicted")
legend([lb1,lb3])
title("training vs. predicted")
Out[17]:
<matplotlib.text.Text at 0x11f164610>

In this section I am checking the model prediction. Since I am using the training material this is supposed to show a good result if the training was successful. As you can see from figure 4. below the predicted graph (red color) is aligned with 'dits' in the training signal (green color) despite QSB fading and noise in the signal.

Figure 4. Training vs. predicted graph

In the following section I will create another Morse signal, this time with text 'KCIUQ' but using the same noise, QSB and speed parameters. I am planning to use this signal to validate how well the model has generalized the 'dit' concept.

In [18]:
# Let's change the input signal, instead of QUICK we have KCIUQ in Morse code
P = signal('KCIUQ',WPM,Tq,sigma)
data = P.sig

# prepare input - 1 feature
input_list = [np.expand_dims(np.atleast_2d(data[i:examples+i]), axis=0) for i in xrange(nb_samples)]
input_mat = np.concatenate(input_list, axis=0)
plt.plot(x,data)
Out[18]:
[<matplotlib.lines.Line2D at 0x136050f90>]

Here is the generated validation Morse signal. It has the same letter as before but in reverse order. Can you read letters 'KCIUQ' from the graph below?

Figure 5. Validation Morse signal

In this section I use the above validation signal to create a prediction and the plot the results.

In [19]:
predicted = model.predict(input_mat)
plt.plot(data,'g')
plt.plot(xrange(examples,examples+nb_samples), predicted[:,1],'r')
Out[19]:
[<matplotlib.lines.Line2D at 0x1217be9d0>]

As you can see from the graph below the predicted 'dit' symbols (red color) don't really line up with actual 'dits' in the signal (green color). This is not a surprise to me. To build a good model that can generalize the learning you need to have a lot of training material (typically millions of datapoints) and the model needs to have enough neural nodes to capture the details of the underlying signals.
In this simple experiment I had only 1950 datapoints and 300 hidden nodes. There are only 8 'dit' symbols in the training material - learning CW skill well requires a lot more material and many repetitions, as any human who has gone through the process can testify. Same principle applies for neural networks.

Figure 6. Validation test

CONCLUSIONS

In this experiment I built a proof of concept to test whether Recurrent Neural Networks (especially LSTM variant) could be used to learn to detect symbols from noisy Morse code that has deep QSB fading. This experiment may contain errors and misunderstandings from my part as I have only had a few hours to play with this Keras Neural Network framework. Also, the concept itself needs still more validation as I may have used the framework incorrectly.

I think that the results look quite promising. In only 100 epochs the RNN model learned 'dits' from the noisy signal and was able to separate them from 'dah' symbols. As the validation test shows I overfitted the model to this small sample of training material used in the experiment. It will take much more training data and larger, more complicated neural network to learn to generalize the symbols in Morse code. The training process may also need more computing capacity. It might be beneficial to have a graphics card with GPU to speed up the training process going forward.

Any comments or feedback?

73
Mauri AG1LE

Creating Training Material for Recurrent Neural Networks

2015-11-22T12:15:00.001-05:00

INTRODUCTION

In my previous post I shared an experiment I did using Recurrent Neural Network (RNN) software. I started thinking that perhaps RNNs could learn not just the QSO language concepts but also learn how to decode Morse code from noisy signals. Since I was able to demonstrate learning of the syntax, structure and commonly used phrases in QSOs just in 50 epochs after going through the training material, wouldn't the same concept work for actual Morse signals?

Well, I don't really have any suitable training materials to test this. For the Kaggle competitions (MLMv1, MLMv2) I created a lot of training materials but the focus of these materials was different. The audio files and corresponding transcript files were open ended as I didn't want to narrow down possible approaches that participants might take. The materials were designed for a Kaggle competition in mind to be able to score participants' solutions.

In machine learning you typically have training & validation material that has many different dimensions and a target variable (or variables) you are trying to model. With neural networks you can train the network to look patterns in the input(s) and set outputs to target values when the input pattern is detected. With RNNs you can introduce memory function - this is necessary because you need to remember signal values from the past to properly decode the Morse characters.

In Morse code you typically have just one signal variable and goal is to extract decoded message from that signal. This could be done by having for example 26 outputs for each alphabet character and train the network to set output 'A' to high when pattern '.-' is detected in the signal input. Alternatively you could have output lines for symbols like 'dit' and 'dah' and 'element space' that are set high when corresponding pattern is detected in the input signal.

Since a well working Morse decoder has to deal with different speeds (typically 5 ... 50 WPM), signals containing noise and QSB fading and other factors I decided to create a Morse Encoder software that creates artificial training signals, but also corresponding symbols, speed information etc. I chose to use this symbols approach because it easier to debug errors and problems when you can plot the inputs vs. outputs graphically. See this Wikipedia article for details about representation, timing of symbols and speed.

The Morse Encoder generates a set of time synchronized signals and has also capability to add QSB type fading effects and Gaussian noise. See example of 'QUICK BROWN FOX JUMPED OVER THE LAZY FOX ' plotted with deep QSB fading with 4 second cycle time and 0.01 sigma Gaussian noise added in Figure 1. below.

Fig 1. Morse Encoder output signal with QSB and noise

The QSB for real life signals doesn't always follow sin() curve like in Fig 1. but as you can see from example below this is close enough. The big challenge is how to continue decoding correctly when the signal goes down to noise level as shown between 12000 to 14000 time samples (horizontal axis) below.

TRAINING MATERIALS

To provide proper target values for RNN training the Morse Encoder creates a Python DataFrame with the following columns defined

P.t # keep time
P.sig # signal stored here
P.dit # probability of 'dit' stored here
P.dah # probability of 'dah' stored here
P.ele # probability of 'element space' stored here
P.chr # probability of 'character space' stored here
P.wrd # probability of 'word space' stored here
P.spd # WPM speed stored here

Using these columns Morse Encoder takes the given text and parameters and then generates values to these columns. For example when there is a 'dit' in the signal, on corresponding rows the P.dit has probability of 1.0. Likewise, if there is a 'dah' in the signal, on corresponding rows the P.dah has probability of 1.0. This is shown on the Figure 2. below - dits are red and dahs are green, while the signal is shown in blue color.

Fig 2. Dit and Dah probabilities

Zoomed section of letters 'QUI ' is shown on Fig 3. below.

Fig 3. Zoomed section

Likewise we create probabilities for the spaces. In Figure 4 below element space is shown with magenta and character space with cyan color. I decided to set character space to probability 1.0 only after element space has passed, as can be seen from the graph.

Fig 4. Element Space and Character Space

The resulting DataFrame can be saved into a CSV file with a simple Python command and it is very easy to manipulate or plot graphs. Conceptually it is like an Excel spreadsheet - see below:

	t	sig	dah	spd
0	0.000	0.573355	1	40
1	0.001	0.531865	1	40
2	0.002	0.554412	1	40
3	0.003	0.551539	1	40
4	0.004	0.536430	1	40
5	0.005	0.561438	1	40
6	0.006	0.561170	1	40
7	0.007	0.546326	1	40
8	0.008	0.562902	1	40
9	0.009	0.533140	1	40

The Morse Encoder software is stored in Github MorseEncoder.py and it is open source.

NEXT STEPS

Now that I have the capability to create proper training material automatically with some parameters, like speed (WPM), fading (QSB) and noise level (sigma) it is a trivial exercise to produce large quantities of these training files.

My next focus area is to learn more about Recurrent Neural Networks (especially LSTM variants) and experiment with different network configurations. The goal would be to find a RNN configuration that is able to learn how to model the symbols correctly, even in presence of noise and QSB or at different speeds.

73
AG1LE

Your next QSO partner - Artificial Intelligence by Recurrent Neural Network?

2015-11-15T17:38:00.000-05:00

INTRODUCTION

Few months ago Andrej Karpathy wrote a great blog post about recurrent neural networks. He explained how these networks work and implemented a character-level RNN language model which learns to generate Paul Graham essays, Shakespeare works, Wikipedia articles, LaTeX articles and even C++ code of Linux kernel. He also released the code of this RNN network on Github.

It has been a while since I have experimented with RNNs. At the time I found RNNs difficult to train and did not pursue any further. Well, all that has changed in the last year or so. I installed Andrej's char-rnn package from Github in less than 10 minutes on my Linux laptop using instructions on the Readme.md file. I tested the installation by training the RNN with the Shakespeare's collected texts provided as part of the package.

If you have GPU graphics card (like NVIDIA Titan) the training goes much faster. I did not have this so I let the training run in the background for over 24 hours on my Lenovo X301 laptop . Looking the results the RNN indeed learned to output Shakespeare like language as Andrej explains in his blog post. It certainly took me more than 24 hours to learn English language and I never learned to write dialogue like Shakespeare. Please note that RNN was a "tabula rasa" so it had to learn everything one character at the time - this was pretty amazing result!

I decided to do an experiment to find out if this RNN technology could be used to build a ham radio robot.

TRAINING A HAM RADIO ROBOT

The robot would have to learn how people make CW QSOs in real life. I collected some 10,000 lines of examples of ham radio CW QSOs from various sources. Some examples were complete QSOs, some were short contest style exchanges and some just calling CQ. The quality of the language model depends on the amount of examples in the training file.

To do this properly I would need at least a few megabytes of examples, but I found only about 200 kBytes after a few minutes of Internet search. I copied this material into a single file named "input.txt" in data/QSOs directory and started the training with the following command:

th train.lua -data_dir "data/QSOs" -gpuid -1

RNN reads the input data, builds a dictionary, configures the neural network and parameters and starts the training. During training the software outputs examples in the cv directory. This allows you to see how RNN learns from the provided material.

After only 12 epochs the RNN had already learned some basic concepts like call signs and basic words, but it was not enough training rounds to learn English or proper syntax of CW QSO.

GG3QXP DE WA1AXP/M K

RARR WIS DE M8BPY RR NAMME FDO RR0S ANOEN CUX/I AO RNEL SNETSL CL ATRSO IANLOED UR RV SI MEE DO RU @ UR RSR IX NUOR IS QS IN WIEZ DE QEMUM/7 W

After 25 epochs the RNN had reached to the contest section (November Sweepstakes) of the input file. Note that the call signs below (like VE7LP or K2XC) were not included in the original training file. RNN has learned the proper call sign syntax from given examples and is generating this text below:

VE7LP 314 Q 52 SK=

K2XC 153 M 68 WNY=

K8CSC 285 A 15 MI=

AE5MX 341 Q 47 SB=

AE7J 419 A 06 WWA=

K5RMC 137 M 43 OK=

KB2QUT 34 U 21 NNY=

K9MIP 120 M 08 WI=

Wow...RNN has also learned November Sweepstakes exchange from provided examples. Even the ARRL sections are correct. Note that RNN is generating these based on learned rules, not just copying provided examples. All this by reading the input text file one character at the time.

After 39 epochs RNN has learned many CW acronyms, RST report and some basic QSO structure, though there is still a lot of nonsense in the output. RNN talks about name, QTH, weather (WX) and even RIG and antennas:

GUDXTEN/8 IN = NAME HR IS ED ED QTH IS CHAPOTON FNCI HOO OO DINED MIAD VALT W FO FON CUR DS MS ES TOT FER CL IW QSO OB ULLOO = QRHPO DE LOOFSD SORK/ISTO= = = RIG HR IS LACKORI WAZH PA WANFOO = WX WR = 2IT WINLE WOOD DES UP 55 FE HW? + MJ1GJO DE MJ3ASA K
GUD DX ES 73 G8XFO DE 2E3CUD/9RP @
KC6XQ DE M5WMM/M DE M1TGL/M K
W63ED DE M5YUE
VVV VVV
CQ CQ CQ DE WA1NX/WA50 WB4AJH/6 KC0AHH K1 WAJH K
WA3JRC DE W4DD/5MM DE KC3GJJ/8 K
GV8SDE DE 2I8APZ GD ENZ/QRP GD3BOB
G1KHC DE G3ECQ/QCP M7Y
VVV VVVVV
CQ CQ CQ DE W3ARJ/0 W11IA DE M9TFO/P WA2MJH/4 K
1BJ0 WA3BH DE MJ1GDS MJ6XW K
MW6PJP/M DE MW6QIC UR RST HR 459 W QSB M5YES DE 2E6AO QRS 6NT QSO D = RIG HR IS K70 WNTN = NAME HR IS JIM JEC QTH IN COLLA NCME HO FEL PU DS MS ES 2INTOOBA MCONTS = = UR RRTR MNNI W IS HOX WBOO ISTEL INYD = =
GUD RIS HL W0 IN W WOME DEN UTD PO P =

At this point RNN has seen the complete training material only 39 times. It has figured out many rules (like ending its turn on "K") but it has still problems with vocabulary, how to use call signs properly, etc.

After 50 epochs the RNN had already mastered idioms such as TNX FER NICE QSO CU AGN and structure resembles a real QSO:

KC2RD/4 DE W1GJV K
W1ARD DE KC2RD/3RO UR RST HR 459 W QSB GD3DIA DE GD9GEN @ CD GG2RD/9 DE W1GRV/4 DE GU5TCH/MM R TNX FER NICE QSO CU AGN
M2YXT DE GD5FM UR RST HR 529 W QSB W1GRD DE W1GRR RR K
GG TI TE UR 33
IWAMO DE WA6EN
KC2X DE W1YDH KE9NZE/0 OL TU
UR RST HR 309 W QSB = NAME HR IS AANNY WAVEL FNH COTE TNX QST
= UR 7S PR = UR RST HR 599 W QSB = HR VY NERVOUS D DE MYUE USD 1S =
NAME HR IS DI EESTY == RIG HR IS HEATH 71 INTO A NME HR IS VILL HW?
2E9AAT DE GW6QI UR TS TAX DEL NAME H5 UE EU 539 FE KHHJ RS 2E MES LANNY =
QRY = NAME HR IS ED ED QTH IS PARD VORETP

You can also see that some parts (like NAME HR ) are repeating multiple times. This was also noted by Andrej in his experiments. Since the training is done one letter at the time, and not word by word the RNN doesn't really get the context of these phrases.

PRACTICAL APPLICATIONS

This kind of ability to provide predictive text based on language models is widely used in many Internet services. When you type letters into Google search bar it will provide you alternatives based on prediction that has been learned from many other search phrases. See figure 1 below.

Figure 1. Predictive search bar

In the same manner RNN could provide a prediction based on characters entered so far and what it has learned from previous materials. This would be a useful feature for example in a Morse decoder. Also, building a system that would be able to respond semi-intelligently for example in a contest situation seems also feasible based on this experiment.

However, there is a paradigm shift when we start using Machine Learning algorithms. In traditional programming you write a program that uses input data to come up with output data. In Machine Learning you provide both input data and output data and computer creates a program (aka model) that is then used to make predictions. See figure 2. below to illustrate this.

Figure 2. Machine Learning paradigm shift

To build a ham radio robot we need to start by defining the input data and expected output data. Then we need to collect large amount of examples that will then be used to train the model. Once the model is able to accurately predict correct output you can then embed it into the overall system. Some systems will continuously learn and update the model on the fly.

In the case of ham radio robot we could focus on automating contest QSOs since the structure and syntax is well defined. In the experiment above RNN learned the rules by seeing the examples only 25 times. So the system could be monitoring a frequency, perhaps sending CQ TEST DE <MYCALL> or something similar. Once it receives a response it would then generate the output using the learned rules and would wait for acknowledgement and log a QSO.

If the training material covers enough "real life" cases, such as missed letters in call signs, out of sequence replies, non-standard responses etc. the ham radio robot would learn to act like human operator and quickly resolve the issue. No extra programming needed, just enough training material to cover these cases.

CONCLUSIONS

Recurrent Neural Network (RNN) is a powerful technology to learn sequences and to build complex language models. A simple 100+ line program is able to learn complex rules and syntax of ham radio QSOs in less than 50 epochs when presented only a small number of examples ( < 200 kBytes of text).

Building a ham radio robot to operate a contest station seems to be within reach using normal computers. The missing piece is to have enough real world training material and to figure out an optimal neural network configuration to learn how to work with human CW operators. With the recent advances of deep learning and RNNs this seems an easier problem than for example trying to build an automatic speech recognition system.

Happiness Formula

2015-09-11T23:55:00.002-04:00

Many people have tried to express human happiness in a mathematical formula. One of my personal favorites is created by Scott Adams (Dilbert fame). However, after many deep thoughts and a few drinks with my buddies I have concluded that Scott did not get the formula quite correct.

The correct and official Happiness Formula is shown in Figure 1. below

Fig 1. Happiness Formula

While Scott tried to explain happiness as a linear combination of each component he missed a few important points.

Integral over time - human happiness varies over time. Happiness is a fragile mental state that can easily go up or down. True happiness must be an integral over the observation time period. The time period could be one fantastic night out with good friends celebrating your promotion or over several months when you are fighting for your life in a cancer treatment center. It could also be over a lifetime when you are on your death bed thinking of your life and all the happy experiences. It can also be over the time period when you fell madly in love, got married and eventually divorced. When you select a different time horizon, you end up with a different happiness value.

Normalization - to be able to measure happiness you need to normalize the value by dividing the sum of components with expectations. If you expect the world you might not be happy even with the greatest partner or having billion dollars in your pocket. Your happiness depends on your expectations; winning a million dollars in a lottery when you least expect will boost your happiness for a while. If you expect to win 2 millions but you only get 1 million you will be disappointed. A small kid visiting DisneyWorld for the first time is super happy about the experience; an adult visiting same place for the 5th time gets easily bored and is not very happy.

Individual Coefficient - Ci also known as "Ida's constant" by the Finnish waitress who validated Happiness Formula after our happy group had spent significant effort and had many drinks to formulate happiness. This coefficient scales the happiness value for each individual to comply with International Unit of Happiness, aka "Anand".

Standardized Unit - Anand ( आनन्द ) is a Hindi word for happiness. One Anand is the unit of happiness, much like Tesla is the unit of magnetic flux density.

So there you have it, a mathematically rigorous formula of Happiness.

In the next post we focus on measurement techniques of Happiness and how to calibrate your measurement system against the International Anand standard held in safety in our Boston based laboratory.

Until next time.

Mauri

Internet of Things (IoT) - hype vs. reality

2015-09-10T02:13:00.001-04:00

Over the last few years the hype around "Internet of Things" (IoT) has been growing rapidly. According to Gartner Hype Cycle 2015 IoT is peaking currently. Assuming IoT follows this cycle this would mean that we are at the peak of inflated expectations and heading towards the through of disillusionment. See figure 1. below to see how IoT concept is tracking on the hype cycle.

Fig 1. "Peak of Expected Inflated Expectations"

I wanted to learn more about the IoT technology and do some concrete experiments to better understand what IoT can offer. I found the Particle Photon board that is a small $19 Arduino compatible U.S. quarter size board with WiFi enabled Internet connectivity very suitable for prototyping some IoT ideas. See fig 2. below to get a sense of the size of this tiny board. I ordered two of these just to play a bit and try to build something useful out of these.

Fig 2. Particle Photon board with and without breadboard headers

HUMIDITY CONTROLLER EXPERIMENT

I already have some previous experience working with Arduino compatible boards such as Arduino Pro Mini that is physically almost the same size of Particle Photon. In fact I used that board to build a simple humidity controller in our bathroom. This was a quick weekend project where I prototyped on a breadboard a simple circuit with a humidity sensor, a LED indicator and a relay driver to turn the bathroom fan on and off. I assembled the prototype parts including the breadboard, sensor, relay unit and 12v/5V power supply inside an Apple mouse plastic enclosure. See Fig 3.

Fig 3. Arduino based humidity controller prototype.

With a few holes drilled on this plastic enclosure to allow air to flow over the sensor I was able to fit the whole controller inside an existing vent box, see Fig 4. below.

Fig 4. Humidity controller installed inside the vent box.

However, I did have a problem with this simple controller. The few lines of software that I wrote in winter time when relative humidity is normally quite low worked very well for many months but during summer months when relative humidity is much higher the software didn't work that well. Debugging this kind of embedded software is not that easy. I disassembled the prototype for 3 times uploading yet another software version but the damn thing kept starting the vent fan in the middle of night or at some random time.

INTERNET ENABLED HUMIDITY CONTROLLER

I had to find a solution to this problem so when I learned about the Particle Photon board I knew that this might just be the solution. After reading some of the documentation I was pretty sure that having Internet connection would not only help me to debug the problem but also saves me a lot of trouble, as Particle Photon allows you to install the new firmware over the air. So I wouldn't have the get the vent box open, remove all wires, flash the Arduino board with new software and assemble everything back together.

After connecting the Particle Photon board to my Wifi and adding the device on Particle.io web IDE, I used Particle.io web based software development environment (see Fig 5. below) to edit and debug the humidity controller software. I could just simply edit, compile new code, press a button and install firmware almost instantly over the Internet using the WiFi connection on this Photon board.

How cool is this?

Fig 5. Web based software development environment

Particle.io provides also excellent API that have simple to use Internet enabled functions that you can incorporate in your own software. In my case I wanted to debug how the humidity sensor behaves when you have a transient increase in relative humidity when taking a shower. I used the HTU21D sensor from Sparkfun. Easy way to debug is to publish your sensor data to the Internet using a simple function like Spark.publish("RH_temp",str,60,PRIVATE);

You can use the Particle Dashboard to view the sensor data in near real time. This was almost too easy to describe on a blog like this.

Fig 6. Particle Dashboard

You can use of course use your own web or mobile applications to read and write data as well as control the input/output pins on the Photon board.

I used a simple API command line call to capture the sensor data for plotting:

curl -k https://api.particle.io/v1/devices/<your device id>\/events/?access_token\=<your access token> >photon_data.txt

Figure 7. below shows the relative humidity transient after taking a shower and then the decline as the vent fan is running. You can also see the small temperature increase when the hot water is running. When looking at the data I realized that my RH% threshold had been too small. When I increased the threshold value the controller started working much better. Being able to extract the sensor data and publish it over the Internet made a big difference in debugging the original problem.

Fig 7. RH% delta and Temperature over time

PLOTTING

In order to collect more data and have a dashboard to plot and review the measurements I signed up for a free account at ThingSpeak. You get an API key and channel number. With these you can plot the values with a simple API call:
ThingSpeak.writeFields(myChannelNumber, myWriteAPIKey);

Fig 7 and 8 below show the sensor data plot. Relative humidity peaks at 100% when taking a shower but since the fan is turned on almost instantly the humidity starts to drop quickly back to normal. You can see also a small increase in temperature at the same time. The drop in temperature is due to A/C that turns on at 6:00 AM.

Fig 7. Relative Humidity plot showing a peak

Fig 8. Temperature plot

CONCLUSIONS

My quick foray into the world of "Internet of Things" took me about 2 hours on a Sunday afternoon. Using the latest Particle.io Photon board and the web based IDE I was able to convert my existing Arduino based humidity controller to an Internet enabled controller that is publishing sensor data in near real time and allows me to update the software over the air.

This whole project felt like too easy - I expected that building IoT prototypes would be much harder but at least for this simple use case it took a novice like myself only a short time to solve a real world problem. Now the bathroom vent works as expected and humidity is under control.

APPENDIX - SOFTWARE

The current software version is listed below. As you can see this is not rocket science - a few lines of code and you have an Internet connected sensor / controller.

// This #include statement was automatically added by the Particle IDE.
#include "HTU21D/HTU21D.h"
#include "application.h"
/*
HTU21D Humidity Controller
By: Mauri Niininen (c) Innomore LLC
Date: Aug 30, 2015

Uses the HTU21D library to control humidity using a fan.

Hardware Connections (Breakout board to Photon)
-VIN = 5.3 V
-VCC = 3.3 V
-GND = GND
-SDA = D0 (use inline 330 ohm resistor if your board is 5V)
-SCL = D1 (use inline 330 ohm resistor if your board is 5V)

-RLY = D3 relay board
*/

#define HOUR 3600/10 // 1 HOUR in seconds - divide by loop delay 10 secs
#define HRS_24 24 // 24 hours of history

// define class for 24 hr relative humidity
class RH24 {
private:
float rh24[HRS_24]; // Keep last 24 hours of humidity
int counter;
int index;
public:
void init(float RH);
void update_h(float RH);
float avg(float RH);
};

//Create an instance of the objects
HTU21D mh;
RH24 rh;

// for time sync
#define ONE_DAY_MILLIS (24 * 60 * 60 * 1000)
unsigned long lastSync = millis();

void setup()
{


pinMode(D7, OUTPUT);
pinMode(D3, OUTPUT);

while (! mh.begin()){
digitalWrite(D7,HIGH);
delay(200);
digitalWrite(D7,LOW);
delay(200);
}

// turn on the fan
digitalWrite(D3,HIGH);

// initialize sensor
rh.init(mh.readHumidity());
delay(5000);

// turn off the fan
digitalWrite(D3,LOW);
}

// MAIN PROGRAM LOOP
void loop()
{
// read sensor humidity and temperature
float humd = mh.readHumidity();
float temp = mh.readTemperature();
float avg = rh.avg(humd);
float delta = humd - avg;

// convert data to string
String h_str = String(humd,2);
String t_str = String(temp,2);
String d_str = String(delta,2);
String avg_str = String(avg,2);
String tm_str = String(millis());
String str = String(h_str+":"+t_str+":"+d_str+":"+avg_str+":"+tm_str);


// if relative humidity increases over 12% vs. 24 hour average, turn on the fan
if (humd - avg > 12.0) {
digitalWrite(D7,HIGH);
digitalWrite(D3,HIGH);
Spark.publish("RH_temp","ON",60,PRIVATE);
}
else {
digitalWrite(D7,LOW);
digitalWrite(D3,LOW);
}

// time sync over Internet once a day
if (millis() - lastSync > ONE_DAY_MILLIS) {
// Request time synchronization from the Particle Cloud
Spark.syncTime();
lastSync = millis();
}

// Send a published string to your devices...
Spark.publish("RH_temp",str,60,PRIVATE);
delay(10000);

}

void RH24::init(float RH){
counter = 0;
index = 0;
for (int i = 0; i < HRS_24; i++)
rh24[i] = RH;
}

void RH24::update_h(float RH) {
counter += 1;
if (counter > HOUR) {
counter = 0;
rh24[index] = RH;
index += 1;
if (index >=HRS_24)
index = 0;
}
}

float RH24::avg(float RH) {
update_h(RH);
float sum = 0.0;
for (int i = 0; i < HRS_24; i++)
sum += rh24[i];
return (sum/HRS_24);
}

El-bug: a novel Morse decoder based on cockroach neural circuits

2015-04-01T11:20:00.000-04:00

April 1, 2015

I have been working on a project to harness the power of biological neural circuits into a practical novel solution for digital communications. I decided to call this project "El-bug" as my focus was to find out how fast biological neural circuits can learn to decode Morse code.

Biological neural circuits have some amazing properties compared to computer based artificial neural networks. State of the art deep learning algorithms require millions of data points and hours or days of repetitions to learn patterns in the data, where as biological neural circuits can often learn new patterns using only a few examples and in time scale of tens of milliseconds. Based on literature biological neural circuits are also adaptive and work well under noisy real world signals.

Computer based learning algorithms require expensive hardware to store gigabytes of training data, GPUs to accelerate the learning process and complicated electronics to convert real world signals into digital pictures, audio or other representations. In comparison biological neural circuits are very small, typically come pre-integrated with sensory organs and require very little power in form of cheap organic energy sources such as glucose.

The hardware used in this project is based on Arduino and the components are available at less than $50 from multiple sources online. The biological neural circuits of American cockroach (Periplaneta americana) is used to power the computation. These fascinating insects are readily available from many sources at low cost, or sometimes even free of charge.

SYSTEM ARCHITECTURE

The overall system architecture is shown in Figure 1 below. Arduino Pro Mini (3.3 V/8 MHz) has analog and digital interfaces and it is connected to a RFDuino Bluetooth module. Interface to cockroach neural circuitry is done using analog amplifier with frequency response designed for capturing bio-electrical neural spike signals. Digital output lines are used to provide electrical stimulation of the nerves.

Figure 1. El-bug Morse decoder system architecture

COCKROACH ANATOMY AND NEURAL CIRCUITRY

There is a surprising amount of research available on the neural circuitry of Periplaneta americana. For example this source explains:
"The anatomy of the cockroach is exceptionally accessible to electrophysiological experimentation for a variety of reasons. First, from the dorsal, or top, view the cockroach has a distinctive prothorax (the section directly behind, and shielding the head) and wings that give the cockroach its distinctive armored look. When flipped on its back, the ventral aspect of the cockroach reveals the basic segmented body sections distinctive of insects: the head, thorax, abdomen, and legs."

See Figure 2 for details of cockroach anatomical features.

Figure 2. Cockroach Anatomy

After studying possible circuits to utilize I decided to focus on "escaping behavior" that the common cockroach (Periplaneta americana) exhibits. This is a robust behavior of turning away from wind puffs (Camhi et al. 1978). This behavior is termed “escaping behavior” since it is the initial movement when escaping from predators. This source explains the detailed mechanism and neural circuitry in use:
"Understanding the anatomy of the cockroach nervous system is helpful when examining this escape behavior. The ventral nerve cord (VNC) of the cockroach is along its underbelly, rather than the dorsal side where the nerve cord of most vertebrates is located. The VNC is composed of several giant interneurons (GIs) and at the terminal ganglion afferents project to the dendrites of these GIs.
To detect wind directions, the cockroach has two cerci that are covered by numerous filiform hairs located at the rear of its body (Figure 1). Mechanoreceptors are attached at the base of the filiform hairs and are sensitive to wind puffs. Afferents send the neural signal from the mechanoreceptors to the terminal ganglion and thus provide input to the GIs. Due to its specific location and orientation on the cerci, each mechanoreceptor is sensitive to wind puffs from a specific direction relative to the cockroach. Afferents that are sensitive to similar wind directions are located close to each other within the abdominal ganglion." Figure 3. from the same source shows typical measurement results as explained in the experiment.

Figure 3. Typical afferent response

The Arduino Pro Mini provides a low cost circuitry to measure the neural responses and it has 4 analog to digital channels readily available. An analog pre-amplifier such as in Spikerbox can easily produce voltage levels required by Arduino ADC. This is a 10 bit ADC and provides 3.2 mV resolution. According to this source a single ADC read takes about 100 microseconds that is adequate speed for this purpose. The goal here is to try to establish a clear differentiation between responses to two types of electrical stimulation, similar to [Yu-Wei 2010] - see Figure 4 as example.

Figure 4. Neural responses to stimulation

MORSE DECODING PROBLEM

So given above how could we build a functioning Morse code decoder using these key components? The schema is shown in Figure 5. Using the Bluetooth module we are sending audio containing noisy Morse code audio as streamed data stream to Arduino. The software in Arduino does very basic signal processing, calculates the envelope of the audio signal and after low pass filtering generates stimulus signals that are sent using digital output lines to cockroaches that are organized in a 4 level hierarchy corresponding to the alphabet. If numerals would be included we would need a fifth layer.

At each level of the hierarchy the corresponding cockroach responds to electrical stimulus and based on a learned reaction will emit either "dit" or "dah" response. These "dit" and "dah" reactions are collected using the 10 bit ADC from 4 analog channels in Arduino and are organized as a sequence. Once the complete character has been received a simple "best matching unit" lookup is performed by Arduino and corresponding matched letter is sent using the serial interface over Bluetooth.

Implementing this scheme in Arduino Pro Mini did take some effort as available RAM memory is only 2 kilobytes. After about 2 weeks of coding effort I managed to squeeze all the functionality in and have still some 343 bytes of RAM free.

Figure 5. Morse decoding schema using cockroach hierarchy

EXPERIMENTAL RESULTS

I did run a 20 hours of tests using El-bug Morse decoder. I compared the character error rate (CER) to signal to noise ratio (SNR) of the audio files with the previous results achieved using Bayesian Morse decoder. The results are shown in Figure 6 below. I had to stop the experiment after 20 hours as 2 of the 4 cockroaches got tired of constant stimulation. They seem to have maximum decoding rate of 100 words per minute.

Surprisingly the decoding accuracy of El-bug system appears to to be quite a lot of better compared to my previous records. With decent signal to noise ratio (> 12 dB @500Hz) the decoding accuracy approaches 99%. Even at lower SNR values El-bug outperforms any known machine learning algorithms that I have tested so far.

Figure 6. Experimental CER vs. SNR results

For the next version I am looking into integrating the electronic circuitry into a smaller form factor, something similar to Figure 7. below. This source provides additional inspiration to pursue this project further.

Figure 7. Portable El-bug Morse Decoder

CONCLUSIONS

If the reader has had patience to follow this story this far I must congratulate you. You have amazing neural circuitry in your brain that is able to absorb this amount of information and form an opinion about what is being presented to you. You may have already realized that this story may be just pure imagination and has no connection to reality whatsoever.

Happy April 1, 2015!

Mauri AG1LE

Morse Learning Machine v1 Challenge

2015-01-02T21:40:00.003-05:00

Morse Learning Machine v1 Challenge Results

Morse Learning Machine v1 Challenge (MLMv1) is officially finished. This challenge was hosted by Kaggle and it created much more interest than I expected. There was active discussion in eham.net in CW forum, as well as in Reddit here and here.
The challenge made it to the ARRL headline news in September. Google search gives 1030 hits as different sites and bloggers worldwide picked up the news.

The goal of this competition was to build a machine that learns how to decode audio files containing Morse code. To make it easier to get started I provided sample Python morse decoder and sample submission files.

For humans it takes many months effort to learn Morse code and after years of practice the most proficient operators can decode Morse code up to 60 words per minute or even beyond. Humans have also extraordinary ability to quickly adapt to varying conditions, speed and rhythm. We wanted to find out if it is possible to create a machine learning algorithm that exceeds human performance and adaptability in Morse decoding.

Total of 11 teams and 13 participants were competing almost 4 months for the perfect score 0.0 (this means no errors in decoding sample audio files). During the competition there was active discussions in the Kaggle forum where participants shared their ideas, asked questions and got also some help from the organizer (ag1le aka myself).

The evaluation was done by Kaggle platform based on submissions that the participants uploaded. Levenshtein distance was used as the evaluation metric to compare predicted results to the corresponding lines in the truth value file that was hidden from the participants.

Announcing the winners

According to the challenge rules I asked participants to make their submissions available as open source with GPL v3 license or later to enable further development of machine learning algorithms. Resulting new Morse decoding algorithms, source code and supporting files are uploaded in Github repository by the MLMv1 participants.

I also asked the winners to provide a brief description about themselves, methods & tools used and any insights and takeaways from this competition.

BrightMinds team: Evgeniy Tyurin and Klim Drobnyh

Public leaderboard score: 0.0
Private leaderboard score: 0.0
Source code & supporting files (waiting for posting by team)

What was your background prior to entering this challenge?

We have been studying machine learning for 3 years. Our interests has been different until now, but there are several areas we share experience in, such as image processing, computer vision and applied statistics.

What made you decide to enter?

Audio processing is a new and exciting field of computer science for us. We wanted to consolidate our expertise in our first joint machine learning project.

What preprocessing and supervised learning methods did you use?

At first we tried to use Fourier transform to get robust features and train supervised machine learning algorithm. But then we would have had extremely large train dataset to work with. That was the reason to change the approach in favour of simple statistical tests.

What was your most important insight into the data?

Our solution relies on the way the data was generated. So, observing the regularity in the data was this very insight that influenced the most.

Were you surprised by any of your insights?

Actually, we expected that the data would be real. For example, recorded live from radio.

Which tools did you use?

Our code was written in Python. We used numpy and scipy to calculate values of normal cumulative density function.

What have you taken away from this competition?

We gained great experience in audio signal processing and the applicability of machine learning approach.

Tobias Lampert

Public leaderboard score: 0.02
Private leaderboard score: 0.12
Source code & supporting files from Tobias

What was your background prior to entering this challenge?

I do not have a college degree, but I have 16 years of professional experience developing software in a variety of languages, mainly in the financial sector. I have been interested in machine learning for quite some time but seriously started getting into the topic after completing Andrew Ng's machine learning Coursera class about half a year ago.

What made you decide to enter?

Unlike most other Kaggle competitions, the raw data is very comprehensible and not just "raw numbers" - after all morse code audio can easily be decoded by humans with some experience. So to me it looked like an easy and fun exercise to write a program for the task. In the beginning this proved to be true and as expected I achieved decent results with relatively little work. However the chance of finding the perfect solution kept me trying hard until the very end!

What preprocessing and supervised learning methods did you use?

Preprocessing:
- Butterworth filter for initial computation of dit length

- FFT to transform data from time to frequency domain

- PCA to reduce dimensionality to just one dimension
Unsupervised learning:

- K-Means to generate clusters of peaks and troughs
Supervised learning:

- Neural network for morse code denoising

What was your most important insight into the data?

The most important insight was probably the fact that all files have a structure which can be heavily exploited - due to the fact the pauses at the beginning and end have the same length in all files and wpm is constant the exact length of one dit can be computed. Using this information, the files can be cut into chunks that fully contain either signal or no signal making further analysis much easier.

Were you surprised by any of your insights?

What surprised me most is that after trying several supervised learning methods like neural networks and SVMs with varying success, a very simple approach using an unsupervised method (K-Means) yielded the best results.

Which tools did you use?

I started with R for some quick tests but switched to Python with scikit-learn very early, additionally I used the ffnet module for neural networks. To get a better grasp on the data, I did a lot of charting using matplotlib.

What have you taken away from this competition?

First of all obviously I learned a lot about how morse code works, how it can be represented mathematically and which patterns random morse texts always have in common. I also deepened my knowledge about signal processing and filtering, even though in the end this only played a minor role in my solution. Like all Kaggle competitions, trying to make sense of data, competing with others and discussing solution approaches was great fun!

Observations & Conclusions

I asked advice in the Kaggle forum how to promote and attract participants. I tried to encourage people to join the challenge during the first 2 weeks by posting frequent forum updates. Based on download statistics (see the table below) the participation rate of this challenge was roughly 11% as there was 13 actual participants and 120 unique users who downloaded the audio files. I don't know if this is typical in Kaggle competitions but certainly there were much more interested people than actual participants.

Filename	Size	Unique Users	Total Downloads	Bandwidth Used
sampleSubmission.csv	2.02 KB	126	226	254.21 KB
levenshtein.py	1.56 KB	83	120	129.69 KB
morse.py	12.69 KB	126	196	1.56 MB
audio_fixed.zip	65.91 MB	120	179	7.72 GB

I did also a short informal survey among the participants in preparation for the MLM v2 challenge. Here are some examples from the survey:

Q: Why did you decide to participate MLM v1 challenge?

I have a background in signal processing and it seemed like a good way to refresh my memory.

By participation I tried to strengthen my Computer Science knowledge. At the university I am attending Machine Learning & Statistics course, so the challenge can help me practice.

You were enthusiastic about it and it seemed fun/challenging/new

Q: How could we make MLM v2 challenge more enjoyable?

High Resolution, Low Power Temperature Logger using Arduino

2014-11-16T21:13:00.000-05:00

I am working on a new project with the goal of developing miniaturized high resolution temperature logger device. The objective is to make it small enough to be a wearable device that would measure temperature with better than 0.1 °C resolution and store the measured data on a memory card for plotting and analysis. The logger needs to run on a chargeable battery for several weeks and must be comfortable enough to wear 24x7.

I used TMP102 that is a digital sensor manufactured by Texas Instruments with I2C a.k.a. two wire interface (TWI) and the following features:

12-bit, 0.0625°C resolution
Accuracy: 0.5°C (-25°C to +85°C)
Low quiescent current
10µA Active (max)
1µA Shutdown (max)
1.4V to 3.6VDC supply range
Two-wire serial interface

This is a very handy high resolution sensor that requires a very low current to operate.

There are quite many data logger boards available but only a few that are fully hack-able and small in size. I ended up selecting Sparkfun OpenLog board that looked small enough and that had all software, firmware and hardware designs available as open source. The only small problem was I2C (TWI) interface signals SCL and SDA. I had to solder wires directly to ATmega328 CPU pins, as they are not available on external interface. Soldering to surface mounted components requires steady hand and small soldering tip on your iron. Having a 20x optical microscope does also help.

Hardware components

The components for this project are listed below with links and cost at the time of writing this article. Most of these are available at Sparkfun.

Sparkfun OpenLog $24.95
Digital Temperature Sensor Breakout - TMP102 $5.96
Lithium Polymer USB Charger and Battery $24.95
Samsung 16GB EVO Class 10 Micro SDHC $10.99
Enclosure - 3D printed $9.50

Total cost of components was $76.35

Software Development

You can download the latest integrated software development environment from Arduino IDE page. I used the Arduino 1.5.8 with the following configuration

Tools/Board - Arduino Uno
Tools/Port - /dev/ttyUSB0

To connect my Lenovo X301 laptop running Linux Mint 17 operating system I used FT231X Breakout with a Crossover Breakout for FTDI. To make it easier to connect I soldered Header - 6-pin Female (0.1", Right Angle) and
Break Away Male Headers - Right Angle connectors for these small breakout boards.

My focus in software development was to re-use existing OpenLog software, implement temperature measurement over I2C bus and try to minimize battery consumption by implementing power shutdown between measurements. This software is still works-in-progress but even with this software version (v3) the average current consumption is about 200 uA level @3.3 Volts. During measurement over I2C bus and when storing the results to the microSD card the current peaks to 25 mA for a few milliseconds.

The software makes a measurement every 5000 milliseconds. Since this design does not have a RTC chip the timekeeping is done with software. Software has a routine to compensate for timing errors but it is not very accurate yet. It meets my needs for data logging purposes for now. For the final product it might be useful to have RTC chip built into the design.

A fully charged 3.7 V 850mAh LiPo battery is estimated to provide power for this data logger over 170 days, though I haven't tested it for longer period than 15 days so far.

The latest logger software is posted in Github.

3D Printed Enclosure

I could not find a suitable small enclosure so I took OpenSCAD software into use and built a 3D model using some examples I found in Thingverse. OpenSCAD allows you to write 3D objects in simple language so it was quite easy to experiment with different designs and view them before creating the .STL file required for 3D printing.

The enclosure size was determined mostly by the LiPo battery dimensions. With a smaller battery it would be possible to make a much smaller enclosure. Also, it would be possible to design a single circuit board with all the required components and make it smaller. Since this project is still in concept phase I chose to use the selected components and 3D print an enclosure where I can fit them in.

3D model of enclosure

I sent the .STL file to a local company with some email instructions. I my first version the dimensions were 1/8" units - this was not clear so I had to call them to clarify. In the 2nd version I used millimeters as units and the enclosure came out fabricated as I expected. The enclosure is big enough for the LiPo battery and it has also a small indention at the bottom to get the TMP102 sensor chip closer to the outside surface. The wall thickness was set to 1.0 mm and sensor chip is only 0.5 mm from the bottom surface.

The cover dimensions were designed it to be tightly fitted on the top.

3D printed enclosure - top view

The enclosure has also fixtures both sides attaching a 22mm standard wristband using spring bars.

3D printed enclosure - side view

Below is a photograph of 3D enclosure model v1 with wristbands attached.

3D printed enclosure with wristbands attached

In the photo below the hardware components fit in perfectly and the LiPo battery comes on the top. The power is connected using red and black wires to OpenLog board.

3D printed enclosure with components inside

Measurement Results

I tested the temperature logger by putting it from room temperature T0 22.6 °C into a freezer at Tm -21.0 °C.

Based on the measurement results below (red graph) it took 54 minutes for the sensor to reach this temperature. The thermal mass of the sensor itself is small, but it was in a closed enclosure with the LiPo battery that has a much larger thermal mass. I did the same test but this time left the enclosure open so that the sensor was exposed to freezer temperature from both sides. The blue graph below shows that the sensor reached -21 °C in 30 minutes.

Freezer Test

The thermal time constant is defined as the time required by a sensor to reach 63.2% of a step change in temperature under a specified set of conditions.

The response of a sensor to a sudden change in the surrounding temperature is exponential and it is described by the following equation:

where T is sensor temperature, Tm is the surrounding medium temperature, T0 is the initial sensor temperature, t is the time and τ is the time constant.

Looking at the results above and trying to find best fit of measurement data to this model using RMS error method we can estimate that

τ = 0.009732 for closed enclosure

τ = 0.003132 for open enclosure

See graph of models vs. actuals is below. To correctly estimate temperatures this thermal time constant need to be taken into account.

Model of thermal time constants τ

In the datasheet there are the following guidelines to maintain accuracy.

The temperature sensor in the TMP102 is the chip itself. Thermal paths run through the package leads, as well as the plastic package. The lower thermal

resistance of metal causes the leads to provide the primary thermal path.

To maintain accuracy in applications requiring air or surface temperature measurement, care should be taken to isolate the package and leads from ambient air temperature. A thermally-conductive adhesive is helpful in achieving accurate surface temperature measurement.

One improvement would be to apply thermally-conductive materials between the sensor chip and the bottom surface of the enclosure. Some biologically inert metal like gold or titanium might provide better thermal time constant. Some insulation might be needed to isolate sensor from ambient air.

Conclusions

Temperature logging using modern, high resolution and accurate digital sensors is quite easy. You need only simple a micro-controller and few lines of code to build a data logger that is small enough to be wearable.

Working with Arduinos is not only easy but also a lot of fun. With minimal investment you can build a powerful data logging device and add new sensors in incremental fashion. The software development is also straightforward and for most problems there is already some open source examples available.

The real challenges seem to be on the physics and mechanical enclosure design side. Building an enclosure for the sensor and electronics with a small thermal time constant is not trivial. For highly accurate wearable data logging devices you need also to consider many other topics such as

physical appearance
hypoallergic materials
fit and comfort for different size of subjects
thermal properties of materials
any variability in sensor contact with subject or ambient air

It would be interesting to see the thermal design details of popular fitness trackers that claim accurate body temperature measurements. In particular the basal body temperature measurements where accuracy and high resolution is required poor thermal design would impact results significantly.

Morse Learning Machine - Challenge

2014-09-03T20:32:00.000-04:00

MACHINE LEARNING CHALLENGE

I was astonished to get email acknowledgement that my Kaggle Morse Challenge was approved today. I have spent last few days preparing materials and editing the description and the rules for this competition.

The goal of this competition is to build a machine that learns how to decode audio files containing Morse code.

For humans it takes many months effort to learn Morse code and after years of practice the most proficient operators can decode Morse code up to 60 words per minute or even beyond. Humans have also extraordinary ability to quickly adapt to varying conditions, speed and rhythm.

I want to find out if it is possible to create a machine learning algorithm that exceeds human performance and adaptability in Morse decoding. I have shared some of these ideas in New England Artificial Intelligence meetup about one year ago and got enthusiastic feedback from the participants.

WHY KAGGLE?

Kaggle is a platform for predictive modelling and analytics competitions on which companies and researchers post their data and statisticians and data miners from all over the world compete to produce the best models. This crowdsourcing approach relies on the fact that there are countless strategies that can be applied to any predictive modelling task and it is impossible to know at the outset which technique or analyst will be most effective. Kaggle aims at making data science a sport.

Kaggle's community of data scientists comprises tens of thousands of PhDs from quantitative fields such as computer science, statistics, econometrics, maths and physics, and industries such as insurance, finance, science, and technology. They come from over 100 countries and 200 universities. In addition to the prize money and data, they use Kaggle to meet, learn, network and collaborate with experts from related fields.

For the Morse Learning Machine competition I hope to attract people from the Kaggle community who are interested in solving new, difficult challenges using their predictive data modeling, computer science and machine learning expertise. For students this challenge provides a great opportunity to put theoretical concepts into practice and see how they can solve tough problems by applying knowledge gained in class rooms.

COMPETITION DETAILS

During the competition, the participants build a learning system capable of decoding Morse code. To that end, they get development data consisting of 200 .WAV audio files containing short sequences of randomized Morse code. The data labels are provided for a training set so the participants can self-evaluate their systems. To evaluate their progress and compare themselves with others, they can submit their prediction results on-line to get immediate feedback. A real-time leaderboard shows participants their current standing based on their validation set predictions.

I have also provided sample Python Morse decoder to make it easier too get started. While this software is purely experimental version it has some features of the FLDIGI Morse decoder but implemented using Python instead of C++.

You can of course leverage the experimental multichannel CW decoder I recently implemented on FLDIGI or the standalone version of Bayesian decoder written in C++. There is also some new tools I posted to Github.

Please help me to spread this message to attract participants for the Morse Learning Machine challenge!

73
Mauri AG1LE

Cortical Learning Algorithm for Morse code - part 1

2014-08-24T20:00:00.004-04:00

Cortical Learning Algorithm Overview

Humans can perform many tasks that computers currently cannot do. For example understanding spoken language in noisy environment, walking down a path in complex terrain or winning in CQWW WPX CW contest are tasks currently not feasible for computers (and might be difficult for humans, too).
Despite decades of machine learning & artificial intelligence research, we have few viable algorithms for achieving human-like performance on a computer. Morse decoding at the best human performance level would be a good target to test these new algorithms.

Numenta Inc. has developed technology called Cortical Learning Algorithm (CLA) that was recently made available as open source project NuPIC. This software provides an online learning system that learns from every data point fed into the system. The CLA is constantly making predictions which are continually verified as more data arrives. As the underlying patterns in the data change the CLA adjusts accordingly. CLA uses Sparse Distributed Representation (SDR) in similar fashion as neocortex in human brain stores information. SDR has many advantages over traditional ways of storing memories, such as ability to associate and retrieve information using noisy data.

Detailed description on how CLA works can be found from this whitepaper.

Experiment

To learn more how CLA works I decided to start with a very simple experiment. I created a Python script that uses Morse code book and calculates Sparse Distributed Representation (SDR) for each character. Figure 1 below shows the Morse alphabet and numbers 0...9 converted to SDRs.

Fig 1. SDR for Morse characters A...Z, 0...9

NuPIC requires input vector to be a binary representation of the input signal. I created a function that packs "dits" and "dahs" into 36x1 vector, see two examples below. Each "dit" is represented as 1. followed by 0. and each "dah" is represented by 1. 1. 1. followed by 0. to accomodate 1:3 timing ratio between "dit" and "dah". This preserves the semantic structure of Morse code that is important from sequence learning perspective.

H ....
[ 1. 0. 1. 0. 1. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]

O ---

[ 1. 1. 1. 0. 1. 1. 1. 0. 1. 1. 1. 0. 0. 0. 0. 0. 0. 0.

0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]

The Spatial Pooler uses 64 x 64 vector giving SDR of size 4096. As you calculate the SDR random bits get active on this vector. I plotted all active cells (value = 1) per columns 0...4096 for each letters and numbers as displayed in Fig 1. above. The respective character is shown on the right most column.

To see better the relationships between SDR and Morse character set I created another SDR map with letters 'EISH5' and 'TMO0' next to each other. These consequent letters and numbers differ from each other only by one "dit" or one "dah". See Fig 2. for SDR visualization of these characters.

There is no obvious visible patterns across these Morse characters, all values look quite different. In the Numenta CLA whitepaper page 21 it says "Imagine now that the input pattern changes. If only a few input bits change, some columns will receive a few more or a few less inputs in the “on” state, but the set of active columns will not likely change much. Thus similar input patterns (ones that have a significant number of active bits in common) will map to a relatively stable set of active columns."

This doesn't seem to apply in these experiments so I need to investigate this a bit further.

Fig 2. SDR for Morse characters EISH5 and TMO0

I did another experiment by reducing SDR size to only 16x16 so 256 cells per SDR. In Fig 3. it is now easier to see common patterns between similar characters - for example compare C with K and Y. These letters have 3 common cells active.

Fig 3. SDR map with reduced size 16x16 = 256 cells

The Python software to create the SDR pictures is below:

"""A simple program that demonstrates the working of the spatial pooler"""

import numpy as np

from matplotlib import pyplot as plt

from random import randrange, random

from nupic.research.spatial_pooler import SpatialPooler as SP

CODE = {

' ': '',

'A': '.-',

'B': '-...',

'C': '-.-.',

'D': '-..',

'E': '.',

'F': '..-.',

'G': '--.',

'H': '....',

'I': '..',

'J': '.---',

'K': '-.-',

'L': '.-..',

'M': '--',

'N': '-.',

'O': '---',

'P': '.--.',

'Q': '--.-',

'R': '.-.',

'S': '...',

'T': '-',

'U': '..-',

'V': '...-',

'W': '.--',

'X': '-..-',

'Y': '-.--',

'Z': '--..',

'0': '-----',

'1': '.----',

'2': '..---',

'3': '...--',

'4': '....-',

'5': '.....',

'6': '-....',

'7': '--...',

'8': '---..',

'9': '----.' }

class Morse():

def __init__(self, inputShape, columnDimensions):

"""

Parameters:

----------

_inputShape : The size of the input. (m,n) will give a size m x n

_columnDimensions : The size of the 2 dimensional array of columns

"""

self.inputShape = inputShape

self.columnDimensions = columnDimensions

self.inputSize = np.array(inputShape).prod()

self.columnNumber = np.array(columnDimensions).prod()

self.inputArray = np.zeros(self.inputSize)

self.activeArray = np.zeros(self.columnNumber)

self.sp = SP(self.inputShape,

self.columnDimensions,

potentialRadius = self.inputSize,

numActiveColumnsPerInhArea = int(0.02*self.columnNumber),

globalInhibition = True,

synPermActiveInc = 0.01

)

def createInputVector(self,elem,chr):

print "\n\nCreating a Morse codebook input vector for character: " + chr + " " + elem

#clear the inputArray to zero before creating a new input vector

self.inputArray[0:] = 0

j = 0

i = 0

while j < len(elem) :

if elem[j] == '.' :

self.inputArray[i] = 1

self.inputArray[i+1] = 0

i = i + 2

if elem[j] == '-' :

self.inputArray[i] = 1

self.inputArray[i+1] = 1

self.inputArray[i+2] = 1

self.inputArray[i+3] = 0

i = i + 4

j = j + 1

print self.inputArray

def createSDRs(self,row,x,y,ch):

"""Run the spatial pooler with the input vector"""

print "\nComputing the SDR for character: " + ch

#activeArray[column]=1 if column is active after spatial pooling

self.sp.compute(self.inputArray, True, self.activeArray)

# plot each row above previous one

z = self.activeArray * row

# Plot the SDR vectors - these are 4096 columns in the plot, with active cells visible

plt.plot(x,y,z,'o')

plt.text(4120,row-0.5,ch,fontsize=14);

print self.activeArray.nonzero()

# Testing NuPIC for Morse decoding

# Create SDRs from Morse Codebook input vectors

print "\n \nCreate SDRs from Morse Codebook input vectors"

# vector size 36x1 for input, 64x64 = 4096 for SDR

example = Morse((36, 1), (64, 64))

x,y = np.meshgrid(np.linspace(1,1,4096), np.linspace(1,1,32))

plt.ylim([0,38])

plt.xlim([0,4170])

# Select the characters to be converted to SDRs

#str = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'

str = 'EISH5 TMO0'

row = 1

for ch in str:

example.createInputVector(CODE[ch],ch)

example.createSDRs(row,x,y,ch)

row = row +1

plt.show()

plt.clf()

Conclusions

CLA provides promising new technology that is now open for ham radio experimenters to start tinkering with. Building a CLA based Morse decoder would be a good performance benchmark for CLA technology. There are plenty of existing Morse decoders available to compare with and it is fairly easy to test decoder accuracy under noisy conditions. There is also a plenty of audio test material available, including streaming sources like WebSDR stations.

Mauri

New Morse Decoder - part 6

2014-07-25T01:30:00.002-04:00

Multichannel CW decoder

Development of the Bayesian CW decoder is moving forward. Many thanks to Dave W1HKJ there is now an alpha build available also for Windows platform. The v3.21.83cw-a4 tarball sources and Windows version are available in http://www.w1hkj.com/ag1le/

This version has still multiple problems that need to fixed. In Fig 1. below I have screenshot where I have the multichannel signal browser and Fldigi configuration screen for Modems / CW visible. I am running Windows FLDIGI version v3.21.83cw-a4 connected to PowerSDR v2.6.4 and my Flex3000 radio.

The following description explains the options and expected behavior in this alpha software version. Things are not yet well optimized so you are likely to see a lot of misdecoded signals. I am interested getting feedback and improvement ideas to make the Bayesian decoder better.

Checkbox "Bayesian decoding" enables multichannel operation. If you have Signal Browser open you can slide the horizontal control at the bottom to adjust the signal decoding threshold. With lower threshold you can enable weaker CW stations to be decoded but often you get just noise and the decoder produces garbage as visible in Fig 1. The software detects peaks in the spectrum and starts a new decoder instance based on the detected peak signal frequency. It tracks each instance on a channel which is rounded at 100 Hz of the audio signal frequency. Number of channels and timeout value can be set in Configure/User Interface/Browser menu.

If there are two stations in nearby frequencies less than 20 Hz apart the other one is eliminated. This is done to reduce impact of frequency splatter - otherwise one strong CW station would spin up decoders on multiple channels. Also, in this version this process is done for every FFT row in the waterfall display. Because I am not doing any kind of averaging yet the detected peak signal center frequency may be incorrect and therefore decoder is tuned on the wrong frequency. With narrow filter bandwidth setting this may cause garbage in the decoder output.

Fig 1. Multichannel CW decoder

In this version each decoder instance uses the same filter bandwidth that is set manually in Configure/Modem/CW/General tab. This means that Bayesian decoder does not have optimal, speed dependent filtering. For faster stations the bandwidth should be larger and for slow stations it can be narrow.
This means that decoding accuracy suffers if there are multiple stations operating at different speeds. Once again this functionality can be improved as the Bayesian decoder does provide a speed estimate automatically but I haven't had time to implement the automatic "Matched filter" feature yet. The filter bandwidth is taken from the "Filter bandwith" slider value for all Bayesian decoder instances.

On receiver speed estimation Rx WPM value is updated only for selected CW signal on the waterfall. You can also observe that with Bayesian decoder enabled the speed display is updated much more frequently than with the legacy CW decoder. Bayesian decoder calculates speed estimate every 5 msec and provides a WPM value whenever there is a state change (mark -> space or space->mark) in the signal. Sometimes the speed estimator gets "stuck" in lower or upper extreme. You can adjust the "Lower limit" or "Upper Limit" on the CW / General Transmit section - this will give a hint to the speed estimator to re-evaluate speed. Obviously, if the speed estimate is incorrect you will get a lot of garbage text out from the decoder.

Tracking feature is not implemented properly yet for the Bayesian decoder. This means that if you start to transmit your speed may be different than the station that you were listening. TX WPM is visible in the configuration screen.

Once again, this is an alpha release provided in order to get some feedback and improvement ideas from FLDIGI users. You can provide feedback by submitting comments below or sending me email to ag1le at innomore dot com. It would be very helpful if you could provide audio samples (WAV files can be recorded using File / Audio / RX Capture feature of FLDIGI), screenshot of what CW parameter settings you are using and general description of the problem or issue you discovered.

If you are interested in software development I would be very grateful to get some additional help. Building a Bayesian Morse decoder has been a great learning experience in signal processing, machine learning algorithms and probability theory. There are plenty of problems to solve in this space as we build more intelligent software to use Morse code, the international language of hams.

73
Mauri AG1LE

New Morse Decoder - part 5

2014-07-17T21:04:00.002-04:00

Multichannel CW decoder for FLDIGI

I have been working on the Bayesian Morse decoder for a while. The latest effort was focused on making it possible to automatically detect all CW signals in the audio band and spin up a new instance of the Bayesian decoder for each detected signal.

Figure 1. shows a running version implemented on top of FLDIGI version 3.21.77. The waterfall shows 9 CW signals from 400Hz to 1200 Hz. The software utilizes the FLDIGI Signal Browser user interface and you can set the signal threshold using the slider bar below the signal browser window. The user interface works very much like the PSK or RTTY browser.

Figure 1. Multichannel CW decoder for FLDIGI

The audio file in this demo was created using Octave script below

Fs = 8000; % Fs is sampling frequency - 8 Khz
Ts = 10*Fs; % Total sample time is 10 seconds

% create 9 different parallel morse sessions - 10 seconds each at 20-35 WPM speed
% TEXT audio file noiselevel Hz speed WPM
x1=morse('CQ TEST DE AG1LE','cw1.wav', 20,1200,Fs,20, Ts);
x2=morse('TEST DE SP3RQ CQ','cw2.wav', 15, 1100,Fs,35, Ts);
x3=morse('DE W3RQS CQ TEST','cw3.wav', 20, 1000,Fs,30, Ts);
x4=morse('SM0LXW CQ TEST DE','cw4.wav',15, 900,Fs, 25, Ts);
x5=morse('CQ TEST DE HS1DX','cw5.wav', 20, 800,Fs, 20, Ts);
x6=morse('TEST DE JA1DX CQ','cw6.wav', 10, 700,Fs, 20, Ts);
x7=morse('DE JA2ATA CQ TEST','cw7.wav',20, 600,Fs, 20, Ts);
x8=morse('UA2HH CQ TEST DE','cw8.wav', 15, 500,Fs, 20, Ts);
x9=morse('CQ TEST DE CT1CX','cw9.wav', 20, 400,Fs, 20, Ts);

% weighted sum - merge all the audio streams together
% 9 signals arranged in frequency order 1200Hz ... 400Hz
y = 0.1*x1 + 0.1*x2 + 0.1*x3 + 0.1*x4 + 0.1*x5 + 0.1*x6 + 0.1*x7 + 0.1*x8 + 0.1*x9;

% write to cwcombo.wav file
wavwrite(y,Fs,'cwcombo.wav');

I have saved the full sources of this experimental FLDIGI version in Github: FLDIGI source
UPDATE July 20, 2014: I rebased this using Source Forge latest branch - new version is here: fldigi-3.22.0CN.tar.gz

Let me know if you are interested in testing this software. I would be interested in getting feedback on scalability, any performance problems as well as how well the CW decoder works with real life signals.

73
Mauri AG1LE

New Morse Decoder - Part 4

2014-06-11T22:04:00.003-04:00

In the previous blog entry I shared some test results of the new experimental Bayesian Morse decoder. Thanks to the alpha testers I found the bug that was causing the sensitivity to signal amplitudes and causing overflow. I have fix this bug in the software over the last few months.

CQ WW WPX CW contest was in May 24-25 and it presented a good opportunity to put the new FLDIGI software version to test. I worked some stations and let the software running for 48 hours to test the stability.

In the figure 1 the first 2 1/2 lines are decoded using the new Bayesian CW decoder. The following 2 lines is the same audio material decoded using the legacy CW decoder. CW settings are also visible in the picture. Matched filter bandwidth was set to 50Hz based on Rx speed of 32 WPM.

The legacy decoder is doing a pretty good job following ZF1A working CW contest at 7005.52 KHz. At first look it appears that the new Bayesian decoder is having a lot of difficulties. Let's have a closer look what is really going on here.

Figure 1. FLDIGI CW Decoder testing

In the audio recording ZF1A made 5 contacts, with VE1RGB, UR4MF, KP2M, SM6BZV and WA2MCR in 1 min 31 seconds.

I let the captured audio file run in a loop for two times using both FLDIGI CW decoder versions. The decoded text is shown below, broken into separate lines by each QSO for clarity.

FLDIGI - the new experimental Bayesian Morse decoder:
RN 512 X
ZF1A N VE1RGB 5NN513 X
ZF1A --R4MF 5NN 0MO0 N T E E E E E E E TU
------TM N T E XJ0TT 6NN 515 X
ZF1A N QT SM6BZV 5NM516 X
ZF1A N WA2MCR 5NN 517
NN 5 --2 B
ZF1A N VE1RGB 5MK 613 X
ZF1A N KR4MF 5NN 514 X
ZF1A N KP2M 6NN 515 TU
ZF1A N OT SM6BZV 5NN 516 X
ZH1A N WA2MCR 5NN 517

The first line should have been decoded as NN 512 TU but the Bayesian decoder misdecoded last 2 letters. What was originally TU was decoded as X.

Let's look at the signal waveform (Fig 2.). It is a close call when looking at the waveform timing...same decoding error happens in almost all the cases above.

Figure 2. TU or X ?

So what happened in the QSO with UR4MF? Let's look at waterfall picture (Fig 3.) for possible clues.
UR2MF is very visible at 692 Hz frequency but what is the faint signal that starts 200 msec before letter U? It is approximately 70Hz below UR2MF and starts "dah-dit".

Figure 3. UR4MF with another station 70 Hz below

The new Bayesian decoder is sensitive enough to be able to pick up this other station, but unfortunately the selected 50 Hz filter bandwidth is not enough to separate these two stations from each other, thus causing what appears an error in decoding. Legacy decoder did not even notice this other station.

FLDIGI - legacy Morse decoder:
6N S W2 TU
F1 VE1RGB 5NN 513 TU
ZF1A UR4MF N 514 TU
ZF1A KP2M 5NN 515 TU
ZF1 SM6BZV 5NN 516 TU
ZF1A WA2MCR 5NN 17
NN S W2 TU
F1 VE1RGB 5NN 513 TU
ZF1A UR4MF E N 514 TU
ZF1A KP2M 5NN 515 TU
ZF1 SM6BZV 5NN 516 TU
ZF1A WA2MCR 5NN 17

First line should be NN 512 TU but deep QSB in the signal mangled the decoding into 6N S W2 TU.
See figure 4 on how the amplitude goes down rapidly between 5 and 1. The first "dit" in number one is barely visible in waterfall figure 5 below. While legacy decoder made an error in this case the new Bayesian decoder didn't seem to have any problem despite this deep and rapid QSB.

Once again, the Bayesian decoder appears to be much more sensitive and automatically able to pickup signals even under deep QSB conditions.

Figure 4. QSB

Figure 5. QSB in waterfall

CONCLUSIONS

Testing the performance of the new, experimental Bayesian Morse decoder with real world contest signals did show some surprising behaviors. What appeared initially to be errors in decoding turned out to be real signals that impacted the probabilistic decoding process. It also appears that the Bayesian method is much more sensitive and may need a bit different strategy related to signal pre-processing and pre-filtering compared to the legacy Morse decoder currently implemented in FLDIGI.

I am working on a better test framework to experiment more systematically the impact of various parameters to find the performance limits of the new experimental Bayesian Morse decoder. Early results show for example that minimum CER (character error rate) is dependent on SNR of signal as expected, but the CW speed dependent pre-filter bandwidth has some interesting effects on CER as demonstrated in figure 6. Different graphs show how the speed (in WPM) setting impacts CER at different pre-filter bandwidth (for example 20 WPM ~ 33.3Hz, 80 WPM ~ 133.3 Hz ). You would expect best CER performance with the most narrow filter setting that matches the actual speed (in this case actual speed was 20 WPM). However, as seen in figure 6 this is not consistently the case with signals between SNR +2 ... +15 dB.

Figure 6. CER vs. SNR testing at different WPM settings