Sunday, November 16, 2014

High Resolution, Low Power Temperature Logger using Arduino

I am working on a new project with the goal of developing miniaturized high resolution temperature logger device.  The objective is to make it small enough to be a wearable device that would measure temperature with better than 0.1 °C resolution and store the measured data on a memory card for plotting and analysis. The logger needs to run on a chargeable battery for several weeks and must be comfortable enough to wear 24x7.

I used TMP102 that is a digital sensor manufactured by Texas Instruments with I2C a.k.a. two wire interface (TWI) and the following features:

  • 12-bit, 0.0625°C resolution
  • Accuracy: 0.5°C (-25°C to +85°C)
  • Low quiescent current
  • 10µA Active (max)
  • 1µA Shutdown (max)
  • 1.4V to 3.6VDC supply range
  • Two-wire serial interface

This is a very handy high resolution sensor that requires a very low current to operate.

There are quite many data logger boards available but only a few  that are fully hack-able and small in size.  I ended up selecting  Sparkfun OpenLog board that looked small enough and that had all software,  firmware and hardware designs available as open source. The only small problem was I2C (TWI) interface signals SCL and SDA. I had to solder wires directly to ATmega328 CPU pins, as they are not available on external interface. Soldering to surface mounted components requires steady hand and small soldering tip on your iron. Having a 20x optical microscope does also help.

Hardware components

The components for this project are listed below with links and cost at the time of writing this article. Most of these are available at Sparkfun.
  1. Sparkfun OpenLog   $24.95
  2. Digital Temperature Sensor Breakout - TMP102 $5.96
  3. Lithium Polymer USB Charger and Battery  $24.95
  4. Samsung 16GB EVO Class 10 Micro SDHC  $10.99
  5. Enclosure  - 3D printed   $9.50 
Total cost of components was  $76.35 

Software Development 

You can download the latest integrated software development environment from Arduino IDE page. I used the   Arduino 1.5.8  with the following configuration


  • Tools/Board -  Arduino Uno
  • Tools/Port -  /dev/ttyUSB0


To connect my Lenovo X301 laptop running Linux Mint 17  operating system I used FT231X Breakout  with a  Crossover Breakout for FTDI. To make it easier to connect I soldered  Header - 6-pin Female (0.1", Right Angle) and
Break Away Male Headers - Right Angle connectors for these small breakout boards.

My focus in software development was to re-use existing OpenLog software, implement temperature measurement over I2C bus and try to minimize battery consumption by implementing power shutdown between measurements.  This software is still works-in-progress but even with this software version (v3) the average current consumption is about 200 uA level @3.3 Volts. During measurement over I2C bus and when storing the results to the microSD card the current peaks to 25 mA for a few milliseconds.

The software makes a measurement every 5000 milliseconds. Since this design does not have a RTC chip the timekeeping is done with software. Software has a routine to compensate for timing errors but it is not very accurate yet. It meets my needs for data logging purposes for now. For the final product it might be useful to have RTC chip built into the design.

A fully charged 3.7 V 850mAh LiPo battery is estimated to provide power for this data logger over 170 days, though I haven't tested it for longer period than 15 days so far.

The latest logger software is posted in Github.

3D Printed Enclosure 

I could not find a suitable small enclosure so I took OpenSCAD software  into use and built a 3D model using some examples I found in Thingverse.  OpenSCAD allows you to write 3D objects in simple language so it was quite easy to experiment with different designs and view them before creating the .STL file required for 3D printing.

The enclosure size was determined mostly by the LiPo battery dimensions.  With a smaller battery it would be possible to make a much smaller enclosure.  Also, it would be possible to design a single circuit board with all the required components and make it smaller.  Since this project is still in concept phase I chose to use the selected components and 3D print an enclosure where I can fit them in.


3D model of enclosure





















I sent the .STL file to a local company with some email instructions. I my first version the dimensions were 1/8" units - this was not clear so I had to call them to clarify. In the 2nd version I used millimeters as units and the enclosure came out fabricated as I expected. The enclosure is big enough for the LiPo battery and it has also a small indention at the bottom to get the TMP102 sensor chip closer to the outside surface. The wall thickness was set to 1.0 mm and sensor chip is only 0.5 mm from the bottom surface.

The cover dimensions were designed it to be tightly fitted on the top.

3D printed enclosure - top view

The enclosure has also fixtures both sides attaching a 22mm standard wristband using spring bars.
3D printed enclosure - side view 



















Below is a photograph of 3D enclosure model v1 with wristbands attached.

3D printed enclosure with wristbands attached




























In the photo below the hardware components fit in perfectly and the LiPo battery comes on the top. The power is connected using red and black wires to OpenLog board.

3D printed enclosure with components inside





























Measurement Results 

I tested the temperature logger by putting it from room temperature T0 22.6 °C into a freezer at Tm -21.0 °C.

Based on the measurement results below (red graph) it took 54 minutes for the sensor to reach this temperature. The thermal mass of the sensor itself is small, but it was in a closed enclosure with the LiPo battery that has a much larger thermal mass.  I did the same test but this time left the enclosure open so that the sensor was exposed to freezer temperature from both sides.  The blue graph below shows that the sensor reached -21 °C in  30 minutes.


Freezer Test























The thermal time constant is defined as the time required by a sensor to reach 63.2% of a step change in temperature under a specified set of conditions.

The response of a sensor to a sudden change in the surrounding temperature is exponential and it is described by the following equation:

where T is sensor temperature, Tm is the surrounding medium temperature, T0 is the initial sensor temperature, t is the time and τ is the time constant.

Looking at the results above and trying to find best fit of measurement data to this model using RMS error method we can estimate that 

τ  = 0.009732  for closed enclosure
τ  = 0.003132 for open enclosure

See graph of models vs. actuals is below. To correctly estimate temperatures this thermal time constant need to be taken into account.
Model of thermal time constants τ
























In the datasheet there are the following guidelines to maintain accuracy.

The temperature sensor in the TMP102 is the chip itself. Thermal paths run through the package leads, as well as the plastic package. The lower thermal
resistance of metal causes the leads to provide the primary thermal path.
To maintain accuracy in applications requiring air or surface temperature measurement, care should be taken to isolate the package and leads from ambient air temperature. A thermally-conductive adhesive is helpful in achieving accurate surface temperature measurement.

One improvement would be to apply thermally-conductive materials between the sensor chip and the bottom surface of the enclosure. Some biologically inert metal like gold or titanium might provide better thermal time constant. Some insulation might be needed to isolate sensor from ambient air. 

Conclusions 

Temperature logging using modern, high resolution and accurate digital sensors is quite easy.  You need only simple a micro-controller and few lines of code to build a data logger that is small enough to be wearable.

Working with Arduinos is not only easy but also a lot of fun.  With minimal investment you can build a powerful data logging device and add new sensors in incremental fashion.  The software development is also straightforward and for most problems there is already some open source examples available.

The real challenges seem to be on the physics and mechanical enclosure design side. Building an enclosure for the sensor and electronics with a small thermal time constant is not trivial.  For highly accurate wearable data logging devices you need also to consider many other topics such as

  • physical appearance 
  • hypoallergic materials 
  • fit and comfort for different size of subjects
  • thermal properties of materials
  • any variability in sensor contact with subject or ambient air 

It would be interesting to see the thermal design details of popular fitness trackers that claim accurate body temperature measurements. In particular the basal body temperature measurements where accuracy and high resolution is required poor thermal design would impact results significantly.





Wednesday, September 3, 2014

Morse Learning Machine - Challenge

MACHINE LEARNING CHALLENGE

I was astonished to get email acknowledgement that my  Kaggle Morse Challenge was approved today. I have spent last few days preparing materials and editing the description and the rules for this competition.

The goal of this competition is to build a machine that learns how to decode audio files containing Morse code.


For humans it takes many months effort to learn Morse code and after years of practice the most proficient operators can decode Morse code up to 60 words per minute or even beyond. Humans have also extraordinary ability to quickly adapt to varying conditions, speed and rhythm.

I want to find out if it is possible to create a machine learning algorithm that exceeds human performance and adaptability in Morse decoding.  I have shared some of these ideas in New England Artificial Intelligence meetup about one year ago and got enthusiastic feedback from the participants.




WHY KAGGLE?   

Kaggle is a platform for predictive modelling and analytics competitions on which companies and researchers post their data and statisticians and data miners from all over the world compete to produce the best models. This crowdsourcing approach relies on the fact that there are countless strategies that can be applied to any predictive modelling task and it is impossible to know at the outset which technique or analyst will be most effective. Kaggle aims at making data science a sport.

Kaggle's community of data scientists comprises tens of thousands of PhDs from quantitative fields such as computer science, statistics, econometrics, maths and physics, and industries such as insurance, finance, science, and technology. They come from over 100 countries and 200 universities. In addition to the prize money and data, they use Kaggle to meet, learn, network and collaborate with experts from related fields.

For the Morse Learning Machine competition I hope to attract people from the Kaggle community who are interested in solving new, difficult challenges using their predictive data modeling, computer science and machine learning expertise.  For students this challenge provides a great opportunity to put theoretical concepts into practice and see how they can solve tough problems by applying knowledge gained in class rooms.


COMPETITION DETAILS


During the competition, the participants build a learning system capable of decoding Morse code. To that end, they get development data consisting of 200 .WAV audio files containing short sequences of randomized Morse code. The data labels are provided for a training set so the participants can self-evaluate their systems. To evaluate their progress and compare themselves with others, they can submit their prediction results on-line to get immediate feedback. A real-time leaderboard shows participants their current standing based on their validation set predictions.

I have also provided  sample Python Morse decoder  to make it easier too get started. While this software is purely experimental version it has some features of the FLDIGI Morse decoder   but implemented using Python instead of C++.

You can of course  leverage the experimental multichannel CW decoder I recently implemented on FLDIGI or the standalone version of Bayesian decoder written in C++.  There is also some new tools I posted to Github.

Please help me to spread this message to attract participants for the Morse Learning Machine challenge!

73
Mauri AG1LE





Sunday, August 24, 2014

Cortical Learning Algorithm for Morse code - part 1

Cortical Learning Algorithm Overview 

Humans can perform many tasks that computers currently cannot do. For example understanding spoken language in noisy environment, walking down a path in complex terrain or winning in CQWW WPX CW contest are tasks currently not feasible for computers (and might be difficult for humans, too).
Despite decades of machine learning & artificial intelligence research, we have few viable algorithms for achieving human-like performance on a computer. Morse decoding at the best human performance level would be a good target to test these new algorithms.

Numenta Inc.  has developed technology called Cortical Learning Algorithm (CLA) that was recently made available as open source project  NuPIC.  This software provides an online learning system that learns from every data point fed into the system. The CLA is constantly making predictions which are continually verified as more data arrives. As the underlying patterns in the data change the CLA adjusts accordingly. CLA uses Sparse Distributed Representation (SDR) in similar fashion as neocortex in human brain stores information. SDR has many advantages over traditional ways of storing memories, such as ability to associate and retrieve information using noisy data.

Detailed description on how CLA works can be found from this whitepaper.

Experiment 

To learn more how CLA works I decided to start with a very simple experiment.  I created a Python script that uses Morse code book and calculates Sparse Distributed Representation (SDR) for each character.  Figure 1 below shows the Morse alphabet and numbers 0...9 converted to SDRs.

Fig 1. SDR for Morse characters A...Z, 0...9


NuPIC requires input vector to be a binary representation of the input signal.  I created a function that packs "dits" and "dahs" into 36x1  vector, see two examples below. Each "dit"  is represented as 1. followed by 0. and each "dah" is represented by  1. 1. 1. followed by 0.  to accomodate 1:3 timing ratio between "dit" and "dah".  This preserves the semantic structure of Morse code that is important from sequence learning perspective.

H ....
[ 1.  0.  1.  0.  1.  0.  1.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]

O ---
[ 1.  1.  1.  0.  1.  1.  1.  0.  1.  1.  1.  0.  0.  0.  0.  0.  0.  0.
  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]

The Spatial Pooler uses 64 x 64 vector giving SDR of size 4096. As you calculate the SDR random bits get active on this vector. I plotted all active cells (value = 1) per columns 0...4096 for each letters and numbers as displayed in Fig 1. above. The respective character is shown on the right most column.

To see better the relationships between SDR and Morse character set I created another SDR map with letters  'EISH5' and 'TMO0'  next to each other.  These consequent letters and numbers differ from each other only by one "dit" or one "dah".  See Fig 2.  for SDR visualization of these characters.

There is no obvious visible patterns across these Morse characters, all values look quite different.  In the Numenta CLA whitepaper page 21 it says "Imagine now that the input pattern changes. If only a few input bits change, some columns will receive a few more or a few less inputs in the “on” state, but the set of active columns will not likely change much. Thus similar input patterns (ones that have a significant number of active bits in common) will map to a relatively stable set of active columns."

This doesn't seem to apply in these experiments so I need to investigate this a bit further.


Fig 2. SDR for Morse characters EISH5 and TMO0










I did another experiment by reducing SDR size to only 16x16  so 256 cells per SDR. In Fig 3.  it is now easier to see common patterns between similar characters - for example compare C with K and Y.  These letters have 3 common cells active.

Fig 3. SDR  map with reduced size 16x16 = 256 cells
























The Python software to create the SDR pictures is below:

"""A simple program that demonstrates the working of the spatial pooler"""
import numpy as np
from matplotlib import pyplot as plt
from random import randrange, random
from nupic.research.spatial_pooler import SpatialPooler as SP


CODE = {
    ' ': '',
    'A': '.-',
    'B': '-...',
    'C': '-.-.', 
    'D': '-..',
    'E': '.',
    'F': '..-.',
    'G': '--.',    
    'H': '....',
    'I': '..',
    'J': '.---',
    'K': '-.-',
    'L': '.-..',
    'M': '--',
    'N': '-.',
    'O': '---',
    'P': '.--.',
    'Q': '--.-',
    'R': '.-.',
    'S': '...',
    'T': '-',
    'U': '..-',
    'V': '...-',
    'W': '.--',
    'X': '-..-',
    'Y': '-.--',
    'Z': '--..',
    '0': '-----',
    '1': '.----',
    '2': '..---',
    '3': '...--',
    '4': '....-',
    '5': '.....',
    '6': '-....',
    '7': '--...',
    '8': '---..',
    '9': '----.' }



class Morse():
  
  def __init__(self, inputShape, columnDimensions):
    """
     Parameters:
     ----------
     _inputShape : The size of the input. (m,n) will give a size m x n
     _columnDimensions : The size of the 2 dimensional array of columns
     """
    self.inputShape = inputShape
    self.columnDimensions = columnDimensions
    self.inputSize = np.array(inputShape).prod()
    self.columnNumber = np.array(columnDimensions).prod()
    self.inputArray = np.zeros(self.inputSize)
    self.activeArray = np.zeros(self.columnNumber)

    self.sp = SP(self.inputShape, 
self.columnDimensions,
potentialRadius = self.inputSize,
numActiveColumnsPerInhArea = int(0.02*self.columnNumber),
globalInhibition = True,
synPermActiveInc = 0.01
)

    
  def createInputVector(self,elem,chr):
        
    print "\n\nCreating a Morse codebook input vector for character: " + chr + " " + elem
    
    #clear the inputArray to zero before creating a new input vector
    self.inputArray[0:] = 0
    j = 0
    i  = 0

    while j < len(elem) :
      if elem[j] == '.' :
        self.inputArray[i] = 1
        self.inputArray[i+1] = 0
        i = i + 2
      if elem[j] == '-' :
        self.inputArray[i] = 1
        self.inputArray[i+1] = 1
        self.inputArray[i+2] = 1
        self.inputArray[i+3] = 0
        i = i + 4
      j = j + 1
               
    print self.inputArray
     
     
  def createSDRs(self,row,x,y,ch):
    """Run the spatial pooler with the input vector"""
    print "\nComputing the SDR for character: " + ch
    
    #activeArray[column]=1 if column is active after spatial pooling
    self.sp.compute(self.inputArray, True, self.activeArray)
    
    # plot each row above previous one
    z = self.activeArray * row

    # Plot the SDR vectors - these are 4096 columns in the plot, with active cells visible
    plt.plot(x,y,z,'o')
    plt.text(4120,row-0.5,ch,fontsize=14);
    print self.activeArray.nonzero()
    
    
# Testing NuPIC for Morse decoding  
# Create SDRs from Morse Codebook input vectors
print "\n \nCreate SDRs from Morse Codebook input vectors"

# vector size 36x1 for input,  64x64 = 4096 for SDR 
example = Morse((36, 1), (64, 64))

x,y = np.meshgrid(np.linspace(1,1,4096), np.linspace(1,1,32))
plt.ylim([0,38])
plt.xlim([0,4170])

# Select the characters to be converted to SDRs
#str = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'
str = 'EISH5 TMO0'
row = 1
for ch in str:
  example.createInputVector(CODE[ch],ch)
  example.createSDRs(row,x,y,ch)
  row = row +1

plt.show()
plt.clf()



Conclusions

CLA provides promising new technology that is now open for ham radio experimenters to start tinkering with. Building a CLA based Morse decoder would be a good performance benchmark for CLA technology. There are plenty of existing Morse decoders available to compare with and it is fairly easy to test decoder accuracy under noisy conditions.  There is also a plenty of audio test material available, including streaming sources like WebSDR stations.

73
Mauri 




Friday, July 25, 2014

New Morse Decoder - part 6

Multichannel CW decoder 

Development of the Bayesian CW decoder is moving forward. Many thanks to Dave W1HKJ  there is now an alpha build available also for Windows platform. The v3.21.83cw-a4  tarball sources and Windows version are available in http://www.w1hkj.com/ag1le/

This version has still multiple problems that need to fixed. In Fig 1. below I have screenshot where I have the multichannel signal browser and Fldigi configuration screen for Modems / CW  visible. I am running Windows FLDIGI version v3.21.83cw-a4  connected to PowerSDR v2.6.4 and my Flex3000 radio.

The following description explains the options and expected behavior in this alpha software version.  Things are not yet  well optimized so you are likely to see a lot of misdecoded signals.  I am interested getting feedback and improvement ideas to make the Bayesian decoder better.

Checkbox "Bayesian decoding" enables multichannel operation. If you have Signal Browser open you can slide the horizontal control at the bottom to adjust the signal decoding threshold. With lower threshold you can enable weaker CW stations to be decoded but often you get just noise and the decoder produces garbage as visible in Fig 1.   The software detects peaks in the spectrum and starts a new decoder instance based on the detected peak signal frequency. It tracks each instance on a channel which is rounded at 100 Hz of the audio signal frequency. Number of channels and timeout value can be set in  Configure/User Interface/Browser menu.

If there are two stations in nearby frequencies less than 20 Hz apart the other one is eliminated.  This is done to reduce impact of frequency splatter - otherwise one strong CW station would spin up decoders on multiple channels. Also, in this version this process is done for every FFT row in the waterfall display. Because I am not doing any kind of averaging yet the detected peak signal center frequency  may be incorrect and therefore decoder is tuned on the wrong frequency. With narrow filter bandwidth setting this may cause garbage in the decoder output.

Fig 1. Multichannel CW decoder 

In this version each decoder instance uses the same filter bandwidth that is set manually in Configure/Modem/CW/General  tab. This means that Bayesian decoder does not have optimal, speed dependent filtering. For faster stations the bandwidth should be larger and for slow stations it can be narrow.
This means that decoding accuracy suffers if there are multiple stations operating at different speeds. Once again this functionality can be improved as the Bayesian decoder does provide a speed estimate automatically but I haven't had time to implement the automatic "Matched filter"  feature yet. The filter bandwidth is taken from the "Filter bandwith" slider value  for all Bayesian  decoder instances.

On receiver speed estimation Rx WPM value is updated only for selected CW signal on the waterfall. You can also observe that with Bayesian decoder enabled the speed display is updated much more frequently than with the legacy CW decoder.  Bayesian decoder calculates speed estimate every 5 msec and provides a WPM value whenever there is a state change  (mark -> space or space->mark) in the signal.  Sometimes the speed estimator gets "stuck" in lower or upper extreme.  You can adjust the "Lower limit" or "Upper Limit" on the CW / General  Transmit  section - this will give a hint to the speed estimator to re-evaluate speed.  Obviously, if the speed estimate is incorrect you will get a lot of garbage text  out from the decoder.

Tracking feature is not implemented properly yet for the Bayesian decoder. This means that if you start to transmit  your speed may be different than the station that you were listening. TX WPM is visible in the configuration screen.

Once again,  this is an alpha release  provided in order to get some feedback and improvement ideas  from FLDIGI users.  You can provide feedback by submitting comments below or sending me email to  ag1le at innomore dot com.   It would be very helpful  if you could provide audio samples (WAV files can be recorded   using File / Audio / RX Capture feature of FLDIGI),  screenshot of what CW parameter settings you are using  and general description of the problem or issue you discovered.

If you are interested in software development I would be very grateful  to get some additional help. Building a Bayesian Morse decoder has been a great learning experience in signal processing, machine learning algorithms  and probability theory.  There are plenty of problems to solve in this space as we  build  more intelligent software to use Morse code, the international language of hams.  

73
Mauri AG1LE








Thursday, July 17, 2014

New Morse Decoder - part 5

Multichannel CW decoder for FLDIGI

I have been working on the Bayesian Morse decoder for a while.  The latest effort was focused on making it possible to automatically detect all CW signals in the audio band and spin up a new instance of the Bayesian decoder for each detected signal.

Figure 1. shows a running version implemented on top of FLDIGI  version 3.21.77.  The waterfall shows 9 CW signals from 400Hz to 1200 Hz. The software utilizes the FLDIGI Signal Browser user interface and you can set the signal threshold using the slider bar below the signal browser window. The user interface works very much like the PSK or RTTY browser.

Figure 1. Multichannel CW decoder for FLDIGI
The audio file in this demo was created using Octave script  below

Fs = 8000; % Fs is sampling frequency - 8 Khz
Ts = 10*Fs; % Total sample time is 10 seconds


% create 9 different parallel morse sessions - 10 seconds each at 20-35 WPM speed
%         TEXT           audio file  noiselevel Hz    speed WPM 
x1=morse('CQ TEST DE AG1LE','cw1.wav', 20,1200,Fs,20, Ts);
x2=morse('TEST DE SP3RQ CQ','cw2.wav', 15, 1100,Fs,35, Ts);
x3=morse('DE W3RQS CQ TEST','cw3.wav', 20,  1000,Fs,30, Ts);
x4=morse('SM0LXW CQ TEST DE','cw4.wav',15,   900,Fs, 25, Ts);
x5=morse('CQ TEST DE HS1DX','cw5.wav', 20,    800,Fs, 20, Ts);
x6=morse('TEST DE JA1DX CQ','cw6.wav', 10,      700,Fs, 20, Ts);
x7=morse('DE JA2ATA CQ TEST','cw7.wav',20,      600,Fs, 20, Ts);
x8=morse('UA2HH CQ TEST DE','cw8.wav', 15,      500,Fs, 20, Ts);
x9=morse('CQ TEST DE CT1CX','cw9.wav', 20,      400,Fs, 20, Ts);


% weighted sum - merge all the audio streams together 
% 9 signals arranged in frequency order 1200Hz ... 400Hz
y = 0.1*x1 + 0.1*x2 + 0.1*x3 + 0.1*x4 + 0.1*x5 + 0.1*x6 + 0.1*x7 + 0.1*x8 + 0.1*x9;


% write to cwcombo.wav file 
wavwrite(y,Fs,'cwcombo.wav');

I have saved the full sources of this experimental FLDIGI version in Github:  FLDIGI source
UPDATE July 20, 2014: I rebased this using Source Forge latest branch  - new version is here: fldigi-3.22.0CN.tar.gz

Let me know if you are interested in testing this software.  I would be interested in getting feedback on scalability, any performance problems as well as how well the CW decoder works with real life signals.

73
Mauri AG1LE

Wednesday, June 11, 2014

New Morse Decoder - Part 4

In the previous blog entry  I shared some test results of the new experimental Bayesian Morse decoder.  Thanks to the alpha testers I found the bug that was causing the sensitivity to signal amplitudes and causing overflow. I have fix this bug in the software over the last few months.

CQ WW WPX  CW contest was in May 24-25 and it presented a good opportunity to put the new FLDIGI software version to test.  I worked some stations and let the software running for 48 hours to test the stability.

In the figure 1 the first 2 1/2 lines are decoded using the new Bayesian CW decoder. The following 2 lines is the same audio material decoded using the legacy CW decoder.  CW settings are also visible in the picture.  Matched filter bandwidth was set to 50Hz based on Rx speed of 32 WPM.

 The legacy decoder is doing a pretty good job following ZF1A working CW contest at 7005.52 KHz.  At first look it appears that the new Bayesian decoder is having a lot of difficulties.   Let's have a closer look what is really going on here.

Figure 1.  FLDIGI  CW Decoder testing 






















In the audio recording  ZF1A made 5 contacts, with VE1RGB, UR4MF, KP2M, SM6BZV and WA2MCR in 1 min 31 seconds.

I let the captured audio file run in a loop for two times using both FLDIGI CW decoder versions.  The decoded text is shown below, broken into separate lines by each QSO for clarity.

FLDIGI - the new experimental Bayesian Morse decoder: 
RN 512 X 
ZF1A N VE1RGB 5NN513 X 
ZF1A --R4MF 5NN 0MO0 N T E E E E E E E TU
 ------TM N T E XJ0TT 6NN 515 X 

ZF1A N QT SM6BZV 5NM516 X 
ZF1A N WA2MCR 5NN 517 
NN 5 --2 B
ZF1A N VE1RGB 5MK 613 X 
ZF1A N KR4MF 5NN 514 X 
ZF1A N KP2M 6NN 515 TU 
ZF1A N OT SM6BZV 5NN 516 X
ZH1A N WA2MCR 5NN 517 

The first line should have been decoded as NN 512 TU but  the Bayesian decoder misdecoded last 2 letters. What was originally  TU  was decoded as X.

Let's look at the signal waveform (Fig 2.). It is a close call when looking at the waveform timing...same decoding error happens in almost all the cases above.

Figure 2.  TU or  X ? 














So what happened in the QSO with UR4MF? Let's look at waterfall picture (Fig 3.) for possible clues.
UR2MF is very visible at 692 Hz frequency but what is the faint signal that starts 200 msec before letter U? It is approximately 70Hz below UR2MF and starts "dah-dit".

Figure 3. UR4MF with another station 70 Hz below









The new Bayesian decoder is sensitive enough to be able to pick up this other station, but unfortunately the selected 50 Hz filter bandwidth is not enough to separate these two stations from each other, thus causing what appears an error in decoding.  Legacy decoder did not even notice this other station.



FLDIGI - legacy Morse decoder: 
6N S W2 TU 
F1 VE1RGB 5NN 513 TU 
ZF1A UR4MF N 514 TU 
ZF1A KP2M 5NN 515 TU 
ZF1 SM6BZV 5NN 516 TU 
ZF1A WA2MCR 5NN 17 
NN S W2 TU 
F1 VE1RGB 5NN 513 TU 
ZF1A UR4MF E N 514 TU 
ZF1A KP2M 5NN 515 TU 
ZF1 SM6BZV 5NN 516 TU 
ZF1A WA2MCR 5NN 17


First line should be NN 512 TU but deep QSB in the signal mangled the decoding into 6N S W2   TU.
See figure 4 on how the amplitude goes down rapidly between 5 and 1. The first "dit" in number one is barely visible in waterfall figure 5 below. While legacy decoder made an error in this case  the new Bayesian decoder didn't seem to have any problem despite this deep and rapid QSB.

Once again, the Bayesian decoder appears to be much more sensitive and automatically able to pickup signals even under deep QSB conditions.
Figure 4. QSB 













Figure 5. QSB in waterfall










CONCLUSIONS

Testing the performance of the new, experimental Bayesian Morse decoder with real world contest signals did show some surprising behaviors. What appeared initially to be errors in decoding turned out to be real signals that impacted the probabilistic decoding process. It also appears that the Bayesian method is much more sensitive and may need a bit different strategy related to signal pre-processing and pre-filtering compared to the legacy Morse decoder currently implemented in FLDIGI.

I am working on a better test framework to experiment more systematically the impact of various parameters to find the performance limits of the new experimental Bayesian Morse decoder.   Early results show for example that minimum CER  (character error rate) is dependent on SNR of signal as expected, but the CW speed dependent pre-filter bandwidth has some interesting effects on CER as demonstrated in figure 6.  Different graphs show how the speed (in WPM) setting impacts CER at different pre-filter bandwidth  (for example 20 WPM  ~ 33.3Hz, 80 WPM ~ 133.3 Hz ).  You would expect best CER performance with the most narrow filter setting that matches the actual speed (in this case actual speed was 20 WPM).  However,  as seen in figure 6  this is not consistently the case with signals between SNR  +2 ... +15 dB.


Figure 6.  CER vs. SNR testing at different WPM settings 




Monday, May 19, 2014

Noise Reduction Method for Morse Signals

MATCHING PURSUIT METHOD FOR NOISE REDUCTION 

In my previous post I shared a new method for reducing noise from Morse signals based on MPTK toolkit.  I contacted Dr. Remi Gribonval who is one of the authors of the MPTK toolkit and asked for his advice in how to optimize the atoms for Morse code signals.   To my surprise he responded the same day and gave me some great ideas. There was also a fruitful discussion in eham.net  that helped me to figure out how to optimize the dictionary suitable for processing audio signals containing Morse code.

FREQUENCY DOMAIN VIEW 

I used the same pileup.wav file than in the previous post and used the new, revised dictionary by issuing the following commands

mpd -D dic_mdct_morse2.xml -s 10 pileup.wav pileup.bin
mpr pileup.bin pileup_reco.wav

Runtime of mpd command was 73 seconds on Thinkpad X301 laptop. The length of pileup.wav is 71.5 seconds, so mpd can decompose audio almost in real time to selected 10 dB SNR.

To illustrate the noise reduction capability I zoomed into a four second long audio segment starting at 1:00.0.  The result is shown in figure 1 spectrum plot below.  Figure 2 shows the original noisy signal. It is easy to observe that there has been quite significant noise reduction and Morse signals are much more visible from background noise.  Unlike a normal bandpass filter that passes only the frequency of a single CW signal within range, this new method automatically detects and passes all CW signals in a broad frequency range while reducing surrounding noise. You may want to download the filtered audio file to listen and compare with the original audio file.


Fig 1. Filtered pileup signal



Fig 2. Original pileup signal























Here is the content of the dictionary file  dic_mdct_morse2.xml.  Sampling rate used in the original audio file was 8000 Hz.

<?xml version="1.0" encoding="ISO-8859-1"?>
<dict>
<libVersion>0.2</libVersion>
        <block>
        <param name="blockOffset" value="0" />
        <param name="fftSize" value="256" />
        <param name="type" value="mdst" />
        <param name="windowLen" value="256" />
        <param name="windowShift" value="32" />
        <param name="windowopt" value="0" />
        <param name="f0Max" value="800" />
        <param name="f0Min" value="400" />     
        <param name="windowtype" value="rectangle" />
   </block>

        <block>
        <param name="blockOffset" value="0" />
        <param name="fftSize" value="256" />
        <param name="type" value="mdst" />
        <param name="windowLen" value="256" />
        <param name="windowShift" value="64" />
        <param name="windowopt" value="0" />
        <param name="f0Max" value="800" />
        <param name="f0Min" value="400" />     
        <param name="windowtype" value="rectangle" />
   </block>

</dict>


TIME DOMAIN VIEW 

I wanted to see how this new noise reduction method impacts the amplitude envelope of the CW signals. I created a sample Morse audio test file with 20 dB SNR and used a dictionary file dic_mdct_morse.xml below for processing.   Figure 3. below shows letter Q  - as you can observe there is visible noise riding on signal and also between "dahs" and "dit".


Fig 3.  Letter Q  with 20 dB SNR

Figure 4. below shows filtered signal.  The waveform shows no noise between "dahs" and "dit".  Also, there is very little noise on signal but selected atoms are visible as 3 "bumps" riding on top of "dah".


Fig 4. Letter Q filtered


























I did another experiment with much worse signal-to-noise ratio.  Figure 6. shows letters QUI  with 6 dB SNR  (this was normalized to -3 dB before processing to prevent overflows).  Figure 5 shows the filtered version of this signal. Noise reduction is visible but there is still some noise left.

Fig 5.  Filtered QUI 



















Fig 6.  Letters QUI with 6 dB SNR


Here is the dictionary file  dic_mdct_morse.xml used in the time domain examples above.  The sampling rate in these artificially generated files is 4000 Hz therefore the atom sizes are scaled down.


<?xml version="1.0" encoding="ISO-8859-1"?>
<dict>
<libVersion>0.2</libVersion>
        <block>
        <param name="blockOffset" value="0" />
        <param name="fftSize" value="128" />
        <param name="type" value="mdst" />
        <param name="windowLen" value="128" />
        <param name="windowShift" value="16" />
        <param name="windowopt" value="0" />
        <param name="f0Max" value="800" />
        <param name="f0Min" value="400" />     
        <param name="windowtype" value="rectangle" />
   </block>

        <block>
        <param name="blockOffset" value="0" />
        <param name="fftSize" value="128" />
        <param name="type" value="mdst" />
        <param name="windowLen" value="128" />
        <param name="windowShift" value="32" />
        <param name="windowopt" value="0" />
        <param name="f0Max" value="800" />
        <param name="f0Min" value="400" />     
        <param name="windowtype" value="rectangle" />
   </block>

</dict>

CONCLUSIONS

Matching Pursuit is a powerful new digital audio processing method for reducing noise in Morse signals.  The experiments above demonstrate that it is possible to significantly improve signal to noise ratio even in cases where there are multiple CW stations sending simultaneously at nearby frequencies.
MPTK toolkit provides rich set of tools to further optimize the digital signal processing. Examples above were the result of some 3 hours of work testing various atoms and related parameters.

All software used is available as open source making it feasible to integrate this noise reduction method to various other projects.  If you find this article valuable please provide your comments below.


73
Mauri AG1LE



Sunday, May 11, 2014

Sparse representations of noisy Morse signals - matching pursuit algorithm in Morse decoders

MATCHING PURSUIT ALGORITHM IN MORSE DECODING

Recently I started doing some research on sparse representations of audio signals. There is a lot of research papers available in how to use sparse coding for machine learning purposes, such as computer vision and audio classification. There is also growing body of evidence that sparseness constitutes a general principle of sensory coding in the human nervous system.

There are also many algorithms designed to transform audio signals to sparse codes.  One such algorithm is Matching Pursuit  that decomposes any signal into linear expansion of waveforms that are selected from a redundant dictionary of functions. A matching pursuit isolates the signal structures that are coherent with respect to given dictionary.  A classic paper that explains this approach was written by Stephane G. Mallat and Zhifeng Zhang in 1993.

Matching Pursuit Tool Kit (MPTK) provides a fast implementation of the Matching Pursuit algorithm for the sparse decomposition of multichannel signals. It comprises a C++ library, standalone command line utilities, and some scripts for running it and plotting the results through Matlab. This article explains the concept and the implementation details of the toolkit.

To experiment with sparse coding I selected a noisy WAV audio file that contains 1:15 long CW pileup example.   I used Audacity to review the spectrum  plot of the original audio file as shown on Fig 1. below. The audio sample contains some 13 ... 15 CW stations, some hardly audible under noise.


Fig 1.  Spectrum display - original audio file
The MPTK toolkit comes with a library of dictionaries. I tested all of them and also created also my own variations to come up with a set of suitable "atoms"  that would be able to extract CW signals from the audio file above.  I also reconstructed the audio file from book of decomposed "atoms" to listen the outcome.  

The best initial results were obtained by using "dic_test.xml"  dictionary that is composed of a combination of  harmonic and gabor atoms. Figure 2. below shows the spectrum plot of reconstructed audio file and noise reduction is quite visible. The CW signals are highlighted with red and white colors whereas background noise has almost disappeared.


Fig 2.  Spectrum display - reconstructed audio file 




















I am also enclosing the reconstructed audio file.  It sounds quite different from the original, noisy audio. The higher frequency signals sound like echo (perhaps too much harmonics) and at 900 Hz  IN3NJB is clear but dits and dahs could have more sharp edges (perhaps too long gabor wavelet?).  

This experiments shows that Matching Pursuit is a very powerful signal processing tool that could be used to improve Morse decoder capabilities, especially with pileups and noisy signals.  However, more work is needed to build an optimized dictionary of "atoms".  For example the modified Morlet wavelet discussed in this article  might improve the reconstruction accuracy.  

If you find this article interesting please provide your comments and feedback below. 

73 
Mauri AG1LE 








Popular Posts