Friday, January 2, 2015

Morse Learning Machine v1 Challenge

Morse Learning Machine v1 Challenge Results

Morse Learning Machine v1 Challenge  (MLMv1) is officially finished.  This challenge was hosted by Kaggle and it created much more interest than I expected.  There was active discussion in  in CW forum, as well as in Reddit here and  here.
The challenge made it to the ARRL headline news in September. Google search gives 1030 hits as different sites and bloggers worldwide picked up the news.

The goal of this competition was to build a machine that learns how to decode audio files containing Morse code.  To make it easier to get started I provided sample Python morse decoder and sample submission files.

For humans it takes many months effort to learn Morse code and after years of practice the most proficient operators can decode Morse code up to 60 words per minute or even beyond. Humans have also extraordinary ability to quickly adapt to varying conditions, speed and rhythm.  We wanted to find out if  it is possible to create a machine learning algorithm that exceeds human performance and adaptability in Morse decoding.

Total of 11 teams and 13 participants were competing almost 4 months for the perfect score 0.0 (this means no errors in decoding sample audio files).  During the competition there was active discussions in the Kaggle forum where participants shared their ideas, asked questions and got also some help from the organizer (ag1le aka myself).

The evaluation was done by Kaggle platform based on submissions that the participants uploaded. Levenshtein distance was used as the evaluation metric to compare predicted results to the corresponding lines in the truth value file that was hidden from the participants.

Announcing the winners

According to the challenge rules I asked participants to make their submissions available as open source with GPL v3 license or later to enable further development of machine learning algorithms.  Resulting new Morse decoding algorithms,  source code and supporting files are uploaded in Github repository by the MLMv1 participants.

I also asked the winners to provide a brief description about themselves, methods & tools used  and any insights and takeaways from this competition.

BrightMinds team: Evgeniy Tyurin and Klim Drobnyh

Public leaderboard score:   0.0
Private leaderboard score: 0.0
Source code & supporting files (waiting for posting by team)

What was your background prior to entering this challenge?
 We  have been studying machine learning for 3 years. Our interests has been different until now, but there are several areas we share experience in, such as image processing, computer vision and applied statistics.

What made you decide to enter?
Audio processing is a new and exciting field of computer science for us.  We wanted to consolidate our expertise in our first joint machine learning project.

What preprocessing and supervised learning methods did you use?
At first we tried to use Fourier transform to get robust features and train supervised machine learning algorithm. But then we would have had extremely large train dataset to work with. That was the reason to change the approach in favour of simple statistical tests.

What was your most important insight into the data?
Our solution relies on the way the data was generated. So, observing the regularity in the data was this very insight that influenced the most.
Were you surprised by any of your insights?
Actually, we expected that the data would be real. For example, recorded live from radio.

Which tools did you use?
Our code was written in Python. We used numpy and scipy to calculate values of normal cumulative density function.

What have you taken away from this competition?
 We gained great experience in audio signal processing and the applicability of machine learning approach.

Tobias Lampert

Public leaderboard score:   0.02
Private leaderboard score: 0.12
Source code & supporting files from Tobias

What was your background prior to entering this challenge?
I do not have a college degree, but I have 16 years of professional experience developing software in a variety of languages, mainly in the financial sector. I have been interested in machine learning for quite some time but seriously started getting into the topic after completing Andrew Ng's machine learning Coursera class about half a year ago.

What made you decide to enter?
Unlike most other Kaggle competitions, the raw data is very comprehensible and not just "raw numbers" - after all morse code audio can easily be decoded by humans with some experience. So to me it looked like an easy and fun exercise to write a program for the task. In the beginning this proved to be true and as expected I achieved decent results with relatively little work. However the chance of finding the perfect solution kept me trying hard until the very end!

What preprocessing and supervised learning methods did you use?

- Butterworth filter for initial computation of dit length
- FFT to transform data from time to frequency domain
- PCA to reduce dimensionality to just one dimension
Unsupervised learning:
- K-Means to generate clusters of peaks and troughs
Supervised learning:
- Neural network for morse code denoising

What was your most important insight into the data?
The most important insight was probably the fact that all files have a structure which can be heavily exploited - due to the fact the pauses at the beginning and end have the same length in all files and wpm is constant the exact length of one dit can be computed. Using this information, the files can be cut into chunks that fully contain either signal or no signal making further analysis much easier.

Were you surprised by any of your insights?
What surprised me most is that after trying several supervised learning methods like neural networks and SVMs with varying success, a very simple approach using an unsupervised method (K-Means) yielded the best results.

Which tools did you use?
I started with R for some quick tests but switched to Python with scikit-learn very early, additionally I used the ffnet module for neural networks. To get a better grasp on the data, I did a lot of charting using matplotlib.

What have you taken away from this competition?
First of all obviously I learned a lot about how morse code works, how it can be represented mathematically and which patterns random morse texts always have in common. I also deepened my knowledge about signal processing and filtering, even though in the end this only played a minor role in my solution. Like all Kaggle competitions, trying to make sense of data, competing with others and discussing solution approaches was great fun!

Observations & Conclusions

I asked advice in the Kaggle forum  how to promote and attract participants. I tried to encourage people to join the challenge during the first 2 weeks by posting frequent forum updates. Based on download statistics (see the table below) the participation rate of this challenge was roughly 11%  as there was 13 actual participants and 120 unique users who downloaded the audio files.  I don't know if this is typical in Kaggle competitions but certainly there were much more interested people than actual participants. 

Unique Users
Total Downloads
Bandwidth Used
2.02 KB
254.21 KB
1.56 KB
129.69 KB
12.69 KB
1.56 MB
65.91 MB
7.72 GB

I did also a short informal survey among the participants in preparation for the MLM v2 challenge. Here are some examples from the survey:

Q: Why did you decide to participate MLM v1 challenge?

I have a background in signal processing and it seemed like a good way to refresh my memory.
By participation I tried to strengthen my Computer Science knowledge. At the university I am attending Machine Learning & Statistics course, so the challenge can help me practice.
You were enthusiastic about it and it seemed fun/challenging/new
Q: How could we make MLM v2 challenge more enjoyable?

More realistic problem setting.
The task is interesting and fun by itself, but the details may vary, so making a challenge with accent on different aspects of the task would be exciting

In the MLMv1 the 200 audio files had all very similar structure - 20 random characters each without spaces.  Participants were able to leverage this structure to overcome poor signal-to-noise ratio in many files. I was surprised to get such good decoding results given that many audio files had only -12 dB SNR.

My conclusion for the next MLM v2 challenge is that I should provide real world, recorded Morse signals and reduce the impact of audio file structure. Also, to make the challenge more realistic I need to incorporate RF propagation effects in the audio files.  I could also stretch the SNR  down to -15 ... -20 dB range, making it harder to search correct answers.

No comments:

Post a Comment

Popular Posts