Morse Learning Machine v1 Challenge Results
Morse Learning Machine v1 Challenge (MLMv1) is officially finished. This challenge was hosted by Kaggle and it created much more interest than I expected. There was active discussion in eham.net in CW forum, as well as in Reddit here and here.The challenge made it to the ARRL headline news in September. Google search gives 1030 hits as different sites and bloggers worldwide picked up the news.
The goal of this competition was to build a machine that learns how to decode audio files containing Morse code. To make it easier to get started I provided sample Python morse decoder and sample submission files.
For humans it takes many months effort to learn Morse code and after years of practice the most proficient operators can decode Morse code up to 60 words per minute or even beyond. Humans have also extraordinary ability to quickly adapt to varying conditions, speed and rhythm. We wanted to find out if it is possible to create a machine learning algorithm that exceeds human performance and adaptability in Morse decoding.
Total of 11 teams and 13 participants were competing almost 4 months for the perfect score 0.0 (this means no errors in decoding sample audio files). During the competition there was active discussions in the Kaggle forum where participants shared their ideas, asked questions and got also some help from the organizer (ag1le aka myself).
The evaluation was done by Kaggle platform based on submissions that the participants uploaded. Levenshtein distance was used as the evaluation metric to compare predicted results to the corresponding lines in the truth value file that was hidden from the participants.
Announcing the winners
According to the challenge rules I asked participants to make their submissions available as open source with GPL v3 license or later to enable further development of machine learning algorithms. Resulting new Morse decoding algorithms, source code and supporting files are uploaded in Github repository by the MLMv1 participants.BrightMinds team: Evgeniy Tyurin and Klim Drobnyh
Public leaderboard score: 0.0Private leaderboard score: 0.0
Source code & supporting files (waiting for posting by team)
We have been studying machine learning for 3 years. Our interests has been different until now, but there are several areas we share experience in, such as image processing, computer vision and applied statistics.
What made you decide to enter?
Audio processing is a new and exciting field of computer science for us. We wanted to consolidate our expertise in our first joint machine learning project.
What preprocessing and supervised learning methods did you use?
At first we tried to use Fourier transform to get robust features and train supervised machine learning algorithm. But then we would have had extremely large train dataset to work with. That was the reason to change the approach in favour of simple statistical tests.
What was your most important insight into the data?
Our solution relies on the way the data was generated. So, observing the regularity in the data was this very insight that influenced the most.
Were you surprised by any of your insights?
Actually, we expected that the data would be real. For example, recorded live from radio.
Which tools did you use?
Our code was written in Python. We used numpy and scipy to calculate values of normal cumulative density function.
What have you taken away from this competition?
We gained great experience in audio signal processing and the applicability of machine learning approach.
Tobias Lampert
Public leaderboard score: 0.02Private leaderboard score: 0.12
Source code & supporting files from Tobias
What was your background prior to entering this challenge?
What made you decide to enter?
I do not have a college degree, but I have 16 years of professional experience developing software in a variety of languages, mainly in the financial sector. I have been interested in machine learning for quite some time but seriously started getting into the topic after completing Andrew Ng's machine learning Coursera class about half a year ago.
Unlike most other Kaggle competitions, the raw data is very comprehensible and not just "raw numbers" - after all morse code audio can easily be decoded by humans with some experience. So to me it looked like an easy and fun exercise to write a program for the task. In the beginning this proved to be true and as expected I achieved decent results with relatively little work. However the chance of finding the perfect solution kept me trying hard until the very end!
Preprocessing:
- Butterworth filter for initial computation of dit length
- Butterworth filter for initial computation of dit length
- FFT to transform data from time to frequency domain
- PCA to reduce dimensionality to just one dimension
Unsupervised learning:
Unsupervised learning:
- K-Means to generate clusters of peaks and troughs
Supervised learning:
Supervised learning:
- Neural network for morse code denoising
The most important insight was probably the fact that all files have a structure which can be heavily exploited - due to the fact the pauses at the beginning and end have the same length in all files and wpm is constant the exact length of one dit can be computed. Using this information, the files can be cut into chunks that fully contain either signal or no signal making further analysis much easier.
What surprised me most is that after trying several supervised learning methods like neural networks and SVMs with varying success, a very simple approach using an unsupervised method (K-Means) yielded the best results.
I started with R for some quick tests but switched to Python with scikit-learn very early, additionally I used the ffnet module for neural networks. To get a better grasp on the data, I did a lot of charting using matplotlib.
First of all obviously I learned a lot about how morse code works, how it can be represented mathematically and which patterns random morse texts always have in common. I also deepened my knowledge about signal processing and filtering, even though in the end this only played a minor role in my solution. Like all Kaggle competitions, trying to make sense of data, competing with others and discussing solution approaches was great fun!
Observations & Conclusions
I asked advice in the Kaggle forum how to promote and attract participants. I tried to encourage people to join the challenge during the first 2 weeks by posting frequent forum updates. Based on download statistics (see the table below) the participation rate of this challenge was roughly 11% as there was 13 actual participants and 120 unique users who downloaded the audio files. I don't know if this is typical in Kaggle competitions but certainly there were much more interested people than actual participants.
Filename
|
Size
|
Unique Users
|
Total Downloads
|
Bandwidth Used
|
---|---|---|---|---|
sampleSubmission.csv
|
2.02 KB
|
126
|
226
|
254.21 KB
|
levenshtein.py
|
1.56 KB
|
83
|
120
|
129.69 KB
|
morse.py
|
12.69 KB
|
126
|
196
|
1.56 MB
|
audio_fixed.zip
|
65.91 MB
|
120
|
179
|
7.72 GB
|
I did also a short informal survey among the participants in preparation for the MLM v2 challenge. Here are some examples from the survey:
Q: Why did you decide to participate MLM v1 challenge?
Q: Why did you decide to participate MLM v1 challenge?
I have a background in signal processing and it seemed like a good way to refresh my memory. |
By participation I tried to strengthen my Computer Science knowledge. At the university I am attending Machine Learning & Statistics course, so the challenge can help me practice. |
You were enthusiastic about it and it seemed fun/challenging/new |
More realistic problem setting. |
The task is interesting and fun by itself, but the details may vary, so making a challenge with accent on different aspects of the task would be exciting |
In the MLMv1 the 200 audio files had all very similar structure - 20 random characters each without spaces. Participants were able to leverage this structure to overcome poor signal-to-noise ratio in many files. I was surprised to get such good decoding results given that many audio files had only -12 dB SNR.
My conclusion for the next MLM v2 challenge is that I should provide real world, recorded Morse signals and reduce the impact of audio file structure. Also, to make the challenge more realistic I need to incorporate RF propagation effects in the audio files. I could also stretch the SNR down to -15 ... -20 dB range, making it harder to search correct answers.
My conclusion for the next MLM v2 challenge is that I should provide real world, recorded Morse signals and reduce the impact of audio file structure. Also, to make the challenge more realistic I need to incorporate RF propagation effects in the audio files. I could also stretch the SNR down to -15 ... -20 dB range, making it harder to search correct answers.
No comments:
Post a Comment