Wednesday, September 3, 2014

Morse Learning Machine - Challenge

MACHINE LEARNING CHALLENGE

I was astonished to get email acknowledgement that my  Kaggle Morse Challenge was approved today. I have spent last few days preparing materials and editing the description and the rules for this competition.

The goal of this competition is to build a machine that learns how to decode audio files containing Morse code.


For humans it takes many months effort to learn Morse code and after years of practice the most proficient operators can decode Morse code up to 60 words per minute or even beyond. Humans have also extraordinary ability to quickly adapt to varying conditions, speed and rhythm.

I want to find out if it is possible to create a machine learning algorithm that exceeds human performance and adaptability in Morse decoding.  I have shared some of these ideas in New England Artificial Intelligence meetup about one year ago and got enthusiastic feedback from the participants.




WHY KAGGLE?   

Kaggle is a platform for predictive modelling and analytics competitions on which companies and researchers post their data and statisticians and data miners from all over the world compete to produce the best models. This crowdsourcing approach relies on the fact that there are countless strategies that can be applied to any predictive modelling task and it is impossible to know at the outset which technique or analyst will be most effective. Kaggle aims at making data science a sport.

Kaggle's community of data scientists comprises tens of thousands of PhDs from quantitative fields such as computer science, statistics, econometrics, maths and physics, and industries such as insurance, finance, science, and technology. They come from over 100 countries and 200 universities. In addition to the prize money and data, they use Kaggle to meet, learn, network and collaborate with experts from related fields.

For the Morse Learning Machine competition I hope to attract people from the Kaggle community who are interested in solving new, difficult challenges using their predictive data modeling, computer science and machine learning expertise.  For students this challenge provides a great opportunity to put theoretical concepts into practice and see how they can solve tough problems by applying knowledge gained in class rooms.


COMPETITION DETAILS


During the competition, the participants build a learning system capable of decoding Morse code. To that end, they get development data consisting of 200 .WAV audio files containing short sequences of randomized Morse code. The data labels are provided for a training set so the participants can self-evaluate their systems. To evaluate their progress and compare themselves with others, they can submit their prediction results on-line to get immediate feedback. A real-time leaderboard shows participants their current standing based on their validation set predictions.

I have also provided  sample Python Morse decoder  to make it easier too get started. While this software is purely experimental version it has some features of the FLDIGI Morse decoder   but implemented using Python instead of C++.

You can of course  leverage the experimental multichannel CW decoder I recently implemented on FLDIGI or the standalone version of Bayesian decoder written in C++.  There is also some new tools I posted to Github.

Please help me to spread this message to attract participants for the Morse Learning Machine challenge!

73
Mauri AG1LE