April 1, 2021
Overview
CT8 is a new exciting digital mode designed for interactive ham radio communication where signals may be weak and fading, and openings may be short.
A beta release of CT8 offers sensitivity down to –48 dB on the AWGN channel, and DX contacts with 4 times longer distance than FT8. An auto-sequencing feature offers the option to respond automatically to the first decoded reply to your CQ.
The best part of this new mode is that it is easy to learn how to decode in your head, thus no decoder software is needed. Alpha users of CT8 mode report that learning to decode CT8 is ten times easier than Morse code. For those who rather use a computer, an open source Tensorflow based Machine Learning decoder software is included in this beta release.
CT8 is based on novel avian vocalization encoding scheme. The character combinations were designed to be very easily recognizable to leverage existing QSO practices in the communication modes like CW.
Below is an example audio clip on how to establish a CT8 contact - the message format should be familiar to anybody who have listened Morse code in ham radio bands before.
Fig 1. CT8 spectrogram - CQ CQ CQ DE AG1LE K |
The audio clip sample may sound a bit like a chicken. This is actually a key feature of avian vocalization encoding.
Scientific Background
The idea behind CT8 mode is not new. There is a lot of research done on avian vocalizations over the past hundred years. From late 1990s digital signal processing software has become widely available and vocal signals can be analyzed using sonograms and spectrograms with a personal computer.
In research article [1] Dr. Nicholas Collias described sound spectrograms of 21 of the 26 vocal signals in the extensive vocal repertoire of the African Village Weaver (Ploceus cucullatus). A spectrographic key to vocal signals helps make these signals comparable for different investigators. Short-distance contact calls are given in favorable situations and are generally characterized by low amplitude and great brevity of notes. Alarm cries are longer, louder, and often strident calls with much energy at high frequencies, whereas threat notes, also relatively long and harsh, emphasize lower frequencies.
In a very interesting research article [2] by Kevin G. McCracken and Frederick H. Sheldon conclude that the characters most subject to ecological convergence, and thus of least phylogenetic value, are first peak-energy frequency and frequency range, because sound penetration through vegetation depends largely on frequency. The most phylogenetically informative characters are number of syllables, syllable structure, and fundamental frequency, because these are more reflective of behavior and syringeal structure. In the figure below give details about Heron phylogeny, corresponding spectrograms, vocal characters, and habitat distributions.
Habitat distributions suggest that avian species that inhabit open areas such as savannas, grasslands, and open marshes have higher peak-energy (J) frequencies (kHz) and broader frequency ranges (kHz) than do taxa inhabiting closed habitats such as forests. Number of syllables is the number most frequently produced.
Ibises, tiger-herons, and boat-billed herons emit a rapid series of similar syllables; other heron vocalizations generally consist of singlets, doublets, or triplets. Syllabic structure may be tonal (i.e., pure whistled notes) or harmonic (i.e., possessing overtones; integral multiples of the base frequency). Fundamental frequency (kHz) is the base frequency of a syllable and is a function of syringeal morphology.
These vocalization features can be used for training modern machine learning algorithms. In fact, in a series of studies published [3] between 2014 and 2016, Georgia Tech research engineer Wayne Daley and his colleagues exposed groups of six to 12 broiler chickens to moderately stressful situations—such as high temperatures, increased ammonia levels in the air and mild viral infections—and recorded their vocalizations with standard USB microphones. They then fed the audio into a machine learning program, training it to recognize the difference between the sounds of contented and distressed birds. According the Scientific American article [4] Carolynn “K-lynn” Smith, a biologist at Macquarie University in Australia and a leading expert on chicken vocalizations, says that although the studies published so far are small and preliminary, they are “a neat proof of concept” and “a really fascinating approach.”
What does CT8 stand for?
Building on this solid scientific foundation it is easy to imagine very effective communication protocols that are based on millions of years of evolution of various avian species. After all, birds are social animals and have very expressive and effective communication protocols, whether to warn others about approaching predator or to invite flock members to join feasting on a corn field.
Humans have domesticated several avian species and have been living with species like chicken (Gallus gallus domesticus) for over 8000 years. Therefore CT8 mode sounds inherently natural to humans and it is much easier to learn to decode than Morse code based on extensive alpha testing performed by the development team.
CT8 stands for "Chicken Talk" version 8 -- over a year of development effort and seven previous encoding versions tested over difficult band conditions, and with hundreds of Machine Learning models trained, the software development team has finally been able to release CT8 digital mode.
Encoding Scheme
sox -b16 -c 1 input.wav output.wav rate 8000
The encoding scheme for the CT8 mode was done by collecting various free audio sources of chicken sounds and carefully assembling vowels, plosives, fricatives and nasals using this resource as the model. Free open source cross-platform audio software Audacity was used to extract vocalizations using the spectrogram view and also creating labeled audio files.
Figure 3. below shows a sample audio file with assigned character labels.
Fig 3. Labeled vocalizations using Audacity software |
CT8 Software
The encoder software is written in C++ and Python and runs on Windows, OSX, and Linux. The sample decoder is made available from Github as open source software, if there is enough interest on this novel communication mode from the ham radio community.
For the CT8 decoder a Machine Learning based decoder software was built on top of open source Tensorflow framework. The decoder was trained on short 4 second audio clips and in the experiments character error rate 0.1% and word accuracy of 99.5% was achieved. With more real-world training material the ML model is expected to achieve even better decoding accuracy.
Future Enhancements
CT8 opens a new era for ham radio communication protocol development using biomimetics principles. Adding new phonemes using the principles of ecological signals as described in article [2] can open up things like "DX mode" for long distance communication. For example the vocalizations of Cetaceans (whales) could be also used to build a new phoneme map for DX contacts - some of the lowest frequency whale sounds can travel through the ocean as far as 10,000 miles without losing their energy.
73 de AG1LE
PS. If you made it down here, I hope that you enjoyed this figment of my imagination and I wish you a very happy April 1st.
References
[1] Nicholas E. Collias, Vocal Signals of the Village Weaver: A Spectrographic Key and the Communication Code
[2] Kevin G. McCracken and Frederick H. Sheldon, Avian vocalizations and phylogenetic signal
[3] Wayne Daley, et al Identifying rale sounds in chickens using audio signals for early disease detection in poultry
[4] Scientific American, Ferris Jabr, Fowl Language: AI Decodes the Nuances of Chicken “Speech”