Public Lab Research note

Serial communication via headphone jack

by rmeister | June 10, 2016 18:06 10 Jun 18:06 | #13185 | #13185

rmeister was awarded the Basic Barnstar by warren for their work in this research note.

The main goal of this work is a communication channel between a sensor node (Arduino) and a browser client, that does not require installing drivers or other additional software. With a technology called WebRTC, browsers are capable of recording the audio input from the microphone. Therefore the headphone jack was choosen to implement the needed communication channel. As microphones use two-wire cable, data has to be transmitted serially. There is already an Arduino library that implementing this: SoftModem. And with modem.js a JavaScript implementation is given, that uses the same modulation scheme, but slightly different encoding. Thus, modem.js is not compatible with SoftModem (yet). That is why I started writing a decoder from scratch.

For reasons of familiarness, I've written an exemplary decoder in the Matlab/Octave language. Afterwards I'm going to translate the gained knowledge into a JavaScript implementation.

Before explaining the decoder itself I will recapitulate some technical fundamentals.

Technical background

The SoftModem library modulates data into signals with Frequency Shift Keying in the audible range. To be more exact, SoftModem uses BFSK (binary FSK, also sometimes called 2FSK) that is working with two frequencies. A binary 0 is mapped to the lower frequency and a binary 1 to the higher frequency. In the following picture we can see a recorded signal with its two frequencies.


But why do we even need this modulation and don't just pass the signal like this:


Well the reason is, that we want to transmit data via the headphone jack. This means we have a capacitively coupled link (due to safety and other reasons I will not further explain). From this we can conclude, that only alternating signals will pass the connection. So if we try to transmit the signal unmodulated, the receiver would see this:


Clearly, it is hard to detect bits correctly in this signal. For a sequence of identical bits the amplitude drops until a different bit follows. In the case of a (long) preamble the signal would fall very close to zero, making it impossible to detect its presence after a while.


First of all, we want to demodulate the signal, meaning we want to translate the low frequency into a constant low signal level and the high frequency into a high signal level. As we are interested in only two frequencies, we can calculate the correlation between the received signal and the two respective frequencies, instead of doing a full FFT. These are the steps for the correlation with one frequency:

  1. Obtain a window of the signal with a window length close to the number of samples per bit
  2. Multiply every sample with a sine and cosine of the frequency
  3. Sum up all products with the sine and separately all products of the cosine multiplication
  4. Square the two sums
  5. Add the squared sums
  6. Take the square root of the summation
  7. Save the value as first sample of the correlation
  8. Move the window one sample ahead and start over

Done for both frequencies, we obtain two signals containing the information how strong the frequency in question is present in the original signal. Finally, the difference of the correlation signals is the demodulated signal.

In the following picture we see (in this order) the recorded signal, the correlation with the lower frequency, the correlation with the higher frequency and the resulting demodulated signal.


One thing to note is the window length for the correlation process. Using a window leads to an averaged signal. Ideally the length fits to the number of samples per bit. If the length is to large, the influence from the previous and next bits will increase, which makes the resulting waveform blurred. In contrast to the demodulated signal above, the picture below shows a result with adjusted window length.


Now the signal is ready for decoding.


To understand the decoding process, let's first explain how data is encoded by SoftModem. This is similar to usual serial communication (e.g. UART). There are two main differences: befor sending the Start bit a preamble is prepended and the end of a transmission is denoted by a Push bit. Omitting the Push bit means more data will follow.

State 1 0 X X X X X X X X 1 1
Meaning Preamble Start d0 d1 d2 d3 d4 d5 d6 d7 Stop Push

Now let's see how we can decode the demodulated signal. The decoder is build as a finite state machine with the states PREAMBLE, START, DATA, STOP. Initially it is in the state PREAMBLE. In this state the decoder trys to find the preamble by counting the samples that are larger than a certain threshold. If a preamble was detected, the decoder jumps to the middle of the Start bit and goes into the START state. There the decoder evaluates if the Start bit is set or if it jumped into the START state by mistake. In case the Start bit is set, the decoder jumps ahead for the number of samples per bit. Now the decoder position should be in the middle of the first Data bit and in state DATA. In this position it evaluates the first Data bit and consecutively the following 7 bits. Afterwards, the decoder goes into the state STOP and checks if the Stop bit was set correctly. Following, the decoder checks the Push bit. If the Push bit is set, transmission is over and we go to the PREAMBLE state. Else the next state is DATA, as a missing Push bit is the equivalent of a Start bit and indicates that the sender will continue transmitting data.

In the next picture we see the positions (red lines), where the demodulated signal will evaluated by the decoder.


There is one issue with this proceeding: synchronization. Given that the transmitter sends faster (or slower) than the decoder expects, there will be an offset and the decoder may check bits at a wrong position. This problem occurs due to rounding errors in SoftModem. I calculated the relative error for the duration of a single bit for the default parameter sets of SoftModem.

Baud 126 315 630 1225
Lower frequency 882 1575 3150 4900
Higher frequency 1764 3150 6300 7350
relative error / bit HIGH 10,72% 4,43% 4,43% 0,04%
relative error / bit LOW 4,21% 1,75% 1,75% 0,04%

Regarding the error, the baudrate is negligible and the error rather depends on the used frequencies. To generate an alternating signal, SoftModem makes use of timers. After the timer hits a certain count, the signal will be flipped. When calculating the this count, a rounding error is induced. Another error may be induced by the choice of the sampling rate of the decoder. E.g. at the (common) sampling rate of 44,1kHz and a signal frequency of 1575Hz, one period will be exactly 28 samples. But if we choose a sampling rate of 29,4kHz (and the same signal frequency), one period would be aprox. 18.7 samples.

Furthermore I came across another problem regarding the detection of the preamble. In SoftModem the length of the preamble is variable, increasing with the time since the last transmission (until a maximum that depends on the baud rate). This is not necessarily bad, but the minimum preamble length may also be smaller than a sequence of 8 consecutive 1's. In case one transmission ended (with a push bit) and a new transmission starts immediately, the preamble is short and may not be detected. Other encoding schemes solve such issues with bit stuffing. In this case, I'd like to propose changing SoftModem to use a fixed length preamble.

I did this Help out by offering feedback! Browse other activities for "webjack"

People who did this (0)

None yet. Be the first to post one!


This is a fantastic and very clear explanation. Thanks so much for all the work and thought that went into this!

In the end, did you decide to modify Softmodem to use a fixed-length preamble, as per the discussion in the last paragraph?

Is this a question? Click here to post it to the Questions page.

Reply to this comment...

This is so fun!. I've been starting to reproduce the Matlab code in a Jupyter notebook, in order to understand the algorithm ... it's still a mess, but here is is in case it's interesting:

My interest at this point is in modifying both SoftModem and WebJack to implement Forward Error Correction. The simplest form being: send every bit 3 times from the Arduino side, and then do a 'majority vote' on the WebJack side. Then, could move onto more sophisticated codes ... but that'd be a really cool start, and might be enough to reduce error rates significantly ...

@rmeister, does that sound like a plausible modification? Or am I just to be simple because I have no idea what I'm doing yet? :)

Is this a question? Click here to post it to the Questions page.

Reply to this comment...

Update: actually, nevermind :) I've read more about error detection codes, and I'm learning that there might be better / easier ways of implementing error detection that won't require messing with SoftModem or the main Webjack code. Will post further updates as I learn more ... but meanwhile, don't waste your brain energy on my previous proposal, as it doesn't seem like the best way to go (at least, from my perspective now). Cheers!

Reply to this comment...

Crap, I thought I posted this comment but didn't, late last night; hopefully still helpful:

I think forward error correction could be done on top of webjacks encoder and decoder, without dipping into the encoder and decoder themselves, no?

Like, make a new library that uses webjack, and just for-loops to send each thing three times? But has the same API as webjack, like send, etc. That might be architecturally much simpler. The new library could be included with webjack, and/or chained into the pipeline between webjacks public API and the encoder/decoder.

Then you'd have to write the same thing as an Arduino library, or an extension to the modded soft modem lib -- perhaps with a similar abstraction to the webjack implementation.

Is this a question? Click here to post it to the Questions page.

Reply to this comment...

So if error detection is set to on, it pipes it through the error correction module before sending it to the encoder.

Reply to this comment...

Yes, exactly! That's what I finally came around to this morning :) That's the proper level of abstraction here; there are various clever schemes for detecting errors, and they can all be tried out on top of the system as it stands, just as you suggest.

I think there are also much more efficient and robust approaches than the simple "repeat three times and majority vote" approach I'd suggested ... going to post an update later today if I get something working ...


Reply to this comment...

Yeah and I'm also happy to add a callback function for any error correction module specified in the webjack options.

Reply to this comment...


Reply to this comment...

I left some more specific input on integration in this issue:

Reply to this comment...

Hi Don, thanks for your awesome feedback!

That notebook looks really cool. I wasn't aware of Jupyter and it looks like a great way to explain and visualize things.

I did not change SoftModem regarding the variable preamble. The current solution is to accept only preambles with a length of exactly 1 bit OR greater than 12 bits. This means that the preamble lengths 2-11 (times the length of one bit) can not be detected correctly. I've experimented with a probabilistic approach to come around this and detect all possible preamble lengths SoftModem will generate, but it did not work out well.

Regarding errors: if you want error correction, I think Hamming is a feasible choice. As you already pointed out, doing this as a layer on top of the SoftModem transmission scheme is a good idea.

I see on github that you already did some CRC stuff, very cool!

Reply to this comment...

Login to comment.