Recognize my doorbell

janvda · July 21, 2020, 9:06am

The use case:

Our doorbell is not clearly audible at the 2nd floor or in the garden.
I have an intel NUC6CAYS device (with embedded microphone) running docker.
I would like to create a docker service that is constantly listening for the doorbell sound and which will publish a message to my MQTT broker whenever someone rings the doorbell.
Another docker service will subscribe to this MQTT message and somehow notify me.

Solution
I would like to use Edge impulse to train the model and run it in a docker service on my intel NUC.

My doorbell

A picture of my doorbell : <picture to be added once I have the right to add more than 1 picture to a forum post>

A 1 minute spectogram covering 9 bell rings : <picture to be added once I have the right to add more than 1 picture to a forum post>

Here below the spectogram (produced by audacity) covering a single bell ring

You can see that bell ringing started at 14.75 sec and then ran at full power until 15.00 sec (duration = 250 ms) and then gradually faded until about 15.60 ( duration = 600 ms). Of course people can press longer on the bell.

My questions
Some guidance / hints regarding how to best train the model for this use case.
Any hints about proper starting parameters for training are very welcome.

janjongboom · July 21, 2020, 9:24am

Hi @janvda:

Isolate the doorbell sound (e.g. in a 1 second sample that has the doorbell in the middle) and export as WAV file in Audacity. Do this for a bunch of recordings, with variety of different background noises (not sure what’s close to the doorbell, e.g. TV noise if that’s an issue).
Grab background noise samples of normal things going on close to the doorbell. Make sure your dataset is balanced between doorbell / noise classes.
Upload the samples on the Data acquisition page.
Train the model, try default parameters for now (maybe make the window length a bit shorter, but suggest to experiment a little if initial model doesn’t work out of the box).
For deployment, easiest might be WebAssembly output. Here’s how to interact with the library from Node.js. Grab the buffer in PCM format from the microphone and feed into the library to get a classification. Some tips to harden the model (run multiple classifications on a rolling buffer) here; https://www.edgeimpulse.com/blog/audio-based-shower-timer-with-a-phone-machine-learning-and-webassembly/

janvda · July 23, 2020, 11:22am

Thanks for the response.

I have looked at the link which makes clear how to run it locally for a specific input sample : e.g.

It is not clear to me how I should deal with an audio stream coming from the microphone ? My end goal is to do real time classification.

janjongboom · July 23, 2020, 1:38pm

I’m not sure 100% sure how to do it on the NUC but in essence:

Get the audio feed from the microphone (e.g. with something like https://github.com/noffle/mic-stream, but I’ve never used it) and let it sample on the frequency that you trained your model on (probably 16KHz).
Get 16,000 samples this way (which is 1 second of data).
Put that 16,000 samples in an array and call classifier.classify() on it.
Downsample to the frequency that you trained the model on (16KHz).

Do this continuously (with some overlap, so let’s say every 250ms. you run the classifier with the last second of data) and there you go.