To build smaller and better performing embedded machine learning models we heavily leverage signal processing to clean up data and to extract features from sensor data before even applying machine learning. This leads to more efficient and better explainable ML models as common signal processing algorithms are well understood, easy to debug, and can often be calculated very efficiently in hardware. After the signal processing step we can then use much smaller machine learning models to do the final classification.
@janjongboom, @aurel, I have migrated today my little doorbell project from the Audio (MFE) to the new Spectrogram block. The first results are extremely promising !!
With Audio (MFE) I got about 0.4% false positives where a false positive = 100ms audio fragment that is incorrectly classified as doorbell ring.
When switching to the new Spectogram block I got following results:
0.24% false positives for the First run
0.10% false positives for the Second run : for this I added 10 false positives detected during the first run to the training set, retrained it and deployed the retrained model.
0.03% false positives for the Third run: for this I added another bunch of false positives detected during the second run to the training set, I also reduced the window increase to get more or less the same number of “ring” samples and “other” samples and retrained it and deployed the retrained model.
0 false positives (based on 132909 audio fragments) for the Fourth run: for this I added 20 false positives detected during the third run to the training set and retrained it and deployed the retrained model.
To be sure that it is not classifying everything as “other” I just shortly ringed my bell and it properly detected 4 ring sounds during a 500 ms period (and this after it has reported more than 130000 audio fragments as “other” sound).
Some more details about my settings and changes can be found in the following issue: