I am trying to create a rolling buffer for audio classification, so that audio is being recorded while inference is happening. I am using 2 second audio clips. I added some print statements to the microphone_audio_signal_get_data() function in the Arduino example app to get a better sense of the pattern for reading data off the buffer. I printed out the Offset and Length that is passed to the function during inference. The results are a little unexpected. It looks like it will read in 320 samples, then re-read the a single sample at the end of the last read. It also never reads in the final set of 320 samples.
Is this the expected behavior? I also tried it with 1 second audio clips and the general pattern was the same.
Starting inferencing in 2 seconds...
offset: 31999 length: 1
offset: 0 length: 320
offset: 319 length: 1
offset: 320 length: 320
offset: 639 length: 1
offset: 640 length: 320
offset: 959 length: 1
offset: 960 length: 320
offset: 31359 length: 1
offset: 31360 length: 320
Predictions (DSP: 1452 ms., Classification: 139 ms., Anomaly: 0 ms.):
@robotastic, yeah it’s correct. The not reading the last 320 samples is a bug that was introduced early on in the Python implementation - that we use to drive the studio - that we had to replicate on-device. This is really hard to fix without breaking older projects.
The 320/1 pattern is also correct, it’s due to the way we build up the filterbanks. Typically not that much of a problem I think unless you have high latency on flash, but if you use a double buffer we didn’t see too much latency increase. If this is an issue on your end please let us know!
Thanks @janjongboom !! When I coded with that pattern in mind I was able to create a rolling buffer where the Mic is write to one part of the buffer on an interrupt and inference is being fed the audio from earlier. I has have a double buffer working but I ran out of space when I tried to work with 2 second clips.
Either way, it is really cool to be able to do continual inference!
Thanks for helping clear up how data gets read in.
@Robotastic, what target are you running on? It indeed adds up when doing 2 second samples (2 x 16000 x 2 bytes = 64,000 bytes for the scratch buffer.
It is a Nrf52840 based board. I also think part of my problem is that I am doing a lot of DSP work. I am able to allocate both buffers, but then there is not enough memory left for the DSP.
ERR: -1002 (EIDSP_OUT_OF_MEM)
assertion “false” failed: file “/src/edge-impulse-sdk/dsp/speechpy/feature.hpp”, line 242, function: static int ei::speechpy::feature::mfe(ei::matrix_t*, ei
@Robotastic, ah, check. You could set the frame length and frame stride higher to lower the number of features. You can also enable the
EIDSP_TRACK_ALLOCATIONS macro to log all allocations, at least that’d give you an idea of how much space you’d need.