I’m working on making an audio classification (keyword spotting) system on an STM32 Nucleo-L476RG board. I’ve trained a classifier on the pre-made yes/no keyword dataset and downloaded the C++ library.
What I’d like to do is use the library in STM32CubeIDE without Arduino or mbed (I plan to use I2S and DMA with a double buffer to read in audio). I’ve read through the porting guide, and from what I gathered, I need to create a nucleo-l476rg directory in edge-impulse-sdk/porting with debug_log.cpp and ei_classifier_porting.cpp files. These files should contain the functions present in the other board .cpp files. The functions should define things like ei_printf() (using the Nucleo’s UART port) and ei_read_timer_ms() (i.e. reading from a timer that ticks once per millisecond).
Does this sound like I’m on the right path for porting?
If so, here’s my next question: once I’ve defined the “porting” functions for my particular board, how do I select those particular debug_log.cpp and ei_classifier_porting.cpp files to be included in the build process (and not compile the other board files)? I know it’s probably something simple, and I’m just missing it.
@ShawnHymel there already is an stm32-cubeai folder which uses the STM32HAL libraries (despite the name it’s only some utility functions around timing and printing in this folder for stm32), so that should be fine. Just set up the UART as described here: https://docs.edgeimpulse.com/docs/using-cubeai#configuring-printf. You can either exclude all the other folders in the porting layer, or just delete them (not sure how CubeIDE is doing that).
That should be it. C++ files should be automatically be picked up by the compiler.
@janjongboom Awesome, thank you! I removed all but the stm32-cubeai folder and that seems to help. However, I’m now running into an issue where the compiler does not like some of the assembly calls in the CMSIS folder (inside the downloaded library).
Here is one such error: error: impossible constraint in 'asm'
I thought that TFLite could be used without the CMSIS-NN library. I created the project with CubeIDE (so, CubeMX), which imports some CMSIS functions. Is there something I need to do to enable the CMSIS-NN framework, can I delete the CMSIS folder in the EI downloaded library, or did I miss something entirely with getting this to compile?
@ShawnHymel very interesting - I have never seen that error. Will have a test later this week. You could disable / remove the NN folder, and then setting this macro to 0:
@ShawnHymel my guess is the GCC7 version that ST ships with their IDE is an issue, perhaps GCC9 works better? But naturally there is no way to change that nor to just generate a @&* Makefile
Anyway I’ve managed to compile by:
Create new C++ library in STM32CubeIDE (tested on the DISCO-L475VG)
@janjongboom It works! Or, at least it’s now compiling Thank you for helping out with this. I’m assuming that it’s going to be a bit slower without the CMSIS-NN calls, but it should work well enough for the demo (I hope). It looks like you are correct in that it’s a bug with the ARM gcc compiler. I found reference to it here: https://github.com/ARM-software/CMSIS_5/issues/996
I’m not familiar with ARM assembly, so the solutions/workarounds presented were a bit over my head
@ShawnHymel we’ve backported the fix to our SDK, and no regressions on our target platforms. Will be available in the next release (later today) in all new exports.
@janjongboom Thank you! I put the CMSIS/Core and CMSIS/NN folders back in, patched the code with your fix, and removed the -DEI_CLASSIFIER_TFLITE_ENABLE_CMSIS_NN=0 flag. It seems to compile and work.
It seems to be quite slow on my Nucleo board. I’m using the yes/no dataset from your tutorials, trained with the MCC -> NN blocks (keeping all defaults). In my code, I copied in a raw 16-bit sound buffer from one of the known-good samples and fed it to the classifier. It looks like DSP is taking ~350 ms and classification ~280 ms. Do those seem reasonable on an 80 MHz ARM (I’m using a Nucleo-L476RG)? This is in the release configuration (using the -DDEBUG flag doubles the classification time).
@ShawnHymel, set the macro to EI_CLASSIFIER_TFLITE_ENABLE_CMSIS_NN=1 - it’s only enabled by default when we can detect the target (which we can’t on STM32IDE as they don’t set any macros on MCU family). Should go down to ~30ms for the classification part.
For DSP set EIDSP_QUANTIZE_FILTERBANK=0. Takes 10K more RAM but should save you 100ms.
Note that when switching to continuous audio mode the DSP slices are smaller so this’ll go down, can easily do 4-5 inferences a second that way.
Hi i tried deploying an audio recognition example in stm32f401re using https://github.com/edgeimpulse/example-standalone-inferencing-mbed with mbedOS and then i tried with STM32CUBEIDE same example .
I noticed that the DSP times were very different .
mbedOS : DSP_TIME = 150 ms
stm32cubide : DSP_TIME = 420 ms
Also i did use the macros that you suggested in the stm32cubeide
What could cause this increase of time ?