Is it possible to have two channels of audio input?

Hi everyone, I followed the instruction here: Responding to your voice - Edge Impulse Documentation to create an ML model on Sony Spresense for voice activity detection (by only using the noise and unknown part of the Google dataset). My question is: right now, the model is created based on one channel of audio input, is it possible to have two channels of audio input to create the model? I hope to use the coherence between the left and right channels to create the model in order to make the classification more accurate.

Thank you in advance for any help.

Hi @LarryWang,

our Sony Firmware handles only one digital audio channel (out of 4 available) and one model.
You can have a look at this example spresense-arduino-compatible/MainAudio.ino at master · sonydevworld/spresense-arduino-compatible · GitHub to see how to use more than one channel and modify our base firmware.

regards,
fv

Yes it is possible but as ei_francesco stated the EI code does not support it out-of-the-box. Download the EI Spresense code here and modify it (see this for how to handle interleaved audio data).

Hi Marcial,

I have no trouble using 2 mics on Spresense (without the usage of EI). My question always related to how to use EI to generate the model for two channels. I followed the tutorial for using EI to conduct a project very similar to the ‘Hello World’ recognition example. However, I am wondering how to modify it to support two channels.

What should I do in using the EI part to generate relevant libraries? I do not know whether the processing block and the learning block can be applied to each channel separately. In short, the tutorial only shows how to use two channels, I do not know how to use EI to generate code for my application (2 channels of audio input).

Thank you in advance for any help.

Best,

Larry

@ei_francesco , Hi francesco,

Thank you for your kind reply. I hope to get a more detailed answer from you.

I have completed modifying the Arduino code to support more than one channel of audio input (without the usage of EI). However, when using EI to generate the model (relevant libraries), I got confused about how to design the impulse.

Since I have 4 classes (left audio, left noise, right audio, right noise) from two channels, how should I design the impulse? What should I do to ensure that the processing block (e.g. MFCC) and the learning block can be applied to each channel separately?

Right now, I followed the tutorial to create the impulse for one channel (almost the same as current ‘Hello World’ example), I still do not understand what you mean by your mentioned ‘modify our base firmware’?

Thank you in advance for any help.

Best,

Larry

Hi @MMarcial Hi Marcial,

I have no trouble using 2 mics on Spresense (without the usage of EI). My question always related to how to use EI to generate the model for two channels. I followed the tutorial for using EI to conduct a project very similar to the ‘Hello World’ recognition example. However, I am wondering how to modify it to support two channels.

What should I do in using the EI part to generate relevant libraries? I do not know whether the processing block and the learning block can be applied to each channel separately. In short, the tutorial only shows how to use two channels, I do not know how to use EI to generate code for my application (2 channels of audio input).

Thank you in advance for any help.

Best,

Larry

Hi @LarryWang

here https://gi
thub.com/edgeimpulse/firmware-sony-spresense our firmware for the Sony Spresense board.
Since dual model/dual channel is not supported out of the box, you can start modifying the base firmware for the Sony Spresense, maybe starting with adding support for dual audio channel (here the audio handling firmware-sony-spresense/ei_microphone.cpp at main · edgeimpulse/firmware-sony-spresense · GitHub ).

Then you can get familiar on deploying C++ library following these guides C++ library - Edge Impulse Documentation and As a generic C++ library - Edge Impulse Documentation .

Next step will be to test what works better: running two separate models (for left and right channel) or one model with two processing block, one for the left and one for the right channel.
In both case you have to modify the way we feed data when running inference, see here firmware-sony-spresense/ei_run_audio_impulse.cpp at bb93749d2e94b2bdf3878b69e7c392b0ea99465d · edgeimpulse/firmware-sony-spresense · GitHub .

regards,
fv

2 Likes

Hi @ei_francesco , Thank you so much for your kind reply. As a new beginner in this area, I still have some questions (Sorry for bothering you).

  1. How difficult it is to modify the base firmware? Can it be completed within 10 hours? (I never modify the base firmware before)
  2. What’s the benefit of using the C++ library rather than the Arduino method? I followed the code here: Sony Spresense Arduino Code and I can successfully run the impulse for one channel.
  3. Based on this post: Running multiple (two) Edge impulse model simultaneously in a single device @aurel aurel said that it is impossible right now to run two models on MCU device simultaneously. I think Spresense also cannot run two separate models. Can you confirm this?
  4. My long-lasting question is: how can I generate the inference if using two channels? I have the source code generated by EI for my dataset for one channel, does it mean that I have to modify the source code directly since EI cannot support two channels right now? I hope to use the cross-correlation between two channels to make the classification more precise. Is it possible to achieve this?

Thank you in advance for any help.

Best,

Larry

Hi @LarryWang

  1. It depends on your coding skills. If you are already familiar with Arduino, it shouldn’t be hard, as we use the Arduino code for handling the microphone input. Check our firmware for the Spresense, with your use case I think you just need to modify the way we feed data when running inference.
  2. I suggested using the c++ export because I’m more familiar with that, but you can use as well the Arduino sketches (and I was not aware of these examples :slight_smile: ).
  3. I can confirm that you cannot run two model simultaneously
  4. I have the source code generated by EI for my dataset for one channel, does it mean that I have to modify the source code directly since EI cannot support two channels right now? Yes, check how to include a C++ export or you can try to use the Arduino library and modify the sketch.
    I hope to use the cross-correlation between two channels to make the classification more precise. Is it possible to achieve this? I don’t know, this is something you should test using one of the two flow I suggested in the previous answer.

Regards,
fv

Thank you so much for your kind reply.

The reason you cannot run 2 models simultaneously is that Sony limits the Audio class to only run on Core0. However you can run an IMU model on Core0 and a different IMU model reading a different IMU on Core1.