Where is the dsp calculation in Python Linux SDK?

Question/Issue:
hello, I am a beginner and learning about voice/audio classification with Raspberry Pi. and I found the edge impulse python Linux SDK, I can run inference real time on the RPi board perfectly, thanks for dev team.
but I have concern, here is some of my questions:

  1. where is the math operation for DSP and NN part, because I don’t find it in this file and in the library?
  2. what is the meaning value after result part?
    faucettt
  3. on edge impulse we can see information of performance on the device. but, how do we know the actual performance of the model we have created? in my case the deploy target is Raspberry Pi 4?

Regards,

Hello @des_16,

  1. The operations are encapsulated in the linux executable (.eim). See Edge Impulse for Linux - Edge Impulse Documentation for more info (section .eim models?). The executable is compiled using our C++ Linux SDK.

  2. (0 ms.) is the time to run the impulse, faucet 0.XX is the probably your model recognized this class, noise: 0.XX is the probably your model recognized this other class.

  3. Are speaking about the RAM consumption / latency? The RAM you can see it on linux using a tool like top and the latency is given by the inference time ( 0 ms. here in your case, note that we round the results). But you can change the print statement on this line:

 print('Result (%d ms.) ' % (res['timing']['dsp'] + res['timing']['classification']), end='')

Best,

Louis

hi @louis
thanks for response.

  1. can we see the contents of the .eim file? I just want to know and learn how the realtime audio classification process from raw data/unknown sound to getting classification results.
    Sorry if I ask a lot of questions, I just want to know the math process behind it. Do you have any reference about it? Because I didn’t find math or equation about DSP in the documentation.

  2. thanks for the explanation. Is chunk_size and overlap here the same as window_size and window_increse in edge impulse studio?
    Screenshot_20221121_211744

  3. yes, that’s what i meant, about latency and RAM consumption. so is it time inference is the whole process from:
    detect sound with microphone → data acquisition → inference/model calculation → result? correct me if i’m wrong.

sincerely yours,

Hello @des_16,

  • For the Python implementation (that we use in the studio), all our DSP source code can be found here: GitHub - edgeimpulse/processing-blocks: Signal processing blocks
  • For the cpp implementation, when you download the cpp library, when you extract the archive, look after this file edge-impulse-sdkclassifier->ei_run_dsp.h.
    For example, if you used the mfcc dsp block in the studio, you will find the cpp implementation in the function:
int extract_mfcc_features(signal_t *signal, matrix_t *output_matrix, void *config_ptr, const float sampling_frequency) {
...
}
  1. Good question, I’d need to further check, I haven’t used in a while the linux python sdk for audio projects. However, I don’t think that’s the same. I believe it is used to run the inference in a “continuous mode”. See @AIWintermuteAI explanation here
    The goal of continuous mode is to process one chunk of data (part of the whole window) in less time than it takes to record it. This way continuous inference can be achieved. The time it takes to process one chunk of data depends on a) DSP processing b) NN inference. If it takes more time to process DSP + NN for one chunk of data (that can happen depending on your MFCC/MFE parameters and NN size/complexity) than it takes to record that chunk, you should get an error.

  2. Correct, in the results, if you want to have the split between the dsp and the classification, just print both instead of summing them as in the print statement:

res['timing']['dsp'] + res['timing']['classification']

Best,

Louis

hi @louis
thanks for the answer.

  1. It’s clear for now.

correct me if I’m wrong, based on the explanation window_size > chunk_size? if I have window_size = 1000 ms, the chunk_size value should be smaller than it?
but, to be honest, I still don’t understand in the chunk_size section, what does the value of 1024 and overlap 0.25 mean? if we change these values, will it affect the inference time?
3. thanks again, I can separate the NN and DSP time inference.

last question, I promise. ( :sweat_smile:)
4. I want to run this python script automatically on startup using crontab. can we shorten the command to run this python sdk:
python classify.py model.eim 1
to:
python classify.py

is it possible for me to call model.eim in the python script. maybe modify this part or which part need to care about.
Screenshot_20221121_231854

regards,

Hello @des_16,

  1. Great
  2. I am not sure about it, I’ll ask around internally. It won’t change the inference time but more the number of inference occurrences you will run per “window”.
  3. Great
  4. Sure that is up to you how you want to call it, you can remove the expected arguments in the main function and hardcode it in your code.

Best,

Louis

thanks @louis,
let me know if there is any update regarding question number 2.

regards,

You also might want to have a look at this. It’s not using the Python implementation but the idea is there: Continuous audio sampling - Edge Impulse Documentation

yeah, I have read it before make this thread,
since I am using python SDK, I didn’t find the answer I was looking for.