Size of your 'features' array in guide

Hi,

I ported my model to Mbed as described in this guide, I copied the raw data from the sample in the ‘Live classification’ page
Screenshot 2020-07-04 at 16.39.40
and on the serial port, got the following error:
The size of your 'features' array is not correct. Expected 600 items, but had 375

Looked into model_metadata.h and realized you are hardcoding this. My assumption is that you are setting this based on the default 10s capture window. Or are these values getting generated somehow?

From the ‘Live classification’ page it doesn’t seem possible to copy data for the whole dataset, fortunately you have the ability to copy the full sample data from the ‘Spectral Features’ page. The only thing that is not possible at this point, that would’ve been interesting, is a direct comparison of results.

@AlessandroDevs, model_metadata.h is generated on deployment, and the number of features described there is dependent on the size of your window, your sample rate and the number of data axes. 600 samples would assume 3 axis * 2 seconds at 100Hz (3*2*100=600), but 375 samples assumes 3 axis * 2 seconds * 62.5Hz (3*2*62.5=375).

Looking at your project you have your training data sampled at 100Hz, but some of your testing data at 62.5Hz - if you take one of the samples sampled at 100Hz this will run on device.

You can see the sample rate by going in Data acquisition > Test data > Make the table larger, then hover over the sensors column:

The fact that you can actually classify data that has the wrong frequency is a bug in the studio though and will make sure this is fixed.

1 Like

Jan, thank you for explaining the steps in detail. I get it now :slight_smile: It seems like the frequency should only be set once at the beginning and then used across the training, testing and live classification to keep it consistent.

1 Like

Correct. We should be better at guiding people to this and highlighting when something is wrong. Unfortunately this is a case where we don’t do that yet - but will add some guidance in the coming week.

Amazing! Coming together very nicely and top-notch support :slight_smile:

I faced the same issue, though I keep same sampling frequency (16,000 Hz) for both train and test samples. The error message is “The size of your ‘features’ array is not correct. Expected 16000 items, but had 10453” .

If you can read the edgeimpulse studio content, here is mine, https://studio.edgeimpulse.com/studio/31223/

And one thing I don’t understand is why the ‘raw features’ changes depending on the test sample selected.

Thanks

Hello @MarconiJiang,

Basically the raw features are just the “raw” data (before any processing). As every test data is different, your raw data will also be different.

In your case, the raw data is the sound sampled at a 16kHz frequency (so one value every 0,0625 ms). I don’t know about your project but if for example some lines of your code block the code execution (often using a delay function), it will result in having a different sampling rate or missing some values.

I hope this helps,

Regards,

Louis

Thanks for quick reply.
The audio data I used is from Kaggle with all samples of 48kHz and 1sec duration. And I use sox to down sampling to 16kHz. Maybe the data duration changed during that process. Let me check first.

However, I am still not sure about the feature definition. Arduino example I use is static_buffer, which requires me to copy the ‘raw features’ from Edge Impulse and paste to “static const float features[] =”. Can you explain the purpose of this ‘features’ in the app? I assume the app use this feature set for inference. That is the reason why I am confused about the feature set using single data for feature extraction?

Then, I tried another example of “nano_ble33_sense_microphone”. It works, no manual copy-and-paste from Edge Impulse required. (though the performance is not good. I found the microphone is near BLE module, not sure if any noise impact from that module. Will try to connect external microphone for validation.)

It seems that your dataset is really clean.
Could you try adding some noise under the NN Classifier tab?
It will probably lower a bit the accuracy but it should work better in real life.

Can you explain the purpose of this ‘features’ in the app?

About the features, you can consider it as the raw data that will be passed to the run_inference function.
In the static_buffer example, you can classify only one data sample.

However, I am still not sure about the feature definition. Arduino example I use is static_buffer, which requires me to copy the ‘raw features’ from Edge Impulse and paste to “static const float features[] =”. Can you explain the purpose of this ‘features’ in the app? I assume the app use this feature set for inference. That is the reason why I am confused about the feature set using single data for feature extraction?

Here features is the raw sample data. So for a 1 second clip at 16KHz you need 16000 values here. We let you copy some data from Edge Impulse so you can quickly verify that the impulse works on the target and gets proper results (as it’ll match the one in the studio). When running you’ll fill this buffer with sensor data as it comes in, here’s a bunch of examples: https://docs.edgeimpulse.com/docs/cli-data-forwarder#classifying-data

hello, I also face a similar issue.

Edge Impulse standalone inferencing (Arduino)
The size of your ‘features’ array is not correct. Expected 750 items, but had 0

I deploy motion detection in Nicla Vision, but does not work. It works well on the edge impulse website.

Were did you get the code your are using? Or can you post your code?