Hi, I have several questions about image classification with different scenarios.
I have uploaded some 320x240 images to EI with a single label for each image (no object detection).
My purpose is to capture images and classify them on Nano 33 BLE sense with a 32x24 camera.
I chose to create a pulse with input size 32x32 and “Fit shortest axis” and trained my model with MobileNetV1 96x96.
I guess the code will expect a 32x32 frame to run the classifier once implemented on the Nano 33, right? So, what should I do?
I can’t “crop” the image to 24x24 (to fit the shortest axis) because the classifier wants 32x32. Right?
So, I should fill 32x8 pixels with something to get a 32x32 image. But with what? Black pixels?
What if I choose “squash” as an option? How should I deal with the pre-processing of the image? Are there functions that allow me to perform squash? Case 2:
Same case as Case 1, but I chose a 96x96 impulse size and the same NN 96x96 model.
With my 32x24 camera frames, do I need to interpolate to get a 96x96 image to provide to the classifier?
Is this done automatically by the library or not?
Are these problems that I should not have to ask myself because the library somehow does all these things automatically or do I have to do them manually? From what I understand, all the blocks created in impulse are transferred to the library.
Actually there is no specific question but many doubts.
I thank those who will help me.
If you look at the image tab under Impulse design, it gives Raw Features and then in the block labeled DSP result which gives the processed features.
You can copy both of these datasets to the clipboardand , if required, paste them into a text editor.
By comparing the two, you can figure out what digital processing Edge Impulse has done with the data before it is fed to the learning block. What ever processing is done here will be performed when the Impulse is deployed to your target hardware.
In my experience, Edge Impulse will downsize larger pictures, but I am not sure if it upsizes pictures.
Squash is an option in the Image data block in the resize mode drop down list.
Here is an article that descibes difference between fit and squash
As far as I know, the Image DSP in Edge Impulse fills in black pixels to get a square image. So the image you show of the processed features (32x32) is consistent with what Image DSP does.
Regarding model accuracy, that is a difficult question. Best is to experiment with different configurations. I can suggest some alternative approaches:
In my experience, the Nano 33 BLE does not have enough RAM (limited to 256 K) to run inferencing on RGB images. So best results I have achieved is to convert the images to Greyscale (this reduces the number of input features by a factor of 3).
If you have a large enough dataset, the presence of black pixels in the processed features should not make much difference because the trained network will take these into account. Train a model and see how accurate it is.
My experience shows there are two options for image recognition when it comes to the learning block in impulse design - Transfer Learning (Images) or Classification (Keras Neural Network). Transfer learning offers MobileNet options. Suggest try the MobileNetV1 96x96 0.25 (no final dense layer, 0.1 dropout). V2 networks are beyond the capability of the Nano 33 BLE. Default values for Classification network seem to work OK
If you are ambitious, it may be possible to write your own Image processing block. What you would then do is convert your images to Greyscale, then flatten the image into 32x24 = 768 raw features. Feed this into the Classification neural network and see how this works
Same as above but skip the greyscale conversion. This give 32x24x3 = 2304 raw features.