Spresense Wildlife Example with Color/RGB888

I wanted to see if there are any examples of doing Image Classification in Arduino with the Spresense. I would like to convert the LTE Wildlife camera example to work with color images. I have tried to use the RGB565 to RGB888 function from the BLE33_sense_camera example/firmware but it doesn’t seem to work: https://github.com/edgeimpulse/firmware-arduino-nano-33-ble-sense/blob/e868f1ba8c31175f5abbb0ba37ae728d87026fae/src/sensors/ei_camera.cpp#L479

I swapped out the grayscale ei_camera_cutout_get_data and changed the pix format to RGB56, err = sized_img.convertPixFormat(CAM_IMAGE_PIX_FMT_RGB565);

But I am getting very inconsistent results and it seems like it is not even taking the image in. The model works great when I run it using the Spresense Firmware. Any pointers on how to do this?

Here is what I have: Spresense-ei.ino · GitHub

So I think I have narrowed down the source of the unstableness. It has something to do with taking the image from the Video Preview, vs taking a Snapshot and running that through the classifier.

I am configuring both the Video preview and the Snapshot the same:

err = theCamera.begin(1, CAM_VIDEO_FPS_5, RAW_WIDTH, RAW_HEIGHT,CAM_IMAGE_PIX_FMT_YUV422);
err = theCamera.setStillPictureImageFormat( RAW_WIDTH,RAW_HEIGHT, CAM_IMAGE_PIX_FMT_YUV422);

With a static scene, here are the classification results when the Video Preview source is being used:

INFO: wildlife_camera initializing on wakeup...
Inferencing settings:
	Image resolution: 160x160
	Frame size: 25600
	No. of classes: 2
	Raw Image Width: 320 Height: 240
	Clip Width: 160 Height: 160
	Offset X: 80 Y: 40
Starting the camera:
Starting sending data:
Set format:
INFO: started wildlife camera recording
INFO: new frame processing...
convert format:
classify picture:
Predictions (DSP: 21 ms., Classification: 4319 ms., Anomaly: 0 ms.): 
    background: 0.035156
    lego: 0.964844
INFO: new frame processing...
convert format:
classify picture:
Predictions (DSP: 21 ms., Classification: 4319 ms., Anomaly: 0 ms.): 
    background: 0.367188
    lego: 0.632812
INFO: new frame processing...
convert format:
classify picture:
Predictions (DSP: 21 ms., Classification: 4319 ms., Anomaly: 0 ms.): 
    background: 0.593750
    lego: 0.406250
INFO: new frame processing...
convert format:
classify picture:
Predictions (DSP: 21 ms., Classification: 4319 ms., Anomaly: 0 ms.): 
    background: 0.476562
    lego: 0.523437
INFO: new frame processing...
convert format:
classify picture:
Predictions (DSP: 21 ms., Classification: 4319 ms., Anomaly: 0 ms.): 
    background: 0.351563
    lego: 0.648437
INFO: new frame processing...
convert format:
classify picture:
Predictions (DSP: 21 ms., Classification: 4319 ms., Anomaly: 0 ms.): 
    background: 0.324219
    lego: 0.675781

and here is when it is using the image from takePicture(). There still is some variance, which is weird since it should be the exact same image each time, but it is only about 8%

INFO: wildlife_camera initializing on wakeup...
Inferencing settings:
	Image resolution: 160x160
	Frame size: 25600
	No. of classes: 2
	Raw Image Width: 320 Height: 240
	Clip Width: 160 Height: 160
	Offset X: 80 Y: 40
Starting the camera:
Starting sending data:
Set format:
INFO: started wildlife camera recording
resize:
convert format:
Predictions (DSP: 21 ms., Classification: 4319 ms., Anomaly: 0 ms.): 
    background: 0.078125
    lego: 0.921875
resize:
convert format:
Predictions (DSP: 21 ms., Classification: 4319 ms., Anomaly: 0 ms.): 
    background: 0.132812
    lego: 0.867187
resize:
convert format:
Predictions (DSP: 21 ms., Classification: 4319 ms., Anomaly: 0 ms.): 
    background: 0.105469
    lego: 0.894531
resize:
convert format:
Predictions (DSP: 21 ms., Classification: 4319 ms., Anomaly: 0 ms.): 
    background: 0.035156
    lego: 0.964844
resize:
convert format:
Predictions (DSP: 21 ms., Classification: 4319 ms., Anomaly: 0 ms.): 
    background: 0.078125
    lego: 0.921875
resize:
convert format:
Predictions (DSP: 21 ms., Classification: 4319 ms., Anomaly: 0 ms.): 
    background: 0.050781
    lego: 0.949219
resize:
convert format:
Predictions (DSP: 21 ms., Classification: 4319 ms., Anomaly: 0 ms.): 
    background: 0.066406
    lego: 0.933594
resize:
convert format:
Predictions (DSP: 21 ms., Classification: 4319 ms., Anomaly: 0 ms.): 
    background: 0.105469
    lego: 0.894531

Here is an update of my program: spresense-picture.ino · GitHub

… and another datapoint - when I build the model as Spresense Firmware using the EI Studio, I get the following error when I try to run it:

Inferencing settings:
	Image resolution: 160x160
	Frame size: 25600
	No. of classes: 2
Taking photo...
ERR: Failed to run DSP process (-1002)
Failed to run impulse (-5)

The model doesn’t seem too crazy though:


Hi @lukedc,

ERR code -1002 means “out of memory,” so it does look like you are running out of memory during the inference process.

Thanks @shawn_edgeimpulse !
Do you think that could also be happening silently with the Arduino Library? Or is there wasn’t enough memory during the DSP portion, would I get a similar error?

Hi @lukedc,

For images, not much happens during the DSP section (maybe a conversion to grayscale). I usually see this error when TensorFlow Lite for Microcontrollers attempts to malloc() a bunch of space for inference (every time ei_run_classifier() is called) and then runs out of RAM.

Check to see how much RAM you are using for your framebuffer whenever you capture an image. Are you scaling or cropping the captured image? If so, you’ll find that you need to maintain 2 different (rather large) image buffers. If both of those are still allocated, there’s a good chance ei_run_classifier() will run out of memory.

I am trying to run my model through the same Arduino code and getting this as output:


Why is the classifier not running?