Problems/code error with the built in case of the FOMO, RGB image and ESP-EYE

Dear Edge Impulse Experts,

I have problems, and it seems I have found a bug, with the Use Case when I try to detect the objects of the same shape but different color.
(If it is about grey-scale and objects with different shapes then it works.)

Use Case

  • I teach the FOMO (Faster Objects, More Objects) MobileNetV2 0.1
  • Color depth RGB
  • Objects: same shape (e.g. simple cylinders) but in different colors
  • In the Desktop, Launch in Browser in Edge Impulse (or with QR code) it works perfectly, the neural net can detect the same shape objects but in different colors.
    So the neural net works perfectly.
  • But the generated built SW by Edge Impulse does not work on ESP-EYE microcontroller in case of RGB setting. I tried different built settings and I have got different issues.

With different settings:

  • In the Deployment section
  1. Set/Unset Enabled EON Compiler, Quantized (int8), RGB →
    it can be built and run on the ESP-EYE but the object detection does not work at all.
    Always No object detected.
    (I think something goes wrong during the quantization. E.g. float->int8 conversion )

  2. Set/Unset Enabled EON Compiler, Unoptimized (float32), RGB →
    Build error in the Edge Impulse generated code:

…\src\edge-impulse-sdk\tensorflow\lite\micro\micro_graph.cpp" -o "
…\edge-impulse-sdk\tensorflow\lite\micro\micro_graph.cpp.o"
…\src\edge-impulse-sdk\tensorflow\lite\micro\kernels*softmax.cpp*: In function ‘void tflite::{anonymous}::SoftmaxQuantized(TfLiteContext*, const TfLiteEvalTensor*, TfLiteEvalTensor*, const tflite::{anonymous}::NodeData*)’:
…\src\edge-impulse-sdk\tensorflow\lite\micro\kernels\softmax.cpp:301:14: error: return-statement with a value, in function returning ‘void’ [-fpermissive]
return kTfLiteError;

  • I only used the generated codes (.zip library, and the template file in Arduino IDE) during this issues.

My questions are as follows:

  • Is the RGB Tiny ML code build not supported for the ESP-EYE yet?
  • Why do I get this code error in the generated code above? Maybe I forgot to set something.

Best regards,
Lehel

Hello @edghba,

Could you try with the Arduino Library deployment type and let me know if you also have issues.
I’ll try to have a look at your project in parallel with the ESP-EYE binary deployment type, can you share your project ID?

Best

Louis

Hello Iouis,

ID: 289375.
I used the Arduino type:

Or maybe are you thinking of something else?

Best regards,
Lehel

Hello @edghba,

Just tested with the quantized model (float32 might be too big).
I don’t have an ESP-EYE but I tested with an ESPCAM AI Thinker.

Here’s the results when I point the camera to a picture from your test dataset:

{Image removed}

Can you save your pictures on an SD card, upload them somewhere or display the images with the same config to make sure your images actually look like expected?

Best,

Louis

Hello Louis,

Thank you for your testing and your support. It looks good/works on your snapshot.
It is a good idea to test it on a original image. I did it but No objects found.
(Small remark: The PC version worked with image also. So the algorithm is OK.)
I also use the quantized model.
I suspect there is an important difference in the generated code/ (lib) between ESP-EYE and ESPCAM AI Thinker.
Is the only difference in the GPIO assignment and in the HW description or is there in the algorithm also?
In my case the #define CAMERA_MODEL_ESP_EYE is active but in your case the CAMERA_MODEL_AI_THINKER.
What is very strange that I do not have this issue when I use gray scale images.
But I can not use gray scale images when the only differences are in the colors.

I only use your Edge Impulse lib and the template example (esp32_camera.ino).
Is it not possible that there exists difference in the algorithm in the lib in the 2 cases?
(Maybe one define is missing for ESP-EYE somewhere. I am just thinking about.)

  • ESP-EYE+RGB
  • ESPCAM AI Thinker+RGB

Best regards,
Lehel

What camera you have on your ESP-EYE?

There might be a setting that is not correctly configured…
You can try to modify some parameters in this function:

static camera_config_t camera_config = {
    .pin_pwdn = PWDN_GPIO_NUM,
    .pin_reset = RESET_GPIO_NUM,
    .pin_xclk = XCLK_GPIO_NUM,
    .pin_sscb_sda = SIOD_GPIO_NUM,
    .pin_sscb_scl = SIOC_GPIO_NUM,

    .pin_d7 = Y9_GPIO_NUM,
    .pin_d6 = Y8_GPIO_NUM,
    .pin_d5 = Y7_GPIO_NUM,
    .pin_d4 = Y6_GPIO_NUM,
    .pin_d3 = Y5_GPIO_NUM,
    .pin_d2 = Y4_GPIO_NUM,
    .pin_d1 = Y3_GPIO_NUM,
    .pin_d0 = Y2_GPIO_NUM,
    .pin_vsync = VSYNC_GPIO_NUM,
    .pin_href = HREF_GPIO_NUM,
    .pin_pclk = PCLK_GPIO_NUM,

    //XCLK 20MHz or 10MHz for OV2640 double FPS (Experimental)
    .xclk_freq_hz = 20000000,
    .ledc_timer = LEDC_TIMER_0,
    .ledc_channel = LEDC_CHANNEL_0,

    .pixel_format = PIXFORMAT_JPEG, //YUV422,GRAYSCALE,RGB565,JPEG
    .frame_size = FRAMESIZE_QVGA,    //QQVGA-UXGA Do not use sizes above QVGA when not JPEG

    .jpeg_quality = 12, //0-63 lower number means higher quality
    .fb_count = 1,       //if more than one, i2s runs in continuous mode. Use only with JPEG
    .fb_location = CAMERA_FB_IN_PSRAM,
    .grab_mode = CAMERA_GRAB_WHEN_EMPTY,
};

Or you can also try to flip the frames, adjust the brightness, saturation with something like that:

sensor_t * s = esp_camera_sensor_get();
s->set_vflip(s, 1); // flip
s->set_hmirror(s, 1); // mirror
s->set_brightness(s, 1); // up the brightness just a bit
s->set_saturation(s, -2); // lower the saturation

How have you collected the images? Else, you can try to use your current camera settings to collect the images and train your model.

Best,

Louis

Hello Louis,

Thank you for your proposals.

  • Image sensor: OV3660.
  • Ok, I look at these settings also. I know them from another camera application where I used them.
  • I collected the images with mobile phone.
  • Yes, I know that I can use the ESP-EYE for collecting also. I have also thought about that.

Best regards,
Lehel

@edghba Make your Project public and I will clone it and run on my ESP-EYE MCU v2.2.

Hello MMarcial,

Thank you very much for your quick and kind offer/help with your ESP-EYE.
Unfortunately, I can not make public the source images/objects.

Now I try to manage an ESPCAM AI Thinker which Lois has and I try it with them.

Best regards,
Lehel

1 Like

Hello Louis,

I tried out lot of things (different light conditions, collecting images with ESP-EYE using your proposed method in Collect Sensor Data Straight From Your Web Browser, setting saturation, rotating the camera, etc.) and I also managed an ESPCAM AI Thinker but I have the same issue. Actually they (EYE and AI-Thinker also) can not detect the objects which differ only in color. In the prototype testing in the Edge Impulse it works but the quantized model does not work on ESP.
Sometimes in 1-2% of the cases ESP can detect correctly.
Could you share the version (.zip) in e-mail, which works at you perfectly, with me please?
I would test, I already have the same HW, as you and maybe I set something wrong in the Edge Impulse when I create the library.
I use this net: FOMO (Faster Objects, More Objects) MobileNetV2 0.1

Best regards,
Lehel

Hello @edghba,

I tried with the default example, I just changed the
#define CAMERA_MODEL_AI_THINKER.

Please note that it was not working perfectly, I had like 10% of the frames that contained a bounding box.

@matkelcey, have you encountered this kind of color limitation with FOMO? Is there any tip to make the model better and separate better on colors?

Best,

Louis

Hello Louis,

Thank you for the information. Then we see the same issue.
I think the problem is at the quantized, Tensorflow Lite model created by Edge Impulse because the neural net works on PC in Edge Impulse well. It detects the objects different only in colors perfectly.
Some accuracy degradation should be on ESP because of the int instead of float etc. but actually unfortunately it does not work on ESP.

Best regards,
Lehel

Hello Louis,

Short summary and a question.
After lot of different training (RGB, grey-scale, with more, less training data, at different light conditions and in different places, with MobileNetV2 0.1/0.35) and testing on ESP-EYE and ESP32 Ai-Thinker I found that:

  • The FOMO (Faster Objects, More Objects) MobileNetV2 0.35, so when the alpha = 0.35, works in grey-scale very robustly when I train it with lot of data (for 2 different objects with a total of 800 training images in different positions).

  • ESP-EYE detects the object very reliably but since it is trained in grey-scale it can not distinguish among the objects of exact same shape but different colors.

  • I have found in the Edge Impulse description also:
    FOMO (Faster Objects, More Objects) MobileNetV2 0.35These models are designed to be <100KB in size and support a grayscale input at any resolution.

  • I have not found such neural net on Edge Impulse site which works with colorful objects.

  • The MobileNetV2 algorithm and your FOMO (Faster Objects, More Objects) MobileNetV2 0.35 algorithm works for colorful objects also. Edge Impulse proves also because it works on PC or on mobile phone perfectly on your site in the Launch in Browser section.

  • I wonder why it can not work after the quantization at all?
    I am thinking about that it should work after quantization also. I think it is not issue of the light conditions and of the camera settings, because I also tested at different conditions, and because the quantized grey-scale model works in different light conditions also.

Could you investigate/look at, please why we have this issue for colorful objects but not for grey-scaled objects?
The MobileNetV2 object detection algorithm is good for colorful objects also.

If this issue could be clarified and solved then it would be really great for me.

Project Id for this entry: 288939

Thank you & Best regards,
Lehel

Hello @edghba,

I’m asking internally because I’m out of ideas :smiley:

Best,

Louis

Hello Louis,

Have you had time to ask within the team?
I am thinking about what the problem can be. I suspect that the FOMO is trained by color images (RGB, 3 channels) but during the object detection the input image is read as grey-scale (1 channel) image. So the FOMO trained on colored image tries to detect from grey-scale input image and therefore it does not work or very-very rarely it can detect from grey-scale input image.
You handle inside the lib if the input image is grey-scaled (1 channel) or RGB with 3 channels.
I can only see this line converted = fmt2rgb888(fb->buf, fb->len, PIXFORMAT_JPEG, snapshot_buf); in your ei_camera_capture() where you read the captured jpg image and convert to RGB format.
I do not know what you do with the RGB input (3 channels) image in the lib later.
I suspect you convert it to grey-scale image when the FOMO is set to grey-scale but what happens if the FOMO is trained in RGB…

Best regards,
Lehel