Bounding box FOMO

GreatoR · May 22, 2024, 5:51pm

Hi!

I saw the documentation on FOMO model that is not using bounding boxes instead is using object location in the frame. In which code line I can find the location of the object? I only see a bounding box inside a loop (I suppose is creating a box around the image as a reference so it is easy to understand)

Do I understand correctly bb.x and bb.y is the top left corner of the bounding box?
bb.heigh and bb.wight is the height and the width of the bounding box?

#if EI_CLASSIFIER_OBJECT_DETECTION == 1
    bool bb_found = result.bounding_boxes[0].value > 0;
    for (size_t ix = 0; ix < result.bounding_boxes_count; ix++) {
        auto bb = result.bounding_boxes[ix];
        if (bb.value == 0) {
            continue;
        }
ei_printf("    %s (%f) [ x: %u, y: %u, width: %u, height: %u ]\n", bb.label, bb.value, bb.x, bb.y, bb.width, bb.height);

Thanks!!!

shawn_edgeimpulse · May 22, 2024, 7:16pm

Hi @GreatoR,

FOMO uses a grid and identifies object location/size based on which cell(s) the object(s) are found in. It then translates those grid cells to bounding box information. So yes, bb.x and bb.y are the top-left corner of the box, and bb.width and bb.height are the weight/height (in pixels) of the box. You can find more info about that struct here: ei_impulse_result_bounding_box_t

GreatoR · May 25, 2024, 5:09pm

If I have an esp32 AI thinker camera that is captures 320 * 240 and I downscale it for the FOMO model into 48*48 the bb.x and bb.y is referencing the top left corner from the distance from the bottom right corner of the rescaling image? Do I understanding correctly?

shawn_edgeimpulse · May 28, 2024, 4:03pm

Hi @GreatoR,

In almost all cases, the origin (0, 0) is the top-left corner of the image. So, bb.x and bb.y measures the distance (in pixels) from the top-left corner.