Preparing image data for signal.get_data?

Hi all, I exported my model (a poison ivy image classifier, project id: poison-ivy) to C++ and I’m trying to get it to work on my pico4ml device. I’m trying to leverage the person_detection example that’s provided in the tflite library.

I believe I’m converting the image appropriately from RGB565 per the example provided on Edge Impulse (and using GetImage() function from the tflite example). Both the camera and classifier use 96 x 96 images so no cutouts are required and they are monochrome.

GetImage() returns an int_8 image array (global variable called image) and i use that as input to signal.get_data overloaded function. I just convert from int8 to float. Does that make sense? It’s not working so far so I think I’m missing something.

int raw_feature_get_data(size_t offset, size_t length, float *out_ptr) 
{
    return numpy::int8_to_float(image + offset, out_ptr, length);
}
signal.get_data = &raw_feature_get_data;
// this is where we'll write all the output
float image_output[96*96];

// read through the signal buffered, like the classifier lib also does
for (size_t ix = 0; ix < signal.total_length; ix += 1024) 
{
     size_t bytes_to_read = 1024;
     if (ix + bytes_to_read > signal.total_length) 
     {
         bytes_to_read = signal.total_length - ix;
     }

     int r = signal.get_data(ix, bytes_to_read, image_output + ix);
}

Once I have the signal structure formed correctly, I can pass it to run_classifier, but I seem to be stuck here. Any help would be appreciated. Could I possibly export the model as a tflite file via an OpenMV export and use it that way?

Hi @jlutzwpi,

This looks correct. Next step is to pass your buffer (image_output) to the classifier, you can do it as follows:

signal_t signal;
int err = numpy::signal_from_buffer(buffer, EI_CLASSIFIER_DSP_INPUT_FRAME_SIZE, &signal);
err = run_classifier(&signal, &result, debug_nn);

You can also find more information on how to use the run_classifier function here: https://docs.edgeimpulse.com/docs/running-your-impulse-locally-1#input-to-the-run_classifier-function

It is also possible to export the model as TfLite using the OpenMV export (or through the Dashboard).

Aurelien

Thank you! Turns out I did need to do the RGB conversion so I got the signal formed correctly and ran my first classification! However, on the second iteration, it looks like it hangs at the run_classifier function.

Perhaps a memory issue?

I initialize result within the while loop:
ei_impulse_result_t result = {0};
err = run_classifier(&signal, &result, debug_nn);

err is not returned so I’m assuming it’s hanging within the run_classifier function itself. Have you seen this behavior before?

It could be a memory issue. Could you actually share the whole code?

Also as the Arducam Pico4ML is a community board, it may be worth asking on their forum too.

Aurelien

Thanks Aurelian, here you go. This is in a main_functions.cpp file. main.cpp is just a very simple executable that calls setup() once and does a while loop on loop() to maintain Arduino compatibility.

// Globals, used for compatibility with Arduino-style sketches.
namespace {
tflite::ErrorReporter *   error_reporter = nullptr;
int8_t image[EI_CLASSIFIER_DSP_INPUT_FRAME_SIZE] = {0};
static bool debug_nn = true;
}  // namespace

#ifndef DO_NOT_OUTPUT_TO_UART
// RX interrupt handler
void on_uart_rx() {
  char cameraCommand = 0;
  while (uart_is_readable(UART_ID)) {
    cameraCommand = uart_getc(UART_ID);
    // Can we send it back?
    if (uart_is_writable(UART_ID)) {
      uart_putc(UART_ID, cameraCommand);
    }
  }
}

void setup_uart() {
  // Set up our UART with the required speed.
  uint baud = uart_init(UART_ID, BAUD_RATE);
  // Set the TX and RX pins by using the function select on the GPIO
  // Set datasheet for more information on function select
  gpio_set_function(UART_TX_PIN, GPIO_FUNC_UART);
  gpio_set_function(UART_RX_PIN, GPIO_FUNC_UART);
  // Set our data format
  uart_set_format(UART_ID, DATA_BITS, STOP_BITS, PARITY);
  // Turn off FIFO's - we want to do this character by character
  uart_set_fifo_enabled(UART_ID, false);
  // Set up a RX interrupt
  // We need to set up the handler first
  // Select correct interrupt for the UART we are using
  int UART_IRQ = UART_ID == uart0 ? UART0_IRQ : UART1_IRQ;

  // And set up and enable the interrupt handlers
  irq_set_exclusive_handler(UART_IRQ, on_uart_rx);
  irq_set_enabled(UART_IRQ, true);

  // Now enable the UART to send interrupts - RX only
  uart_set_irq_enables(UART_ID, true, false);
}
#else
void setup_uart() {}
#endif

void r565_to_rgb(uint16_t color, uint8_t *r, uint8_t *g, uint8_t *b) {
    *r = (color & 0xF800) >> 8;
    *g = (color & 0x07E0) >> 3;
    *b = (color & 0x1F) << 3;
}

int raw_feature_get_data(size_t offset, size_t length, float *out_ptr) {
    size_t bytes_left = length;
    size_t out_ptr_ix = 0;

    // read byte for byte
    while (bytes_left != 0) {
        
        // grab the value and convert to r/g/b
        uint16_t pixel = (uint16_t)image[out_ptr_ix];

        uint8_t r, g, b;
        r565_to_rgb(pixel, &r, &g, &b);

        // then convert to out_ptr format
        float pixel_f = (r << 16) + (g << 8) + b;
        out_ptr[out_ptr_ix] = pixel_f;

        // and go to the next pixel
        out_ptr_ix++;
        offset++;
        bytes_left--;
    }
    // and done!
    return 0;
}

// The name of this function is important for Arduino compatibility.
void setup() {

  static tflite::MicroErrorReporter micro_error_reporter;
  error_reporter = &micro_error_reporter;
  
  setup_uart();
  stdio_usb_init();
  
  TfLiteStatus setup_status = ScreenInit(error_reporter);
  if (setup_status != kTfLiteOk) {
    TF_LITE_REPORT_ERROR(error_reporter, "Screen Set up failed\n");
    ei_printf("Failed to init screen\n");
  }

}

int loop_count = 0;

// The name of this function is important for Arduino compatibility.
void loop() {
  
  // Get image from camera.
  int kNumCols = 96;
  int kNumRows = 96;
  int kNumChannels = 1;
  
  if (kTfLiteOk
      != GetImage(error_reporter, kNumCols, kNumRows, kNumChannels, image)) {
    TF_LITE_REPORT_ERROR(error_reporter, "Image capture failed.");
    ei_printf("Failed to get image\n");
  }

  // Run the model on this input and make sure it succeeds.
  // put Edge Impulse model code here 
  // Turn the raw buffer in a signal which we can the classify
  // this is where we'll write all the output
  float image_output[EI_CLASSIFIER_DSP_INPUT_FRAME_SIZE] = {0};
  
  signal_t signal;
  signal.total_length = EI_CLASSIFIER_DSP_INPUT_FRAME_SIZE;
  signal.get_data = &raw_feature_get_data;

  // read through the signal buffered, like the classifier lib also does
  for (size_t ix = 0; ix < signal.total_length; ix += 1024) {
     size_t bytes_to_read = 1024;
     if (ix + bytes_to_read > signal.total_length) {
         bytes_to_read = signal.total_length - ix;
     }

     int r = signal.get_data(ix, bytes_to_read, image_output + ix);
     ei_printf("Call to get_data (%d)\n", r);
        
  }
 
  // Run the classifier
  ei_impulse_result_t result = { 0 };
  ei_printf("Before run_classifier"); 
//this is where the program hangs on the 2nd iteration
  int err = run_classifier(&signal, &result, debug_nn);
  ei_printf("Call to run_classifier (%d)\n", err);
  if (err != EI_IMPULSE_OK) {
     ei_printf("ERR: Failed to run classifier (%d)\n", err);
     return;
  }
 
  // print the predictions
  ei_printf("Predictions ");
  for (size_t ix = 0; ix < EI_CLASSIFIER_LABEL_COUNT; ix++) {
      ei_printf("    %s: %.5f\n", result.classification[ix].label, result.classification[ix].value);
  }
  // Process the inference results.
  int8_t no_poison_ivy_score    = floor(result.classification[0].value * 100);
  int8_t poison_ivy_score       = floor(result.classification[1].value * 100);
   
#if SCREEN
  char array[10];
  sprintf(array, "%d%%", poison_ivy_score);
  ST7735_FillRectangle(10, 120, ST7735_WIDTH, 60, ST7735_BLACK);
  if(poison_ivy_score > 60)
  {
    ST7735_WriteString(10, 120, array, Font_14x26, ST7735_RED, ST7735_BLACK);
  }
  else
  {
    ST7735_WriteString(10, 120, array, Font_14x26, ST7735_GREEN, ST7735_BLACK);
  }
#endif
  TF_LITE_REPORT_ERROR(error_reporter, "**********");
}

I’ll try out the Pico4ML board as well. I hadn’t thought of that.

thanks,
Justin

Hi Justin,

Not sure why the program hans. Do you see the features being printed out when run_classifier is called? They should as the debug_nn flag is set to true.

Aurelien

Yes, I am stumped as well. It does print out the classifier info, but only ONCE. On the second time, it hangs which makes me think it is a memory issue/leak. If I comment out run classifier, the pico4ml runs fine and updates the image on the screen. It’s not until I try to run run_classifier on the second time where I have the issue. Serial log printout below:

Call to get_data (0)
Call to get_data (0)
Call to get_data (0)
Call to get_data (0)
Call to get_data (0)
Call to get_data (0)
Call to get_data (0)
Call to get_data (0)
Call to get_data (0)
Before run_classifierFeatures (11 ms.): 0.898039 0.894118 0.898039 0.898039 0.890196 0.898039 0.894118 0.894118 0.898039 0.898039 0.898039 0.898039 0.898039 0.894118 0.898039 0.898039 0.894118 0.894118 0.898039 0.89
Predictions (time: 1878 ms.):
not-poison-ivy: 0.765625
poison-ivy: 0.234375
Call to run_classifier (0)
Predictions not-poison-ivy: 0.76562
poison-ivy: 0.23438
poison ivy score:23 no poison ivy score 76


Call to get_data (0)
Call to get_data (0)
Call to get_data (0)
Call to get_data (0)
Call to get_data (0)
Call to get_data (0)
Call to get_data (0)

That’s the last of the data that I receive. get_data seems to be running correctly since it iterates 9 times (1024 byte buffer x 9 = 96x96 image). Given that the inference takes 1.8s, could there be a race condition for run_classifier() memory? I think the issue lies there based on the debugging I’ve done. Should I sleep for 2s after the inference? I think everything is run serially and not multithreaded so I don’t think there would be an issue there.

Thanks,
Justin

Looks like it doesn’t receive all the data right? Does it print out the “Before running classifier” in the 2nd iteration?
Maybe try printing the image_output array to see if it’s correctly filled.

You can also add an ei_sleep(5000) instruction at the end of the loop just to double check.

Aurelien

Hi Aurelien, I was finally able to test it out. I printed out the last 200 elements of the array, and it looks like it is populating properly (although I’m not quite sure if the float values are correct or not). I added in a sleep for 5000ms as well to test that out. As I suspected, looks like it is a memory issue:

Call to get_data (0)

Call to get_data (0)

Call to get_data (0)

Call to get_data (0)

Call to get_data (0)

Call to get_data (0)

Call to get_data (0)

Call to get_data (0)

Call to get_data (0)
0.0 16317544.0 16317520.0 16317528.0 16317528.0 16317528.0 16317552.0 16317504.0 16317528.0 16317536.0 16317536.0 16317544.0 16317544.0 16317560.0 16317584.0 16317560.0 16317560.0
16317544.0 16317552.0 16317552.0 16317536.0 16316664.0 16315624.0 16315640.0 16316520.0 16317536.0 16317568.0 16317552.0 16317560.0 16317560.0 16317536.0 16317552.0 16317568.0 16317544.0 16317544.0 16317544.0 16317536.0
16317520.0 16317496.0 16317504.0 16317496.0 16317496.0 16317472.0 16317488.0 16317504.0 16317544.0 16317504.0 16317544.0 16317536.0 16317544.0 16317568.0 16317560.0 16317592.0 16317568.0 16317592.0 16317568.0 16317608.0
16317632.0 16317624.0 16317576.0 16317632.0 16317600.0 16317608.0 16317624.0 16317632.0 16316648.0 16316648.0 16316648.0 16316648.0 16316664.0 16316656.0 16317440.0 16317448.0 16317440.0 16317456.0 16317456.0 16317464.0
16317440.0 16317456.0 16317456.0 16317456.0 16317472.0 16317472.0 16317480.0 16317480.0 16317496.0 16317496.0 16317504.0 16317480.0 16317504.0 16317488.0 16317488.0 16317496.0 16317496.0 16317512.0 16317496.0 16317536.0
16317528.0 16317528.0 16317504.0 16317544.0 16317512.0 16317544.0 16317520.0 16317536.0 16317520.0 16317560.0 16317528.0 16317536.0 16317552.0 16317560.0 16317584.0 16317560.0 16317568.0 16317536.0 16317576.0 16317552.0
16317536.0 16316568.0 16315608.0 16316432.0 16316592.0 16317536.0 16317560.0 16317560.0 16317552.0 16317552.0 16317552.0 16317552.0 16317544.0 16317536.0 16317544.0 16317528.0 16317528.0 16317520.0 16317528.0 16317504.0
16317480.0 16317496.0 16317472.0 16317512.0 16317544.0 16317528.0 16317536.0 16317544.0 16317560.0 16317560.0 16317576.0 16317608.0 16317584.0 16317584.0 16317608.0 16317616.0 16317624.0 16317624.0 16317616.0 16317632.0
16317600.0 16317600.0 16317608.0 16317648.0 16316648.0 16316664.0 16316664.0 16317440.0 16317448.0 16316664.0 16317448.0 16317472.0 16317464.0 16317464.0 16317456.0 16317472.0 16317464.0 16317472.0 16317488.0 16317480.0
16317472.0 16317496.0 16317464.0 16317480.0 16317480.0 16317504.0 16317528.0 16317504.0 16317528.0 16317504.0 16317512.0 16317512.0 16317496.0 16317512.0 16317536.0 16317512.0 16317496.0 16317552.0 16317528.0 16317536.0
16317544.0 16317560.0 16317560.0
Before run_classifier
Call to run_classifier (0)
Predictions not-poison-ivy: 0.99219
poison-ivy: 0.00781
poison ivy score:0 no poison ivy score 99


Call to get_data (0)

Call to get_data (0)

Call to get_data (0)

Call to get_data (0)

Call to get_data (0)

Call to get_data (0)

Call to get_data (0)

Call to get_data (0)

Call to get_data (0)
0.0 16315512.0 16315504.0 16315512.0 16315512.0 16315504.0 16315512.0 16315504.0 16315512.0 16315528.0 16315512.0 16315528.0 16315504.0 16315504.0 16315528.0 16315504.0 16315504.0
16315512.0 16315504.0 16315512.0 16315512.0 16315504.0 16315512.0 16315568.0 16315632.0 16316640.0 136.0 1160.0 1256.0 2168.0 2296.0 2280.0 2064.0 16317672.0 16316488.0 16316560.0 16317520.0
1120.0 2056.0 2152.0 2256.0 3152.0 2264.0 1224.0 16316640.0 16316424.0 16316512.0 16317504.0 128.0 152.0 200.0 1096.0 1208.0 1144.0 1048.0 16316512.0 16315616.0
16316424.0 16316560.0 16317648.0 16317608.0 16317608.0 16317608.0 16317592.0 16317560.0 1144.0 2056.0 3184.0 3320.0 3320.0 3320.0 3320.0 3320.0 3320.0 3320.0 3320.0 3320.0
3320.0 3136.0 16317648.0 16316472.0 16315616.0 16315608.0 16315568.0 16315552.0 16315552.0 16315568.0 16315568.0 16315568.0 16315552.0 16315568.0 16315552.0 16315552.0 16315528.0 16315552.0 16315568.0 16315552.0
16315528.0 16315544.0 16315528.0 16315544.0 16315544.0 16315552.0 16315528.0 16315528.0 16315552.0 16315528.0 16315512.0 16315528.0 16315528.0 16315544.0 16315528.0 16315544.0 16315528.0 16315528.0 16315512.0 16315528.0
16315528.0 16315528.0 16315528.0 16315528.0 16315576.0 16316448.0 16317544.0 1024.0 1120.0 200.0 224.0 1096.0 1032.0 128.0 16316608.0 16316464.0 16316536.0 16317440.0 16.0 16.0
104.0 176.0 152.0 112.0 16317584.0 16316432.0 16315632.0 16316448.0 16316640.0 16317528.0 16317544.0 16317656.0 0.0 64.0 16317672.0 16316640.0 16316424.0 16315616.0 16316464.0 16316648.0
16317504.0 16317520.0 16317568.0 16317560.0 1272.0 2232.0 3320.0 3320.0 3320.0 3320.0 3320.0 3320.0 3320.0 3320.0 3320.0 3320.0 3320.0 3320.0 0.0 16317496.0
16317480.0 16317528.0 16317472.0 16317472.0 16317440.0 16317440.0 16316648.0 16316624.0 16316624.0 16316608.0 16316600.0 16316584.0 16316552.0 16316496.0 16316432.0 16315616.0 16315592.0 16315616.0 16315616.0 16316424.0
16316464.0 16316472.0 16316448.0
Before run_classifier

*** PANIC ***

Out of memory

I did an xxd of the tflite model from EI and the length was about 340k. The original person-detection model was 300k. I don’t think the model is too large, but I could be wrong. The EI dashboard predicted about 110k RAM would be required and less than 400k flash, which I believe fall into the capabilities of the board.

Any suggestions would be appreciated. I’m running out of ideas, ha. It does still seem to be getting snagged on the run_classifier though.
thanks,
Justin

I think I found a bug in the raw_feature_get_data method. I never use the offset input variable so I repeatedly start at 0. Will try that out in the morning. Thanks for all your guidance so far!

Hi @jlutzwpi,
It does appear to be a memory leak, but I can’t spot it immediately. However I do spot somethings that may be concerning.
In raw_features_get_data():

  1. out_ptr_ix goes up to length-1, as a result only the first length-1 elements of the image array are used. image should be dependent on offset if you want to read the entire image array.
  2. you cast a single int8_t value from image to a uint16_t, which as a result pixel should have the HB always set to 0x00. I think you want to grab 2 int8_t bytes from image before casting.

You may want to take an look at the nano_ble33_sense_camera.ino example when you deploy the Arduino Library in the Studio. The part to look at is:

int ei_camera_cutout_get_data(size_t offset, size_t length, float *out_ptr) {
    size_t pixel_ix = offset * 2; 
    size_t bytes_left = length;
    size_t out_ptr_ix = 0;

    // read byte for byte
    while (bytes_left != 0) {
        // grab the value and convert to r/g/b
        uint16_t pixel = (ei_camera_capture_out[pixel_ix] << 8) | ei_camera_capture_out[pixel_ix+1];
        uint8_t r, g, b;
        r = ((pixel >> 11) & 0x1f) << 3;
        g = ((pixel >> 5) & 0x3f) << 2;
        b = (pixel & 0x1f) << 3;

        // then convert to out_ptr format
        float pixel_f = (r << 16) + (g << 8) + b;
        out_ptr[out_ptr_ix] = pixel_f;

        // and go to the next pixel
        out_ptr_ix++;
        pixel_ix+=2;
        bytes_left--;
    }

    // and done!
    return 0;
}

Note: that ei_camera_capture_out is 2 bytes per pixel. Not 1 BPP like in your case.

Hello, that is really good info, thank you. Taking a look at the Arduino code was helpful.

I was able to get the memory issue resolved. It turns out the model I was using was too large, so I changed to the mobilenetV1 0.2 (the same one used in the TFL person_detection example) and now I don’t have memory issues.

However, my classification (with the changes above implemented) result in poor classification. I’m wondering if I’m still not processing the image into the signal structure correctly. Even with “truth data” images, I’m only getting a 5-10% probability.

One thing that confuses me from the Arduino example is that get_data never seems to be called:

 ei::signal_t signal;
 signal.total_length = EI_CLASSIFIER_INPUT_WIDTH * EI_CLASSIFIER_INPUT_HEIGHT;
 signal.get_data = &ei_camera_cutout_get_data;

 // run the impulse: DSP, neural network and the Anomaly algorithm
 ei_impulse_result_t result = { 0 };
 EI_IMPULSE_ERROR ei_error = run_classifier(&signal, &result, debug_nn);

I would have expected variables to be passed to get_data so the function knows which variables to process. I did that in my code (and I believe I remember seeing it somewhere but it seems missing in the Arduino code. This is what I did in my code before inference:

  signal_t signal;
  signal.total_length = EI_CLASSIFIER_DSP_INPUT_FRAME_SIZE;
  signal.get_data = &raw_feature_get_data;

  // read through the signal buffered, like the classifier lib also does
  for (size_t ix = 0; ix < signal.total_length; ix += 1024) {
     size_t bytes_to_read = 1024;
     if (ix + bytes_to_read > signal.total_length) {
         bytes_to_read = signal.total_length - ix;
     }
     int r = signal.get_data(ix, bytes_to_read, image_output + ix);
  }
// Run the classifier
  int err = run_classifier(&signal, &result, debug_nn);
  if (err != EI_IMPULSE_OK) {
     ei_printf("ERR: Failed to run classifier (%d)\n", err);
     return;
  }

Still fighting my way through this. At least I’m not crashing anymore! :slight_smile:

Ok, I think I got it working!

I found some useful code in the image_provider.cpp file in person_detection which is essentially what you were trying to show me in terms of having 2 8-bit pixel values in the array (so it was the image size *2):

  auto *displayBuf = new uint8_t[96 * 96 * 2];
  uint16_t index      = 0;
  for (int x = 0; x < 96 * 96; x++) {
    uint16_t imageRGB   = ST7735_COLOR565(image_data[x], image_data[x], image_data[x]);
    displayBuf[index++] = (uint8_t)(imageRGB >> 8) & 0xFF;
    displayBuf[index++] = (uint8_t)(imageRGB)&0xFF;
  }

I thought about it for a bit and coded this in raw_feature_get_data function:

//code from EI-generated Arduino library, recommended by Edge Impulse team
int raw_feature_get_data(size_t offset, size_t length, float *out_ptr) {
       
    //size_t pixel_ix = offset * 2;
    size_t pixel_ix = offset; 
    size_t bytes_left = length;
    size_t out_ptr_ix = 0;

    // read byte for byte
    while (bytes_left != 0) {
        // grab the value and convert to r/g/b
        //uint16_t imageRGB   = ST7735_COLOR565(image[pixel_ix], image[pixel_ix], image[pixel_ix]);
        uint16_t pixel   = ST7735_COLOR565(image[pixel_ix], image[pixel_ix], image[pixel_ix]);
        //uint8_t pix1 = (uint8_t)(imageRGB >> 8) & 0xFF;
        //uint8_t pix2 = (uint8_t)(imageRGB)&0xFF;
        
        //uint16_t pixel = (pix1 << 8) | pix2;
        uint8_t r, g, b;
        r = ((pixel >> 11) & 0x1f) << 3;
        g = ((pixel >> 5) & 0x3f) << 2;
        b = (pixel & 0x1f) << 3;

        // then convert to out_ptr format
        float pixel_f = (r << 16) + (g << 8) + b;
        out_ptr[out_ptr_ix] = pixel_f;

        // and go to the next pixel
        out_ptr_ix++;
        //pixel_ix+=2;
        pixel_ix++;
        bytes_left--;
    }

    // and done!
    return 0;
}

I didn’t think I needed the unint8_t’s since they were just combining back to to a uint16_t. I think the helper function ST7735_COLOR565 was what I needed.

Now I think I just need to improve my model for better results. Thanks for all the help!

Justin

3 Likes