Question/Issue: I’d like to understand how EI creates features from grayscale images.
Project ID:
Context/Use case:
I have an application that captures 1 channel (single byte lumenessence) imagery. EI requires that training data be updated in a specific format, so i convert the raw data to .png. My EI model trains and I’m able to load my model on the target device with the c++ library export (very nice b.t.w!).
My question is, how do i generate feature data on device for the input tensor? I see that in extract_image_features_quantized
the following:
int32_t r = static_cast<int32_t>(pixel >> 16 & 0xff);
int32_t g = static_cast<int32_t>(pixel >> 8 & 0xff);
int32_t b = static_cast<int32_t>(pixel & 0xff);
// ITU-R 601-2 luma transform
// see: https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.Image.convert
int32_t gray = (iRedToGray * r) + (iGreenToGray * g) + (iBlueToGray * b);
gray >>= 16; // scale down to int8_t
gray += EI_CLASSIFIER_TFLITE_INPUT_ZEROPOINT;
if (gray < - 128) gray = -128;
else if (gray > 127) gray = 127;
output_matrix->buffer[output_ix++] = static_cast<int8_t>(gray);
I did notice that the features for grayscale have the lower 3 bytes all the same (presumably the grayscale values). Does this mean i can take my lumenessence 1 byte value directly and quantize as such:
static const int32_t iRedToGray = (int32_t)(0.299f * 65536.0f);
static const int32_t iGreenToGray = (int32_t)(0.587f * 65536.0f);
static const int32_t iBlueToGray = (int32_t)(0.114f * 65536.0f);
// ITU-R 601-2 luma transform
// see: https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.Image.convert
int32_t gray = (iRedToGray * pixel_grayscale) + (iGreenToGray * pixel_grayscale) + (iBlueToGray * pixel_grayscale);
gray >>= 16; // scale down to int8_t
gray += EI_CLASSIFIER_TFLITE_INPUT_ZEROPOINT;
if (gray < -128)
gray = -128;
else if (gray > 127)
gray = 127;
return static_cast<int8_t>(gray);