Grayscale Quantization Explanation

stantonious · July 11, 2022, 6:17pm

Question/Issue: I’d like to understand how EI creates features from grayscale images.

Project ID:

Context/Use case:

I have an application that captures 1 channel (single byte lumenessence) imagery. EI requires that training data be updated in a specific format, so i convert the raw data to .png. My EI model trains and I’m able to load my model on the target device with the c++ library export (very nice b.t.w!).

My question is, how do i generate feature data on device for the input tensor? I see that in extract_image_features_quantized the following:

int32_t r = static_cast<int32_t>(pixel >> 16 & 0xff);
                    int32_t g = static_cast<int32_t>(pixel >> 8 & 0xff);
                    int32_t b = static_cast<int32_t>(pixel & 0xff);

                    // ITU-R 601-2 luma transform
                    // see: https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.Image.convert
                    int32_t gray = (iRedToGray * r) + (iGreenToGray * g) + (iBlueToGray * b);
                    gray >>= 16; // scale down to int8_t
                    gray += EI_CLASSIFIER_TFLITE_INPUT_ZEROPOINT;
                    if (gray < - 128) gray = -128;
                    else if (gray > 127) gray = 127;
                    output_matrix->buffer[output_ix++] = static_cast<int8_t>(gray);

I did notice that the features for grayscale have the lower 3 bytes all the same (presumably the grayscale values). Does this mean i can take my lumenessence 1 byte value directly and quantize as such:

    static const int32_t iRedToGray = (int32_t)(0.299f * 65536.0f);
    static const int32_t iGreenToGray = (int32_t)(0.587f * 65536.0f);
    static const int32_t iBlueToGray = (int32_t)(0.114f * 65536.0f);

    // ITU-R 601-2 luma transform
    // see: https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.Image.convert
    int32_t gray = (iRedToGray * pixel_grayscale) + (iGreenToGray * pixel_grayscale) + (iBlueToGray * pixel_grayscale);
    gray >>= 16; // scale down to int8_t
    gray += EI_CLASSIFIER_TFLITE_INPUT_ZEROPOINT;
    if (gray < -128)
        gray = -128;
    else if (gray > 127)
        gray = 127;
    return static_cast<int8_t>(gray);

shawn_edgeimpulse · July 12, 2022, 6:08pm

Hi @stantonious,

Your best bet for converting 1-byte grayscale values to the RGB values expected by run_classifier() in the C++ library is to simply copy the grayscale value to each r, g, b byte. For more info, please see this thread: DSP does not handle grayscale image efficiently

Hope that helps!