Running on computer very slow

Hello,
I am hoping to run the resulting model on my computer. I have messaged before about getting it running on a pc, it is working but it is EXTREMELY slow. I only have 2 things I am trying to see, a Humming bird, or that damn squirrel that bullies the humming bird.

I am trying to get my video from an IP cam feed, Would anyone be kind enough to have a look at my code and let me know if I am doing anything obviously wrong? or is my PC just too slow? I am using the optomized build. and my ip cam is delivering a 320x320 image already (i set that on the camera)

#include <unistd.h>
#include "opencv2/opencv.hpp"
#include "opencv2/videoio/videoio_c.h"
#include "edge-impulse-sdk/classifier/ei_run_classifier.h"
#include "iostream"

static bool use_debug = true;

// If you don't want to allocate this much memory you can use a signal_t structure as well
// and read directly from a cv::Mat object, but on Linux this should be OK
static float features[EI_CLASSIFIER_INPUT_WIDTH * EI_CLASSIFIER_INPUT_HEIGHT];

/**
 * Resize and crop to the set width/height from model_metadata.h
 */
void resize_and_crop(cv::Mat *in_frame, cv::Mat *out_frame) {
    // to resize... we first need to know the factor
    float factor_w = static_cast<float>(EI_CLASSIFIER_INPUT_WIDTH) / static_cast<float>(in_frame->cols);
    float factor_h = static_cast<float>(EI_CLASSIFIER_INPUT_HEIGHT) / static_cast<float>(in_frame->rows);

    float largest_factor = factor_w > factor_h ? factor_w : factor_h;

    cv::Size resize_size(static_cast<int>(largest_factor * static_cast<float>(in_frame->cols)),
        static_cast<int>(largest_factor * static_cast<float>(in_frame->rows)));

    cv::Mat resized;
    cv::resize(*in_frame, resized, resize_size);

    int crop_x = resize_size.width > resize_size.height ?
        (resize_size.width - resize_size.height) / 2 :
        0;
    int crop_y = resize_size.height > resize_size.width ?
        (resize_size.height - resize_size.width) / 2 :
        0;

    cv::Rect crop_region(crop_x, crop_y, EI_CLASSIFIER_INPUT_WIDTH, EI_CLASSIFIER_INPUT_HEIGHT);

    if (use_debug) {
        printf("crop_region x=%d y=%d width=%d height=%d\n", crop_x, crop_y, EI_CLASSIFIER_INPUT_WIDTH, EI_CLASSIFIER_INPUT_HEIGHT);
    }

    *out_frame = resized(crop_region);
}

int main(int argc, char** argv) {
    // If you see: OpenCV: not authorized to capture video (status 0), requesting... Abort trap: 6
    // This might be a permissions issue. Are you running this command from a simulated shell (like in Visual Studio Code)?
    // Try it from a real terminal.

    if (argc < 2) {
        printf("Requires one parameter (ID of the webcam).\n");
        printf("You can find these via `v4l2-ctl --list-devices`.\n");
        printf("E.g. for:\n");
        printf("    C922 Pro Stream Webcam (usb-70090000.xusb-2.1):\n");
	    printf("            /dev/video0\n");
        printf("The ID of the webcam is 0\n");
        exit(1);
    }

    for (int ix = 2; ix < argc; ix++) {
        if (strcmp(argv[ix], "--debug") == 0) {
            printf("Enabling debug mode\n");
            use_debug = true;
        }
    }

    cv::VideoCapture vcap;
    const std::string videoStreamAddress = "http://root:Link888*@192.168.2.19/mjpg/video.mjpg";
    if(!vcap.open(videoStreamAddress)) {
        std::cout << "Error opening video stream or file" << std::endl;
        return -1;
    } else {
        std::cout << "VCAP open!" << std::endl;
    }

    // //open the webcam...
    // cv::VideoCapture camera(atoi(argv[1]));
    // if (!camera.isOpened()) {
    //     std::cerr << "ERROR: Could not open camera" << std::endl;
    //     return 1;
    // }

    cv::namedWindow("Webcam", cv::WINDOW_AUTOSIZE);

    // this will contain the image from the webcam
    cv::Mat frame;

    // display the frame until you press a key
    while (1) {
        // 100ms. between inference
        int64_t next_frame = (int64_t)(ei_read_timer_ms() + 100);

        // capture the next frame from the webcam
        if(!vcap.read(frame)) {
            std::cout << "No frame" << std::endl;
            cv::waitKey();
        }
        cv::imshow("Webcam", frame);
        if (cv::waitKey(1) >= 0) break;

        cv::Mat cropped;
        resize_and_crop(&frame, &cropped);

        size_t feature_ix = 0;
        for (int rx = 0; rx < (int)cropped.rows; rx++) {
            for (int cx = 0; cx < (int)cropped.cols; cx++) {
                cv::Vec3b pixel = cropped.at<cv::Vec3b>(rx, cx);
                uint8_t b = pixel.val[0];
                uint8_t g = pixel.val[1];
                uint8_t r = pixel.val[2];
                features[feature_ix++] = (r << 16) + (g << 8) + b;
            }
        }

        ei_impulse_result_t result;

        // construct a signal from the features buffer
        signal_t signal;
        numpy::signal_from_buffer(features, EI_CLASSIFIER_INPUT_WIDTH * EI_CLASSIFIER_INPUT_HEIGHT, &signal);

        // and run the classifier
        EI_IMPULSE_ERROR res = run_classifier(&signal, &result, false);
        if (res != 0) {
            printf("ERR: Failed to run classifier (%d)\n", res);
            return 1;
        }

    #if EI_CLASSIFIER_OBJECT_DETECTION == 1
        printf("Classification result (%d ms.):\n", result.timing.dsp + result.timing.classification);
        bool found_bb = false;
        for (size_t ix = 0; ix < EI_CLASSIFIER_OBJECT_DETECTION_COUNT; ix++) {
            auto bb = result.bounding_boxes[ix];
            if (bb.value == 0) {
                continue;
            }

            found_bb = true;
            printf("    %s (%f) [ x: %u, y: %u, width: %u, height: %u ]\n", bb.label, bb.value, bb.x, bb.y, bb.width, bb.height);
        }

        if (!found_bb) {
            printf("    no objects found\n");
        }
    #else
        printf("%d ms. ", result.timing.dsp + result.timing.classification);
        for (size_t ix = 0; ix < EI_CLASSIFIER_LABEL_COUNT; ix++) {
            printf("%s: %.05f", result.classification[ix].label, result.classification[ix].value);
            if (ix != EI_CLASSIFIER_LABEL_COUNT - 1) {
                printf(", ");
            }
        }
        printf("\n");
    #endif

        int64_t sleep_ms = next_frame > (int64_t)ei_read_timer_ms() ? next_frame - (int64_t)ei_read_timer_ms() : 0;
        if (sleep_ms > 0) {
            usleep(sleep_ms * 1000);
        }
    }
    return 0;
}

#if !defined(EI_CLASSIFIER_SENSOR) || EI_CLASSIFIER_SENSOR != EI_CLASSIFIER_SENSOR_CAMERA
#error "Invalid model for current sensor."
#endif

Hi @kevin192291 On PCs you’ll need to build with hardware acceleration if you want to run object detection - otherwise it will indeed be very slow (falls back to pure software implementation).

Hello @janjongboom,
Sorry to bug you again. I have been trying to get this working on my own but am at a bit of a stand still.
I used the command: APP_CAMERA=1 TARGET_LINUX_AARCH64=1 USE_FULL_TFLITE=1 CC=clang CXX=clang++ make -j
and I did the build with EON set to OFF.

The result is:

ftsg2d -lruy -lXNNPACK -lpthread
/usr/bin/ld: skipping incompatible ./tflite/linux-aarch64/libtensorflow-lite.a when searching for -ltensorflow-lite
/usr/bin/ld: cannot find -ltensorflow-lite
/usr/bin/ld: skipping incompatible ./tflite/linux-aarch64/libcpuinfo.a when searching for -lcpuinfo
/usr/bin/ld: cannot find -lcpuinfo
/usr/bin/ld: skipping incompatible ./tflite/linux-aarch64/libfarmhash.a when searching for -lfarmhash
/usr/bin/ld: cannot find -lfarmhash
/usr/bin/ld: skipping incompatible ./tflite/linux-aarch64/libfft2d_fftsg.a when searching for -lfft2d_fftsg
/usr/bin/ld: cannot find -lfft2d_fftsg
/usr/bin/ld: skipping incompatible ./tflite/linux-aarch64/libfft2d_fftsg2d.a when searching for -lfft2d_fftsg2d
/usr/bin/ld: cannot find -lfft2d_fftsg2d
/usr/bin/ld: skipping incompatible ./tflite/linux-aarch64/libruy.a when searching for -lruy
/usr/bin/ld: cannot find -lruy
/usr/bin/ld: skipping incompatible ./tflite/linux-aarch64/libXNNPACK.a when searching for -lXNNPACK
/usr/bin/ld: cannot find -lXNNPACK
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [Makefile:95: runner] Error 1

I am running Linux Mint, ubuntu based. not Arch, would it be difficult for me to get these working in regard to the target? or is this difficult to do?

Thanks again!
(btw, let me know if these questions are too much)

Hi @kevin192291 Your computer does not seem to be an AARCH64 target, where are you running this? We only support ARM targets right now w/ hw acceleration, not x86.

Ohhh okay, I see. I had thought this would run on my pc. My setup is:

OS: Linux Mint 20,
Processor: Ryzen 9 5900x
GFX Card: Nvidia 970 gtx

Is there a way to get the pure tensorflow model? Would that not work on my pc?

Yes on the Dashboard page, under ‘Download block output’ you’ll find the TensorFlow models (in TFLite and SavedModel formats).