Edge Impulse with pan and tilt sevos

GreatoR · April 20, 2024, 3:04pm

Hi I want to make a pan tilt camera using esp32 camera (ai thinker) and a FTDI. I want the servos to move/track the object according the position (x,y)

I took the existing edge impulse code and I added( in order to find the center point )a line that calculates the center using the height and width of bounding_boxes

But the servos the don’t move accordingly they move a bit off

#if EI_CLASSIFIER_OBJECT_DETECTION == 1
    bool bb_found = result.bounding_boxes[0].value > 0;
    for (size_t ix = 0; ix < result.bounding_boxes_count; ix++) {
        auto bb = result.bounding_boxes[ix];
        if (bb.value == 0) {
           continue;
        }
        centerX=bb.x+(bb.width/2);
        centerY=bb.y+(bb.height/2); 
	    servoX.write(centerX);
		servoY.write(centerY);

        ei_printf("    %s (%f) [ x: %u, y: %u, width: %u, height: %u ]\n", bb.label, bb.value, bb.x, bb.y, bb.width, bb.height);
        ```
The output is :

Predictions (DSP: 2 ms., Classification: 131 ms., Anomaly: 0 ms.):
No objects found
Predictions (DSP: 2 ms., Classification: 131 ms., Anomaly: 0 ms.):
No objects found
Predictions (DSP: 2 ms., Classification: 132 ms., Anomaly: 0 ms.):
No objects found
Predictions (DSP: 2 ms., Classification: 132 ms., Anomaly: 0 ms.):
No objects found
Predictions (DSP: 2 ms., Classification: 131 ms., Anomaly: 0 ms.):
No objects found
Predictions (DSP: 2 ms., Classification: 132 ms., Anomaly: 0 ms.):
Son Goku (0.902344) [ x: 16, y: 16, width: 8, height: 16 ]
Predictions (DSP: 2 ms., Classification: 132 ms., Anomaly: 0 ms.):
Son Goku (0.621094) [ x: 16, y: 16, width: 16, height: 8 ]
Predictions (DSP: 2 ms., Classification: 132 ms., Anomaly: 0 ms.):
Son Goku (0.855469) [ x: 24, y: 16, width: 8, height: 16 ]
Predictions (DSP: 2 ms., Classification: 132 ms., Anomaly: 0 ms.):
Son Goku (0.687500) [ x: 24, y: 16, width: 8, height: 16 ]
Predictions (DSP: 2 ms., Classification: 132 ms., Anomaly: 0 ms.):
Son Goku (0.773438) [ x: 8, y: 8, width: 16, height: 16 ]
Predictions (DSP: 2 ms., Classification: 132 ms., Anomaly: 0 ms.):
Son Goku (0.828125) [ x: 8, y: 8, width: 16, height: 16 ]
Predictions (DSP: 2 ms., Classification: 132 ms., Anomaly: 0 ms.):
Son Goku (0.835938) [ x: 8, y: 8, width: 16, height: 16 ]
Predictions (DSP: 2 ms., Classification: 132 ms., Anomaly: 0 ms.):
Son Goku (0.882812) [ x: 8, y: 16, width: 16, height: 8 ]
Predictions (DSP: 2 ms., Classification: 132 ms., Anomaly: 0 ms.):
Son Goku (0.835938) [ x: 16, y: 16, width: 8, height: 8 ]

shawn_edgeimpulse · April 23, 2024, 4:08pm

Hi @GreatoR,

Bounding box information (x, y, width, height) are given in pixels. So, when you calculate center coordinates (centerX and centerY), those are pixels (from 0 to whatever the width/height of your image is). Assuming you are using the Arduino servo library, the Servo.write() method expects an angle in degrees (0 to 180). You need to translate center coordinates in pixels to an angle to pass to your servos.

GreatoR · April 25, 2024, 6:53pm

Thanks for the response !
I am new into this!
I modify the code and look like this, still not working ;/

.
.
.
int servo_min_angle = 0;   
int servo_max_angle = 180; 
int image_width = 48;      
int image_height = 48;
.
.
.
void loop()
{
for (size_t ix = 0; ix < result.bounding_boxes_count; ix++) {
    auto bb = result.bounding_boxes[ix];
    if (bb.value == 0) {
        continue;
    }

    int servoX_angle = map(bb.x, 0, image_width, servo_min_angle, servo_max_angle);
    int servoY_angle = map(bb.y, 0, image_height, servo_min_angle, servo_max_angle);

    servoX.write(servoX_angle);
    servoY.write(servoY_angle);

    ei_printf("    %s (%f) [ x: %u, y: %u, width: %u, height: %u ]\n", bb.label, bb.value, bb.x, bb.y, bb.width, bb.height);
}
}

shawn_edgeimpulse · April 26, 2024, 2:36pm

Hi @GreatoR,

What behavior are you seeing from the servo and what are you expecting it to do?

Also, keep in mind if there is more than one bounding box, you will give conflicting instructions to your servos, as you attempt to move them for every bounding box (as given by the for loop).

GreatoR · April 28, 2024, 8:23am

Tilt servo is going downwards I check the code if I put wrong the pan movement into tilt
I provide you with a video and the code!
In the link the are 2 videos

/* Edge Impulse Arduino examples
 * Copyright (c) 2022 EdgeImpulse Inc.
 *
 * Permission is hereby granted, free of charge, to any person obtaining a copy
 * of this software and associated documentation files (the "Software"), to deal
 * in the Software without restriction, including without limitation the rights
 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 * copies of the Software, and to permit persons to whom the Software is
 * furnished to do so, subject to the following conditions:
 *
 * The above copyright notice and this permission notice shall be included in
 * all copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 * SOFTWARE.
 */

/* Includes ---------------------------------------------------------------- */
#include "ESP32_inferencing.h"
#include "edge-impulse-sdk/dsp/image/image.hpp"
#include <ESP32Servo.h> 
#include "esp_camera.h"
#define LED1 4
#define SERVO_PIN_1 14 // GPIO14
#define SERVO_PIN_2 13 // GPIO13

Servo servoPan;
Servo servoTilt;


// Select camera model - find more camera models in camera_pins.h file here
// https://github.com/espressif/arduino-esp32/blob/master/libraries/ESP32/examples/Camera/CameraWebServer/camera_pins.h

//#define CAMERA_MODEL_ESP_EYE // Has PSRAM
#define CAMERA_MODEL_AI_THINKER // Has PSRAM

#if defined(CAMERA_MODEL_ESP_EYE)
#define PWDN_GPIO_NUM    -1
#define RESET_GPIO_NUM   -1
#define XCLK_GPIO_NUM    4
#define SIOD_GPIO_NUM    18
#define SIOC_GPIO_NUM    23

#define Y9_GPIO_NUM      36
#define Y8_GPIO_NUM      37
#define Y7_GPIO_NUM      38
#define Y6_GPIO_NUM      39
#define Y5_GPIO_NUM      35
#define Y4_GPIO_NUM      14
#define Y3_GPIO_NUM      13
#define Y2_GPIO_NUM      34
#define VSYNC_GPIO_NUM   5
#define HREF_GPIO_NUM    27
#define PCLK_GPIO_NUM    25

#elif defined(CAMERA_MODEL_AI_THINKER)
#define PWDN_GPIO_NUM     32
#define RESET_GPIO_NUM    -1
#define XCLK_GPIO_NUM      0
#define SIOD_GPIO_NUM     26
#define SIOC_GPIO_NUM     27

#define Y9_GPIO_NUM       35
#define Y8_GPIO_NUM       34
#define Y7_GPIO_NUM       39
#define Y6_GPIO_NUM       36
#define Y5_GPIO_NUM       21
#define Y4_GPIO_NUM       19
#define Y3_GPIO_NUM       18
#define Y2_GPIO_NUM        5
#define VSYNC_GPIO_NUM    25
#define HREF_GPIO_NUM     23
#define PCLK_GPIO_NUM     22

#else
#error "Camera model not selected"
#endif

/* Constant defines -------------------------------------------------------- */
#define EI_CAMERA_RAW_FRAME_BUFFER_COLS           320
#define EI_CAMERA_RAW_FRAME_BUFFER_ROWS           240
#define EI_CAMERA_FRAME_BYTE_SIZE                 3

/* Private variables ------------------------------------------------------- */
static bool debug_nn = false; // Set this to true to see e.g. features generated from the raw signal
static bool is_initialised = false;
uint8_t *snapshot_buf; //points to the output of the capture

static camera_config_t camera_config = {
    .pin_pwdn = PWDN_GPIO_NUM,
    .pin_reset = RESET_GPIO_NUM,
    .pin_xclk = XCLK_GPIO_NUM,
    .pin_sscb_sda = SIOD_GPIO_NUM,
    .pin_sscb_scl = SIOC_GPIO_NUM,

    .pin_d7 = Y9_GPIO_NUM,
    .pin_d6 = Y8_GPIO_NUM,
    .pin_d5 = Y7_GPIO_NUM,
    .pin_d4 = Y6_GPIO_NUM,
    .pin_d3 = Y5_GPIO_NUM,
    .pin_d2 = Y4_GPIO_NUM,
    .pin_d1 = Y3_GPIO_NUM,
    .pin_d0 = Y2_GPIO_NUM,
    .pin_vsync = VSYNC_GPIO_NUM,
    .pin_href = HREF_GPIO_NUM,
    .pin_pclk = PCLK_GPIO_NUM,

    //XCLK 20MHz or 10MHz for OV2640 double FPS (Experimental)
    .xclk_freq_hz = 20000000,
    .ledc_timer = LEDC_TIMER_0,
    .ledc_channel = LEDC_CHANNEL_0,

    .pixel_format = PIXFORMAT_JPEG, //YUV422,GRAYSCALE,RGB565,JPEG
    .frame_size = FRAMESIZE_QVGA,    //QQVGA-UXGA Do not use sizes above QVGA when not JPEG

    .jpeg_quality = 12, //0-63 lower number means higher quality
    .fb_count = 1,       //if more than one, i2s runs in continuous mode. Use only with JPEG
    .fb_location = CAMERA_FB_IN_PSRAM,
    .grab_mode = CAMERA_GRAB_WHEN_EMPTY,
};

/* Function definitions ------------------------------------------------------- */
bool ei_camera_init(void);
void ei_camera_deinit(void);
bool ei_camera_capture(uint32_t img_width, uint32_t img_height, uint8_t *out_buf) ;

int centerX = 0;
int centerY = 0;
int servo_min_angle = 0;   // Ελάχιστη γωνία του servo
int servo_max_angle = 180; // Μέγιστη γωνία του servo
int image_width = 48;      // Πλάτος της εικόνας
int image_height = 48;     // Ύψος της εικόνας

/**
* @brief      Arduino setup function
*/
void setup()
{
	servoTilt.attach(SERVO_PIN_1);
	servoPan.attach(SERVO_PIN_2);
	pinMode(LED1,OUTPUT);
    // put your setup code here, to run once:
    Serial.begin(115200);
    //comment out the below line to start inference immediately after upload
    // move all servos to position 0
	servoPan.write(centerX);
	servoTilt.write(centerY);

    while (!Serial);
    Serial.println("Edge Impulse Inferencing Demo");
    if (ei_camera_init() == false) {
        ei_printf("Failed to initialize Camera!\r\n");
    }
    else {
        ei_printf("Camera initialized\r\n");
    }

    ei_printf("\nStarting continious inference in 2 seconds...\n");
    ei_sleep(2000);
}

/**
* @brief      Get data and run inferencing
*
* @param[in]  debug  Get debug info if true
*/
void loop()
{
	
    // instead of wait_ms, we'll wait on the signal, this allows threads to cancel us...
    if (ei_sleep(5) != EI_IMPULSE_OK) {
        return;
    }

    snapshot_buf = (uint8_t*)malloc(EI_CAMERA_RAW_FRAME_BUFFER_COLS * EI_CAMERA_RAW_FRAME_BUFFER_ROWS * EI_CAMERA_FRAME_BYTE_SIZE);

    // check if allocation was successful
    if(snapshot_buf == nullptr) {
        ei_printf("ERR: Failed to allocate snapshot buffer!\n");
        return;
    }

    ei::signal_t signal;
    signal.total_length = EI_CLASSIFIER_INPUT_WIDTH * EI_CLASSIFIER_INPUT_HEIGHT;
    signal.get_data = &ei_camera_get_data;

    if (ei_camera_capture((size_t)EI_CLASSIFIER_INPUT_WIDTH, (size_t)EI_CLASSIFIER_INPUT_HEIGHT, snapshot_buf) == false) {
        ei_printf("Failed to capture image\r\n");
        free(snapshot_buf);
        return;
    }

    // Run the classifier
    ei_impulse_result_t result = { 0 };

    EI_IMPULSE_ERROR err = run_classifier(&signal, &result, debug_nn);
    if (err != EI_IMPULSE_OK) {
        ei_printf("ERR: Failed to run classifier (%d)\n", err);
        return;
    }

    // print the predictions
    ei_printf("Predictions (DSP: %d ms., Classification: %d ms., Anomaly: %d ms.): \n",
                result.timing.dsp, result.timing.classification, result.timing.anomaly);

#if EI_CLASSIFIER_OBJECT_DETECTION == 1
    bool bb_found = result.bounding_boxes[0].value > 0;
    for (size_t ix = 0; ix < result.bounding_boxes_count; ix++) {
        auto bb = result.bounding_boxes[ix];
        if (bb.value == 0) {
            //digitalWrite(LED1,LOW);
            continue;
        }

        int servoPan_angle = map(bb.x, 0, image_width, servo_min_angle, servo_max_angle); 
        int servoTilt_angle = map(bb.y, 0, image_height, servo_min_angle, servo_max_angle);

        servoPan.write(servoPan_angle);
        servoTilt.write(servoTilt_angle);

        //try and error ..testing..
        //centerX=bb.x+(bb.width/2);
       // centerY=bb.y+(bb.height/2);
        //printf("KEDRO" , centerX,centerY);
		    //digitalWrite(LED1,HIGH);     
		    //servoX.write(centerX);
		    //servoY.write(centerY);


        ei_printf("    %s (%f) [ x: %u, y: %u, width: %u, height: %u ]\n", bb.label, bb.value, bb.x, bb.y, bb.width, bb.height);
        
    }
    if (!bb_found) {
        ei_printf("    No objects found\n");
        //digitalWrite(LED1,LOW);
        
		
    }
#else
    for (size_t ix = 0; ix < EI_CLASSIFIER_LABEL_COUNT; ix++) {
        ei_printf("    %s: %.5f\n", result.classification[ix].label,
                                    result.classification[ix].value);
    }
#endif

#if EI_CLASSIFIER_HAS_ANOMALY == 1
        ei_printf("    anomaly score: %.3f\n", result.anomaly);
#endif


    free(snapshot_buf);

}

/**
 * @brief   Setup image sensor & start streaming
 *
 * @retval  false if initialisation failed
 */
bool ei_camera_init(void) {

    if (is_initialised) return true;

#if defined(CAMERA_MODEL_ESP_EYE)
  pinMode(13, INPUT_PULLUP);
  pinMode(14, INPUT_PULLUP);
#endif

    //initialize the camera
    esp_err_t err = esp_camera_init(&camera_config);
    if (err != ESP_OK) {
      Serial.printf("Camera init failed with error 0x%x\n", err);
      return false;
    }

    sensor_t * s = esp_camera_sensor_get();
    // initial sensors are flipped vertically and colors are a bit saturated
    if (s->id.PID == OV3660_PID) {
      s->set_vflip(s, 1); // flip it back
      s->set_brightness(s, 1); // up the brightness just a bit
      s->set_saturation(s, 0); // lower the saturation
    }

#if defined(CAMERA_MODEL_M5STACK_WIDE)
    s->set_vflip(s, 1);
    s->set_hmirror(s, 1);
#elif defined(CAMERA_MODEL_ESP_EYE)
    s->set_vflip(s, 1);
    s->set_hmirror(s, 1);
    s->set_awb_gain(s, 1);
#endif

    is_initialised = true;
    return true;
}

/**
 * @brief      Stop streaming of sensor data
 */
void ei_camera_deinit(void) {

    //deinitialize the camera
    esp_err_t err = esp_camera_deinit();

    if (err != ESP_OK)
    {
        ei_printf("Camera deinit failed\n");
        return;
    }

    is_initialised = false;
    return;
}


/**
 * @brief      Capture, rescale and crop image
 *
 * @param[in]  img_width     width of output image
 * @param[in]  img_height    height of output image
 * @param[in]  out_buf       pointer to store output image, NULL may be used
 *                           if ei_camera_frame_buffer is to be used for capture and resize/cropping.
 *
 * @retval     false if not initialised, image captured, rescaled or cropped failed
 *
 */
bool ei_camera_capture(uint32_t img_width, uint32_t img_height, uint8_t *out_buf) {
    bool do_resize = false;

    if (!is_initialised) {
        ei_printf("ERR: Camera is not initialized\r\n");
        return false;
    }

    camera_fb_t *fb = esp_camera_fb_get();

    if (!fb) {
        ei_printf("Camera capture failed\n");
        return false;
    }

   bool converted = fmt2rgb888(fb->buf, fb->len, PIXFORMAT_JPEG, snapshot_buf);

   esp_camera_fb_return(fb);

   if(!converted){
       ei_printf("Conversion failed\n");
       return false;
   }

    if ((img_width != EI_CAMERA_RAW_FRAME_BUFFER_COLS)
        || (img_height != EI_CAMERA_RAW_FRAME_BUFFER_ROWS)) {
        do_resize = true;
    }

    if (do_resize) {
        ei::image::processing::crop_and_interpolate_rgb888(
        out_buf,
        EI_CAMERA_RAW_FRAME_BUFFER_COLS,
        EI_CAMERA_RAW_FRAME_BUFFER_ROWS,
        out_buf,
        img_width,
        img_height);
    }


    return true;
}

static int ei_camera_get_data(size_t offset, size_t length, float *out_ptr)
{
    // we already have a RGB888 buffer, so recalculate offset into pixel index
    size_t pixel_ix = offset * 3;
    size_t pixels_left = length;
    size_t out_ptr_ix = 0;

    while (pixels_left != 0) {
        out_ptr[out_ptr_ix] = (snapshot_buf[pixel_ix] << 16) + (snapshot_buf[pixel_ix + 1] << 8) + snapshot_buf[pixel_ix + 2];

        // go to the next pixel
        out_ptr_ix++;
        pixel_ix+=3;
        pixels_left--;
    }
    // and done!
    return 0;
}

#if !defined(EI_CLASSIFIER_SENSOR) || EI_CLASSIFIER_SENSOR != EI_CLASSIFIER_SENSOR_CAMERA
#error "Invalid model for current sensor"
#endif

GreatoR · April 28, 2024, 8:24am

The link! Sorry for the mess in the wiring
https://imgur.com/a/azvUZoC

herbk · April 28, 2024, 11:09am

Hi,
with only a few snippeds of the code it’s hard to help… I guess:
The camera don’t know where it is atm, so it don’t know where to move.

shawn_edgeimpulse · April 30, 2024, 3:10pm

Hi @GreatoR,

My guess is that you’re still moving the servo far more than necessary. You’re trying to map a pixel (x, y) location to the full range of servo motion, e.g. 0 to 180 degrees. So, if your Goku is at the top of the frame (y=0), the servo will try to move all the way to 0 deg. In all likelihood, that will move your camera too far and Goku will be out of the frame.

I recommend taking an iterative approach. If the center of the Goku bounding box (x, y) is above the center of the frame (image_width / 2, image_height / 2), then just move the camera up a little. Same thing for if Goku is below the middle point. You should also have a “dead zone” where the camera doesn’t move at all (start with something like 10 pixels in either direction):

if (y < (image_width / 2 - 10)) {
  servoTilt_angle -= 5;
else if (y > (image_width / 2 + 10)) {
 servoTilt_angle += 5;
}
servoTilt.write(servoTilt_angle);

Repeat for the x direction.

ashokghil · December 2, 2024, 3:45am

Hi GratoR, I’m also facing the same issue. Did you find the solution for the problem? Please share the code.