Best Approach for Custom Person Detection on ESP32-CAM — Manual Model or Edge Impulse?

rifat.projects · November 22, 2025, 4:24am

Topic:
Best Approach for Custom Person Detection on ESP32-CAM — Manual Model or Edge Impulse?

Context/Use case:
Building a low-cost, compact computer-vision system to detect unauthorized individuals outside a lab using an ESP32-CAM. The goal is to keep hardware minimal while still achieving reliable person detection on-device.

Details:
I’m planning to create an AI model that runs directly on the ESP32-CAM. During research, I found two main workflows:

Manual workflow

Collect dataset
Train a custom lightweight model
Optimize (quantization, pruning, etc.)
Convert to TFLite / ESP-NN format
Deploy manually using ESP-IDF or Arduino
This gives full control but requires more expertise, especially in ML optimization and embedded deployment.

Automated workflow using Edge Impulse

Upload images (dataset)
Use built-in training pipelines
Auto-optimize for microcontrollers
Export as an ESP32-ready library
This drastically simplifies dataset management, training, optimization, and deployment—especially useful for beginners or early prototyping.

I want to understand which approach makes more sense for a practical custom project where I need custom detection but also need fast development without too many complications.

Questions:

For an ESP32-CAM, which approach is more practical: manual training or using Edge Impulse?
How much accuracy difference can I expect between the two methods?
Does Edge Impulse handle model optimization well enough for real-time inference on ESP32-CAM?