Topic:
Best Approach for Custom Person Detection on ESP32-CAM — Manual Model or Edge Impulse?
Context/Use case:
Building a low-cost, compact computer-vision system to detect unauthorized individuals outside a lab using an ESP32-CAM. The goal is to keep hardware minimal while still achieving reliable person detection on-device.
Details:
I’m planning to create an AI model that runs directly on the ESP32-CAM. During research, I found two main workflows:
- Manual workflow
- Collect dataset
- Train a custom lightweight model
- Optimize (quantization, pruning, etc.)
- Convert to TFLite / ESP-NN format
- Deploy manually using ESP-IDF or Arduino
This gives full control but requires more expertise, especially in ML optimization and embedded deployment.
- Automated workflow using Edge Impulse
- Upload images (dataset)
- Use built-in training pipelines
- Auto-optimize for microcontrollers
- Export as an ESP32-ready library
This drastically simplifies dataset management, training, optimization, and deployment—especially useful for beginners or early prototyping.
I want to understand which approach makes more sense for a practical custom project where I need custom detection but also need fast development without too many complications.
Questions:
- For an ESP32-CAM, which approach is more practical: manual training or using Edge Impulse?
- How much accuracy difference can I expect between the two methods?
- Does Edge Impulse handle model optimization well enough for real-time inference on ESP32-CAM?