Question/Issue:
I trained an image classification model using Edge Impulse, achieving excellent results during training and testing. However, when deployed on the ESP32-CAM, the model’s real-world accuracy is poor. The model consistently misclassifies images, frequently categorizing inputs as “rotten bananas,” regardless of class. I tried showing it fresh bananas, other physical items, and pictures, but this issue persisted.
Project ID:
562986
Context/Use case:
- Dataset: Fresh and rotten fruits dataset from Kaggle (dataset link).
- Model architecture: MobileNetV2 with 96x96 RGB input images, using transfer learning.
- Neural Network settings:
- Input Layer: 27,648 features (96x96x3).
- Transfer learning with 8 final neurons and a dropout rate of 0.1.
- Validation accuracy: ~86.93%.
- Optimizer: Auto-learned with appropriate batch sizes and augmentation.
- Deployment:
- ESP32-CAM (AI Thinker model) with Edge Impulse’s Arduino-compatible library.
- Image preprocessing includes resizing to 96x96 and converting to RGB.
Steps Taken:
- Collected and split data (training/testing).
- Preprocessed images to 96x96 dimensions and converted them to RGB format in Edge Impulse.
- Trained the model with MobileNetV2 transfer learning in Edge Impulse.
- Deployed the model using Edge Impulse’s Arduino library.
- Tested the deployed model using real-world images on ESP32-CAM.
Expected Outcome:
The deployed model should classify images accurately, with performance aligning with the 86.93% accuracy observed during testing.
Actual Outcome:
The deployed model frequently misclassifies images, often labeling them “rotten bananas.” There is a significant disparity between Edge Impulse testing results and real-world deployment performance. Essentially, it is a bias or high false positive rate. There is a bias towards the class that performed the best in my confusion matrix.
Reproducibility:
- [X] Always
- [ ] Sometimes
- [ ] Rarely
Environment:
- Platform: ESP32-CAM (AI Thinker)
- Build Environment Details: Arduino IDE 2.3.3 with ESP32 Core for Arduino (latest version)
- OS Version: macOS (latest)
- Edge Impulse Version: [Verify using Edge Impulse versioning instructions]
- Edge Impulse CLI Version: v1.30.0
- Project Version: 1.0.0
- Custom Blocks / Impulse Configuration: MobileNetV2 with transfer learning. RGB 96x96 input.
Logs/Attachments:
Predictions (DSP: 8 ms., Classification: 620 ms., Anomaly: 0 ms.):
Predictions:
freshapple: 0.00000
freshbanana: 0.00000
freshoranges: 0.00000
rottenapple: 0.00000
rottenbanana: 0.99609
rottenoranges: 0.00000
Predictions (DSP: 8 ms., Classification: 620 ms., Anomaly: 0 ms.):
Predictions:
freshapple: 0.00000
freshbanana: 0.00000
freshoranges: 0.00000
rottenapple: 0.00000
rottenbanana: 0.99609
rottenoranges: 0.00000
Additional Information:
The model’s exported library code was directly used in the ESP32-CAM sketch. I suspect preprocessing or model quantization might be the issue, but I need guidance to debug further.
Request for Assistance:
- Suggestions for debugging the model in real-world deployment on ESP32-CAM.
- Best practices to improve alignment between Edge Impulse testing results and deployed model performance.
- Recommendations for testing camera input preprocessing on ESP32-CAM.
Thank you for your help!
Edit: Originally, when compiling, an error was thrown from two files, conv.cpp, and depthwise_conv.cpp, with errors in a structure declaration. Here is one of the errors: conv.cpp:1795:80: error: either all initializer clauses should be designated or none of them should be
1795 | data_dims_t filter_dims = {.width = filter_width, .height = filter_height, 0, 0};
I attempted to fix it myself, but let me know if there is a more proper way of patching it based on the data_dims_t structure.