Multiple Classes

I am using a Open MV H7 Plus and currently following their and your tutorials to create a trained data set that can be redeployed onto the OpenMV H7 or some other processor

I am working on an mobile robot avoidance project and would like to identify people, tree trunks, grass, fallen branches. Later on I would expand the classes to animals, children etc if successful

My approach would be to set up 4 classes, take say 100 pictures of each class then use your platform to do the rest.

I have notice in your tutorials you only use 2 classes. (coffee/lamp) or (lamp/plant). Still not quite sure the difference between the 2 tutorials? Can you please explain.
Is the difference due to the labelling ? ie in the lamp/coffee you had both objects in the same image

For each class I would take pictures at different distances from the object, angles, times of day, times of year, different lighting conditions. My default is grass. ie when I only see grass I am happy and when I see trees, branches or people then I need to take some action. Not sure what happens when we get grass and a tree? or human and grass?

Bearing in mind the above what should I do to maximise my chances of success?

Use a different deployment processor like the Nvidia boards
I may also use my mobil phone. Can I mix MV data and phone data sets.
I do know that I can add more images later on
Also should take pictures of multiple objects like people, grass and a tree?

Can you recomed any reading. Not interested in the nuts and bolts just implementation

regards and thanks for your great efforts


Hello @Mdbirley,

These are 2 different tutorials. The lamp/plant tutorial is for classical image classification, where there is only one object in the viewpoint of the camera, and you want your machine learning model to identify what is in the image (a lamp or a plant, or unknown). See the tutorial here:

The coffee/lamp project is an object detection project. Object detection projects allow you to define the bounding box region of where an object that you want to identify is located in an image. Then after training your model, you can use the object detection model to determine which objects are predicted to be in view of your camera, and where they are approximately located in the frame. See the tutorial here:

For the starting off with the Open MV H7 Plus, I recommend building a classical image classification first rather than object detection. Image classification projects require your training data and testing data images to be labeled with exactly one label (the object in the image). So take around 100 pictures of each of your 4 objects, upload them to your Edge Impulse project, and then use the Data Acquisition tab to label them accordingly. (Or if you label them locally on your computer, Edge Impulse can automatically infer what the label of the image should be depending on the filename, i.e. “Cat.jpg” would be labeled “Cat” in Edge Impulse).

For the project that I just described, you will upload your labeled images for each of your 4 classes (also including images that contain none of the 4 class objects, i.e. “unknown”), and then once you train and deploy your model, you can get a prediction score for how confident the model is that one of your known objects is present in the image, if it is not confident at all the “unknown” score will be high.

Also, yes you can use any deployment option for image classification as long as the target device has a camera, like a mobile phone, NVIDIA Jetson with webcam, etc.

Please let me know if this answers your questions!

– Jenny

1 Like

Hi Jenny
A comprehensive response. Very professional. I have just completed a small classical image classification project with some Trees around the house. They are classified as Trunks, Clear and unsure. I will change unsure to unknown. Clear are pictures of the grassy ground that do not pose a threat to the mobile robot.

I will leave the other classes (dogs humans etc) until I have a better grasp of the concepts.
In fact I may end up using two openMV 7H plus cameras. One with their people detection model as it works so well and the on the other camera I will build for Trunks and Clear etc and allow them both to have an impact on the robot. As they will be focussed on different parts of the problem I will get to know which is working more efficiently.

The training already has a success rate of 90% and I only took 30 pictures
Now that I know it all works, I will now add many more photos to the project to improve the training.

Currently we have all the trees located with RTK gps so we know their locations to with in 2 cm. We also use a 360 Lidar. I am playing this Camera technology to see if we can add to our current setup and get better results

I am enclosing a picture of the area and trees that I need to navigate. As you can see there are many Trunks in the frame. So would you agree that an object detection method may in the long run be the best approach.

once again great support


Really great to hear you already have a model up and running!

Please do keep us updated on the progress of your project. :slight_smile:

– Jenny

The success rate I was referring too in the last post was when using the test photos in doors to verify that all was working. This is when I sent the last update. Yes I was getting 90%. all be it on a very small data set. So pretty excited

In the following morning when I took my laptop and MV H7 camera outside the system did not recognise trees with much repeatability. I was quite surprised as I was expecting something more than I got. I am sure I have messed up somewhere.

Originally I took 2 small classes of photos. “Trunks” and “Clear”. The aim was to get some idea of performance, to learn the process of data collection and then take alot more pictures to build a better model. I guess this is the lazy man approach

The picture I uploaded earlier was more of a long range Trunks picture. It shows many Trunks in the frame.

Due to restriction I can only post 1 picture here
The more typical Trunk will be posted separately

The Clear pictures are taken of the ground when it is clear of the robot can proceed.

The earlier photos were all acquired on an apple iphone as not using your Q code as that is the easiest way to collect the pictures quickly. The outdoor test done to verify the work was done with the MV7H plus

Clear photos

What could be confusing the recognition from test verification data set and the real data taken live outside

Strong shadows of the trees
colour vs grayscale
landscape vs portrait
height above ground that the pictures are being taken of the Trunks
Distance away from the trunks
differences in the pictures generated by the iphone and the MV7
the pictures containing multiple trees

I am very impressed with the person recognition example from the MV7 examples. I have used this indoors to great success. Based on this I thought I could build a relatively accurate Trunk detection system for the outside.

From what I have told you do you think I can do this? (with a little help)

Also when I detect the Trunk I am eventually going to need to know where it is in the frame so I can plan to take some action

Does this mean I will need to eventually move to an object identification model

Sorry for all the questions

I am posting here the code we use on the MV7H so you can see what we are using


More typical Trunks photos

I have been adding more data to the model. Both clear and trunks shots. Acquired on an iphone using your capture system. I seem to be getting 2 populations that contain both clear and trunk shots
can you guess what is happening here



Hello @Mdbirley,

That very depends on your dataset:
I have not checked your data but to give you an exemple maybe some images in both categories contained an higher blue part which is the sky and the other not.

You can click on two points that are next to each other on the 3D feature explorer to understand more what is happening. But don’t stop at the feature explorer view, your NN may learn well even here.
I can see more green dots on the right part of your two clusters than on the left part (mostly blue).
If I can distinguish that, a NN will likely detect it too :smiley:

If you are not happy with the results several options:

  • Add more data
  • Optimize the NN architecture

Also, you may want to apply some custom pre-processing to your image:
You are using the OpenMV H7 right?
If I remember well, it is using MicroPython and you can use some of the OpenCV functions so it should be easy to adapt the pre-processing on the device side.

Finally, keep in mind that the OpenMV cam and your iPhone capture the images differently (color, contrast, quality, etc…).




Great response. I will check out what you say and go from there.
once again thanks for the helpful response

All the images in my data set collected with a iphone ie just taking snaps, have black borders down each side and the raw features are all 0x0. the processed features are 0.0000

Some images in the data set where acquired with the iphone but using your software and they do have proper hex values like 0x5d616a etc however the processed image appears on its side
see below

by the way what is a DSP result. what does DSP mean

I guess that i need to trash these captures and only use the openMV H7 plus as that weill be the camera I will use when deployed.


I started a new project for object classification. It has the following classes pens, coins and unknown. I took 50,50, 100 photos on the open mv H7 plus. Like the mowing project the sum of the outcomes in the serial terminal is about 2.9. On you demos they add up to 1.0. I believe yours and not mine.
I have watched your video on this closely and followed the instructions.
Can you please let me know what I am doing wrong