Hi, I am using a combination of different sensors as input data to a model that is meant to run very quickly (within 1ms). The target device is a BeagleBone Black microcomputer running Debian GNU/Linux 10 (buster). There is already a large codebase (C++) that reads in sensor data and uses that sensor data to control a robot, and I’m looking for feedback or examples of the best way for me to incorporate a model trained on the Edge Impulse website into this project.
The pipeline for collecting and logging sensor data already exists on C++ codebase on the BeagleBone, so I used that pipeline and then uploaded the datafiles to the the Edge Impulse studio and trained a model, which in Edge Impulse Studio shows the inference can run at 1ms (my target speed). The next few integration steps I did I’m not sure if they are the ideal ones for implemented the model into my project on the BeagleBone to test in realtime:
I exported the model to a C++ library. However, I also looked into selecting the “Linux Board” option, but wasn’t sure. Which of these options is recommended given my hardware and goal of implementing the model into a pre-existing codebase?
I then compiled the C++ library on the BeagleBone, following the steps here: As a generic C++ library - Edge Impulse Documentation. I slightly modified this so that I can read in a data file with the input data and then the compiled app outputs the model results. One thing that is not ideal about this step is that the compilation is very long. I can see that many of the files included in the library are ML operations that I don’t need for my model (e.g., depthwise conv takes a long to compile, but I’m not using it). Is there a way to speed this up, or a different method of compiling it?
Finally I call the compiled app from the pre-existing C++ codebase. The aspect of this implementation that I like is that I do not need to recompile the model with the C++ codebase. I can test the model by itself with data I’ve collected before, or I can run it with real-time data. However, I feel that by calling the compiled model and reading the output, I may be adding additional time needed for each inference. Is there a better way to integrate the model into the C++ codebase?
Any feedback on my implementation you can provide would be greatly appreciated. Currently, the model takes about 500ms per inference when deployed on the BeagleBone. The .eim model seems like an alternative, but I’m not sure where to start or if it will solve the issues I’ve described above.