Title: Architecture advice: Building a custom lightweight VIO/SLAM pipeline for OAK-D on Raspberry Pi (Alternative to Spectacular AI / VINS)
Background: Hi everyone, I’m a software team member for a university underwater ROV project. We are using the OAK-D for real-time 3D mapping and state estimation, and our main compute unit is a Raspberry Pi.
My Journey So Far: I’ve gone through a few iterations trying to get a robust pipeline working, and here is where I’m currently at:
- The Starting Point (Spectacular AI): I initially started building the pipeline using Spectacular AI. It worked beautifully and gave great results out of the box. However, when it came time to deploy on our actual ROV hardware (the Raspberry Pi), I discovered that their SDK requires a commercial license for ARM processors. Since we are a student team, I had to pivot.
- The Open-Source Pivot (VINS-Fusion + ROS 2): Looking for alternatives, I transitioned the stack to use VINS-Fusion integrated with ROS 2. While I successfully got the pipeline up and running, the overhead of ROS 2 combined with VINS-Fusion resulted in very high latency on the Raspberry Pi—too high for our real-time requirements.
- Current Goal (The Custom Build): Because of the ARM licensing constraints of Spectacular AI and the latency issues with VINS/ROS, I have decided to build a lightweight, custom VIO and 3D reconstruction pipeline from scratch. The goal is to create something similar to Spectacular AI in terms of performance, but heavily optimized for the OAK-D and the Pi’s resource constraints, running entirely C++ or Python without the heavy middleware.
What I’m Looking For: Since I am architecting this from the ground up, I’m looking for advice from anyone who has tackled a similar custom build. Specifically:
- Is there any open source software like spectacular AI that can utilize the OAK D with the raspberry pi instead of me trying to build it from scratch
- Are there specific lightweight backend optimization libraries (like GTSAM or Ceres) you recommend for a Pi?]*
- What is the best approach to tightly couple the OAK-D’s high-frequency IMU data with the stereo frames without introducing latency?]*
Any architectural advice, recommended libraries, or insights on avoiding common pitfalls would be massively appreciated!