Two New Papers Want to Fix Drone Navigation's Dirty Secret: Onboard Compute Is the Bottleneck
A pair of arXiv preprints tackle the same core problem from different angles: how do you do real-time, safe obstacle avoidance when your drone has the compute budget of a Raspberry Pi?
By
·9 hours ago·5 min read
Drone navigation research has a credibility problem. Most of it looks great in simulation and falls apart the moment you strap a sensor suite to something that actually has to fly. Two new preprints out of arXiv this week are at least honest about that constraint, and they're worth paying attention to because of it.
Both papers target the same fundamental bottleneck: the gap between what modern perception systems can see and what an onboard processor can actually do with that information in real time. One proposes compressing 3D scene representations down to something a microcontroller can reason over. The other rethinks how a drone decides where to look in the first place. Neither is a finished product. But the underlying ideas are more grounded than a lot of what crosses my desk.
Start with arXiv, specifically the PolyMerge paper from a team targeting resource-constrained navigation. The core problem they're solving is real and well-known to anyone who's worked in hardware: 3D Gaussian Splatting, the photorealistic scene reconstruction method that's gotten a lot of attention over the past two years, produces models that are simply too large and compute-hungry to run on a drone's onboard processor during flight.
PolyMerge's answer is to convert a 3DGS scene model into a set of convex polytopes, basically geometric shapes that together over-approximate every obstacle in the original model. The key word there is "over-approximate." The system is designed to be conservative: it would rather tell the drone that an obstacle is slightly larger than it is than miss part of it. That's the right engineering instinct. In obstacle avoidance, false negatives are catastrophic; false positives just cost you a bit of efficiency.
Related coverage
More in Drones
The Antigravity A1 drone is up to 25% off starting June 23. Before you add it to your cart, here's what you should actually know.
Aisha Patel · 8 hours ago · 6 min
A new framework lets aerial manipulators place objects based on plain-language instructions, hitting 72% success in real-world tests. That's more impressive than it might sound.
Sarah Williams · Yesterday · 6 min
St. Louis-based WingXpand just joined a Verizon-backed accelerator focused on disaster resilience. The drone fits in a backpack. The questions are bigger than the hardware.
Sarah Williams · 4 days ago · 5 min
Two new research papers push autonomous UAVs toward genuine decision-making. One lets drones interpret plain-English missions. The other teaches aerial robots to grab things mid-flight. I've seen this movie before.
They tested this on a Crazyflie drone, which is a 27-gram quadrotor with severe onboard compute constraints. That's a meaningful hardware choice. I've seen enough spec sheets to know that a lot of navigation research quietly runs its "onboard" computation on a tethered laptop or a ground station. Running on a Crazyflie is a genuine constraint, not a marketing one.
The system integrates with control barrier functions (CBFs) to plan collision-free paths, and the paper claims it outperforms baselines in speed while maintaining safety guarantees. The exact performance margins aren't summarized in the abstract, so the full numbers are in the paper itself. That's an important caveat: "outperforming baselines" covers a wide range of actual improvement, and it's too early to say how this holds up across more complex environments.
Key claims from PolyMerge:
Converts large 3DGS scene models into lightweight convex polytope representations
Provably over-approximates all obstacles in the original model (a formal safety guarantee, not just empirical)
Tunes polytope count to trade off conservativeness against compute cost
Integrates with CBFs for collision-free path planning
Demonstrated on Crazyflie hardware in real-time, not just simulation
Code and videos available at the project website
The second paper, arXiv on FLAP (FOV-Constrained Active Perception Planning), is attacking an adjacent problem. Most drone planners treat perception as passive: the sensors see what they see, and the planner works with that. FLAP argues that a smarter drone should actively steer itself to maximize what it can perceive, especially in cluttered unknown environments where obstacles can appear late relative to the drone's current velocity.
This is actually a subtle point that gets glossed over in a lot of navigation papers. If you're moving fast and your sensor has a narrow field of view, you can fly into an obstacle that your sensor technically could have detected if you'd been pointed slightly differently. FLAP's "velocity-triggered activation" mechanism tries to solve this: as the drone speeds up, the planner becomes more aggressive about orienting the vehicle to keep the flight path within the sensor's FOV.
The formulation is differentiable, meaning the perception constraints are baked directly into the trajectory optimization rather than handled as a separate post-processing step. That matters for computational efficiency. They also tested this in real-world experiments across diverse environments with different sensor configurations, not just a single lab setup. That's the kind of generalization testing that separates research with legs from research that's been overfit to one scenario.
Key claims from FLAP:
Integrates active perception directly into trajectory optimization
Derives perception constraints from the UAV's dynamic model in the sensor coordinate frame
Velocity-triggered activation balances perception and motion efficiency
Introduces a parametric start-time optimization for a perception sub-trajectory segment
Works for arbitrary 3D maneuvers, not just horizontal flight
Requires only a simple global path as input, no expensive perception-aware path generator
Validated in both simulation and real-world experiments across varying environments
Look, these two papers are solving different parts of the same problem. PolyMerge is about compressing what you already know about a scene into something a compute-limited platform can act on. FLAP is about making sure you're building that scene knowledge efficiently in the first place. In a real deployment, you'd want both working together, which raises questions about... well, multiple things, including whether the computational overhead of active perception planning in FLAP would still be tractable on the kind of hardware PolyMerge is targeting.
Neither paper addresses that integration question directly, which is fair since they're independent research efforts. But it's the obvious next question for anyone thinking about full-stack deployment.
The broader context here matters. UAV navigation in unknown cluttered environments is genuinely hard, and the field has a long history of papers that work beautifully in simulation and then hit a wall when real-world sensor noise, compute latency, and unpredictable obstacle geometry enter the picture. Both of these papers at least demonstrate hardware experiments, which is a higher bar than pure simulation work. Whether the performance holds at larger scales, higher speeds, or in more chaotic environments remains unclear from the preprint abstracts alone.
From my time in hardware, the bottleneck that kills most promising navigation systems isn't the algorithm. It's the interface between the algorithm and the physical constraints of the platform: weight, power draw, thermal limits, latency in the sensor pipeline. PolyMerge's choice to validate on a Crazyflie is encouraging specifically because the Crazyflie doesn't let you cheat those constraints. FLAP's multi-environment validation is similarly credible.
The real test for both systems is production volume and repeated real-world deployments. One successful hardware demo is a necessary condition, not a sufficient one. But as preprints go, these are doing the right things.