Full notes
Full DROSS update
Read the full published notes in a cleaner layout. The original post stays linked below.
What changed
- Performance
- Gameplay
> *"The swarm worked perfectly at 100 bugs. At 500, the game caught fire."*
DROSS is a sci-fi horror game built in Unity 6 with ECS (Entity Component System) and Burst-compiled code. One of its core features is a **swarm AI system** — hundreds to thousands of alien bugs that move through procedurally generated levels, following leader-follower hierarchies, flanking the player, and reacting to threats in real-time.
It's the kind of feature that looks great on paper, but melts your CPU in practice.
This post walks through the 10-step optimization journey that took our swarm pathfinding from a **187ms main-thread catastrophe** to something that comfortably handles 5,000+ agents. No magic bullets — just a lot of profiling, several mistakes, and some patterns that might be useful if you're doing anything similar.
## The Problem: A* at Scale
We use the A* Pathfinding with ECS integration. Every bug in the swarm is a fully pathfinding-capable agent — they avoid walls, navigate stairs, and respect the generated level’s navmesh. This was fantastic for 50-100 bugs. But after 300+, several issues conspired to destroy our frame rate:
The profiler Low was genuinely alarming. 187ms on the main thread. Single-digit FPS. The swarm was eating itself alive.
## The Solution:
## Step 1: Quality-Based Local Avoidance
The first instinct was to scale RVO neighbours by total bug count — fewer neighbours when there are more bugs. We shipped that, and it worked, but it had a nasty side effect: **close bugs looked stupid.** When you have 300 bugs total but only 20 near the player, those 20 visible bugs were getting 2 neighbours each and clipping through each other constantly.
The fix was obvious in retrospect: **tie avoidance quality to distance from the player, not total count.**
There is already had a robust quality-culling system that assigns every bug a tier based on distance:
Close bugs always look good. Distant bugs save cycles. Total count doesn't matter — a thousand bugs far away don't degrade the twenty bugs right in front of you.
## Step 2: Destination Write Cooldowns
Here's something the A* documentation warns you about that's easy to miss:
> *"Repairing the path each frame can be a significant part of movement calculation time."*
Every time you write a new destination to an agent, A* repairs the path. If you're updating destinations every frame for 300 bugs, that's 300 path repairs per frame. Each one involves local graph queries, node traversal, and funnel calculations.
The fix is embarrassingly simple: **don't update destinations that haven't meaningfully changed.** We added quality-based cooldowns:
Combined with frame staggering (Q3 bugs only process every 4th frame, Q2 every 2nd), the destination write volume dropped by roughly 80%.
These Constant updates, spread out to up to 2 seconds apart, had to be handled even more. As theses changes could potentially lag swarm movement due to only distance. The solution was:
---
## Step 3: Predicting Where Leaders Are Going
Our swarm uses a leader-follower hierarchy. Leaders pathfind to nests, alarms, or the player. Followers... follow. But followers also need destinations for A* to pathfind them.
The problem: culled followers (Q0, off-screen, 48+ meters away) were still requesting full pathfinding through A*'s managed `PathTracer` — a main-thread operation that extracts path nodes one at a time. For bugs nobody can see.
The solution is to skip pathfinding entirely for distant followers. Instead, we share each leader's target destination via a lookup table. When a Q0 follower needs a destination, we do simple vector math: project a point 10 meters ahead along the line from the follower toward the leader's target.
It's not accurate. The bug might walk through a wall off-screen. Nobody cares — by the time the bug is close enough to be visible, it switches to real pathfinding and corrects itself within a second.
Zero main-thread cost. Fully Burst-compatible. Distant bugs keep moving in roughly the right direction.
---
## Step 4: The Death Spiral
This one was the scariest bug because it's self-reinforcing. Here's how it works:
Frame takes 50ms (maybe a GC spike, maybe a scene load hiccup)
A* sees it missed simulation time, runs 3 substeps to catch up
Each substep is 40ms. Frame now takes 170ms.
Next frame, A* sees it's even further behind. Runs 4 substeps.
Frame takes 210ms. Game is now a slideshow.
The fix has two parts:
**Hard clamp:** We modified A*'s internal rate manager to cap substeps at 3 per frame, no matter what. If the simulation falls behind, it falls behind. Smooth 30fps is better than perfect simulation at 5fps.
**Dynamic timestep:** When bug count is high, we increase A*'s simulation timestep (lower frequency). Under 100 bugs, A* runs at 1/30s. At 300+, it drops to 1/15s. Distant bugs don't need 30Hz movement updates — nobody's counting their footsteps.
## Step 5: Staggering Leader Updates
All our leaders were updating on the exact same cadence: once per second. That means every 60th frame (at 60fps), every single leader simultaneously evaluates its state machine, picks a new destination, and writes to A*. This is the cost of getting it working before getting it working right.
The profiler showed a clean 1Hz spike — a sawtooth pattern where one frame every second was 3-4x more expensive than its neighbours.
The fix: easy. When a leader spawns, it gets a staggered time offset based on its swarm ID. Leader 0 updates at t=0.0, leader 1 at t=0.05, leader 2 at t=0.10, etc. The same total work happens per second, but it's spread evenly across frames instead of concentrated in one.
## Step 6: Making the Leader System Burst-Compatible
This is where things got interesting. Steps 1-5 were configuration and algorithmic changes. Step 6 was surgery. ACTUAL optimizations, im going to get technical here:
Ok so, Unity's Burst compiler can take ECS system code and compile it to highly optimized native code with SIMD vectorization. The catch is: “no managed code allowed.”
This means:No `Debug.Log`s. No `string` operations. No calling managed class methods. No `System.DateTime`. No garbage-collected allocations.
The leader movement system had a few (seven) things blocking Burst:
Each one had to be replaced with a Burst-safe alternative.
After removing all seven blockers: `[BurstCompile]` on `OnUpdate`. and compiler accepted it. The profiler showed a ~0.5-1ms improvement from vectorization alone.
## Step 7: Deferred Cleanup (A Fallen Tag Pipeline)
When a bug falls through the world (Y < -5, this happens occasionally with procedural geometry), the leader system needs to return it to the object pool. But `BugPoolSystem.Release()` is a managed method — you can't call it from Burst code.
The solution is a **deferred tagging pipeline:**
Burst system detects fallen bug → sets `BugFallenTag` enabled via ECB (entity command buffer)
End of frame: ECB plays back, tag becomes active
Next frame: cleanup system queries for `BugFallenTag` entities → calls managed pool release → resets tag
The bug is returned to the pool one frame late. Nobody notices. The leader system stays Burst-compiled.
This pattern — "tag now, process later in a managed system" — is generally useful anywhere you need a Burst system to trigger managed-only operations.
## Step 8: Eliminating Per-Frame Garbage Collection
The profiler's GC column was showing 17KB/frame from the leader system alone. The culprits:
`Allocator.Temp` is supposed to be cheap, but "cheap × every frame × forever" adds up. And the GC doesn't care that your allocation is "temporary" — it still has to track and collect it.
The fix: **persistent containers.** Allocate the hash sets and lists once in `OnCreate` with `Allocator.Persistent`. Call `.Clear()` at the start of each frame instead of creating new ones. Dispose in `OnDestroy`.
Same data, same logic, zero allocations per frame. 17KB/frame → 0.
The cleanup system got the same treatment — persistent `NativeList`s for its three processing passes (dead bugs, orphaned leaders, fallen bugs), cleared each frame, disposed on shutdown.
The Stress Test
Sometimes you just need to know: "how many bugs can we actually handle?" And you need to answer it *while the game is running,* without restarting.
Slide it down during a stress test and watch the profiler drop. Slide it up and see where the budget breaks. It's the single most useful debug tool we've built for swarm performance.
The swarm is no longer the bottleneck. A* pathfinding at 5K agents is still expensive — it's real pathfinding on a real navmesh — but it's a *predictable* expensive. No spikes, no spirals, no garbage collection pauses.
DROSS is in active development. Follow me and Wishlist DROSS on steam!
Thank you everyone for reading, Cheers
~Salt_
Source
Changelog.gg summarizes and formats this update. How we read updates.
