Open worlds feel dead without people. You can build the most detailed city environment in Unreal Engine 5 — photorealistic buildings, volumetric fog rolling through alleys, dynamic weather — and it will still feel like a ghost town if the streets are empty. Crowds and traffic are what transform a beautiful environment into a believable world.
The challenge is scale. A single pedestrian NPC with full AI, skeletal mesh animation, and physics collision costs roughly 0.1-0.3ms per frame. Multiply that by the hundreds or thousands of NPCs needed to fill a city, and you have blown your frame budget several times over before rendering a single pixel.
This tutorial covers the approaches available in UE5 for crowd simulation and traffic AI, from the engine's built-in Mass Entity framework to third-party solutions. We will walk through implementation, compare performance benchmarks, discuss LOD strategies, and show how to integrate static environment population with dynamic crowds.
Why You Need Crowd Systems
Before diving into implementation, let us establish why dedicated crowd systems matter and why you cannot simply spawn hundreds of standard AI characters.
The Standard AI Character Problem
A typical UE5 AI character (the kind you would use for an enemy or companion NPC) includes:
- Skeletal Mesh Component: Full skeleton with bone transforms updated every frame
- Animation Blueprint: State machine evaluation, blend space sampling, montage playback
- Character Movement Component: Full movement simulation with navmesh pathfinding, acceleration, deceleration, slope handling
- AI Controller: Behavior tree tick, blackboard updates, perception system queries
- Capsule collision: Physics scene presence for collision detection
The per-character cost breakdown at 60fps:
| Component | Cost per Character |
|---|---|
| Animation evaluation | 0.05-0.15ms |
| AI behavior tree tick | 0.02-0.08ms |
| Movement/pathfinding | 0.03-0.10ms |
| Skeletal mesh render | 0.02-0.05ms |
| Physics collision | 0.01-0.03ms |
| Total | 0.13-0.41ms |
At the high end, 100 standard AI characters would consume 41ms — far more than your total 16.67ms frame budget. Even at the low end, 100 characters take 13ms, leaving almost nothing for rendering, game logic, or anything else.
This is why crowd systems exist. They achieve 10-100x better per-agent performance by replacing expensive general-purpose systems with specialized, optimized alternatives designed specifically for large numbers of simple agents.
What Makes a Crowd System Different
Crowd systems achieve their performance advantage through several techniques:
Data-oriented design (ECS/DOD). Instead of each agent being an independent Actor with Components, crowd systems store agent data in contiguous arrays processed in bulk. This is cache-friendly and enables SIMD operations.
Simplified simulation. Crowd agents do not need full navmesh pathfinding, behavior trees, or physics collision. They follow pre-computed flow paths, use simple avoidance rules, and collide only with other agents and major obstacles.
LOD-based fidelity. Near agents get full skeletal animation. Mid-distance agents use simplified animation or pose snapshots. Far agents use static meshes, impostors, or are culled entirely.
Animation instancing. Instead of evaluating animation per-skeleton, crowd systems use GPU instanced animation — the same animation data is shared across thousands of instances with only per-instance phase offsets.
Batched rendering. Thousands of crowd agents render as instanced meshes (or Nanite meshes) in a handful of draw calls, rather than individual skeletal mesh draws.
MassAI and the Mass Entity Framework
Unreal Engine 5 includes a built-in Entity Component System (ECS) called Mass Entity, and a crowd/AI framework built on top of it called MassAI. This is the same technology Epic uses for crowd systems in Fortnite and other internal projects.
Understanding Mass Entity
Mass Entity is UE5's implementation of the ECS (Entity Component System) pattern. Unlike the traditional Actor-Component model where each Actor is an independent object with a pointer-based component hierarchy, ECS stores data in flat arrays organized by data type.
Entities are lightweight identifiers (essentially integer IDs). They have no behavior or data on their own.
Fragments (equivalent to Components in ECS terminology) are plain data structs. Examples: FTransformFragment (position/rotation), FMassMoveTargetFragment (movement goal), FMassVelocityFragment (current velocity).
Processors (equivalent to Systems in ECS) are functions that operate on entities that have specific fragment combinations. A movement processor iterates over all entities with Transform + Velocity + MoveTarget fragments and updates their positions.
Traits are configuration templates that define which fragments an entity archetype has and configure their initial values.
This architecture enables processing thousands of entities efficiently because:
- Data is stored contiguously in memory (cache-friendly)
- Processors batch-process all matching entities (no per-entity virtual function calls)
- Only the data needed for each processor is loaded (no loading unnecessary components)
MassAI: The Crowd Layer
MassAI builds on Mass Entity to provide crowd-specific functionality:
MassNavigation: Pathfinding and movement for mass agents. Uses a simplified navigation system (zone graphs) rather than full navmesh pathfinding. Zone graphs define movement corridors (sidewalks, roads, paths) with pre-computed connectivity.
MassAvoidance: Local collision avoidance between agents. Prevents agents from walking through each other using velocity-based avoidance (similar to RVO/ORCA algorithms).
MassRepresentation: Manages the visual representation of agents at different LODs. Handles transitions between skeletal mesh (near), static mesh (mid), instanced static mesh (far), and culled (very far).
MassSmartObjects: Integration with UE5's Smart Object system, allowing mass agents to interact with environmental objects (sit on benches, use doors, browse shop stalls).
MassTraffic: Vehicle traffic simulation for Mass Entity agents, supporting lane-based driving, intersections, and traffic signals.
Step-by-Step: Populating a City with Mass Entity
Here is a practical walkthrough for setting up a crowd system using Mass Entity.
Step 1: Create Zone Graphs
Zone graphs define where crowd agents can walk. They are created by placing MassZoneGraphNavigation volumes in your level along pedestrian paths, sidewalks, and walkable areas.
- Open your city level in the editor
- Place
ZoneGraphDataactors along sidewalks and pedestrian areas - Define lanes within each zone graph segment — a sidewalk might have 2-3 lanes for bidirectional flow
- Connect zone graph segments at intersections and crossings
- Build the zone graph (similar to building a navmesh)
Zone graphs are much simpler than navmeshes. They define linear paths (lanes) rather than navigable surfaces. This is appropriate for crowds because pedestrians follow predictable paths — they walk on sidewalks, cross at crosswalks, and move through doorways.
Step 2: Define Agent Archetypes
Create Mass Entity agent archetypes that define the fragments (data) each crowd agent carries.
- Create a Data Asset of type
MassEntityConfig - Add traits:
MassAssortedFragmentsTrait— basic fragment collectionMassMovementTrait— movement parameters (speed, acceleration)MassNavigationTrait— zone graph navigationMassAvoidanceTrait— collision avoidanceMassRepresentationTrait— visual representation LODsMassSmartObjectUserTrait— (optional) interaction with smart objects
- Configure trait parameters:
- Walking speed: 100-150 cm/s (typical pedestrian)
- Avoidance radius: 40-60 cm
- Representation LODs: Skeletal mesh (0-30m), static mesh (30-80m), instanced static mesh (80-200m), culled (200m+)
Step 3: Configure Visual Representations
Each LOD level needs a visual representation:
LOD 0 (Near): Skeletal Mesh Use a standard Skeletal Mesh with a simplified animation set — walk, idle, and a few variation animations. This is the most expensive LOD but only applies to the closest agents (typically 10-30 visible at once).
LOD 1 (Mid): Static Mesh or Animated Static Mesh Use a static mesh in a walking pose, or use vertex animation textures (VAT) that bake skeletal animation into texture data and replay it on static meshes via a material shader. This is dramatically cheaper than skeletal mesh animation.
LOD 2 (Far): Instanced Static Mesh Use instanced static meshes (ISM) for maximum rendering efficiency. Thousands of instances render in a single draw call. Use a simple standing or walking pose mesh.
LOD 3 (Very Far): Impostor or Culled Either use billboard impostors (flat cards facing the camera with a character texture) or cull agents entirely. At 200+ meters, individual pedestrians are barely visible anyway.
Step 4: Spawn Agents
Mass Entity agents can be spawned through:
- MassSpawner actors placed in the level — define spawn areas, agent counts, and archetype assignments
- Runtime spawning through code — spawn agents dynamically as the player enters new areas
- Stream-in spawning — tie agent spawning to World Partition level streaming
For a city, place MassSpawner actors in each city block. Configure them to spawn appropriate agent densities — dense sidewalks might have 50-100 agents per block, while quiet residential streets might have 10-20.
Step 5: Configure LOD Transitions
Smooth LOD transitions are critical. A crowd agent popping from a skeletal mesh to a static mesh is visually jarring. Configure transition distances with overlap ranges:
- Skeletal mesh active: 0-25m
- Transition zone: 25-30m (crossfade between skeletal and static mesh)
- Static mesh active: 30-75m
- Transition zone: 75-80m
- Instanced static mesh active: 80-190m
- Cull distance: 200m
Use dithered opacity or fade-in/fade-out during transitions to avoid popping.
Mass Entity Performance Benchmarks
We benchmarked Mass Entity crowd simulation in a representative urban environment on three hardware tiers.
Test conditions: Urban environment, 1920x1080, Nanite + Lumen enabled, crowd agents using 3-LOD representation.
| Agent Count | RTX 4080 | RTX 4060 | PS5/XSX |
|---|---|---|---|
| 1,000 | 1.2ms | 1.8ms | 2.1ms |
| 5,000 | 2.8ms | 4.5ms | 5.2ms |
| 10,000 | 4.6ms | 7.8ms | 9.1ms |
| 20,000 | 8.2ms | 14.1ms | 16.5ms |
At 10,000 agents, the RTX 4080 comfortably maintains 60fps with 4.6ms crowd cost. The RTX 4060 is tight at 7.8ms but feasible if other systems are well-optimized. Console platforms struggle above 5,000 agents without aggressive LOD culling.
Key performance observations:
- Below 5,000 agents, the cost is dominated by rendering (instanced mesh draws and LOD management)
- Above 5,000 agents, simulation cost (navigation, avoidance) becomes the dominant factor
- Skeletal mesh agents (LOD 0) are approximately 50x more expensive per-agent than instanced static mesh agents (LOD 2)
- The number of near-LOD skeletal mesh agents has more impact on performance than the total agent count
Optimizing Mass Entity Performance
Limit skeletal mesh agent count. The single most impactful optimization. If only 20 agents are within 30m of the camera at any time, the skeletal mesh cost is fixed regardless of total agent count. Use aggressive LOD distances and ensure the near-LOD radius is as small as visually acceptable.
Use zone graph lane budgets. Configure zone graph lanes with maximum occupancy. If a sidewalk lane has a budget of 5 agents visible in skeletal mesh LOD, the system limits near-LOD agents on that lane. This prevents crowds from bunching up near the camera and exploding the skeletal mesh count.
Disable avoidance for far agents. Agents beyond 50m do not need collision avoidance with each other. They are too far away for players to notice overlapping. Disable the avoidance processor for far agents.
Batch zone graph queries. Mass Entity naturally batches processor execution, but ensure your zone graph is well-partitioned for spatial queries. Avoid monolithic zone graphs that span the entire map — partition them per district or block.
Use lightweight agents where possible. Not all crowd agents need navigation and avoidance. Background crowd agents (visible only at distance) can be simple position-along-path agents with no avoidance, reducing their simulation cost to near zero.
Traffic Simulation
Pedestrian crowds are half the equation. For a believable city, you also need vehicle traffic — cars, trucks, buses flowing through streets, stopping at intersections, and responding to traffic signals.
MassTraffic: Built-in Traffic System
UE5's Mass Entity framework includes MassTraffic, a traffic simulation system that runs on the same ECS architecture as crowd agents.
Lane-based simulation: Traffic vehicles follow pre-defined lanes on roads. Each road segment has lanes with specified speed limits, directions, and connectivity at intersections.
Intersection management: Intersections have traffic signal controllers that manage green/red phases. Vehicles queue at red lights, proceed on green, and handle turn lanes.
Vehicle variety: Support for multiple vehicle types (sedan, truck, bus, motorcycle) with different sizes, speeds, and acceleration characteristics.
Integration with pedestrian crowds: Traffic signals coordinate with pedestrian crossings. Vehicles stop for pedestrians in crosswalks (when signals allow).
Setting Up Traffic
Road Lane Configuration
- Define road splines in your level using the zone graph system
- Assign lane properties: speed limit, vehicle types allowed, lane direction
- Configure intersections: which lanes connect, signal phases, turn permissions
- Build the traffic zone graph
Vehicle Archetypes
Create vehicle Mass Entity archetypes similar to pedestrian archetypes:
MassTrafficVehicleTrait— vehicle physics (acceleration, braking, steering)MassTrafficLaneTrait— lane following behaviorMassTrafficIntersectionTrait— intersection signal awarenessMassRepresentationTrait— visual LODs for vehicles
Vehicle LODs:
- LOD 0 (near): Full vehicle mesh with animated wheels, brake lights
- LOD 1 (mid): Static vehicle mesh
- LOD 2 (far): Simplified low-poly vehicle mesh, instanced
- LOD 3 (very far): Culled or point sprite
Traffic Spawning
Spawn vehicles on road lanes using traffic spawner actors. Configure density per road type:
- Highway: 30-50 vehicles per km
- Main road: 15-30 vehicles per km
- Side street: 5-15 vehicles per km
Custom Traffic Solutions
For games requiring more complex traffic behavior (emergency vehicles with sirens, police chases, traffic accidents, player-driven vehicles interacting with traffic), you may need custom traffic logic on top of MassTraffic.
Common extensions:
- Dynamic re-routing: Vehicles avoid blocked roads (destroyed bridge, player obstruction)
- Emergency vehicle protocols: Traffic pulls over for emergency vehicles
- Parking behavior: Vehicles park in designated areas, enabling players to commandeer parked cars
- Traffic incidents: Scripted or emergent accidents that create traffic jams
Comparison of Traffic Approaches
| Feature | MassTraffic (Built-in) | Custom ECS Traffic | Traditional AI Vehicles |
|---|---|---|---|
| Vehicle count | 1,000-5,000+ | 500-3,000 | 20-100 |
| Lane following | Built-in | Manual implementation | NavMesh-based |
| Intersection logic | Signal-based | Customizable | Behavior tree |
| Player interaction | Limited | Full control | Full control |
| Setup complexity | Medium | High | Low |
| Per-vehicle cost | ~0.005ms | ~0.01ms | ~0.3ms |
For most open-world games, MassTraffic with custom extensions provides the best balance of performance and features.
LOD Strategies for Crowd Agents
LOD (Level of Detail) management is the cornerstone of crowd performance. The strategy you choose for transitioning between detail levels determines whether your crowd looks convincing or breaks immersion.
The Four-Tier LOD Model
We recommend a four-tier LOD system for crowd agents:
Tier 1: Full Fidelity (0-25m)
- Skeletal mesh with bone-driven animation
- Full material complexity (skin shaders, cloth simulation if applicable)
- Active ragdoll or procedural animation blending
- AI behavior (smart object interaction, gaze direction, facial animation)
- Physics collision capsule
Tier 2: Animated Mesh (25-60m)
- Vertex Animation Texture (VAT) driven static mesh
- Simplified materials (no subsurface skin, no cloth simulation)
- No AI behavior (follows path waypoints only)
- No physics collision (collision avoidance through the avoidance processor only)
Tier 3: Instanced Crowd (60-150m)
- Instanced static mesh (single draw call per mesh variation)
- Flat material with baked diffuse/normal
- No individual animation (uniform pose or very simple position-based animation)
- No avoidance (agents follow paths, overlapping is acceptable)
Tier 4: Background (150m+)
- Billboard impostor or particle-based crowd representation
- Extremely low per-agent cost
- No individual behavior
- Optional: skip entirely if environment provides enough visual complexity
Transition Smoothing
Abrupt LOD transitions are the most common visual artifact in crowd systems. Strategies to smooth transitions:
Dithered transition: During the transition range (e.g., 23-27m for Tier 1 to Tier 2), render both representations simultaneously. The outgoing LOD uses a dithered opacity that decreases, while the incoming LOD uses a dithered opacity that increases. The dither pattern prevents both being fully visible at once.
Distance-based crossfade: Similar to dithered transition but uses smooth opacity blending instead of dither. More visually smooth but requires sorting for correct alpha rendering.
Temporal offset: Stagger LOD transitions across agents. Instead of all agents at exactly 25m switching simultaneously, add a small random offset (23-27m) per agent. This prevents visible "waves" of LOD switching as the camera moves.
Occlusion-aware transitions: Only apply Tier 1 (skeletal mesh) to agents that are actually visible on screen. An agent at 10m behind a wall does not need skeletal animation. Use occlusion queries or simplified visibility checks to downgrade occluded agents regardless of distance.
Animation Instancing
Animation instancing is a technique that enables thousands of animated crowd agents to share animation data, dramatically reducing the per-agent cost of animation.
How It Works
Traditional skeletal mesh animation evaluates the animation pose per-skeleton. Each skeleton samples its animation curves, evaluates blend spaces, applies IK, and outputs bone transforms. For 1,000 agents, that is 1,000 independent animation evaluations.
Animation instancing moves animation evaluation to the GPU. The animation data (bone transforms per frame) is baked into a texture (Vertex Animation Texture or bone animation texture). At render time, each instance reads its current animation frame from the shared texture, offsets by a per-instance phase value, and applies the bone transforms in the vertex shader.
Cost comparison:
- CPU skeletal animation: 0.05-0.15ms per agent
- GPU animation instancing: 0.001-0.005ms per agent (amortized across instances)
For 1,000 agents, this is the difference between 50-150ms (impossible) and 1-5ms (comfortable).
Implementation in UE5
UE5 supports animation instancing through several approaches:
Vertex Animation Textures (VAT): Bake skeletal animation into a texture that stores vertex positions per frame. Apply this texture in a material shader that reads vertex positions based on a time parameter. This works with static meshes (no skeleton needed at runtime).
Tools like the SideFX Houdini Game Development Toolset can generate VAT from skeletal animations. The resulting texture and material can be used with instanced static meshes for maximum performance.
Animation Sharing Plugin: UE5 includes an Animation Sharing plugin that pools animation evaluation across multiple skeletal meshes. Instead of evaluating the same walk animation 100 times, it evaluates it once and applies the result to all instances. The per-instance cost is reduced to bone transform copying rather than full evaluation.
Custom GPU animation system: For advanced cases, implement a custom system that stores bone transforms in a structured buffer, evaluates animation on the GPU via compute shaders, and applies transforms during vertex processing. This provides maximum control and performance but requires significant engineering.
Choosing the Right Approach
| Approach | Quality | Performance | Complexity |
|---|---|---|---|
| Standard Skeletal Mesh | Highest | Lowest (0.1ms/agent) | None |
| Animation Sharing | High | Medium (0.02ms/agent) | Low |
| VAT Static Mesh | Medium | Highest (0.002ms/agent) | Medium |
| Custom GPU Animation | High | Highest (0.003ms/agent) | Very High |
For most projects, use standard skeletal meshes for Tier 1 (near) agents and VAT static meshes for Tier 2+ (mid/far) agents. This provides high quality where it matters and maximum performance where it does not.
Combining Static Scatter with Dynamic Crowds
Open-world environments need both static population (parked cars, market stalls, street furniture, NPCs in fixed positions) and dynamic crowds (moving pedestrians, flowing traffic). These two systems should complement each other, not compete.
Static Population with Procedural Placement Tool
Our Procedural Placement Tool excels at populating environments with static elements at scale — scattering 100,000+ instances per second across terrain and surfaces.
Use the Procedural Placement Tool for:
- Parked vehicles along streets and in parking lots
- Street furniture — benches, trash cans, mailboxes, fire hydrants, newspaper stands
- Market stalls and vendor carts in marketplace areas
- Static NPC groups — seated restaurant patrons, vendors behind counters, guards at fixed posts
- Vegetation that borders pedestrian areas — potted plants, street trees, flower boxes
These static elements provide visual density and environmental storytelling without any simulation cost. They fill the environment so that dynamic crowd agents are supplementary rather than solely responsible for making the world feel alive.
Integration Strategy
Layer 1 (Static, Procedural Placement Tool): Dense environmental population. Hundreds to thousands of placed instances. Zero simulation cost. This is the foundation — the world looks populated even before a single dynamic crowd agent appears.
Layer 2 (Background Crowd, Mass Entity Tier 3-4): Low-cost instanced crowd agents that provide movement and life in the middle and far distance. Thousands of agents, minimal per-agent cost. This layer adds the perception of a bustling city without expensive simulation.
Layer 3 (Foreground Crowd, Mass Entity Tier 1-2): Full-fidelity crowd agents near the player. Dozens of agents with skeletal animation, avoidance behavior, and smart object interaction. This layer provides close-up believability.
Layer 4 (Gameplay NPCs, Standard AI): Important NPCs — quest givers, vendors, enemies, allies — that use traditional UE5 AI with behavior trees, full physics, and gameplay interaction. These are limited in number (10-30 near the player) and are the only agents with full gameplay capability.
Spawning Dynamic Crowds Around Static Elements
Configure dynamic crowd agents to interact with static elements placed by the Procedural Placement Tool:
- Crowd agents walk along sidewalks that pass between statically-placed market stalls
- Agents use Smart Objects tagged on statically-placed benches (sitting behavior)
- Traffic vehicles drive between statically-placed parked cars
- Crowd density adapts to the density of static elements — crowded marketplaces get more agents, quiet residential streets get fewer
This layered approach means each system does what it does best. The Procedural Placement Tool provides mass static population. Mass Entity provides dynamic movement and interaction. Together, they create convincing open worlds.
MCP Automation of Mass Entity Configuration
Setting up Mass Entity for a large open world involves repetitive configuration — zone graph creation, agent archetype setup, spawner placement, LOD tuning. The Unreal MCP Server can automate much of this.
Automated Zone Graph Generation
The Unreal MCP Server can scan a city level and generate zone graphs from road and sidewalk geometry:
- Identify road and sidewalk meshes by material type or naming convention
- Generate center-line splines for each sidewalk segment
- Create zone graph data actors along these splines
- Configure lane width, direction, and connectivity based on the geometry
- Build zone graph navigation data
This turns a manual process that takes hours per city block into an automated process that handles entire districts in minutes.
Agent Archetype Management
MCP can manage agent archetypes systematically:
- Create archetype variants for different crowd types (pedestrians, joggers, workers, tourists)
- Configure variation parameters (speed ranges, visual mesh assignments, behavior traits)
- Assign archetypes to spawners based on zone type (business district gets worker archetypes, park gets jogger archetypes, tourist area gets tourist archetypes)
LOD Tuning Automation
MCP can profile crowd performance and adjust LOD distances automatically:
- Measure frame time contribution of crowd agents at various distances
- Identify the distance at which skeletal mesh agents push beyond budget
- Adjust Tier 1 LOD distance to ensure no more than N skeletal mesh agents are visible
- Report optimal LOD transition distances for the current scene complexity
Spawner Configuration
For large open worlds with dozens of city blocks, MCP can:
- Place MassSpawner actors at regular intervals along zone graph paths
- Configure density per spawner based on the area type (commercial = dense, residential = sparse)
- Set up spawn budgets to limit total agent count per streaming level
- Configure despawn distances tied to World Partition streaming ranges
Network Considerations
Crowd simulation in multiplayer games presents unique networking challenges. Not every crowd agent needs to be synchronized between clients.
Client-Side Crowds
For most multiplayer games, crowds are cosmetic — they provide visual atmosphere but do not affect gameplay. In this case, crowds can be entirely client-side:
- Each client runs its own Mass Entity simulation
- Crowd agents are not replicated between clients
- Different clients may see different crowd configurations — this is acceptable because players rarely compare crowd details
Advantages: Zero network bandwidth for crowds. Each client optimizes its own crowd independently based on local hardware capability.
Limitations: Players in the same location see different crowd agents. If gameplay involves interacting with crowd agents (shooting into crowds, crowd reactions to events), different clients see different results.
Selective Synchronization
For games where crowd interaction matters:
- Event synchronization: Replicate crowd events (a crowd agent reacting to an explosion) as server-authoritative events. All clients play the same reaction at the same location.
- Key agent synchronization: Promote important crowd agents to replicated actors when they become gameplay-relevant (a crowd agent that witnesses a crime, a pedestrian that the player talks to). Standard crowd agents remain client-side.
- Seed synchronization: Use identical random seeds on all clients so that crowd spawning and behavior produces the same results deterministically. This requires careful control of randomization sources.
Crowd Density Scaling in Multiplayer
In multiplayer, each player's client renders crowds near their camera. If 4 players are in different parts of the city, the total crowd agent count could be 4x the single-player count. Implement per-client crowd budgets that scale with player count and hardware capability.
Sound Design for Crowds
Crowd audio is essential for believability but challenging to implement efficiently with thousands of agents.
Ambient Crowd Sound
Rather than individual audio per agent, use ambient crowd sound that scales with crowd density:
Crowd murmur layers: Create looping ambient sound cues with multiple crowd murmur layers — quiet (sparse crowd), moderate (medium crowd), dense (packed crowd). Blend between layers based on the number of crowd agents within a radius of the listener.
Footstep aggregate: Instead of individual footstep sounds per agent, use a procedural footstep system that plays randomized footstep sounds at a rate proportional to nearby agent count. 50 nearby agents might trigger 10-15 footstep sounds per second from random positions within the crowd area.
Vehicle traffic ambience: Similar approach for traffic — a looping traffic ambience layer that scales with vehicle density, supplemented by individual engine/horn sounds for the closest vehicles.
Individual Sounds for Near Agents
For Tier 1 (near) agents, add individual audio:
- Footstep sounds tied to animation events
- Occasional voiced lines (coughs, phone conversations, greetings)
- Clothing rustle from material interaction
Limit concurrent individual agent sounds to 5-10 using UE5's sound concurrency system. Only the nearest and most relevant agents play individual audio.
Spatial Audio Integration
Use UE5's audio attenuation and spatialization to position crowd sounds correctly:
- Near agent sounds are fully spatialized (3D positioned)
- Ambient crowd murmur uses distance-based volume and filtering but not precise spatialization
- Traffic sounds use audio volumes along road corridors
Common Pitfalls
Pitfall 1: Too Many Skeletal Mesh Agents
The most common crowd performance issue is insufficient LOD aggressiveness. Developers want their crowds to look good at distance, so they extend skeletal mesh LOD ranges too far. Even 50 skeletal mesh agents at 60fps consume significant CPU and GPU resources.
Fix: Be aggressive with LOD distances. 20-30m for skeletal mesh is usually sufficient. Most players cannot distinguish skeletal animation from VAT animation beyond 15-20 meters.
Pitfall 2: Ignoring Memory Budget
10,000 crowd agents with unique meshes, materials, and animation assets consume significant memory. Each variation multiplied by thousands of instances adds up.
Fix: Use a limited set of mesh variations (8-12 body meshes with modular texture/color variation). Share animation assets across all agents. Use material instances with per-instance color parameters rather than unique materials per variation.
Pitfall 3: Crowd Agents Blocking Gameplay
Crowd agents that obstruct the player, block pathways, or interfere with combat break gameplay. This is a game design problem, not a technical one, but it is critical.
Fix: Implement crowd avoidance of the player and gameplay-important areas. Crowd agents within a radius of the player should move out of the way. During combat, clear crowds from combat zones (they flee or despawn). Never let crowd agents block doorways, quest objectives, or navigation chokepoints.
Pitfall 4: Uniform Crowd Behavior
All agents walking at the same speed in the same direction looks robotic. This breaks immersion more than low polygon counts.
Fix: Add variation to every parameter. Speed varies by +/-20%. Agents occasionally stop, look around, change direction, interact with smart objects. Mix walking and standing agents. Group some agents together (walking in pairs or groups) and let others walk alone.
Pitfall 5: Crowds Disappearing When the Player Turns
If crowds only exist in the player's view frustum, turning the camera causes agents to visibly pop in and out.
Fix: Simulate and persist agents in a radius around the player, not just in the view frustum. Agents behind the camera still exist and move, so they are already in position when the camera turns back. The render frustum culls them for free — the simulation cost of off-screen agents is minimal compared to rendering.
Pitfall 6: Not Testing on Target Hardware
A crowd that runs at 60fps on your development RTX 4080 may run at 25fps on a target console. Performance testing must happen on the lowest target hardware early in development.
Fix: Establish agent count budgets per platform early. Test on console devkits regularly. Implement scalable crowd density that reduces agent counts on lower-end hardware.
Practical Implementation Checklist
Here is a step-by-step checklist for implementing crowds and traffic in your UE5 open-world project:
- Define agent budgets per platform. PC high: 10,000. PC low: 3,000. Console: 5,000.
- Create zone graphs for all pedestrian and vehicle routes
- Build agent archetypes with 3-4 pedestrian types and 2-3 vehicle types
- Create LOD meshes for each agent type — skeletal mesh, VAT mesh, instanced mesh
- Configure LOD distances — start aggressive (20m skeletal), refine based on testing
- Place MassSpawners in each city block/area
- Populate static elements with the Procedural Placement Tool — parked cars, furniture, static NPCs
- Set up crowd audio — ambient layers and near-agent individual sounds
- Implement crowd management — player avoidance, combat clearing, density scaling
- Profile on target hardware and adjust budgets
- Automate configuration with the Unreal MCP Server for iteration speed
Conclusion
Building crowds and traffic in UE5 is achievable at scale thanks to Mass Entity and MassAI. The key principles:
- Use the right system for the right agent. ECS-based Mass Entity for crowds, traditional AI for gameplay-important NPCs. Do not try to make one system do everything.
- LOD is everything. The difference between 100 skeletal mesh agents and 10,000 instanced agents can be smaller than you expect — if your LOD strategy is aggressive enough.
- Layer static and dynamic. Use the Procedural Placement Tool for dense static population and Mass Entity for dynamic crowds. Together, they create worlds that feel alive.
- Automate with MCP. Zone graph creation, archetype management, and spawner configuration across a large open world is impractical without automation. The Unreal MCP Server makes it manageable.
- Budget by platform. Establish agent count limits per platform early. Test on target hardware regularly. Scale density dynamically.
- Audio sells the crowd. Ambient crowd sound scaled by density plus individual sounds for near agents creates convincing atmosphere without excessive audio overhead.
10,000 NPCs at 60fps is not a marketing gimmick. With Mass Entity, proper LOD management, and smart integration between static and dynamic population, it is a practical, shippable reality.