Launch Discount: 25% off for the first 50 customers — use code LAUNCH25

StraySparkStraySpark
ProductsFree AssetsDocsBlogGamesAbout
StraySparkStraySpark

Game Studio & UE5 Tool Developers. Building professional-grade tools for the Unreal Engine community.

Products

  • Complete Toolkit (Bundle)
  • Procedural Placement Tool
  • Cinematic Spline Tool
  • Blueprint Template Library
  • DetailForge
  • Unreal MCP Server
  • Blender MCP Server
  • Godot MCP Server

Resources

  • Free Assets
  • Documentation
  • Blog
  • Changelog
  • Roadmap
  • FAQ
  • Contact

Legal

  • Privacy Policy
  • Terms of Service

© 2026 StraySpark. All rights reserved.

Back to Blog
tutorial
StraySparkMarch 13, 20265 min read
Procedural Placement + Nanite Foliage: Performance Guide for Dense Open Worlds in UE 5.7 
Unreal EngineNaniteFoliagePerformanceProcedural PlacementOpen WorldOptimization

Dense foliage is one of the most visually impactful elements of an open world game. It's also one of the most expensive. A forest that looks stunning in a screenshot can destroy your frame budget the moment a player walks through it.

Unreal Engine 5.7 has improved the situation significantly. Nanite support for foliage meshes, better instance culling, and refined World Partition streaming make dense open worlds more achievable than ever. But "more achievable" isn't the same as "easy" — there are specific techniques and settings that determine whether your procedurally-placed forest runs at 60fps or 30fps.

This guide covers the technical details of making dense Nanite foliage work at production frame rates. We'll go through draw call management, instanced rendering, LOD strategies, memory budgeting, streaming configuration, and profiling workflows — with specific numbers and before/after benchmarks where relevant.

The Performance Landscape in UE 5.7

Before diving into optimization techniques, let's establish what's changed in UE 5.7 compared to 5.4/5.5.

Nanite Foliage: What's New

UE 5.4 introduced Nanite support for foliage, but with significant limitations — no skeletal mesh support, limited wind animation, and performance overhead for small instances. UE 5.7 addresses most of these:

  • Nanite skeletal mesh support — foliage can now use skeletal meshes for wind animation and interaction, with Nanite handling the geometry. This eliminates the previous trade-off between visual fidelity (Nanite) and animation (skeletal mesh with traditional LODs).
  • Improved small-triangle culling — Nanite's cluster culling now handles small-triangle meshes (grass, ground cover) more efficiently. Previous versions had overhead per-instance that made very small meshes slower with Nanite than with traditional rendering.
  • Better instance streaming — Nanite instance data integrates with World Partition streaming, so instances outside the streaming radius are properly unloaded rather than occupying memory.
  • Reduced GPU memory overhead — Nanite mesh data sharing between instances is more aggressive, reducing the per-instance memory cost.

These improvements matter for procedurally-placed foliage because the density numbers are higher than hand-placed scenes. A procedurally-scattered forest might place 50,000-200,000 individual foliage instances in a single World Partition cell. At that scale, per-instance overhead multiplies quickly.

Key Performance Metrics

For dense foliage scenes, these are the metrics that matter:

  • GPU scene update time — how long the GPU spends updating instance transforms and visibility. Target: under 2ms.
  • Nanite rasterization time — how long Nanite takes to render visible geometry. Target: under 4ms for foliage (leaving budget for terrain, structures, characters).
  • Instance culling time — how long the CPU spends determining which instances are visible. Target: under 1ms.
  • Memory usage — total GPU memory consumed by Nanite mesh data and instance data. Target: under 2GB for foliage on a mid-range GPU.
  • Streaming throughput — how quickly new instances load as the player moves through the world. Target: no visible pop-in at intended movement speed.

These targets assume a 60fps budget (16.67ms total frame time) on a mid-range 2024 GPU (RTX 4070 / RX 7800 XT class). Adjust proportionally for your target hardware.

Draw Call Management

Draw calls are the traditional bottleneck for dense foliage. Each unique mesh-material combination submitted to the GPU is a draw call, and the CPU has a per-frame budget for how many it can issue.

Nanite Eliminates Traditional Draw Call Overhead

With Nanite foliage, individual draw calls per instance are replaced by Nanite's cluster-based rendering. Nanite batches all visible geometry into a single rendering pass, regardless of how many individual instances exist. This means going from 10,000 to 100,000 foliage instances doesn't proportionally increase draw calls.

However, Nanite has its own overhead per unique mesh type. Each unique Nanite mesh in the scene adds to the GPU scene complexity. The practical impact:

  • 50 unique foliage mesh types — negligible overhead
  • 200 unique mesh types — measurable overhead (0.5-1ms)
  • 500+ unique mesh types — significant overhead, consider reducing variety

For procedurally-placed foliage, this means your mesh palette size matters more than your instance count. A forest with 100,000 instances of 20 mesh types is cheaper than 50,000 instances of 200 mesh types.

Hierarchical Instanced Static Mesh (HISM) Tuning

Even with Nanite, foliage uses Hierarchical Instanced Static Mesh (HISM) components for spatial organization and culling. The HISM settings affect how efficiently the CPU can cull instances:

Cluster Size

HISM Cluster Size: 64 (default) → 128 (recommended for dense foliage)

The cluster size determines how many instances are grouped together for culling. Larger clusters mean fewer culling operations but less granular culling. For dense foliage, increasing from 64 to 128 reduces CPU culling time by approximately 30% with minimal impact on over-draw (because foliage is dense enough that most clusters are fully visible anyway).

Max Instance Count Per Component

Max Instance Count: 65536 (default) → 32768 (recommended)

Counter-intuitively, reducing the max instance count per HISM component can improve performance. Smaller components mean more components but better spatial locality for culling. With World Partition, each component maps to a streaming cell, and smaller components enable more granular streaming.

Instance Culling Optimization

With hundreds of thousands of foliage instances, culling efficiency determines whether your CPU frame budget is eaten by visibility calculations.

Frustum Culling

Frustum culling — removing instances outside the camera's view — is the first line of defense. It's handled automatically by the HISM system, but you can influence its efficiency:

  • Bounds accuracy — HISM components use hierarchical bounding boxes. If your foliage meshes have overly large bounds (common with meshes that have long, thin leaves extending far from the center), the bounding boxes are larger than necessary, and instances are considered "visible" when only empty space is in frame. Clean up mesh bounds in your asset preparation pipeline.
  • Culling distance — set per-foliage-type culling distances that match the visual importance of each type. Grass can cull at 50-80m. Bushes at 100-150m. Trees at 500-1000m+. This is the single highest-impact optimization for dense foliage.

Occlusion Culling

Occlusion culling — removing instances hidden behind other geometry — is more complex with Nanite because Nanite handles some occlusion internally during rasterization. However, CPU-side occlusion culling still matters for preventing instances from being submitted to the GPU scene.

UE 5.7's improved occlusion system provides hardware occlusion queries that feed back to the HISM culling system. For this to work effectively:

  • Enable hardware occlusion queries in Project Settings → Rendering → Culling
  • Set occlusion query granularity to match your HISM cluster size. If clusters are 128 instances, occlusion queries should test at the cluster level, not per-instance
  • Large occluders — terrain, buildings, and cliff faces are effective occluders for foliage behind them. Make sure these have proper occlusion proxy geometry

Distance-Based Density Reduction

A technique that's particularly effective for procedurally-placed foliage: reduce instance density based on distance from the camera. The idea is simple — at 200m away, the player can't distinguish between 100% and 50% foliage density. At 500m, 25% density looks identical to 100%.

Implementation approaches:

Per-foliage-type end cull distance — the simplest approach. Set each foliage type's end cull distance to progressively remove small foliage at distance. Ground cover fades out at 50m, small plants at 100m, bushes at 200m. Only trees persist to maximum view distance.

HISM density scaling — UE 5.7 supports per-LOD density scaling on HISM components. At LOD transitions, you can skip rendering a percentage of instances. This is more granular than cull distance because it operates within the visible range.

Procedural placement with distance-aware density — if you're generating foliage placement at runtime or during a build step, you can generate sparser placement data for distant regions. The Procedural Placement Tool supports distance-based density falloff as a placement parameter, generating fewer instances in areas that are typically viewed from afar.

LOD Transitions with Nanite Skeletal Meshes

Nanite's automatic LOD system works differently for foliage than for static objects. Understanding these differences is critical for performance tuning.

How Nanite LOD Works for Foliage

Nanite reduces triangle count based on screen-space size. For foliage, this means:

  • Distant trees automatically render at dramatically reduced triangle counts (potentially fewer than 100 triangles for a tree that has 50,000 at full resolution)
  • The transition is seamless — no visible LOD popping, which was a major problem with traditional foliage LODs
  • BUT — Nanite's LOD reduction for organic shapes (trees, bushes) can produce visual artifacts at extreme reduction levels. Thin branches may disappear before they're small enough to be invisible, creating a "melting" look at medium-long distances.

Mitigating Nanite LOD Artifacts

Custom Nanite fallback meshes. For tree types that have complex silhouettes (branches, leaves extending far from the trunk), create a custom fallback mesh that maintains the silhouette shape at low triangle counts. Set this as the Nanite fallback mesh in the static mesh settings.

Nanite pixel error threshold tuning.

r.Nanite.MaxPixelsPerEdge = 1.0 (default) → 2.0 (for foliage with thin features)

Increasing the max pixels per edge allows Nanite to maintain more detail for thin features. The trade-off is more triangles rendered at distance. Experiment with values between 1.0 and 3.0 to find the balance for your specific foliage meshes.

Billboard crossfade distance. For the furthest distance, where even Nanite LOD isn't cheap enough, consider using impostor/billboard rendering. UE 5.7 supports Nanite-to-billboard crossfading where the Nanite mesh fades out while a billboard representation fades in. This is configured per foliage type:

Billboard Start Distance: 800m
Billboard Full Distance: 1000m
Crossfade Range: 200m

The billboard textures need to be generated from your actual foliage meshes for visual consistency. Several marketplace tools automate this process.

Nanite Skeletal Mesh Considerations

UE 5.7's Nanite skeletal mesh support enables wind-animated foliage with Nanite rendering. The performance considerations differ from static Nanite meshes:

  • Bone count matters. Each bone adds to the per-instance animation cost. For foliage wind animation, you typically need only 3-8 bones (trunk, major branches, leaf clusters). More than 12 bones per foliage mesh is excessive for background vegetation.
  • Animation update rate. Wind animation doesn't need to update at full frame rate. UE 5.7 supports LOD-based animation rate reduction — distant instances can update at 15fps while nearby instances update at 60fps with no visible difference.
  • GPU skinning budget. Nanite skeletal mesh skinning is GPU-accelerated, but it's still a per-instance cost. Budget 0.5-1ms for foliage skinning with 50,000+ visible animated instances on a mid-range GPU.

Memory Budgeting

Dense Nanite foliage can consume significant GPU memory. Understanding where that memory goes helps you make informed decisions about density and variety.

Memory Breakdown

For a typical dense forest scene with 100,000 visible instances across 30 mesh types:

CategoryMemoryNotes
Nanite mesh data (shared)200-500 MBShared across all instances of each type. Higher for more mesh types and higher-poly meshes
Instance transforms50-100 MB64 bytes per instance (transform + metadata)
HISM spatial data20-40 MBBounding volume hierarchy for culling
Nanite visibility buffer50-100 MBScreen-resolution dependent, shared with all Nanite rendering
Total320-740 MB

Memory Optimization Strategies

Reduce unique mesh count. This has the highest impact on the "Nanite mesh data" category. Going from 50 unique tree types to 20 saves 200-300 MB of GPU memory. Use material variation (color tinting, bark texture swaps) to maintain visual variety with fewer unique meshes.

Optimize source mesh triangle counts. Nanite stores mesh data more efficiently than traditional meshes, but the source triangle count still affects memory. A tree with 100,000 triangles uses roughly 3x the Nanite memory of a tree with 30,000 triangles. For background foliage (not hero trees that players see up close), 20,000-40,000 source triangles is a reasonable target.

World Partition streaming radius. The streaming radius determines how many instances are loaded at once. A smaller radius means less memory but more streaming activity (and potential pop-in). For foliage:

Foliage Streaming Radius: 500m (trees) / 200m (bushes) / 80m (ground cover)

These values are starting points. Adjust based on your game's movement speed — a game where the player moves at 50 m/s needs a larger radius than one where the player walks at 5 m/s.

Instance data compression. UE 5.7 supports compressed instance transforms for foliage. If your foliage doesn't need full 32-bit precision for position and rotation (most doesn't), enabling 16-bit instance data halves the instance transform memory:

Project Settings → Rendering → Foliage → Use Compressed Instance Data: True

This saves 25-50 MB for dense scenes with negligible visual impact.

Streaming Configuration

Open world foliage requires careful streaming configuration. Loading too aggressively wastes memory and causes hitches. Loading too lazily causes visible pop-in.

World Partition Cell Size

For foliage-heavy open worlds, the World Partition cell size directly affects streaming granularity:

  • Large cells (512m) — fewer cells to manage, but each cell contains more foliage data. Loading a single cell can cause a hitch if it contains dense forest.
  • Small cells (128m) — more granular streaming, smaller per-cell hitches, but more streaming operations and more cell management overhead.
  • Recommended: 256m cells for foliage — this is a good middle ground that keeps per-cell foliage counts manageable (typically 5,000-20,000 instances per cell in dense areas) while limiting the total number of active cells.

Async Loading Configuration

Foliage instance loading should be asynchronous to avoid frame hitches. Key settings:

s.AsyncLoadingTimeLimit = 2.0  (ms per frame for async loading)
s.AsyncLoadingUseFullTimeLimit = 0  (don't use full limit if not needed)

These settings cap how much frame time async loading can consume. 2ms is conservative — you can increase to 3-4ms if your game has frame budget headroom and pop-in is noticeable.

Priority-Based Streaming

Not all foliage types need to load at the same priority. Trees are visible from further away and have more visual impact than ground cover. Configure streaming priority per foliage type:

  1. Trees — highest priority, load first. They define the silhouette of the landscape.
  2. Large bushes — medium priority. Fill in the mid-ground.
  3. Ground cover and grass — lowest priority. Only needed when the player is nearby.

This ensures that as a player approaches a new area, trees appear first (maintaining the landscape silhouette), followed by progressive detail fill-in. This is visually superior to all foliage types loading simultaneously at their respective distances.

Profiling Techniques

Optimization without profiling is guessing. Here's how to profile foliage performance effectively.

GPU Profiling

Unreal Insights — the primary GPU profiling tool. Key counters for foliage:

  • Nanite.Rasterize — time spent rasterizing Nanite geometry. If this exceeds 4ms and your scene is mostly foliage, you need to reduce visible triangle count (culling distances, Nanite pixel error, billboard transitions).
  • Nanite.CullInstances — GPU-side instance culling. Should be under 1ms.
  • GPUScene.Update — time to update the GPU scene with instance data. If high, you have too many instances changing state per frame (usually a streaming problem).

RenderDoc / Nsight Graphics — for detailed GPU analysis. Useful when you need to understand exactly why Nanite rasterization is slow (specific mesh types, specific screen regions).

CPU Profiling

  • FoliageInstancedStaticMesh — CPU time for foliage visibility and culling. If this exceeds 1-2ms, your HISM configuration needs tuning (cluster size, component count, culling distances).
  • WorldPartitionStreaming — streaming-related CPU time. If high, your cell sizes may be too small or your streaming configuration too aggressive.

Automated Profiling

For systematic optimization, automated profiling catches regressions that manual spot-checks miss. The Unreal MCP Server can automate profiling workflows — running stat commands, capturing frame data at specific locations, and generating performance reports across multiple test points in your world. This is particularly useful for open world games where performance varies dramatically by location.

A practical automated workflow:

  1. Define test points across your world (coordinates where foliage density varies)
  2. Script a flythrough that visits each test point
  3. Capture frame timing data at each point
  4. Flag any point where frame time exceeds your budget
  5. Iterate on optimization and re-run

This catches problems like "the forest in the northeast quadrant is 20% denser than everywhere else and drops to 45fps" before your players find it.

Before/After Benchmarks

To make these recommendations concrete, here are benchmarks from a test scene: a 2km x 2km forested landscape with procedurally-placed vegetation.

Test Scene Specification

  • Engine: UE 5.7.1
  • Hardware: RTX 4070, Ryzen 7 7800X3D, 32GB DDR5
  • Resolution: 1920x1080 with TSR Quality
  • Foliage: 180,000 total instances (8 tree types, 12 bush types, 6 ground cover types)
  • Terrain: 4 landscape components with full Nanite support
  • Placement: Generated via Procedural Placement Tool with biome-aware distribution

Unoptimized Baseline

MetricValue
Frame time (average)24.3ms (41fps)
Nanite.Rasterize8.2ms
Instance culling (CPU)3.1ms
GPU memory (foliage)1.8 GB
Visible instances (avg)95,000

After Optimization Pass 1: Culling Distance Tuning

Applied per-type culling distances: grass at 60m, ground cover at 100m, bushes at 200m, trees at 800m.

MetricValueChange
Frame time18.1ms (55fps)-25%
Nanite.Rasterize4.8ms-41%
Instance culling (CPU)1.8ms-42%
GPU memory1.4 GB-22%
Visible instances42,000-56%

Culling distance tuning alone brought the scene from 41fps to 55fps by dramatically reducing visible instance count.

After Optimization Pass 2: HISM Configuration

Increased cluster size from 64 to 128. Reduced max instances per component to 32768. Enabled compressed instance data.

MetricValueChange from Pass 1
Frame time16.8ms (59fps)-7%
Nanite.Rasterize4.6ms-4%
Instance culling (CPU)1.1ms-39%
GPU memory1.1 GB-21%
Visible instances42,000no change

HISM tuning primarily reduced CPU culling time and memory usage. The GPU impact was smaller because visible instance count didn't change.

After Optimization Pass 3: Mesh and Nanite Tuning

Reduced unique mesh types from 26 to 18 (merged similar bush variants). Increased Nanite pixel error to 1.5 for foliage. Added billboard crossfade at 600m for trees.

MetricValueChange from Pass 2
Frame time14.2ms (70fps)-15%
Nanite.Rasterize3.1ms-33%
Instance culling (CPU)0.9ms-18%
GPU memory0.8 GB-27%
Visible instances38,000-10%

Summary

ConfigurationFPSGPU Foliage Memory
Unoptimized411.8 GB
+ Culling distances551.4 GB
+ HISM tuning591.1 GB
+ Mesh & Nanite tuning700.8 GB

From 41fps to 70fps — a 71% improvement — with no reduction in visual quality at normal gameplay camera distances. The forest looks identical in gameplay; it just renders more efficiently.

Procedural Placement Considerations

When foliage is procedurally placed rather than hand-placed, there are additional optimization opportunities and pitfalls.

Density Control

Procedural placement systems generate instance counts based on density parameters. It's easy to set density too high — a "dense forest" parameter that produces 500 instances per 100m2 might look great in a small test area but destroy performance at scale.

Recommended density targets for 60fps on mid-range hardware:

  • Large trees (trunk > 30cm diameter): 2-5 per 100m2
  • Small trees and tall bushes: 5-15 per 100m2
  • Medium bushes: 10-30 per 100m2
  • Ground cover: 30-80 per 100m2
  • Grass: 50-150 per 100m2

These are starting points. Your specific meshes, materials, and target hardware will shift these numbers. The key discipline is profiling after each density change.

Biome-Aware Density

Smart procedural placement adjusts density based on context. Dense forest areas need more instances. Rocky cliffsides need fewer. Transition zones between biomes need careful blending. The Procedural Placement Tool handles this through biome layers with per-layer density and exclusion rules, ensuring that procedural placement doesn't generate instances in areas where they're not needed.

Build-Time vs Runtime Placement

For open world games, procedural placement should happen at build time (or as an editor tool), not at runtime. Runtime generation adds CPU cost per frame and makes streaming more complex. Generate placement data in the editor, bake it into World Partition cells, and stream the pre-generated data.

Runtime procedural placement is appropriate for small-scale decoration (foliage around a player-built structure, vegetation regrowth in a survival game) but not for world-scale open world foliage.

Closing Thoughts

Dense Nanite foliage in UE 5.7 is achievable at 60fps on mid-range hardware — but only with deliberate optimization. The key principles:

  1. Culling distances are your most impactful optimization. Set them per foliage type based on visual importance.
  2. Unique mesh count matters more than instance count when using Nanite.
  3. HISM configuration affects CPU cost. Tune cluster size and component limits for your density.
  4. Memory budgeting prevents GPU memory exhaustion. Use compressed instance data and appropriate streaming radii.
  5. Profile systematically. Automated profiling across multiple world locations catches problems before players do.

The combination of procedural placement for generating dense, natural-looking foliage and Nanite for rendering it efficiently is the most practical pipeline for indie teams building open world environments. The tools exist. The performance is achievable. The work is in the tuning — and the tuning is guided by profiling, not by guesswork.

Tags

Unreal EngineNaniteFoliagePerformanceProcedural PlacementOpen WorldOptimization

Continue Reading

tutorial

The Solo Indie Dev's UE5 Toolkit: What We'd Install on Day One

Read more
tutorial

UE 5.7 Procedural Vegetation Editor: Complete Beginner's Guide to Nanite-Ready Forests

Read more
tutorial

UE 5.7 PCG Framework: From Experimental to Production — What Changed and How to Migrate

Read more
All posts