Spring Sale: 30% off bundles with SPRINGBUNDLE or 15% off individual products with SPRING15 — ends Apr 15

StraySparkStraySpark
ProductsFree AssetsDocsBlogGamesAbout
StraySparkStraySpark

Game Studio & UE5 Tool Developers. Building professional-grade tools for the Unreal Engine community.

Products

  • Complete Toolkit (Bundle)
  • Procedural Placement Tool
  • Cinematic Spline Tool
  • Blueprint Template Library
  • DetailForge
  • UltraWire
  • Unreal MCP Server
  • Blender MCP Server
  • Godot MCP Server

Resources

  • Free Assets
  • Documentation
  • Blog
  • Changelog
  • Roadmap
  • FAQ
  • Contact

Legal

  • Privacy Policy
  • Terms of Service

© 2026 StraySpark. All rights reserved.

Back to Blog
tutorial
StraySparkApril 2, 20265 min read
UE5 Performance Profiling 101: Finding and Fixing Bottlenecks 
Unreal EngineOptimizationPerformanceProfilingTutorial

Why Profiling Comes First

The number one optimization mistake is guessing where the problem is. Developers spend hours rewriting systems they think are slow, only to discover the actual bottleneck was somewhere else entirely.

Profiling turns optimization from guesswork into science. You measure, identify the real bottleneck, fix it, measure again. Repeat until you hit your target frame rate.

Understanding Frame Budgets

Before profiling, know your target:

Target FPSFrame BudgetCommon Use Case
30 fps33.33msConsole open-world, VR minimum
60 fps16.67msStandard gameplay, competitive
120 fps8.33msCompetitive shooters, high-refresh
144 fps6.94msEsports, enthusiast PC

Your frame budget is split between CPU and GPU work. A 60fps target means both CPU and GPU must complete their work within 16.67ms. If either exceeds the budget, you miss the frame.

CPU vs GPU Bound

The first question when profiling: which one is the bottleneck?

Use stat unit in the console:

stat unit

This shows:

  • Frame: Total frame time
  • Game: Game thread time (gameplay logic, AI, physics simulation)
  • Draw: Render thread time (draw call preparation, culling)
  • GPU: GPU execution time
  • RHIT: Render Hardware Interface thread

The largest number is your bottleneck. If GPU shows 20ms and Game shows 8ms, you're GPU-bound — optimizing gameplay code won't help.

Essential Stat Commands

Overview Stats

stat fps          // Frame rate and frame time
stat unit         // Per-thread frame time breakdown
stat unitgraph    // Visual graph of thread times over time

CPU Profiling

stat game              // Game thread breakdown
stat ai                // AI system costs
stat physics           // Physics simulation timing
stat anim              // Animation evaluation costs
stat particles         // Particle system costs
stat tickgroups        // Tick function costs by group
stat startfile         // Begin capturing to .ue4stats file
stat stopfile          // Stop capture (analyze with Unreal Insights)

GPU Profiling

stat gpu               // GPU pass breakdown
stat scenerendering    // Rendering pipeline timing
stat rhi               // GPU memory and resource stats
stat Nanite            // Nanite-specific rendering costs
stat LumenScene        // Lumen GI costs
stat shadowrendering   // Shadow map costs
profilegpu             // Detailed single-frame GPU capture

Memory

stat memory            // Overall memory usage
stat memoryplatform    // Platform-specific memory
stat streaming         // Texture and mesh streaming stats

Unreal Insights: The Power Tool

Unreal Insights is the most powerful profiling tool in UE5. It captures detailed timing data for every system across multiple frames.

Launching with Insights

# Launch editor with trace enabled
UnrealEditor.exe YourProject.uproject -trace=default,gpu,memory

# Or enable at runtime via console
trace.enable default,gpu

Reading the Timeline

The Timing Insights panel shows horizontal bars for each thread:

  • GameThread: All gameplay logic, Blueprints, AI, physics
  • RenderThread: Draw call preparation, culling, command list building
  • RHIThread: GPU command submission
  • GPU: Actual GPU execution

Look for:

  1. Long bars: The widest bar in any thread is your frame bottleneck
  2. Gaps: Empty space between bars means threads are waiting for each other
  3. Spikes: Occasional long bars cause hitches even if average FPS is fine

Common Patterns in Insights

CPU-bound (Game Thread):

  • Large blocks labeled "BlueprintVM" → expensive Blueprint tick functions
  • Large "Physics" blocks → too many physics bodies or complex collision queries
  • Large "AI" blocks → behavior tree evaluation, perception queries, navigation

CPU-bound (Render Thread):

  • Large "Draw" blocks → too many draw calls (reduce actor count, use instancing)
  • Large "Occlusion" blocks → occlusion query bottleneck (simplify occluder geometry)

GPU-bound:

  • Large "BasePass" → too many triangles or expensive materials
  • Large "Lumen" → GI/reflection cost too high
  • Large "Shadow" → too many shadow-casting lights or large shadow maps
  • Large "PostProcess" → expensive post-processing chain

The GPU Profiler

For detailed GPU analysis, use profilegpu in the console. This captures a single frame and shows a hierarchical breakdown of every GPU pass.

Reading the Results

The profiler shows a tree of render passes with millisecond timings:

Scene (12.4ms)
├── PrePass (1.2ms)
├── BasePass (3.1ms)
│   ├── Nanite Raster (1.8ms)
│   └── Traditional (1.3ms)
├── Lumen (4.2ms)
│   ├── Screen Probe Gather (2.1ms)
│   ├── Reflections (1.4ms)
│   └── Scene Update (0.7ms)
├── Shadows (1.8ms)
├── Translucency (0.8ms)
└── PostProcess (1.3ms)

Focus your optimization on the largest passes. Reducing a 4.2ms Lumen pass by 25% saves more time than eliminating a 0.8ms translucency pass entirely.

Systematic Optimization Process

Step 1: Measure Baseline

Profile your worst-case scenario:

  • The most complex level
  • Maximum actor count
  • Worst camera angle (looking at the most geometry)
  • During gameplay (AI active, particles playing, physics simulating)

Record baseline numbers for reference.

Step 2: Identify the Bottleneck

Is it CPU or GPU? Which specific system within that?

Step 3: Research Solutions

Common optimizations by bottleneck:

Draw calls too high (>5000):

  • Enable mesh instancing (HISM for foliage)
  • Merge static meshes
  • Reduce unique material count
  • Use Nanite for complex geometry

Triangle count too high:

  • Enable Nanite on heavy meshes
  • Set appropriate LOD distances
  • Reduce foliage density at distance
  • Use impostor billboards for distant vegetation

Lumen too expensive:

  • See our Lumen Optimization Guide
  • Reduce trace quality settings
  • Simplify scene lighting
  • Use scalability profiles

Game thread overloaded:

  • Reduce tick frequency on non-critical actors
  • Move expensive calculations to async tasks
  • Optimize Blueprint hot paths (or move to C++)
  • Reduce physics body count

Shadows too expensive:

  • Reduce shadow-casting light count
  • Use cascade shadow map settings appropriate for your scene
  • Disable shadows on small objects
  • Use Virtual Shadow Maps (designed for Nanite workflows)

Step 4: Implement and Measure

Make ONE change at a time. Measure after each change. This is critical — if you make five changes at once, you don't know which one helped (or hurt).

Step 5: Repeat

Optimization is iterative. After fixing the biggest bottleneck, profile again — the next bottleneck may be in a completely different system.

Real-World Profiling Checklist

Before milestone reviews or ship:

  • Profile on target minimum spec hardware, not your dev machine
  • Test worst-case scenarios (maximum actors, complex levels)
  • Check for frame spikes, not just average FPS
  • Verify loading times are acceptable
  • Test memory usage stays within platform budget
  • Profile with shipping build (Development builds are significantly slower)
  • Run automated benchmarks on multiple levels
  • Document optimization settings for each scalability tier

Tools Beyond the Engine

RenderDoc

For deep GPU analysis:

  • Capture individual frames
  • Inspect every draw call, shader, and texture
  • Profile specific materials and passes
  • Free and open source

Platform-Specific Profilers

  • PIX (Xbox/Windows): Microsoft's GPU debugger
  • RGP (AMD): Radeon GPU Profiler for AMD-specific optimization
  • Nsight (NVIDIA): GPU profiling and debugging
  • Instruments (Mac/iOS): Apple's profiling suite

Automated Profiling

Set up automated performance tests that run on every build:

  • Fly-through cameras on key levels
  • Record and compare frame times across builds
  • Alert on regressions above a threshold
  • Track memory usage trends

Performance profiling isn't glamorous work, but it's the difference between a game that runs smoothly and one that stutters. Make it part of your regular development workflow, not something you do in a panic before launch.

Tags

Unreal EngineOptimizationPerformanceProfilingTutorial

Continue Reading

tutorial

Getting Started with UE5 PCG Framework: Build Your First Procedural World

Read more
tutorial

Nanite Foliage in UE5: The Complete Guide to High-Performance Vegetation

Read more
tutorial

UE5 Lumen Optimization Guide: Achieving 60fps with Dynamic Global Illumination

Read more
All posts