Neural Texture Compression is the first genuinely new texture format most game developers will encounter since ASTC in 2012. Unlike prior attempts at ML-based texture formats (which all required offline decompression to a fixed format before rendering), NTC decompresses on the fly during shading. This is only possible because GPUs now have hardware for the specific tensor operations NTC needs — NVIDIA's Tensor Cores since Turing, Microsoft's Cooperative Vectors API in DirectX since late 2024, and AMD's WMMA instructions since RDNA 3.
UE5.7 shipped NTC support as production-ready for PC targets and experimental for everything else. The memory savings are real — we've seen 60–80% reduction in texture VRAM on scenes converted from BC7 to NTC. The runtime cost is not free, and the hardware matrix is confusing enough that most teams get the adoption decision wrong on their first pass.
This guide covers what NTC actually is, how it's exposed in UE5.7, the platform support matrix, the runtime cost envelope, and practical memory budget examples from real projects.
What NTC Actually Is
A short technical refresher, because the marketing around NTC has been loose.
Traditional block-based texture compression (BC7, ASTC) works on fixed-size tiles (4×4 pixels for BC, variable for ASTC) and compresses each tile independently using a small set of hand-designed endpoints and interpolation rules. Decompression is a hardware fixed-function path on every GPU since 2002. The compression ratios are fixed: BC1 is 8:1, BC7 is 4:1, ASTC varies from 4:1 to 36:1.
NTC replaces the hand-designed endpoint schemes with a small neural network. Specifically:
- A set of latent feature grids, stored at multiple resolutions (a feature pyramid).
- A small MLP (typically 2 layers, 16 neurons wide) that maps a sampled latent feature vector plus a 2D UV position to an RGB(A) output.
- A set of per-texture channel scales and offsets.
To sample a texel, the GPU samples the feature pyramid at the target UV, concatenates the sampled vectors, and runs them through the MLP. The MLP inference is where hardware tensor ops matter — on a GPU without tensor acceleration, NTC's runtime cost is prohibitive. On hardware with Tensor Cores or equivalent, a 2-layer 16-wide MLP inference takes roughly 10–30 nanoseconds per texel, which is in the same ballpark as a BC7 decode plus a trilinear filter.
The compression ratios NTC achieves depend on the network configuration:
- NTC-Tiny (smallest MLP, smallest feature pyramid): 12:1 to 20:1 compression, visible quality degradation on high-frequency textures.
- NTC-Small (the UE5.7 default): 8:1 to 14:1, competitive quality with BC7.
- NTC-Large: 4:1 to 8:1, approaches lossless for 8-bit source data.
For comparison, BC7 is a fixed 4:1. NTC-Small at ~10:1 is roughly 2.5× better compression at matched quality. For a 4K albedo+normal+ORM material set, that's the difference between ~42MB and ~17MB.
NVIDIA's RTX Neural Shading Framework
RTX Neural Shading is NVIDIA's umbrella term for neural techniques that inference inside a shader. NTC is the first production-grade application, but the framework also covers neural BRDFs, neural materials, and neural radiance caching. All of them rely on the same underlying hardware capability: the ability to do small matrix multiplies inside a pixel/compute shader at viable performance.
On NVIDIA hardware this is exposed via:
- Cooperative Matrix extensions in Vulkan.
- Wave Matrix intrinsics in DirectX 12 with the SM 6.8 shader model.
- NVIDIA's own
nv_cooperative_vectorproprietary extension for the most aggressive optimizations.
UE5.7's NTC path uses Cooperative Vectors when available and falls back to Wave Matrix on shader models that support it. On shader models below 6.8, NTC is not supported at all — there is no software fallback path in the shipping engine.
Microsoft's Cooperative Vectors
The DirectX team shipped Cooperative Vectors in late 2024 as a cross-vendor API for small matrix operations inside shaders. Before this, small-matrix math inside a pixel shader was fundamentally slow on consumer GPUs because it did not map onto the wide SIMD execution model. Cooperative Vectors lets a shader cooperatively execute a matrix operation across a wave, using whatever hardware path the GPU vendor provides.
This is what makes NTC portable across vendors. Without a vendor-neutral API, NTC would have been an NVIDIA-only feature. With it, AMD and Intel can implement NTC decode with their own hardware paths (AMD WMMA on RDNA 3+, Intel XMX on Battlemage+) and the shader code is the same.
The catch: Cooperative Vectors requires Windows 11 24H2 or newer, and driver support landed at different times for each vendor:
- NVIDIA: full support on RTX 20-series and newer, driver 551.xx+.
- AMD: RDNA 3 (RX 7000) and RDNA 4 (RX 9000) only, Adrenalin 24.9.1+.
- Intel: Arc Alchemist (A-series) has partial support with reduced performance; Battlemage (B-series) has full support.
RDNA 2 (RX 6000) and GTX 16-series do not have the hardware primitives for viable NTC decode. On those cards, UE5.7 falls back to the texture's BC7 representation (which the engine stores alongside the NTC data for this purpose).
Platform Support Matrix
Since platform support is the most common source of adoption mistakes, here's the concrete matrix as of UE5.7.3:
| Platform | NTC Support | Notes |
|---|---|---|
| PC DX12 + RTX 20/30/40/50 series | Full | Reference implementation. Best performance. |
| PC DX12 + RX 7000/9000 | Full | Slightly higher decode cost than NVIDIA. |
| PC DX12 + Arc B-series | Full | Similar to AMD RDNA 3. |
| PC DX12 + Arc A-series | Partial | Works, reduced performance. Not recommended. |
| PC DX12 + RX 6000 / GTX 16 | Fallback | BC7 used at runtime. NTC data wasted. |
| PC Vulkan | Full | Requires VK_KHR_cooperative_matrix. |
| PlayStation 5 | No | No shader support. |
| PlayStation 5 Pro | Experimental | Early PSSL support via vendor extension. |
| Xbox Series X/S | No | No shader support at launch; Microsoft has signaled future support. |
| Nintendo Switch 2 | No | No cooperative matrix support. |
| macOS / Metal | No | Metal has no equivalent API as of 2026. |
| iOS / Android | No | No mobile GPU has production-ready tensor primitives. |
The one-sentence summary: NTC is a PC-first feature. If your target platform mix is console-heavy, NTC buys you nothing on consoles and you still have to author BC7 for the console cooks — at which point the question is whether NTC is worth maintaining a second texture format for PC only.
That question is usually yes for AAA projects with high-resolution texture sets, and usually no for mid-budget cross-platform projects where the artist time to manage two texture variants is not justified by the PC-only VRAM savings.
Enabling NTC in UE5.7
The engine exposes NTC via Project Settings → Rendering → Neural Texture Compression. Three toggles:
- Enable Neural Texture Compression — master switch. Off by default.
- Use NTC for Streaming Textures — if on, textures marked for streaming are authored with NTC when supported, BC7 otherwise.
- NTC Quality Preset — Tiny / Small / Large, matching the network sizes above. Default is Small.
Per-texture, you can override the setting in the texture asset properties. A typical production pattern:
- Hero textures (characters, weapons, UI): NTC-Large. The runtime cost is absorbed by the limited screen coverage and the quality gain over BC7 is worthwhile.
- Environment albedo, normal, ORM: NTC-Small. Best memory-per-quality tradeoff.
- Detail textures, tiling textures: NTC-Tiny. Quality loss is masked by the high tiling factor.
- Lightmap textures: BC7. NTC has trouble with the low-frequency high-dynamic-range content in lightmaps.
- Virtual textures: NTC-Small, but see the virtual texturing section below.
Cook-Time NTC Baking
NTC textures bake during the cook, not during texture import. This is because the baking process is expensive (2–8 seconds per 4K texture on a 4090, depending on quality preset) and you do not want to pay that cost on every texture re-import during iteration.
The trade-off: your cook times balloon. A project with 3,000 textures might spend 30–60 additional minutes on cook for NTC baking. UE5.7 supports distributed NTC baking via the DDC (Derived Data Cache), so a team-shared DDC absorbs most of that cost, but a fresh cook on a build agent will feel it.
Sampling in Materials
In a material graph, an NTC-backed texture is sampled identically to a BC7 texture — you use a TextureSampler node, the engine handles the rest. The material compiler emits different shader code paths behind the scenes based on the texture's compression type and the target hardware's capabilities.
The one exception: sampler filtering is constrained. NTC currently supports trilinear filtering but not anisotropic filtering. Anisotropic texture filtering requires multiple samples along the anisotropy direction, and each sample is a full MLP inference, so the cost scales linearly with the anisotropy level. UE5.7's implementation caps NTC at 1× anisotropy. Textures that require high-anisotropy filtering (typically ground materials viewed at grazing angles) should stay on BC7 or use a virtual texture variant.
Runtime Cost: The Numbers
Measured on RTX 4070 at 1440p TSR Balanced, a scene with 180 visible materials, 12GB texture working set:
| Configuration | GPU ms (shading pass) | Texture VRAM |
|---|---|---|
| All BC7 (baseline) | 4.1ms | 10.8 GB |
| NTC-Small where supported | 4.9ms | 3.6 GB |
| NTC-Large for hero, NTC-Small elsewhere | 5.2ms | 4.1 GB |
| NTC-Tiny across the board | 4.6ms | 1.9 GB |
Observations:
- NTC adds roughly 0.8–1.1ms to the shading pass on this hardware at this resolution.
- The cost scales with sampled texel count, not texture count or memory size. A scene with many small textures sampled heavily costs more in NTC than a scene with a few large textures sampled lightly.
- The VRAM reduction is substantial. The 7.2GB saved in the NTC-Small row is enough to change which SKU tiers you can target — a 10GB 3080 can fit a working set that would have required a 16GB card with BC7.
On a 4060 (8GB), the same scene could not fit in VRAM with BC7 (triggered PCIe streaming thrash, dropped to 30fps). With NTC-Small, it fit and held 60fps+. This is the real selling point of NTC: it expands the addressable hardware market for high-resolution texture budgets.
Practical Memory Budgets
A few worked examples from shipping projects.
Example 1: Open-world action game, 25GB texture budget
- Original BC7 budget: 25GB of textures at 4K across ~1,800 unique materials.
- NTC-Small conversion: 8.4GB runtime VRAM for textures. 16.6GB freed.
- The freed VRAM absorbed into: larger virtual texture cache (+3GB), higher-resolution detail textures (+5GB authored), Nanite geometry streaming cache (+2GB), and a reduced memory floor that let the 8GB SKU tier target run the game at High instead of Medium settings.
Example 2: Linear cinematic game, 14GB texture budget
- Original BC7 budget: 14GB at 4K, ~900 unique materials.
- NTC-Large for 200 hero assets, NTC-Small elsewhere: 4.9GB runtime.
- The freed VRAM used for 8K hero textures (which could not have fit at BC7) and a second set of seasonal material variants cached in VRAM for instant swap.
Example 3: Stylized multiplayer game, 6GB texture budget
- Original BC7 budget: 6GB at 2K, ~400 unique materials.
- NTC was evaluated and rejected. The runtime cost (0.8ms/frame) was too high relative to the VRAM savings (~3.5GB), because the game's target SKU (RX 6600) did not benefit from NTC and adding NTC only for the high-SKU tier was not worth the two-pipeline complexity.
The pattern: NTC is worth it when texture VRAM is your binding constraint and you have the shader budget to absorb the decode cost. It is not worth it when your constraint is elsewhere (draw calls, geometry, gameplay logic, cross-platform authoring) or when your target hardware does not support it.
Virtual Texturing Interaction
UE5.7's runtime virtual texture system has partial NTC support. The virtual texture tiles are stored as NTC, decompressed to the physical texture cache, and sampled from the cache with standard filtering. This means:
- The disk-storage and streaming-bandwidth savings of NTC apply to virtual textures.
- The runtime decompression cost is paid once per tile upload, not per frame.
- The physical cache is still BC7/uncompressed, so runtime VRAM savings are smaller.
Net: for virtual-textured environments, NTC mostly saves disk and streaming bandwidth, not runtime VRAM. This is still valuable (a 60GB texture-heavy install can shrink to 20GB with NTC'd VT data) but don't expect the same runtime memory reduction as non-virtual NTC usage.
Authoring Considerations
A few notes from production experience:
- Source quality matters more with NTC. BC7 is forgiving of noisy source data because the block quantization masks fine noise. NTC fits its feature pyramid to the source and can over-fit to noise, producing speckled artifacts. Clean up source textures before NTC baking.
- Normal maps work fine. The common worry — that NTC would struggle with normals because of the precision required — did not pan out. NTC-Small normals are visually indistinguishable from BC5 in our blind tests.
- Alpha channels are handled correctly. NTC treats alpha as a fourth channel and compresses it alongside RGB. Binary alpha (foliage cards) works well at NTC-Small and above; smooth alpha gradients benefit from NTC-Large.
- Tiling textures need NTC-Small minimum. NTC-Tiny produces visible seams when a tile is repeated many times across a surface, because the seam artifacts are coherent rather than randomized.
We use the Unreal MCP Server to batch-apply NTC settings across large material libraries — a migration pass across 2,000 textures to set per-texture quality presets based on asset tags takes a few minutes by prompt versus hours by hand.
For source-side texture preparation (denoising, channel packing, alpha cleanup), the Blender MCP Server handles the Blender-side compositor operations programmatically, which pairs well with NTC's sensitivity to source cleanliness.
Debugging NTC
Two tools worth knowing:
r.NTC.Debug.ShowCompressionHeatmap 1— colors the screen by per-pixel NTC decode cost. Red areas are expensive, green areas are cheap. Useful for finding materials that over-sample NTC textures.r.NTC.Debug.CompareWithBC7 1— renders a BC7-reconstruction of the same textures side-by-side with the NTC version. Useful for QA passes on specific assets.
Both are developer-only and not safe to ship.
Streaming and Loading Behavior
One under-documented consequence of NTC adoption: streaming patterns change.
BC7 textures stream as raw blocks. The engine loads the bytes, the GPU samples them directly, no decompression involved. NTC textures still stream as raw bytes (the NTC representation itself), but the feature pyramid has to be resident in VRAM for sampling — partial streaming of an NTC texture is not supported. You either have the whole thing or you have nothing.
This matters for texture streaming pools. A virtual-texture-like system that streams individual mipmap levels of a BC7 texture works frictionlessly. With NTC, you stream whole textures at a time. On a streaming-heavy open-world title, this can produce visibly different LRU behavior — NTC textures get evicted as atomic units rather than partial, which can cause more aggressive re-streaming of popular textures.
UE5.7's NTC integration handles this with a pinning system: textures tagged as "streaming hot" stay resident longer, and the engine tracks sampling heat per texture to avoid thrashing. The default heuristics are reasonable, but we've seen projects benefit from manual pinning on critical textures (HUD elements, player-character textures) that would otherwise churn.
The streaming cost on eviction-and-reload is also slightly higher for NTC, because the texture pyramid has to be uploaded in full. Empirically: a 4K NTC-Small texture takes about 1.1× as long to stream in as the equivalent BC7, measured on NVMe with DirectStorage enabled. Small difference, but it compounds in worst-case streaming scenarios.
NTC in Shipping Titles
As a reality check, a few data points from titles that shipped with NTC in 2025–2026:
A large open-world action title shipped with NTC-Small on ~70% of environment textures. VRAM reduction on PC was measured at 5.8GB average. The team kept all character and weapon textures on BC7 because their anisotropic filtering requirements at grazing angles exceeded what NTC supports.
A linear cinematic game shipped NTC-Large on hero textures only, BC7 on everything else. The selective approach let them ship 8K hero albedo maps that would have been impossible at BC7 — a specific feature for their target audience that justified the two-pipeline maintenance cost.
A multiplayer shooter evaluated NTC and shipped without it. The primary reason was the console parity — they could not ship 60% lower texture VRAM on PC only without affecting art direction parity across platforms, and the art team preferred a consistent cross-platform look over a PC-exclusive upgrade.
The common pattern in the wins: NTC adopted selectively, not wholesale, and with clear accounting of which assets benefit and which don't.
When to Adopt NTC
A decision framework:
Adopt NTC now if:
- PC is a primary or exclusive target.
- Your texture VRAM budget is the binding constraint on hardware tier scaling.
- Your cook infrastructure can absorb the additional bake time.
- Your target audience skews toward RTX 20-series and newer, or RDNA 3+.
Defer NTC if:
- Consoles are your primary target.
- Your texture budgets already fit comfortably in 8GB.
- You are late in a project and adding NTC is a rework tax without proportional benefit.
- Your texture pipeline has significant anisotropic-filtering dependencies.
Revisit in 12 months if:
- You're targeting PS5 Pro and Xbox equivalents that may add cooperative vector support.
- You're targeting mobile — the first mobile chips with viable tensor primitives for NTC-scale inference are expected in the 2027–2028 generation.
NTC is not a universal upgrade. It is a real, measurable win on PC-heavy projects with high-resolution texture budgets. It is a maintenance burden with no payoff on projects where texture VRAM was not the constraint. The engineering work is to know which bucket your project falls into — and the honest answer is more often the second bucket than the first.