NeAR: Coupled Neural Asset

Relightable 3D generative rendering results. Columns from left to right depict the target illumination, the casually lit input image, Blender-rendered results from Trellis 3D, Hunyuan 3D-2.1 (with PBR materials), our method's estimated multi-view PBR materials back-projected onto the given mesh, our neural rendering results, and ground truth. The final two columns keep illumination fixed while varying the camera pose. Compared with prior approaches, our method yields more faithful relighting, particularly in preserving rendering accuracy and enforcing cross-view illumination consistency.

Abstract

Neural asset authoring and neural rendering have emerged as largely disjoint threads: one generates digital assets using neural networks for traditional graphics pipelines, while the other develops neural renderers that map conventional assets to images. However, the joint design of the asset representation and renderer remains largely unexplored. We argue that coupling them can unlock an end-to-end learnable graphics stack with benefits in fidelity, consistency, and efficiency. In this paper, we explore this possibility with NeAR: a Coupled Neural Asset–Renderer Stack. On the asset side, we build on Trellis-style Structured 3D Latents and introduce a lighting-homogenized neural asset: from a casually lit input, a rectified-flow backbone predicts a Lighting-Homogenized SLAT that encodes geometry and intrinsic material cues in a compact, view-agnostic latent. On the renderer side, we design a lighting-aware neural renderer that uses this neural asset, along with explicit view embeddings and HDR environment maps, to produce lighting-aware renderings in realtime. We validate NeAR on four tasks: (1) G-buffer–based forward rendering, (2) random-lit single-image reconstruction, (3) unknown-lit single-image relighting, and (4) novel-view relighting, where our coupled stack surpasses state-of-the-art baselines in quantitative metrics and perceptual quality. We hope this coupled asset-renderer perspective inspires new graphics stacks that view neural assets and renderers as co-designed components instead of independent ones.

Overview

Pipeline of NeAR. Top: Light Homogenization extracts a Lighting-Homogenized Structured 3D Latent (LH-SLAT) from a casually lit input image. The right side shows the pipeline for generating LH-SLAT from a single image: we first recover a shaded SLAT, then perform illumination homogenization in voxel space. Bottom: Relightable Neural 3DGS Synthesis generates a relightable 3D Gaussian Splatting (3DGS) field conditioned on the LH-SLAT, target illumination, and target viewpoint. The decoded 3DGS encodes geometry, appearance, and light–material interactions, and is rendered into the final relit image.

Visualization of Illumination Homogenization: We present a breakdown of the rendering pipeline to demonstrate the rationale behind our Illumination Homogenization. Left: Ground-truth PBR intrinsics. Middle: Decomposed shading components (shadows, BRDFs) under random versus normalized lighting. Right: Comparison of latent features. The visualization confirms that unlike the \textit{Shaded-SLAT}, which is corrupted by lighting artifacts, our LH-SLAT successfully decouples lighting from geometry, yielding a stable representation for relighting.

Results

Forward Rendering

Visual comparison of Diffusion Renderer with Gbuffer/LH-SLAT for image relighting. In the condition of given real Gbuffer and LH-SLAT, compared to the diffusion-based methods, our method achieves better performance in shadow (upper right), reflection (left), and rendering quality.

Recon Rendering

Given an image with known illumination, our method reconstructs the image and achieves higher fidelity to the original compared with existing approaches.

Relight Rendering

Given an image under unknown lighting and a specified target lighting, our method produces more realistic relighting results.

PBR Textured 3D Generation

Comparison of relighting renderings between our neural rendering method and 3D generation methods that can recover PBR material properties. Our method achieves more stable and accurate rendering results.

Material & Shadow Decomposition

Visualization of material and shadow decomposition results. Our method decomposes the input image into material and shadow components, providing a well-conditioned intermediate supervision for relighting.

Main Related Works

[1] Trellis3D: Structured 3D Latents for Scalable and Versatile 3D Generation

[2] DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation

[3] Neural Gaffer: Relighting Any Object via Diffusion

[4] DiffusionRenderer: Neural Inverse and Forward Rendering with Video Diffusion Models

[5] MeshGen: Generating PBR Textured Mesh with Render-Enhanced Auto-Encoder and Generative Data Augmentation

[6] Hunyuan3D 2.1: From Images to High-Fidelity 3D Assets with Production-Ready PBR Material

[7] RGB↔X: Image Decomposition and Synthesis Using Material- and Lighting-aware Diffusion Models

BibTeX

@article{li2025near,
  title={NeAR: Coupled Neural Asset-Renderer Stack},
  author={Li, Hong and Ye, Chongjie and Chen, Houyuan and Xiao, Weiqing and Yan, Ziyang and Xiao, Lixing and Chen, Zhaoxi and Xiang, Jianfeng and Xu, Shaocong and Liu, Xuhui and Wang, Yikai and Zhang, Baochang and Han, Xiaoguang and Yang, Jiaolong and Zhao, Hao},
  journal={arXiv preprint arXiv:2511.18600},
  year={2025}
}