GenesisTex: Adapting Image Denoising Diffusion to Texture Space

Chenjian Gao1,2#, Boyan Jiang2, Xinghui Li2, Yingpeng Zhang2#, Qian Yu1*
#Equal contribution
*Corresponding author


1BeiHang University
2R&D Efficiency and Capability Department, Tencent IEG

CVPR 2024
Texturing results

Texturing results with GenesisTex

Abstract

We present GenesisTex, a novel method for synthesizing textures for 3D geometries from text descriptions. GenesisTex adapts the pretrained image diffusion model to texture space by texture space sampling. Specifically, we maintain a latent texture map for each viewpoint, which is updated with predicted noise on the rendering of the corresponding viewpoint. The sampled latent texture maps are then decoded into a final texture map. During the sampling process, we focus on both global and local consistency across multiple viewpoints: global consistency is achieved through the integration of style consistency mechanisms within the noise prediction network, and low-level consistency is achieved by dynamically aligning latent textures. Finally, we apply reference-based inpainting and img2img on denser views for texture refinement. Our approach overcomes the limitations of slow optimization in distillation-based methods and instability in inpainting-based methods. Experiments on meshes from various sources demonstrate that our method surpasses the baseline methods quantitatively and qualitatively.

Comparisons

Consistency Ablations

Texturing with Stable Diffusion XL

How does it work?

Texturing Pipeline

GenesisTex generates a texture map for a given mesh based on a prompt. Texture Space Sampling samples a texture map using Stable Diffusion, introducing style consistency and dynamic alignment across multiple viewpoints. Furthermore, Inpainting and Img2Img are applied to fill in the blank regions and enhance the quality of texture map details, respectively.

Check out the paper to learn more 🤓