Simple Drop-in LoRA Conditioning on Attention Layers Will Improve Your Diffusion Model

JY Choi, JR Park, I Park, J Cho, A No… - arXiv preprint arXiv …, 2024 - arxiv.org
Current state-of-the-art diffusion models employ U-Net architectures containing
convolutional and (qkv) self-attention layers. The U-Net processes images while being
conditioned on the time embedding input for each sampling step and the class or caption
embedding input corresponding to the desired conditional generation. Such conditioning
involves scale-and-shift operations to the convolutional layers but does not directly affect the
attention layers. While these standard architectural choices are certainly effective, not …

Simple Drop-in LoRA Conditioning on Attention Layers Will Improve Your Diffusion Model

R Block, A Block - openreview.net
Current state-of-the-art diffusion models employ U-Net architectures containing
convolutional and (qkv) self-attention layers. The U-Net processes images while being
conditioned on the time embedding input for each sampling step and the class or caption
embedding input corresponding to the desired conditional generation. Such conditioning
involves scale-and-shift operations to the convolutional layers but does not directly affect the
attention layers. While these standard architectural choices are certainly effective, not …
以上显示的是最相近的搜索结果。 查看全部搜索结果

Google学术搜索按钮

example.edu/paper.pdf
查找
获取 PDF 文件
引用
References