Deepcache: Accelerating diffusion models for free

Z Zhu, X Wang, W Zhao, C Min, N Deng, M Dou… - arXiv preprint arXiv …, 2024 - arxiv.org

General world models represent a crucial pathway toward achieving Artificial General
Intelligence (AGI), serving as the cornerstone for various applications ranging from virtual …

被引用次数：12 相关文章所有 3 个版本

[PDF] thecvf.com

Adapting visual-language models for generalizable anomaly detection in medical images

C Huang, A Jiang, J Feng, Y Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com

Recent advancements in large-scale visual-language pre-trained models have led to
significant progress in zero-/few-shot anomaly detection within natural image domains …

被引用次数：8 相关文章所有 3 个版本

[PDF] thecvf.com

Mindbridge: A cross-subject brain decoding framework

S Wang, S Liu, Z Tan, X Wang - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Brain decoding a pivotal field in neuroscience aims to reconstruct stimuli from acquired brain
signals primarily utilizing functional magnetic resonance imaging (fMRI). Currently brain …

被引用次数：5 相关文章所有 4 个版本

[PDF] aaai.org

Mutual-modality adversarial attack with semantic perturbation

J Ye, R Yu, S Liu, X Wang - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org

Adversarial attacks constitute a notable threat to machine learning systems, given their
potential to induce erroneous predictions and classifications. However, within real-world …

被引用次数：6 相关文章所有 3 个版本

[PDF] aaai.org

Tc-lif: A two-compartment spiking neuron model for long-term sequential modelling

S Zhang, Q Yang, C Ma, J Wu, H Li… - Proceedings of the AAAI …, 2024 - ojs.aaai.org

The identification of sensory cues associated with potential opportunities and dangers is
frequently complicated by unrelated events that separate useful cues by long delays. As a …

被引用次数：5 相关文章所有 4 个版本

[PDF] arxiv.org

Cross-attention makes inference cumbersome in text-to-image diffusion models

W Zhang, H Liu, J Xie, F Faccio, MZ Shou… - arXiv preprint arXiv …, 2024 - arxiv.org

This study explores the role of cross-attention during inference in text-conditional diffusion
models. We find that cross-attention outputs converge to a fixed point after few inference …

被引用次数：7 相关文章所有 2 个版本

[PDF] arxiv.org

Laptop-diff: Layer pruning and normalized distillation for compressing diffusion models

D Zhang, S Li, C Chen, Q Xie, H Lu - arXiv preprint arXiv:2404.11098, 2024 - arxiv.org

In the era of AIGC, the demand for low-budget or even on-device applications of diffusion
models emerged. In terms of compressing the Stable Diffusion models (SDMs), several …

被引用次数：4 相关文章所有 2 个版本

[PDF] aaai.org

NightRain: Nighttime Video Deraining via Adaptive-Rain-Removal and Adaptive-Correction

B Lin, Y Jin, W Yan, W Ye, Y Yuan, S Zhang… - Proceedings of the AAAI …, 2024 - ojs.aaai.org

Existing deep-learning-based methods for nighttime video deraining rely on synthetic data
due to the absence of real-world paired data. However, the intricacies of the real world …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

Prompt-driven target speech diarization

Y Jiang, Z Chen, R Tao, L Deng… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org

We introduce a novel task named 'target speech diarization', which seeks to determine
'when target event occurred'within an audio signal. We devise a neural architecture called …

被引用次数：8 相关文章所有 3 个版本

[PDF] arxiv.org

Life regression based patch slimming for vision transformers

J Chen, L Chen, J Yang, T Shi, L Cheng, Z Feng… - Neural Networks, 2024 - Elsevier

Vision transformers have achieved remarkable success in computer vision tasks by using
multi-head self-attention modules to capture long-range dependencies within images …

被引用次数：4 相关文章所有 6 个版本

高级搜索

QQ 群