- 学术资源搜索

Large-scale multi-modal pre-trained models: A comprehensive survey

X Wang, G Chen, G Qian, P Gao, XY Wei… - Machine Intelligence …, 2023 - Springer

With the urgent demand for generalized deep models, many pre-trained big models are
proposed, such as bidirectional encoder representations (BERT), vision transformer (ViT) …

被引用次数：190 相关文章所有 8 个版本

[PDF] arxiv.org

ChatGPT-like large-scale foundation models for prognostics and health management: A survey and roadmaps

YF Li, H Wang, M Sun - Reliability Engineering & System Safety, 2024 - Elsevier

PHM technology is vital in industrial production and maintenance, identifying and predicting
potential equipment failures and damages. This enables proactive maintenance measures …

被引用次数：48 相关文章所有 5 个版本

[PDF] arxiv.org

Timemarker: A versatile video-llm for long and short video understanding with superior temporal localization ability

S Chen, X Lan, Y Yuan, Z Jie, L Ma - arXiv preprint arXiv:2411.18211, 2024 - arxiv.org

Rapid development of large language models (LLMs) has significantly advanced multimodal
large language models (LMMs), particularly in vision-language tasks. However, existing …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

Large scale foundation models for intelligent manufacturing applications: a survey

H Zhang, SD Semujju, Z Wang, X Lv, K Xu… - Journal of Intelligent …, 2025 - Springer

Although the applications of artificial intelligence especially deep learning have greatly
improved various aspects of intelligent manufacturing, they still face challenges for broader …

被引用次数：5 相关文章所有 2 个版本

[PDF] mdpi.com

An Efficient Product-Customization Framework Based on Multimodal Data under the Social Manufacturing Paradigm

Y Li, H Wu, TS Tamir, Z Shen, S Liu, B Hu, G Xiong - Machines, 2023 - mdpi.com

With improvements in social productivity and technology, along with the popularity of the
Internet, consumer demands are becoming increasingly personalized and diversified …

被引用次数：7 相关文章所有 3 个版本

[PDF] arxiv.org

高级搜索

QQ 群

Large-scale multi-modal pre-trained models: A comprehensive survey

ChatGPT-like large-scale foundation models for prognostics and health management: A survey and roadmaps

Timemarker: A versatile video-llm for long and short video understanding with superior temporal localization ability

Large scale foundation models for intelligent manufacturing applications: a survey

An Efficient Product-Customization Framework Based on Multimodal Data under the Social Manufacturing Paradigm

CBVS: A Large-Scale Chinese Image-Text Benchmark for Real-World Short Video Search Scenarios

BagFormer: Better cross-modal retrieval via bag-wise interaction

Enhanced image-text retrieval based on CLIP with YOLOv10 and Next-ViT

Chinese image description evaluation method based on target domain semantic constraints

引用