Axcell: Automatic extraction of results from machine learning papers

H Dong, Z Cheng, X He, M Zhou, A Zhou… - arXiv preprint arXiv …, 2022 - arxiv.org

Since a vast number of tables can be easily collected from web pages, spreadsheets, PDFs,
and various other document types, a flurry of table pre-training frameworks have been …

被引用次数：64 相关文章所有 4 个版本

[PDF] thecvf.com

Unifying vision, text, and layout for universal document processing

Z Tang, Z Yang, G Wang, Y Fang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract We propose Universal Document Processing (UDOP), a foundation Document AI
model which unifies text, image, and layout modalities together with varied task formats …

被引用次数：95 相关文章所有 6 个版本

ChemDataExtractor 2.0: Autopopulated ontologies for materials science

J Mavracic, CJ Court, T Isazawa… - Journal of Chemical …, 2021 - ACS Publications

The ever-growing abundance of data found in heterogeneous sources, such as scientific
publications, has forced the development of automated techniques for data extraction. While …

被引用次数：82 相关文章所有 3 个版本

[PDF] arxiv.org

MATE: multi-view attention for table transformer efficiency

JM Eisenschlos, M Gor, T Müller, WW Cohen - arXiv preprint arXiv …, 2021 - arxiv.org

This work presents a sparse-attention Transformer architecture for modeling documents that
contain large tables. Tables are ubiquitous on the web, and are rich in information. However …

被引用次数：86 相关文章所有 7 个版本

[PDF] arxiv.org

Hitab: A hierarchical table dataset for question answering and natural language generation

Z Cheng, H Dong, Z Wang, R Jia, J Guo, Y Gao… - arXiv preprint arXiv …, 2021 - arxiv.org

Tables are often created with hierarchies, but existing works on table reasoning mainly focus
on flat tables and neglect hierarchical tables. Hierarchical tables challenge existing methods …

被引用次数：87 相关文章所有 5 个版本

[PDF] arxiv.org

DocLLM: A layout-aware generative language model for multimodal document understanding

D Wang, N Raman, M Sibue, Z Ma, P Babkin… - arXiv preprint arXiv …, 2023 - arxiv.org

Enterprise documents such as forms, invoices, receipts, reports, contracts, and other similar
records, often carry rich semantics at the intersection of textual and spatial modalities. The …

被引用次数：34 相关文章所有 2 个版本

[PDF] acm.org

Large language models for tabular data: Progresses and future directions

H Dong, Z Wang - Proceedings of the 47th International ACM SIGIR …, 2024 - dl.acm.org

Tables contain a significant portion of the world's structured information. The ability to
efficiently and accurately understand, process, reason about, analyze, and generate tabular …

被引用次数：11 相关文章所有 2 个版本

[PDF] arxiv.org

Mmlongbench-doc: Benchmarking long-context document understanding with visualizations

Y Ma, Y Zang, L Chen, M Chen, Y Jiao, X Li… - arXiv preprint arXiv …, 2024 - arxiv.org

Understanding documents with rich layouts and multi-modal components is a long-standing
and practical task. Recent Large Vision-Language Models (LVLMs) have made remarkable …

被引用次数：11 相关文章所有 4 个版本

[PDF] mit.edu

Revise and resubmit: An intertextual model of text-based collaboration in peer review

I Kuznetsov, J Buchmann, M Eichler… - Computational …, 2022 - direct.mit.edu

Peer review is a key component of the publishing process in most fields of science.
Increasing submission rates put a strain on reviewing quality and efficiency, motivating the …

被引用次数：31 相关文章所有 8 个版本

[PDF] brunel.ac.uk

The current state of the art in deep learning for image classification: a review

A Byerly, T Kalganova, R Ott - Science and information conference, 2022 - Springer

We present a review of the methods behind the top 40 highest accuracies achieved on the
ILSVRC 2012 Imagenet validation set as ranked on Papers with Code. A significant …

被引用次数：13 相关文章所有 3 个版本

高级搜索

QQ 群