Accelerating materials discovery using artificial intelligence, high performance computing and robotics

EO Pyzer-Knapp, JW Pitera, PWJ Staar… - npj Computational …, 2022 - nature.com
New tools enable new ways of working, and materials science is no exception. In materials
discovery, traditional manual, serial, and human-intensive work is being augmented by …

Doclaynet: A large human-annotated dataset for document-layout segmentation

B Pfitzmann, C Auer, M Dolfi, AS Nassar… - Proceedings of the 28th …, 2022 - dl.acm.org
Accurate document layout analysis is a key requirement for high-quality PDF document
conversion. With the recent availability of public, large ground-truth datasets such as …

Tableformer: Table structure understanding with transformers

A Nassar, N Livathinos, M Lysak… - Proceedings of the …, 2022 - openaccess.thecvf.com
Tables organize valuable content in a concise and compact representation. This content is
extremely valuable for systems such as search engines, Knowledge Graph's, etc, since they …

Skin tone analysis for representation in educational materials (star-ed) using machine learning

GA Tadesse, C Cintas, KR Varshney, P Staar… - NPJ Digital …, 2023 - nature.com
Images depicting dark skin tones are significantly underrepresented in the educational
materials used to teach primary care physicians and dermatologists to recognize skin …

PDF malware detection based on optimizable decision trees

Q Abu Al-Haija, A Odeh, H Qattous - Electronics, 2022 - mdpi.com
Portable document format (PDF) files are one of the most universally used file types. This
has incentivized hackers to develop methods to use these normally innocent PDF files to …

An overview on the role of artificial intelligence in modern advancements of material science

M Das, TC Perez, D Shetty, P Hiremath, N Naik… - ES General, 2024 - espublisher.com
Artificial intelligence (AI) has become a disruptive force in many industries over the past few
decades, and the subjects of material science and engineering are no exception. This …

VILA: Improving structured content extraction from scientific PDFs using visual layout groups

Z Shen, K Lo, LL Wang, B Kuehl, DS Weld… - Transactions of the …, 2022 - direct.mit.edu
Accurately extracting structured content from PDFs is a critical first step for NLP over
scientific papers. Recent work has improved extraction accuracy by incorporating …

FETA: Towards specializing foundational models for expert task applications

A Alfassy, A Arbelle, O Halimi… - Advances in …, 2022 - proceedings.neurips.cc
Abstract Foundational Models (FMs) have demonstrated unprecedented capabilities
including zero-shot learning, high fidelity data synthesis, and out of domain generalization …

A benchmark of pdf information extraction tools using a multi-task and multi-domain evaluation framework for academic documents

N Meuschke, A Jagdale, T Spinde, J Mitrović… - International Conference …, 2023 - Springer
Extracting information from academic PDF documents is crucial for numerous indexing,
retrieval, and analysis use cases. Choosing the best tool to extract specific content elements …

ICDAR 2023 competition on robust layout segmentation in corporate documents

C Auer, A Nassar, M Lysak, M Dolfi… - … on Document Analysis …, 2023 - Springer
Transforming documents into machine-processable representations is a challenging task
due to their complex structures and variability in formats. Recovering the layout structure and …