相关文章- 学术资源搜索

Extracting bibliographical data for PDF documents with HMM and external resources

WF Hsiao, TM Chang, E Thomas - Program, 2014 - emerald.com

Purpose–The purpose of this paper is to propose an automatic metadata extraction and
retrieval system to extract bibliographical information from digital academic documents in …

被引用次数：5 相关文章所有 5 个版本

[PDF] aclanthology.org

Bootstrapping multilingual metadata extraction: a showcase in cyrillic

J Krause, I Shapiro, T Saier… - Proceedings of the Second …, 2021 - aclanthology.org

Applications based on scholarly data are of ever increasing importance. This results in
disadvantages for areas where high-quality data and compatible systems are not available …

被引用次数：5 相关文章所有 6 个版本

[PDF] arxiv.org

New methods for metadata extraction from scientific literature

D Tkaczyk - arXiv preprint arXiv:1710.10201, 2017 - arxiv.org

Within the past few decades we have witnessed digital revolution, which moved scholarly
communication to electronic media and also resulted in a substantial increase in its volume …

被引用次数：20 相关文章所有 3 个版本

[PDF] ucy.ac.cy

GROBID: Combining automatic bibliographic data recognition and term extraction for scholarship publications

P Lopez - Research and Advanced Technology for Digital …, 2009 - Springer

Based on state of the art machine learning techniques, GROBID (GeneRation Of
BIbliographic Data) performs reliable bibliographic data extractions from scholar articles …

被引用次数：338 相关文章所有 11 个版本

[PDF] arxiv.org

A benchmark of pdf information extraction tools using a multi-task and multi-domain evaluation framework for academic documents

N Meuschke, A Jagdale, T Spinde, J Mitrović… - International Conference …, 2023 - Springer

Extracting information from academic PDF documents is crucial for numerous indexing,
retrieval, and analysis use cases. Choosing the best tool to extract specific content elements …

被引用次数：13 相关文章所有 10 个版本

[PDF] aclanthology.org

[PDF][PDF] An End-to-End Pipeline for Bibliography Extraction from Scientific Articles

B Joshi, A Symeonidou, SM Danish… - Proceedings of the …, 2023 - aclanthology.org

We introduce a comprehensive end-to-end pipeline designed to extract complete
bibliography section from English scientific articles in digital-born PDF format and further …

OCR++: a robust framework for information extraction from scholarly articles

M Singh, B Barua, P Palod, M Garg… - arXiv preprint arXiv …, 2016 - arxiv.org

This paper proposes OCR++, an open-source framework designed for a variety of
information extraction tasks from scholarly articles including metadata (title, author names …

被引用次数：41 相关文章所有 9 个版本

[HTML] plos.org

[HTML][HTML] Building an annotated corpus for automatic metadata extraction from multilingual journal article references

W Choi, HM Yoon, MH Hyun, HJ Lee, JW Seol, KD Lee… - PloS one, 2023 - journals.plos.org

Bibliographic references containing citation information of academic literature play an
important role as a medium connecting earlier and recent studies. As references contain …

被引用次数：4 相关文章所有 6 个版本

[PDF] psu.edu

Metadata extraction from bibliographies using bigram HMM

P Yin, M Zhang, ZH Deng, DQ Yang - Digital Libraries: International …, 2005 - Springer

In recent years, we have seen huge volumes of research papers available on the World
Wide Web. Metadata provides a good approach for organizing and retrieving these useful …

被引用次数：51 相关文章所有 12 个版本

[PDF] arxiv.org

Structured references from pdf articles: assessing the tools for bibliographic reference extraction and parsing

A Cioffi, S Peroni - International Conference on Theory and Practice of …, 2022 - Springer

Many solutions have been provided to extract bibliographic references from PDF papers.
Machine learning, rule-based and regular expressions approaches were among the most …

被引用次数：2 相关文章所有 6 个版本

高级搜索

QQ 群

Extracting bibliographical data for PDF documents with HMM and external resources

Bootstrapping multilingual metadata extraction: a showcase in cyrillic

New methods for metadata extraction from scientific literature

GROBID: Combining automatic bibliographic data recognition and term extraction for scholarship publications

A benchmark of pdf information extraction tools using a multi-task and multi-domain evaluation framework for academic documents

[PDF][PDF] An End-to-End Pipeline for Bibliography Extraction from Scientific Articles

OCR++: a robust framework for information extraction from scholarly articles

[HTML][HTML] Building an annotated corpus for automatic metadata extraction from multilingual journal article references

Metadata extraction from bibliographies using bigram HMM

Structured references from pdf articles: assessing the tools for bibliographic reference extraction and parsing

引用