A novel approach for content extraction from web pages

MO Samuel, AI Tolulope… - Journal of Physics …, 2019 - iopscience.iop.org

Abstract Knowledge in web documents, Relevance ranking of webpages and so on are
some of the under-researched areas in web content mining (WCM). Apart from the general …

被引用次数：16 相关文章所有 3 个版本

[PDF] vcu.edu

HTML web content extraction using paragraph tags

HJ Carey, M Manic - 2016 IEEE 25th International Symposium …, 2016 - ieeexplore.ieee.org

With the ever expanding use of the internet to disseminate information across the world,
gathering useful information from the multitude of web page styles continues to be a difficult …

被引用次数：26 相关文章所有 4 个版本

Semantic web mining for content-based online shopping recommender systems

IT Afolabi, OS Makinde, OO Oladipupo - International Journal of …, 2019 - igi-global.com

Currently, for content-based recommendations, semantic analysis of text from webpages
seems to be a major problem. In this research, we present a semantic web content mining …

被引用次数：13 相关文章所有 5 个版本

[PDF] northumbria.ac.uk

Semantics based web ranking using a robust weight scheme

RV Priya, V Vijayakumar, L Yang - International Journal of Web …, 2019 - igi-global.com

In this paper, HTML tags and attributes are used to determine different structural position of
text in a web page. Tags-attributes based models are used to assign a weight to a text that …

被引用次数：4 相关文章所有 8 个版本

Web content extraction based on subject detection and node density

W Petprasit, S Jaiyen - 2015 7th International Conference on …, 2015 - ieeexplore.ieee.org

Currently, very large data have been transferred from everywhere through World Wide Web.
Consequently, the information extraction systems have been arising and many researches …

被引用次数：6 相关文章所有 2 个版本

[PDF] beei.org

Marketplace affiliates potential analysis using cosine similarity and vision-based page segmentation

WB Zulfikar, M Irfan, M Ghufron, J Jumadi… - Bulletin of Electrical …, 2020 - beei.org

One success factor of an online affiliate is determined by the quality of the content source.
Therefore, affiliate marketplaces need to do an objective assessment to retrieve content data …

被引用次数：3 相关文章所有 5 个版本

[PDF] vbmv.org

[PDF][PDF] Various Approaches for Content Extraction from Web Pages based on Factors

DM Kene, A Iqbal - Recent Advancements in Science and Technology, 2024 - vbmv.org

With the huge development of the internet and web publishing techniques generally create
numerous information sources published as HTML pages on World Wide Web. So Extraction …

无链接文档排序算法研究

蒋招龙，赵泽茂 - 杭州电子科技大学学报: 自然科学版, 2015 - cqvip.com

大数据时代的到来, 数据格式呈现多样化, 对Web 数据的处理不仅仅局限在网页链接上,
还需要处理无链接结构的文档. 如何从海量的文档中获取所需的信息是搜索引擎亟待解决的问题 …

[PDF][PDF] A Research on Web Content Extraction and Noise Reduction through Text Density Using Malicious URL Pattern Detection

C Patel, H Diwanji - 2016 - academia.edu

ABSTRACT A Web Page has large amount of information including some additional
contents like hyperlinks, header footer, navigational panel; advertisements which may cause …

被引用次数：2 相关文章所有 2 个版本

An Improved VIPS-based Algorithm of Extracting Web Content

L Li, AM Zhou, Y Fang, L Liu, Q Wu - Applied Mechanics and …, 2014 - Trans Tech Publ

The paper studies the VIPS algorithm, and improves VIPS which has the deficiency with
complex rules and low performance, according that the Web page has the feature of DIV …

被引用次数：2 相关文章所有 2 个版本

高级搜索

QQ 群