作者
Allison Woodruff, Paul M Aoki, Eric Brewer, Paul Gauthier, Lawrence A Rowe
发表日期
1996/5/1
期刊
Computer Networks and ISDN Systems
卷号
28
期号
7-11
页码范围
963-980
出版商
Elsevier
简介
We report on our examination of pages from the World Wide Web. We have analyzed data collected by the Inktomi Web crawler (this data currently comprises over 2.6 million HTML documents). We have examined many characteristics of these documents, including: document size; number and types of tags, attributes, file extensions, protocols, and ports; the number of in-links; and the ratio of document size to the number of tags and attributes. For a more limited set of documents, we have examined the following: the number and types of syntax errors and readability scores. These data have been aggregated to create a number of ranked lists, e.g., the ten most-used tags, the ten most common HTML errors.
引用总数
199619971998199920002001200220032004200520062007200820092010201120122013201420152016201720182019202020212022202351310172015111712164755143332211121
学术搜索中的文章
A Woodruff, PM Aoki, E Brewer, P Gauthier, LA Rowe - Computer Networks and ISDN Systems, 1996