Towards publishing secure capsule-based analysis

J Murdock, J Jett, T Cole, Y Ma… - 2017 ACM/IEEE Joint …, 2017 - ieeexplore.ieee.org
Computational engagement with the HathiTrust Digital Library (HTDL) is confounded by the
in-copyright status and licensing restrictions on the majority of the content. Because of these …

Reliable access to massive restricted texts: Experience‐based evaluation

Z Peng, B Plale - Concurrency and Computation: Practice and …, 2020 - Wiley Online Library
Libraries are seeing growing numbers of digitized textual corpora that frequently come with
restrictions on their content. Computational analysis corpora that are large, while of interest …

Cloud-Based Service for Access Optimization to Textual Big Data

Z Peng - 2018 - search.proquest.com
Libraries are increasingly amassing large digitized textual corpora. Digitized volumes (ie,
books) are converted into page level searchable text, enabling page level or even phrase …

Exploiting graph‐based data to realize new functionalities for scholar‐built worksets

J Jett, TW Cole, JS Downie - Proceedings of the Association for …, 2017 - Wiley Online Library
In this poster, we describe how the HathiTrust Research Center (HTRC) is developing a
graph‐based approach to representing scholar‐built worksets in the HTRC's research …

Towards Evaluation of Cultural-scale Claims in Light of Topic Model Sampling Effects

J Murdock, J Zeng, C Allen - arXiv preprint arXiv:1512.05004, 2015 - arxiv.org
Cultural-scale models of full text documents are prone to over-interpretation by researchers
making unintentionally strong socio-linguistic claims (Pechenick et al., 2015) without …

[PDF][PDF] Towards Cultural-Scale Models of Full Text

ACS HTRC, J Murdock, J Zeng… - arXiv preprint arXiv …, 2015 - researchgate.net
In this preliminary study, we examine whether random samples from within given Library of
Congress Classification Outline areas yield significantly different topic models. We find that …