A context-free method for the computational analysis of Buddhist texts- 学术资源搜索

A context-free method for the computational analysis of Buddhist texts

C Handy - Digital Humanities and Buddhism: An Introduction …, 2019 - degruyter.com

C Handy

Digital Humanities and Buddhism: An Introduction. Berlin: De Gruyter, 2019•degruyter.com

This study demonstrates a practical method for extracting recurrent strings from digitized texts in cases where grammar, vocabulary and other information about the texts are partly or entirely unknown. My method involves building concordances of words and phrases from digitized input sets of texts, using a simple but effective pattern-recognition algorithm. The algorithm can be generalized to work with information in any language, but I restrict this study to just three major languages of the Buddhist tradition: classical Sanskrit, classical Tibetan and classical Chinese. I utilize free text files available in online databases so that my examples can be verified easily. I also provide C source code examples of the algorithm, available at a web link mentioned later in this paper. ¹ In modern English, and in many other modern languages, words in texts are separated by spaces, non-inflected, and essentially discrete particles that can be read as individual strings into a computer. A practical consequence of this linguistic feature is that computer spell-checkers, text search algorithms, and similar functions are computationally friendly (fast processing and small storage size for texts). Most computer programming languages in use today have string-analysis functions based on European languages using a standard roman character set. Unicode standards make it easier to input and output non-roman scripts, but the general string functions in C and Unix (and in later

Thanks to Lance Adams for providing use of his high-performance computing environment, a Linux cluster of 128 logical processors, which greatly reduced the processing time required for this project. While my computer program is able to run on an ordinary desktop computer (or notebook computer), having this extra power was of great benefit in testing the limits of the algorithm, allowing me to process entire corpora in a reasonable amount of time.

De Gruyter

展开收起

被引用次数：1 相关文章所有 2 个版本

以上显示的是最相近的搜索结果。查看全部搜索结果

高级搜索

QQ 群

A context-free method for the computational analysis of Buddhist texts

引用