概要 |
In October 2017, Pervez Rizvi published online a database of lists of collocation matches (sets of matching words with gaps allowed and not necessarily in the same order) and n-gram matches (matching ...sets of consecutive words) between 527 early modern English plays (the 1602 Additions to The Spanish Tragedy being counted as a separate little play). In May 2018, he published additional datasets which comprise extended lists of collocation matches, counts of all n-gram matches, and lists of function word skip n-gram matches. For each play, “summary” spreadsheets are provided which show the numbers of its collocation and n-gram matches with each other play. Rizvi has also published “attribution tester” spreadsheets which allow us to further summarize the data of the summary files for authorship attribution. This paper reviews his database of collocations and n-grams and examines the reliability of the “summary” and “attribution tester” spreadsheets. It points to some inadequacies found in his database and spreadsheets. For example, his lists of plays fail to include some important titles such as The Famous Victories of Henry the Fifth and Monsieur Thomas and wrongly include the epistle dedicatory and other preliminary matter prefaced to the Beaumont and Fletcher folio published in 1647 as Beaumont’s “Comedies and Tragedies.” It is also problematic that Rizvi’s set of Shakespeare control texts includes those texts of his “collaborative” plays which cannot be confidently attributed to the playwright. Nevertheless, this paper stresses, Rizvi’s database, the greatest collection of collocations and n-grams shared by English plays in the early modern era that has ever been published, is a precious source of information for anyone engaged in attribution research in early modern dramatic literature. It also illustrates how Rizvi’s “summary” and “attribution tester” spreadsheets can be effectively used for authorship attribution, arguing that they can serve as a powerful authorship tool.続きを見る
|